Skip to content

Commit 4c057a6

Browse files
authored
PMM-14411 - Enhance pg_custom_replication_wal query for accurate WAL … (#338)
* PMM-14411 - Enhance pg_custom_replication_wal query for accurate WAL metrics Updated the pg_custom_replication_wal query to conditionally return the correct WAL metrics based on the database recovery state. Added logic to differentiate between received and current WAL LSNs, and adjusted lag calculation accordingly. Set the 'master' flag to true for this query to ensure it runs on the primary database instance. * Refactor pg_custom_replication_wal query for clarity and accuracy Updated the pg_custom_replication_wal query to improve readability and ensure accurate reporting of WAL metrics based on the node type (primary or replica). Adjusted the descriptions of the metrics to specify their relevance to either the primary or replica nodes, enhancing the overall understanding of the metrics collected.
1 parent 3720260 commit 4c057a6

File tree

1 file changed

+16
-8
lines changed

1 file changed

+16
-8
lines changed

queries-hr.yml

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -84,25 +84,33 @@ pg_custom_database_size_custom:
8484
description: "Disk space used by the database"
8585

8686
pg_custom_replication_wal:
87+
master: true
8788
query: |
8889
SELECT
89-
pg_last_wal_receive_lsn() AS received_lsn,
90-
pg_last_wal_replay_lsn() AS replayed_lsn,
91-
pg_current_wal_lsn() AS current_lsn,
92-
pg_current_wal_lsn() - pg_last_wal_replay_lsn() AS lag_bytes;
90+
CASE WHEN pg_is_in_recovery() THEN 'replica' ELSE 'primary' END AS node_type,
91+
CASE WHEN pg_is_in_recovery() THEN pg_last_wal_receive_lsn() ELSE NULL END AS received_lsn,
92+
CASE WHEN pg_is_in_recovery() THEN pg_last_wal_replay_lsn() ELSE NULL END AS replayed_lsn,
93+
CASE WHEN pg_is_in_recovery() THEN NULL ELSE pg_current_wal_lsn() END AS current_lsn,
94+
CASE
95+
WHEN pg_is_in_recovery() THEN pg_last_wal_receive_lsn() - pg_last_wal_replay_lsn()
96+
ELSE NULL
97+
END AS lag_bytes
9398
metrics:
99+
- node_type:
100+
usage: "LABEL"
101+
description: "Type of node (primary or replica)."
94102
- received_lsn:
95103
usage: "GAUGE"
96-
description: "Last WAL location received by the standby server."
104+
description: "Last WAL location received by the standby server (replica only)."
97105
- replayed_lsn:
98106
usage: "GAUGE"
99-
description: "Last WAL location replayed by the standby server."
107+
description: "Last WAL location replayed by the standby server (replica only)."
100108
- current_lsn:
101109
usage: "GAUGE"
102-
description: "Current WAL location on the primary server."
110+
description: "Current WAL location on the primary server (primary only)."
103111
- lag_bytes:
104112
usage: "GAUGE"
105-
description: "Current WAL replication lag in bytes."
113+
description: "Current WAL replication lag in bytes (replica only)."
106114

107115
pg_custom_stat_replication:
108116
query: |

0 commit comments

Comments
 (0)