From d3311da78fe155e303880958ef308d7abe934769 Mon Sep 17 00:00:00 2001 From: Benjamin Renard Date: Wed, 17 Jul 2024 11:31:30 +0200 Subject: [PATCH] Update what the script do in README.md file --- README.md | 53 ++++++++++++++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 21 deletions(-) diff --git a/README.md b/README.md index 7205cd0..1246996 100644 --- a/README.md +++ b/README.md @@ -4,34 +4,45 @@ This script could be used as Nagios check plugin to verify Postgres Streaming re This script : -- check if Postgres is running (_CRITICAL_ raise if not) -- check if Postgres is in recovery mode : - - if Postgres is in recovery mode : - - retrieve from Postgres the last _xlog_ file receive and the _xlog_ file replay - - check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present) - - retrieve master connection information from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify. - - retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - - check if the current state of the host is "streaming" (_CRITICAL_ raise if not) - - check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not) - - if the check of the current XLOG file of the master host is enabled : - - retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - - check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not) - - check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded - - Return _OK_ state - - if Postgres is not in recovery mode : - - check if Postgres recovery configuration file is present (_CRITICAL_ raise if present) - - check if stand-by client(s) is connected (_WARNING_ raise if not) - - Return _OK_ state with list and count of stand-by client(s) +- check if Postgres is running (_CRITICAL_ raise if not) +- check if Postgres is in recovery mode (using ̀`pg_is_in_recovery()`) : +- if the expected mode need to be auto-detected (default, see ̀`-e` parameter): + - if Postgres is in recovery mode: `hot-standby` + - if recovery file is present and contain `primary_conninfo`: `hot-standby` + - otherwise: `master` +- if expected mode is `hot-standby`: + - check if Postgres is in recovery mode (_CRITICAL_ raise if not) + - retrieve from Postgres the last _xlog_ file received and the _xlog_ file replayed + - retrieve master connection information from Postgres `primary_conninfo` configuration parameter (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify. + - retrieve master sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`) + - retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). + - check if the current state of the host is `streaming` (_CRITICAL_ raise if not) + - check if the current sync state of the host is the expected one (default: `sync`, see `-e` parameter, _CRITICAL_ raise if not) + - if the check of the current XLOG file of the master host is enabled : + - retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). + - check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not) + - check if the last received _xlog_ file is the last replayed _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded + - if synchronous commit is enabled on master, check the last _xlog_ file sent by Postgres master is the last received by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_. + - if synchronous commit is disabled on master, check the last _xlog_ file sent by Postgres master is the last writed by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_. + - otherwise, return _OK_ state +- if expected mode is `master`: + - check if Postgres is in recovery mode (_CRITICAL_ raise if it is) + - retrieve current _xlog_ file (_UNKNOWN_ raise on error) + - retrieve sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`) + - list stand-by client(s) from master and check for each to them: + - if synchronous commit is enabled, check the last _xlog_ file sent by Postgres master is the current one on master (_WARNING_ raise if not) + - if synchronous commit is disabled, check the last _xlog_ file sent by Postgres master is the laster writed one on master (_WARNING_ raise if not) + - otherwise, return _OK_ state with list and count of stand-by client(s) **Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome ! ## Requirements -- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster` +- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster` -- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details). +- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details). -- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`). +- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`). ## Installation