Compare commits

..

2 commits

Author SHA1 Message Date
Benjamin Renard
8d172e944c
Code cleaning and introduce some pre-commit hooks 2024-06-03 15:47:30 +02:00
Benjamin Renard
5f9573612b
Add details in script output in recovery mode 2024-06-03 15:43:17 +02:00
3 changed files with 357 additions and 328 deletions

25
.pre-commit-config.yaml Normal file
View file

@ -0,0 +1,25 @@
# Pre-commit hooks to run tests and ensure code is cleaned.
# See https://pre-commit.com for more information
---
repos:
- repo: https://github.com/codespell-project/codespell
rev: v2.2.2
hooks:
- id: codespell
args:
- --ignore-words-list=exten
- --skip="./.*,*.csv,*.json,*.ini,*.subject,*.txt,*.html,*.log,*.conf"
- --quiet-level=2
- --ignore-regex=.*codespell-ignore$
# - --write-changes # Uncomment to write changes
exclude_types: [csv, json]
- repo: https://github.com/adrienverge/yamllint
rev: v1.32.0
hooks:
- id: yamllint
ignore: .github/
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.7.1
hooks:
- id: prettier
args: ["--print-width", "100"]

View file

@ -7,14 +7,14 @@ This script :
- check if Postgres is running (_CRITICAL_ raise if not) - check if Postgres is running (_CRITICAL_ raise if not)
- check if Postgres is in recovery mode : - check if Postgres is in recovery mode :
- if Postgres is in recovery mode : - if Postgres is in recovery mode :
- retreive from Postgres the last _xlog_ file receive and the _xlog_ file replay - retrieve from Postgres the last _xlog_ file receive and the _xlog_ file replay
- check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present) - check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present)
- retreive master connection informations from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify. - retrieve master connection information from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify.
- retreive the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current state of the host is "streaming" (_CRITICAL_ raise if not) - check if the current state of the host is "streaming" (_CRITICAL_ raise if not)
- check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not) - check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not)
- if the check of the current XLOG file of the master host is enabled : - if the check of the current XLOG file of the master host is enabled :
- retreive current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not) - check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not)
- check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded - check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded
- Return _OK_ state - Return _OK_ state
@ -27,11 +27,11 @@ This script :
## Requirements ## Requirements
* Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster` - Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster`
* **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details). - **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details).
* **On standby node:** `PG_USER` must be able to connect localy on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`). - **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`).
## Installation ## Installation

View file

@ -1,4 +1,5 @@
#!/bin/bash #!/bin/bash
# vim: tabstop=4 shiftwidth=4 softtabstop=4 expandtab
# #
# Nagios plugin to check Postgresql streamin replication state # Nagios plugin to check Postgresql streamin replication state
# #
@ -14,7 +15,7 @@
# ~/.pgpass). This user must have SUPERUSER privilege (need to get replication # ~/.pgpass). This user must have SUPERUSER privilege (need to get replication
# details). # details).
# #
# On standby node: PG_USER must be able to connect localy on the database with the same name # On standby node: PG_USER must be able to connect locally on the database with the same name
# (or another specified with -D) as trust (or using password specified in # (or another specified with -D) as trust (or using password specified in
# ~/.pgpass). # ~/.pgpass).
# #
@ -57,7 +58,7 @@ Usage: $0 [-d] [-h] [options]
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or use -m pg_main Specify Postgres main directory path (Default: try to auto-detect or use
$DEFAULT_PG_MAIN) $DEFAULT_PG_MAIN)
-r recovery_conf Specify Postgres recovery configuration file path -r recovery_conf Specify Postgres recovery configuration file path
(Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12) ( Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file) -U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file)
-p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL -p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL
port if detected or use $DEFAULT_PG_PORT) port if detected or use $DEFAULT_PG_PORT)
@ -125,7 +126,7 @@ do
usage usage
;; ;;
\?) \?)
echo -n "Unkown option" echo -n "Unknown option"
usage usage
esac esac
done done
@ -280,19 +281,19 @@ then
debug "Last replayed LSN: $LAST_REPLAYED_LSN" debug "Last replayed LSN: $LAST_REPLAYED_LSN"
# Get master connection informations from recovery.conf file # Get master connection information from recovery.conf file
MASTER_CONN_INFOS=$( egrep '^ *primary_conninfo' $RECOVERY_CONF|sed "s/^ *primary_conninfo *= *\(.\+\) *$/\1/" ) MASTER_CONN_INFOS=$( egrep '^ *primary_conninfo' $RECOVERY_CONF|sed "s/^ *primary_conninfo *= *\(.\+\) *$/\1/" )
if [ ! -n "$MASTER_CONN_INFOS" ] if [ ! -n "$MASTER_CONN_INFOS" ]
then then
echo "UNKNOWN: Can't retreive master connection informations form recovery.conf file" echo "UNKNOWN: Can't retrieve master connection information form recovery.conf file"
exit 3 exit 3
fi fi
debug "Master connection informations: $MASTER_CONN_INFOS" debug "Master connection information: $MASTER_CONN_INFOS"
M_HOST=$( echo "$MASTER_CONN_INFOS"| grep 'host=' | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' ) M_HOST=$( echo "$MASTER_CONN_INFOS"| grep 'host=' | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
if [ ! -n "$M_HOST" ] if [ ! -n "$M_HOST" ]
then then
echo "UNKNOWN: Can't retreive master host from recovery.conf file" echo "UNKNOWN: Can't retrieve master host from recovery.conf file"
exit 3 exit 3
fi fi
debug "Master host: $M_HOST" debug "Master host: $M_HOST"
@ -348,7 +349,7 @@ then
M_CUR_REPL_STATE_INFO="$( psql_master_get "SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn FROM pg_stat_replication WHERE application_name='$M_APP_NAME';" )" M_CUR_REPL_STATE_INFO="$( psql_master_get "SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn FROM pg_stat_replication WHERE application_name='$M_APP_NAME';" )"
if [ ! -n "$M_CUR_REPL_STATE_INFO" ] if [ ! -n "$M_CUR_REPL_STATE_INFO" ]
then then
echo "UNKNOWN: Can't retreive current replication state information from master server" echo "UNKNOWN: Can't retrieve current replication state information from master server"
exit 3 exit 3
fi fi
debug "Master current replication state:\n\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO" debug "Master current replication state:\n\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
@ -380,7 +381,7 @@ then
M_CUR_LSN="$( psql_master_get "SELECT $pg_current_wal_lsn" )" M_CUR_LSN="$( psql_master_get "SELECT $pg_current_wal_lsn" )"
if [ ! -n "$M_CUR_LSN" ] if [ ! -n "$M_CUR_LSN" ]
then then
echo "UNKNOWN: Can't retreive current LSN from master server" echo "UNKNOWN: Can't retrieve current LSN from master server"
exit 3 exit 3
fi fi
debug "Master current LSN: $M_CUR_LSN" debug "Master current LSN: $M_CUR_LSN"
@ -423,7 +424,10 @@ then
exit 1 exit 1
fi fi
echo "OK: Hot-standby server is uptodate" echo "OK: Hot-standby server is up-to-date"
echo "Replication state: $M_CUR_SYNC_STATE"
echo "Last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'"
[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ] && echo "Replay delay: ${REPLAY_DELAY}s"
exit 0 exit 0
else else
debug "File recovery.conf not found. Master mode." debug "File recovery.conf not found. Master mode."
@ -436,11 +440,11 @@ else
fi fi
debug "Postgres is not in recovery mode" debug "Postgres is not in recovery mode"
# Retreive current lsn # Retrieve current lsn
CURRENT_LSN=$( psql_get "SELECT $pg_current_wal_lsn" ) CURRENT_LSN=$( psql_get "SELECT $pg_current_wal_lsn" )
if [ -z "$CURRENT_LSN" ] if [ -z "$CURRENT_LSN" ]
then then
echo "UNKNOWN: Fail to retreive current LSN (Log Sequence Number)" echo "UNKNOWN: Fail to retrieve current LSN (Log Sequence Number)"
exit 3 exit 3
fi fi
debug "Current LSN: $CURRENT_LSN" debug "Current LSN: $CURRENT_LSN"