Compare commits

...

8 commits

Author SHA1 Message Date
Benjamin Renard
e1392060a3
Add .editorconfig file and ignore dist directory by git
All checks were successful
Run tests / tests (push) Successful in 2m40s
2024-07-17 11:49:25 +02:00
Benjamin Renard
b61ab45ae1
debian package: provide a manpage built from README file and fix setting version and date in script 2024-07-17 11:49:24 +02:00
Benjamin Renard
fb02ac9269
Fix debian package build dependencies 2024-07-17 11:49:24 +02:00
Benjamin Renard
d3311da78f
Update what the script do in README.md file 2024-07-17 11:49:24 +02:00
Benjamin Renard
e5514e587f
Fix taking care of synchronous_commit master configuration and removed useless -C parameter and its check 2024-07-17 11:49:23 +02:00
Benjamin Renard
bc078f83e8
Code cleaning 2024-07-16 13:43:26 +02:00
Benjamin Renard
269b92415f
Update README.md 2024-07-16 10:40:47 +02:00
Benjamin Renard
ff5059ed79
Fix Debian package dependencies
All checks were successful
Run tests / tests (push) Successful in 2m46s
2024-07-16 10:37:28 +02:00
6 changed files with 320 additions and 160 deletions

15
.editorconfig Normal file
View file

@ -0,0 +1,15 @@
root = true
[*]
indent_style = space
indent_size = 4
trim_trailing_whitespace = true
insert_final_newline = true
charset = utf-8
end_of_line = lf
[*.{yaml,yml}]
indent_size = 2
[Makefile]
indent_style = tab

1
.gitignore vendored
View file

@ -1,2 +1,3 @@
*~
/.env
/dist

104
README.md
View file

@ -4,37 +4,60 @@ This script could be used as Nagios check plugin to verify Postgres Streaming re
This script :
- check if Postgres is running (_CRITICAL_ raise if not)
- check if Postgres is in recovery mode :
- if Postgres is in recovery mode :
- retrieve from Postgres the last _xlog_ file receive and the _xlog_ file replay
- check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present)
- retrieve master connection information from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify.
- retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current state of the host is "streaming" (_CRITICAL_ raise if not)
- check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not)
- if the check of the current XLOG file of the master host is enabled :
- retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not)
- check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded
- Return _OK_ state
- if Postgres is not in recovery mode :
- check if Postgres recovery configuration file is present (_CRITICAL_ raise if present)
- check if stand-by client(s) is connected (_WARNING_ raise if not)
- Return _OK_ state with list and count of stand-by client(s)
- check if Postgres is running (_CRITICAL_ raise if not)
- check if Postgres is in recovery mode (using ̀`pg_is_in_recovery()`) :
- if the expected mode need to be auto-detected (default, see ̀`-e` parameter):
- if Postgres is in recovery mode: `hot-standby`
- if recovery file is present and contain `primary_conninfo`: `hot-standby`
- otherwise: `master`
- if expected mode is `hot-standby`:
- check if Postgres is in recovery mode (_CRITICAL_ raise if not)
- retrieve from Postgres the last _xlog_ file received and the _xlog_ file replayed
- retrieve master connection information from Postgres `primary_conninfo` configuration parameter (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify.
- retrieve master sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`)
- retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current state of the host is `streaming` (_CRITICAL_ raise if not)
- check if the current sync state of the host is the expected one (default: `sync`, see `-e` parameter, _CRITICAL_ raise if not)
- if the check of the current XLOG file of the master host is enabled :
- retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not)
- check if the last received _xlog_ file is the last replayed _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded
- if synchronous commit is enabled on master, check the last _xlog_ file sent by Postgres master is the last received by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- if synchronous commit is disabled on master, check the last _xlog_ file sent by Postgres master is the last writed by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- otherwise, return _OK_ state
- if expected mode is `master`:
- check if Postgres is in recovery mode (_CRITICAL_ raise if it is)
- retrieve current _xlog_ file (_UNKNOWN_ raise on error)
- retrieve sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`)
- list stand-by client(s) from master and check for each to them:
- if synchronous commit is enabled, check the last _xlog_ file sent by Postgres master is the current one on master (_WARNING_ raise if not)
- if synchronous commit is disabled, check the last _xlog_ file sent by Postgres master is the laster writed one on master (_WARNING_ raise if not)
- otherwise, return _OK_ state with list and count of stand-by client(s)
**Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome !
## Requirements
- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster`
- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster`
- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details).
- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details).
- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`).
- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`).
## Installation
### From debian packages
```
echo "deb http://debian.zionetrix.net stable main" | sudo tee /etc/apt/sources.list.d/zionetrix.list
sudo apt -o Acquire::AllowInsecureRepositories=true -o Acquire::AllowDowngradeToInsecureRepositories=true update
sudo apt -o APT::Get::AllowUnauthenticated=true install --yes zionetrix-archive-keyring
sudo apt update
sudo apt install check-pg-streaming-replication
```
### From sources
```
apt install sudo awk sed bc postgresql-client
git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \
@ -48,27 +71,34 @@ ln -s /usr/local/src/check_pg_streaming_replication/check_pg_streaming_replicati
```
Usage: ./check_pg_streaming_replication [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or use postgres)
-u pg_user Specify local Postgres user (Default: try to auto-detect or
use postgres)
-b psql_bin Specify psql binary path (Default: /usr/bin/psql)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters)
-V pg_version Specify Postgres version (Default: try to auto-detect or use 9.1)
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or use
/var/lib/postgresql//main)
-V pg_version Specify Postgres version (Default: try to auto-detect or
use 9.1)
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or
use /var/lib/postgresql//main)
-r recovery_conf Specify Postgres recovery configuration file path
( Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file)
-p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL
port if detected or use 5432)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must
match with .pgpass one is used)
-C 1/0 Enable or disable check if the current LSN of the master host is the same
of the last received LSN (Default: 1)
-w replay_warn_delay Specify the replay warning delay in second (Default: 3)
-c replay_crit_delay Specify the replay critical delay in second (Default: 5)
-e expected_sync_state The expected replication state ('sync' or 'async', default: sync)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: 'auto')
(Default: [PG_MAIN]/recovery.conf on PG <= 11,
[PG_MAIN]/postgresql.auto.conf on PG >= 12)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf
file)
-p pg_port Specify default Postgres master TCP port (Default: same as local
PostgreSQL port if detected or use 5432)
-D dbname Specify DB name on Postgres master/slave to connect on (Default:
PG_USER, must match with .pgpass one is used)
-w replay_warn_delay Specify the replay warning delay in second
(Default: 3)
-c replay_crit_delay Specify the replay critical delay in second
(Default: 5)
-e expected_sync_state The expected replication state ('sync' or 'async',
default: sync)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: 'auto')
-d Debug mode
-h Show this message
```
## Copyright

View file

@ -1,16 +1,21 @@
#!/bin/bash
# vim: tabstop=4 shiftwidth=4 softtabstop=4 expandtab
QUIET_ARG=""
[[ "$1" == "--quiet" ]] && QUIET_ARG="--quiet"
set -e
QUIET_MODE=0
[[ "$1" == "--quiet" ]] && QUIET_MODE=1
# Enter source directory
cd "$( dirname "$0" )" || exit
CHECK_FILE="$( find "." -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )"
PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )"
if [[ -d dist ]]; then
echo "Clean previous build..."
rm -fr dist
fi
echo "Clean previous build..."
rm -fr dist
CHECK_FILE="$( find "." -maxdepth 1 -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )"
PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )"
echo "Detect version using git describe..."
VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )"
@ -18,46 +23,80 @@ VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )"
echo "Create building environemt..."
BDIR="dist/$PACKAGE_NAME-$VERSION"
mkdir -p "$BDIR"
RSYNC_ARG=""
[[ -z "$QUIET_ARG" ]] && RSYNC_ARG="-v"
rsync -a "$RSYNC_ARG" debian/ "$BDIR/debian/"
RSYNC_ARGS=( )
[[ $QUIET_MODE -eq 0 ]] && RSYNC_ARGS+=( -v )
rsync -a "${RSYNC_ARGS[@]}" debian/ "$BDIR/debian/"
cp "$CHECK_FILE" "$BDIR/"
echo "Set VERSION=$VERSION in gitdch using sed..."
sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )"
if [[ -e "README.md" ]]; then
echo "Build manpage from README.md file..."
MAN_TITLE=$( basename "$CHECK_FILE" )
MAN_MD_FILE="$BDIR/$( basename "$CHECK_FILE" ).1.md"
MAN_FILE="$BDIR/$( basename "$CHECK_FILE" ).1"
echo "# ${MAN_TITLE^^} 1" \
"$( git log --follow --format=%ad --date iso README.md | head -n 1 | awk '{print $1}')" \
"$PACKAGE_NAME" \
'"User Manuals"' > "$MAN_MD_FILE"
sed 1d README.md >> "$MAN_MD_FILE"
if ! which go-md2man > /dev/null; then
ARG_ARGS=()
[[ $QUIET_MODE -eq 0 ]] && ARG_ARGS+=( -qq )
apt update "${ARG_ARGS[@]}"
apt install -y "${ARG_ARGS[@]}" go-md2man
fi
go-md2man -in "$MAN_MD_FILE" -out "$MAN_FILE"
basename "$MAN_FILE" >> "$BDIR/debian/manpages"
grep -Eq '^Build-Depends: .*go-md2man' "$BDIR/debian/control" || \
sed -i 's/^Build-Depends: \(.*\)$/Build-Depends: \1, go-md2man/' "$BDIR/debian/control"
fi
if grep -Eq '^VERSION *=' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set VERSION=$VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )"
elif grep -Eq '^# Version:' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set Version = $VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Version:\s*\).*$/#\1$VERSION/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if grep -Eq '^# Date:' "$BDIR/$( basename "$CHECK_FILE" )"; then
DATE="$( git log --follow --format=%ad --date iso "$CHECK_FILE" | head -n 1 )"
echo "Set Date = $DATE in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Date:\s*\).*$/#\1$DATE/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if [[ -z "$DEBIAN_CODENAME" ]]; then
echo "Retrieve debian codename using lsb_release..."
DEBIAN_CODENAME=$( lsb_release -c -s )
echo "Retrieve debian codename using lsb_release..."
DEBIAN_CODENAME=$( lsb_release -c -s )
else
echo "Use debian codename from environment ($DEBIAN_CODENAME)"
echo "Use debian codename from environment ($DEBIAN_CODENAME)"
fi
echo "Generate debian changelog using gitdch..."
GITDCH_ARGS=('--verbose')
[[ -n "$QUIET_ARG" ]] && GITDCH_ARGS=('--warning')
[[ $QUIET_MODE -eq 1 ]] && GITDCH_ARGS=('--warning')
if [[ -n "$MAINTAINER_NAME" ]]; then
echo "Use maintainer name from environment ($MAINTAINER_NAME)"
GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}")
echo "Use maintainer name from environment ($MAINTAINER_NAME)"
GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}")
fi
if [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Use maintainer email from environment ($MAINTAINER_EMAIL)"
GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL")
echo "Use maintainer email from environment ($MAINTAINER_EMAIL)"
GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL")
fi
gitdch \
--package-name "$PACKAGE_NAME" \
--version "${VERSION}" \
--code-name "$DEBIAN_CODENAME" \
--output "$BDIR"/debian/changelog \
--release-notes dist/release_notes.md \
"${GITDCH_ARGS[@]}"
--package-name "$PACKAGE_NAME" \
--version "${VERSION}" \
--code-name "$DEBIAN_CODENAME" \
--output "$BDIR"/debian/changelog \
--release-notes dist/release_notes.md \
"${GITDCH_ARGS[@]}"
if [[ -n "$MAINTAINER_NAME" ]] && [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..."
sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \
"$BDIR"/debian/control
echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..."
sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \
"$BDIR"/debian/control
fi
echo "Build debian package..."
cd "$BDIR" || exit
[[ $QUIET_MODE -eq 0 ]] && export DH_VERBOSE=1
dpkg-buildpackage

View file

@ -20,7 +20,8 @@
# ~/.pgpass).
#
# Author: Benjamin Renard <brenard@easter-eggs.com>
# Date: Mon, 03 Jun 2024 15:31:29 +0200
# Version: dev
# Date: dev
# Source: https://gitea.zionetrix.net/bn8/check_pg_streaming_replication
# SPDX-License-Identifier: GPL-3.0-or-later
#
@ -39,7 +40,6 @@ RECOVERY_CONF=""
PG_DEFAULT_PORT=""
PG_DEFAULT_APP_NAME=$( hostname )
PG_DB=""
CHECK_CUR_MASTER_LSN=1
REPLAY_WARNING_DELAY=3
REPLAY_CRITICAL_DELAY=5
EXPECTED_SYNC_STATE=sync
@ -48,36 +48,42 @@ EXPECTED_MODE=auto
DEBUG=0
function usage () {
ERROR="$1"
ERROR="$*"
[[ -n "$ERROR" ]] && echo -e "$ERROR\n"
cat << EOF
Usage: $0 [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or use $DEFAULT_PG_USER)
-u pg_user Specify local Postgres user (Default: try to auto-detect or
use $DEFAULT_PG_USER)
-b psql_bin Specify psql binary path (Default: $PSQL_BIN)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: $PG_LSCLUSTER_BIN)
-V pg_version Specify Postgres version (Default: try to auto-detect or use $DEFAULT_PG_VERSION)
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or use
$DEFAULT_PG_MAIN)
-V pg_version Specify Postgres version (Default: try to auto-detect or
use $DEFAULT_PG_VERSION)
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or
use $DEFAULT_PG_MAIN)
-r recovery_conf Specify Postgres recovery configuration file path
(Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file)
-p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL
port if detected or use $DEFAULT_PG_PORT)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must
match with .pgpass one is used)
-C 1/0 Enable or disable check if the current LSN of the master host is the same
of the last received LSN (Default: $CHECK_CUR_MASTER_LSN)
-w replay_warn_delay Specify the replay warning delay in second (Default: $REPLAY_WARNING_DELAY)
-c replay_crit_delay Specify the replay critical delay in second (Default: $REPLAY_CRITICAL_DELAY)
-e expected_sync_state The expected replication state ('sync' or 'async', default: $EXPECTED_SYNC_STATE)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: '$EXPECTED_MODE')
(Default: [PG_MAIN]/recovery.conf on PG <= 11,
[PG_MAIN]/postgresql.auto.conf on PG >= 12)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf
file)
-p pg_port Specify default Postgres master TCP port (Default: same as local
PostgreSQL port if detected or use $DEFAULT_PG_PORT)
-D dbname Specify DB name on Postgres master/slave to connect on (Default:
PG_USER, must match with .pgpass one is used)
-w replay_warn_delay Specify the replay warning delay in second
(Default: $REPLAY_WARNING_DELAY)
-c replay_crit_delay Specify the replay critical delay in second
(Default: $REPLAY_CRITICAL_DELAY)
-e expected_sync_state The expected replication state ('sync' or 'async',
default: $EXPECTED_SYNC_STATE)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: '$EXPECTED_MODE')
-d Debug mode
-h Show this message
EOF
[[ -n "$ERROR" ]] && exit 1 || exit 0
}
while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
while getopts "hu:b:B:V:m:r:U:p:D:w:c:e:E:d" OPTION; do
case $OPTION in
u)
PG_USER=$OPTARG
@ -106,9 +112,6 @@ while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
D)
PG_DB=$OPTARG
;;
C)
CHECK_CUR_MASTER_LSN=$OPTARG
;;
w)
REPLAY_WARNING_DELAY=$OPTARG
;;
@ -117,12 +120,15 @@ while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
;;
e)
[[ "$OPTARG" != "sync" ]] && [[ "$OPTARG" != "async" ]] && \
usage "Invalid expected replication state '$OPTARG'. Possible values: sync or async."
usage "Invalid expected replication state '$OPTARG'." \
"Possible values: sync or async."
EXPECTED_SYNC_STATE=$OPTARG
;;
E)
[[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && [[ "$OPTARG" != "auto" ]] && \
usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby or auto."
[[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && \
[[ "$OPTARG" != "auto" ]] && \
usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby" \
"or auto."
EXPECTED_MODE=$OPTARG
;;
d)
@ -139,7 +145,7 @@ done
function debug() {
if [[ $DEBUG -eq 1 ]]; then
>&2 echo -e "[DEBUG] $1"
>&2 echo -e "[DEBUG] $*"
fi
}
@ -153,7 +159,6 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
EXPECTED_SYNC_STATE = $EXPECTED_SYNC_STATE
@ -162,15 +167,19 @@ EXPECTED_MODE = $EXPECTED_MODE
# Auto-detect PostgreSQL information using pg_lsclusters
if [[ -x "$PG_LSCLUSTER_BIN" ]]; then
PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null|head -n1 )
PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null | head -n1 )
if [[ -n "$PG_CLUSTER" ]]; then
debug "pg_lsclusters output:\n\t$PG_CLUSTER"
# Output example:
# 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main /var/log/postgresql/postgresql-9.6-main.log
[[ -z "$PG_VERSION" ]] && PG_VERSION=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $1}' )
[[ -z "$PG_DEFAULT_PORT" ]] && PG_DEFAULT_PORT=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $3}' )
[[ -z "$PG_USER" ]] && PG_USER=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $5}' )
[[ -z "$PG_MAIN" ]] && PG_MAIN=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $6}' )
# 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main \
# /var/log/postgresql/postgresql-9.6-main.log
# 13 main 5432 online,recovery,pacemaker postgres /var/lib/postgresql/13/main \
# /var/log/postgresql/postgresql-13-main.log
[[ -z "$PG_VERSION" ]] && PG_VERSION=$( awk -F ' +' '{print $1}' <<< "$PG_CLUSTER" )
[[ -z "$PG_DEFAULT_PORT" ]] && \
PG_DEFAULT_PORT=$( awk -F ' +' '{print $3}' <<< "$PG_CLUSTER" )
[[ -z "$PG_USER" ]] && PG_USER=$( awk -F ' +' '{print $5}' <<< "$PG_CLUSTER" )
[[ -z "$PG_MAIN" ]] && PG_MAIN=$( awk -F ' +' '{print $6}' <<< "$PG_CLUSTER" )
fi
else
debug "pg_lsclusters not found ($PG_LSCLUSTER_BIN): parameters auto-detection disabled"
@ -194,7 +203,11 @@ id "$PG_USER" > /dev/null 2>&1 || { echo "UNKNOWN: Invalid Postgres user ($PG_US
# Check RECOVERY_CONF
if [[ -z "$RECOVERY_CONF" ]]; then
[[ $PG_VERSION -le 11 ]] && RECOVERY_CONF_FILENAME="recovery.conf" || RECOVERY_CONF_FILENAME="postgresql.auto.conf"
if [[ $PG_VERSION -le 11 ]]; then
RECOVERY_CONF_FILENAME="recovery.conf"
else
RECOVERY_CONF_FILENAME="postgresql.auto.conf"
fi
RECOVERY_CONF="$PG_MAIN/$RECOVERY_CONF_FILENAME"
else
RECOVERY_CONF_FILENAME=$( basename "$RECOVERY_CONF" )
@ -208,15 +221,17 @@ fi
[[ -z "$PG_DB" ]] && PG_DB="$PG_USER"
function psql_get () {
sql="$1"
debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned"
local sql="$*"
debug "Exec 'sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned <<< \"$sql\""
sudo -u "$PG_USER" "$PSQL_BIN" -d "$PG_DB" -w -t -P format=unaligned <<< "$sql"
}
function psql_master_get () {
sql="$1"
debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t -P format=unaligned"
sudo -u "$PG_USER" "$PSQL_BIN" -U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql"
local sql="$*"
debug "Exec 'sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t" \
"-P format=unaligned <<< \"$sql\""
sudo -u "$PG_USER" "$PSQL_BIN" \
-U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql"
}
debug "Running options:
@ -229,7 +244,6 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
"
@ -273,7 +287,8 @@ if [[ "$EXPECTED_MODE" == "auto" ]]; then
if [[ $RECOVERY_MODE -eq 1 ]]; then
debug "Postgres is in recovery mode. Hot-standby mode."
EXPECTED_MODE="hot-standby"
elif [[ -f $RECOVERY_CONF ]] && [[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then
elif [[ -f $RECOVERY_CONF ]] && \
[[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then
debug "File $RECOVERY_CONF_FILENAME found and contain primary_conninfo. Hot-standby mode."
EXPECTED_MODE="hot-standby"
else
@ -301,19 +316,24 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
# Get master connection information from primary_conninfo configuration parameter
MASTER_CONN_INFOS=$( psql_get "SHOW primary_conninfo" )
if [[ -z "$MASTER_CONN_INFOS" ]]; then
echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo configuration parameter"
echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo" \
"configuration parameter"
exit 3
fi
debug "Master connection information: $MASTER_CONN_INFOS"
M_HOST=$( grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
M_HOST=$(
grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_HOST" ]]; then
echo "UNKNOWN: Can't retrieve master host from primary_conninfo configuration parameter"
exit 3
fi
debug "Master host: $M_HOST"
M_PORT=$( grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
M_PORT=$(
grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_PORT" ]]; then
debug "Master port not specified, use default: $PG_DEFAULT_PORT"
M_PORT=$PG_DEFAULT_PORT
@ -325,7 +345,9 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master user provided by command-line, use it: $PG_MASTER_USER"
M_USER="$PG_MASTER_USER"
else
M_USER=$( grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
M_USER=$(
grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_USER" ]]; then
debug "Master user not specified, use default: $PG_USER"
M_USER=$PG_USER
@ -334,7 +356,10 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
fi
fi
M_APP_NAME=$( grep 'application_name=' <<< "$MASTER_CONN_INFOS" | sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/" )
M_APP_NAME=$(
grep 'application_name=' <<< "$MASTER_CONN_INFOS" |
sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/"
)
if [[ -z "$M_APP_NAME" ]]; then
if [[ $PG_VERSION -ge 12 ]]; then
debug "Master application name not specified, use cluster_name if defined"
@ -354,25 +379,45 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master application name: $M_APP_NAME"
fi
# Check if master is configured for synchronous commit
SYNC_MODE="$(
psql_master_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';"
)"
debug "Master synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Get current replication state information from master
M_CUR_REPL_STATE_INFO="$( psql_master_get "SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn FROM pg_stat_replication WHERE application_name='$M_APP_NAME';" )"
M_CUR_REPL_STATE_INFO="$(
psql_master_get \
"SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn" \
"FROM pg_stat_replication WHERE application_name='$M_APP_NAME';"
)"
if [[ -z "$M_CUR_REPL_STATE_INFO" ]]; then
echo "UNKNOWN: Can't retrieve current replication state information from master server"
exit 3
fi
debug "Master current replication state:\n\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
debug "Master current replication state:\n" \
"\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
M_CUR_STATE=$( cut -d'|' -f1 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current state: $M_CUR_STATE"
if [[ "$M_CUR_STATE" != "streaming" ]]; then
echo "CRITICAL: this host is not in streaming state according to master host (current state = '$M_CUR_STATE')"
echo "CRITICAL: this host is not in streaming state according to master host" \
"(current state = '$M_CUR_STATE')"
exit 2
fi
M_CUR_SYNC_STATE=$( cut -d'|' -f2 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current sync state: $M_CUR_SYNC_STATE"
if [[ "$M_CUR_SYNC_STATE" != "$EXPECTED_SYNC_STATE" ]]; then
echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE' (expected state = '$EXPECTED_SYNC_STATE')"
echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE'" \
"(expected state = '$EXPECTED_SYNC_STATE')"
exit 2
fi
@ -380,35 +425,24 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
M_CUR_WRITED_LSN=$( cut -d'|' -f4 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'"
# Check current master LSN vs last received LSN
if [[ "$CHECK_CUR_MASTER_LSN" == "1" ]]; then
# Get current LSN from master
M_CUR_LSN="$( psql_master_get "SELECT $pg_current_wal_lsn" )"
if [[ -z "$M_CUR_LSN" ]]; then
echo "UNKNOWN: Can't retrieve current LSN from master server"
exit 3
fi
debug "Master current LSN: $M_CUR_LSN"
# Master current LSN is the last received LSN ?
if [[ "$M_CUR_LSN" != "$LAST_RECEIVED_LSN" ]]; then
echo "CRITICAL: Master current LSN is not the last received LSN"
exit 2
fi
debug "Master current LSN is the last received LSN"
fi
# The last received LSN is the last replayed ?
if [[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ]]; then
debug "/!\ The last received LSN is NOT the last replayed LSN ('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')"
REPLAY_DELAY="$( psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());' )"
debug "/!\ The last received LSN is NOT the last replayed LSN" \
"('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')"
REPLAY_DELAY="$(
psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());'
)"
debug "Replay delay is $REPLAY_DELAY second(s)"
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_CRITICAL_DELAY" ) -gt 0 ]]; then
echo "CRITICAL: last received LSN is not the last replayed ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)"
echo "CRITICAL: last received LSN is not the last replayed" \
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 2
fi
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_WARNING_DELAY" ) -gt 0 ]]; then
echo "WARNING: last received LSN is not the last replay file ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)"
echo "WARNING: last received LSN is not the last replay file" \
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 1
fi
debug "Replay delay is not worrying"
@ -416,10 +450,27 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Last received LSN is the last replayed file"
# The master last sent LSN is the last received (and synced) ?
if [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then
echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave. May be we have some network delay or load on slave"
if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then
LSN_DIFF=$(
psql_master_get "SELECT $pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$LAST_RECEIVED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$LAST_RECEIVED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave" \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received (and synced to disk) LSN: $LAST_RECEIVED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [ "$M_CUR_SENT_LSN" != "$M_CUR_WRITED_LSN" ];then
LSN_DIFF=$(
psql_master_get "SELECT pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$M_CUR_WRITED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$M_CUR_WRITED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received by slave " \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received LSN: $M_CUR_WRITED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1
fi
@ -444,28 +495,44 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
fi
debug "Current LSN: $CURRENT_LSN"
# Check if master is configured for synchronous commit
SYNC_MODE="$( psql_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';" )"
debug "synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Check standby client
STANDBY_CLIENTS=$( psql_get "SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
FROM (
SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
FROM (
SELECT application_name, client_addr, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn, state, sync_state,
$pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag
FROM pg_stat_replication
) AS s2
) AS s1" )
STANDBY_CLIENTS=$(
psql_get \
"SELECT
application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
FROM (
SELECT
application_name, client_addr, sent_lsn, write_lsn, state, sync_state,
current_lag
FROM (
SELECT
application_name, client_addr, $sent_lsn AS sent_lsn,
$write_lsn AS write_lsn, state, sync_state,
$pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag
FROM pg_stat_replication
) AS s2
) AS s1"
)
if [[ -z "$STANDBY_CLIENTS" ]]; then
echo "WARNING: no stand-by client connected"
exit 1
fi
debug "Stand-by client(s):\n\t${STANDBY_CLIENTS//$'\n'/\\n\\t}"
STANDBY_CLIENTS_TXT=""
STANDBY_CLIENTS_COUNT=0
STANDBY_CLIENTS_ROWS=()
CURRENT_LSN_IS_LAST_SENT=1
for line in $STANDBY_CLIENTS; do
(( STANDBY_CLIENTS_COUNT+=1 ))
NAME=$( cut -d '|' -f 1 <<< "$line" )
IP=$( cut -d '|' -f 2 <<< "$line" )
SENT_LSN=$( cut -d '|' -f 3 <<< "$line" )
@ -473,20 +540,28 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
STATE=$( cut -d '|' -f 5 <<< "$line" )
SYNC_STATE=$( cut -d '|' -f 6 <<< "$line" )
LAG=$( cut -d '|' -f 7 <<< "$line" )
STANDBY_CLIENTS_TXT="$STANDBY_CLIENTS_TXT\n$NAME ($IP): $STATE/$SYNC_STATE (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b)"
[[ "$SENT_LSN" != "$CURRENT_LSN" ]] && CURRENT_LSN_IS_LAST_SENT=0
STANDBY_CLIENTS_ROW="$NAME ($IP): $STATE/$SYNC_STATE"
STANDBY_CLIENTS_ROW+=" (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b"
STANDBY_CLIENTS_ROWS+=( "$STANDBY_CLIENTS_ROW" )
if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$SENT_LSN" != "$CURRENT_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [[ "$SENT_LSN" != "$WRITED_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
fi
done
if [[ $CURRENT_LSN_IS_LAST_SENT -eq 1 ]]; then
echo "OK: $STANDBY_CLIENTS_COUNT stand-by client(s) connected"
echo "OK: ${#STANDBY_CLIENTS_ROWS[@]} stand-by client(s) connected"
EXIT_CODE=0
else
echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected. May be we have some load ?"
echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected." \
"May be we have some load ?"
EXIT_CODE=1
fi
echo "Current master LSN: $CURRENT_LSN"
echo -e "$STANDBY_CLIENTS_TXT"
IFS=$'\n'
echo "${STANDBY_CLIENTS_ROWS[*]}"
exit $EXIT_CODE
else
echo "UNKNOWN - Invalid mode '$EXPECTED_MODE'"

4
debian/control vendored
View file

@ -2,11 +2,11 @@ Source: check-pg-streaming-replication
Section: admin
Priority: optional
Maintainer: Debian Zionetrix - check-pg-streaming-replication <debian+check-pg-streaming-replication@zionetrix.net>
Build-Depends: debhelper (>> 11.0.0)
Build-Depends: debhelper (>> 11.0.0), findutils, rsync, sed, awk, git, gitdch
Standards-Version: 3.9.6
Package: check-pg-streaming-replication
Architecture: all
Depends: ${misc:Depends}, python3, python3-requests
Depends: ${misc:Depends}, sudo, gawk, sed, bc, postgresql-client
Description: Monitoring plugin to check Postgres Streaming replication state
This Icinga/Nagios check plugin permit to check Postgres Streaming replication state.