Compare commits

..

No commits in common. "master" and "2024.7.1" have entirely different histories.

6 changed files with 160 additions and 320 deletions

View file

@ -1,15 +0,0 @@
root = true
[*]
indent_style = space
indent_size = 4
trim_trailing_whitespace = true
insert_final_newline = true
charset = utf-8
end_of_line = lf
[*.{yaml,yml}]
indent_size = 2
[Makefile]
indent_style = tab

1
.gitignore vendored
View file

@ -1,3 +1,2 @@
*~ *~
/.env /.env
/dist

104
README.md
View file

@ -4,60 +4,37 @@ This script could be used as Nagios check plugin to verify Postgres Streaming re
This script : This script :
- check if Postgres is running (_CRITICAL_ raise if not) - check if Postgres is running (_CRITICAL_ raise if not)
- check if Postgres is in recovery mode (using ̀`pg_is_in_recovery()`) : - check if Postgres is in recovery mode :
- if the expected mode need to be auto-detected (default, see ̀`-e` parameter): - if Postgres is in recovery mode :
- if Postgres is in recovery mode: `hot-standby` - retrieve from Postgres the last _xlog_ file receive and the _xlog_ file replay
- if recovery file is present and contain `primary_conninfo`: `hot-standby` - check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present)
- otherwise: `master` - retrieve master connection information from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify.
- if expected mode is `hot-standby`: - retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if Postgres is in recovery mode (_CRITICAL_ raise if not) - check if the current state of the host is "streaming" (_CRITICAL_ raise if not)
- retrieve from Postgres the last _xlog_ file received and the _xlog_ file replayed - check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not)
- retrieve master connection information from Postgres `primary_conninfo` configuration parameter (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify. - if the check of the current XLOG file of the master host is enabled :
- retrieve master sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`) - retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not)
- check if the current state of the host is `streaming` (_CRITICAL_ raise if not) - check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded
- check if the current sync state of the host is the expected one (default: `sync`, see `-e` parameter, _CRITICAL_ raise if not) - Return _OK_ state
- if the check of the current XLOG file of the master host is enabled : - if Postgres is not in recovery mode :
- retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - check if Postgres recovery configuration file is present (_CRITICAL_ raise if present)
- check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not) - check if stand-by client(s) is connected (_WARNING_ raise if not)
- check if the last received _xlog_ file is the last replayed _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded - Return _OK_ state with list and count of stand-by client(s)
- if synchronous commit is enabled on master, check the last _xlog_ file sent by Postgres master is the last received by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- if synchronous commit is disabled on master, check the last _xlog_ file sent by Postgres master is the last writed by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- otherwise, return _OK_ state
- if expected mode is `master`:
- check if Postgres is in recovery mode (_CRITICAL_ raise if it is)
- retrieve current _xlog_ file (_UNKNOWN_ raise on error)
- retrieve sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`)
- list stand-by client(s) from master and check for each to them:
- if synchronous commit is enabled, check the last _xlog_ file sent by Postgres master is the current one on master (_WARNING_ raise if not)
- if synchronous commit is disabled, check the last _xlog_ file sent by Postgres master is the laster writed one on master (_WARNING_ raise if not)
- otherwise, return _OK_ state with list and count of stand-by client(s)
**Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome ! **Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome !
## Requirements ## Requirements
- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster` - Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster`
- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details). - **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details).
- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`). - **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`).
## Installation ## Installation
### From debian packages
```
echo "deb http://debian.zionetrix.net stable main" | sudo tee /etc/apt/sources.list.d/zionetrix.list
sudo apt -o Acquire::AllowInsecureRepositories=true -o Acquire::AllowDowngradeToInsecureRepositories=true update
sudo apt -o APT::Get::AllowUnauthenticated=true install --yes zionetrix-archive-keyring
sudo apt update
sudo apt install check-pg-streaming-replication
```
### From sources
``` ```
apt install sudo awk sed bc postgresql-client apt install sudo awk sed bc postgresql-client
git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \ git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \
@ -71,34 +48,27 @@ ln -s /usr/local/src/check_pg_streaming_replication/check_pg_streaming_replicati
``` ```
Usage: ./check_pg_streaming_replication [-d] [-h] [options] Usage: ./check_pg_streaming_replication [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or -u pg_user Specify local Postgres user (Default: try to auto-detect or use postgres)
use postgres)
-b psql_bin Specify psql binary path (Default: /usr/bin/psql) -b psql_bin Specify psql binary path (Default: /usr/bin/psql)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters) -B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters)
-V pg_version Specify Postgres version (Default: try to auto-detect or -V pg_version Specify Postgres version (Default: try to auto-detect or use 9.1)
use 9.1) -m pg_main Specify Postgres main directory path (Default: try to auto-detect or use
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or /var/lib/postgresql//main)
use /var/lib/postgresql//main)
-r recovery_conf Specify Postgres recovery configuration file path -r recovery_conf Specify Postgres recovery configuration file path
(Default: [PG_MAIN]/recovery.conf on PG <= 11, ( Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12)
[PG_MAIN]/postgresql.auto.conf on PG >= 12) -U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf -p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL
file) port if detected or use 5432)
-p pg_port Specify default Postgres master TCP port (Default: same as local -D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must
PostgreSQL port if detected or use 5432) match with .pgpass one is used)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: -C 1/0 Enable or disable check if the current LSN of the master host is the same
PG_USER, must match with .pgpass one is used) of the last received LSN (Default: 1)
-w replay_warn_delay Specify the replay warning delay in second -w replay_warn_delay Specify the replay warning delay in second (Default: 3)
(Default: 3) -c replay_crit_delay Specify the replay critical delay in second (Default: 5)
-c replay_crit_delay Specify the replay critical delay in second -e expected_sync_state The expected replication state ('sync' or 'async', default: sync)
(Default: 5) -E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: 'auto')
-e expected_sync_state The expected replication state ('sync' or 'async',
default: sync)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: 'auto')
-d Debug mode -d Debug mode
-h Show this message -h Show this message
``` ```
## Copyright ## Copyright

View file

@ -1,102 +1,63 @@
#!/bin/bash #!/bin/bash
# vim: tabstop=4 shiftwidth=4 softtabstop=4 expandtab
set -e QUIET_ARG=""
[[ "$1" == "--quiet" ]] && QUIET_ARG="--quiet"
QUIET_MODE=0
[[ "$1" == "--quiet" ]] && QUIET_MODE=1
# Enter source directory # Enter source directory
cd "$( dirname "$0" )" || exit cd "$( dirname "$0" )" || exit
if [[ -d dist ]]; then CHECK_FILE="$( find "." -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )"
echo "Clean previous build..."
rm -fr dist
fi
CHECK_FILE="$( find "." -maxdepth 1 -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )"
PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )" PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )"
echo "Clean previous build..."
rm -fr dist
echo "Detect version using git describe..." echo "Detect version using git describe..."
VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )" VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )"
echo "Create building environemt..." echo "Create building environemt..."
BDIR="dist/$PACKAGE_NAME-$VERSION" BDIR="dist/$PACKAGE_NAME-$VERSION"
mkdir -p "$BDIR" mkdir -p "$BDIR"
RSYNC_ARGS=( ) RSYNC_ARG=""
[[ $QUIET_MODE -eq 0 ]] && RSYNC_ARGS+=( -v ) [[ -z "$QUIET_ARG" ]] && RSYNC_ARG="-v"
rsync -a "${RSYNC_ARGS[@]}" debian/ "$BDIR/debian/" rsync -a "$RSYNC_ARG" debian/ "$BDIR/debian/"
cp "$CHECK_FILE" "$BDIR/" cp "$CHECK_FILE" "$BDIR/"
if [[ -e "README.md" ]]; then echo "Set VERSION=$VERSION in gitdch using sed..."
echo "Build manpage from README.md file..." sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )"
MAN_TITLE=$( basename "$CHECK_FILE" )
MAN_MD_FILE="$BDIR/$( basename "$CHECK_FILE" ).1.md"
MAN_FILE="$BDIR/$( basename "$CHECK_FILE" ).1"
echo "# ${MAN_TITLE^^} 1" \
"$( git log --follow --format=%ad --date iso README.md | head -n 1 | awk '{print $1}')" \
"$PACKAGE_NAME" \
'"User Manuals"' > "$MAN_MD_FILE"
sed 1d README.md >> "$MAN_MD_FILE"
if ! which go-md2man > /dev/null; then
ARG_ARGS=()
[[ $QUIET_MODE -eq 0 ]] && ARG_ARGS+=( -qq )
apt update "${ARG_ARGS[@]}"
apt install -y "${ARG_ARGS[@]}" go-md2man
fi
go-md2man -in "$MAN_MD_FILE" -out "$MAN_FILE"
basename "$MAN_FILE" >> "$BDIR/debian/manpages"
grep -Eq '^Build-Depends: .*go-md2man' "$BDIR/debian/control" || \
sed -i 's/^Build-Depends: \(.*\)$/Build-Depends: \1, go-md2man/' "$BDIR/debian/control"
fi
if grep -Eq '^VERSION *=' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set VERSION=$VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )"
elif grep -Eq '^# Version:' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set Version = $VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Version:\s*\).*$/#\1$VERSION/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if grep -Eq '^# Date:' "$BDIR/$( basename "$CHECK_FILE" )"; then
DATE="$( git log --follow --format=%ad --date iso "$CHECK_FILE" | head -n 1 )"
echo "Set Date = $DATE in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Date:\s*\).*$/#\1$DATE/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if [[ -z "$DEBIAN_CODENAME" ]]; then if [[ -z "$DEBIAN_CODENAME" ]]; then
echo "Retrieve debian codename using lsb_release..." echo "Retrieve debian codename using lsb_release..."
DEBIAN_CODENAME=$( lsb_release -c -s ) DEBIAN_CODENAME=$( lsb_release -c -s )
else else
echo "Use debian codename from environment ($DEBIAN_CODENAME)" echo "Use debian codename from environment ($DEBIAN_CODENAME)"
fi fi
echo "Generate debian changelog using gitdch..." echo "Generate debian changelog using gitdch..."
GITDCH_ARGS=('--verbose') GITDCH_ARGS=('--verbose')
[[ $QUIET_MODE -eq 1 ]] && GITDCH_ARGS=('--warning') [[ -n "$QUIET_ARG" ]] && GITDCH_ARGS=('--warning')
if [[ -n "$MAINTAINER_NAME" ]]; then if [[ -n "$MAINTAINER_NAME" ]]; then
echo "Use maintainer name from environment ($MAINTAINER_NAME)" echo "Use maintainer name from environment ($MAINTAINER_NAME)"
GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}") GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}")
fi fi
if [[ -n "$MAINTAINER_EMAIL" ]]; then if [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Use maintainer email from environment ($MAINTAINER_EMAIL)" echo "Use maintainer email from environment ($MAINTAINER_EMAIL)"
GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL") GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL")
fi fi
gitdch \ gitdch \
--package-name "$PACKAGE_NAME" \ --package-name "$PACKAGE_NAME" \
--version "${VERSION}" \ --version "${VERSION}" \
--code-name "$DEBIAN_CODENAME" \ --code-name "$DEBIAN_CODENAME" \
--output "$BDIR"/debian/changelog \ --output "$BDIR"/debian/changelog \
--release-notes dist/release_notes.md \ --release-notes dist/release_notes.md \
"${GITDCH_ARGS[@]}" "${GITDCH_ARGS[@]}"
if [[ -n "$MAINTAINER_NAME" ]] && [[ -n "$MAINTAINER_EMAIL" ]]; then if [[ -n "$MAINTAINER_NAME" ]] && [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..." echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..."
sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \ sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \
"$BDIR"/debian/control "$BDIR"/debian/control
fi fi
echo "Build debian package..." echo "Build debian package..."
cd "$BDIR" || exit cd "$BDIR" || exit
[[ $QUIET_MODE -eq 0 ]] && export DH_VERBOSE=1
dpkg-buildpackage dpkg-buildpackage

View file

@ -20,8 +20,7 @@
# ~/.pgpass). # ~/.pgpass).
# #
# Author: Benjamin Renard <brenard@easter-eggs.com> # Author: Benjamin Renard <brenard@easter-eggs.com>
# Version: dev # Date: Mon, 03 Jun 2024 15:31:29 +0200
# Date: dev
# Source: https://gitea.zionetrix.net/bn8/check_pg_streaming_replication # Source: https://gitea.zionetrix.net/bn8/check_pg_streaming_replication
# SPDX-License-Identifier: GPL-3.0-or-later # SPDX-License-Identifier: GPL-3.0-or-later
# #
@ -40,6 +39,7 @@ RECOVERY_CONF=""
PG_DEFAULT_PORT="" PG_DEFAULT_PORT=""
PG_DEFAULT_APP_NAME=$( hostname ) PG_DEFAULT_APP_NAME=$( hostname )
PG_DB="" PG_DB=""
CHECK_CUR_MASTER_LSN=1
REPLAY_WARNING_DELAY=3 REPLAY_WARNING_DELAY=3
REPLAY_CRITICAL_DELAY=5 REPLAY_CRITICAL_DELAY=5
EXPECTED_SYNC_STATE=sync EXPECTED_SYNC_STATE=sync
@ -48,42 +48,36 @@ EXPECTED_MODE=auto
DEBUG=0 DEBUG=0
function usage () { function usage () {
ERROR="$*" ERROR="$1"
[[ -n "$ERROR" ]] && echo -e "$ERROR\n" [[ -n "$ERROR" ]] && echo -e "$ERROR\n"
cat << EOF cat << EOF
Usage: $0 [-d] [-h] [options] Usage: $0 [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or -u pg_user Specify local Postgres user (Default: try to auto-detect or use $DEFAULT_PG_USER)
use $DEFAULT_PG_USER)
-b psql_bin Specify psql binary path (Default: $PSQL_BIN) -b psql_bin Specify psql binary path (Default: $PSQL_BIN)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: $PG_LSCLUSTER_BIN) -B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: $PG_LSCLUSTER_BIN)
-V pg_version Specify Postgres version (Default: try to auto-detect or -V pg_version Specify Postgres version (Default: try to auto-detect or use $DEFAULT_PG_VERSION)
use $DEFAULT_PG_VERSION) -m pg_main Specify Postgres main directory path (Default: try to auto-detect or use
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or $DEFAULT_PG_MAIN)
use $DEFAULT_PG_MAIN)
-r recovery_conf Specify Postgres recovery configuration file path -r recovery_conf Specify Postgres recovery configuration file path
(Default: [PG_MAIN]/recovery.conf on PG <= 11, (Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12)
[PG_MAIN]/postgresql.auto.conf on PG >= 12) -U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file)
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf -p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL
file) port if detected or use $DEFAULT_PG_PORT)
-p pg_port Specify default Postgres master TCP port (Default: same as local -D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must
PostgreSQL port if detected or use $DEFAULT_PG_PORT) match with .pgpass one is used)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: -C 1/0 Enable or disable check if the current LSN of the master host is the same
PG_USER, must match with .pgpass one is used) of the last received LSN (Default: $CHECK_CUR_MASTER_LSN)
-w replay_warn_delay Specify the replay warning delay in second -w replay_warn_delay Specify the replay warning delay in second (Default: $REPLAY_WARNING_DELAY)
(Default: $REPLAY_WARNING_DELAY) -c replay_crit_delay Specify the replay critical delay in second (Default: $REPLAY_CRITICAL_DELAY)
-c replay_crit_delay Specify the replay critical delay in second -e expected_sync_state The expected replication state ('sync' or 'async', default: $EXPECTED_SYNC_STATE)
(Default: $REPLAY_CRITICAL_DELAY) -E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: '$EXPECTED_MODE')
-e expected_sync_state The expected replication state ('sync' or 'async',
default: $EXPECTED_SYNC_STATE)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: '$EXPECTED_MODE')
-d Debug mode -d Debug mode
-h Show this message -h Show this message
EOF EOF
[[ -n "$ERROR" ]] && exit 1 || exit 0 [[ -n "$ERROR" ]] && exit 1 || exit 0
} }
while getopts "hu:b:B:V:m:r:U:p:D:w:c:e:E:d" OPTION; do while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
case $OPTION in case $OPTION in
u) u)
PG_USER=$OPTARG PG_USER=$OPTARG
@ -112,6 +106,9 @@ while getopts "hu:b:B:V:m:r:U:p:D:w:c:e:E:d" OPTION; do
D) D)
PG_DB=$OPTARG PG_DB=$OPTARG
;; ;;
C)
CHECK_CUR_MASTER_LSN=$OPTARG
;;
w) w)
REPLAY_WARNING_DELAY=$OPTARG REPLAY_WARNING_DELAY=$OPTARG
;; ;;
@ -120,15 +117,12 @@ while getopts "hu:b:B:V:m:r:U:p:D:w:c:e:E:d" OPTION; do
;; ;;
e) e)
[[ "$OPTARG" != "sync" ]] && [[ "$OPTARG" != "async" ]] && \ [[ "$OPTARG" != "sync" ]] && [[ "$OPTARG" != "async" ]] && \
usage "Invalid expected replication state '$OPTARG'." \ usage "Invalid expected replication state '$OPTARG'. Possible values: sync or async."
"Possible values: sync or async."
EXPECTED_SYNC_STATE=$OPTARG EXPECTED_SYNC_STATE=$OPTARG
;; ;;
E) E)
[[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && \ [[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && [[ "$OPTARG" != "auto" ]] && \
[[ "$OPTARG" != "auto" ]] && \ usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby or auto."
usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby" \
"or auto."
EXPECTED_MODE=$OPTARG EXPECTED_MODE=$OPTARG
;; ;;
d) d)
@ -145,7 +139,7 @@ done
function debug() { function debug() {
if [[ $DEBUG -eq 1 ]]; then if [[ $DEBUG -eq 1 ]]; then
>&2 echo -e "[DEBUG] $*" >&2 echo -e "[DEBUG] $1"
fi fi
} }
@ -159,6 +153,7 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
EXPECTED_SYNC_STATE = $EXPECTED_SYNC_STATE EXPECTED_SYNC_STATE = $EXPECTED_SYNC_STATE
@ -167,19 +162,15 @@ EXPECTED_MODE = $EXPECTED_MODE
# Auto-detect PostgreSQL information using pg_lsclusters # Auto-detect PostgreSQL information using pg_lsclusters
if [[ -x "$PG_LSCLUSTER_BIN" ]]; then if [[ -x "$PG_LSCLUSTER_BIN" ]]; then
PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null | head -n1 ) PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null|head -n1 )
if [[ -n "$PG_CLUSTER" ]]; then if [[ -n "$PG_CLUSTER" ]]; then
debug "pg_lsclusters output:\n\t$PG_CLUSTER" debug "pg_lsclusters output:\n\t$PG_CLUSTER"
# Output example: # Output example:
# 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main \ # 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main /var/log/postgresql/postgresql-9.6-main.log
# /var/log/postgresql/postgresql-9.6-main.log [[ -z "$PG_VERSION" ]] && PG_VERSION=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $1}' )
# 13 main 5432 online,recovery,pacemaker postgres /var/lib/postgresql/13/main \ [[ -z "$PG_DEFAULT_PORT" ]] && PG_DEFAULT_PORT=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $3}' )
# /var/log/postgresql/postgresql-13-main.log [[ -z "$PG_USER" ]] && PG_USER=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $5}' )
[[ -z "$PG_VERSION" ]] && PG_VERSION=$( awk -F ' +' '{print $1}' <<< "$PG_CLUSTER" ) [[ -z "$PG_MAIN" ]] && PG_MAIN=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $6}' )
[[ -z "$PG_DEFAULT_PORT" ]] && \
PG_DEFAULT_PORT=$( awk -F ' +' '{print $3}' <<< "$PG_CLUSTER" )
[[ -z "$PG_USER" ]] && PG_USER=$( awk -F ' +' '{print $5}' <<< "$PG_CLUSTER" )
[[ -z "$PG_MAIN" ]] && PG_MAIN=$( awk -F ' +' '{print $6}' <<< "$PG_CLUSTER" )
fi fi
else else
debug "pg_lsclusters not found ($PG_LSCLUSTER_BIN): parameters auto-detection disabled" debug "pg_lsclusters not found ($PG_LSCLUSTER_BIN): parameters auto-detection disabled"
@ -203,11 +194,7 @@ id "$PG_USER" > /dev/null 2>&1 || { echo "UNKNOWN: Invalid Postgres user ($PG_US
# Check RECOVERY_CONF # Check RECOVERY_CONF
if [[ -z "$RECOVERY_CONF" ]]; then if [[ -z "$RECOVERY_CONF" ]]; then
if [[ $PG_VERSION -le 11 ]]; then [[ $PG_VERSION -le 11 ]] && RECOVERY_CONF_FILENAME="recovery.conf" || RECOVERY_CONF_FILENAME="postgresql.auto.conf"
RECOVERY_CONF_FILENAME="recovery.conf"
else
RECOVERY_CONF_FILENAME="postgresql.auto.conf"
fi
RECOVERY_CONF="$PG_MAIN/$RECOVERY_CONF_FILENAME" RECOVERY_CONF="$PG_MAIN/$RECOVERY_CONF_FILENAME"
else else
RECOVERY_CONF_FILENAME=$( basename "$RECOVERY_CONF" ) RECOVERY_CONF_FILENAME=$( basename "$RECOVERY_CONF" )
@ -221,17 +208,15 @@ fi
[[ -z "$PG_DB" ]] && PG_DB="$PG_USER" [[ -z "$PG_DB" ]] && PG_DB="$PG_USER"
function psql_get () { function psql_get () {
local sql="$*" sql="$1"
debug "Exec 'sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned <<< \"$sql\"" debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned"
sudo -u "$PG_USER" "$PSQL_BIN" -d "$PG_DB" -w -t -P format=unaligned <<< "$sql" sudo -u "$PG_USER" "$PSQL_BIN" -d "$PG_DB" -w -t -P format=unaligned <<< "$sql"
} }
function psql_master_get () { function psql_master_get () {
local sql="$*" sql="$1"
debug "Exec 'sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t" \ debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t -P format=unaligned"
"-P format=unaligned <<< \"$sql\"" sudo -u "$PG_USER" "$PSQL_BIN" -U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql"
sudo -u "$PG_USER" "$PSQL_BIN" \
-U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql"
} }
debug "Running options: debug "Running options:
@ -244,6 +229,7 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
" "
@ -287,8 +273,7 @@ if [[ "$EXPECTED_MODE" == "auto" ]]; then
if [[ $RECOVERY_MODE -eq 1 ]]; then if [[ $RECOVERY_MODE -eq 1 ]]; then
debug "Postgres is in recovery mode. Hot-standby mode." debug "Postgres is in recovery mode. Hot-standby mode."
EXPECTED_MODE="hot-standby" EXPECTED_MODE="hot-standby"
elif [[ -f $RECOVERY_CONF ]] && \ elif [[ -f $RECOVERY_CONF ]] && [[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then
[[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then
debug "File $RECOVERY_CONF_FILENAME found and contain primary_conninfo. Hot-standby mode." debug "File $RECOVERY_CONF_FILENAME found and contain primary_conninfo. Hot-standby mode."
EXPECTED_MODE="hot-standby" EXPECTED_MODE="hot-standby"
else else
@ -316,24 +301,19 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
# Get master connection information from primary_conninfo configuration parameter # Get master connection information from primary_conninfo configuration parameter
MASTER_CONN_INFOS=$( psql_get "SHOW primary_conninfo" ) MASTER_CONN_INFOS=$( psql_get "SHOW primary_conninfo" )
if [[ -z "$MASTER_CONN_INFOS" ]]; then if [[ -z "$MASTER_CONN_INFOS" ]]; then
echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo" \ echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo configuration parameter"
"configuration parameter"
exit 3 exit 3
fi fi
debug "Master connection information: $MASTER_CONN_INFOS" debug "Master connection information: $MASTER_CONN_INFOS"
M_HOST=$( M_HOST=$( grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_HOST" ]]; then if [[ -z "$M_HOST" ]]; then
echo "UNKNOWN: Can't retrieve master host from primary_conninfo configuration parameter" echo "UNKNOWN: Can't retrieve master host from primary_conninfo configuration parameter"
exit 3 exit 3
fi fi
debug "Master host: $M_HOST" debug "Master host: $M_HOST"
M_PORT=$( M_PORT=$( grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_PORT" ]]; then if [[ -z "$M_PORT" ]]; then
debug "Master port not specified, use default: $PG_DEFAULT_PORT" debug "Master port not specified, use default: $PG_DEFAULT_PORT"
M_PORT=$PG_DEFAULT_PORT M_PORT=$PG_DEFAULT_PORT
@ -345,9 +325,7 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master user provided by command-line, use it: $PG_MASTER_USER" debug "Master user provided by command-line, use it: $PG_MASTER_USER"
M_USER="$PG_MASTER_USER" M_USER="$PG_MASTER_USER"
else else
M_USER=$( M_USER=$( grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' )
grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_USER" ]]; then if [[ -z "$M_USER" ]]; then
debug "Master user not specified, use default: $PG_USER" debug "Master user not specified, use default: $PG_USER"
M_USER=$PG_USER M_USER=$PG_USER
@ -356,10 +334,7 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
fi fi
fi fi
M_APP_NAME=$( M_APP_NAME=$( grep 'application_name=' <<< "$MASTER_CONN_INFOS" | sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/" )
grep 'application_name=' <<< "$MASTER_CONN_INFOS" |
sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/"
)
if [[ -z "$M_APP_NAME" ]]; then if [[ -z "$M_APP_NAME" ]]; then
if [[ $PG_VERSION -ge 12 ]]; then if [[ $PG_VERSION -ge 12 ]]; then
debug "Master application name not specified, use cluster_name if defined" debug "Master application name not specified, use cluster_name if defined"
@ -379,45 +354,25 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master application name: $M_APP_NAME" debug "Master application name: $M_APP_NAME"
fi fi
# Check if master is configured for synchronous commit
SYNC_MODE="$(
psql_master_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';"
)"
debug "Master synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Get current replication state information from master # Get current replication state information from master
M_CUR_REPL_STATE_INFO="$( M_CUR_REPL_STATE_INFO="$( psql_master_get "SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn FROM pg_stat_replication WHERE application_name='$M_APP_NAME';" )"
psql_master_get \
"SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn" \
"FROM pg_stat_replication WHERE application_name='$M_APP_NAME';"
)"
if [[ -z "$M_CUR_REPL_STATE_INFO" ]]; then if [[ -z "$M_CUR_REPL_STATE_INFO" ]]; then
echo "UNKNOWN: Can't retrieve current replication state information from master server" echo "UNKNOWN: Can't retrieve current replication state information from master server"
exit 3 exit 3
fi fi
debug "Master current replication state:\n" \ debug "Master current replication state:\n\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
"\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
M_CUR_STATE=$( cut -d'|' -f1 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_STATE=$( cut -d'|' -f1 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current state: $M_CUR_STATE" debug "Master current state: $M_CUR_STATE"
if [[ "$M_CUR_STATE" != "streaming" ]]; then if [[ "$M_CUR_STATE" != "streaming" ]]; then
echo "CRITICAL: this host is not in streaming state according to master host" \ echo "CRITICAL: this host is not in streaming state according to master host (current state = '$M_CUR_STATE')"
"(current state = '$M_CUR_STATE')"
exit 2 exit 2
fi fi
M_CUR_SYNC_STATE=$( cut -d'|' -f2 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_SYNC_STATE=$( cut -d'|' -f2 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current sync state: $M_CUR_SYNC_STATE" debug "Master current sync state: $M_CUR_SYNC_STATE"
if [[ "$M_CUR_SYNC_STATE" != "$EXPECTED_SYNC_STATE" ]]; then if [[ "$M_CUR_SYNC_STATE" != "$EXPECTED_SYNC_STATE" ]]; then
echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE'" \ echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE' (expected state = '$EXPECTED_SYNC_STATE')"
"(expected state = '$EXPECTED_SYNC_STATE')"
exit 2 exit 2
fi fi
@ -425,24 +380,35 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
M_CUR_WRITED_LSN=$( cut -d'|' -f4 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_WRITED_LSN=$( cut -d'|' -f4 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'" debug "Master current last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'"
# Check current master LSN vs last received LSN
if [[ "$CHECK_CUR_MASTER_LSN" == "1" ]]; then
# Get current LSN from master
M_CUR_LSN="$( psql_master_get "SELECT $pg_current_wal_lsn" )"
if [[ -z "$M_CUR_LSN" ]]; then
echo "UNKNOWN: Can't retrieve current LSN from master server"
exit 3
fi
debug "Master current LSN: $M_CUR_LSN"
# Master current LSN is the last received LSN ?
if [[ "$M_CUR_LSN" != "$LAST_RECEIVED_LSN" ]]; then
echo "CRITICAL: Master current LSN is not the last received LSN"
exit 2
fi
debug "Master current LSN is the last received LSN"
fi
# The last received LSN is the last replayed ? # The last received LSN is the last replayed ?
if [[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ]]; then if [[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ]]; then
debug "/!\ The last received LSN is NOT the last replayed LSN" \ debug "/!\ The last received LSN is NOT the last replayed LSN ('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')"
"('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')" REPLAY_DELAY="$( psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());' )"
REPLAY_DELAY="$(
psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());'
)"
debug "Replay delay is $REPLAY_DELAY second(s)" debug "Replay delay is $REPLAY_DELAY second(s)"
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_CRITICAL_DELAY" ) -gt 0 ]]; then if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_CRITICAL_DELAY" ) -gt 0 ]]; then
echo "CRITICAL: last received LSN is not the last replayed" \ echo "CRITICAL: last received LSN is not the last replayed ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)"
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 2 exit 2
fi fi
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_WARNING_DELAY" ) -gt 0 ]]; then if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_WARNING_DELAY" ) -gt 0 ]]; then
echo "WARNING: last received LSN is not the last replay file" \ echo "WARNING: last received LSN is not the last replay file ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)"
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 1 exit 1
fi fi
debug "Replay delay is not worrying" debug "Replay delay is not worrying"
@ -450,27 +416,10 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Last received LSN is the last replayed file" debug "Last received LSN is the last replayed file"
# The master last sent LSN is the last received (and synced) ? # The master last sent LSN is the last received (and synced) ?
if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then if [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then
LSN_DIFF=$( echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave. May be we have some network delay or load on slave"
psql_master_get "SELECT $pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$LAST_RECEIVED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$LAST_RECEIVED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave" \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN" echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received (and synced to disk) LSN: $LAST_RECEIVED_LSN" echo "Slave last received (and synced to disk) LSN: $LAST_RECEIVED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [ "$M_CUR_SENT_LSN" != "$M_CUR_WRITED_LSN" ];then
LSN_DIFF=$(
psql_master_get "SELECT pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$M_CUR_WRITED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$M_CUR_WRITED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received by slave " \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received LSN: $M_CUR_WRITED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1 exit 1
fi fi
@ -495,44 +444,28 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
fi fi
debug "Current LSN: $CURRENT_LSN" debug "Current LSN: $CURRENT_LSN"
# Check if master is configured for synchronous commit
SYNC_MODE="$( psql_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';" )"
debug "synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Check standby client # Check standby client
STANDBY_CLIENTS=$( STANDBY_CLIENTS=$( psql_get "SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
psql_get \ FROM (
"SELECT SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag FROM (
FROM ( SELECT application_name, client_addr, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn, state, sync_state,
SELECT $pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag
application_name, client_addr, sent_lsn, write_lsn, state, sync_state, FROM pg_stat_replication
current_lag ) AS s2
FROM ( ) AS s1" )
SELECT
application_name, client_addr, $sent_lsn AS sent_lsn,
$write_lsn AS write_lsn, state, sync_state,
$pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag
FROM pg_stat_replication
) AS s2
) AS s1"
)
if [[ -z "$STANDBY_CLIENTS" ]]; then if [[ -z "$STANDBY_CLIENTS" ]]; then
echo "WARNING: no stand-by client connected" echo "WARNING: no stand-by client connected"
exit 1 exit 1
fi fi
debug "Stand-by client(s):\n\t${STANDBY_CLIENTS//$'\n'/\\n\\t}" debug "Stand-by client(s):\n\t${STANDBY_CLIENTS//$'\n'/\\n\\t}"
STANDBY_CLIENTS_ROWS=() STANDBY_CLIENTS_TXT=""
STANDBY_CLIENTS_COUNT=0
CURRENT_LSN_IS_LAST_SENT=1 CURRENT_LSN_IS_LAST_SENT=1
for line in $STANDBY_CLIENTS; do for line in $STANDBY_CLIENTS; do
(( STANDBY_CLIENTS_COUNT+=1 ))
NAME=$( cut -d '|' -f 1 <<< "$line" ) NAME=$( cut -d '|' -f 1 <<< "$line" )
IP=$( cut -d '|' -f 2 <<< "$line" ) IP=$( cut -d '|' -f 2 <<< "$line" )
SENT_LSN=$( cut -d '|' -f 3 <<< "$line" ) SENT_LSN=$( cut -d '|' -f 3 <<< "$line" )
@ -540,28 +473,20 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
STATE=$( cut -d '|' -f 5 <<< "$line" ) STATE=$( cut -d '|' -f 5 <<< "$line" )
SYNC_STATE=$( cut -d '|' -f 6 <<< "$line" ) SYNC_STATE=$( cut -d '|' -f 6 <<< "$line" )
LAG=$( cut -d '|' -f 7 <<< "$line" ) LAG=$( cut -d '|' -f 7 <<< "$line" )
STANDBY_CLIENTS_ROW="$NAME ($IP): $STATE/$SYNC_STATE" STANDBY_CLIENTS_TXT="$STANDBY_CLIENTS_TXT\n$NAME ($IP): $STATE/$SYNC_STATE (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b)"
STANDBY_CLIENTS_ROW+=" (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b" [[ "$SENT_LSN" != "$CURRENT_LSN" ]] && CURRENT_LSN_IS_LAST_SENT=0
STANDBY_CLIENTS_ROWS+=( "$STANDBY_CLIENTS_ROW" )
if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$SENT_LSN" != "$CURRENT_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [[ "$SENT_LSN" != "$WRITED_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
fi
done done
if [[ $CURRENT_LSN_IS_LAST_SENT -eq 1 ]]; then if [[ $CURRENT_LSN_IS_LAST_SENT -eq 1 ]]; then
echo "OK: ${#STANDBY_CLIENTS_ROWS[@]} stand-by client(s) connected" echo "OK: $STANDBY_CLIENTS_COUNT stand-by client(s) connected"
EXIT_CODE=0 EXIT_CODE=0
else else
echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected." \ echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected. May be we have some load ?"
"May be we have some load ?"
EXIT_CODE=1 EXIT_CODE=1
fi fi
echo "Current master LSN: $CURRENT_LSN" echo "Current master LSN: $CURRENT_LSN"
IFS=$'\n' echo -e "$STANDBY_CLIENTS_TXT"
echo "${STANDBY_CLIENTS_ROWS[*]}"
exit $EXIT_CODE exit $EXIT_CODE
else else
echo "UNKNOWN - Invalid mode '$EXPECTED_MODE'" echo "UNKNOWN - Invalid mode '$EXPECTED_MODE'"

4
debian/control vendored
View file

@ -2,11 +2,11 @@ Source: check-pg-streaming-replication
Section: admin Section: admin
Priority: optional Priority: optional
Maintainer: Debian Zionetrix - check-pg-streaming-replication <debian+check-pg-streaming-replication@zionetrix.net> Maintainer: Debian Zionetrix - check-pg-streaming-replication <debian+check-pg-streaming-replication@zionetrix.net>
Build-Depends: debhelper (>> 11.0.0), findutils, rsync, sed, awk, git, gitdch Build-Depends: debhelper (>> 11.0.0)
Standards-Version: 3.9.6 Standards-Version: 3.9.6
Package: check-pg-streaming-replication Package: check-pg-streaming-replication
Architecture: all Architecture: all
Depends: ${misc:Depends}, sudo, gawk, sed, bc, postgresql-client Depends: ${misc:Depends}, python3, python3-requests
Description: Monitoring plugin to check Postgres Streaming replication state Description: Monitoring plugin to check Postgres Streaming replication state
This Icinga/Nagios check plugin permit to check Postgres Streaming replication state. This Icinga/Nagios check plugin permit to check Postgres Streaming replication state.