Compare commits

...

7 commits

Author SHA1 Message Date
Benjamin Renard
e1392060a3
Add .editorconfig file and ignore dist directory by git
All checks were successful
Run tests / tests (push) Successful in 2m40s
2024-07-17 11:49:25 +02:00
Benjamin Renard
b61ab45ae1
debian package: provide a manpage built from README file and fix setting version and date in script 2024-07-17 11:49:24 +02:00
Benjamin Renard
fb02ac9269
Fix debian package build dependencies 2024-07-17 11:49:24 +02:00
Benjamin Renard
d3311da78f
Update what the script do in README.md file 2024-07-17 11:49:24 +02:00
Benjamin Renard
e5514e587f
Fix taking care of synchronous_commit master configuration and removed useless -C parameter and its check 2024-07-17 11:49:23 +02:00
Benjamin Renard
bc078f83e8
Code cleaning 2024-07-16 13:43:26 +02:00
Benjamin Renard
269b92415f
Update README.md 2024-07-16 10:40:47 +02:00
6 changed files with 319 additions and 159 deletions

15
.editorconfig Normal file
View file

@ -0,0 +1,15 @@
root = true
[*]
indent_style = space
indent_size = 4
trim_trailing_whitespace = true
insert_final_newline = true
charset = utf-8
end_of_line = lf
[*.{yaml,yml}]
indent_size = 2
[Makefile]
indent_style = tab

1
.gitignore vendored
View file

@ -1,2 +1,3 @@
*~ *~
/.env /.env
/dist

104
README.md
View file

@ -4,37 +4,60 @@ This script could be used as Nagios check plugin to verify Postgres Streaming re
This script : This script :
- check if Postgres is running (_CRITICAL_ raise if not) - check if Postgres is running (_CRITICAL_ raise if not)
- check if Postgres is in recovery mode : - check if Postgres is in recovery mode (using ̀`pg_is_in_recovery()`) :
- if Postgres is in recovery mode : - if the expected mode need to be auto-detected (default, see ̀`-e` parameter):
- retrieve from Postgres the last _xlog_ file receive and the _xlog_ file replay - if Postgres is in recovery mode: `hot-standby`
- check if Postgres recovery configuration file is NOT present (_CRITICAL_ raise if present) - if recovery file is present and contain `primary_conninfo`: `hot-standby`
- retrieve master connection information from Postgres recovery configuration file (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify. - otherwise: `master`
- retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - if expected mode is `hot-standby`:
- check if the current state of the host is "streaming" (_CRITICAL_ raise if not) - check if Postgres is in recovery mode (_CRITICAL_ raise if not)
- check if the current sync state of the host is "sync" (or the state specified using `-e` parameter, _CRITICAL_ raise if not) - retrieve from Postgres the last _xlog_ file received and the _xlog_ file replayed
- if the check of the current XLOG file of the master host is enabled : - retrieve master connection information from Postgres `primary_conninfo` configuration parameter (_UNKNOWN_ raise on error). Default Postgres master TCP port will be used if port is not specify.
- retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error). - retrieve master sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`)
- check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not) - retrieve the current state and sync state of the host from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if the last received _xlog_ file is the last replay _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded - check if the current state of the host is `streaming` (_CRITICAL_ raise if not)
- Return _OK_ state - check if the current sync state of the host is the expected one (default: `sync`, see `-e` parameter, _CRITICAL_ raise if not)
- if Postgres is not in recovery mode : - if the check of the current XLOG file of the master host is enabled :
- check if Postgres recovery configuration file is present (_CRITICAL_ raise if present) - retrieve current _xlog_ file from Postgres master server by making a connection on master server (_UNKNOWN_ raise on error).
- check if stand-by client(s) is connected (_WARNING_ raise if not) - check if the current master _xlog_ file is the last received _xlog_ file (_CRITICAL_ raise if not)
- Return _OK_ state with list and count of stand-by client(s) - check if the last received _xlog_ file is the last replayed _xlog_ file : if not, check the current delay with the last replayed transaction against _replay_warn_delay_ and _replay_crit_delay_ thresholds and raise corresponding error if they are exceeded
- if synchronous commit is enabled on master, check the last _xlog_ file sent by Postgres master is the last received by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- if synchronous commit is disabled on master, check the last _xlog_ file sent by Postgres master is the last writed by the slave. If not, retrieve difference (in bytes) and raise a _WARNING_.
- otherwise, return _OK_ state
- if expected mode is `master`:
- check if Postgres is in recovery mode (_CRITICAL_ raise if it is)
- retrieve current _xlog_ file (_UNKNOWN_ raise on error)
- retrieve sync mode from `synchronous_commit` setting and assume synchronous commit is enabled if `synchronous_commit` is equal to `on` or `remote_apply`)
- list stand-by client(s) from master and check for each to them:
- if synchronous commit is enabled, check the last _xlog_ file sent by Postgres master is the current one on master (_WARNING_ raise if not)
- if synchronous commit is disabled, check the last _xlog_ file sent by Postgres master is the laster writed one on master (_WARNING_ raise if not)
- otherwise, return _OK_ state with list and count of stand-by client(s)
**Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome ! **Note :** This script was originally write for PostgreSQL 9.1 and test on 9.1, 9.5, 9.6, 11, 13 and 15. Do not hesitate to tell me how this script work with other versions and share some fix. All contributions are welcome !
## Requirements ## Requirements
- Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster` - Some CLI tools: `sudo`, `awk`, `sed`, `bc`, `psql` and `pg_lscluster`
- **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details). - **On master node:** Slaves must be able to connect with user from `recovery.conf` / `postgresql.auto.conf` (or user specify using `-U`) to database with the same name (or another specified with `-D`) as `trust` (or using password specified in `~/.pgpass`). This user must have `SUPERUSER` privilege (need to get replication details).
- **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`). - **On standby node:** `PG_USER` must be able to connect locally on the database with the same name `(or another specified with -D)` as `trust` (or using password specified in `~/.pgpass`).
## Installation ## Installation
### From debian packages
```
echo "deb http://debian.zionetrix.net stable main" | sudo tee /etc/apt/sources.list.d/zionetrix.list
sudo apt -o Acquire::AllowInsecureRepositories=true -o Acquire::AllowDowngradeToInsecureRepositories=true update
sudo apt -o APT::Get::AllowUnauthenticated=true install --yes zionetrix-archive-keyring
sudo apt update
sudo apt install check-pg-streaming-replication
```
### From sources
``` ```
apt install sudo awk sed bc postgresql-client apt install sudo awk sed bc postgresql-client
git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \ git clone https://gitea.zionetrix.net/bn8/check_pg_streaming_replication.git \
@ -48,27 +71,34 @@ ln -s /usr/local/src/check_pg_streaming_replication/check_pg_streaming_replicati
``` ```
Usage: ./check_pg_streaming_replication [-d] [-h] [options] Usage: ./check_pg_streaming_replication [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or use postgres) -u pg_user Specify local Postgres user (Default: try to auto-detect or
use postgres)
-b psql_bin Specify psql binary path (Default: /usr/bin/psql) -b psql_bin Specify psql binary path (Default: /usr/bin/psql)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters) -B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: /usr/bin/pg_lsclusters)
-V pg_version Specify Postgres version (Default: try to auto-detect or use 9.1) -V pg_version Specify Postgres version (Default: try to auto-detect or
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or use use 9.1)
/var/lib/postgresql//main) -m pg_main Specify Postgres main directory path (Default: try to auto-detect or
use /var/lib/postgresql//main)
-r recovery_conf Specify Postgres recovery configuration file path -r recovery_conf Specify Postgres recovery configuration file path
( Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12) (Default: [PG_MAIN]/recovery.conf on PG <= 11,
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file) [PG_MAIN]/postgresql.auto.conf on PG >= 12)
-p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL -U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf
port if detected or use 5432) file)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must -p pg_port Specify default Postgres master TCP port (Default: same as local
match with .pgpass one is used) PostgreSQL port if detected or use 5432)
-C 1/0 Enable or disable check if the current LSN of the master host is the same -D dbname Specify DB name on Postgres master/slave to connect on (Default:
of the last received LSN (Default: 1) PG_USER, must match with .pgpass one is used)
-w replay_warn_delay Specify the replay warning delay in second (Default: 3) -w replay_warn_delay Specify the replay warning delay in second
-c replay_crit_delay Specify the replay critical delay in second (Default: 5) (Default: 3)
-e expected_sync_state The expected replication state ('sync' or 'async', default: sync) -c replay_crit_delay Specify the replay critical delay in second
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: 'auto') (Default: 5)
-e expected_sync_state The expected replication state ('sync' or 'async',
default: sync)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: 'auto')
-d Debug mode -d Debug mode
-h Show this message -h Show this message
``` ```
## Copyright ## Copyright

View file

@ -1,16 +1,21 @@
#!/bin/bash #!/bin/bash
# vim: tabstop=4 shiftwidth=4 softtabstop=4 expandtab
QUIET_ARG="" set -e
[[ "$1" == "--quiet" ]] && QUIET_ARG="--quiet"
QUIET_MODE=0
[[ "$1" == "--quiet" ]] && QUIET_MODE=1
# Enter source directory # Enter source directory
cd "$( dirname "$0" )" || exit cd "$( dirname "$0" )" || exit
CHECK_FILE="$( find "." -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )" if [[ -d dist ]]; then
PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )" echo "Clean previous build..."
rm -fr dist
fi
echo "Clean previous build..." CHECK_FILE="$( find "." -maxdepth 1 -name 'check_*' ! -name '*~' -type f -executable | head -n 1 )"
rm -fr dist PACKAGE_NAME="$( basename "$CHECK_FILE" | tr '_' '-' )"
echo "Detect version using git describe..." echo "Detect version using git describe..."
VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )" VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )"
@ -18,46 +23,80 @@ VERSION="$( git describe --tags|sed 's/^[^0-9]*//' )"
echo "Create building environemt..." echo "Create building environemt..."
BDIR="dist/$PACKAGE_NAME-$VERSION" BDIR="dist/$PACKAGE_NAME-$VERSION"
mkdir -p "$BDIR" mkdir -p "$BDIR"
RSYNC_ARG="" RSYNC_ARGS=( )
[[ -z "$QUIET_ARG" ]] && RSYNC_ARG="-v" [[ $QUIET_MODE -eq 0 ]] && RSYNC_ARGS+=( -v )
rsync -a "$RSYNC_ARG" debian/ "$BDIR/debian/" rsync -a "${RSYNC_ARGS[@]}" debian/ "$BDIR/debian/"
cp "$CHECK_FILE" "$BDIR/" cp "$CHECK_FILE" "$BDIR/"
echo "Set VERSION=$VERSION in gitdch using sed..." if [[ -e "README.md" ]]; then
sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )" echo "Build manpage from README.md file..."
MAN_TITLE=$( basename "$CHECK_FILE" )
MAN_MD_FILE="$BDIR/$( basename "$CHECK_FILE" ).1.md"
MAN_FILE="$BDIR/$( basename "$CHECK_FILE" ).1"
echo "# ${MAN_TITLE^^} 1" \
"$( git log --follow --format=%ad --date iso README.md | head -n 1 | awk '{print $1}')" \
"$PACKAGE_NAME" \
'"User Manuals"' > "$MAN_MD_FILE"
sed 1d README.md >> "$MAN_MD_FILE"
if ! which go-md2man > /dev/null; then
ARG_ARGS=()
[[ $QUIET_MODE -eq 0 ]] && ARG_ARGS+=( -qq )
apt update "${ARG_ARGS[@]}"
apt install -y "${ARG_ARGS[@]}" go-md2man
fi
go-md2man -in "$MAN_MD_FILE" -out "$MAN_FILE"
basename "$MAN_FILE" >> "$BDIR/debian/manpages"
grep -Eq '^Build-Depends: .*go-md2man' "$BDIR/debian/control" || \
sed -i 's/^Build-Depends: \(.*\)$/Build-Depends: \1, go-md2man/' "$BDIR/debian/control"
fi
if grep -Eq '^VERSION *=' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set VERSION=$VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^VERSION *=.*$/VERSION = '$VERSION'/" "$BDIR/$( basename "$CHECK_FILE" )"
elif grep -Eq '^# Version:' "$BDIR/$( basename "$CHECK_FILE" )"; then
echo "Set Version = $VERSION in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Version:\s*\).*$/#\1$VERSION/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if grep -Eq '^# Date:' "$BDIR/$( basename "$CHECK_FILE" )"; then
DATE="$( git log --follow --format=%ad --date iso "$CHECK_FILE" | head -n 1 )"
echo "Set Date = $DATE in $( basename "$CHECK_FILE" ) using sed..."
sed -i "s/^#\(\s*Date:\s*\).*$/#\1$DATE/" "$BDIR/$( basename "$CHECK_FILE" )"
fi
if [[ -z "$DEBIAN_CODENAME" ]]; then if [[ -z "$DEBIAN_CODENAME" ]]; then
echo "Retrieve debian codename using lsb_release..." echo "Retrieve debian codename using lsb_release..."
DEBIAN_CODENAME=$( lsb_release -c -s ) DEBIAN_CODENAME=$( lsb_release -c -s )
else else
echo "Use debian codename from environment ($DEBIAN_CODENAME)" echo "Use debian codename from environment ($DEBIAN_CODENAME)"
fi fi
echo "Generate debian changelog using gitdch..." echo "Generate debian changelog using gitdch..."
GITDCH_ARGS=('--verbose') GITDCH_ARGS=('--verbose')
[[ -n "$QUIET_ARG" ]] && GITDCH_ARGS=('--warning') [[ $QUIET_MODE -eq 1 ]] && GITDCH_ARGS=('--warning')
if [[ -n "$MAINTAINER_NAME" ]]; then if [[ -n "$MAINTAINER_NAME" ]]; then
echo "Use maintainer name from environment ($MAINTAINER_NAME)" echo "Use maintainer name from environment ($MAINTAINER_NAME)"
GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}") GITDCH_ARGS+=("--maintainer-name" "${MAINTAINER_NAME}")
fi fi
if [[ -n "$MAINTAINER_EMAIL" ]]; then if [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Use maintainer email from environment ($MAINTAINER_EMAIL)" echo "Use maintainer email from environment ($MAINTAINER_EMAIL)"
GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL") GITDCH_ARGS+=("--maintainer-email" "$MAINTAINER_EMAIL")
fi fi
gitdch \ gitdch \
--package-name "$PACKAGE_NAME" \ --package-name "$PACKAGE_NAME" \
--version "${VERSION}" \ --version "${VERSION}" \
--code-name "$DEBIAN_CODENAME" \ --code-name "$DEBIAN_CODENAME" \
--output "$BDIR"/debian/changelog \ --output "$BDIR"/debian/changelog \
--release-notes dist/release_notes.md \ --release-notes dist/release_notes.md \
"${GITDCH_ARGS[@]}" "${GITDCH_ARGS[@]}"
if [[ -n "$MAINTAINER_NAME" ]] && [[ -n "$MAINTAINER_EMAIL" ]]; then if [[ -n "$MAINTAINER_NAME" ]] && [[ -n "$MAINTAINER_EMAIL" ]]; then
echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..." echo "Set Maintainer field in debian control file ($MAINTAINER_NAME <$MAINTAINER_EMAIL>)..."
sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \ sed -i "s/^Maintainer: .*$/Maintainer: $MAINTAINER_NAME <$MAINTAINER_EMAIL>/" \
"$BDIR"/debian/control "$BDIR"/debian/control
fi fi
echo "Build debian package..." echo "Build debian package..."
cd "$BDIR" || exit cd "$BDIR" || exit
[[ $QUIET_MODE -eq 0 ]] && export DH_VERBOSE=1
dpkg-buildpackage dpkg-buildpackage

View file

@ -20,7 +20,8 @@
# ~/.pgpass). # ~/.pgpass).
# #
# Author: Benjamin Renard <brenard@easter-eggs.com> # Author: Benjamin Renard <brenard@easter-eggs.com>
# Date: Mon, 03 Jun 2024 15:31:29 +0200 # Version: dev
# Date: dev
# Source: https://gitea.zionetrix.net/bn8/check_pg_streaming_replication # Source: https://gitea.zionetrix.net/bn8/check_pg_streaming_replication
# SPDX-License-Identifier: GPL-3.0-or-later # SPDX-License-Identifier: GPL-3.0-or-later
# #
@ -39,7 +40,6 @@ RECOVERY_CONF=""
PG_DEFAULT_PORT="" PG_DEFAULT_PORT=""
PG_DEFAULT_APP_NAME=$( hostname ) PG_DEFAULT_APP_NAME=$( hostname )
PG_DB="" PG_DB=""
CHECK_CUR_MASTER_LSN=1
REPLAY_WARNING_DELAY=3 REPLAY_WARNING_DELAY=3
REPLAY_CRITICAL_DELAY=5 REPLAY_CRITICAL_DELAY=5
EXPECTED_SYNC_STATE=sync EXPECTED_SYNC_STATE=sync
@ -48,36 +48,42 @@ EXPECTED_MODE=auto
DEBUG=0 DEBUG=0
function usage () { function usage () {
ERROR="$1" ERROR="$*"
[[ -n "$ERROR" ]] && echo -e "$ERROR\n" [[ -n "$ERROR" ]] && echo -e "$ERROR\n"
cat << EOF cat << EOF
Usage: $0 [-d] [-h] [options] Usage: $0 [-d] [-h] [options]
-u pg_user Specify local Postgres user (Default: try to auto-detect or use $DEFAULT_PG_USER) -u pg_user Specify local Postgres user (Default: try to auto-detect or
use $DEFAULT_PG_USER)
-b psql_bin Specify psql binary path (Default: $PSQL_BIN) -b psql_bin Specify psql binary path (Default: $PSQL_BIN)
-B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: $PG_LSCLUSTER_BIN) -B pg_lsclusters_bin Specify pg_lsclusters binary path (Default: $PG_LSCLUSTER_BIN)
-V pg_version Specify Postgres version (Default: try to auto-detect or use $DEFAULT_PG_VERSION) -V pg_version Specify Postgres version (Default: try to auto-detect or
-m pg_main Specify Postgres main directory path (Default: try to auto-detect or use use $DEFAULT_PG_VERSION)
$DEFAULT_PG_MAIN) -m pg_main Specify Postgres main directory path (Default: try to auto-detect or
use $DEFAULT_PG_MAIN)
-r recovery_conf Specify Postgres recovery configuration file path -r recovery_conf Specify Postgres recovery configuration file path
(Default: [PG_MAIN]/recovery.conf on PG <= 11, [PG_MAIN]/postgresql.auto.conf on PG >= 12) (Default: [PG_MAIN]/recovery.conf on PG <= 11,
-U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf file) [PG_MAIN]/postgresql.auto.conf on PG >= 12)
-p pg_port Specify default Postgres master TCP port (Default: same as local PostgreSQL -U pg_master_user Specify Postgres user to use on master (Default: user from recovery.conf
port if detected or use $DEFAULT_PG_PORT) file)
-D dbname Specify DB name on Postgres master/slave to connect on (Default: PG_USER, must -p pg_port Specify default Postgres master TCP port (Default: same as local
match with .pgpass one is used) PostgreSQL port if detected or use $DEFAULT_PG_PORT)
-C 1/0 Enable or disable check if the current LSN of the master host is the same -D dbname Specify DB name on Postgres master/slave to connect on (Default:
of the last received LSN (Default: $CHECK_CUR_MASTER_LSN) PG_USER, must match with .pgpass one is used)
-w replay_warn_delay Specify the replay warning delay in second (Default: $REPLAY_WARNING_DELAY) -w replay_warn_delay Specify the replay warning delay in second
-c replay_crit_delay Specify the replay critical delay in second (Default: $REPLAY_CRITICAL_DELAY) (Default: $REPLAY_WARNING_DELAY)
-e expected_sync_state The expected replication state ('sync' or 'async', default: $EXPECTED_SYNC_STATE) -c replay_crit_delay Specify the replay critical delay in second
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto', default: '$EXPECTED_MODE') (Default: $REPLAY_CRITICAL_DELAY)
-e expected_sync_state The expected replication state ('sync' or 'async',
default: $EXPECTED_SYNC_STATE)
-E expected_mode The expected mode ('master', 'hot-standby' or 'auto',
default: '$EXPECTED_MODE')
-d Debug mode -d Debug mode
-h Show this message -h Show this message
EOF EOF
[[ -n "$ERROR" ]] && exit 1 || exit 0 [[ -n "$ERROR" ]] && exit 1 || exit 0
} }
while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do while getopts "hu:b:B:V:m:r:U:p:D:w:c:e:E:d" OPTION; do
case $OPTION in case $OPTION in
u) u)
PG_USER=$OPTARG PG_USER=$OPTARG
@ -106,9 +112,6 @@ while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
D) D)
PG_DB=$OPTARG PG_DB=$OPTARG
;; ;;
C)
CHECK_CUR_MASTER_LSN=$OPTARG
;;
w) w)
REPLAY_WARNING_DELAY=$OPTARG REPLAY_WARNING_DELAY=$OPTARG
;; ;;
@ -117,12 +120,15 @@ while getopts "hu:b:B:V:m:r:U:p:D:C:w:c:e:E:d" OPTION; do
;; ;;
e) e)
[[ "$OPTARG" != "sync" ]] && [[ "$OPTARG" != "async" ]] && \ [[ "$OPTARG" != "sync" ]] && [[ "$OPTARG" != "async" ]] && \
usage "Invalid expected replication state '$OPTARG'. Possible values: sync or async." usage "Invalid expected replication state '$OPTARG'." \
"Possible values: sync or async."
EXPECTED_SYNC_STATE=$OPTARG EXPECTED_SYNC_STATE=$OPTARG
;; ;;
E) E)
[[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && [[ "$OPTARG" != "auto" ]] && \ [[ "$OPTARG" != "master" ]] && [[ "$OPTARG" != "hot-standby" ]] && \
usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby or auto." [[ "$OPTARG" != "auto" ]] && \
usage "Invalid expected mode '$OPTARG'. Possible values: master, hot-standby" \
"or auto."
EXPECTED_MODE=$OPTARG EXPECTED_MODE=$OPTARG
;; ;;
d) d)
@ -139,7 +145,7 @@ done
function debug() { function debug() {
if [[ $DEBUG -eq 1 ]]; then if [[ $DEBUG -eq 1 ]]; then
>&2 echo -e "[DEBUG] $1" >&2 echo -e "[DEBUG] $*"
fi fi
} }
@ -153,7 +159,6 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
EXPECTED_SYNC_STATE = $EXPECTED_SYNC_STATE EXPECTED_SYNC_STATE = $EXPECTED_SYNC_STATE
@ -162,15 +167,19 @@ EXPECTED_MODE = $EXPECTED_MODE
# Auto-detect PostgreSQL information using pg_lsclusters # Auto-detect PostgreSQL information using pg_lsclusters
if [[ -x "$PG_LSCLUSTER_BIN" ]]; then if [[ -x "$PG_LSCLUSTER_BIN" ]]; then
PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null|head -n1 ) PG_CLUSTER=$( $PG_LSCLUSTER_BIN -h 2>/dev/null | head -n1 )
if [[ -n "$PG_CLUSTER" ]]; then if [[ -n "$PG_CLUSTER" ]]; then
debug "pg_lsclusters output:\n\t$PG_CLUSTER" debug "pg_lsclusters output:\n\t$PG_CLUSTER"
# Output example: # Output example:
# 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main /var/log/postgresql/postgresql-9.6-main.log # 9.6 main 5432 online,recovery postgres /var/lib/postgresql/9.6/main \
[[ -z "$PG_VERSION" ]] && PG_VERSION=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $1}' ) # /var/log/postgresql/postgresql-9.6-main.log
[[ -z "$PG_DEFAULT_PORT" ]] && PG_DEFAULT_PORT=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $3}' ) # 13 main 5432 online,recovery,pacemaker postgres /var/lib/postgresql/13/main \
[[ -z "$PG_USER" ]] && PG_USER=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $5}' ) # /var/log/postgresql/postgresql-13-main.log
[[ -z "$PG_MAIN" ]] && PG_MAIN=$( echo "$PG_CLUSTER"|awk -F ' +' '{print $6}' ) [[ -z "$PG_VERSION" ]] && PG_VERSION=$( awk -F ' +' '{print $1}' <<< "$PG_CLUSTER" )
[[ -z "$PG_DEFAULT_PORT" ]] && \
PG_DEFAULT_PORT=$( awk -F ' +' '{print $3}' <<< "$PG_CLUSTER" )
[[ -z "$PG_USER" ]] && PG_USER=$( awk -F ' +' '{print $5}' <<< "$PG_CLUSTER" )
[[ -z "$PG_MAIN" ]] && PG_MAIN=$( awk -F ' +' '{print $6}' <<< "$PG_CLUSTER" )
fi fi
else else
debug "pg_lsclusters not found ($PG_LSCLUSTER_BIN): parameters auto-detection disabled" debug "pg_lsclusters not found ($PG_LSCLUSTER_BIN): parameters auto-detection disabled"
@ -194,7 +203,11 @@ id "$PG_USER" > /dev/null 2>&1 || { echo "UNKNOWN: Invalid Postgres user ($PG_US
# Check RECOVERY_CONF # Check RECOVERY_CONF
if [[ -z "$RECOVERY_CONF" ]]; then if [[ -z "$RECOVERY_CONF" ]]; then
[[ $PG_VERSION -le 11 ]] && RECOVERY_CONF_FILENAME="recovery.conf" || RECOVERY_CONF_FILENAME="postgresql.auto.conf" if [[ $PG_VERSION -le 11 ]]; then
RECOVERY_CONF_FILENAME="recovery.conf"
else
RECOVERY_CONF_FILENAME="postgresql.auto.conf"
fi
RECOVERY_CONF="$PG_MAIN/$RECOVERY_CONF_FILENAME" RECOVERY_CONF="$PG_MAIN/$RECOVERY_CONF_FILENAME"
else else
RECOVERY_CONF_FILENAME=$( basename "$RECOVERY_CONF" ) RECOVERY_CONF_FILENAME=$( basename "$RECOVERY_CONF" )
@ -208,15 +221,17 @@ fi
[[ -z "$PG_DB" ]] && PG_DB="$PG_USER" [[ -z "$PG_DB" ]] && PG_DB="$PG_USER"
function psql_get () { function psql_get () {
sql="$1" local sql="$*"
debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned" debug "Exec 'sudo -u $PG_USER $PSQL_BIN -d \"$PG_DB\" -w -t -P format=unaligned <<< \"$sql\""
sudo -u "$PG_USER" "$PSQL_BIN" -d "$PG_DB" -w -t -P format=unaligned <<< "$sql" sudo -u "$PG_USER" "$PSQL_BIN" -d "$PG_DB" -w -t -P format=unaligned <<< "$sql"
} }
function psql_master_get () { function psql_master_get () {
sql="$1" local sql="$*"
debug "Exec 'echo \"$sql\"|sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t -P format=unaligned" debug "Exec 'sudo -u $PG_USER $PSQL_BIN -U $M_USER -h $M_HOST -w -p $M_PORT -d $PG_DB -t" \
sudo -u "$PG_USER" "$PSQL_BIN" -U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql" "-P format=unaligned <<< \"$sql\""
sudo -u "$PG_USER" "$PSQL_BIN" \
-U "$M_USER" -h "$M_HOST" -w -p "$M_PORT" -d "$PG_DB" -t -P format=unaligned <<< "$sql"
} }
debug "Running options: debug "Running options:
@ -229,7 +244,6 @@ PG_MAIN = $PG_MAIN
RECOVERY_CONF = $RECOVERY_CONF RECOVERY_CONF = $RECOVERY_CONF
PG_DEFAULT_PORT = $PG_DEFAULT_PORT PG_DEFAULT_PORT = $PG_DEFAULT_PORT
PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME PG_DEFAULT_APP_NAME = $PG_DEFAULT_APP_NAME
CHECK_CUR_MASTER_LSN = $CHECK_CUR_MASTER_LSN
REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY REPLAY_WARNING_DELAY = $REPLAY_WARNING_DELAY
REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY REPLAY_CRITICAL_DELAY = $REPLAY_CRITICAL_DELAY
" "
@ -273,7 +287,8 @@ if [[ "$EXPECTED_MODE" == "auto" ]]; then
if [[ $RECOVERY_MODE -eq 1 ]]; then if [[ $RECOVERY_MODE -eq 1 ]]; then
debug "Postgres is in recovery mode. Hot-standby mode." debug "Postgres is in recovery mode. Hot-standby mode."
EXPECTED_MODE="hot-standby" EXPECTED_MODE="hot-standby"
elif [[ -f $RECOVERY_CONF ]] && [[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then elif [[ -f $RECOVERY_CONF ]] && \
[[ $( grep -cE '^\s*primary_conninfo' "$RECOVERY_CONF" ) -gt 0 ]]; then
debug "File $RECOVERY_CONF_FILENAME found and contain primary_conninfo. Hot-standby mode." debug "File $RECOVERY_CONF_FILENAME found and contain primary_conninfo. Hot-standby mode."
EXPECTED_MODE="hot-standby" EXPECTED_MODE="hot-standby"
else else
@ -301,19 +316,24 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
# Get master connection information from primary_conninfo configuration parameter # Get master connection information from primary_conninfo configuration parameter
MASTER_CONN_INFOS=$( psql_get "SHOW primary_conninfo" ) MASTER_CONN_INFOS=$( psql_get "SHOW primary_conninfo" )
if [[ -z "$MASTER_CONN_INFOS" ]]; then if [[ -z "$MASTER_CONN_INFOS" ]]; then
echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo configuration parameter" echo "UNKNOWN: Can't retrieve master connection information from primary_conninfo" \
"configuration parameter"
exit 3 exit 3
fi fi
debug "Master connection information: $MASTER_CONN_INFOS" debug "Master connection information: $MASTER_CONN_INFOS"
M_HOST=$( grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' ) M_HOST=$(
grep 'host=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*host= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_HOST" ]]; then if [[ -z "$M_HOST" ]]; then
echo "UNKNOWN: Can't retrieve master host from primary_conninfo configuration parameter" echo "UNKNOWN: Can't retrieve master host from primary_conninfo configuration parameter"
exit 3 exit 3
fi fi
debug "Master host: $M_HOST" debug "Master host: $M_HOST"
M_PORT=$( grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' ) M_PORT=$(
grep 'port=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*port= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_PORT" ]]; then if [[ -z "$M_PORT" ]]; then
debug "Master port not specified, use default: $PG_DEFAULT_PORT" debug "Master port not specified, use default: $PG_DEFAULT_PORT"
M_PORT=$PG_DEFAULT_PORT M_PORT=$PG_DEFAULT_PORT
@ -325,7 +345,9 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master user provided by command-line, use it: $PG_MASTER_USER" debug "Master user provided by command-line, use it: $PG_MASTER_USER"
M_USER="$PG_MASTER_USER" M_USER="$PG_MASTER_USER"
else else
M_USER=$( grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/' ) M_USER=$(
grep 'user=' <<< "$MASTER_CONN_INFOS" | sed 's/^.*user= *\([0-9a-zA-Z.-]\+\) *.*$/\1/'
)
if [[ -z "$M_USER" ]]; then if [[ -z "$M_USER" ]]; then
debug "Master user not specified, use default: $PG_USER" debug "Master user not specified, use default: $PG_USER"
M_USER=$PG_USER M_USER=$PG_USER
@ -334,7 +356,10 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
fi fi
fi fi
M_APP_NAME=$( grep 'application_name=' <<< "$MASTER_CONN_INFOS" | sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/" ) M_APP_NAME=$(
grep 'application_name=' <<< "$MASTER_CONN_INFOS" |
sed "s/^.*application_name=[ \'\"]*\([^ \'\"]\+\)[ \'\"]*.*$/\1/"
)
if [[ -z "$M_APP_NAME" ]]; then if [[ -z "$M_APP_NAME" ]]; then
if [[ $PG_VERSION -ge 12 ]]; then if [[ $PG_VERSION -ge 12 ]]; then
debug "Master application name not specified, use cluster_name if defined" debug "Master application name not specified, use cluster_name if defined"
@ -354,25 +379,45 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Master application name: $M_APP_NAME" debug "Master application name: $M_APP_NAME"
fi fi
# Check if master is configured for synchronous commit
SYNC_MODE="$(
psql_master_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';"
)"
debug "Master synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Get current replication state information from master # Get current replication state information from master
M_CUR_REPL_STATE_INFO="$( psql_master_get "SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn FROM pg_stat_replication WHERE application_name='$M_APP_NAME';" )" M_CUR_REPL_STATE_INFO="$(
psql_master_get \
"SELECT state, sync_state, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn" \
"FROM pg_stat_replication WHERE application_name='$M_APP_NAME';"
)"
if [[ -z "$M_CUR_REPL_STATE_INFO" ]]; then if [[ -z "$M_CUR_REPL_STATE_INFO" ]]; then
echo "UNKNOWN: Can't retrieve current replication state information from master server" echo "UNKNOWN: Can't retrieve current replication state information from master server"
exit 3 exit 3
fi fi
debug "Master current replication state:\n\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO" debug "Master current replication state:\n" \
"\tstate|sync_state|sent_lsn|write_lsn\n\t$M_CUR_REPL_STATE_INFO"
M_CUR_STATE=$( cut -d'|' -f1 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_STATE=$( cut -d'|' -f1 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current state: $M_CUR_STATE" debug "Master current state: $M_CUR_STATE"
if [[ "$M_CUR_STATE" != "streaming" ]]; then if [[ "$M_CUR_STATE" != "streaming" ]]; then
echo "CRITICAL: this host is not in streaming state according to master host (current state = '$M_CUR_STATE')" echo "CRITICAL: this host is not in streaming state according to master host" \
"(current state = '$M_CUR_STATE')"
exit 2 exit 2
fi fi
M_CUR_SYNC_STATE=$( cut -d'|' -f2 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_SYNC_STATE=$( cut -d'|' -f2 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current sync state: $M_CUR_SYNC_STATE" debug "Master current sync state: $M_CUR_SYNC_STATE"
if [[ "$M_CUR_SYNC_STATE" != "$EXPECTED_SYNC_STATE" ]]; then if [[ "$M_CUR_SYNC_STATE" != "$EXPECTED_SYNC_STATE" ]]; then
echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE' (expected state = '$EXPECTED_SYNC_STATE')" echo "CRITICAL: unexpected replication state '$M_CUR_SYNC_STATE'" \
"(expected state = '$EXPECTED_SYNC_STATE')"
exit 2 exit 2
fi fi
@ -380,35 +425,24 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
M_CUR_WRITED_LSN=$( cut -d'|' -f4 <<< "$M_CUR_REPL_STATE_INFO" ) M_CUR_WRITED_LSN=$( cut -d'|' -f4 <<< "$M_CUR_REPL_STATE_INFO" )
debug "Master current last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'" debug "Master current last sent/writed LSN: '$M_CUR_SENT_LSN' / '$M_CUR_WRITED_LSN'"
# Check current master LSN vs last received LSN
if [[ "$CHECK_CUR_MASTER_LSN" == "1" ]]; then
# Get current LSN from master
M_CUR_LSN="$( psql_master_get "SELECT $pg_current_wal_lsn" )"
if [[ -z "$M_CUR_LSN" ]]; then
echo "UNKNOWN: Can't retrieve current LSN from master server"
exit 3
fi
debug "Master current LSN: $M_CUR_LSN"
# Master current LSN is the last received LSN ?
if [[ "$M_CUR_LSN" != "$LAST_RECEIVED_LSN" ]]; then
echo "CRITICAL: Master current LSN is not the last received LSN"
exit 2
fi
debug "Master current LSN is the last received LSN"
fi
# The last received LSN is the last replayed ? # The last received LSN is the last replayed ?
if [[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ]]; then if [[ "$LAST_RECEIVED_LSN" != "$LAST_REPLAYED_LSN" ]]; then
debug "/!\ The last received LSN is NOT the last replayed LSN ('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')" debug "/!\ The last received LSN is NOT the last replayed LSN" \
REPLAY_DELAY="$( psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());' )" "('$M_CUR_LSN' / '$LAST_REPLAYED_LSN')"
REPLAY_DELAY="$(
psql_get 'SELECT EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp());'
)"
debug "Replay delay is $REPLAY_DELAY second(s)" debug "Replay delay is $REPLAY_DELAY second(s)"
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_CRITICAL_DELAY" ) -gt 0 ]]; then if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_CRITICAL_DELAY" ) -gt 0 ]]; then
echo "CRITICAL: last received LSN is not the last replayed ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)" echo "CRITICAL: last received LSN is not the last replayed" \
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 2 exit 2
fi fi
if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_WARNING_DELAY" ) -gt 0 ]]; then if [[ $( bc -l <<< "$REPLAY_DELAY >= $REPLAY_WARNING_DELAY" ) -gt 0 ]]; then
echo "WARNING: last received LSN is not the last replay file ('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and replay delay is $REPLAY_DELAY second(s)" echo "WARNING: last received LSN is not the last replay file" \
"('$LAST_RECEIVED_LSN' / '$LAST_REPLAYED_LSN') and" \
"replay delay is $REPLAY_DELAY second(s)"
exit 1 exit 1
fi fi
debug "Replay delay is not worrying" debug "Replay delay is not worrying"
@ -416,10 +450,27 @@ if [[ "$EXPECTED_MODE" == "hot-standby" ]]; then
debug "Last received LSN is the last replayed file" debug "Last received LSN is the last replayed file"
# The master last sent LSN is the last received (and synced) ? # The master last sent LSN is the last received (and synced) ?
if [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$M_CUR_SENT_LSN" != "$LAST_RECEIVED_LSN" ]]; then
echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave. May be we have some network delay or load on slave" LSN_DIFF=$(
psql_master_get "SELECT $pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$LAST_RECEIVED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$LAST_RECEIVED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received (and synced to disk) by slave" \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN" echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received (and synced to disk) LSN: $LAST_RECEIVED_LSN" echo "Slave last received (and synced to disk) LSN: $LAST_RECEIVED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [ "$M_CUR_SENT_LSN" != "$M_CUR_WRITED_LSN" ];then
LSN_DIFF=$(
psql_master_get "SELECT pg_wal_lsn_diff('$M_CUR_SENT_LSN', '$M_CUR_WRITED_LSN');"
)
debug "LSN diff ('$M_CUR_SENT_LSN' vs '$M_CUR_WRITED_LSN'): $LSN_DIFF bytes"
echo "WARNING: master last sent LSN is not already received by slave " \
"(diff: $LSN_DIFF bytes). May be we have some network delay or load on slave"
echo "Master last sent LSN: $M_CUR_SENT_LSN"
echo "Slave last received LSN: $M_CUR_WRITED_LSN"
echo "Diff: $LSN_DIFF bytes"
exit 1 exit 1
fi fi
@ -444,28 +495,44 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
fi fi
debug "Current LSN: $CURRENT_LSN" debug "Current LSN: $CURRENT_LSN"
# Check if master is configured for synchronous commit
SYNC_MODE="$( psql_get "SELECT setting from pg_settings WHERE name = 'synchronous_commit';" )"
debug "synchronous_commit=$SYNC_MODE"
if [[ "$SYNC_MODE" == "on" ]] || [[ "$SYNC_MODE" == "remote_apply" ]]; then
debug "Master is configured for synchronous commit"
SYNCHRONOUS_COMMIT=1
else
debug "Master is not configured for synchronous commit"
SYNCHRONOUS_COMMIT=0
fi
# Check standby client # Check standby client
STANDBY_CLIENTS=$( psql_get "SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag STANDBY_CLIENTS=$(
FROM ( psql_get \
SELECT application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag "SELECT
FROM ( application_name, client_addr, sent_lsn, write_lsn, state, sync_state, current_lag
SELECT application_name, client_addr, $sent_lsn AS sent_lsn, $write_lsn AS write_lsn, state, sync_state, FROM (
$pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag SELECT
FROM pg_stat_replication application_name, client_addr, sent_lsn, write_lsn, state, sync_state,
) AS s2 current_lag
) AS s1" ) FROM (
SELECT
application_name, client_addr, $sent_lsn AS sent_lsn,
$write_lsn AS write_lsn, state, sync_state,
$pg_wal_lsn_diff($pg_current_wal_lsn, $write_lsn) AS current_lag
FROM pg_stat_replication
) AS s2
) AS s1"
)
if [[ -z "$STANDBY_CLIENTS" ]]; then if [[ -z "$STANDBY_CLIENTS" ]]; then
echo "WARNING: no stand-by client connected" echo "WARNING: no stand-by client connected"
exit 1 exit 1
fi fi
debug "Stand-by client(s):\n\t${STANDBY_CLIENTS//$'\n'/\\n\\t}" debug "Stand-by client(s):\n\t${STANDBY_CLIENTS//$'\n'/\\n\\t}"
STANDBY_CLIENTS_TXT="" STANDBY_CLIENTS_ROWS=()
STANDBY_CLIENTS_COUNT=0
CURRENT_LSN_IS_LAST_SENT=1 CURRENT_LSN_IS_LAST_SENT=1
for line in $STANDBY_CLIENTS; do for line in $STANDBY_CLIENTS; do
(( STANDBY_CLIENTS_COUNT+=1 ))
NAME=$( cut -d '|' -f 1 <<< "$line" ) NAME=$( cut -d '|' -f 1 <<< "$line" )
IP=$( cut -d '|' -f 2 <<< "$line" ) IP=$( cut -d '|' -f 2 <<< "$line" )
SENT_LSN=$( cut -d '|' -f 3 <<< "$line" ) SENT_LSN=$( cut -d '|' -f 3 <<< "$line" )
@ -473,20 +540,28 @@ elif [[ "$EXPECTED_MODE" == "master" ]]; then
STATE=$( cut -d '|' -f 5 <<< "$line" ) STATE=$( cut -d '|' -f 5 <<< "$line" )
SYNC_STATE=$( cut -d '|' -f 6 <<< "$line" ) SYNC_STATE=$( cut -d '|' -f 6 <<< "$line" )
LAG=$( cut -d '|' -f 7 <<< "$line" ) LAG=$( cut -d '|' -f 7 <<< "$line" )
STANDBY_CLIENTS_TXT="$STANDBY_CLIENTS_TXT\n$NAME ($IP): $STATE/$SYNC_STATE (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b)" STANDBY_CLIENTS_ROW="$NAME ($IP): $STATE/$SYNC_STATE"
[[ "$SENT_LSN" != "$CURRENT_LSN" ]] && CURRENT_LSN_IS_LAST_SENT=0 STANDBY_CLIENTS_ROW+=" (LSN: sent='$SENT_LSN' / writed='$WRITED_LSN', Lag: ${LAG}b"
STANDBY_CLIENTS_ROWS+=( "$STANDBY_CLIENTS_ROW" )
if [[ $SYNCHRONOUS_COMMIT -eq 1 ]] && [[ "$SENT_LSN" != "$CURRENT_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
elif [[ $SYNCHRONOUS_COMMIT -eq 0 ]] && [[ "$SENT_LSN" != "$WRITED_LSN" ]]; then
CURRENT_LSN_IS_LAST_SENT=0
fi
done done
if [[ $CURRENT_LSN_IS_LAST_SENT -eq 1 ]]; then if [[ $CURRENT_LSN_IS_LAST_SENT -eq 1 ]]; then
echo "OK: $STANDBY_CLIENTS_COUNT stand-by client(s) connected" echo "OK: ${#STANDBY_CLIENTS_ROWS[@]} stand-by client(s) connected"
EXIT_CODE=0 EXIT_CODE=0
else else
echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected. May be we have some load ?" echo "WARNING: current master LSN is not the last sent to stand-by client(s) connected." \
"May be we have some load ?"
EXIT_CODE=1 EXIT_CODE=1
fi fi
echo "Current master LSN: $CURRENT_LSN" echo "Current master LSN: $CURRENT_LSN"
echo -e "$STANDBY_CLIENTS_TXT" IFS=$'\n'
echo "${STANDBY_CLIENTS_ROWS[*]}"
exit $EXIT_CODE exit $EXIT_CODE
else else
echo "UNKNOWN - Invalid mode '$EXPECTED_MODE'" echo "UNKNOWN - Invalid mode '$EXPECTED_MODE'"

2
debian/control vendored
View file

@ -2,7 +2,7 @@ Source: check-pg-streaming-replication
Section: admin Section: admin
Priority: optional Priority: optional
Maintainer: Debian Zionetrix - check-pg-streaming-replication <debian+check-pg-streaming-replication@zionetrix.net> Maintainer: Debian Zionetrix - check-pg-streaming-replication <debian+check-pg-streaming-replication@zionetrix.net>
Build-Depends: debhelper (>> 11.0.0) Build-Depends: debhelper (>> 11.0.0), findutils, rsync, sed, awk, git, gitdch
Standards-Version: 3.9.6 Standards-Version: 3.9.6
Package: check-pg-streaming-replication Package: check-pg-streaming-replication