PostgreSQL pg_auto_failover 高可用 1:pg_auto_failover集群搭建
PostgreSQL pg_auto_failover 高可用 1:pg_auto_failover集群搭建
PostgreSQL pg_auto_failover 高可用 2:pg_auto_failover集群運(yùn)維
0,pg_auto_failover架構(gòu)
開始之前,先看一下pg_auto_failover的一個(gè)最基礎(chǔ)架構(gòu)原理如下,需要弄清楚幾個(gè)節(jié)點(diǎn)的作用
1,monitor節(jié)點(diǎn)的身份是一個(gè)監(jiān)控節(jié)點(diǎn),僅存儲(chǔ)元數(shù)據(jù),負(fù)責(zé)監(jiān)控primary和secondary的健康狀況以及異常情況下的故障轉(zhuǎn)移。
2,monitor不負(fù)責(zé)存儲(chǔ)用戶數(shù)據(jù),負(fù)責(zé)存儲(chǔ)用戶數(shù)據(jù)的是primary和secondary節(jié)點(diǎn)。
3,monitor節(jié)點(diǎn)是一個(gè)單點(diǎn),存在單點(diǎn)故障的可能性,這是pg_auto_failover的硬傷,但monitor節(jié)點(diǎn)故障后不影響primary和secondary的運(yùn)行。
4,primary和secondary節(jié)點(diǎn)是用戶數(shù)據(jù)庫的存儲(chǔ)節(jié)點(diǎn),先注冊(cè)到monitor中的節(jié)點(diǎn)為主節(jié)點(diǎn),后注冊(cè)到monitor的節(jié)點(diǎn)為從節(jié)點(diǎn),正常注冊(cè)后實(shí)現(xiàn)流復(fù)制,其身份可以互換。

1,環(huán)境準(zhǔn)備
1.1 主機(jī)環(huán)境
monitor: 192.168.152.121 ubuntu11
主: 192.168.152.122 ubuntu12
從: 192.168.152.123 ubuntu13
1.2 三臺(tái)主機(jī)同步修改,操作一致
2,pg_auto_failover源碼編譯安裝
2.1 下載源碼包,編譯安裝
--下載pg_auto_failover源碼包
cd /usr/local/pg_auto_failover
wget https://github.com/hapostgres/pg_auto_failover/archive/refs/tags/v2.2.zip
apt install unzip
unzip v2.2.zip
--解壓后路徑如下
drwxr-xr-x 3 root root 4096 Jun 4 16:27 ./
drwxr-xr-x 14 root root 4096 Jun 4 16:26 ../
drwxr-xr-x 8 root root 4096 Apr 3 20:05 pg_auto_failover-2.2/
-rw-r--r-- 1 root root 1364027 Jun 4 16:26 v2.2.zip
--這里的安裝,實(shí)際上將pg_auto_failover的編譯文件,安裝到上面配置postgres的環(huán)境變量的指定的路徑中
cd pg_auto_failover-2.2/
make
make install
--安裝完成后,重新將PGHOME相關(guān)目錄授權(quán)給postgres用戶,否則后續(xù)使用pg_autoctl的時(shí)候會(huì)報(bào)找不到命令的錯(cuò)誤
chown -R postgres:postgres /usr/local/pgsql16/
2.2 編譯安裝的pg_auto_failover位置
實(shí)際上pg_auto_failover的編譯安裝之后,把文件存放在上述環(huán)境變量的PGHOME=/usr/local/pgsql16/server目錄的bin和lib目錄下


3,monitor節(jié)點(diǎn)配置pg_auto_failover
3.1 pg_auto_failover初始化
postgres@ubuntu11:/usr/local/pg_install_packgae$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname ubuntu11 --pgport 9300
pg_autoctl: command not found
postgres@ubuntu11:/usr/local/pg_install_packgae$
postgres@ubuntu11:/usr/local/pg_install_packgae$
postgres@ubuntu11:/usr/local/pg_install_packgae$ /usr/local/pgsql16/server/bin/pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname 127.0.0.1 --pgport 9300
10:47:45 2821 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied
10:47:45 2821 ERROR Failed to build pg_autoctl pid file pathname, see above.
10:47:45 2821 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.
postgres@ubuntu11:/usr/local/pg_install_packgae$
postgres@ubuntu11:/usr/local/pg_install_packgae$
postgres@ubuntu11:/usr/local/pg_install_packgae$ exit
exit
root@ubuntu11:/usr/local/pg_install_packgae# sudo chmod -R 777 /run/user/0
root@ubuntu11:/usr/local/pg_install_packgae# sudo chown -R postgres:postgres /run/user/0
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# su - postgres #切換到postgres用戶下執(zhí)行
postgres@ubuntu11:~$
postgres@ubuntu11:~$
postgres@ubuntu11:~$
postgres@ubuntu11:~$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname ubuntu11 --pgport 9300 --run
05:04:27 2216 INFO Using default --ssl-mode "require"
05:04:27 2216 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
05:04:27 2216 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
05:04:27 2216 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
05:04:27 2216 INFO Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"
05:04:27 2216 INFO /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'
05:04:29 2216 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu11"
05:04:29 2216 INFO Started pg_autoctl postgres service with pid 2238
05:04:29 2216 INFO Started pg_autoctl listener service with pid 2239
05:04:29 2238 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
05:04:29 2243 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
05:04:29 2238 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2243
05:04:29 2239 WARN NOTICE: installing required extension "btree_gist"
05:04:29 2239 INFO Granting connection privileges on 192.168.152.0/24
05:04:29 2239 WARN Skipping HBA edits (per --skip-pg-hba) for rule: hostssl "pg_auto_failover" "autoctl_node" 192.168.152.0/24 trust
05:04:29 2239 INFO Your pg_auto_failover monitor instance is now ready on port 9300.
05:04:29 2239 INFO Monitor has been successfully initialized.
05:04:29 2239 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data/ -v
05:04:29 2239 INFO Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
05:04:29 2239 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
05:04:29 2239 INFO Reloading Postgres configuration and HBA rules
05:04:30 2239 INFO The version of extension "pgautofailover" is "2.2" on the monitor
05:04:30 2239 INFO Contacting the monitor to LISTEN to its events.
3.2 pg_auto_failover systemctl腳本配置
postgres@ubuntu11:~$ exit #切換到root用戶執(zhí)行
logout
root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show systemd
05:06:26 2333 INFO HINT: to complete a systemd integration, run the following commands (as root):
05:06:26 2333 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
05:06:26 2333 INFO systemctl daemon-reload
05:06:26 2333 INFO systemctl enable pgautofailover
05:06:26 2333 INFO systemctl start pgautofailover
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# systemctl daemon-reload
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# systemctl enable pgautofailover
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# systemctl start pgautofailover
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# systemctl status pgautofailover
● pgautofailover.service - pg_auto_failover
Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2025-10-09 05:06:43 UTC; 5s ago
Main PID: 2421 (pg_autoctl)
Tasks: 14 (limit: 4550)
Memory: 29.2M
CGroup: /system.slice/pgautofailover.service
├─2421 /usr/local/pgsql16/server/bin/pg_autoctl run
├─2444 pg_autoctl: start/stop postgres
├─2445 pg_autoctl: monitor listener
├─2454 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
├─2455 postgres: pg_auto_failover monitor: logger
├─2456 postgres: pg_auto_failover monitor: checkpointer
├─2457 postgres: pg_auto_failover monitor: background writer
├─2459 postgres: pg_auto_failover monitor: walwriter
├─2460 postgres: pg_auto_failover monitor: autovacuum launcher
├─2461 postgres: pg_auto_failover monitor: pg_auto_failover monitor
├─2462 postgres: pg_auto_failover monitor: logical replication launcher
├─2463 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker postgres
├─2464 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker pg_auto_failover
└─2466 postgres: pg_auto_failover monitor: autoctl_node pg_auto_failover [local] idle
Oct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO Started pg_autoctl postgres service with pid 2444
Oct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO Started pg_autoctl listener service with pid 2445
Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data -v
Oct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -v
Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
Oct 09 05:06:43 ubuntu11 pg_autoctl[2454]: 05:06:43 2454 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
Oct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2454
Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO The version of extension "pgautofailover" is "2.2" on the monitor
Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Contacting the monitor to LISTEN to its events.
root@ubuntu11:/usr/local/pg_install_packgae#
3.3 pg_auto_failover 初始化驗(yàn)證

root@ubuntu11:/usr/local/pg_install_packgae# psql -h ubuntu11 -p 9300 postgres postgres
psql (16.4)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.
#1,自動(dòng)安裝了pg_auto_failover庫
postgres=# \l
List of databases
Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges
------------------+----------+----------+-----------------+------------+------------+------------+-----------+-----------------------
pg_auto_failover | autoctl | UTF8 | libc | en_US.utf8 | en_US.utf8 | | |
postgres | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | |
template0 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres +
| | | | | | | | postgres=CTc/postgres
(4 rows)
#2,自動(dòng)創(chuàng)建了autoctl和autoctl_node兩個(gè)角色
postgres=# \du
List of roles
Role name | Attributes
--------------+------------------------------------------------------------
autoctl |
autoctl_node |
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS
#3,自動(dòng)安裝了如下兩個(gè)擴(kuò)展
postgres=# \c pg_auto_failover
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
You are now connected to database "pg_auto_failover" as user "postgres".
pg_auto_failover=# \dx
List of installed extensions
Name | Version | Schema | Description
----------------+---------+------------+-----------------------------------------------
btree_gist | 1.7 | public | support for indexing common datatypes in GiST
pgautofailover | 2.2 | public | pg_auto_failover
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(3 rows)
pg_auto_failover=#
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri
Type | Name | Connection String
-------------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@127.0.0.1:9300/pg_auto_failover?sslmode=require #一開始我懵逼了,為什么連接串里是127.0.0.1
formation | default |
root@ubuntu11:/usr/local/pg_install_packgae# su - postgres
postgres@ubuntu11:~$
postgres@ubuntu11:~$ source /etc/profile
postgres@ubuntu11:~$
postgres@ubuntu11:~$ pg_autoctl show uri
Type | Name | Connection String
-------------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require #后來嘗試切換到postgres用戶下查看,竟然變成了主機(jī)名,搞什么鬼哦
formation | default |
postgres@ubuntu11:~$
4,主節(jié)點(diǎn)和從節(jié)點(diǎn)依次加入pg_auto_failover集群
4.1 注冊(cè)主節(jié)點(diǎn)中遇到的異常處理


postgres@ubuntu12:/root$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
03:09:00 58370 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied
03:09:00 58370 ERROR Failed to build pg_autoctl pid file pathname, see above.
03:09:00 58370 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.
postgres@ubuntu12:/root$
postgres@ubuntu12:/root$
postgres@ubuntu12:/root$ exit
exit
root@ubuntu12:~# sudo chmod -R 777 /run/user/0
root@ubuntu12:~# sudo chown -R postgres:postgres /run/user/0
root@ubuntu11:/usr/local/pg_install_packgae# psql -h 127.0.0.1 -p 9300 postgres postgres
psql (16.4)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.
postgres=# \c pg_auto_failover
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
You are now connected to database "pg_auto_failover" as user "postgres".
pg_auto_failover=#
pg_auto_failover=# SELECT nodeid, nodename, nodehost, nodeport, goalstate, reportedstate FROM pgautofailover.node;
nodeid | nodename | nodehost | nodeport | goalstate | reportedstate
--------+----------+----------+----------+-----------+---------------
1 | ubuntu12 | ubuntu12 | 9300 | single | single
(1 row)
pg_auto_failover=# delete from pgautofailover.node where nodeid = 1;
DELETE 1
pg_auto_failover=#
pg_auto_failover=#
root@ubuntu11:/usr/local/pg_install_packgae# su - postgres
postgres@ubuntu11:~$
postgres@ubuntu11:~$ source /etc/profile
postgres@ubuntu11:~$
postgres@ubuntu11:~$ pg_autoctl show uri #這樣monitor節(jié)點(diǎn)恢復(fù)成原始狀態(tài),解決錯(cuò)誤后再重新注冊(cè)主/從節(jié)點(diǎn)
Type | Name | Connection String
-------------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require
formation | default |
postgres@ubuntu11:~$
本地清理 keeper state,再強(qiáng)制注冊(cè)
本地 pg_autoctl 會(huì)維護(hù)一個(gè)狀態(tài)文件(通常在 ~/.local/share/pg_autoctl/.../pg_autoctl.state)。
如果你只是清理 pgdata,但沒有清理這個(gè)文件,再次注冊(cè)會(huì)沖突。
解決:
rm -rf /home/postgres/.local/share/pg_autoctl
然后重新運(yùn)行 pg_autoctl create,這樣它會(huì)認(rèn)為是一個(gè)全新節(jié)點(diǎn)。
4.2 注冊(cè)主節(jié)點(diǎn)
/usr/local/pgsql16/server/bin/pg_autoctl create postgres \
--hostname ubuntu12 \
--name ubuntu12 \
--pgport 9300 \
--auth trust \
--ssl-self-signed \
--monitor 'postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require'
root@ubuntu12:~# su - postgres
postgres@ubuntu12:~$
postgres@ubuntu12:~$
postgres@ubuntu12:~$
postgres@ubuntu12:~$
postgres@ubuntu12:~$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
05:17:27 2094 INFO Using default --ssl-mode "require"
05:17:27 2094 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
05:17:27 2094 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
05:17:27 2094 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
05:17:27 2094 INFO Started pg_autoctl postgres service with pid 2096
05:17:27 2094 INFO Started pg_autoctl node-init service with pid 2097
05:17:27 2096 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
05:17:27 2097 INFO Registered node 52 "ubuntu12" (ubuntu12:9300) in formation "default", group 0, state "single"
05:17:27 2097 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"
05:17:27 2097 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"
05:17:27 2097 INFO Successfully registered as "single" to the monitor.
05:17:27 2097 INFO FSM transition from "init" to "single": Start as a single node
05:17:27 2097 INFO Initialising postgres as a primary
05:17:27 2097 INFO Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"
05:17:27 2097 INFO /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'
05:17:27 2097 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu12"
05:17:28 2122 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
05:17:28 2096 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2122
05:17:28 2097 INFO The user "postgres" already exists, skipping.
05:17:28 2097 INFO CREATE USER postgres
05:17:28 2097 INFO CREATE DATABASE postgres;
05:17:28 2097 INFO The database "postgres" already exists, skipping.
05:17:28 2097 INFO CREATE EXTENSION pg_stat_statements;
05:17:28 2097 INFO Disabling synchronous replication
05:17:28 2097 INFO Reloading Postgres configuration and HBA rules
05:17:28 2097 WARN Failed to resolve hostname "Ubuntu11" to an IP address that resolves back to the hostname on a reverse DNS lookup.
05:17:28 2097 WARN Postgres might deny connection attempts from "Ubuntu11", even with the new HBA rules.
05:17:28 2097 WARN Hint: correct setup of HBA with host names requires proper reverse DNS setup. You might want to use IP addresses.
05:17:28 2097 WARN Using IP address "192.168.152.121" in HBA file instead of hostname "Ubuntu11"
05:17:28 2097 INFO Reloading Postgres configuration and HBA rules
05:17:28 2097 INFO Transition complete: current state is now "single"
05:17:28 2097 INFO keeper has been successfully initialized.
05:17:28 2094 WARN pg_autoctl service node-init exited with exit status 0
05:17:28 2096 INFO Postgres controller service received signal SIGTERM, terminating
05:17:28 2096 INFO Stopping pg_autoctl postgres service
05:17:28 2096 INFO /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast
05:17:28 2094 INFO Stop pg_autoctl
postgres@ubuntu12:~$
postgres@ubuntu12:~$
postgres@ubuntu12:~$ exit
logout
root@ubuntu12:~# pg_autoctl show systemd
05:18:06 2160 INFO HINT: to complete a systemd integration, run the following commands (as root):
05:18:06 2160 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
05:18:06 2160 INFO systemctl daemon-reload
05:18:06 2160 INFO systemctl enable pgautofailover
05:18:06 2160 INFO systemctl start pgautofailover
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu12:~# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu12:~# systemctl daemon-reload
root@ubuntu12:~# systemctl enable pgautofailover
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.
root@ubuntu12:~# systemctl start pgautofailover
root@ubuntu12:~#
root@ubuntu12:~# systemctl status pgautofailover
● pgautofailover.service - pg_auto_failover
Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2025-10-09 05:18:32 UTC; 10s ago
Main PID: 2250 (pg_autoctl)
Tasks: 11 (limit: 4550)
Memory: 26.8M
CGroup: /system.slice/pgautofailover.service
├─2250 /usr/local/pgsql16/server/bin/pg_autoctl run
├─2267 pg_autoctl: start/stop postgres
├─2268 pg_autoctl: node active
├─2278 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
├─2279 postgres: logger
├─2280 postgres: checkpointer
├─2281 postgres: background writer
├─2283 postgres: walwriter
├─2284 postgres: autovacuum launcher
├─2285 postgres: logical replication launcher
└─2313 postgres: postgres postgres [local] startup
Oct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO Started pg_autoctl postgres service with pid 2267
Oct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO Started pg_autoctl node-active service with pid 2268
Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service node-active --pgdata /usr/local/pgsql16/pg9300/data -v
Oct 09 05:18:32 ubuntu12 pg_autoctl[2267]: 05:18:32 2267 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -v
Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"
Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO pg_autoctl service is running, current state is "single"
Oct 09 05:18:32 ubuntu12 pg_autoctl[2278]: 05:18:32 2278 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 WARN PostgreSQL was not running, restarted with pid 2278
Oct 09 05:18:33 ubuntu12 pg_autoctl[2267]: 05:18:33 2267 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2278
Oct 09 05:18:33 ubuntu12 pg_autoctl[2268]: 05:18:33 2268 INFO New state for this node (node 52, "ubuntu12") (ubuntu12:9300): single ? single
root@ubuntu12:~#
4.3 monitor 節(jié)點(diǎn)查看已注冊(cè)的主節(jié)點(diǎn)
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri
Type | Name | Connection String
-------------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@127.0.0.1:9300/pg_auto_failover?sslmode=require
formation | default | postgres://ubuntu12:9300/postgres?target_session_attrs=read-write&sslmode=require
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu12 | 8 | ubuntu12:9300 | 1: 0/15596F8 | read-write | single | single
root@ubuntu11:/usr/local/pg_install_packgae#
5,從節(jié)點(diǎn)加入pg_auto_failover
5.1 注冊(cè)從節(jié)點(diǎn)Ubuntu13
root@ubuntu13:/usr/local/pg_install_package# systemctl stop postgresql9300
root@ubuntu13:/usr/local/pg_install_package# systemctl disable postgresql9300
root@ubuntu13:/usr/local/pg_install_package# rm -rf /usr/local/pgsql16/pg9300
postgres@ubuntu13:~$ pg_autoctl create postgres --hostname ubuntu13 --name ubuntu13 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'
05:46:07 11100 INFO Using default --ssl-mode "require"
05:46:07 11100 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
05:46:07 11100 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
05:46:07 11100 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
05:46:07 11100 INFO Started pg_autoctl postgres service with pid 11102
05:46:07 11100 INFO Started pg_autoctl node-init service with pid 11103
05:46:07 11102 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v
05:46:07 11103 INFO Registered node 60 "ubuntu13" (ubuntu13:9300) in formation "default", group 0, state "wait_standby"
05:46:07 11103 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"
05:46:07 11103 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"
05:46:07 11103 INFO Successfully registered as "wait_standby" to the monitor.
05:46:07 11103 INFO FSM transition from "init" to "wait_standby": Start following a primary
05:46:07 11103 INFO Transition complete: current state is now "wait_standby"
05:46:07 11103 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): single ? wait_primary
05:46:07 11103 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): wait_primary ? wait_primary
05:46:07 11103 INFO FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
05:46:07 11103 INFO Initialising PostgreSQL as a hot standby
05:46:07 11103 INFO /usr/local/pgsql16/server/bin/pg_basebackup -w -d 'application_name=pgautofailover_standby_60 host=ubuntu12 port=9300 user=pgautofailover_replicator sslmode=require' --pgdata /usr/local/pgsql16/pg9300/backup/node_60 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_60
05:46:07 11103 INFO pg_basebackup:
05:46:07 11103 INFO
05:46:07 11103 INFO initiating base backup, waiting for checkpoint to complete
05:46:07 11103 INFO pg_basebackup:
05:46:07 11103 INFO
05:46:07 11103 INFO checkpoint completed
05:46:07 11103 INFO pg_basebackup:
05:46:07 11103 INFO
05:46:07 11103 INFO write-ahead log start point: 0/2000028 on timeline 1
05:46:07 11103 INFO pg_basebackup:
05:46:07 11103 INFO
05:46:07 11103 INFO starting background WAL receiver
05:46:07 11103 INFO 22591/22591 kB (100%), 0/1 tablespace (...backup/node_60/global/pg_control)
05:46:08 11103 INFO 22591/22591 kB (100%), 1/1 tablespace
05:46:08 11103 INFO pg_basebackup: write-ahead log end point: 0/2000138
05:46:08 11103 INFO pg_basebackup: waiting for background process to finish streaming ...
05:46:08 11103 INFO pg_basebackup: syncing data to disk ...
05:46:08 11103 INFO pg_basebackup: renaming backup_manifest.tmp to backup_manifest
05:46:08 11103 INFO pg_basebackup: base backup completed
05:46:08 11103 INFO Creating the standby signal file at "/usr/local/pgsql16/pg9300/data/standby.signal", and replication setup at "/usr/local/pgsql16/pg9300/data/postgresql-auto-failover-standby.conf"
05:46:08 11103 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu13"
05:46:08 11116 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
05:46:08 11102 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 11116
05:46:08 11103 INFO PostgreSQL started on port 9300
05:46:08 11103 INFO Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.
05:46:08 11103 INFO Ensuring HBA rules for node 52 "ubuntu12" (ubuntu12:9300)
05:46:08 11103 INFO Adding HBA rule: hostssl replication "pgautofailover_replicator" ubuntu12 trust
05:46:08 11103 INFO Adding HBA rule: hostssl "postgres" "pgautofailover_replicator" ubuntu12 trust
05:46:08 11103 INFO Writing new HBA rules in "/usr/local/pgsql16/pg9300/data/pg_hba.conf"
05:46:08 11103 INFO Reloading Postgres configuration and HBA rules
05:46:08 11103 INFO Transition complete: current state is now "catchingup"
05:46:08 11103 INFO keeper has been successfully initialized.
05:46:08 11100 WARN pg_autoctl service node-init exited with exit status 0
05:46:08 11102 INFO Postgres controller service received signal SIGTERM, terminating
05:46:08 11102 INFO Stopping pg_autoctl postgres service
05:46:08 11102 INFO /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast
05:46:08 11100 INFO Stop pg_autoctl
postgres@ubuntu13:~$
root@ubuntu13:/usr/local/pg_install_package# pg_autoctl show systemd
05:48:11 12172 INFO HINT: to complete a systemd integration, run the following commands (as root):
05:48:11 12172 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
05:48:11 12172 INFO systemctl daemon-reload
05:48:11 12172 INFO systemctl enable pgautofailover
05:48:11 12172 INFO systemctl start pgautofailover
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu13:/usr/local/pg_install_package# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover
[Service]
WorkingDirectory = /home/postgres
Environment = 'PGDATA=/usr/local/pgsql16/pg9300/data'
User = postgres
ExecStart = /usr/local/pgsql16/server/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0
ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload
[Install]
WantedBy = multi-user.target
root@ubuntu13:/usr/local/pg_install_package# systemctl daemon-reload
root@ubuntu13:/usr/local/pg_install_package# systemctl enable pgautofailover
root@ubuntu13:/usr/local/pg_install_package# systemctl start pgautofailover
root@ubuntu13:/usr/local/pg_install_package#
root@ubuntu13:/usr/local/pg_install_package# systemctl status pgautofailover
● pgautofailover.service - pg_auto_failover
Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2025-10-09 05:48:26 UTC; 5s ago
Main PID: 12381 (pg_autoctl)
Tasks: 9 (limit: 4550)
Memory: 42.0M
CGroup: /system.slice/pgautofailover.service
├─12381 /usr/local/pgsql16/server/bin/pg_autoctl run
├─12391 pg_autoctl: start/stop postgres
├─12392 pg_autoctl: node active
├─12402 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *
├─12403 postgres: logger
├─12404 postgres: checkpointer
├─12405 postgres: background writer
├─12406 postgres: startup recovering 000000010000000000000003
└─12407 postgres: walreceiver streaming 0/3000110
Oct 09 05:48:27 ubuntu13 pg_autoctl[12391]: 05:48:27 12391 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 12402
Oct 09 05:48:27 ubuntu13 pg_autoctl[12392]: 05:48:27 12392 WARN PostgreSQL was not running, restarted with pid 12402
Oct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO Updated the keeper's state from the local PostgreSQL instance, which is running
Oct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO pg_autoctl managed to ensure current state "catchingup": PostgreSQL is running
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Monitor assigned new state "secondary"
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Reached timeline 1, same as upstream node 52 "ubuntu12" (ubuntu12:9300)
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Creating replication slot "pgautofailover_standby_52"
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Transition complete: current state is now "secondary"
Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): primary ? primary
root@ubuntu13:/usr/local/pg_install_package#
5.2 monitor 節(jié)點(diǎn)查看已注冊(cè)的從節(jié)點(diǎn)
再次回到monitor節(jié)點(diǎn)查看,可看到ubuntu11作為monitor節(jié)點(diǎn),ubuntu12作為主節(jié)點(diǎn),ubuntu13作為從節(jié)點(diǎn)
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show uri;
Type | Name | Connection String
-------------+---------+-------------------------------
monitor | monitor | postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require
formation | default | postgres://ubuntu12:9300,ubuntu13:9300/postgres?target_session_attrs=read-write&sslmode=require
root@ubuntu11:~#
root@ubuntu11:~#
root@ubuntu11:~# pg_autoctl show state
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
---------+-------+---------------+----------------+--------------+---------------------+--------------------
ubuntu12 | 52 | ubuntu12:9300 | 1: 0/3000148 | read-write | primary | primary
ubuntu13 | 60 | ubuntu13:9300 | 1: 0/3000148 | read-only | secondary | secondary
root@ubuntu11:~#
root@ubuntu11:~#
6,pg_auto_failover外部訪問配置
1,修改postgres用戶密碼
登錄主節(jié)點(diǎn)修改postgres用戶密碼
ALTER USER postgres WITH PASSWORD 'a-strong-password';
2,修改hba.conf
1,修改從節(jié)點(diǎn)ubuntu13的hba.conf,添加一下訪問規(guī)則
hostssl postgres all 192.168.0.0/16 md5
2,重啟主節(jié)點(diǎn)ubuntu12,此時(shí)故障轉(zhuǎn)移,ubuntu13從節(jié)點(diǎn)提升為主節(jié)點(diǎn),同時(shí)ubuntu13作為新的主節(jié)點(diǎn),配置規(guī)則會(huì)覆蓋從節(jié)點(diǎn)ubuntu12
pg_auto_failover的hba.conf覆蓋規(guī)則是“新的主節(jié)點(diǎn)覆蓋舊的主節(jié)點(diǎn)”,這一點(diǎn)有點(diǎn)繞,有興趣的自己測試驗(yàn)證
然后從客戶端連接至pg_auto_failover集群的主節(jié)點(diǎn),查看復(fù)制狀態(tài),其實(shí)跟手動(dòng)搭建的流復(fù)制就一樣了,只不過是pg_auto_failover把整個(gè)postgresql集群的搭建過程給屏蔽掉了
SELECT * FROM pg_replication_slots;
slot_name |plugin|slot_type|datoid|database|temporary|active|active_pid|xmin|catalog_xmin|restart_lsn|confirmed_flush_lsn|wal_status|safe_wal_size|two_phase|conflicting|
-------------------------+------+---------+------+--------+---------+------+----------+----+------------+-----------+-------------------+----------+-------------+---------+-----------+
pgautofailover_standby_60| |physical | | |false |true | 11783|747 | |0/5020868 | |reserved | |false | |
select * from pg_stat_replication;
pid |usesysid|usename |application_name |client_addr |client_hostname|client_port|backend_start |backend_xmin|state |sent_lsn |write_lsn|flush_lsn|replay_lsn|write_lag |flush_lag |replay_lag |sync_priority|sync_state|reply_time |
-----+--------+-------------------------+-------------------------+---------------+---------------+-----------+-----------------------------+------------+---------+---------+---------+---------+----------+---------------+---------------+---------------+-------------+----------+-----------------------------+
11783| 16416|pgautofailover_replicator|pgautofailover_standby_60|192.168.152.123|ubuntu13 | 60426|2025-10-09 13:23:00.095 +0800| |streaming|0/5020868|0/5020868|0/5020868|0/5020868 |00:00:00.001064|00:00:00.001766|00:00:00.001773| 1|quorum |2025-10-09 13:51:42.416 +0800|
可以看到,pg_auto_failover在一主一從的模式下,是同步復(fù)制
select * from pg_settings where name like '%synchronous_commit%';
name |setting|unit|category |short_desc |extra_desc|context|vartype|source |min_val|max_val|enumvals |boot_val|reset_val|sourcefile |sourceline|pending_restart|
------------------+-------+----+--------------------------+-----------------------------------------------------+----------+-------+-------+------------------+-------+-------+----------------------------------------+--------+---------+------------------------------------------------------------+----------+---------------+
synchronous_commit|on | |Write-Ahead Log / Settings|Sets the current transaction''s synchronization level.| |user |enum |configuration file| | |{local,remote_write,remote_apply,on,off}|on |on |/usr/local/pgsql16/pg9300/data/postgresql-auto-failover.conf| 12|false |
select * from pg_settings where name like '%synchronous_standby_names%' ;
name |setting |unit|category |short_desc |extra_desc|context|vartype|source |min_val|max_val|enumvals|boot_val|reset_val |sourcefile |sourceline|pending_restart|
-------------------------+---------------------------------+----+----------------------------+-------------------------------------------------------------------------------+----------+-------+-------+------------------+-------+-------+--------+--------+---------------------------------+---------------------------------------------------+----------+---------------+
synchronous_standby_names|ANY 1 (pgautofailover_standby_60)| |Replication / Primary Server|Number of synchronous standbys and list of names of potential synchronous ones.| |sighup |string |configuration file| | |NULL | |ANY 1 (pgautofailover_standby_60)|/usr/local/pgsql16/pg9300/data/postgresql.auto.conf| 4|false |
posted on 2025-10-09 18:56 MSSQL123 閱讀(57) 評(píng)論(0) 收藏 舉報(bào)
浙公網(wǎng)安備 33010602011771號(hào)