PostgreSQL is a powerful, open-source relational database with a strong reputation for reliability, robustness, and performance. It supports replication to achieve high availability for a very long. But it lacks an important feature on its own, that is the concept of automated failover. Of course, there are a lot of open-source tools available to do it, but it won’t come under the core PostgreSQL packages. Additional knowledge and maintenance are required to handle it.
In this blog post, we will look at the tool pg_auto_failover which is very simple to use, easy for configuration, and maintenance.
pg_auto_failover is an open-source tool(basically an extension) that can be used to perform automated failover in the PostgreSQL production environments. You can find its source code here.
Simple Architecture:
The architecture with pg_auto_failover requires N+1 nodes where N is the number of PostgreSQL servers and the additional one node is required, it is a monitor node that acts both as a witness and an orchestrator.
A simple architecture with 2 PostgreSQL servers and pg_auto_failover.
Here I am using 3 VMs to show the demonstration, the details are as follows
Installation of pg_auto_failover(On all the nodes):
Pg_auto_failover is available in the package managers for Linux distributions. The following steps for installation can be used in Fedora, CentOS, or Red Hat distributions. For other distributions, please refer here.
Perform the following steps in the monitor and the DB servers
pg_autoctl will be installed along with pg_auto_failover extension and other PostgreSQL client /server/ library packages. pg_autoctl is a binary with subcommands to initialize and manage the PostgreSQL service.
Note: If you are using PostgreSQL 14 or others, your binaries will be placed in the respective directories. (/usr/pgsql-XX/bin/pg_autoctl )
Configuration of pg_auto_failover:
Once we are done with installation on all the DB and the monitor servers, we are ready to start the configuration.
Configure Monitor First(Only on monitor):
We need to first configure the monitor as it will periodically monitor the DB nodes and watches their health. For complete configuration, we will be using pg_autoctl commands along with the subcommands for respective operations.
Create your data and tmp directories(if not exist) and change the ownership to postgres user.
The above successful execution of the command will print the following message
11:54:24 4205 INFO Your pg_auto_failover monitor instance is now ready on port 5432.
11:54:24 4205 INFO Monitor has been successfully initialized.
11:54:24 4200 WARN pg_autoctl service monitor-init exited with exit status 0
11:54:24 4204 INFO Postgres controller service received signal SIGTERM, terminating
11:54:24 4204 INFO Stopping pg_autoctl postgres service
11:54:24 4204 INFO /usr/pgsql-12/bin/pg_ctl --pgdata /var/lib/pgsql/12/data --wait stop --mode fast
11:54:25 4200 INFO Stop pg_autoctl
It also installs the pgautofailover extension and grants access to the new user autoctl_node in the local PostgreSQL database.
Note: The monitor server also runs the PostgreSQL service to store the state changes in the database servers(Primary or Secondary), only for its own purpose.
Execute the following command to start the monitor server by postgres user.
Check by login the PostgreSQL service (By postgres user)
psql -d pg_auto_failover
Where
-d —> pg_auto_failover is the database created by the monitor setup
pg_auto_failover=# \dx pgautofailover
List of installed extensions
Name | Version | Schema | Description
----------------+---------+--------+------------------
pgautofailover | 1.4 | public | pg_auto_failover
(1 row)
5. Run the pg_autoctl service as a monitor(Monitor server) under postgres user
/usr/pgsql-12/bin/pg_autoctl run &
It starts to listen on port 5432 in the monitor server for further configuration and monitoring
13:02:50 6588 INFO Restarting service listener
13:02:50 6600 INFO /usr/pgsql-12/bin/pg_autoctl do service listener --pgdata /var/lib/pgsql/12/data -v
13:02:50 6600 INFO Managing the monitor at postgres://autoctl_node@10.0.2.15:5432/pg_auto_failover?sslmode=require
13:02:50 6600 INFO Reloaded the new configuration from "/var/lib/pgsql/.config/pg_autoctl/var/lib/pgsql/12/data/pg_autoctl.cfg"
13:02:50 6600 INFO Contents of "/var/lib/pgsql/12/data/postgresql-auto-failover.conf" have changed, overwriting
13:02:50 6607 INFO /usr/pgsql-12/bin/postgres -D /var/lib/pgsql/12/data -p 5432 -h *
13:02:50 6597 INFO Postgres is now serving PGDATA "/var/lib/pgsql/12/data" on port 5432 with pid 6607
13:02:50 6600 INFO The version of extension "pgautofailover" is "1.4" on the monitor
13:02:50 6600 INFO Contacting the monitor to LISTEN to its events.
It is perfect now.
After post successful configuration of the monitor server, it is time to move to set up the DB server.
Configuring DB Servers(Only on DB servers):
The following command will be executed on the DB1, will configure the primary database.
hostname —> Hostname of the primary database (Current Server)
pg_ctl —> Absolute path of the pg_ctl binary
monitor:
autoctl_node —> Default user to connect to the monitor server. It will be created in the monitor server at the time of configuration as mentioned earlier.
192.168.33.22:5432 —> Hostname or IP of the monitor server along with the port.
pg_auto_failover —> Database created in the monitor server at the time of configuration
13:04:09 4202 INFO Postgres is now serving PGDATA "/var/lib/pgsql/12/data" on port 5432 with pid 4224
13:04:09 4203 INFO CREATE DATABASE mydbops;
13:04:09 4203 INFO CREATE EXTENSION pg_stat_statements;
13:04:09 4203 INFO /bin/openssl req -new -x509 -days 365 -nodes -text -out /var/lib/pgsql/12/data/server.crt -keyout /var/lib/pgsql/12/data/server.key -subj "/CN=192.168.33.23"
13:04:09 4203 INFO Contents of "/var/lib/pgsql/12/data/postgresql-auto-failover.conf" have changed, overwriting
13:04:09 4203 INFO Transition complete: current state is now "single"
13:04:09 4203 INFO keeper has been successfully initialized.
13:04:09 4199 WARN pg_autoctl service node-init exited with exit status 0
13:04:09 4202 INFO Postgres controller service received signal SIGTERM, terminating
13:04:09 4202 INFO Stopping pg_autoctl postgres service
13:04:09 4202 INFO /usr/pgsql-12/bin/pg_ctl --pgdata /var/lib/pgsql/12/data --wait stop --mode fast
13:04:09 4199 INFO Stop pg_autoctl
Run the pg_autoctl service as keeper(DB server) under postgres user
/usr/pgsql-12/bin/pg_autoctl run &
-bash-4.2$ 13:05:07 4274 INFO Started pg_autoctl postgres service with pid 4277
13:05:07 4277 INFO /usr/pgsql-12/bin/pg_autoctl do service postgres --pgdata /var/lib/pgsql/12/data -v
13:05:07 4274 INFO Started pg_autoctl node-active service with pid 4278
13:05:07 4278 INFO /usr/pgsql-12/bin/pg_autoctl do service node-active --pgdata /var/lib/pgsql/12/data -v
13:05:07 4278 INFO Reloaded the new configuration from "/var/lib/pgsql/.config/pg_autoctl/var/lib/pgsql/12/data/pg_autoctl.cfg"
13:05:07 4278 INFO pg_autoctl service is running, current state is "single"
13:05:07 4286 INFO /usr/pgsql-12/bin/postgres -D /var/lib/pgsql/12/data -p 5432 -h *
13:05:07 4278 WARN PostgreSQL was not running, restarted with pid 4286
13:05:07 4277 INFO Postgres is now serving PGDATA "/var/lib/pgsql/12/data" on port 5432 with pid 4286
Checking the current status
#-bash-4.2$ /usr/pgsql-12/bin/pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+--------------------+-----------+-----------+---------------------+--------------------
node_1 | 1 | 192.168.33.23:5432 | 0/16DBA48 | yes | single | single
-bash-4.2$
It is showing the status as single as currently, only one server is available.
Only changed the value for option hostname here. Used the IP of the server DB2.
Then, run the following
/usr/pgsql-12/bin/pg_autoctl run &
Now, check the current status. You could see the following
-bash-4.2$ /usr/pgsql-12/bin/pg_autoctl show state
Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
-------+-------+--------------------+-----------+-----------+---------------------+--------------------
node_1 | 1 | 192.168.33.23:5432 | 0/3000110 | yes | primary | primary
node_8 | 8 | 192.168.33.24:5432 | 0/3000110 | yes | secondary | secondary
While configuring the DB2, the pg_auto_failover performs the standby setup as well by performing pg_basebackup for the initial sync.
You can use the status command on any nodes in the architecture.
That’s all about the configuration. Try to do some testing with various scenarios
Shutting down the PostgreSQL service
Shutdown the VM
This will do the failover on its own.
In the next blog, we will look at the more advanced features available and also with the multiple standby architecture which is more opt for the production systems.