
PostgreSQL Replication: Types, Setup & Best Practices
As your app starts to grow and user traffic picks up, a single PostgreSQL database might start feeling a little... overwhelmed. You may notice things slowing down or becoming less reliable — especially when it’s handling both reads and writes at the same time. To address these challenges, replication offers a practical solution. By creating one or more copies of your database, you can share the load, reduce downtime, and build a system that’s much more scalable and resilient.
In this post, we’ll walk you through what PostgreSQL replication is, why it’s useful, the different types you can use, and how to set it up the right way and best practices to keep everything running smoothly.
Replication in PostgreSQL
Replication in PostgreSQL involves creating and maintaining one or more copies known as standby servers or replicas of a primary database server. Changes made to the primary server are continuously propagated to these replicas, ensuring data consistency across all instances.
The primary goals of replication include:
- High Availability (HA): In the event of a failure on the primary server, a standby server can take over, helping to minimize downtime.
- Read Scalability: Read-heavy workloads can be distributed across replicas, reducing the load on the primary server and improving overall performance.
- Disaster Recovery (DR): Deploying replicas in different physical locations adds a layer of protection against data loss due to unexpected incidents.
PostgreSQL supports several replication methods, each designed to address specific requirements and use cases.
Types of Replication in PostgreSQL
PostgreSQL offers replication methods, each with its own architecture and trade-offs.
.png)
Here are the main types you'll encounter:
Streaming Replication
This is the most common and generally recommended type of replication in modern PostgreSQL. It involves continuously streaming Write-Ahead Log (WAL) records from the primary to the standby servers. This ensures near real-time data synchronization.
Streaming replication can be configured as
- Asynchornous
- Synchronous
Asynchronous Replication
Asynchronous replication in PostgreSQL is a type of streaming replication where the primary server commits transactions without waiting for confirmation that the changes have been written to the standby servers.
Key characteristics:
- Faster Write Performance on Primary: Since the primary doesn't wait, write operations are generally faster.
- Potential Data Loss: In the event of a primary server failure, some recently committed transactions that haven't yet been applied to the standby servers might be lost. This is the main trade-off.
- Simpler Setup: Generally easier to configure compared to synchronous replication.
- Common Use Case: Suitable for read-heavy workloads where eventual consistency is acceptable and high write performance on the primary is crucial.
Synchronous Replication
Synchronous replication in PostgreSQL is a type of streaming replication where the primary server waits for confirmation from one or more synchronous standby servers that a transaction's changes have been successfully written to the disk before the primary server itself commits that transaction.
Key characteristics:
- Strong Data Consistency: Guarantees that if the primary server fails, any transaction that was committed on the primary will also be present on at least one synchronous standby. This significantly reduces the risk of data loss.
- Potential Write Latency: Because the primary has to wait for confirmation, write operations can experience higher latency compared to asynchronous replication. The primary's performance can be impacted by the slowest synchronous standby.
- More Complex Configuration: Requires careful configuration of synchronous standbys.
- Common Use Case: Ideal for critical applications where data integrity and zero data loss are paramount, even at the cost of slightly lower write performance.
Logical Replication
Introduced in PostgreSQL 10, Logical replication in PostgreSQL is a method of replicating data based on its content (logical changes) rather than physical storage (WAL segments). It allows you to replicate changes at the level of individual tables or even subsets of tables. This offers more granular control over what data is replicated .
Key characteristics
- Granular Control: You can choose which tables (or even specific rows and columns with more advanced filtering) are replicated.
- Cross-Version Compatibility: Logical replication can often work between different major PostgreSQL versions, making upgrades with minimal downtime easier.
- Schema Flexibility: The target database can have a different schema than the source, as long as the replicated data can be accommodated.
- Different Replication Unit: Instead of replicating the entire write-ahead log, it replicates individual data modifications (inserts, updates, deletes) in a logical format.
- More Complex Setup: Setting up publications and subscriptions requires more explicit configuration than basic streaming replication.
- Use Cases: Ideal for selective replication, data sharing between different applications, upgrades, and scenarios where the target database has a different structure or only needs a subset of the data.
Setting Up Streaming Replication in PostgreSQL
Here’s a high-level overview of setting up streaming replication:
Prerequisites
- Same PostgreSQL major version on both primary and standby
- Password authentication for replication user
- Network connectivity between both nodes
- Enough disk space for base backup and WAL retention
Configure the Primary Server:
Edit postgresql.conf:
These parameters are essential for enabling physical streaming replication:
wal_level = replica
(This controls how much information is written to the WAL.replica (formerly hot_standby) is required for physical replication.)
max_wal_senders = 5
(Maximum number of concurrent connections that can stream WAL data to replicas. Set this based on how many replicas you plan to connect.)
wal_keep_size = 64MB
(Minimum size of WAL files to retain in the pg_wal directory to prevent replicas from falling too far behind. Increase this if you anticipate high write volumes or slow replicas.)
Edit pg_hba.conf:
Add a line to allow the standby to connect for replication:
host replication replicator standby_ip/32 md5
(replicator user to be created in the DB with replication privileges and a strong password
E.g CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'securepass'; )
- replication: special database keyword for replication traffic
- replicator: the username
- standby_ip/32: IP of the replica (or subnet)
- md5: password-based authentication
Reload the postgresql.conf file post the changes :
select pg_reload_conf();
Prepare the standby server
On the standby server, stop the PostgreSQL service:
sudo systemctl stop postgresql
Clear the old data directory:
rm -rf /var/lib/postgresql/15/main/*
Take the basebackup from the primary server :
pg_basebackup -h <primary_ip> -U replicator -D /var/lib/postgresql/15/main -P --wal-method=stream --write-recovery-conf
- -h: Hostname or IP of the primary server.
- -U: Replication user
- -D: Destination directory
- --wal-method=stream: Stream WAL files during backup
- --write-recovery-conf: Automatically generate recovery info
Start the standby server once backup is completed
sudo systemctl start postgresql
Check replication status on the primary.
SELECT pid, state, sync_state, write_lag, replay_lag FROM pg_stat_replication;
Setting Up Logical Replication in PostgreSQL
Prerequisites :
- PostgreSQL 10 or higher
- Replicated tables must:
- Have a primary key
- Exist on both publisher and subscriber
- Have identical schema
Configure the Primary Server:
Edit postgresql.conf
These parameters are essential for enabling logical replication:
wal_level = logical # Enables logical (row-based) replication
max_replication_slots = 4 # Number of allowed logical replication slots
max_wal_senders = 4 # Number of concurrent WAL sender processes
These settings are crucial for enabling logical replication and supporting multiple subscriptions.
Edit pg_hba.conf:
Add the following line to allow the subscriber (standby) to connect for replication:
host database_name replicator subscriber_ip/32 md5
/32
for a single IP).After editing, reload the PostgreSQL configuration to apply the changes:
select pg_reload_conf();
Create a Publication on the Publisher:
Connect to the desired database on the publisher server, and create a publication.
For selected tables :
CREATE PUBLICATION my_pub FOR TABLE employees, departments;
For all tables :
CREATE PUBLICATION my_pub FOR ALL TABLES;
Prepare the Subscriber server:
On the subscriber, create the subscription:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary_ip port=5432 dbname=mydb user=replicator password=StrongPassword123'
PUBLICATION my_pub;
This will create a replication slot on the primary and begin streaming data changes
Monitor the logical replication
On the subscriber :
SELECT * FROM pg_stat_subscription;
On the publisher :
SELECT * FROM pg_replication_slots WHERE active = true;
Learn how to Convert Streaming Replica to Logical Replication with pg_createsubscriber here.
Best Practices for PostgreSQL Replication
Choose the Right Replication Type
Monitor Replication Lag Continuously:
Monitoring lag helps catch issues early before replication breaks.
Use views like:
For Streaming Replication
SELECT pid, client_addr, state, write_lag, flush_lag, replay_lag FROM pg_stat_replication;
For Logical Replication
SELECT * FROM pg_stat_subscription;
Tune WAL Configuration Parameters
Efficient WAL settings help reduce lag and avoid disk bloat.
Secure Replication Traffic
- Use SSL to encrypt traffic between primary and replicas:
- Set ssl = on in postgresql.conf
- Use hostssl entries in pg_hba.conf
- Set ssl = on in postgresql.conf
- Use firewall rules to restrict replication traffic to trusted IPs.
Automate Failover (High Availability)
Streaming replication provides read replicas, but it doesn’t do automatic failover. Use HA tools:
Ensure Schema Consistency (Logical Replication)
Logical replication doesn't sync schema changes. To avoid breakage:
- Make DDL changes on both publisher and subscriber
- Avoid schema mismatch using schema comparison tools
Use Physical Replication Slots (Streaming)
Replication slots prevent WAL deletion before replicas consume them:
SELECT * FROM pg_replication_slots;
Creating replication slots :
SELECT * FROM pg_create_physical_replication_slot('replica_slot1');
NOTE : If a replica fails and doesn't consume WALs, the disk can fill up. Use monitoring to clean up orphaned or inactive slots.
Regularly Test and Document Your DR Plan
- Simulate failure and recovery
- Document recovery steps (e.g., promoting a replica):
pg_ctl promote -D /var/lib/postgresql/15/main
Troubleshooting Common Replication Issues in PostgreSQL
We can face multiple issues working with PostgreSQL replication , following tables lists the cause and fix for some of the most common ones :
Replication in PostgreSQL is a powerful feature that supports both high-availability and scalability goals. Streaming replication is ideal for full-cluster failover and redundancy, while logical replication offers flexibility for partial replication, version upgrades, and more complex topologies.
Setting things up correctly and following best practices will help you build a resilient PostgreSQL architecture that scales with your needs.
Need Help with PostgreSQL Replication?
Whether you're scaling your app, planning a zero-downtime migration, or need expert insights to optimize your PostgreSQL setup, Mydbops has you covered. We offer Managed Services, Consulting, and support services tailored for PostgreSQL environments. Contact Us Now.