How to Scale PostgreSQL with Replication: Methods, Setup & Pro Tips

Mydbops

Apr 22, 2025

Mins to Read

All

PostgreSQL Replication: Types, Setup & Best Practices

‍

As your app starts to grow and user traffic picks up, a single PostgreSQL database might start feeling a little... overwhelmed. You may notice things slowing down or becoming less reliable — especially when it’s handling both reads and writes at the same time. To address these challenges, replication offers a practical solution. By creating one or more copies of your database, you can share the load, reduce downtime, and build a system that’s much more scalable and resilient.

In this post, we’ll walk you through what PostgreSQL replication is, why it’s useful, the different types you can use, and how to set it up the right way and best practices to keep everything running smoothly.

Replication in PostgreSQL

Replication in PostgreSQL involves creating and maintaining one or more copies known as standby servers or replicas of a primary database server. Changes made to the primary server are continuously propagated to these replicas, ensuring data consistency across all instances.

The primary goals of replication include:

High Availability (HA): In the event of a failure on the primary server, a standby server can take over, helping to minimize downtime.
Read Scalability: Read-heavy workloads can be distributed across replicas, reducing the load on the primary server and improving overall performance.
Disaster Recovery (DR): Deploying replicas in different physical locations adds a layer of protection against data loss due to unexpected incidents.

‍

‍

PostgreSQL supports several replication methods, each designed to address specific requirements and use cases.

Types of Replication in PostgreSQL

PostgreSQL offers replication methods, each with its own architecture and trade-offs.

‍

Here are the main types you'll encounter:

Streaming Replication

‍This is the most common and generally recommended type of replication in modern PostgreSQL. It involves continuously streaming Write-Ahead Log (WAL) records from the primary to the standby servers. This ensures near real-time data synchronization.

Streaming replication can be configured as

Asynchornous
Synchronous

Asynchronous Replication

‍Asynchronous replication in PostgreSQL is a type of streaming replication where the primary server commits transactions without waiting for confirmation that the changes have been written to the standby servers.

Key characteristics:

Faster Write Performance on Primary: Since the primary doesn't wait, write operations are generally faster.
Potential Data Loss: In the event of a primary server failure, some recently committed transactions that haven't yet been applied to the standby servers might be lost. This is the main trade-off.
Simpler Setup: Generally easier to configure compared to synchronous replication.
Common Use Case: Suitable for read-heavy workloads where eventual consistency is acceptable and high write performance on the primary is crucial.

‍

Synchronous Replication

‍Synchronous replication in PostgreSQL is a type of streaming replication where the primary server waits for confirmation from one or more synchronous standby servers that a transaction's changes have been successfully written to the disk before the primary server itself commits that transaction.

Key characteristics:

Strong Data Consistency: Guarantees that if the primary server fails, any transaction that was committed on the primary will also be present on at least one synchronous standby. This significantly reduces the risk of data loss.
Potential Write Latency: Because the primary has to wait for confirmation, write operations can experience higher latency compared to asynchronous replication. The primary's performance can be impacted by the slowest synchronous standby.
More Complex Configuration: Requires careful configuration of synchronous standbys.
Common Use Case: Ideal for critical applications where data integrity and zero data loss are paramount, even at the cost of slightly lower write performance

‍

Logical Replication

‍Introduced in PostgreSQL 10, Logical replication in PostgreSQL is a method of replicating data based on its content (logical changes) rather than physical storage (WAL segments). It allows you to replicate changes at the level of individual tables or even subsets of tables. This offers more granular control over what data is replicated .

Key characteristics

Granular Control: You can choose which tables (or even specific rows and columns with more advanced filtering) are replicated.
Cross-Version Compatibility: Logical replication can often work between different major PostgreSQL versions, making upgrades with minimal downtime easier.
Schema Flexibility: The target database can have a different schema than the source, as long as the replicated data can be accommodated.
Different Replication Unit: Instead of replicating the entire write-ahead log, it replicates individual data modifications (inserts, updates, deletes) in a logical format.
More Complex Setup: Setting up publications and subscriptions requires more explicit configuration than basic streaming replication.
Use Cases: Ideal for selective replication, data sharing between different applications, upgrades, and scenarios where the target database has a different structure or only needs a subset of the data.

‍

‍

Setting Up Streaming Replication in PostgreSQL

Here’s a high-level overview of setting up streaming replication:

Prerequisites

Same PostgreSQL major version on both primary and standby
Password authentication for replication user
Network connectivity between both nodes
Enough disk space for base backup and WAL retention

Configure the Primary Server:

Edit postgresql.conf:

These parameters are essential for enabling physical streaming replication:

wal_level = replica

(This controls how much information is written to the WAL.replica (formerly hot_standby) is required for physical replication.)

max_wal_senders = 5

(Maximum number of concurrent connections that can stream WAL data to replicas. Set this based on how many replicas you plan to connect.)

wal_keep_size = ‘1GB’

(Minimum size of WAL files to retain in the pg_wal directory to prevent replicas from falling too far behind. Increase this if you anticipate high write volumes or slow replicas.)

Edit pg_hba.conf:

Add a line to allow the standby to connect for replication:

host replication replicator standby_ip/32 md5

(replicator user to be created in the DB with replication privileges and a strong password

E.g CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'securepass'; )

‍replication: special database keyword for replication traffic
‍replicator: the username
‍standby_ip/32: IP of the replica (or subnet)
‍md5: password-based authentication

Reload the postgresql.conf file post the changes :

select pg_reload_conf();

Prepare the standby server

On the standby server, stop the PostgreSQL service:

sudo systemctl stop postgresql

Clear the old data directory:

rm -rf /var/lib/postgresql/15/main/*

Take the basebackup from the primary server :

pg_basebackup -h <primary_ip> -U replicator -D /var/lib/postgresql/15/main -P --wal-method=stream --write-recovery-conf -C -S repslot -v

-h: Hostname or IP of the primary server.
-U: Replication user
-D: Destination directory
-C: Creates physical replication slot
-S: Physical replication slot name
--wal-method=stream: Stream WAL files during backup
--write-recovery-conf: Automatically generate recovery info
-v: Enables verbose mode

Start the standby server once backup is completed

sudo systemctl start postgresql

‍

Check replication status on the primary.

SELECT pid, state, sync_state, write_lag, replay_lag FROM pg_stat_replication;

Setting Up Logical Replication in PostgreSQL

Prerequisites :

PostgreSQL 10 or higher
Replicated tables must:
- Have a primary key
- Exist on both publisher and subscriber
- Have identical schema

Configure the Primary Server:

Edit postgresql.conf

These parameters are essential for enabling logical replication:

wal_level = logical             # Enables logical (row-based) replication
max_replication_slots = 4       # Number of allowed logical replication slots
max_wal_senders = 4             # Number of concurrent WAL sender processes

These settings are crucial for enabling logical replication and supporting multiple subscriptions.

Edit pg_hba.conf:

Add the following line to allow the subscriber (standby) to connect for replication:

host    database_name    replicator    subscriber_ip/32    md5

database_name: The name of the database to connect to.

replicator: The username used for replication.

subscriber_ip/32: The IP address of the subscriber server (use /32 for a single IP).

md5: Enables password-based authentication.

After editing, reload the PostgreSQL configuration to apply the changes:

select pg_reload_conf();

‍

Create a Publication on the Publisher:

Connect to the desired database on the publisher server, and create a publication.

For selected tables :

CREATE PUBLICATION my_pub FOR TABLE employees, departments;

‍

For all tables :

CREATE PUBLICATION my_pub FOR ALL TABLES;

Prepare the Subscriber server:

On the subscriber, create the subscription:

CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary_ip port=5432 dbname=mydb user=replicator password=StrongPassword123'
PUBLICATION my_pub;

‍

This will create a replication slot on the primary and begin streaming data changes

Monitor the logical replication

On the subscriber :

SELECT * FROM pg_stat_subscription;

‍

On the publisher :

SELECT * FROM pg_replication_slots WHERE active = true;

Learn how to Convert Streaming Replica to Logical Replication with pg_createsubscriber here.

Best Practices for PostgreSQL Replication

Choose the Right Replication Type

Use Case	Recommendation
Full cluster redundancy and failover	Streaming Replication
Selective table-level replication or cross-version upgrades	Logical Replication
Upgrades or migrations without downtime	Logical Replication
High-throughput with minimal overhead	Streaming Replication

Monitor Replication Lag Continuously:

Monitoring lag helps catch issues early before replication breaks.

Use views like:

For Streaming Replication

SELECT pid, client_addr, state, write_lag, flush_lag, replay_lag FROM pg_stat_replication;

For Logical Replication

SELECT * FROM pg_stat_subscription;

Tune WAL Configuration Parameters

Efficient WAL settings help reduce lag and avoid disk bloat.

Parameter	Recommended Practice
wal_level	Use `replica` for streaming, `logical` for logical
wal_keep_size	Ensure it can buffer short outages (e.g., 128MB–1GB)
max_wal_size / min_wal_size	Set based on write volume and disk size
checkpoint_timeout	Keep moderate (e.g., 5min) for recovery performance
archive_mode	Enable for point-in-time recovery or long-term replica catch-up

Secure Replication Traffic

Use SSL to encrypt traffic between primary and replicas:
- Set ssl = on in postgresql.conf
- Use hostssl entries in pg_hba.conf
Use firewall rules to restrict replication traffic to trusted IPs.

Automate Failover (High Availability)

Streaming replication provides read replicas, but it doesn’t do automatic failover. Use HA tools:

Tool	Features
Patroni	Etcd/Consul-backed failover, simple configuration
repmgr	Switchover, monitoring, command-line tools

Ensure Schema Consistency (Logical Replication)

Logical replication doesn't sync schema changes. To avoid breakage:

Make DDL changes on both publisher and subscriber
Avoid schema mismatch using schema comparison tools

Use Physical Replication Slots (Streaming)

Replication slots prevent WAL deletion before replicas consume them:

SELECT * FROM pg_replication_slots;

Creating replication slots :

SELECT * FROM pg_create_physical_replication_slot('replica_slot1');

‍

NOTE : If a replica fails and doesn't consume WALs, the disk can fill up. Use monitoring to clean up orphaned or inactive slots.

Regularly Test and Document Your DR Plan

Simulate failure and recovery
Document recovery steps (e.g., promoting a replica):

pg_ctl promote -D /var/lib/postgresql/15/main

Troubleshooting Common Replication Issues in PostgreSQL

We can face multiple issues working with PostgreSQL replication , following tables lists the cause and fix for some of the most common ones :

Issue	Cause	Resolution
Replica out of sync	Standby too far behind	Recreate from base backup
Replication lag	High load or slow disk/network	Monitor and optimize I/O, WAL settings
Subscription not receiving data	Table structure mismatch	Ensure identical schemas
Missing WALs	Short retention period	Increase `wal_keep_size` or archive WALs
Permissions errors	Misconfigured `pg_hba.conf`	Verify user roles and connection rules

‍

Replication in PostgreSQL is a powerful feature that supports both high-availability and scalability goals. Streaming replication is ideal for full-cluster failover and redundancy, while logical replication offers flexibility for partial replication, version upgrades, and more complex topologies.

Setting things up correctly and following best practices will help you build a resilient PostgreSQL architecture that scales with your needs.

Need Help with PostgreSQL Replication?

Whether you're scaling your app, planning a zero-downtime migration, or need expert insights to optimize your PostgreSQL setup, Mydbops has you covered. We offer Managed Services, Consulting, and support services tailored for PostgreSQL environments. Contact Us Now.

No items found.