Optimize Cross-Region Data Replication with YugabyteDB Read Replicas

Mydbops
Nov 18, 2023
8
Mins to Read
All

In our previous blog, we discussed setting up the 3-node YugabyteDB cluster on the same cloud region, achieving a seamless deployment on Linux. Continuing our journey into mastering YugabyteDB, this time around, we delve into a crucial aspect of database management: building read replicas.

Scaling a database horizontally is imperative for applications dealing with high traffic and complex queries. Asynchronous Read replicas play a pivotal role in this scenario, ensuring both performance and fault tolerance. In this blog post, we will explore the steps of creating read replicas with YugabyteDB, empowering to optimize the database for efficient read-heavy workloads.

Understanding the Importance of Read Replicas

Before we dive into the technicalities, let’s take a moment to understand why read replicas are essential in a distributed database system like YugabyteDB. In a nutshell, read replicas serve as duplicates of the primary database that can handle read queries. By distributing read operations across multiple replicas, not only enhances the system’s read throughput but also alleviates the burden on the primary database. This not only improves the overall performance of your application but also enhances its fault tolerance.

Setting up a read replica cluster

A read replica cluster comprises follower nodes linked to a primary cluster. These nodes act solely as observers and do not participate in the RAFT consensus or elections. This unique characteristic allows read replicas to have a distinct replication factor (RF) from the primary cluster, and remarkably, the RF can even be set as an even number.

A few important take aways while building read replica clusters:

  • The Replication Factor(RF) can be 1 or be an even number as well and there is no participation in elections
  • Once a read replica cluster is built with an RF, it cannot be modified later. So, plan the RF value carefully during the initial setup to ensure it aligns with the scalability and fault tolerance requirements
  • All schema changes are replicated with eventual consistency through RAFT. There is no need to replicate DDL objects separately on Read Replica Clusters and There is no limitation on replicating any objects.
  • Read Replicas uses Asynchronous Replication to replicate the data. Mind the replication lag if doing a failover to avoid data loss.

Now, let's build a three-node Primary Cluster with a one-node Replica Cluster. This architecture topology can be used for distributing read traffic.

Architecture Overview

YugabyteDB Read Replicas
Three-node Primary Cluster with one-node Replica Cluster - Architecture

Inventory Details

Replication Setup Steps

In our previous blog, we covered the installation and configuration of YugabyteDB. In this post, we will specifically delve into the replication setup.

  1. Start yb-master on Primary nodes and form a cluster

On Node 1:

 
./bin/yb-master --master_addresses 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 --rpc_bind_addresses 192.168.33.21:7100 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws --placement_region ap-south-1 --placement_zone ap-south-1a  >& /root/yb-master.out &
	

Start the YB-master on the other two nodes with the respective change in rpc_bind_addresses.

 
./bin/yb-master --master_addresses 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 --rpc_bind_addresses 192.168.33.22:7100 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws --placement_region ap-south-1 --placement_zone ap-south-1a  >& /root/yb-master.out &
	
 
./bin/yb-master --master_addresses 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 --rpc_bind_addresses 192.168.33.23:7100 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws --placement_region ap-south-1 --placement_zone ap-south-1a  >& /root/yb-master.out &
	
  1. Once the quorum is formed in the primary cluster, define the primary cluster placement on any one of the nodes in the primary cluster.

On Node 1:

 
./bin/yb-admin -master_addresses 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 modify_placement_info aws.ap-south-1.a 3 primary_cluster
	
  • aws.ap-south-1.a - cloudname.regionname.zonename
  • 3 - Replication Factor(RF) of the primary cluster
  • primary_cluster - The placement identifier of the Primary Cluster. It can be any meaningful unique string.
  1. Now define the read replica placement on any one of the nodes in the primary cluster.

On Node 1:

 
./bin/yb-admin -master_addresses 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 add_read_replica_placement_info aws.ap-south-2.a:1 1 readreplica_cluster
	
  • aws.ap-south-2.a:1 - cloudname.regionname.zonename:Number of Replicas on Zone 1
  • 1 - The total number of read replicas
  • readreplica_cluster - The placement identifier of the Read Replica Cluster. It can be any meaningful unique string.
  1. Start the YB-Tservers on all the nodes in the primary cluster

On Node1:

 
./bin/yb-tserver  --tserver_master_addrs 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100  --rpc_bind_addresses 192.168.33.21:9100 --enable_ysql --pgsql_proxy_bind_address 192.168.33.21:5433 --cql_proxy_bind_address 192.168.33.21:9042 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws   --placement_region ap-south-1 --placement_zone ap-south-1a --placement_uuid primary_cluster >& /root/yb-tserver.out &
	

On Node2:

 
./bin/yb-tserver  --tserver_master_addrs 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100  --rpc_bind_addresses 192.168.33.22:9100 --enable_ysql --pgsql_proxy_bind_address 192.168.33.22:5433 --cql_proxy_bind_address 192.168.33.22:9042 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws   --placement_region ap-south-1 --placement_zone ap-south-1a --placement_uuid primary_cluster >& /root/yb-tserver.out &
	

On Node3:

 
./bin/yb-tserver  --tserver_master_addrs 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100  --rpc_bind_addresses 192.168.33.23:9100 --enable_ysql --pgsql_proxy_bind_address 192.168.33.23:5433 --cql_proxy_bind_address 192.168.33.23:9042 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws   --placement_region ap-south-1 --placement_zone ap-south-1a --placement_uuid primary_cluster >& /root/yb-tserver.out &
	

You can verify the status of the tablet servers now on the admin portal

YugabyteDB Read Replicas
  1. Start the YB-Tservers on the replica node in the secondary region

On DR Node:

 
./bin/yb-tserver  --tserver_master_addrs 192.168.33.21:7100,192.168.33.22:7100,192.168.33.23:7100 --rpc_bind_addresses 192.168.33.55:9100 --enable_ysql --pgsql_proxy_bind_address 192.168.33.55:5433 --cql_proxy_bind_address 192.168.33.55:9042 --fs_data_dirs "/root/disk1,/root/disk2" --placement_cloud aws   --placement_region ap-south-2 --placement_zone ap-south-2a --placement_uuid readreplica_cluster >& /root/yb-tserver.out &
	

Now, the tablet servers on the primary cluster admin portal are updated

YugabyteDB Read Replicas

This confirms that the replication between two different data centers is successfully configured and now the read replica cluster will start to replicate the data asynchronously.

In wrapping up, we've successfully configured replication between the two clusters, initiating seamless data replication between regions. This will enhance the multi-data center setup, ensuring both resilience and reliability. For questions, please comment or reach out to us. Happy replicating!

Stay connected with Mydbops Blogs for more technical insights and discoveries in the PostgreSQL and YugabyteDB ecosystem.

Also read: PostgreSQL 16: What's New? A Comprehensive Overview of the Latest Features

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.