In our previous blog, we discussed setting up the 3-node YugabyteDB cluster on the same cloud region, achieving a seamless deployment on Linux. Continuing our journey into mastering YugabyteDB, this time around, we delve into a crucial aspect of database management: building read replicas.
Scaling a database horizontally is imperative for applications dealing with high traffic and complex queries. Asynchronous Read replicas play a pivotal role in this scenario, ensuring both performance and fault tolerance. In this blog post, we will explore the steps of creating read replicas with YugabyteDB, empowering to optimize the database for efficient read-heavy workloads.
Understanding the Importance of Read Replicas
Before we dive into the technicalities, let’s take a moment to understand why read replicas are essential in a distributed database system like YugabyteDB. In a nutshell, read replicas serve as duplicates of the primary database that can handle read queries. By distributing read operations across multiple replicas, not only enhances the system’s read throughput but also alleviates the burden on the primary database. This not only improves the overall performance of your application but also enhances its fault tolerance.
Setting up a read replica cluster
A read replica cluster comprises follower nodes linked to a primary cluster. These nodes act solely as observers and do not participate in the RAFT consensus or elections. This unique characteristic allows read replicas to have a distinct replication factor (RF) from the primary cluster, and remarkably, the RF can even be set as an even number.
A few important take aways while building read replica clusters:
- The Replication Factor(RF) can be 1 or be an even number as well and there is no participation in elections
- Once a read replica cluster is built with an RF, it cannot be modified later. So, plan the RF value carefully during the initial setup to ensure it aligns with the scalability and fault tolerance requirements
- All schema changes are replicated with eventual consistency through RAFT. There is no need to replicate DDL objects separately on Read Replica Clusters and There is no limitation on replicating any objects.
- Read Replicas uses Asynchronous Replication to replicate the data. Mind the replication lag if doing a failover to avoid data loss.
Now, let's build a three-node Primary Cluster with a one-node Replica Cluster. This architecture topology can be used for distributing read traffic.
Architecture Overview
Inventory Details
Replication Setup Steps
In our previous blog, we covered the installation and configuration of YugabyteDB. In this post, we will specifically delve into the replication setup.
- Start yb-master on Primary nodes and form a cluster
On Node 1:
- rpc_bind_addresses - The IP address of the node in which the command is executed.
- fs_data_dirs - The location of the data directory. Ensure the directories are present and it is empty.
- placement_cloud - The cloud provider
- placement_region - The region of the hosted servers
- placement_zone - The zone of the hosted servers
Start the YB-master on the other two nodes with the respective change in rpc_bind_addresses.
- Once the quorum is formed in the primary cluster, define the primary cluster placement on any one of the nodes in the primary cluster.
On Node 1:
- aws.ap-south-1.a - cloudname.regionname.zonename
- 3 - Replication Factor(RF) of the primary cluster
- primary_cluster - The placement identifier of the Primary Cluster. It can be any meaningful unique string.
- Now define the read replica placement on any one of the nodes in the primary cluster.
On Node 1:
- aws.ap-south-2.a:1 - cloudname.regionname.zonename:Number of Replicas on Zone 1
- 1 - The total number of read replicas
- readreplica_cluster - The placement identifier of the Read Replica Cluster. It can be any meaningful unique string.
- Start the YB-Tservers on all the nodes in the primary cluster
On Node1:
On Node2:
On Node3:
You can verify the status of the tablet servers now on the admin portal
- Start the YB-Tservers on the replica node in the secondary region
On DR Node:
Now, the tablet servers on the primary cluster admin portal are updated
This confirms that the replication between two different data centers is successfully configured and now the read replica cluster will start to replicate the data asynchronously.
In wrapping up, we've successfully configured replication between the two clusters, initiating seamless data replication between regions. This will enhance the multi-data center setup, ensuring both resilience and reliability. For questions, please comment or reach out to us. Happy replicating!
Stay connected with Mydbops Blogs for more technical insights and discoveries in the PostgreSQL and YugabyteDB ecosystem.
Also read: PostgreSQL 16: What's New? A Comprehensive Overview of the Latest Features