How to Switch Replica Master of a non-GTID Slave in Percona Cluster ?

Mydbops
Jan 31, 2019
8
Mins to Read
All

Disaster averted! Node failure throws a wrench in your Percona cluster's replication? Don't fret! This blog post unveils a 4-step method to swiftly switch your non-GTID slave to a new master, minimizing downtime and ensuring business continuity.

Recently i worked on a production issue for one of our client under support .They have a architecture of a three node Galera cluster with one asynchronous slave .

  • Node1 – 172.10.2.11
  • Node2 – 172.10.2.12
  • Node3 – 172.10.2.13
  • Replica – 172.10.2.14

Architecture

Replica Master in a Non-GTID Percona Cluster
Replica Master in a Non-GTID Percona Cluster

The slave(replica) was configured with node3 as replica master. Unfortunately the node 3 was crashed with an OOM killer ,also server has a low gcache size, so when i am trying to start the node 3 , it went to SST . Here the data size was around 2.6 TB , in general for completion of whole SST and joining the node back to cluster will take around  approximately 12 hours.

As i told earlier, the replication slave was under node3  and all reporting applications were pointed to async slave only .So, I can’t wait upto 12 hours as it will affect my entire client reporting environment.

To overcome this scenario, i had planned to switch my async slave under node 2 ( 172.10.2.12 ) . By this blog post, i am going to explain the steps how i was able to achieve this .

Step 1:

Stop the slave server for getting persistent log file and position in async slave.

 
mysql> STOP SLAVE SQL_THREAD;
Query OK, 0 rows affected (0.11 sec)
	

From the command SHOW SLAVE STATUS\G get the Relay_Log_File and Read_Master_Log_Pos.

 
mysql> show slave status\G
*************************** 1. row ***************************
         Slave_IO_State: Reconnecting after a failed master event read
                 Master_Host: 172.10.2.13
                 Master_User: repl
                 Master_Port: 3306
               Connect_Retry: 60
              Master_Log_File: mysql-bin.003099
        Read_Master_Log_Pos: 677232126
            Relay_Log_File: replica-relay-bin.009093
                Relay_Log_Pos: 677232323
        Relay_Master_Log_File: mysql-bin.003099
	

Here Relay_Log_File is replica-relay-bin.009093 and Read_Master_Log_Pos is 677232126

Step 2:

Decode the respective Relay_Log_File and get the Xid value using the position of Read_Master_Log_Pos .

 
mysqlbinlog --no-defaults --base64-output=decode-rows -vv /data/mysql/replica-relay-bin.009093 > /home/mydbops/replica-relay-bin.009093.txt

[mydbops@replica ~]$ less replica-relay-bin.009093.txt | grep "677232126"#181202  3:27:13 server id 12  end_log_pos 677232126 CRC32 0xc818fec0   Xid = 572228464
	

Extended view of replica-relay-bin.009093

 
# at 677232001
#181202  3:27:13 server id 12  end_log_pos 677231872 CRC32 0x19bffe41   Query thread_id=7239194 exec_time=1     error_code=0
SET TIMESTAMP=1543701433/*!*/;
SET @@session.foreign_key_checks=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;
BEGIN
/*!*/;
# at 677232069
#181202  3:27:13 server id 12  end_log_pos 677231983 CRC32 0x8c9b2a51   Rows_query
# insert into mydbops.wsrep_paused(tag,counter) values ( NAME_CONST('node',2),@new_value)
# at 677232180
#181202  3:27:13 server id 12  end_log_pos 677232046 CRC32 0xa0e15b87   Table_map: `mydbops`.`wsrep_paused` mapped to number 297
# at 677232243
#181202  3:27:13 server id 12  end_log_pos 677232095 CRC32 0xe5c9c37e   Write_rows: table id 297 flags: STMT_END_F
### INSERT INTO `mydbops`.`wsrep_paused`
### SET
###   @1=1517733 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2 /* TINYINT meta=0 nullable=0 is_null=0 */
###   @3=7555.01              /* FLOAT meta=4 nullable=1 is_null=0 */
###   @4=1543701433 /* TIMESTAMP(0) meta=0 nullable=1 is_null=0 */
# at 677232292
#181202  3:27:13 server id 12  end_log_pos 677232126 CRC32 0xc818fec0   Xid=572228464
COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
	

Step 3:

Using Xid number get end_log_pos from node 2 to which we are going configure replication.

 
[mydbops@db02 ~]$  sudo mysqlbinlog /data/mysql/mysql-bin.002279 --base64-output=DECODE-ROWS --verbose | grep "Xid = 572228464"#181202  3:27:13 server id 12  end_log_pos 815896364 CRC32 0x519db5da   Xid = 572228464
	

from Xid  ‘572228464’ we got the binlog position ‘815896364‘ in mysql-bin.002279

Step 4:

Pointing the replication to node 2 by running change master in slave server.

 
mysql> reset slave;
Query OK, 0 rows affected (0.11 sec)

mysql> reset slave all; 
Query OK, 0 rows affected (0.01 sec)

mysql> change master to master_host='172.20.4.12',master_port=3306,master_log_file='mysql-bin.002279',master_log_pos=815896364;
Query OK, 0 rows affected (0.02 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G
*************************** 1. row ***************************
           Slave_IO_State: Queueing master event to the relay log
                  Master_Host: 172.20.4.12
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.002280
          Read_Master_Log_Pos: 100575837
               Relay_Log_File: replica-relay-bin.000002
                Relay_Log_Pos: 317
        Relay_Master_Log_File: mysql-bin.002279
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
	

After switch over current master for replica is node 2 (172.20.4.12):

after_switch
Replica Master in a Non-GTID Percona Cluster

Conclusion:

  • From the four simple steps i was able to get back the slave and make it  production ready, which reduces the impact of downtime of a switch with out GTID.

From replication to performance optimization, Mydbops empowers you to streamline tasks and ensure database health.

{{cta}}

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.