6 Step MySQL Point-In-Time recovery on AWS RDS

Mar 13, 2021

Mins to Read

All

illustration representing mysql point in time recovery on AWS RDS

Point-In-Time Recovery in MySQL RDS: How We Rolled Back 16M Faulty Updates

‍

Recently one of our customers ran into an issue, wherein a bad actor(code) from the application had made the wrong update to 16 M records of a critical table in the database, causing the entire production process to go down. The application Team was able to find the bad actor and block it, our Remote DBA was involved in the Data Recovery/Rollback.

Here I would like to discuss possible recovery methods for the above said scenario

‍

Delayed Slave:

A simple and effective way to recover is by using a delayed slave, RDS started supporting this feature from version 5.6.40 and 5.7.22 i.e., you can induce a SQL thread delay-interval for applying the writes to a slave, detailed implementation is covered in our blog on delayed replication with Amazon RDS. It’s quite simple if the impact or the wrong update/delete from the application found within the delayed interval defined, just proceed to pause the replication in slave and recover the data. This option is crossed-out in our case since we don’t have one.

‍

Recovery From Binlog:

Binlogs are very trivial in MySQL as they are the heart of replication as well as the source of point in time recovery irrespective of the engine used. With RDS binlogs are purged as soon as possible and also once the replicas are in sync. To increase the retention of binlogs set the retention adequately based on the requirement and disk availability as below

 
call mysql.rds_set_configuration('binlog retention hours', 24);

When you have enough retention of binlogs and Binlog row image set to FULL, you can proceed to make a flash-back i.e, Rollback of the updates as mentioned in our blog on Flashback Recovery in MariaDB/MySQL Servers. This option is also crossed-out since binlogs was purged

‍

Point-In-Time-Recovery:

Now we had to fall back on the “Point-In-Time” recovery with MySQL RDS.

How Point In Time Recovery (PTIR) in RDS works?

This works in combination with both the automated backups(snapshots) as well as binlogs. Binlogs are backed up every 5 mins to S3(AWS object store) by default in RDS when you have it enabled in fact backup and binlogs are interdependent with RDS, and also backup retention(max 35 days) defines the max time you can go back in time with your data.

For example, say you have the backup retention of 7 days ie., you can move back and forth in time with data within 7 days interval. Lets say now you require to restore to a point in time say 10.00 Am on day 5 below are the steps it follows

The nearest full backup(automated) would be restored first ie., day 4 full backup taken at 12:00 AM
Now the incremental data gets applied by restoring the Binlogs post the full backup to the requested time ie., from 12:00 Am to 10:00 AM

I will now walk you through the steps that we followed for restoring a production instance

‍

Step 1

Choose the instance below for which you need to do a point-in-time recovery

Search and select a MySQL database instance for recovery in AWS RDS.

Step 2

- Now click Actions —-> Restore to point in time as shown below

Dropdown menu showing point-in-time restore option under Actions in AWS RDS.

Step 3

- Launch DB instance Wizard will appear as below, you can see “Last restorable time” defines the max time to which recovery is possible and “Custom” help you in choosing the custom time within the restorable window ie., within backup retention set

instance specifications for creating a new database instance on AWS.

Step 4

Choose “Custom” and you can note the date and time wizard. Set the time to the point based on the requirement to have your data restored.

"Launch DB Instance" that allows selecting a point-in-time restore for a database.

Step 5

Now proceed to chose a name for the instance i have made it as “recovery” engine type would be same as the source instance.

instance specifications of the AWS RDS console.

Step 6

You can also proceed to choose the storage, connectivity and log options am not mentioning all the options here , finally proceed with the “continue” option at the bottom

AWS RDS "Launch DB Instance" under the Maintenance section.

Now you can see the backup getting restored to a new instance with the provided name. Restoration time greatly depend upon the number of binlogs to be applied ie., lesser the number of binlogs to be applied between the snapshot to recovery time faster would be the restore. On successful restore, you can see the below event notification “Restored DB instance instance_name to time”

successful database recovery in AWS RDS.

We were able to successfully recover/rollback the data which was updated wrongly on to a new instance and restore it back to the production

‍

Key Take-aways

With PITR which greatly reduces RPO(Recovery Point Object) of RDS to 5 mins, and also comes with some cost.
Always make sure to have the durability settings enabled with RDS for best results with PITR again it comes with a performance penalty on IO.
The higher the number of backup retention greater is the ability to move much longer back in time again it adds storage cost for snapshots.
‍

Need help managing your MySQL RDS environment or preparing for disaster recovery? Let the experts handle it for you. Explore our MySQL Managed Services to ensure high availability, backup strategies, and proactive monitoring—so you’re never caught off guard.

#databaseadministrator

database management

MySQL Insights

No items found.