During our recent consulting with one of our client, We came across an interesting issue on RDS. The baseline is that “Low IO size on your RDS instance can affect your DB performance”. Yes, It’s IO size, Not IOPS.
We had our production systems running on RDS MySQL with a single master, 3 replicas. All instances are of same type db.m4.4xlarge with same parameter group configuration and the disk size is 1.5 TB. According to the AWS user guide, each of these instances can support up to 4500 (sustained IOPS) guaranteed IOPS.
Find below the Write IOPS graph for all the instances.
It’s understood that Write IOPS / pattern on Master can vary when compared with Slave, due to a lot of factors like binlog row format, log writing etc. But it has to be almost similar for all the slaves given that it has the same configuration. But when we look into the RDS Write IOPS graphs for all the nodes usage of slave-01 is almost 50% lesser than other slave-03 & slave-05.
With the same write set, slave-01 Write IOPS usage is 50% lesser. We have this question passed on to the RDS team requesting clarification. We have got the answer that due to the varying IO size IOPS are different. Following data was provided by the RDS team.
This IO size factor is not documented or This info is not available to check for the instances on the AWS console.
If the variation in IO size (Hidden to customers) can affect the Performance / Usage. We don’t think it’s fair enough.
When comes to the IO sizing, There is no guarantee that we could be getting same IO size instance on the creation of an RDS instance, Even RDS team has no control over it. As of writing today, It’s based on luck.
Further, We analyzed MySQL usage through graphs from PMM.
InnoDB IO Graph from InnoDB Metrics Dashboard
When we check this graph for all replicas, It reflected the same. Which means that InnoDB is performing data write in the same phase. IOPS (count) changes occur when there is Read / Write operations to underlying Disk depending on IO Size.
Summarising my thoughts with Questions and Answers:
1. Will, it only affects writes?
No, It will affect both the read and writes, For reads quantifying on the productions systems would be difficult, So I have taken writes as an example.
2. What exactly is IO size?
Not sure to be exact, My big bet should be on the disk block size or It can also be on Network throttling. If someone has better insights into RDS internals please comment.
3. Which is better Large IO Size or Smaller IO Size?
The instance with higher IO size will be better, As I previously said, Disk is metered on IOPS. High IO size means, Same write operation can be performed with Lesser IOPS, Which means more room to saturate. 16K can be a better one considering the InnoDB default page size.
4. General Recommendation?
As an end user, If you see huge variations in IO usage for your RDS servers, Feel free to reach out the RDS support for further investigation.
RDS should keep up standard hardware configuration. Otherwise, it has to be advised to the customers.