Boost MySQL Database Performance with LVM Stripping!
Hope everyone aware about LVM(Logical Volume Manager) an extremely useful tool for handling the storage at various levels. LVM basically functions by layering abstractions on top of physical storage devices as mentioned below in the illustration.
Below is a simple diagrammatic expression of LVM
sda1 sdb1 (PV:s on partitions or whole disks)
\ /
\ /
Vgmysql (VG)
/ | \
/ | \
data log tmp (LV:s)
| | |
xfs ext4 xfs (filesystems)
IOPS is an extremely important resource, when it comes to storage it defines the performance of disk. Let’s not forget PIOPS(Provisioned IOPS) one of the major selling points for AWS and other cloud vendors for production machines such as databases. Since Disk is the slowest in the server, we can compare the major components as below.
Consider CPU in speed range of Fighter Jet, RAM in speed range of F1 car and hard Disk in speed range of bullock cart. With modern hardware improvement, IOPS is also seeing significant improvement with SSD’s.
In this blog, we are going to see Merging and Stripping of multiple HDD drives to reap the benefit of disks and combined IOPS
Below is the Disk attached to my server, Each is an 11TB disk with Max supported IOPS of 600.
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 10G 0 disk
sda1 8:1 0 10G 0 part
sdb 8:16 0 10.9T 0 disk
sdc 8:32 0 10.9T 0 disk
sdd 8:48 0 10.9T 0 disk
sde 8:64 0 10.9T 0 disk
sdf 8:80 0 10.9T 0 disk
sdg 8:96 0 10.9T 0 disk
sda is the root partition, sd[b-g] is the attached HDD disk,
With Mere merging of these disk, you will have space management since the disk is clubbed in a linear fashion. With stripping our aim is to get 600*6=3600 IOPS or atleast a value somewhere around 3.2 k to 3.4 k.
Now let’s proceed to create the PV (Physical volume)
# pvcreate /dev/sd[b-g]
Physical volume "/dev/sdb" successfully created.
Physical volume "/dev/sdc" successfully created.
Physical volume "/dev/sdd" successfully created.
Physical volume "/dev/sde" successfully created.
Physical volume "/dev/sdf" successfully created.
Physical volume "/dev/sdg" successfully created.
Validating the PV status:
# pvs
PV VG Fmt Attr PSize PFree
/dev/vdb lvm2 --- 10.91t 10.91t
/dev/vdc lvm2 --- 10.91t 10.91t
/dev/vdd lvm2 --- 10.91t 10.91t
/dev/vde lvm2 --- 10.91t 10.91t
/dev/vdf lvm2 --- 10.91t 10.91t
/dev/vdg lvm2 --- 10.91t 10.91t
Let’s proceed to create a volume group (VG) with a physical extent of 1MB, (PE is similar to block size with physical disks) and volume group name as “vgmysql” combining the PV’s
#vgcreate -s 1M vgmysql /dev/vd[b-g] -v
Wiping internal VG cache
Wiping cache of LVM-capable devices
Wiping signatures on new PV /dev/vdb.
Wiping signatures on new PV /dev/vdc.
Wiping signatures on new PV /dev/vdd.
Wiping signatures on new PV /dev/vde.
Wiping signatures on new PV /dev/vdf.
Wiping signatures on new PV /dev/vdg.
Adding physical volume '/dev/vdb' to volume group 'vgmysql'
Adding physical volume '/dev/vdc' to volume group 'vgmysql'
Adding physical volume '/dev/vdd' to volume group 'vgmysql'
Adding physical volume '/dev/vde' to volume group 'vgmysql'
Adding physical volume '/dev/vdf' to volume group 'vgmysql'
Adding physical volume '/dev/vdg' to volume group 'vgmysql'
Archiving volume group "vgmysql" metadata (seqno 0).
Creating volume group backup "/etc/lvm/backup/vgmysql" (seqno 1).
Volume group "vgmysql" successfully created
Will check the volume group status as below with VG display
# vgdisplay -v
--- Volume group ---
VG Name vgmysql
System ID
Format lvm2
Metadata Areas 6
MetadataSequenceNo 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 6
Act PV 6
VG Size 65.48 TiB
PE Size 1.00 MiB
Total PE 68665326
Alloc PE / Size 0 / 0
Free PE / Size 68665326 / 65.48 TiB
VG UUID 51KvHN-ZqgY-LyjH-znpq-Ufy2-AUVH-OqRNrN
Now our volume group is ready, let’s proceed to create Logical Volume(LV) space with stripe size of 16K equivalent to the page size of MySQL (InnoDB) to be stripped across the 6 attached disk
# lvcreate -L 7T -I 16k -i 6 -n mysqldata vgmysql
Rounding size 7.00 TiB (234881024 extents) up to stripe boundary size 7.00 TiB (234881028 extents).
Logical volume "mysqldata" created.
-L volume size
-I strip size
-i Equivalent to number of disks
-n LV name
Vgmysql volume group to use
lvdisplay to provide a complete view of the Logical volume
# lvdisplay -m
--- Logical volume ---
LV Path /dev/vgmysql/mysqldata
LV Name mysqldata
VG Name vgmysql
LV UUID Y6i7ql-ecfN-7lXz-GzzQ-eNsV-oax3-WVUKn6
LV Write Access read/write
LV Creation host, time warehouse-db-archival-none, 2019-08-26 15:50:20 +0530
LV Status available
# open 0
LV Size 7.00 TiB
Current LE 7340034
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 384
Block device 254:0
--- Segments ---
Logical extents 0 to 7340033:
Type striped
Stripes 6
Stripe size 16.00 KiB
Now we will proceed to format with XFS and mount the partition
# mkfs.xfs /dev/mapper/vgmysql-mysqldata
Below are the mount options used
/dev/mapper/vgmysql-mysqldata on /var/lib/mysql type xfs (rw,noatime,nodiratime,attr2,nobarrier,inode64,sunit=32,swidth=192,noquota)
Now let’s proceed with the FIO test to have IO benchmark.
Command:
#fio --randrepeat=1 --name=randrw --rw=randrw --direct=1 --ioengine=libaio --bs=16k --numjobs=10 --size=512M --runtime=60 --time_based --iodepth=64 --group_reporting
Result:
read : io=1467.8MB, bw=24679KB/s, iops=1542, runt= 60903msec
slat (usec): min=3, max=1362.7K, avg=148.74, stdev=8772.92
clat (msec): min=2, max=6610, avg=233.47, stdev=356.86
lat (msec): min=2, max=6610, avg=233.62, stdev=357.65
write: io=1465.1MB, bw=24634KB/s, iops=1539, runt= 60903msec
slat (usec): min=4, max=1308.1K, avg=162.97, stdev=8196.09
clat (usec): min=551, max=5518.4K, avg=180989.83, stdev=316690.67
lat (usec): min=573, max=5526.4K, avg=181152.80, stdev=317708.30
We have the desired iops ~3.1k by merging and stripped LVM rather than the normal IOPS of 600
Key Take-aways:
- Management of storage becomes very easy with LVM
- Distributed IOPS with stripping helps in enhancing disk performance
- LVM snapshots
Downsides:
Every tool has its own downsides, we should embrace it. Considering the use case it serves best ie., IOPS in our case. One major downside I could think of is, if any one of the disks fails with this setup there will be a potential data-loss/Data corruption.
Work Around:
- To avoid this data-loss/Data corruption we have set-up HA by adding 3 slaves for this setup in production
- Have a regular backup for stripped LVM with xtrabackup, MEB, or via snapshot
- RAID 0 also serves the same purpose as the stripped LVM.
{{cta}}
Featured Image by Carl J on Unsplash