As Mydbops we are implementing Load Balancer using Maxscale or ProxySQL ( Our presentation ) for lot our client, but these load balancers will become a SPOF (Single Point of failure) . We have tried to explore services like HAProxy, Nginx, and Keepalived etc. Except Keepalived, all the services need to run on the standalone instance and did not satisfy our exact need.
Keepalived does not requires any standalone instance, it can be deployed and configured with a minimal effort and provide the HA Solutions to the DB Infra. This approach not only fits for our DB setup, we can implement same even managing Application or Frontend server HA Solutions. Its very simple and elegant way.
Below we will see how to set up the configure the Keepalived for our requirements.
The question arises, how do you load balance the Maxscale or ProxySQL servers themselves without any standalone instance? One way is to use keepalived. In order to use this solution, you need at least two Maxscale or ProxySQL servers. On both of them install keepalived as explained below. Both servers will have a floating IP. The application server will attempt to connect to 192.168.33.30 (Floating IP) . Depending on which Maxscale or ProxySQL server is the master, the IP will be owned by that server. If that server fails, then the backup server will start to issue gratuitous ARP responses for the same IP of 192.168.33.30 and the requests to the App servers will then go through the backup Maxscale or ProxySQL server which has now become the primary.
Note :
ProxySQL has its clustering inbuilt with ProxySQL 1.4 to synchronise the configuration. ProxySQL has more fine grained configuration while comparing all SQL load balancers.
Maxscale as a config file based setup . The configuration has to be synced manually across nodes ( but chances are minimal on production setup ).
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2018-01-23 13:21:57 UTC; 2h 22min ago
Process: 3615 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 3616 (keepalived)
CGroup: /system.slice/keepalived.service
├─3616 /usr/sbin/keepalived -D
├─3617 /usr/sbin/keepalived -D
└─3618 /usr/sbin/keepalived -D
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Verifying Keepalived Running on Slave
systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2018-01-23 13:21:30 UTC; 2h 25min ago
Process: 8682 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 8683 (keepalived)
CGroup: /system.slice/keepalived.service
├─8683 /usr/sbin/keepalived -D
├─8684 /usr/sbin/keepalived -D
└─8685 /usr/sbin/keepalived -D
Jan 23 13:25:29 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Received advert with higher priority 101, ours 100
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) removing protocol VIPs.
As we know 192.168.33.13 is MASTER, Floating IP is Assigned to enp0s8 when Master is UP.
ip add show enp0s8
Notice in the above we have sections vrr_script check_maxscale. This code will run maxscale status on the MASTER. If the return code is ‘0’ then the service is considered to be up. If the return code is other than ‘0’ then the service is considered to be down and the BACKUP host will then take-over the floating IP.
Testing The Failover and How Keepalived help for minimal Downtime
For this testing, I will ping Floating IP 192.168.33.30 from some local machine and manually stop the MAXSCALE server. So that Keepalive recognizes the service failure and Slave server keepalive configured Floating on that machine.
Initiate Ping to Floating IP
ping 192.168.33.30
Stop MAXSCAL on Master
systemctl stop maxscale
systemctl status maxscale
maxscale.service - MariaDB MaxScale Database Proxy
Loaded: loaded (/usr/lib/systemd/system/maxscale.service; disabled; vendor preset: disabled)
Active: inactive (dead)
Jan 23 13:25:49 mydbopslabs13 maxscale[3883]: [mysqlmon] No Master can be determined
Jan 23 16:15:12 mydbopslabs13 systemd[1]: Stopping MariaDB MaxScale Database Proxy...
Jan 23 16:15:12 mydbopslabs13 maxscale[3883]: Waiting for housekeeper to shut down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Finished MaxScale log flusher.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper shutting down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper has shut down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale received signal SIGTERM. Exiting.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale is shutting down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale shutdown completed.
Jan 23 16:15:13 mydbopslabs13 systemd[1]: Stopped MariaDB MaxScale Database Proxy.
Ping Output
ping 192.168.33.30
PING 192.168.33.30 (192.168.33.30) 56(84) bytes of data.
64 bytes from 192.168.33.30: icmp_seq=1 ttl=64 time=0.288 ms
64 bytes from 192.168.33.30: icmp_seq=2 ttl=64 time=0.285 ms
64 bytes from 192.168.33.30: icmp_seq=3 ttl=64 time=0.220 ms
64 bytes from 192.168.33.30: icmp_seq=4 ttl=64 time=0.208 ms
64 bytes from 192.168.33.30: icmp_seq=5 ttl=64 time=0.187 ms
64 bytes from 192.168.33.30: icmp_seq=6 ttl=64 time=0.236 ms
64 bytes from 192.168.33.30: icmp_seq=7 ttl=64 time=0.293 ms
64 bytes from 192.168.33.30: icmp_seq=8 ttl=64 time=0.198 ms
64 bytes from 192.168.33.30: icmp_seq=9 ttl=64 time=0.255 ms
64 bytes from 192.168.33.30: icmp_seq=10 ttl=64 time=0.440 ms
64 bytes from 192.168.33.30: icmp_seq=11 ttl=64 time=0.288 ms
64 bytes from 192.168.33.30: icmp_seq=12 ttl=64 time=0.366 ms
64 bytes from 192.168.33.30: icmp_seq=13 ttl=64 time=0.507 ms
64 bytes from 192.168.33.30: icmp_seq=14 ttl=64 time=0.401 ms
64 bytes from 192.168.33.30: icmp_seq=15 ttl=64 time=0.284 ms
64 bytes from 192.168.33.30: icmp_seq=16 ttl=64 time=0.366 ms
64 bytes from 192.168.33.30: icmp_seq=18 ttl=64 time=0.510 ms
64 bytes from 192.168.33.30: icmp_seq=19 ttl=64 time=0.242 ms
64 bytes from 192.168.33.30: icmp_seq=20 ttl=64 time=0.254 ms
64 bytes from 192.168.33.30: icmp_seq=21 ttl=64 time=0.175 ms
64 bytes from 192.168.33.30: icmp_seq=22 ttl=64 time=0.205 ms
64 bytes from 192.168.33.30: icmp_seq=23 ttl=64 time=0.246 ms
64 bytes from 192.168.33.30: icmp_seq=24 ttl=64 time=0.231 ms
64 bytes from 192.168.33.30: icmp_seq=25 ttl=64 time=0.249 ms
64 bytes from 192.168.33.30: icmp_seq=26 ttl=64 time=0.243 ms
when we see the above output only one packet was dropped. Within few seconds Floating IP Assigned to Slave and started to respond.
Check Floating IP assigned to Slave
ip add show enp0s8
3: enp0s8: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 08:00:27:a8:1d:d7 brd ff:ff:ff:ff:ff:ff
inet 192.168.33.12/24 brd 192.168.33.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet 192.168.33.30/32 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fea8:1dd7/64 scope link
valid_lft forever preferred_lft forever
Once Master Maxscale service online automatically Floating re-assign to Master Server. These can be applied for the ProxySQL Cluster too.
We believe this blog make worth your time. If you have any comments please share below.