Investigating a MySQL dead lock

Mydbops
Feb 13, 2018
10
Mins to Read
All

I was involved in one of the development support for MySQL. Here is an interesting case

The application needs a Sequence generator as it was not present inbuilt with MySQL ( MariaDB will have it in 10.3 ) they have made sequence generator function with ‘select for update‘ statement. But it tends to cause more locking at high concurrency with their XtraDB Cluster deployment.

To get rid of this , the sequence generation were made as direct Update queries and sequence is fetched using Select @next.

 
UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'dealer';
	

It was supposed to fix the problem but it does not in the real world test scenario. The table structure is below

 
show create table EO_PK_TABLE\G
*************************** 1. row ***************************
       Table: EO_PK_TABLE
Create Table: CREATE TABLE `EO_PK_TABLE` (
  `NAME` char(40) DEFAULT NULL,
  `PK` decimal(11,0) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
	

Application Exception:

 
:
    Next exception:SQL State:41000 
-- error code: 1205 -- msg: Lock wait timeout exceeded; try restarting transaction"}
	

Let us have a look at the InnoDB Engine status

 
---TRANSACTION 2821395544, ACTIVE 212 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 3397, OS thread handle 139772570343168, query id 195064 10.0.5.128 appuser updating
UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'contact'
------- TRX HAS BEEN WAITING 212 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1087 page no 3 n bits 248 index PRIMARY of table `ccbuser`.`EO_PK_TABLE` trx id 2821395544 lock_mode X locks rec but not gap waiting
	

The processlist snap shot

There is a lock over the index , but what is causing the lock and why it happens ?

 
---TRANSACTION 2821458353, ACTIVE 135 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2
MySQL thread id 5455, OS thread handle 139772578862848, query id 590796 10.0.5.128 appuser Searching rows for update
UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'service'
------- TRX HAS BEEN WAITING 135 SEC FOR THIS LOCK TO BE GRANTED:
	

It is noted that the transaction have UNDO log entries. While Updating the records from application perspective , the sequence generation was supposed to be the first step in the transaction. The UNDO entries further for the queries gives the hint that it is not start of the transaction or there is an open transaction.

 
1) Setting the ISOLATION LEVEL 
2) Increment the EO_PK_TABLE based on NAME Column.
3)Fetch the next ID 
4)Insert into relevant table 
5)Commit.
	

Query Log for Analysis .

I have enabled the query log on this non-prod environment to get the query write pattern. But at cases the transactions are not committed by the application .After the UPDATE over EO_PK_TABLE  it keeps on Updating the columns in  EO_PK_TABLE.

Let us consider a transaction handled by Query ID 5068.

 
5068 Query     commit
-- Transaction starts
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'contact'
5068 Query     SELECT @next contactId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'account_vod_thrsh'
5068 Query     SELECT @next id
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'address'
5068 Query     SELECT @next addrId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'service'
5068 Query     SELECT @next servId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'customer_registration'
5068 Query     SELECT @next id
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'location'
5068 Query     SELECT @next locId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'bill_delivery'
5068 Query     SELECT @next billDelId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'account'
5068 Query     SELECT @next acctId
5068 Query     set tx_isolation='READ-COMMITTED'
5068 Query     UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'bob'
5068 Query     SELECT @next bobId
5068 Query     INSERT INTO account(sales_person, do_not_auto_script, direct_acct, mso_id, parental_control, create_date, bu_id, sales_method, acct_id, cancel_reasons_id, tv_dma_id, tax_number, verified_date, status_change_date, external_creation_date, customer_id, attach_bill_cycle, contact_id, compliment_acct, acct_comment, acct_status_id, complaintmode_acct, bill_del_type_id, legacy_id, acct_type_id, tax_exempt, verified, psa_acct_id, bb_acct_id, anonymized, acct_reg_num, email_inv_format, mso_name, verified_channel, offer_type, external_id_suffix, external_id_prefix, active_logins_count, acct_bill_mode, shipping_add_id) VALUES (NULL, 'F', NULL, NULL, NULL, '2017-11-30 14:43:16', 65, 'Call Center', 42389538, NULL, '001', NULL, NULL, NULL, NULL, '171130144316899', 'F', 42391292, 'F', NULL, 1, NULL, NULL, NULL, 2, NULL, NULL, NULL, '171130144316899', NULL, '20171130144316912', NULL, NULL, NULL, NULL, '30144316912', '201711', NULL, 'Monthly', 4.4585868E+7)
5068 Query     rollback
	

This makes sense the application is not closing the transaction perfectly. With persistent connection they use the same connection thread without a proper COMMIT. It causes the lock over the records and the dead lock arises. Now we have the root cause of the problem .

As it is non-prod environment I had the luxury to troubleshoot using Query log. But query log on production ENV will come with a high penalty on IO. So it is good to use the Performance Schema inbuilt with MySQL for trouble shooting.

Performance Schema to Troubleshoot:

The Performance schema is used to analyze the transaction history and its relevant info. Here the query ID = 5511 and Transaction ID = 2821461197

 
*************************** 3. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10911
EVENT NAME: statement/sql/insert
       SQL: INSERT INTO account(sales_person, do_not_auto_script, direct_acct, mso_id, parental_control, create_date, bu_id, sales_method, acct_id, cancel_reasons_id, tv_dma_id, tax_number, verified_date, status_change_date, external_creation_date, customer_id, attach_bill_cycle, contact_id, compliment_acct, acct_comment, acct_status_id, complaintmode_acct, bill_del_type_id, legacy_id, acct_type_id, tax_exempt, verified, psa_acct_id, bb_acct_id, anonymized, acct_reg_num, email_inv_format, mso_name, verified_channel, offer_type, external_id_suffix, external_id_prefix, active_logins_count, acct_bill_mode, shipping_add_id) VALUES (NULL, 'F', NULL, NULL, NULL, '2017-11-30 17:05:19', 65, 'Call Center', 42392550, NULL, '001', NULL, NULL, NULL, NULL, '171130155238743', 'F', 42394399, 'F', NULL, 1, NULL, NULL, NULL, 2, NULL, NULL, NULL, '171130155238743', NULL, '20171130170520142', NULL, NULL, NULL, NULL, '30170520142', '201711', NULL, 'Monthly', 4.458888E+7)
*************************** 4. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10912
EVENT NAME: statement/sql/rollback
       SQL: rollback
*************************** 5. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10913
EVENT NAME: statement/sql/insert
       SQL: INSERT INTO account(sales_person, do_not_auto_script, direct_acct, mso_id, parental_control, create_date, bu_id, sales_method, acct_id, cancel_reasons_id, tv_dma_id, tax_number, verified_date, status_change_date, external_creation_date, customer_id, attach_bill_cycle, contact_id, compliment_acct, acct_comment, acct_status_id, complaintmode_acct, bill_del_type_id, legacy_id, acct_type_id, tax_exempt, verified, psa_acct_id, bb_acct_id, anonymized, acct_reg_num, email_inv_format, mso_name, verified_channel, offer_type, external_id_suffix, external_id_prefix, active_logins_count, acct_bill_mode, shipping_add_id) VALUES (NULL, 'F', NULL, NULL, NULL, '2017-11-30 17:05:20', 65, 'Call Center', 42392549, NULL, '001', NULL, NULL, NULL, NULL, '171130155032832', 'F', 42394398, 'F', NULL, 1, NULL, NULL, NULL, 2, NULL, NULL, NULL, '171130155032832', NULL, '20171130170520142', NULL, NULL, NULL, NULL, '30170520142', '201711', NULL, 'Monthly', 4.4588879E+7)
*************************** 6. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10914
EVENT NAME: statement/sql/rollback
       SQL: rollback
*************************** 7. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10915
EVENT NAME: statement/sql/update
       SQL: UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'address'
*************************** 8. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10916
EVENT NAME: statement/sql/select
       SQL: SELECT @next addrId
*************************** 9. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10917
EVENT NAME: statement/sql/update
       SQL: UPDATE EO_PK_TABLE SET PK = (@next := PK + 100) WHERE NAME = 'bob'
*************************** 10. row ***************************
PROCESS ID: 5511
    TXN ID: 2821461197
  EVENT_ID: 10918
EVENT NAME: statement/sql/select
       SQL: SELECT @next bobId
10 rows in set (0.00 sec)
	

These information and stats were passed on to development team to fix the code. Once the issue is nailed down perfectly it can be fixed easily. The code was fixed, tested and shipped to production. Now the DB queries scales better .

No items found.

About the Author

Mydbops

Subscribe Now!

Subscribe here to get exclusive updates on upcoming webinars, meetups, and to receive instant updates on new database technologies.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.