As a MySQL Administrator, you will from time to time be called on to diagnose locking problems. Unfortuantely, MySQL does not provide many tools to help with this - you need to be able to find your own way around.
If this wasn't bad enough, there are usually other factors that make things even worse:
- Locks are short-lived.
- As soon as the offending transaction performs a COMMIT, the problem is gone.
- When other transactions reach the table_lock_wait_timeout, the statement stops waiting for the lock
- It tends to happen during "busy" periods. And problems during busy periods get managers attention. So it needs to be sorted quickly.
- They are intermittent.
- They re difficult to reproduce.
- The application may not handle SQL errors gracefully at all. This may in turn cascade into other problems, e.g.:
- Inconsistent data in the database. Although the data are consistent from a MySQL point-of-view, it may well be inconsistent from a business logic point of view. E.g. Orders being processed without appearing on the customers's account.
- Errors messages to end users.
To start with, create a SQL file (e.g. lock-diag.sql) that contains this:
FROM information_schema.`PROCESSLIST` P
where time > 1
and command not in ( 'Sleep', 'Connect');
show engine innodb status\G
and then run this script whenever you see a locking problem:
mysql -uroot < lock-diag.sql | less -S
The output contains all of the information you need to figure out which process is getting in the way.
First, find a process that is waiting for a lock -hopefully there is only one. The relevant fragment could look like this (the actual numbers will vary, but the important bits are highlighted in bold):
---TRANSACTION 0 10599, ACTIVE 530 sec, process no 10487, OS thread id 3004115856 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 320, 1 row lock(s)
MySQL thread id 3661, query id 77476 localhost karl updating
delete from foo
------- TRX HAS BEEN WAITING 7 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 56 n bits 72 index `GEN_CLUST_INDEX` of table `karl`.`foo` trx id 0 10599 lock_mode X waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
0: len 6; hex 0000000004fe; asc ;; 1: len 6; hex 000000002966; asc )f;; 2: len 7; hex 0000000035121f; asc 5 ;; 3: len 4; hex 80000007; asc ;;
The last bit indicates the ID of the transaction that holds the lock (000000002966 in this case). Unfortunately, this is in hexadecimal, so it needs to be converted to normal base 10 before we can use it. Most calculators should be able to do this. Here, 2966 in hex i= 10598 in base 10.
Then it is simply a question of finding the transaction (10598 here) and identifying the MySQL Thread ID:
---TRANSACTION 0 10598, ACTIVE 808 sec, process no 10487, OS thread id 3003980688
2 lock struct(s), heap size 320, 2 row lock(s), undo log entries 1
MySQL thread id 3660, query id 77457 localhost karl
which tells us that the MySQL Thread ID for transaction 10598 is 3660. Killing this should resolve things:
mysql> kill 3660;
Query OK, 0 rows affected (0.00 sec)
PS: Others have written about this problem too: http://www.xaprb.com/blog/2007/09/18/how-to-debug-innodb-lock-waits/