一个mysql事务引发的思考(血案)
问题简述
大家都知道mysql是支持ACID,支持事务的,事务是非常重要的一个特性,要不都执行成功,要不都不成功。我们在coding时也会大量用到,但是随着业务代码的累加与使用人数的增加,系统有某些场景下会出现下面这个错误
11:48:40.265[http-nio-8999-exec-3] ERROR c.x.e.l.c.e.ExceptionHandleFilter - exception message: {} com.xxx.common.exception.ExceptionHandleFilter:45
org.springframework.dao.CannotAcquireLockException:
### Error updating database. Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
### The error may involve com.xxx.dao.issuegroup.CatalogTaskDao.insert-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO `catalog_task` (`issue_id`,`catalog_id`,`status`,`cuserid`) VALUES ( ?,?,?,?)
### Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
; SQL []; Lock wait timeout exceeded; try restarting transaction; nested exception is java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:259) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:75) ~[mybatis-spring-1.3.0.jar:1.3.0]
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:447) ~[mybatis-spring-1.3.0.jar:1.3.0]
at com.sun.proxy.$Proxy99.insert(Unknown Source) ~[na:na]
at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:279) ~[mybatis-spring-1.3.0.jar:1.3.0]
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:56) ~[mybatis-3.4.0.jar:3.4.0]
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53) ~[mybatis-3.4.0.jar:3.4.0]
问题跟踪
复现操作场景,跟踪期间执行的sql语句
开启mysql generallog日志,并找到日志路径,如果未开启的话执行 set global general_log = on;
即可。
mysql> show global variables like '%general%';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 16082
Current database: lean_dev
+------------------+---------------------------------+
| Variable_name | Value |
+------------------+---------------------------------+
| general_log | ON |
| general_log_file | /var/lib/mysql/9905e4ad340b.log |
+------------------+---------------------------------+
2 rows in set (0.61 sec)
需要注意的是问题复现期间建议使用单独的mysql机器,我使用的是已经废弃不用的机器(ip:10.94.120.110),这样才能得到特定场景操作下的纯粹的sql日志,截取具体的日志,
找到可能存在锁竞争的sql(过滤掉select语句):
2018-07-04T12:04:11.847548Z 16087 Query SET autocommit=0
2018-07-04T12:04:11.901111Z 16087 Query UPDATE lean_archive la SET la.status = -1, mtime = '2018-07-04 20:04:11.875' WHERE la.object_type = 1 AND la.object_id = 195959 AND la.status = 0
2018-07-04T12:04:12.083234Z 16087 Query insert into lean_change ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source) values (null,120859,now(),'list
',11,'status','195959',null,'柳志崇恢复了列“{2}”',0)
2018-07-04T12:04:12.099792Z 16087 Query update lean_list set name = 'done2', ctime = '2018-05-09 16:55:45.0', cuserid = 120859, list_order = 3, status = 0, boardid = 10114, finished = 0 wh
ere id = 195959
2018-07-04T12:04:15.023635Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 1649 , 1650 , 3602 , 3603 , 3604 , 3605 )
2018-07-04T12:04:15.036100Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118859 )
2018-07-04T12:04:15.051886Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118859
2018-07-04T12:04:23.394968Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118864
2018-07-04T12:04:24.308721Z 16085 Query SET autocommit=0
2018-07-04T12:04:24.497692Z 16085 Query insert ignore into lean_issue set title = 'bbbb', content = null, cuserid = 120859, status = 0, listid = 195406, boardid = 10114, priority = null,
source_type = 0, starttime = null, endtime = null, sequnce = 'LR4605-83', parentid = null, ctime = now(), mtime = null, workflowid = null, type = 4722, source = null, issue_order = null
2018-07-04T12:04:25.218430Z 16085 Query insert into lean_change ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source) values (123096,120859,now(),'is
sue',1,'list','','195406','柳志崇在“Done1”添加了任务',0)
2018-07-04T12:04:25.620718Z 16085 Query insert into lean_card_propose_user set issue_id = 123096, user_id = 120859, status = null
2018-07-04T12:04:25.639585Z 16085 Query insert into lean_issue_user set userid = 120859, issueid = 123096, type = 2
2018-07-04T12:04:25.832163Z 16085 Query update lean_board SET mtime = '2018-07-04 20:04:25' ,count = 83 where id = 10114
2018-07-04T12:04:26.449587Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 3602 , 3604 , 3605 )
2018-07-04T12:04:26.458029Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118864 )
2018-07-04T12:04:26.468883Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118864
2018-07-04T12:04:26.951996Z 16085 Query INSERT INTO `catalog_task` (`issue_id`,`catalog_id`,`status`,`cuserid`) VALUES ( 123096,1649,0,120859)
...
2018-07-04T12:05:13.319018Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118873 )
2018-07-04T12:05:13.330249Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118873
2018-07-04T12:05:16.033715Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 1649 , 1650 , 3602 , 3603 , 3604 , 3605 )
2018-07-04T12:05:16.042514Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118874 )
2018-07-04T12:05:16.053007Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118874
2018-07-04T12:05:17.967792Z 16085 Query rollback
2018-07-04T12:05:17.972078Z 16085 Query SET autocommit=1
2018-07-04T12:05:18.769333Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 1649 , 1650 , 3602 , 3603 , 3604 , 3605 )
2018-07-04T12:05:18.779075Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118874 )
2018-07-04T12:05:18.791102Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 118874
2018-07-04T12:05:21.380525Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 1649 , 1650 , 3602 , 3603 , 3604 , 3605 )
2018-07-04T12:05:21.386858Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 118875 )
...
2018-07-04T12:08:29.544130Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 122925
2018-07-04T12:08:30.424652Z 16087 Query UPDATE `catalog` SET `status` = 0,mtime=now() where status !=-1 and id in ( 1649 , 1650 , 3602 , 3603 , 3604 , 3605 )
2018-07-04T12:08:30.437569Z 16087 Query UPDATE `catalog_task` SET `status` = 0,mtime=now() where status!=-1 and issue_id in ( 122925 )
2018-07-04T12:08:30.448584Z 16087 Query update lean_issue SET mtime = now() ,boardid=10114 where id = 122925
2018-07-04T12:08:30.718356Z 16087 Query INSERT INTO `lean_change` ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source) VALUES (118859,120859,now()
,'issue',10,'status','-1','0','lean恢复了任务',0) , (118859,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118861,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118861,120859,now(),'issue',10,
'status','-1','0','lean恢复了任务',0) , (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118864,120859,now(),'issue',10,'status','-1
','0','lean恢复了任务',0) , (118864,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务'
,0) , (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) , (118
867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) ...
2018-07-04T12:08:30.733035Z 16087 Query UPDATE lean_archive la, lean_issue li SET la.status = -1, la.mtime = now(), li.status = la.object_status, li.mtime = now() WHERE la.object_type = 2 AND la.s
tatus = 0 AND li.listid = 195959 AND li.status = -1 AND li.id = la.object_id
2018-07-04T12:08:30.879701Z 16087 Query UPDATE lean_board_list_lane SET status = 0, archive_id = null WHERE status = -1 and board_id = 10114 and list_id = 195959 and archive_id = 57354
2018-07-04T12:08:31.623011Z 16087 Query insert into notice_info (`title`, `user_id`, `type`, `desc`, `url`, `from`, `from_id`, `c_time`) values ('柳志崇恢复了列“done2”。',38604,1,'','http://lean-dev.intra.xiaoju
keji.com/project/LR4605','柳志崇',120859,'2018-07-04 20:08:31.618')
2018-07-04T12:08:32.241940Z 16087 Query commit
2018-07-04T12:08:32.244986Z 16087 Query SET autocommit=1
至此问题基本锁定了表catalog_task(为什么这么说呢,只有这张表同时涉及到了update与insert变更操作)
场景还原
接下来就简化业务场景,准备测试表,测试数据进行验证
初始化
drop table catalog_task_test;
CREATE TABLE `catalog_task_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`issue_id` int(11) NOT NULL,
`catalog_id` int(11) NOT NULL COMMENT '目录id',
`status` tinyint(3) NOT NULL,
`cuserid` int(11) NOT NULL,
`ctime` timestamp NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`mtime` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=7199 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ;
INSERT INTO `catalog_task_test` (`issue_id`,`catalog_id`,`status`,`cuserid`) VALUES ( 111,1649,0,120859) ;
INSERT INTO `catalog_task_test` (`issue_id`,`catalog_id`,`status`,`cuserid`) VALUES ( 222,1649,0,120859) ;
测试脚本
session1:
set autocommit=0;
INSERT INTO `catalog_task_test` (`issue_id`,`catalog_id`,`status`,`cuserid`) VALUES ( 333,1649,0,120859) ;
session2:(session1语句执行完成后再执行session2中的语句)
set autocommit=0;
select now(); update catalog_task_test set status=1 where issue_id=111; select now();
session2实际执行时会报错,提示事务超时,错误结果:
mysql> select now(); update catalog_task_test set status=1 where issue_id=111; select now();
+---------------------+
| now() |
+---------------------+
| 2018-07-03 10:23:26 |
+---------------------+
1 row in set (0.01 sec)
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
+---------------------+
| now() |
+---------------------+
| 2018-07-03 10:24:17 |
+---------------------+
1 row in set (0.01 sec)
疑问点
- Q: insert into难道是表级锁吗,为什么在insert into语句没有执行commit; 时 update需要等待他的提交
- A: 以mysql的InnoDB引擎为例,有索引并且使用了该索引当条件的时候就是行锁,没有索引的时候就是表锁。innodb的行锁是在有索引的情况下,没有索引的表是锁定全表的。insert一般都是行锁, update能确定且走索引的是行锁,否则表锁。
结合上面的解答,补充2个问题
- Q: 那么先在session1中执行update ,然后在session2中执行insert 会出现锁超时的问题么?
- A: 会,表级锁被占用的情况下是拿不到行级锁的
- Q: 那么先在session1中执行insert ,然后在session2中执行insert 会出现锁超时的问题么?
- A:不会,因为insert 均是行级锁
- Q: 那么先在session1中执行update ,然后在session2中执行update 会出现锁超时的问题么?
- A:会
解决思路1
将update表级锁降为行级锁(可以通过按主键进行更新或者where条件列增加索引)
上述案例添加索引sql如下:
ALTER TABLE `catalog_task_test` ADD INDEX idx_issue_id ( `issue_id` ) ;
解决思路2
上述案例insert并不是拿不到锁就立即抛异常的,它会等待innodb_lock_wait_timeout时间后还拿不到锁才会抛出异常,可以优化先拿到锁的业务逻辑,在innodb_lock_wait_timeout时间以内处理完并释放锁即可的
获取锁的超时时间的方法(在这个时间内会阻塞一直等待锁,到时间还拿不到锁的话就会抛异常)
show variables like '%innodb_lock_wait_timeout%'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50 |
+--------------------------+-------+
上一篇: ST表求区间最值(模板)
下一篇: 多边形面积交模板