欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

一个mysql事务引发的思考(血案)

程序员文章站 2024-01-14 17:14:04
...

问题简述

大家都知道mysql是支持ACID,支持事务的,事务是非常重要的一个特性,要不都执行成功,要不都不成功。我们在coding时也会大量用到,但是随着业务代码的累加与使用人数的增加,系统有某些场景下会出现下面这个错误

11:48:40.265[http-nio-8999-exec-3] ERROR c.x.e.l.c.e.ExceptionHandleFilter - exception message: {} com.xxx.common.exception.ExceptionHandleFilter:45
org.springframework.dao.CannotAcquireLockException: 
### Error updating database.  Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
### The error may involve com.xxx.dao.issuegroup.CatalogTaskDao.insert-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO `catalog_task`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( ?,?,?,?)
### Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
; SQL []; Lock wait timeout exceeded; try restarting transaction; nested exception is java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
    at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:259) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
    at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) ~[spring-jdbc-4.2.3.RELEASE.jar:4.2.3.RELEASE]
    at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:75) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:447) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at com.sun.proxy.$Proxy99.insert(Unknown Source) ~[na:na]
    at org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:279) ~[mybatis-spring-1.3.0.jar:1.3.0]
    at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:56) ~[mybatis-3.4.0.jar:3.4.0]
    at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53) ~[mybatis-3.4.0.jar:3.4.0]

问题跟踪

复现操作场景,跟踪期间执行的sql语句
开启mysql generallog日志,并找到日志路径,如果未开启的话执行 set global general_log = on; 即可。

mysql> show global variables like '%general%';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    16082
Current database: lean_dev
+------------------+---------------------------------+
| Variable_name    | Value                           |
+------------------+---------------------------------+
| general_log      | ON                              |
| general_log_file | /var/lib/mysql/9905e4ad340b.log |
+------------------+---------------------------------+
2 rows in set (0.61 sec)

需要注意的是问题复现期间建议使用单独的mysql机器,我使用的是已经废弃不用的机器(ip:10.94.120.110),这样才能得到特定场景操作下的纯粹的sql日志,截取具体的日志,
找到可能存在锁竞争的sql(过滤掉select语句):

2018-07-04T12:04:11.847548Z 16087 Query SET autocommit=0
2018-07-04T12:04:11.901111Z 16087 Query UPDATE lean_archive la SET la.status = -1, mtime = '2018-07-04 20:04:11.875'  WHERE la.object_type = 1      AND la.object_id = 195959      AND la.status = 0
2018-07-04T12:04:12.083234Z 16087 Query insert into  lean_change  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  values  (null,120859,now(),'list
',11,'status','195959',null,'柳志崇恢复了列“{2}”',0)
2018-07-04T12:04:12.099792Z 16087 Query update  lean_list  set  name = 'done2',  ctime = '2018-05-09 16:55:45.0',  cuserid = 120859,  list_order = 3,  status = 0,  boardid = 10114, finished = 0 wh
ere id = 195959
2018-07-04T12:04:15.023635Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:04:15.036100Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118859   )
2018-07-04T12:04:15.051886Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118859
2018-07-04T12:04:23.394968Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118864
2018-07-04T12:04:24.308721Z 16085 Query SET autocommit=0
2018-07-04T12:04:24.497692Z 16085 Query insert ignore into lean_issue set  title = 'bbbb',  content = null,  cuserid = 120859,  status = 0,  listid = 195406,  boardid = 10114,  priority = null,
   source_type = 0,   starttime = null,  endtime = null,  sequnce = 'LR4605-83',  parentid = null,  ctime = now(),  mtime = null,  workflowid = null,  type = 4722,  source = null,  issue_order = null
2018-07-04T12:04:25.218430Z 16085 Query insert into  lean_change  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  values  (123096,120859,now(),'is
sue',1,'list','','195406','柳志崇在“Done1”添加了任务',0)
2018-07-04T12:04:25.620718Z 16085 Query insert into  lean_card_propose_user  set  issue_id = 123096,  user_id = 120859,  status = null
2018-07-04T12:04:25.639585Z 16085 Query insert into  lean_issue_user set  userid = 120859,  issueid = 123096,  type = 2
2018-07-04T12:04:25.832163Z 16085 Query update lean_board SET mtime = '2018-07-04 20:04:25' ,count = 83  where id = 10114
2018-07-04T12:04:26.449587Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    3602   ,   3604   ,   3605   )
2018-07-04T12:04:26.458029Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118864   )
2018-07-04T12:04:26.468883Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118864
2018-07-04T12:04:26.951996Z 16085 Query INSERT INTO `catalog_task`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 123096,1649,0,120859)
...
2018-07-04T12:05:13.319018Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118873   )
2018-07-04T12:05:13.330249Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118873
2018-07-04T12:05:16.033715Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:16.042514Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118874   )
2018-07-04T12:05:16.053007Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118874
2018-07-04T12:05:17.967792Z 16085 Query rollback
2018-07-04T12:05:17.972078Z 16085 Query SET autocommit=1
2018-07-04T12:05:18.769333Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:18.779075Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118874   )
2018-07-04T12:05:18.791102Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 118874
2018-07-04T12:05:21.380525Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:05:21.386858Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    118875   )
...
2018-07-04T12:08:29.544130Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 122925
2018-07-04T12:08:30.424652Z 16087 Query UPDATE `catalog`  SET `status` = 0,mtime=now() where  status !=-1 and id in    (    1649   ,   1650   ,   3602   ,   3603   ,   3604   ,   3605   )
2018-07-04T12:08:30.437569Z 16087 Query UPDATE `catalog_task`  SET `status` = 0,mtime=now()   where status!=-1 and issue_id in    (    122925   )
2018-07-04T12:08:30.448584Z 16087 Query update lean_issue SET mtime = now()        ,boardid=10114  where id = 122925
2018-07-04T12:08:30.718356Z 16087 Query INSERT INTO `lean_change`  ( `issueid`,`cuserid`,`ctime`, `type`, `status`, `field`, `oldvalue`, `newvalue`, `desc`,source)  VALUES     (118859,120859,now()
,'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118859,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118861,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118861,120859,now(),'issue',10,
'status','-1','0','lean恢复了任务',0)  ,  (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118862,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118864,120859,now(),'issue',10,'status','-1
','0','lean恢复了任务',0)  ,  (118864,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118865,120859,now(),'issue',10,'status','-1','0','lean恢复了任务'
,0)  ,  (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118866,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0)  ,  (118
867,120859,now(),'issue',10,'status','-1','0','lean恢复了任务',0) ...
2018-07-04T12:08:30.733035Z 16087 Query UPDATE lean_archive la, lean_issue li SET la.status = -1, la.mtime = now(), li.status = la.object_status, li.mtime = now() WHERE la.object_type = 2 AND la.s
tatus = 0 AND li.listid = 195959  AND li.status = -1 AND li.id = la.object_id
2018-07-04T12:08:30.879701Z 16087 Query UPDATE lean_board_list_lane SET status = 0, archive_id = null WHERE  status = -1   and board_id = 10114   and list_id = 195959   and archive_id = 57354
2018-07-04T12:08:31.623011Z 16087 Query insert into  notice_info  (`title`, `user_id`, `type`, `desc`, `url`, `from`, `from_id`, `c_time`) values  ('柳志崇恢复了列“done2”。',38604,1,'','http://lean-dev.intra.xiaoju
keji.com/project/LR4605','柳志崇',120859,'2018-07-04 20:08:31.618')
2018-07-04T12:08:32.241940Z 16087 Query commit
2018-07-04T12:08:32.244986Z 16087 Query SET autocommit=1

至此问题基本锁定了表catalog_task(为什么这么说呢,只有这张表同时涉及到了update与insert变更操作)

场景还原

接下来就简化业务场景,准备测试表,测试数据进行验证

初始化

drop table catalog_task_test;
CREATE TABLE `catalog_task_test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `issue_id` int(11) NOT NULL,
  `catalog_id` int(11) NOT NULL COMMENT '目录id',
  `status` tinyint(3) NOT NULL,
  `cuserid` int(11) NOT NULL,
  `ctime` timestamp NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  `mtime` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
  PRIMARY KEY (`id`) 
) ENGINE=InnoDB AUTO_INCREMENT=7199 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 111,1649,0,120859) ;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 222,1649,0,120859) ;

测试脚本

session1:
set autocommit=0;
INSERT INTO `catalog_task_test`  (`issue_id`,`catalog_id`,`status`,`cuserid`)  VALUES ( 333,1649,0,120859) ;
session2:(session1语句执行完成后再执行session2中的语句)
set autocommit=0; 
select now(); update catalog_task_test set status=1 where issue_id=111; select now();

session2实际执行时会报错,提示事务超时,错误结果:

mysql> select now(); update catalog_task_test set status=1 where issue_id=111; select now();
+---------------------+
| now()               |
+---------------------+
| 2018-07-03 10:23:26 |
+---------------------+
1 row in set (0.01 sec)
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
+---------------------+
| now()               |
+---------------------+
| 2018-07-03 10:24:17 |
+---------------------+
1 row in set (0.01 sec) 

疑问点

  • Q: insert into难道是表级锁吗,为什么在insert into语句没有执行commit; 时 update需要等待他的提交
  • A: 以mysql的InnoDB引擎为例,有索引并且使用了该索引当条件的时候就是行锁,没有索引的时候就是表锁。innodb的行锁是在有索引的情况下,没有索引的表是锁定全表的。insert一般都是行锁, update能确定且走索引的是行锁,否则表锁。

结合上面的解答,补充2个问题

  • Q: 那么先在session1中执行update ,然后在session2中执行insert 会出现锁超时的问题么?
  • A: 会,表级锁被占用的情况下是拿不到行级锁的
  • Q: 那么先在session1中执行insert ,然后在session2中执行insert 会出现锁超时的问题么?
  • A:不会,因为insert 均是行级锁
  • Q: 那么先在session1中执行update ,然后在session2中执行update 会出现锁超时的问题么?
  • A:会

解决思路1

将update表级锁降为行级锁(可以通过按主键进行更新或者where条件列增加索引)
上述案例添加索引sql如下:

ALTER TABLE `catalog_task_test` ADD INDEX idx_issue_id ( `issue_id` )  ;

解决思路2

上述案例insert并不是拿不到锁就立即抛异常的,它会等待innodb_lock_wait_timeout时间后还拿不到锁才会抛出异常,可以优化先拿到锁的业务逻辑,在innodb_lock_wait_timeout时间以内处理完并释放锁即可的

获取锁的超时时间的方法(在这个时间内会阻塞一直等待锁,到时间还拿不到锁的话就会抛异常)

show variables like '%innodb_lock_wait_timeout%'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50    |
+--------------------------+-------+
相关标签: mysql 事务