欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

MySQL案例01:Last_SQL_Errno: 1755 Cannot execute the current event group in the par

程序员文章站 2024-01-18 10:50:58
周五同事监控报警,有个MySQL从库复制状态异常,让我帮忙排查下,经过排查发现是MySQL5.6并行复制的一个Bug所致,具体处理过程如下: 一、错误信息 登录mysql从库服务器,检查复制状态 mysql错误日志内容如下 二、错误原因 从错误日志字面意思理解,出现这个错误是因为数据库开启了并行复制 ......

周五同事监控报警,有个MySQL从库复制状态异常,让我帮忙排查下,经过排查发现是MySQL5.6并行复制的一个Bug所致,具体处理过程如下:

一、错误信息

登录mysql从库服务器,检查复制状态

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: xx.xx.xx.xx
                  Master_User: repuser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.001689
          Read_Master_Log_Pos: 145325388
               Relay_Log_File: mysql2-relay-bin.004390
                Relay_Log_Pos: 147850306
        Relay_Master_Log_File: mysql-bin.001673
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1755
                   Last_Error: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 147850096
              Relay_Log_Space: 4440639819
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1755
               Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master.
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 46
                  Master_UUID: f6d575b8-6a5b-11e5-8313-0050568d23c9
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 180314 01:01:17
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: f6d575b8-6a5b-11e5-8313-0050568d23c9:9393159-70288025
            Executed_Gtid_Set: 33a835bb-6a5a-11e5-8308-0050568d685b:1-146,
f6d575b8-6a5b-11e5-8313-0050568d23c9:1-69584557
                Auto_Position: 1
1 row in set (0.01 sec)

mysql错误日志内容如下

018-03-14 01:01:16 26349 [Note] Event Scheduler: Loaded 0 events
2018-03-14 01:01:16 26349 [Note] /data/mysql/bin/mysqld: ready for connections.
Version: '5.6.16-log'  socket: '/tmp/mysql.sock'  port: 3306  Source distribution
2018-03-14 01:01:17 26349 [ERROR] Slave SQL: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master. Error_code: 1755
2018-03-14 01:01:17 26349 [Warning] Slave: Cannot execute the current event group in the parallel mode. Encountered event Format_desc, relay-log name ./mysql2-relay-bin.004392, position 238 which prevents execution of this event group in parallel mode. Reason: possible malformed group of events from an old master. Error_code: 1755
2018-03-14 01:01:17 26349 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.001673' position 147850096

二、错误原因

从错误日志字面意思理解,出现这个错误是因为数据库开启了并行复制,一些event group在并行复制下,不支持从而导致复制失败,主从状态异常。

检查数据库版本:

mysql>  select version();
+------------+
| version()  |
+------------+
| 5.6.16-log |
+------------+
1 row in set (0.07 sec)

检查数据库并行复制信息:

mysql> show variables like '%parall%';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| slave_parallel_workers | 1     |
+------------------------+-------+
1 row in set (0.05 sec)

官方搜索错误信息,发现此错误是一bug,具体bug编号为Bug #71495、Bug #72537

https://bugs.mysql.com/bug.php?id=71495

https://bugs.mysql.com/bug.php?id=72537

三、解决方案

1、既然是bug,通过分析binlog内容,然后跳过的方式跳过事物

(非gtid模式,通过set sql_slave_skip_counter=1;跳过事务)

主库分析binlog:

mysqlbinlog -vv --base64-output='decode-rows'   --start-position=147850096  --stop-position=147850096   mysql-bin.001674  >a.sql

SHOW BINLOG EVENTS in 'mysql-bin.001673' from 147850096  limit 10;
从库跳过事物(gtid模式)

stop slave;

SET @@SESSION.GTID_NEXT= 'f6d575b8-6a5b-11e5-8313-0050568d23c9:9393159-70288025';

begin;

commit;

SET SESSION GTID_NEXT = AUTOMATIC;

start slave;

2、通过关闭并行复制来恢复(最佳方法)

stop slave;

set global slave_parallel_workers = 0 ;

start slave;

(  更改my.cnf中并行复制参数slave_parallel_workers = 0)

3、如果上述方法都不行,可以使用重新部署从库的方式来恢复,恢复后关闭并行复制(5.6并行复制存在bug,不推荐使用)

如果使用mysqldump备份,最好指定参数--set-gtid-purged=off,否则需要充值gtid

主库备份数据库:

innobackupexp方式:innobackupex  --socket=/tmp/mysql.sock  /20180316/

从库恢复数据库:

innobackupex  --redo-only --apply-log /soft/2018-03-16_17-45-30
innobackupex  --copy-back /soft/2018-03-16_17-45-30

查看事物应用binlog信息:cat xtrabackup_info

binlog_pos = filename 'mysql-bin.001690', position 46997344, GTID of the last change 'f6d575b8-6a5b-11e5-8313-0050568d23c9:1-70316241'

innodb_from_lsn = 0
innodb_to_lsn = 84446557453

设置从库gtid信息

set global gtid_purged='f6d575b8-6a5b-11e5-8313-0050568d23c9:1-70316241';  

重新部署从库:

change master to master_host='xx.xx.xx.xx',master_user='usernamer',master_password='password',MASTER_AUTO_POSITION=84446557453;

(上述使用了reset master,否则可以直接指定从MASTER_AUTO_POSITION=1开始)