首页 > 代码库 > 【案例】利用innodb_force_recovery 解决MySQL服务器crash无法重启问题
【案例】利用innodb_force_recovery 解决MySQL服务器crash无法重启问题
一 背景
某一创业的朋友的主机因为磁盘阵列损坏机器crash,重启MySQL服务时 报如下错误:
-
InnoDB: Reading tablespace information from the .ibd files...
-
InnoDB: Restoring possible half-written data pages from the doublewrite
-
InnoDB: buffer...
-
InnoDB: Doing recovery: scanned up to log sequence number 9120034833
-
150125 16:12:51 InnoDB: Starting an apply batch of log records to the database...
-
InnoDB: Progress in percents: 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 150125 16:12:51 [ERROR] mysqld got signal 11 ;
-
This could be because you hit a bug. It is also possible that this binary
-
or one of the libraries it was linked against is corrupt, improperly built,
-
or misconfigured. This error can also be caused by malfunctioning hardware.
-
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
-
We will try our best to scrape up some info that will hopefully help
-
diagnose the problem, but since we have already crashed,
-
something is definitely wrong and this may fail.
-
Server version: 5.5.37-MariaDB-log
-
key_buffer_size=268435456
-
read_buffer_size=1048576
-
max_used_connections=0
-
max_threads=1002
-
thread_count=0
-
It is possible that mysqld could use up to
-
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332093 K bytes of memory
- 41 Hope that.
二 分析
主要关注 mysqld got signal 11 的问题,从日志内容分析来看,数据库在机器crash 导致日志文件损坏,重启之后无法正常恢复,更无法正常对外提供服务。
三 解决
因为日志已经损坏,这里采用非常规手段,首先修改innodb_force_recovery参数,使mysqld跳过恢复步骤,将mysqld 启动,将数据导出来然后重建数据库。
innodb_force_recovery可以设置为1-6,大的数字包含前面所有数字的影响。
1. (SRV_FORCE_IGNORE_CORRUPT):忽略检查到的corrupt页。
2. (SRV_FORCE_NO_BACKGROUND):阻止主线程的运行,如主线程需要执行full purge操作,会导致crash。
3. (SRV_FORCE_NO_TRX_UNDO):不执行事务回滚操作。
4. (SRV_FORCE_NO_IBUF_MERGE):不执行插入缓冲的合并操作。
5. (SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日志,InnoDB存储引擎会将未提交的事务视为已提交。
6. (SRV_FORCE_NO_LOG_REDO):不执行前滚的操作。
注意
a 当设置参数值大于0后,可以对表进行select,create,drop操作,但insert,update或者delete这类操作是不允许的。
b 当innodb_purge_threads 和 innodb_force_recovery一起设置会出现一种loop现象:
-
150125 17:07:42 InnoDB: Waiting for the background threads to start
-
150125 17:07:43 InnoDB: Waiting for the background threads to start
-
150125 17:07:44 InnoDB: Waiting for the background threads to start
-
150125 17:07:45 InnoDB: Waiting for the background threads to start
-
150125 17:07:46 InnoDB: Waiting for the background threads to start
- 150125 17:07:47 InnoDB: Waiting for the background threads to start
在my.cnf中修改以下两个参数
innodb_force_recovery=6
innodb_purge_thread=0
重启MySQL
-
150125 17:10:47 [Note] Crash recovery finished.
-
150125 17:10:47 [Note] Server socket created on IP: ‘0.0.0.0‘.
-
150125 17:10:47 [Note] Event Scheduler: Loaded 0 events
-
150125 17:10:47 [Note] /vdata/webserver/mysql/bin/mysqld: ready for connections.
- Version: ‘5.5.37-MariaDB-log‘ socket: ‘/tmp/mysql.sock‘ port: 3306 Source distribution
立即对数据库做逻辑导出 ,完成之后将innodb_force_recovery设置为0 ,innodb_purge_thread=1 ,然后重建数据库 。
另外 MySQL 版本 5.5以及之前 ,当innodb_purge_threads =1,innodb_force_recovery >1 的情况会出现上文提到的循环报warning 问题(=1 没有问题),
原因:
MySQL 的源代码中显示 当innodb_purge_threads 和 innodb_force_recovery一起设置会出现loop循环
-
while (srv_shutdown_state == SRV_SHUTDOWN_NONE) {
-
if (srv_thread_has_reserved_slot(SRV_MASTER) == ULINT_UNDEFINED
-
|| (srv_n_purge_threads == 1
-
&& srv_thread_has_reserved_slot(SRV_WORKER)
-
== ULINT_UNDEFINED)) {
-
ut_print_timestamp(stderr);
- fprintf(stderr, " InnoDB: Waiting for the background threads to start\n");
-
os_thread_sleep(1000000);
-
} else {
-
break;
-
}
- }
所以当需要设置innodb_force_recovery>1的时候需要关闭 innodb_purge_threads,设置为0(默认)。
四 小结
MySQL crash 或者 MySQL 数据库服务器 crash 会导致各种各样的问题 ,比如主备之间的error 1594 (5.6 版本开启crash-safe ,会最大程度上避免 error 1594的问题,以后会写5.6新特性介绍该功能 ),error 1236, 日志损坏,数据文件损坏 ,等等,本案例只是其中的一种,细心从日志中找的相关错误提示,逐步解决即可。
【案例】利用innodb_force_recovery 解决MySQL服务器crash无法重启问题