首页 > 代码库 > mha0.56版本安装使用排错

mha0.56版本安装使用排错

如果对MHA还不了解,建议先看以下链接对应的博文。


http://os.51cto.com/art/201307/401702.htm        //这篇博文把搭建MHA的前期准备共组写的很清楚

http://blog.itpub.net/26230597/viewspace-1570798/    // 这篇博文上的安装过程写的比较具体,而且写了MHA除了支持自动故障切换,还可以做手动的故障切换

http://www.dataguru.cn/thread-457284-1-1.html      // 这篇博文把MHA的配置参数等信息解释的很清楚

http://467754239.blog.51cto.com/4878013/1695175     // 这个博文上把整个MHA的切换过程都描述了。博文上也描述了MAH自带虚拟IP转移的脚本,我理解应该不需要keepalive。但是如何把出问题的主在加入到MHA中作为新的从设备好像存在点问题


如果以对MHA有所了解,可以直接阅读。

环境:centos 6.5

             mysql 5.7   (yum安装)

             mha0.56

            master: 192.168.21.10

             backup:192.168.21.11

             slave:192.168.21.12

yum安装mha

1 安装epel源

2 下载官网的rpm包   官方介绍:https://code.google.com/p/mysql-master-ha/

技术分享

按照以下博文链接做MHA的实验

http://blog.csdn.net/lichangzai/article/details/50470771     博文链接


以下是我在实验过程中遇到的问题,这些问题都是在执行

masterha_check_repl --conf=/etc/masterha/app1/app1.cnf 发生

有些问题网上很难找到解决办法。现在分享给大家。

问题1

[root@master ~]# masterha_check_repl --conf=/etc/mha/app1.conf

Fri Jul 22 09:08:54 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Jul 22 09:08:54 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..

Fri Jul 22 09:08:54 2016 - [info] Reading server configuration from /etc/mha/app1.conf..

Fri Jul 22 09:08:54 2016 - [info] MHA::MasterMonitor version 0.56.

Fri Jul 22 09:08:54 2016 - [info] GTID failover mode = 0

Fri Jul 22 09:08:54 2016 - [info] Dead Servers:

Fri Jul 22 09:08:54 2016 - [info] Alive Servers:

Fri Jul 22 09:08:54 2016 - [info]   192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:08:54 2016 - [info]   192.168.21.11(192.168.21.11:3306)

Fri Jul 22 09:08:54 2016 - [info]   192.168.21.12(192.168.21.12:3306)

Fri Jul 22 09:08:54 2016 - [info] Alive Slaves:

Fri Jul 22 09:08:54 2016 - [info]   192.168.21.11(192.168.21.11:3306)  Version=5.7.16 (oldest major version between slaves) log-bin:disabled

Fri Jul 22 09:08:54 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:08:54 2016 - [info]     Primary candidate for the new Master (candidate_master is set)

Fri Jul 22 09:08:54 2016 - [info]   192.168.21.12(192.168.21.12:3306)  Version=5.7.16 (oldest major version between slaves) log-bin:disabled

Fri Jul 22 09:08:54 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:08:54 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:08:54 2016 - [info] Checking slave configurations..

Fri Jul 22 09:08:54 2016 - [info]  read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:08:54 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:08:54 2016 - [warning]  log-bin is not set on slave 192.168.21.11(192.168.21.11:3306). This host cannot be a master.

Fri Jul 22 09:08:54 2016 - [info]  read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:08:54 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:08:54 2016 - [warning]  log-bin is not set on slave 192.168.21.12(192.168.21.12:3306). This host cannot be a master.

Fri Jul 22 09:08:54 2016 - [info] Checking replication filtering settings..

Fri Jul 22 09:08:54 2016 - [info]  binlog_do_db= , binlog_ignore_db= mysql

Fri Jul 22 09:08:54 2016 - [info]  Replication filtering check ok.

Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf

Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48

Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.

Fri Jul 22 09:08:54 2016 - [info] Got exit code 1 (Not master dead).

技术分享

解决方法:

在两个从库上开启二进制日志即可(花了 一天时间,找不到解决方法,最后还是靠自己的理解及测试解决的,骄傲!!)具体配置不在贴上来了。


问题2

[root@master ~]# masterha_check_repl --conf=/etc/mha/app1.conf

Fri Jul 22 09:26:48 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Jul 22 09:26:48 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..

Fri Jul 22 09:26:48 2016 - [info] Reading server configuration from /etc/mha/app1.conf..

Fri Jul 22 09:26:48 2016 - [info] MHA::MasterMonitor version 0.56.

Fri Jul 22 09:26:48 2016 - [info] GTID failover mode = 0

Fri Jul 22 09:26:48 2016 - [info] Dead Servers:

Fri Jul 22 09:26:48 2016 - [info] Alive Servers:

Fri Jul 22 09:26:48 2016 - [info]   192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:26:48 2016 - [info]   192.168.21.11(192.168.21.11:3306)

Fri Jul 22 09:26:48 2016 - [info]   192.168.21.12(192.168.21.12:3306)

Fri Jul 22 09:26:48 2016 - [info] Alive Slaves:

Fri Jul 22 09:26:48 2016 - [info]   192.168.21.11(192.168.21.11:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:26:48 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:26:48 2016 - [info]     Primary candidate for the new Master (candidate_master is set)

Fri Jul 22 09:26:48 2016 - [info]   192.168.21.12(192.168.21.12:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:26:48 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:26:48 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:26:48 2016 - [info] Checking slave configurations..

Fri Jul 22 09:26:48 2016 - [info]  read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:26:48 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:26:48 2016 - [info]  read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:26:48 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:26:48 2016 - [info] Checking replication filtering settings..

Fri Jul 22 09:26:48 2016 - [info]  binlog_do_db= , binlog_ignore_db= mysql

Fri Jul 22 09:26:48 2016 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln443] Binlog filtering check failed on 192.168.21.11(192.168.21.11:3306)! All log-bin enabled servers must have same binlog filtering rules (same binlog-do-db and binlog-ignore-db). Check SHOW MASTER STATUS output and set my.cnf correctly.

技术分享

解决方法:

我在主上开了复制过滤,在从上也必须开启,修改配置文件后还不能reload,需要restart。


问题3

[root@master ~]# masterha_check_repl --conf=/etc/mha/app1.conf

Fri Jul 22 09:30:04 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Jul 22 09:30:04 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..

Fri Jul 22 09:30:04 2016 - [info] Reading server configuration from /etc/mha/app1.conf..

Fri Jul 22 09:30:04 2016 - [info] MHA::MasterMonitor version 0.56.

Fri Jul 22 09:30:04 2016 - [info] GTID failover mode = 0

Fri Jul 22 09:30:04 2016 - [info] Dead Servers:

Fri Jul 22 09:30:04 2016 - [info] Alive Servers:

Fri Jul 22 09:30:04 2016 - [info]   192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:30:04 2016 - [info]   192.168.21.11(192.168.21.11:3306)

Fri Jul 22 09:30:04 2016 - [info]   192.168.21.12(192.168.21.12:3306)

Fri Jul 22 09:30:04 2016 - [info] Alive Slaves:

Fri Jul 22 09:30:04 2016 - [info]   192.168.21.11(192.168.21.11:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:30:04 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:30:04 2016 - [info]     Primary candidate for the new Master (candidate_master is set)

Fri Jul 22 09:30:04 2016 - [info]   192.168.21.12(192.168.21.12:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:30:04 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:30:04 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:30:04 2016 - [info] Checking slave configurations..

Fri Jul 22 09:30:04 2016 - [info]  read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:30:04 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:30:04 2016 - [info]  read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:30:04 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:30:04 2016 - [info] Checking replication filtering settings..

Fri Jul 22 09:30:04 2016 - [info]  binlog_do_db= , binlog_ignore_db= mysql

Fri Jul 22 09:30:04 2016 - [info]  Replication filtering check ok.

Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln393] 192.168.21.11(192.168.21.11:3306): User repl does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403

Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.

Fri Jul 22 09:30:04 2016 - [info] Got exit code 1 (Not master dead).

技术分享

解决方法:

具有复制权限的用户必须在所有节点上都创建一次,具有管理权限的用户也是一样,这两点在网上的好多博文上都没说清楚。


问题4

[root@master ~]# masterha_check_repl --conf=/etc/mha/app1.conf

Fri Jul 22 09:42:46 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Fri Jul 22 09:42:46 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..

Fri Jul 22 09:42:46 2016 - [info] Reading server configuration from /etc/mha/app1.conf..

Fri Jul 22 09:42:46 2016 - [info] MHA::MasterMonitor version 0.56.

Fri Jul 22 09:42:46 2016 - [info] GTID failover mode = 0

Fri Jul 22 09:42:46 2016 - [info] Dead Servers:

Fri Jul 22 09:42:46 2016 - [info] Alive Servers:

Fri Jul 22 09:42:46 2016 - [info]   192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:42:46 2016 - [info]   192.168.21.11(192.168.21.11:3306)

Fri Jul 22 09:42:46 2016 - [info]   192.168.21.12(192.168.21.12:3306)

Fri Jul 22 09:42:46 2016 - [info] Alive Slaves:

Fri Jul 22 09:42:46 2016 - [info]   192.168.21.11(192.168.21.11:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:42:46 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:42:46 2016 - [info]     Primary candidate for the new Master (candidate_master is set)

Fri Jul 22 09:42:46 2016 - [info]   192.168.21.12(192.168.21.12:3306)  Version=5.7.16-log (oldest major version between slaves) log-bin:enabled

Fri Jul 22 09:42:46 2016 - [info]     Replicating from 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:42:46 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)

Fri Jul 22 09:42:46 2016 - [info] Checking slave configurations..

Fri Jul 22 09:42:46 2016 - [info]  read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:42:46 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).

Fri Jul 22 09:42:46 2016 - [info]  read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:42:46 2016 - [warning]  relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).

Fri Jul 22 09:42:46 2016 - [info] Checking replication filtering settings..

Fri Jul 22 09:42:46 2016 - [info]  binlog_do_db= , binlog_ignore_db= mysql

Fri Jul 22 09:42:46 2016 - [info]  Replication filtering check ok.

Fri Jul 22 09:42:47 2016 - [info] GTID (with auto-pos) is not supported

Fri Jul 22 09:42:47 2016 - [info] Starting SSH connection tests..

Fri Jul 22 09:42:48 2016 - [info] All SSH connection tests passed successfully.

Fri Jul 22 09:42:48 2016 - [info] Checking MHA Node version..

Fri Jul 22 09:42:49 2016 - [info]  Version check ok.

Fri Jul 22 09:42:49 2016 - [info] Checking SSH publickey authentication settings on the current master..

Fri Jul 22 09:42:49 2016 - [info] HealthCheck: SSH to 192.168.21.10 is reachable.

Fri Jul 22 09:42:49 2016 - [info] Master MHA Node version is 0.56.

Fri Jul 22 09:42:49 2016 - [info] Checking recovery script configurations on 192.168.21.10(192.168.21.10:3306)..

Fri Jul 22 09:42:49 2016 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/logs/mysqllog/mysql-bin --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000001

Fri Jul 22 09:42:49 2016 - [info]   Connecting to root@192.168.21.10(192.168.21.10:22)..

Failed to save binary log: Binlog not found from /logs/mysqllog/mysql-bin! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager‘s configuration file and try again.

at /usr/bin/save_binary_logs line 123

eval {...} called at /usr/bin/save_binary_logs line 70

main::main() called at /usr/bin/save_binary_logs line 66

Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln158] Binlog setting check failed!

Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln405] Master configuration failed.

Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48

Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.

Fri Jul 22 09:42:49 2016 - [info] Got exit code 1 (Not master dead).

技术分享

解决方法:

如果手动定义了二进制日志文件的路径,就必须在mha的配置文件中制定master_binlog_dir=‘二进制日志文件所在目录‘


总结:用我博文中介绍的MHA版本,应该需要在所有的数据库中都开启二进制日志,中继日志,授权也应该都相同,配置文件也基本相同。我想在这个前提下在安装执行MHA应该不会遇上太多问题了。只是目前还不能确定这种做法是不是正解。


本文出自 “点滴积累” 博客,请务必保留此出处http://16769017.blog.51cto.com/700711/1878451

mha0.56版本安装使用排错