接着上一章节的环境,我们来模拟一下MHA故障,以及如何恢复MHA,如果环境还没有启动的,请看看上一章节,巩固一下。

 

故障演练

  1. 手动宕掉db01的MySQL。
    [root@db01 tools]# service mysqld stop
    Shutting down MySQL.. SUCCESS!


     

  2. 观察db03中MHA的日志:tail -f /var/log/mha/app1/manager,会发现从库已成为db02(10.0.0.22)

    [root@db03 app1]# tail -f manager
    Tue Aug 27 17:58:10 2019 - [info] Starting master failover.
    Tue Aug 27 17:58:10 2019 - [info] 
    Tue Aug 27 17:58:10 2019 - [info] * Phase 1: Configuration Check Phase..
    Tue Aug 27 17:58:10 2019 - [info] 
    Tue Aug 27 17:58:15 2019 - [info] GTID failover mode = 1
    Tue Aug 27 17:58:15 2019 - [info] Dead Servers:
    Tue Aug 27 17:58:15 2019 - [info]   10.0.0.20(10.0.0.20:3306)
    Tue Aug 27 17:58:15 2019 - [info] Checking master reachability via MySQL(double check)...
                             。
                             。
                          省略部分
                             。
                             。
    Wed Aug 28 07:35:16 2019 - [info] Starting ping health check on 10.0.0.20(10.0.0.20:3306)..
    Wed Aug 28 07:35:17 2019 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
    Wed Aug 28 07:35:17 2019 - [warning] Connection failed 1 time(s)..    
    Wed Aug 28 07:35:17 2019 - [info] Executing SSH check script: exit 0
    Wed Aug 28 07:35:17 2019 - [info] HealthCheck: SSH to 10.0.0.20 is reachable.
    Wed Aug 28 07:35:19 2019 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
    Wed Aug 28 07:35:19 2019 - [warning] Connection failed 2 time(s)..
    Wed Aug 28 07:35:21 2019 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
    Wed Aug 28 07:35:21 2019 - [warning] Connection failed 3 time(s)..
    Wed Aug 28 07:35:23 2019 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
    Wed Aug 28 07:35:23 2019 - [warning] Connection failed 4 time(s)..
    Wed Aug 28 07:35:23 2019 - [warning] Master is not reachable from health checker!
    Wed Aug 28 07:35:23 2019 - [warning] Master 10.0.0.20(10.0.0.20:3306) is not reachable!    ### 多次尝试连接db01的mysql失败
    Wed Aug 28 07:35:23 2019 - [warning] SSH is reachable.   ### 但是db01的ssh可以连接
    Wed Aug 28 07:35:23 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
    Wed Aug 28 07:35:23 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Wed Aug 28 07:35:23 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Wed Aug 28 07:35:23 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Wed Aug 28 07:35:24 2019 - [info] GTID failover mode = 1
    Wed Aug 28 07:35:24 2019 - [info] Dead Servers:   ## 提示db01节点死亡
    Wed Aug 28 07:35:24 2019 - [info]   10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:24 2019 - [info] Alive Servers:  ## 存活节点是db02和db03
    Wed Aug 28 07:35:24 2019 - [info]   10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 07:35:24 2019 - [info]   10.0.0.23(10.0.0.23:3306)
    Wed Aug 28 07:35:24 2019 - [info] Alive Slaves:
    Wed Aug 28 07:35:24 2019 - [info]   10.0.0.22(10.0.0.22:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:24 2019 - [info]     GTID ON
    Wed Aug 28 07:35:24 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:24 2019 - [info]   10.0.0.23(10.0.0.23:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:24 2019 - [info]     GTID ON
    Wed Aug 28 07:35:24 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:24 2019 - [info] Checking slave configurations..
    Wed Aug 28 07:35:24 2019 - [info]  read_only=1 is not set on slave 10.0.0.22(10.0.0.22:3306).
    Wed Aug 28 07:35:24 2019 - [info]  read_only=1 is not set on slave 10.0.0.23(10.0.0.23:3306).
    Wed Aug 28 07:35:24 2019 - [info] Checking replication filtering settings..
    Wed Aug 28 07:35:24 2019 - [info]  Replication filtering check ok.
    Wed Aug 28 07:35:24 2019 - [info] Master is down!
    Wed Aug 28 07:35:24 2019 - [info] Terminating monitoring script.
    Wed Aug 28 07:35:24 2019 - [info] Got exit code 20 (Master dead).
    Wed Aug 28 07:35:24 2019 - [info] MHA::MasterFailover version 0.56.
    Wed Aug 28 07:35:24 2019 - [info] Starting master failover.
    Wed Aug 28 07:35:24 2019 - [info] 
    Wed Aug 28 07:35:24 2019 - [info] * Phase 1: Configuration Check Phase..
    Wed Aug 28 07:35:24 2019 - [info] 
    Wed Aug 28 07:35:25 2019 - [info] GTID failover mode = 1
    Wed Aug 28 07:35:25 2019 - [info] Dead Servers:
    Wed Aug 28 07:35:25 2019 - [info]   10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:25 2019 - [info] Checking master reachability via MySQL(double check)...
    Wed Aug 28 07:35:25 2019 - [info]  ok.
    Wed Aug 28 07:35:25 2019 - [info] Alive Servers:
    Wed Aug 28 07:35:25 2019 - [info]   10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 07:35:25 2019 - [info]   10.0.0.23(10.0.0.23:3306)
    Wed Aug 28 07:35:25 2019 - [info] Alive Slaves:
    Wed Aug 28 07:35:25 2019 - [info]   10.0.0.22(10.0.0.22:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:25 2019 - [info]     GTID ON
    Wed Aug 28 07:35:25 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:25 2019 - [info]   10.0.0.23(10.0.0.23:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:25 2019 - [info]     GTID ON
    Wed Aug 28 07:35:25 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:25 2019 - [info] Starting GTID based failover.
    Wed Aug 28 07:35:25 2019 - [info] 
    Wed Aug 28 07:35:25 2019 - [info] ** Phase 1: Configuration Check Phase completed.
    Wed Aug 28 07:35:25 2019 - [info] 
    Wed Aug 28 07:35:25 2019 - [info] * Phase 2: Dead Master Shutdown Phase..
    Wed Aug 28 07:35:25 2019 - [info] 
    Wed Aug 28 07:35:25 2019 - [info] Forcing shutdown so that applications never connect to the current master..
    Wed Aug 28 07:35:25 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
    Wed Aug 28 07:35:25 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
    Wed Aug 28 07:35:26 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 3: Master Recovery Phase..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] The latest binary log file/position on all slaves is mysql-bin.000004:191
    Wed Aug 28 07:35:26 2019 - [info] Retrieved Gtid Set: 551059f1-c57c-11e9-aaa0-000c29db612a:1-2
    Wed Aug 28 07:35:26 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):
    Wed Aug 28 07:35:26 2019 - [info]   10.0.0.22(10.0.0.22:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:26 2019 - [info]     GTID ON
    Wed Aug 28 07:35:26 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:26 2019 - [info]   10.0.0.23(10.0.0.23:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:26 2019 - [info]     GTID ON
    Wed Aug 28 07:35:26 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:26 2019 - [info] The oldest binary log file/position on all slaves is mysql-bin.000004:191
    Wed Aug 28 07:35:26 2019 - [info] Retrieved Gtid Set: 551059f1-c57c-11e9-aaa0-000c29db612a:1-2
    Wed Aug 28 07:35:26 2019 - [info] Oldest slaves:
    Wed Aug 28 07:35:26 2019 - [info]   10.0.0.22(10.0.0.22:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:26 2019 - [info]     GTID ON
    Wed Aug 28 07:35:26 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:26 2019 - [info]   10.0.0.23(10.0.0.23:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 07:35:26 2019 - [info]     GTID ON
    Wed Aug 28 07:35:26 2019 - [info]     Replicating from 10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 3.3: Determining New Master Phase..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] Searching new master from slaves..
    Wed Aug 28 07:35:26 2019 - [info]  Candidate masters from the configuration file:
    Wed Aug 28 07:35:26 2019 - [info]  Non-candidate masters:
    Wed Aug 28 07:35:26 2019 - [info] New master is 10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 07:35:26 2019 - [info] Starting master failover..
    Wed Aug 28 07:35:26 2019 - [info] 
    From:
    10.0.0.20(10.0.0.20:3306) (current master)
     +--10.0.0.22(10.0.0.22:3306)
     +--10.0.0.23(10.0.0.23:3306)
    
    To:
    10.0.0.22(10.0.0.22:3306) (new master)
     +--10.0.0.23(10.0.0.23:3306)
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 3.3: New Master Recovery Phase..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info]  Waiting all logs to be applied.. 
    Wed Aug 28 07:35:26 2019 - [info]   done.
    Wed Aug 28 07:35:26 2019 - [info] Getting new masters binlog name and position..
    Wed Aug 28 07:35:26 2019 - [info]  mysql-bin.000003:440
    Wed Aug 28 07:35:26 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.22', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';    #如果主节点恢复后,加入到备用节点所做的操作
    Wed Aug 28 07:35:26 2019 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000003, 440, 551059f1-c57c-11e9-aaa0-000c29db612a:1-2
    Wed Aug 28 07:35:26 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
    Wed Aug 28 07:35:26 2019 - [info] ** Finished master recovery successfully.
    Wed Aug 28 07:35:26 2019 - [info] * Phase 3: Master Recovery Phase completed.
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 4: Slaves Recovery Phase..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] * Phase 4.1: Starting Slaves in parallel..
    Wed Aug 28 07:35:26 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info] -- Slave recovery on host 10.0.0.23(10.0.0.23:3306) started, pid: 9039. Check tmp log /var/log/mha/app1/10.0.0.23_3306_20190828073524.log if it takes time..
    Wed Aug 28 07:35:27 2019 - [info] 
    Wed Aug 28 07:35:27 2019 - [info] Log messages from 10.0.0.23 ...
    Wed Aug 28 07:35:27 2019 - [info] 
    Wed Aug 28 07:35:26 2019 - [info]  Resetting slave 10.0.0.23(10.0.0.23:3306) and starting replication from the new master 10.0.0.22(10.0.0.22:3306)..
    Wed Aug 28 07:35:26 2019 - [info]  Executed CHANGE MASTER.
    Wed Aug 28 07:35:26 2019 - [info]  Slave started.
    Wed Aug 28 07:35:26 2019 - [info]  gtid_wait(551059f1-c57c-11e9-aaa0-000c29db612a:1-2) completed on 10.0.0.23(10.0.0.23:3306). Executed 0 events.
    Wed Aug 28 07:35:27 2019 - [info] End of log messages from 10.0.0.23.
    Wed Aug 28 07:35:27 2019 - [info] -- Slave on host 10.0.0.23(10.0.0.23:3306) started.
    Wed Aug 28 07:35:27 2019 - [info] All new slave servers recovered successfully.
    Wed Aug 28 07:35:27 2019 - [info] 
    Wed Aug 28 07:35:27 2019 - [info] * Phase 5: New master cleanup phase..
    Wed Aug 28 07:35:27 2019 - [info] 
    Wed Aug 28 07:35:27 2019 - [info] Resetting slave info on the new master..
    Wed Aug 28 07:35:27 2019 - [info]  10.0.0.22: Resetting slave info succeeded.
    Wed Aug 28 07:35:27 2019 - [info] Master failover to 10.0.0.22(10.0.0.22:3306) completed successfully.
    Wed Aug 28 07:35:27 2019 - [info] Deleted server1 entry from /etc/mha/app1.cnf .
    Wed Aug 28 07:35:27 2019 - [info] 
    
    ----- Failover Report -----             ####开始切换主从节点
    
    app1: MySQL Master failover 10.0.0.20(10.0.0.20:3306) to 10.0.0.22(10.0.0.22:3306) succeeded
    
    Master 10.0.0.20(10.0.0.20:3306) is down!
    
    Check MHA Manager logs at db03:/var/log/mha/app1/manager for details.
    
    Started automated(non-interactive) failover.
    Selected 10.0.0.22(10.0.0.22:3306) as a new master.   ##  此时db02已经被切换为主库
    10.0.0.22(10.0.0.22:3306): OK: Applying all logs succeeded.
    10.0.0.23(10.0.0.23:3306): OK: Slave started, replicating from 10.0.0.22(10.0.0.22:3306)
    10.0.0.22(10.0.0.22:3306): Resetting slave info succeeded.
    Master failover to 10.0.0.22(10.0.0.22:3306) completed successfully.
    

    ## 请仔细查看你的日志,因为日志中有很多信息可以利用,并且能够让你知道MHA是如何切换主从,上面的日志中已经给了注释说明。我们把其中一句重要信息拿出来:

    CHANGE MASTER TO MASTER_HOST='10.0.0.22', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';  ## xxx需要换成自己的password

    这是当我们修复好db01的MySQL后,需要的操作。因为db01宕机,所以数据已经不是最新数据,只能当作从库加入。
     

  3. 查看主从状态
    db02:db02中已经没有主从状态了,因为db02成为了主库,取消了从库状态

    mysql> show slave status\G;   
    Empty set (0.00 sec)
    
    ERROR: 
    No query specified
    
    mysql> 

    db03:

    mysql> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: 10.0.0.22     ##主库是db02(10.0.0.22)
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mysql-bin.000003
              Read_Master_Log_Pos: 440
                   Relay_Log_File: db03-relay-bin.000002
                    Relay_Log_Pos: 408
            Relay_Master_Log_File: mysql-bin.000003
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                          .
                          .
                       省略部分
                          .
                          .
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 22
                      Master_UUID: 0ae0ef2e-c57d-11e9-aaa5-000c29e69491
                 Master_Info_File: /application/mysql-5.6.36/data/master.info
                        SQL_Delay: 0
              SQL_Remaining_Delay: NULL
          Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
               Master_Retry_Count: 86400
                      Master_Bind: 
          Last_IO_Error_Timestamp: 
         Last_SQL_Error_Timestamp: 
                   Master_SSL_Crl: 
               Master_SSL_Crlpath: 
               Retrieved_Gtid_Set: 
                Executed_Gtid_Set: 551059f1-c57c-11e9-aaa0-000c29db612a:1-2
                    Auto_Position: 1
    1 row in set (0.00 sec)


     

  4. 恢复db01的MySQL运行,重新将db01加入到主从复制关系中

    [root@db01 tools]# service mysqld start
    Starting MySQL. SUCCESS! 
    [root@db01 tools]# mysql
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 1
    Server version: 5.6.36-log Source distribution
    
    mysql> CHANGE MASTER TO MASTER_HOST='10.0.0.22', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='123456';
    Query OK, 0 rows affected, 2 warnings (0.08 sec)
    
    mysql> start slave;
    Query OK, 0 rows affected (0.03 sec)


     

  5. db03中,查看配置文件,发现[server1] 模块被删除,其实是MHA自动将故障节点从配置文件中删除的,此时我们已经将db01修复好,将server1重新加上
    修改前:

    [root@db03 ~]# vim /etc/mha/app1.cnf 
    
    [server default]
    manager_log=/var/log/mha/app1/manager
    manager_workdir=/var/log/mha/app1
    master_binlog_dir=/application/mysql/binlog
    password=mha
    ping_interval=2
    repl_password=123456
    repl_user=repl
    ssh_user=root
    user=mha
    
    [server2]
    hostname=10.0.0.22
    port=3306
    
    [server3]
    hostname=10.0.0.23
    port=3306

    修改后:

    [root@db03 ~]# vim /etc/mha/app1.cnf 
    
    [server default]
    manager_log=/var/log/mha/app1/manager
    manager_workdir=/var/log/mha/app1
    master_binlog_dir=/application/mysql/binlog
    password=mha
    ping_interval=2
    repl_password=123456
    repl_user=repl
    ssh_user=root
    user=mha
    
    [server1]
    hostname=10.0.0.20
    port=3306
    
    [server2]
    hostname=10.0.0.22
    port=3306
    
    [server3]
    hostname=10.0.0.23
    port=3306


     

  6. 检查互信以及主从状态

    [root@db03 ~]#  masterha_check_ssh  --conf=/etc/mha/app1.cnf     ##互信检查
    Wed Aug 28 08:13:34 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Wed Aug 28 08:13:34 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Wed Aug 28 08:13:34 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Wed Aug 28 08:13:34 2019 - [info] Starting SSH connection tests..
    Wed Aug 28 08:13:35 2019 - [debug] 
    Wed Aug 28 08:13:34 2019 - [debug]  Connecting via SSH from root@10.0.0.20(10.0.0.20:22) to root@10.0.0.22(10.0.0.22:22)..
    Wed Aug 28 08:13:35 2019 - [debug]   ok.
    Wed Aug 28 08:13:35 2019 - [debug]  Connecting via SSH from root@10.0.0.20(10.0.0.20:22) to root@10.0.0.23(10.0.0.23:22)..
    Wed Aug 28 08:13:35 2019 - [debug]   ok.
    Wed Aug 28 08:13:35 2019 - [debug] 
    Wed Aug 28 08:13:35 2019 - [debug]  Connecting via SSH from root@10.0.0.22(10.0.0.22:22) to root@10.0.0.20(10.0.0.20:22)..
    Wed Aug 28 08:13:35 2019 - [debug]   ok.
    Wed Aug 28 08:13:35 2019 - [debug]  Connecting via SSH from root@10.0.0.22(10.0.0.22:22) to root@10.0.0.23(10.0.0.23:22)..
    Wed Aug 28 08:13:35 2019 - [debug]   ok.
    Wed Aug 28 08:13:36 2019 - [debug] 
    Wed Aug 28 08:13:35 2019 - [debug]  Connecting via SSH from root@10.0.0.23(10.0.0.23:22) to root@10.0.0.20(10.0.0.20:22)..
    Wed Aug 28 08:13:36 2019 - [debug]   ok.
    Wed Aug 28 08:13:36 2019 - [debug]  Connecting via SSH from root@10.0.0.23(10.0.0.23:22) to root@10.0.0.22(10.0.0.22:22)..
    Wed Aug 28 08:13:36 2019 - [debug]   ok.
    Wed Aug 28 08:13:36 2019 - [info] All SSH connection tests passed successfully.
    
    [root@db03 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf    ##主从状态检查
    Wed Aug 28 08:15:08 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Wed Aug 28 08:15:08 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Wed Aug 28 08:15:08 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Wed Aug 28 08:15:08 2019 - [info] MHA::MasterMonitor version 0.56.
    Wed Aug 28 08:15:09 2019 - [info] GTID failover mode = 1
    Wed Aug 28 08:15:09 2019 - [info] Dead Servers:
    Wed Aug 28 08:15:09 2019 - [info] Alive Servers:
    Wed Aug 28 08:15:09 2019 - [info]   10.0.0.20(10.0.0.20:3306)
    Wed Aug 28 08:15:09 2019 - [info]   10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 08:15:09 2019 - [info]   10.0.0.23(10.0.0.23:3306)
    Wed Aug 28 08:15:09 2019 - [info] Alive Slaves:
    Wed Aug 28 08:15:09 2019 - [info]   10.0.0.20(10.0.0.20:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 08:15:09 2019 - [info]     GTID ON
    Wed Aug 28 08:15:09 2019 - [info]     Replicating from 10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 08:15:09 2019 - [info]   10.0.0.23(10.0.0.23:3306)  Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
    Wed Aug 28 08:15:09 2019 - [info]     GTID ON
    Wed Aug 28 08:15:09 2019 - [info]     Replicating from 10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 08:15:09 2019 - [info] Current Alive Master: 10.0.0.22(10.0.0.22:3306)
    Wed Aug 28 08:15:09 2019 - [info] Checking slave configurations..
    Wed Aug 28 08:15:09 2019 - [info]  read_only=1 is not set on slave 10.0.0.20(10.0.0.20:3306).
    Wed Aug 28 08:15:09 2019 - [info]  read_only=1 is not set on slave 10.0.0.23(10.0.0.23:3306).
    Wed Aug 28 08:15:09 2019 - [info] Checking replication filtering settings..
    Wed Aug 28 08:15:09 2019 - [info]  binlog_do_db= , binlog_ignore_db= 
    Wed Aug 28 08:15:09 2019 - [info]  Replication filtering check ok.
    Wed Aug 28 08:15:09 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
    Wed Aug 28 08:15:09 2019 - [info] Checking SSH publickey authentication settings on the current master..
    Wed Aug 28 08:15:09 2019 - [info] HealthCheck: SSH to 10.0.0.22 is reachable.
    Wed Aug 28 08:15:09 2019 - [info] 
    10.0.0.22(10.0.0.22:3306) (current master)
     +--10.0.0.20(10.0.0.20:3306)
     +--10.0.0.23(10.0.0.23:3306)
    
    Wed Aug 28 08:15:09 2019 - [info] Checking replication health on 10.0.0.20..
    Wed Aug 28 08:15:09 2019 - [info]  ok.
    Wed Aug 28 08:15:09 2019 - [info] Checking replication health on 10.0.0.23..
    Wed Aug 28 08:15:09 2019 - [info]  ok.
    Wed Aug 28 08:15:09 2019 - [warning] master_ip_failover_script is not defined.
    Wed Aug 28 08:15:09 2019 - [warning] shutdown_script is not defined.
    Wed Aug 28 08:15:09 2019 - [info] Got exit code 0 (Not master dead).
    
    MySQL Replication Health is OK.
    [root@db03 ~]# 
    

    ## 一切都OK
     

  7. 重新开启MHA(MHA切换主从后,会自动停止服务masterha_manager)

    [root@db03 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &    ## 开启MHA
    [1] 9092
    [root@db03 ~]# masterha_check_status --conf=/etc/mha/app1.cnf    ##检查MHA运行状态
    app1 (pid:9092) is running(0:PING_OK), master:10.0.0.22
    [root@db03 ~]# 

     

最后修改于 2019-08-29 09:16:03
如果觉得我的文章对你有用,请随意赞赏
扫一扫支付
上一篇