MHA高可用架构部署细节(具体内容见配置文档)
-
三台MySQL独立节点实例,主机名、IP、防火墙关闭等
-
开启1主2从GTID复制结构
-
关闭各节点relay-log自动删除功能
-
各节点部署node工具包及依赖包
-
选择其中一个从节点进行部署manager工具包
-
各节点ssh秘钥互信配置
-
配置manager节点配置文件(注意:在数据库中添加mha管理用户和密码)
-
做ssh互信检查和主从状态检查
-
开启MHA功能
1.三台MySQL独立节点实例,主机名、IP、防火墙关闭等
所有节点中操作
db01(10.0.0.20/172.16.1.20)、db02(10.0.0.22/172.16.1.22)、db03(10.0.0.23/172.16.1.23)
##hots解析
[root@db01 ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.20 db01
##查看防火墙
[root@db01 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
##分别解析并检查db02(10.0.0.22/172.16.1.22)和db03(10.0.0.23/172.16.1.23)
2.关闭各节点relay-log自动删除功能
所有节点中
[root@db01 ~]# vim /application/mysql/my.cnf
[mysqld]
basedir=/application/mysql
datadir=/application/mysql/data
socket=/application/mysql/tmp/mysql.sock
log-error=/application/mysql/log/mysql.log
log_bin=/application/mysql/binlog/mysql-bin
binlog_format=row
skip-name-resolve
server-id=20
gtid-mode=on
enforce-gtid-consistency=true
log-slave-updates=1
relay_log_purge=0 ###改配置为0就是关闭relay-log自动清理
[client]
socket=/application/mysql/tmp/mysql.sock
重启mysql查看信息,已关闭
mysql> show variables like '%relay_log_purge%';
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| relay_log_purge | OFF |
+-----------------+-------+
1 row in set (0.00 sec)
## 重启mysql后,别忘记需要再次启动SQL和IO线程哦:start slave;
3.各节点部署node工具包及依赖包
下载软件MHA:https://file.sudo.ren/MHA/MHA.zip mha下载需要国外网,因此这里本人提供,改地址是本人搭建的文件服务器,专门对小文件记录和下载用,请勿随意下载,否则占用本人服务器带宽,谢谢各位(如发现恶意占用带宽,立即停服,不再提供)
(1)所有节点中安装node软件
[root@db01 ~]# yum install perl-DBD-MySQL -y ##请先安装依赖包
[root@db01 ~]# mkdir -p /server/tools
[root@db01 ~]# cd /server/tools/
[root@db01 tools]# ll
total 0
[root@db01 tools]# wget https://file.sudo.ren/MHA/MHA.zip
--2019-08-27 11:11:52-- https://file.sudo.ren/MHA/MHA.zip
Resolving file.sudo.ren... 218.78.4.211
Connecting to file.sudo.ren|218.78.4.211|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5028053 (4.8M) [application/zip]
Saving to: “MHA.zip”
100%[================================================================================================================>] 5,028,053 116K/s in 43s
2019-08-27 11:12:41 (114 KB/s) - “MHA.zip” saved [5028053/5028053]
[root@db01 tools]# unzip MHA.zip
Archive: MHA.zip
inflating: Atlas-2.2.1.el6.x86_64.rpm
inflating: mha4mysql-manager-0.56-0.el6.noarch.rpm
inflating: mha4mysql-node-0.56-0.el6.noarch.rpm
[root@db01 tools]# ll
total 9884
-rw-r--r-- 1 root root 4963681 Oct 30 2017 Atlas-2.2.1.el6.x86_64.rpm
-rw-r--r-- 1 root root 87119 Oct 30 2017 mha4mysql-manager-0.56-0.el6.noarch.rpm
-rw-r--r-- 1 root root 36326 Oct 30 2017 mha4mysql-node-0.56-0.el6.noarch.rpm
-rw-r--r-- 1 root root 5028053 Aug 29 2018 MHA.zip
[root@db01 tools]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
Preparing... ########################################### [100%]
1:mha4mysql-node ########################################### [100%]
[root@db01 tools]#
(2)在主库中创建mha管理用户(用户名mha,密码mha),因为主从复制,所以其他从库也会有该用户
mysql> grant all privileges on *.* to mha@'10.0.0.%' identified by 'mha';
Query OK, 0 rows affected (0.01 sec)
mysql> select user,password,host from mysql.user;
+------+-------------------------------------------+-----------+
| user | password | host |
+------+-------------------------------------------+-----------+
| root | | localhost |
| repl | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 | 10.0.0.% |
| mha | *F4C9AC49A736981AE2739FC2F4A1FD92B4F07929 | 10.0.0.% |
+------+-------------------------------------------+-----------+
8 rows in set (0.00 sec)
(3)在所有节点配置软连接,MHA固定从/usr/bin/mysqlbinlog,/usr/bin/mysql接入运行MySQL,所以我们只能遵从他的规则咯。
[root@db01 tools]# ln -s /application/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
[root@db01 tools]# ln -s /application/mysql/bin/mysql /usr/bin/mysql
4.选择其中一个从节点进行部署manager工具包
在db03上部署manger节点(建议在从节点db03,不能在主节点上部署,因为如果主节点宕机,那么manger也无法工作)
(1)安装manager软件
[root@db03 tools]# wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-6.repo ##下载epel源
[root@db03 tools]# yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes ##安装依赖包
[root@db03 tools]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm ##安装/server/tools/下的manager软件
(2)创建Manager必须目录与配置文件
[root@db03 ~]# mkdir -p /etc/mha
[root@db03 ~]# mkdir -p /var/log/mha/app1 ###可以管理多套主从复制
(3)创建配置文件 (不需要的配置不要留着,注释没用,切换后会重写)
vim /etc/mha/app1.cnf
[server default] ##server default可以独立
manager_log=/var/log/mha/app1/manager
manager_workdir=/var/log/mha/app1
master_binlog_dir=/application/mysql/binlog ##db03的mysql/binlog
user=mha ##mha管理用户
password=mha ##mha用户密码
ping_interval=2
repl_user=repl ##主从复制用户
repl_password=123456 #主从复制密码
ssh_user=root
[server1]
hostname=10.0.0.20
port=3306
[server2]
hostname=10.0.0.22
port=3306
[server3]
hostname=10.0.0.23
port=3306
5.做ssh互信检查和主从状态检查
所有节点
db01:
[root@db01 binlog]# ssh 10.0.0.22 date
Tue Aug 27 13:17:19 CST 2019
[root@db01 binlog]# ssh 10.0.0.20 date
Tue Aug 27 13:17:27 CST 2019
[root@db01 binlog]# ssh 10.0.0.23 date
Tue Aug 27 13:17:33 CST 2019
db02:
[root@db02 ~]# ssh 10.0.0.20 date
Tue Aug 27 13:17:51 CST 2019
[root@db02 ~]# ssh 10.0.0.22 date
Tue Aug 27 13:17:53 CST 2019
[root@db02 ~]# ssh 10.0.0.23 date
Tue Aug 27 13:17:55 CST 2019
db03:
[root@db03 ~]# ssh 10.0.0.20 date
Tue Aug 27 13:18:03 CST 2019
[root@db03 ~]# ssh 10.0.0.22 date
Tue Aug 27 13:18:05 CST 2019
[root@db03 ~]# ssh 10.0.0.23 date
Tue Aug 27 13:18:07 CST 2019
mha互信检查
[root@db03 app1]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Tue Aug 27 14:03:26 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 27 14:03:26 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Tue Aug 27 14:03:26 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Tue Aug 27 14:03:26 2019 - [info] Starting SSH connection tests..
Tue Aug 27 14:03:26 2019 - [debug]
Tue Aug 27 14:03:26 2019 - [debug] Connecting via SSH from root@10.0.0.20(10.0.0.20:22) to root@10.0.0.22(10.0.0.22:22)..
Tue Aug 27 14:03:26 2019 - [debug] ok.
Tue Aug 27 14:03:26 2019 - [debug] Connecting via SSH from root@10.0.0.20(10.0.0.20:22) to root@10.0.0.23(10.0.0.23:22)..
Tue Aug 27 14:03:26 2019 - [debug] ok.
Tue Aug 27 14:03:27 2019 - [debug]
Tue Aug 27 14:03:26 2019 - [debug] Connecting via SSH from root@10.0.0.22(10.0.0.22:22) to root@10.0.0.20(10.0.0.20:22)..
Tue Aug 27 14:03:27 2019 - [debug] ok.
Tue Aug 27 14:03:27 2019 - [debug] Connecting via SSH from root@10.0.0.22(10.0.0.22:22) to root@10.0.0.23(10.0.0.23:22)..
Tue Aug 27 14:03:27 2019 - [debug] ok.
Tue Aug 27 14:03:27 2019 - [debug]
Tue Aug 27 14:03:27 2019 - [debug] Connecting via SSH from root@10.0.0.23(10.0.0.23:22) to root@10.0.0.20(10.0.0.20:22)..
Tue Aug 27 14:03:27 2019 - [debug] ok.
Tue Aug 27 14:03:27 2019 - [debug] Connecting via SSH from root@10.0.0.23(10.0.0.23:22) to root@10.0.0.22(10.0.0.22:22)..
Tue Aug 27 14:03:27 2019 - [debug] ok.
Tue Aug 27 14:03:27 2019 - [info] All SSH connection tests passed successfully.
mha主从检查
[root@db03 app1]# masterha_check_repl --conf=/etc/mha/app1.cnf
Tue Aug 27 14:05:56 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 27 14:05:56 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Tue Aug 27 14:05:56 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Tue Aug 27 14:05:56 2019 - [info] MHA::MasterMonitor version 0.56.
Tue Aug 27 14:05:57 2019 - [info] GTID failover mode = 1
Tue Aug 27 14:05:57 2019 - [info] Dead Servers:
Tue Aug 27 14:05:57 2019 - [info] Alive Servers:
Tue Aug 27 14:05:57 2019 - [info] 10.0.0.20(10.0.0.20:3306)
Tue Aug 27 14:05:57 2019 - [info] 10.0.0.22(10.0.0.22:3306)
Tue Aug 27 14:05:57 2019 - [info] 10.0.0.23(10.0.0.23:3306)
Tue Aug 27 14:05:57 2019 - [info] Alive Slaves:
Tue Aug 27 14:05:57 2019 - [info] 10.0.0.22(10.0.0.22:3306) Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
Tue Aug 27 14:05:57 2019 - [info] GTID ON
Tue Aug 27 14:05:57 2019 - [info] Replicating from 10.0.0.20(10.0.0.20:3306)
Tue Aug 27 14:05:57 2019 - [info] 10.0.0.23(10.0.0.23:3306) Version=5.6.36-log (oldest major version between slaves) log-bin:enabled
Tue Aug 27 14:05:57 2019 - [info] GTID ON
Tue Aug 27 14:05:57 2019 - [info] Replicating from 10.0.0.20(10.0.0.20:3306)
Tue Aug 27 14:05:57 2019 - [info] Current Alive Master: 10.0.0.20(10.0.0.20:3306)
Tue Aug 27 14:05:57 2019 - [info] Checking slave configurations..
Tue Aug 27 14:05:57 2019 - [info] read_only=1 is not set on slave 10.0.0.22(10.0.0.22:3306).
Tue Aug 27 14:05:57 2019 - [info] read_only=1 is not set on slave 10.0.0.23(10.0.0.23:3306).
Tue Aug 27 14:05:57 2019 - [info] Checking replication filtering settings..
Tue Aug 27 14:05:57 2019 - [info] binlog_do_db= , binlog_ignore_db=
Tue Aug 27 14:05:57 2019 - [info] Replication filtering check ok.
Tue Aug 27 14:05:57 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Tue Aug 27 14:05:57 2019 - [info] Checking SSH publickey authentication settings on the current master..
Tue Aug 27 14:05:58 2019 - [info] HealthCheck: SSH to 10.0.0.20 is reachable.
Tue Aug 27 14:05:58 2019 - [info]
10.0.0.20(10.0.0.20:3306) (current master)
+--10.0.0.22(10.0.0.22:3306)
+--10.0.0.23(10.0.0.23:3306)
Tue Aug 27 14:05:58 2019 - [info] Checking replication health on 10.0.0.22..
Tue Aug 27 14:05:58 2019 - [info] ok.
Tue Aug 27 14:05:58 2019 - [info] Checking replication health on 10.0.0.23..
Tue Aug 27 14:05:58 2019 - [info] ok.
Tue Aug 27 14:05:58 2019 - [warning] master_ip_failover_script is not defined.
Tue Aug 27 14:05:58 2019 - [warning] shutdown_script is not defined.
Tue Aug 27 14:05:58 2019 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
6.如果上面的检查都OK,开启MHA功能
db03节点
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
## --remove_dead_master_conf 如果主库宕机,自动将主配置文件清除
## --ignore_last_failover 忽略上一次切换
MHA其实也存在自己的配置文件 /var/log/mha/app1/manager
[root@db03 app1]# cd /var/log/mha/app1/
[root@db03 app1]# ll
total 12
-rw-r--r-- 1 root root 31 Aug 27 14:10 app1.master_status.health
-rw-r--r-- 1 root root 2537 Aug 27 14:10 manager
-rw-r--r-- 1 root root 299 Aug 27 14:10 manager.log
[root@db03 app1]# tail -f manager
10.0.0.20(10.0.0.20:3306) (current master)
+--10.0.0.22(10.0.0.22:3306)
+--10.0.0.23(10.0.0.23:3306)
Tue Aug 27 14:10:13 2019 - [warning] master_ip_failover_script is not defined.
Tue Aug 27 14:10:13 2019 - [warning] shutdown_script is not defined.
Tue Aug 27 14:10:13 2019 - [info] Set master ping interval 2 seconds.
Tue Aug 27 14:10:13 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Tue Aug 27 14:10:13 2019 - [info] Starting ping health check on 10.0.0.20(10.0.0.20:3306)..
Tue Aug 27 14:10:13 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
就此,我们已经成功地搭建了MHA,下一章节教大家如何使用,以及故障模拟。
(今天晚餐给自己加个鸡腿,写这篇文章太费劲了,中间的由于自己大意,出现了问题,然后重新梳理一遍才搞明白。)

