欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

heartbeat+mysql双主复制实现高可用

程序员文章站 2022-05-13 12:37:53
...

一:搭建主主复制环境 1.1 实验环境 两台机器事先都已经装好了 MySQL 单实例。 IP: 10.192.203.201 10.192.203.202 端口都是 3307. 二者的端口号需要保持一致,否则在最后用 vip 连接的时候,不能使用相同端口号连接。 1.2 实验步骤 1.2.1 修改配置文件 修改

一:搭建主主复制环境


1.1实验环境

两台机器事先都已经装好了MySQL单实例。

IP: 10.192.203.201 10.192.203.202

端口都是3307.
二者的端口号需要保持一致,否则在最后用vip连接的时候,不能使用相同端口号连接。

1.2实验步骤


1.2.1修改配置文件

修改master1

[mysqld]下面添加:

server-id = 1
relay-log=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin
relay-log-index=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin.index
auto-increment-offset= 1    
auto-increment-increment= 2 
log-slave-updates=true 

修改master2

[mysqld]下面添加:

server-id = 3
relay-log=/data/server/mysql/binlog/single-relay-bin
relay-log-index=/data/server/mysql/binlog/single-relay-bin.index
auto-increment-offset= 2   
auto-increment-increment= 2 
log-slave-updates=true


添加auto-increment-offset那两项,是为了避免在MySQLINSERT时主键冲突。

修改完后记得重启mysql

1.2.2建复制用户

分别在两台mysql上执行

GRANTREPLICATION SLAVE ON *.* TO 'RepUser'@'%'identified by 'beijing';

1.2.3指向master

两台服务器均为新建立,且无其它写入操作,各服务器只需记录当前自己二进制日志文件及事件位置,以之作为另外的服务器复制起始位置即可。否则,需要先备份主库,在备库进行恢复,从而保持数据一致,然后再指向master

Master1:

mysql>show master status;

+------------------+----------+--------------+------------------+-------------------+

|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+------------------+----------+--------------+------------------+-------------------+

|mysql-bin.000001 | 302| | | |

+------------------+----------+--------------+------------------+-------------------+

1 row inset (0.00 sec)

Master2:

mysql>show master status;

+------------------+----------+--------------+------------------+-------------------+

|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+------------------+----------+--------------+------------------+-------------------+

|mysql-bin.000001 | 120| | | |

+------------------+----------+--------------+------------------+-------------------+

1 row inset (0.00 sec)

#Master1指向Master2

1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.202',MASTER_PASSWORD='beijing',MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=120;


#Master2
指向Master1

[1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.201',MASTER_PASSWORD='beijing', MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=302;

1.2.4分别启动slave

startslave ;

确保show slave status

Slave_IO_Running:Yes

Slave_SQL_Running:Yes

测试两边是否同步,略。


二 配心跳

每个主机分别带有两块以太网卡,其中一块用于网络通信,另一块用于心跳功能。

本实验都是在Oracle virtualbox虚拟机里做的,故添加一块儿用于内部连接的网卡,用于心跳测试,请参考:http://blog.csdn.net/yabingshi_tech/article/details/51445006

三:安装部署heartbeat


在两台机器上分别做以下操作:

3.1 安装依赖包

yum install PyXML cluster-glue cluster-glue-libs resource-agents –y

3.2 安装heartbeat

wgethttp://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-3.0.4-2.el6.x86_64.rpm

wget http://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-libs-3.0.4-2.el6.x86_64.rpm

rpm -ivh heartbeat-*

3.3 配置heartbeat

复制配置文件

cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/

cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/

cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/

3.3.1 配置心跳的加密方式:authkeys

vi /etc/ha.d/authkeys

#如果使用双机对联线(双绞线),可以配置如下:

auth 1

1 crc

#存盘退出,然后

chmod 600 /etc/ha.d/authkeys

3.3.2 配置心跳的监控:haresources

vi /etc/ha.d/haresources

#各主机这部分应完全相同。

添加:

PC IPaddr::10.192.203.203

#注意,PC这写你的master的主机名,Ipaddr写的是你的VIP地址。

也可设置heartbeat管理的资源或服务:在该目录下存放服务启动脚本(例如:mysqld),将相同脚本名称添到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。

如:PC IPaddr::10.192.203.203mysql #

但是,这样当heartbeat关闭的时候,也会关闭mysql,所以这里我就不添加了。

3.3.3 配置心跳的配置文件:ha.cf

主和从机器除了ucast eth1 10.0.0.2这一行不同外,其他都一样。

vi /etc/ha.d/ha.cf

添加:

logfile/var/log/ha_log/ha-log.log   ## ha的日志文件记录位置。如没有该目录,则需要手动添加
bcast eth1     ##使用eht1做心跳监测
ucast eth110.0.0.2   ##心跳网卡连接对方心跳地址
keepalive 2    ##设定心跳(监测)时间时间为2秒
warntime 10
deadtime 30
initdead 120
hopfudge 1
udpport 694    ##使用udp端口694 进行心跳监测
auto_failback off
node PC  ##节点1,必须要与 uname -n 指令得到的结果一致。
node slave2  ##节点2
ping 10.192.203.254   ##通过ping 网关来监测心跳是否正常。


3.3.4 创建日志文件路径

mkdir -p /var/log/ha_log

chmod 777 /var/log/ha_log/

3.3.5 设置ipvsadm的巡回监测

ipvsadm -A -t 10.192.203.203:3307 -s rr

ipvsadm -a -t 10.192.203.203:3307 -r 10.192.203.201:3307-m

ipvsadm -a -t 10.192.203.203:3307 -r 10.192.203.202:3307-m

[root@PC download]# ipvsadm --list

IP Virtual Server version 1.2.1 (size=4096)

Prot LocalAddress:Port Scheduler Flags

-> RemoteAddress:Port Forward Weight ActiveConn InActConn

TCP bogon:opsession-prxy rr

-> bogon:opsession-prxy Local 1 0 0

-> bogon:opsession-prxy Masq 1 0 0

[root@slave2 download]# ipvsadm --list

IP Virtual Server version 1.2.1 (size=4096)

Prot LocalAddress:Port Scheduler Flags

-> RemoteAddress:Port Forward Weight ActiveConn InActConn

TCP 10.192.203.203:opsession-prx rr

-> 10.192.203.201:opsession-prx Masq 1 0 0

-> 10.192.203.202:opsession-prx Local 1 0 0

3.4 开放防火墙端口

heartbeat 默认使用udp 694端口进行心跳监测。 如果系统有使用iptables 做防火墙,应记住把这个端口打开。

vi/etc/sysconfig/iptables

添加:-A INPUT -pudp --dport 694 -j ACCEPT

service iptables restart

3.5 HA服务的启动、关闭以及测试

启动HA: serviceheartbeat start

在主从都启动heartbeat

[root@PC init.d]# service heartbeat start

Starting High-Availability services:INFO: Resource is stopped

Done.

[root@PC ha_log]# service heartbeat status

heartbeat OK [pid 17943 et al] is runningon pc [pc]...

[root@slave2 ha_log]# service heartbeatstatus

heartbeat OK [pid 6536 et al] is running onslave2 [slave2]...

在主上看到虚拟IP了:

[root@PC ha_log]# ip addr

1: lo:  mtu16436 qdisc noqueue state UNKNOWN
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
   inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen1000
   link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
   inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0
   inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0
   inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed
      valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen1000
   link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
   inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1
   inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed
      valid_lft forever preferred_lft forever

在/var/log/ha_log下的日志文件或者/var/log/messages 都可以看到相关信息。

[root@PC network-scripts]# tail -f/var/log/messages

May 19 01:34:59 PCResourceManager(default)[17985]: info: Running /etc/ha.d/resource.d/IPaddr10.192.203.203 start
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Bringing device eth0 up
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: /usr/libexec/heartbeat/send_arp -i200 -r 5 -p /var/run/resource-agents/send_arp-10.192.203.203 eth010.192.203.203 auto not_used not_used
May 19 01:35:00 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18089]:INFO:  Success
May 19 01:35:00 PCResourceManager(default)[17985]: info: Running /etc/init.d/mysql  start
May 19 01:35:03 PC heartbeat: [17972]:info: local HA resource acquisition completed (standby).
May 19 01:35:03 PC heartbeat: [17943]:info: Standby resource acquisition done [foreign].
May 19 01:35:03 PC heartbeat: [17943]:info: Initial resource acquisition complete (auto_failback)
May 19 01:35:03 PC heartbeat: [17943]:info: remote resource transition completed.

测试:

将主201上的心跳关闭

[root@PC ha_log]# service heartbeat stop

Stopping High-Availability services: Done.

查看日志:

May 19 01:46:57 PC heartbeat: [18561]: info:Giving up all HA resources.
May 19 01:46:58 PCResourceManager(default)[18574]: info: Releasing resource group: pcIPaddr::10.192.203.203 mysql
May 19 01:46:58 PCResourceManager(default)[18574]: info: Running /etc/init.d/mysql  stop
May 19 01:46:59 PC ResourceManager(default)[18574]:info: Running /etc/ha.d/resource.d/IPaddr 10.192.203.203 stop
May 19 01:46:59 PCIPaddr(IPaddr_10.192.203.203)[18652]: INFO: IP status = ok, IP_CIP=
May 19 01:46:59 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18638]:INFO:  Success
May 19 01:46:59 PC heartbeat: [18561]:info: All HA resources relinquished.
May 19 01:47:00 PC heartbeat: [17943]:WARN: 1 lost packet(s) for [slave2] [2777:2779]
May 19 01:47:00 PC heartbeat: [17943]:info: No pkts missing from slave2!
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17949 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17950 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17951 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17952 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBFIFO process 17946 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17947 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17948 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: Core process 17951 exited. 7 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17946 exited. 6 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17947 exited. 5 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17948 exited. 4 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17949 exited. 3 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17950 exited. 2 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17952 exited. 1 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: pc Heartbeat shutdown complete.

查看从202的日志:

harc(default)[8578]:         2016/05/19_01:47:00 info: Running /etc/ha.d//rc.d/statusstatus
mach_down(default)[8595]:    2016/05/19_01:47:00 info: Taking overresource group IPaddr::10.192.203.203
ResourceManager(default)[8622]: 2016/05/19_01:47:00 info: Acquiring resourcegroup: pc IPaddr::10.192.203.203 mysql
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8650]: 2016/05/19_01:47:01 INFO:  Resource is stopped
ResourceManager(default)[8622]: 2016/05/19_01:47:01 info: Running/etc/ha.d/resource.d/IPaddr 10.192.203.203 start
IPaddr(IPaddr_10.192.203.203)[8746]:  2016/05/19_01:47:01 INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0
IPaddr(IPaddr_10.192.203.203)[8746]:  2016/05/19_01:47:01 INFO: Bringing device eth0up
IPaddr(IPaddr_10.192.203.203)[8746]:  2016/05/19_01:47:01 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.192.203.203 eth0 10.192.203.203 autonot_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8732]: 2016/05/19_01:47:01 INFO:  Success
ResourceManager(default)[8622]: 2016/05/19_01:47:02 info: Running/etc/init.d/mysql  start
mach_down(default)[8595]:    2016/05/19_01:47:05 info: /usr/share/heartbeat/mach_down:nice_failback: foreign resources acquired
mach_down(default)[8595]:    2016/05/19_01:47:05 info: mach_down takeovercomplete for node pc.
May 19 01:47:05 slave2 heartbeat: [6536]:info: mach_down takeover complete.
May 19 01:47:31 slave2 heartbeat: [6536]:WARN: node pc: is dead
May 19 01:47:31 slave2 heartbeat: [6536]:info: Dead node pc gave up resources.
May 19 01:47:31 slave2 heartbeat: [6536]:info: Link pc:eth1 dead.

显示202接管成功了。

在202上能看到vip已经漂移过来:

[root@slave2 ha_log]# ip addr

1: lo:  mtu16436 qdisc noqueue state UNKNOWN
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
   inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000
   link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
   inet 10.192.203.202/24 brd 10.192.203.255 scope global eth0
   inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0
   inet6 fe80::a00:27ff:fe04:516/64 scope link
      valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen1000
   link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
   inet 10.0.0.2/24 brd 10.0.0.255 scope global eth1
   inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link
      valid_lft forever preferred_lft forever

201已经没有vip

[root@PC ha_log]# ip addr

1: lo:  mtu16436 qdisc noqueue state UNKNOWN
   link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
   inet 127.0.0.1/8 scope host lo
   inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen1000
    link/ether08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
   inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0
   inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed
      valid_lft forever preferred_lft forever
3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000
   link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
   inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1
   inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed
      valid_lft forever preferred_lft forever


四:heartbeat+mysql实现高可用

heartbeat只检测心跳也就是只检测设备是否宕机,不会检测MySQL服务,所以我们同样要有一个脚本来检测MySQL服务,如果mysql服务宕掉,则killheartbeat进程实现故障转移(nginx+keepalived原理一致),脚本内容如下:

分别在master1master2上新建检查mysql脚本

vi /root/check_mysql.sh

MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=system@123 
 
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER-p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1
#$mysqlclient --host=$host --port=$port--user=$user --password=$password  -e"show databases;" > /dev/null 2>&1
if [ $? == 0 ]
then
   echo " $host mysql login successfully "
   exit 0
else
   #echo " $host mysql login faild"
   /etc/init.d/heartbeat stop
   exit 2
fi


这个脚本待写一些邮件通知的操作。

chmod +x /root/check_mysql.sh

设置成定时任务,每分钟检查一次:

*/1 * * * * /root/check_mysql.sh >>/root/check_mysql.log

关闭当前主的mysql,验证下vip是否漂移到了从。

本篇文章参考了以下文章:

http://www.linuxidc.com/Linux/2011-11/46764.htm

http://www.codesky.net/article/201111/173710.html

http://blog.chinaunix.net/uid-20639775-id-3337481.html

http://www.oschina.net/question/163914_31896

https://www.linuxzen.com/heartbeatshi-xian-mysqlshuang-ji-gao-ke-yong.html

http://www.it165.net/admin/html/201308/1702.html

http://blog.csdn.net/wyzxg/article/details/7741116