欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Keepalived+LVS+MariaDBGaleraCluster10.0高可用负载均衡多主复

程序员文章站 2022-04-23 13:25:45
...

一、概述 1.简述: MariaDB Galera Cluster 是一套在mysql innodb存储引擎上面实现multi-master及数据实时同步的系统架构,业务层面无需做读写分离工作,数据库读写压力都能按照既定的规则分发到各个节点上去。在数据方面完全兼容 MariaDB、Percona Server和

一、概述


1.简述: MariaDB Galera Cluster 是一套在mysql innodb存储引擎上面实现multi-master及数据实时同步的系统架构,业务层面无需做读写分离工作,数据库读写压力都能按照既定的规则分发到各个节点上去。在数据方面完全兼容 MariaDB、Percona Server和MySQL。

Keepalived+LVS+MariaDBGaleraCluster10.0高可用负载均衡多主复


2.特性:

(1).同步复制 Synchronous replication

(2).Active-active multi-master 拓扑逻辑

(3).可对集群中任一节点进行数据读写

(4).自动成员控制,故障节点自动从集群中移除

(5).自动节点加入

(6).真正并行的复制,基于行级

(7).直接客户端连接,原生的 MySQL 接口

(8).每个节点都包含完整的数据副本

(9).多台数据库中数据同步由 wsrep 接口实现


3.局限性:

(1).目前的复制仅仅支持InnoDB存储引擎,任何写入其他引擎的表,包括mysql.*表将不会复制,但是DDL语句会被复制的,因此创建用户将会被复制,但是insert into mysql.user…将不会被复制的 (2).DELETE操作不支持没有主键的表,没有主键的表在不同的节点顺序将不同,如果执行SELECT…LIMIT… 将出现不同的结果集

(3).在多主环境下LOCK/UNLOCK TABLES不支持,以及锁函数GET_LOCK(), RELEASE_LOCK()…

(4).查询日志不能保存在表中。如果开启查询日志,只能保存到文件中

(5).允许最大的事务大小由wsrep_max_ws_rows和wsrep_max_ws_size定义。任何大型操作将被拒绝。如大型的LOAD DATA操作

(6).由于集群是乐观的并发控制,事务commit可能在该阶段中止。如果有两个事务向在集群中不同的节点向同一行写入并提交,失败的节点将中止。对于集群级别的中止,集群返回死锁错误代码(Error: 1213 SQLSTATE: 40001 (ER_LOCK_DEADLOCK))

(7).XA事务不支持,由于在提交上可能回滚

(8).整个集群的写入吞吐量是由最弱的节点限制,如果有一个节点变得缓慢,那么整个集群将是缓慢的。为了稳定的高性能要求,所有的节点应使用统一的硬件

(9).集群节点建议最少3个

(10).如果DDL语句有问题将破坏集群。


二、架构介绍


1.Keepalived+LVS的经典组合作为前端负载均衡和高可用保障,可以使用单独两台主机分别作为主、备,如果数据库集群数量不多,比如两台,也可以直接在数据库主机上使用此组合


2.一共5台主机,2台作为keepalived+LVS的主备,另外三台分别为mdb1、mdb2和mdb3,mdb1作为参考节点,不执行任何客户端SQL,这样做的好处有如下几条:

(1).数据一致性:因为"参考节点"本身不执行任何客户端SQL,所以在这个节点上发生transaction冲突的可能性最小。因此如果发现集群有数据不一致的时候,"参考节点"上的数据应该是集群中最准确的。

(2).数据安全性:因为"参考节点"本身不执行任何客户端SQL,所以在这个节点上发生灾难事件的可能性最小。因此当整个集群宕掉的时候,"参考节点"应该是恢复集群的最佳节点。

(3).高可用:"参考节点"可以作为专门state snapshot donor。因为"参考节点"不服务于客户端,因此当使用此节点进行SST的时候,不会影响用户体验,并且前端的负载均衡设备也不需要重新配置。


三、 环境准备


1.系统和软件


系统环境
系统
CentOS release 6.5
系统位数 x86_64
内核版本
2.6.32-431
软件版本
Keepalived
1.2.13
LVS 1.24
MaridDB 10.0.16
socat 1.7.3.0

2.主机环境

mdb1(参考点) 172.16.21.180
mdb2 172.16.21.181
mdb3 172.16.21.182
ha1(keepalived+lvs主) 172.16.21.201
ha2(keepalived+lvs备) 172.16.21.202
VIP 172.16.21.188


四、 集群安装配置

以主机mdb1为例:

1.配置hosts文件

编辑/etc/hosts加入下列内容

[root@mdb1 ~]# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.21.201   ha1
172.16.21.202   ha2
172.16.21.180   mdb1
172.16.21.181   mdb2
172.16.21.182   mdb3

2. 准备YUM源

除了系统自带的官方源,再添加epel,Percona,MariaDB的源

[root@mdb1 ~]# vi /etc/yum.repos.d/MariDB.repo 
# MariaDB 5.5 RedHat repository list - created 2015-03-04 02:45 UTC
# http://mariadb.org/mariadb/repositories/
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.0/rhel6-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

[root@mdb1 ~]#rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@mdb1 ~]#rpm --import https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
[root@mdb1 ~]# vi /etc/yum.repos.d/Percona.repo
[percona]
name = CentOS $releasever - Percona
baseurl=http://repo.percona.com/centos/$releasever/os/$basearch/
enabled = 1
gpgkey = file:///etc/pki/rpm-gpg/RPM-GPG-KEY-percona
gpgcheck = 1

[root@mdb1 ~]#wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-percona http://www.percona.com/downloads/RPM-GPG-KEY-percona
[root@mdb1 ~]#yum clean all


3.安装socat

socat是一个多功能的网络工具,名字来由是”Socket CAT”,可以看作是netcat的N倍加强版。

事实证明,如果不安装socat,MariaDB-Galera-server最后的数据同步会失败报错,网上很多配置文档都没有讲到这点,请记住一定要安装

[root@mdb1 ~]# tar -xzvf socat-1.7.3.0.tar.gz
[root@mdb1 ~]# cd socat-1.7.3.0
[root@mdb1 socat-1.7.3.0]# ./configure --prefix=/usr/local/socat
[root@mdb1 socat-1.7.3.0]# make && make install
[root@mdb1 socat-1.7.3.0]# ln -s /usr/local/socat/bin/socat /usr/sbin/

4.安装MariaDB、galera、xtrabackup

[root@mdb1 ~]# rpm -e --nodeps mysql-libs
[root@mdb1 ~]# yum install MariaDB-Galera-server galera MariaDB-client xtrabackup
[root@mdb1 ~]#chkconfig mysql on
[root@mdb1 ~]#service mysql start


5.置MariaDB的root密码,并做安全加固

[root@mdb1 ~]#/usr/bin/mysql_secure_installation

6.创建用于同步数据库的SST帐号

[root@mdb1 ~]# mysql -uroot -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.0.16-MariaDB-wsrep-log MariaDB Server, wsrep_25.10.r4144

Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> grant all privileges on *.* to sst@'%' identified by '123456'; 
MariaDB [(none)]> flush privileges;
MariaDB [(none)]> quit


7.创建wsrep.cnf文件

[root@mdb1 ~]#cp /usr/share/mysql/wsrep.cnf /etc/my.cnf.d/
[root@mdb1 ~]# vi /etc/my.cnf.d/wsrep.cnf
只需要修改如下4行:
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://"
wsrep_sst_auth=sst:123456
wsrep_sst_method=xtrabackup

注意:

"gcomm://" 是特殊的地址,仅仅是Galera cluster初始化启动时候使用。

如果集群启动以后,我们关闭了第一个节点,那么再次启动的时候必须先修改

"gcomm://"为其他节点的集群地址,例如下次启动时需要更改

wsrep_cluster_address="gcomm://172.16.21.182:4567"

Keepalived+LVS+MariaDBGaleraCluster10.0高可用负载均衡多主复


图中的Node A就是我们的mdb1,Node N就是后面需要添加的主机mdb3


8.修改/etc/my.cnf

添加如下一行

!includedir /etc/my.cnf.d/

另外最好在/etc/my.cnf中指定datadir路径

datadir = /var/lib/mysql

否则可能会遇到报错说找不到路径,所以最好加上这条


9.关闭防火墙iptables和selinux

很多人在启动数据库集群时总是失败,很可能就是因为防火墙没有关闭或者没有打开相应端口,最好的办法就是清空iptables并关闭selinux

[root@mdb1 ~]# iptables -F
[root@mdb1 ~]# iptables-save > /etc/sysconfig/iptables
[root@mdb1 ~]# setenforce 0
[root@mdb1 ~]# vi /etc/selinux/config 

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted


10.重启MariaDB

[root@mdb1 ~]# service mysql restart
[root@mdb1 ~]#  netstat -tulpn | grep -e 4567 -e 3306
tcp        0      0 0.0.0.0:4567                0.0.0.0:*                   LISTEN      11325/mysqld        
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN      11325/mysqld


到此,单节点的配置完成


11.添加mdb2、mdb3到集群

整个集群就是首位相连,简单说来就是在"gcomm://"处的IP不一样,mdb3—>mdb2—>mdb1—>mdb3,在生产环境,可以考虑将mdb1作为参考节点,不执行客户端的SQL,用来保障数据一致性和数据恢复时用。具体构造方法如下:

(1)按照上述1-10的步骤安装和配置另外两条主机

(2)除了第7步wsrep_cluster_address要改为对应的主机地址

mdb2:wsrep_cluster_address="gcomm://172.16.21.180:4567"

mdb3:wsrep_cluster_address="gcomm://172.16.21.181:4567"

如果有更多主机要加入集群,以此类推,将wsrep_cluster_address指向前一个主机地址,而集群第一台主机指向最后一台的地址就行了


12.最后将mdb2和mdb3启动

[root@mdb2 ~]# service mysql start
[root@mdb3 ~]# service mysql start


13.给集群加入Galera arbitrator

对于只有2个节点的Galera Cluster和其他集群软件一样,需要面对极端情况下的"脑裂"状态。

为了避免这种问题,Galera引入了"arbitrator(仲裁人)"。

"仲裁人"节点上没有数据,它在集群中的作用就是在集群发生分裂时进行仲裁,集群中可以有多个"仲裁人"节点。

"仲裁人"节点加入集群的方法很简单,运行如下命令即可:

[root@mdb1 ~]# garbd -a gcomm://172.16.21.180:4567 -g my_wsrep_cluster -d

参数说明:

-d 以daemon模式运行

-a 集群地址

-g 集群名称


14.确认galera集群正确安装和运行

MariaDB [(none)]> show status like 'ws%';
+------------------------------+----------------------------------------------------------+
| Variable_name                | Value                                                    |
+------------------------------+----------------------------------------------------------+
| wsrep_local_state_uuid       | 64784714-c23a-11e4-b7d7-5edbdea0e62c   uuid 集群唯一标记                |
| wsrep_protocol_version       | 5                                                        |
| wsrep_last_committed         | 94049           sql 提交记录                                         |
| wsrep_replicated             | 0                                                        |
| wsrep_replicated_bytes       | 0                                                        |
| wsrep_repl_keys              | 0                                                        |
| wsrep_repl_keys_bytes        | 0                                                        |
| wsrep_repl_data_bytes        | 0                                                        |
| wsrep_repl_other_bytes       | 0                                                        |
| wsrep_received               | 3                                                        |
| wsrep_received_bytes         | 287                                                      |
| wsrep_local_commits          | 0                             本地执行的 sql                           |
| wsrep_local_cert_failures    | 0                              本地失败事务                          |
| wsrep_local_replays          | 0                                                        |
| wsrep_local_send_queue       | 0                                                        |
| wsrep_local_send_queue_avg   | 0.333333                       队列平均时间间隔                          |
| wsrep_local_recv_queue       | 0                                                        |
| wsrep_local_recv_queue_avg   | 0.000000                                                 |
| wsrep_local_cached_downto    | 18446744073709551615                                     |
| wsrep_flow_control_paused_ns | 0                                                        |
| wsrep_flow_control_paused    | 0.000000                                                 |
| wsrep_flow_control_sent      | 0                                                        |
| wsrep_flow_control_recv      | 0                                                        |
| wsrep_cert_deps_distance     | 0.000000                       并发数量                          |
| wsrep_apply_oooe             | 0.000000                                                 |
| wsrep_apply_oool             | 0.000000                                                 |
| wsrep_apply_window           | 0.000000                                                 |
| wsrep_commit_oooe            | 0.000000                                                 |
| wsrep_commit_oool            | 0.000000                                                 |
| wsrep_commit_window          | 0.000000                                                 |
| wsrep_local_state            | 4                                                        |
| wsrep_local_state_comment    | Synced                                                   |
| wsrep_cert_index_size        | 0                                                        |
| wsrep_causal_reads           | 0                                                        |
| wsrep_cert_interval          | 0.000000                                                 |
| wsrep_incoming_addresses     | 172.16.21.180:3306,172.16.21.182:3306,172.16.21.188:3306 |
| wsrep_cluster_conf_id        | 19                                                       |
| wsrep_cluster_size           | 3                            集群成员个数                            |
| wsrep_cluster_state_uuid     | 64784714-c23a-11e4-b7d7-5edbdea0e62c                     |
| wsrep_cluster_status         | Primary                       主服务器                           |
| wsrep_connected              | ON                            当前是否连接中                           |
| wsrep_local_bf_aborts        | 0                                                        |
| wsrep_local_index            | 0                                                        |
| wsrep_provider_name          | Galera                                                   |
| wsrep_provider_vendor        | Codership Oy                         |
| wsrep_provider_version       | 25.3.5(rXXXX)                                            |
| wsrep_ready                  | ON                                                       |
| wsrep_thread_count           | 3                                                        |
+------------------------------+----------------------------------------------------------+

wsrep_ready为ON,则说明MariaDB Galera集群已经正确运行了


监控状态说明:

(1)集群完整性检查:

wsrep_cluster_state_uuid:在集群所有节点的值应该是相同的,有不同值的节点,说明其没有连接入集群.

wsrep_cluster_conf_id:正常情况下所有节点上该值是一样的.如果值不同,说明该节点被临时”分区”了.当节点之间网络连接恢复的时候应该会恢复一样的值.

wsrep_cluster_size:如果这个值跟预期的节点数一致,则所有的集群节点已经连接.

wsrep_cluster_status:集群组成的状态.如果不为”Primary”,说明出现”分区”或是”split-brain”状况.


(2)节点状态检查:

wsrep_ready: 该值为ON,则说明可以接受SQL负载.如果为Off,则需要检查wsrep_connected.

wsrep_connected: 如果该值为Off,且wsrep_ready的值也为Off,则说明该节点没有连接到集群.(可能是wsrep_cluster_address或wsrep_cluster_name等配置错造成的.具体错误需要查看错误日志)

wsrep_local_state_comment:如果wsrep_connected为On,但wsrep_ready为OFF,则可以从该项查看原因.


(3)复制健康检查:

wsrep_flow_control_paused:表示复制停止了多长时间.即表明集群因为Slave延迟而慢的程度.值为0~1,越靠近0越好,值为1表示复制完全停止.可优化wsrep_slave_threads的值来改善.

wsrep_cert_deps_distance:有多少事务可以并行应用处理.wsrep_slave_threads设置的值不应该高出该值太多.

wsrep_flow_control_sent:表示该节点已经停止复制了多少次.

wsrep_local_recv_queue_avg:表示slave事务队列的平均长度.slave瓶颈的预兆.

最慢的节点的wsrep_flow_control_sent和wsrep_local_recv_queue_avg这两个值最高.这两个值较低的话,相对更好.


(4)检测慢网络问题:

wsrep_local_send_queue_avg:网络瓶颈的预兆.如果这个值比较高的话,可能存在网络瓶

冲突或死锁的数目:

wsrep_last_committed:最后提交的事务数目

wsrep_local_cert_failures和wsrep_local_bf_aborts:回滚,检测到的冲突数目


15.测试数据是否能同步

分别在每个节点创建库和表,再删除,查看其它节点是否同步,如若配置正确,应该是同步的,具体操作省略


五、 Keepalived+LVS配置


1.使用YUM方式安装

[root@ha1 ~]# yum install keepalived ipvsadm
[root@ha2 ~]# yum install keepalived ipvsadm


2.Keepalived配置

主机ha1的配置

[root@ha1 ~]# vi /etc/keepalived/keepalived.conf
global_defs {
   notification_email {
        xx@xxxx.com
   }
   notification_email_from root@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_201
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.21.188/24 dev eth0 label eth0:0
    }
}
virtual_server 172.16.21.188 3306 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    nat_mask 255.255.255.0
    persistence_timeout 50
    protocol TCP

    real_server 172.16.21.181 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
        }
    }
    real_server 172.16.21.182 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
        }
    }
}


备机ha2的配置

global_defs {
   notification_email {
        xx@xxxx.com
   }
   notification_email_from root@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_202
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.21.188/24 dev eth0 label eth0:0
    }
}
virtual_server 172.16.21.188 3306 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    nat_mask 255.255.255.0
    persistence_timeout 50
    protocol TCP

    real_server 172.16.21.181 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
        }
    }
    real_server 172.16.21.182 3306 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
            connect_port 3306
        }
    }
}


3.LVS脚本配置

两台realserver服务器上都要配置如下脚本

[root@mdb2 ~]#vi /etc/init.d/lvsdr.sh
#!/bin/bash
# description: Config realserver lo and apply noarp

VIP=172.16.21.188                    
. /etc/rc.d/init.d/functions
case "$1" in
start)
    /sbin/ifconfig lo down  
    /sbin/ifconfig lo up        
    echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
    echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
    echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
    echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
    /sbin/sysctl -p >/dev/null 2>&1
    /sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 up
    /sbin/route add -host $VIP dev lo:0
    echo "LVS-DR real server starts successfully."            
    ;;
stop)    
    /sbin/ifconfig lo:0 down
    /sbin/route del $VIP >/dev/null 2>&1
    echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignore 
    echo "0" >/proc/sys/net/ipv4/conf/lo/arp_announce
    echo "0" >/proc/sys/net/ipv4/conf/all/arp_ignore
    echo "0" >/proc/sys/net/ipv4/conf/all/arp_announce
    echo "LVS-DR real server stopped."
    ;;
status)
    isLoOn=`/sbin/ifconfig lo:0 | grep "$VIP"`
    isRoOn=`/bin/netstat -rn | grep "$VIP"`
    if [ "$isLoOn" == "" -a "$isRoOn" == "" ]; then
       echo "LVS-DR real server has to run yet."
    else
       echo "LVS-DR real server is running."
    fi
    exit 3
    ;;
*)    
    echo "Usage: $0 {start|stop|status}" 
    exit 1    
esac                     
exit 0

[root@mdb2 ~]# chmod +x /etc/init.d/lvsdr.sh
[root@mdb3 ~]# chmod +x /etc/init.d/lvsdr.sh


4.启动Keepalived和LVS

[root@mdb2 ~]# /etc/init.d/lvsdr.sh start
[root@mdb3 ~]# /etc/init.d/lvsdr.sh start
[root@ha1 ~]# service keepalived start
[root@ha2 ~]# service keepalived start


5.加入开机自动启动

[root@mdb2 ~]#echo "/etc/init.d/lvsdr.sh start" >> /etc/rc.d/rc.local
[root@mdb3 ~]#echo "/etc/init.d/lvsdr.sh start" >> /etc/rc.d/rc.local
[root@ha1 ~]# chkconfig keepalived on
[root@ha2 ~]# chkconfig keepalived on

6.测试

将主服务器ha1的keepalived关闭,在备机ha2上观察日志和IP变化

[root@ha1 ~]#service keepalived stop
[root@ha2 ~]#tail -f /var/log/messages
Mar  5 10:36:03 ha2 Keepalived_healthcheckers[11249]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  5 10:36:03 ha2 Keepalived_healthcheckers[11249]: Configuration is using : 14697 Bytes
Mar  5 10:36:03 ha2 Keepalived_vrrp[11250]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  5 10:36:03 ha2 Keepalived_vrrp[11250]: Configuration is using : 63250 Bytes
Mar  5 10:36:03 ha2 Keepalived_vrrp[11250]: Using LinkWatch kernel netlink reflector...
Mar  5 10:36:03 ha2 Keepalived_healthcheckers[11249]: Using LinkWatch kernel netlink reflector...
Mar  5 10:36:03 ha2 Keepalived_healthcheckers[11249]: Activating healthchecker for service [172.16.21.181]:3306
Mar  5 10:36:03 ha2 Keepalived_healthcheckers[11249]: Activating healthchecker for service [172.16.21.182]:3306
Mar  5 10:36:03 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar  5 10:36:03 ha2 Keepalived_vrrp[11250]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Mar  6 08:41:53 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar  6 08:41:54 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar  6 08:41:54 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar  6 08:41:54 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.21.188
Mar  6 08:41:54 ha2 Keepalived_healthcheckers[11249]: Netlink reflector reports IP 172.16.21.188 added
Mar  6 08:41:59 ha2 Keepalived_vrrp[11250]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.21.188
[root@ha2 ~]#ifconfig 
eth0      Link encap:Ethernet  HWaddr 00:0C:29:1D:77:9C  
          inet addr:172.16.21.202  Bcast:172.16.21.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe1d:779c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2969375670 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2966841735 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:225643845081 (210.1 GiB)  TX bytes:222421642143 (207.1 GiB)

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:1D:77:9C  
          inet addr:172.16.21.188  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:55694 errors:0 dropped:0 overruns:0 frame:0
          TX packets:55694 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3176387 (3.0 MiB)  TX bytes:3176387 (3.0 MiB)


将ha1的keepalived启动再观察ha1的日志和IP

[root@ha1 ~]#service keepalived start
[root@ha1 ~]#tail -f /var/log/messages
Mar  6 08:54:42 ha1 Keepalived[13310]: Starting Keepalived v1.2.13 (10/15,2014)
Mar  6 08:54:42 ha1 Keepalived[13311]: Starting Healthcheck child process, pid=13312
Mar  6 08:54:42 ha1 Keepalived[13311]: Starting VRRP child process, pid=13313
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Netlink reflector reports IP 172.16.21.181 added
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Netlink reflector reports IP fe80::20c:29ff:fe4d:8e83 added
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Netlink reflector reports IP 172.16.21.181 added
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Registering Kernel netlink reflector
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Netlink reflector reports IP fe80::20c:29ff:fe4d:8e83 added
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Registering Kernel netlink command channel
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Registering gratuitous ARP shared channel
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Registering Kernel netlink reflector
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Registering Kernel netlink command channel
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Opening file '/etc/keepalived/keepalived.conf'.
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Configuration is using : 63252 Bytes
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: Using LinkWatch kernel netlink reflector...
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Configuration is using : 14699 Bytes
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Using LinkWatch kernel netlink reflector...
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Activating healthchecker for service [172.16.21.181]:3306
Mar  6 08:54:42 ha1 Keepalived_healthcheckers[13312]: Activating healthchecker for service [172.16.21.182]:3306
Mar  6 08:54:42 ha1 Keepalived_vrrp[13313]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Mar  6 08:54:43 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar  6 08:54:43 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) Received lower prio advert, forcing new election
Mar  6 08:54:44 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar  6 08:54:44 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar  6 08:54:44 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.21.188
Mar  6 08:54:44 ha1 Keepalived_healthcheckers[13312]: Netlink reflector reports IP 172.16.21.188 added
Mar  6 08:54:49 ha1 Keepalived_vrrp[13313]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 172.16.21.188
[root@ha1 ~]#ifconfig 
eth0      Link encap:Ethernet  HWaddr 00:0C:29:4D:8E:83  
          inet addr:172.16.21.201  Bcast:172.16.21.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe4d:8e83/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2968402607 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2966256067 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:224206102960 (208.8 GiB)  TX bytes:221258814612 (206.0 GiB)

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:4D:8E:83  
          inet addr:172.16.21.188  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:54918 errors:0 dropped:0 overruns:0 frame:0
          TX packets:54918 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3096422 (2.9 MiB)  TX bytes:3096422 (2.9 MiB)


到此,所有配置就完成了


六、 总结

提到MySQL多主复制,大家很可能都是想到MySQL+MMM的架构,MariaDB Galera Cluster很好地替代了前者并且可靠性更高,具体比较可以参考http://www.oschina.net/translate/from-mysql-mmm-to-mariadb-galera-cluster-a-high-availability-makeover这篇文章。

当然,MariaDB Galera Cluster并不是适合所有需要复制的情形,你必须根据自己的需求来决定,比如,如果你是数据一致性考虑的多,而且写操作和更新的东西多,但写入量不是很大,MariaDB Galera Cluster就适合你;如果你是查询的多,且读写分离也容易实现,那就用replication好,简单易用,用一个master保证数据的一致性,可以有多个slave用来读去数据,分担负载,只要能解决好数据一致性和唯一性,replication就更适合你,毕竟MariaDB Galera Cluster集群遵循“木桶”原理,如果写的量很大,数据同步速度是由集群节点中IO最低的节点决定的,整体上,写入的速度会比replication慢许多。

如果文中有任何遗漏和错误