欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

MySQL-Cluster三机集群+HA高可用+负载均衡配置手册

程序员文章站 2022-03-12 09:17:21
...

本文中的配置已经在实验室进行过性能测试、可靠性测试验证。
一、介绍
这篇文档旨在介绍如何安装配置基于3台服务器的MySQL集群。并且实现任意一台服务器出现问题或宕机时MySQL集群依然能够继续运行。
1. MySQL-Cluster简介
MySQL-Cluster主要有三种类型节点:
NDB node,数据节点将保存数据库,它会自动复制所有的数据节点;
SQL node,守护节点相当于是SQL数据库和客户端之间的接口,它提供从数据节点查询等操作,类似于"网关";
Management node,管理节点,用以监控和管理整个集群。
-------------------------------此处插图----------------------------
2.安装环境及软件包
3台华为RH2285服务器——其中两台(16颗CPU、96G内存、500G磁盘阵列)作为NDB节点和SQL节点服务器,另一台服务器16颗CPU、32G内存作为管理节点


软件包:MySQL-Cluster-client-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-server-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-devel-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-embedded-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-shared-compat-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-shared-gpl-7.2.13-1.rhel5.x86_64.rpm
操作系统:Redhat.Enterprise.Linux.v5.UPDATE.7.X86_64.DVD-HOTiSO.iso


服务器hostname和IP地址:
db1 172.16.1.74
db2 172.16.1.75
demo 172.16.1.78


数据库规划:
数据库管理节点(ndb_mgmd):172.16.1.78
数据库数据节点(NDB):172.16.1.74,172.16.1.75
数据库SQL节点(API节点):172.16.1.74,172.16.1.75


二、在db1和db2上安装MySQL-cluster
以下步骤需要在db1和db2上各做一次
1.首先检查linux操作系统中是否安装了自带的mysql数据库
一般linux自带mysql版本都比较低一些,因此要安装比较新的版本需要先将自带的低版本卸载。
检查操作系统中是否安装了mysql的命令:
rpm -qa |grep -i mysql
卸载mysql的命令:rpm -e mysqlXXXX 
或者使用yum -remove mysqlXXX来删除,这样可以避免卸载过程中由于各个包之间的依赖造成卸载停止
2.创建mysql用户和组
groupadd mysql
useradd -g mysql mysql
usermod -d /home/mysql mysql
3.将MySQL-Cluster上传到/home/mysql/目录下,开始安装MySQL-Cluster
[root@db1 sdd]# rpm -ivh MySQL-Cluster-server-gpl-7.2.14-1.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-server-gp########################################### [100%]


PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:


/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h db1 password 'new-password'


Alternatively you can run:
/usr/bin/mysql_secure_installation


which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.


See the manual for more instructions.


Please report any problems with the /usr/bin/mysqlbug script!




[root@db1 sbin]# rpm -ivh MySQL-Cluster-client-gpl-7.2.13-1.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-client-gp ########################################### [100%]
依次完成MySQL-Cluster-devel-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-embedded-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-shared-compat-gpl-7.2.13-1.rhel5.x86_64.rpm
        MySQL-Cluster-shared-gpl-7.2.13-1.rhel5.x86_64.rpm
        的安装
        
4.安装完成之后创建连接
ln -s /usr/sbin/ndbd /usr/bin
ln -s /usr/sbin/ndb_mgmd /usr/bin
ln -s /usr/sbin/mysqld /usr/bin
#ln -s /usr/local/lib/mysqlmanager /usr/bin ---这个在7.2.13版本上不需要


管理节点服务器demo(172.16.1.78)上只安装MySQL-Cluster-server-gpl-7.2.13-1.rhel5.x86_64.rpm即可


三、安装并配置节点


1.配置管理节点(172.16.1.78)配置文件
# mkdir /var/lib/mysql-cluster
# cd /var/lib/mysql-cluster
# vi config.ini
在config.ini中添加如下内容:
[ndbd default]
NoOfReplicas= 2
MaxNoOfConcurrentOperations= 500000
MaxNoOfLocalOperations=500000
MaxNoOfConcurrentTransactions=500000
MaxNoOfConcurrentIndexOperations=50000
# Amount of memory required=(SizeofDatabase * NumberOfReplicas * 1.1 ) / NumberOfDataNodes
DataMemory= 49152M
IndexMemory= 6144M
TimeBetweenWatchDogCheck= 30000
DataDir=/var/lib/mysql/data
MaxNoOfAttributes=500000
MaxNoOfTables=1000
MaxNoOfOrderedIndexes= 512
#StartPartialTimeout=100
#StartPartitionedTimeout=100
#ArbitrationTimeout=5000
#MaxNoOfFiredTriggers=8000
#TransactionBufferMemory=5M
TransactionDeadlockDetectionTimeout=90000
HeartbeatIntervalDbDb=5000


#StopOnError=1


[ndb_mgmd default]
DataDir=/var/lib/mysql/data
[ndb_mgmd]
NodeId=1
HostName= 172.16.1.78
[ndbd]
NodeId= 2
HostName= 172.16.1.74
[ndbd]
NodeId= 3
HostName= 172.16.1.75
[mysqld]
ArbitrationRank=2
[mysqld]
ArbitrationRank=2
[mysqld]
[mysqld]
[mysqld]
[mysqld]
[tcp default]
portnumber=2279




以下步骤需要在db1和db2上各做一次
2.配置通用my.cnf文件(SQL节点及ndb节点均使用此文件)
# vi /etc/my.cnf
在my.cnf中添加如下内容:
[mysqld]
datadir=/var/lib/mysql/data/
socket=/var/lib/mysql/mysql.sock
lower_case_table_names=1
ndb_cluster_connection_pool=2
tmp_table_size=512M
max_heap_table_size=512M
max_connections =400
thread_stack=1M
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).


old_passwords=1


default-storage-engine=ndbcluster
ndbcluster
ndb-connectstring=172.16.1.78
[ndbd]
connect-string=172.16.1.78
[ndb_mgm]
connect-string=172.16.1.78
[ndb_mgmd]
config-file=/var/lib/mysql-cluster/config.ini
[mysql_cluster]
ndb-connectstring=172.16.1.78
[mysql.server]
user=mysql
basedir=/usr
[mysqld_safe]
log-error=/var/log/mysqld.log


#pid-file=/var/run/mysqld/mysqld.pid
[mysql]
#socket=/usr/local/var/mysql.sock
[mysqladmin]
#socket=/usr/local/var/mysql.sock
[ndb_restore default]




四、启动管理节点
MySQL安装目录初始化
执行/usr/bin/mysql_install_db --user=mysql --datadir=/usr/local/mysql/data即可。注意这里的datadir要和config.ini和my.cnf中配置的datadir一致


[root@demo sbin]# ndb_mgmd --ndb_nodeid=1
MySQL Cluster Management Server mysql-5.5.31 ndb-7.2.13
2013-10-29 15:45:36 [MgmtSrvr] INFO     -- The default config directory '/usr/mysql-cluster' does not exist. Trying to create it...
2013-10-29 15:45:36 [MgmtSrvr] INFO     -- Sucessfully created config directory
2013-10-29 15:45:36 [MgmtSrvr] WARNING  -- at line 38: [tcp] portnumber is deprecated
2013-10-29 15:45:36 [MgmtSrvr] WARNING  -- at line 38: Cluster configuration warning:
arbitrator with id 1 and db node with id 2 on same host 172.16.1.74
arbitrator with id 2 and db node with id 3 on same host 172.16.1.75
arbitrator with id 4 has no hostname specified
arbitrator with id 5 has no hostname specified
Running arbitrator on the same host as a database node may
cause complete cluster shutdown in case of host failure.
注:在启动时几个警告提示
2013-10-29 15:45:36 [MgmtSrvr] WARNING  -- at line 38: [tcp] portnumber is deprecated这个警告提示可以不必理会,不影响使用;
2013-10-29 15:45:36 [MgmtSrvr] WARNING  -- at line 38: Cluster configuration warning这个警告提示说节点1和3,2和4的arbitrator一样,可能引起整个集群失败,由于是双管理节点所示此警告可以不用放在心上


[root@demo sbin]# netstat -lntpu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address      Foreign Address    State       PID/Program name   
tcp        0      0 0.0.0.0:1186      0.0.0.0:*         LISTEN      9484/ndb_mgmd
看到1186端口开放了说明启动是正常的。




五、初始化集群
在db1中
[root@db1 sbin]# ndbd --ndb_nodeid=2 --initial
2013-10-30 09:37:07 [ndbd] INFO     -- Angel connected to '172.16.1.78:1186'
2013-10-30 09:37:07 [ndbd] INFO     -- Angel allocated nodeid: 2
执行netstat –lntpu,如果列表中有端口为2279,表示初始化成功
[root@db1 mysql-cluster]# netstat -lntpu
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address         Foreign Address  State      PID/Program name
tcp        0      0 0.0.0.0:1186           0.0.0.0:*       LISTEN      13713/ndb_mgmd
tcp        0      0 172.16.1.74:2279     0.0.0.0:*       LISTEN      13735/ndbd


在db2中
[root@db2 mysql]# ndbd --ndb_nodeid=3 --initial
2013-10-30 09:37:49 [ndbd] INFO     -- Angel connected to '172.16.1.78:1186'
2013-10-30 09:37:49 [ndbd] INFO     -- Angel allocated nodeid: 3
初始化集群也可以直接输入ndbd –initial命令,而不必输入节点号
注:只有在第一次启动ndbd时或者对config.ini进行改动后才需要使用--initial参数!数据库维护过程中慎用initial参数,
如果数据库中已经存在大量数据,并且在所有数据节点上都执行了 ndbd --initial,那么所有数据库中的表和数据都会消失。


检查集群工作状态
在db1或者db2任意一台服务器上启动管理终端:
[root@db2 mysql]# ndb_mgm -e show
Connected to Management Server at: 172.16.1.78:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0, Master)
id=3    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0)


[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.16.1.78  (mysql-5.5.31 ndb-7.2.13)


[mysqld(API)]   4 node(s)
id=4 (not connected, accepting connect from any host)
id=5 (not connected, accepting connect from any host)
id=6 (not connected, accepting connect from any host)
id=7 (not connected, accepting connect from any host)
id=8 (not connected, accepting connect from any host)
id=9 (not connected, accepting connect from any host)
如果上面没有问题,现在开始加入mysqld(API)
六、启动SQL节点
在db1 中:
[root@db1 sbin]# mysqld_safe --ndb_nodeid=4 --user=mysql &
[1] 12495
[root@db1 sbin]# 131030 10:26:00 mysqld_safe Logging to '/var/log/mysqld.log'.
131030 10:26:00 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/data/
在db2 中:
[root@db2 mysql]# mysqld_safe --ndb_nodeid=6 --user=mysql &
[1] 11756
[root@db2 sbin]#131030 10:18:56 mysqld_safe Logging to '/var/log/mysqld.log'.
131030 10:18:57 mysqld_safe Starting mysqld daemon with databases from /usr/local/mysql/data/


出现131110  0:37:19 [ERROR] Fatal error: Can't open and lock privilege tables: Table 'mysql.host' doesn't exist
错误时怎样解决问题
/usr/bin/mysql_install_db --user=mysql --datadir=/var/lib/mysql/data/


再次检查集群工作状态,看SQL节点是否加入成功,出现下面的提示信息表示添加成功
[root@db1 sbin]# ndb_mgm -e show
Connected to Management Server at: 172.16.1.78:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0, Master)
id=3    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0)


[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.16.1.78  (mysql-5.5.31 ndb-7.2.13)


[mysqld(API)]   4 node(s)
id=4    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13)
id=5    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13)
id=6    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=7    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=8 (not connected, accepting connect from any host)
id=9 (not connected, accepting connect from any host)


此步也可以通过执行mysql服务器启动命令来代替
[root@db1 mysql-cluster]# service mysql start
Starting MySQL.......................[确定]......................


设置开机自动启动
将一下内容加入到/etc/rc.d/rc.local文件的最后一行,就可以使得ndb_mgmd和ndbd开机自动启动了
管理节点服务器172.16.1.78上添加ndb_mgmd -f /var/lib/mysql-cluster/config.ini
数据和SQL节点服务器172.16.1.74、172.16.1.75上添加ndbd
注:mysqld服务会自动重启,不需要添加


七、设置数据库的用户名和密码
执行如下命令:
./bin/mysqladmin -u root password 'huawei'
./bin/mysqladmin -u root -h db2 password 'huawei'
新增一个webapp用户,并且配置权限
mysql>GRANT ALL ON *.* TO 'webapp'@'%' IDENTIFIED BY 'huawei';
mysql>GRANT ALL ON *.* TO 'webapp'@'localhost' IDENTIFIED BY 'huawei';
mysql>GRANT ALL ON *.* TO 'webapp'@'db1' IDENTIFIED BY 'huawei';
mysql>GRANT ALL ON *.* TO 'webapp'@'db2' IDENTIFIED BY 'huawei';


八、功能测试
到管理节点查看下相关服务状态
# ndb_mgm
ndb_mgm> show
Connected to Management Server at: 172.16.1.78:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0, Master)
id=3    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0)


[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.16.1.78  (mysql-5.5.31 ndb-7.2.13)


[mysqld(API)]   6 node(s)
id=4    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13)
id=5    @172.16.1.74  (mysql-5.5.31 ndb-7.2.13)
id=6    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=7    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=8 (not connected, accepting connect from any host)
id=9 (not connected, accepting connect from any host)
可以看到这里的数据节点、管理节点、sql节点都是正常的。现在我们在其中一个数据节点上进行相关数据库的创建,然后到另外一个数据节点上看看数据是否同步。
[root@db2 usr]# mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.31-ndb-7.2.13-cluster-gpl MySQL Cluster Community Server (GPL)


Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.


Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| ndb_4_fs           |
| ndbinfo            |
| performance_schema |
| test               |
+--------------------+
6 rows in set (0.08 sec)
mysql> create database aa;
Query OK, 1 row affected (3.56 sec)


mysql> use aa
Database changed
mysql> CREATE TABLE ctest2 (i INT) ;
Query OK, 0 rows affected (24.71 sec)


mysql> INSERT INTO ctest2 () VALUES (1);
Query OK, 1 row affected (0.54 sec)


mysql> SELECT * FROM ctest2;
+------+
| i    |
+------+
|    1 |
+------+
1 row in set (0.01 sec) 


现在到另外一个数据节点db1查看下aa数据库是否同步过来了。
[root@db1 mysql-cluster]# mysql -u root
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| aa                 |
| mysql              |
| ndb_3_fs           |
| ndbinfo            |
| performance_schema |
| test               |
+--------------------+
7 rows in set (4.80 sec) 
mysql> use aa
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A


Database changed
mysql> select * from ctest2;
+------+
| i    |
+------+
|    1 |
+------+
1 row in set (0.01 sec) 
从上面可以看到数据已经同步了,mysql集群环境已经搭建完成。
九、破坏性测试
在上面可以看到172.16.1.74作为主的数据节点,现在把172.16.1.74这台机器关闭,有如下的结果:
[root@db1 mysql-cluster]# shutdown -h now
Broadcast message from root (pts/2) (Thu Oct 31 14:00:56 2013):
The system is going down for system halt NOW!


[root@db2 ~]# ndb_mgm -e show
Connected to Management Server at: 172.16.1.75:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2 (not connected, accepting connect from 172.16.1.74)
id=3    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13, Nodegroup: 0, Master)


[ndb_mgmd(MGM)] 1 node(s)
id=1    @172.16.1.78  (mysql-5.5.31 ndb-7.2.13)


[mysqld(API)]   4 node(s)
id=4 (not connected, accepting connect from any host)
id=5 (not connected, accepting connect from any host)
id=6    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=7    @172.16.1.75  (mysql-5.5.31 ndb-7.2.13)
id=8 (not connected, accepting connect from any host)
id=9 (not connected, accepting connect from any host)
从上面可以发现现在172.16.1.74这台机器的数据节点和sql节点已经连接不上了,172.16.1.75成为了主数据节点,现在在172.16.1.75数据节点上创建一个表,然后恢复172.16.1.74的数据节点,查看是否能够把数据同步过来了。
先在172.16.1.75数据节点做如下操作:
mysql> create table ctest3(id int(11)) ;
mysql> show tables;
mysql> create table ctest3(id int(11)) ;
Query OK, 0 rows affected (0.20 sec)


mysql> show tables;
+--------------+
| Tables_in_aa |
+--------------+
| ctest2       |
| ctest3       |
+--------------+
2 rows in set (0.00 sec)


mysql> insert into ctest3 values(1);
Query OK, 1 row affected (0.02 sec)


mysql> select * from ctest3;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.01 sec)
然后我们恢复172.16.1.74数据节点,查看下ctest3数据是否同步过来了。
[root@db1 ~]# mysql -u root
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| aa                 |
| mysql              |
| ndb_3_fs           |
| ndbinfo            |
| performance_schema |
| test               |
+--------------------+
7 rows in set (0.19 sec)


mysql> use aa
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A


Database changed
mysql> show tables;
+--------------+
| Tables_in_aa |
+--------------+
| ctest2       |
| ctest3       |
+--------------+
2 rows in set (0.00 sec)


mysql> select * from ctest3;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.06 sec)
可以看到172.16.1.74数据节点已经把172.16.1.75数据节点的数据同步过来了,说明mysql集群是没有问题的。




HA高可用+负载均衡的配置
十、实现高可用和负载均衡(还需要再完善整理)
ldirectord+heartbeat介绍
ldirectord的作用
在Apache服务器的前端,我们要放置一台服务器专门来做负载调度的任务(为了称呼简单和便于理解,我们将这样的负载调度服务器简称为“导演”),
用来把访问需求分发给两台Apache服务器。这个“导演”的任务,正是由ldirectord来完成的。
“ldirectord”(Linux Director Daemon)可以对服务和物理服务器进行监测,被广泛地用于http和https等服务。
它是专门为LVS(Linux Virtual Server)监控而编写的,
不仅能从heartbeat的配置文件/etc/ha.d/xxx.cf中读取所有有关IPVS(IP Virtul Server)路由表配置的信息,
还可以方便地被heartbeat管理(比如由heartbeat来启动和停止ldirectord服务)。


heartbeat的工作原理
因为我们要提供“高可用性”,所以要考虑到“导演”突然停止的情形;因此我们要安排两个导演,也就是要有两个调度服务器节点。
这两个节点的地位不同,其中一个是主节点,另外一个是辅节点(可以看成是一个“主导演”和一个“副导演”)。
这两个节点正是用heartbeat来互相监测对方的。heartbeat可以通过以太网(或者串行接口)来监控节点的“健康”状况。
如果有多个heartbeat节点(heartbeat 2.0及后续版本已经能够支持两个以上节点),我们既可以使用串行线又可以使用以太网连接它们,
这样将大大提高系统的可用性。heartbeat的核心功能有两个部分:心跳监测和资源接管。通过心跳监测,节点之间相互“打招呼”(发送报文)
来告诉对方自己当前的状态;如果在指定的时间内没“听”到对方“打招呼”(没收到报文),那么就认为对方停止了,
这时heartbeat会自动启动资源接管模块,运行相关的shell脚本来接管运行在对方主机上的资源或者服务。


注意:heartbeat+ldirectord要发挥作用,必须保证承担负载均衡的两台服务器时间是同步的。
具体到我的配置里就要求172.16.1.74与172.16.1.75服务器时间同步。


查看内核是否支持IPVS
[root@db1 init.d]# modprobe -l |grep ipvs
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_dh.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_ftp.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lblc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lblcr.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_nq.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_rr.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_sed.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_sh.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_wlc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_wrr.ko


[root@db2 ipvs]# modprobe -l |grep ipvs
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_dh.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_ftp.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lblc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lblcr.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_lc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_nq.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_rr.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_sed.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_sh.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_wlc.ko
/lib/modules/2.6.18-274.el5/kernel/net/ipv4/ipvs/ip_vs_wrr.ko




启用内核的包转发功能
修改/etc/sysctl.conf
#net.ipv4.ip_forward = 0
net.ipv4.ip_forward = 1
[root@db1 etc]# sysctl -p //使其马上生效
net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296


安装heartbeat和lidirectord软件
[root@db1 heartbeat2_x86_64]# ls -l
-rw-r--r-- 1 root root 1691714 11-10 11:31 heartbeat-2.1.4-11.el5.x86_64.rpm
-rw-r--r-- 1 root root  111941 11-10 12:08 heartbeat-ldirectord-2.1.4-11.el5.x86_64.rpm
-rw-r--r-- 1 root root   93297 11-10 11:57 heartbeat-pils-2.1.4-11.el5.x86_64.rpm
-rw-r--r-- 1 root root  190686 11-10 11:56 heartbeat-stonith-2.1.4-11.el5.x86_64.rpm
-rw-r--r-- 1 root root   33839 11-10 12:10 ipvsadm-1.24-12.el5.x86_64.rpm
-rw-r--r-- 1 root root   62324 11-10 11:44 libnet-1.1.6-2.el5.pp.x86_64.rpm
-rw-r--r-- 1 root root   38792 2011-09-10 libtool-ltdl-1.5.22-7.el5_4.x86_64.rpm
-rw-r--r-- 1 root root  171524 2011-09-10 openhpi-libs-2.14.0-5.el5.x86_64.rpm
-rw-r--r-- 1 root root  224092 11-10 12:18 perl-DBD-mysql-4.005-1.el5.rf.x86_64.rpm
-rw-r--r-- 1 root root  860170 11-10 12:19 perl-DBI-1.58-2.el5.rf.x86_64.rpm
-rw-r--r-- 1 root root   93196 2011-09-10 perl-MailTools-1.77-1.el5.centos.noarch.rpm
-rw-r--r-- 1 root root   45282 11-10 12:26 perl-Net-Daemon-0.39-1.el5.rf.noarch.rpm
-rw-r--r-- 1 root root   33187 11-10 12:25 perl-PlRPC-0.2018-1.2.el5.rf.noarch.rpm
-rw-r--r-- 1 root root   32736 2007-07-02 perl-TimeDate-1.16-5.el5.noarch.rpm


[root@db1 heartbeat2_x86_64]# rpm -ivh *.rpm
[root@db2 heartbeat2_x86_64]# ls
heartbeat-2.1.4-11.el5.x86_64.rpm             ipvsadm-1.24-12.el5.x86_64.rpm          perl-DBD-mysql-4.005-1.el5.rf.x86_64.rpm     perl-PlRPC-0.2018-1.2.el5.rf.noarch.rpm
heartbeat-ldirectord-2.1.4-11.el5.x86_64.rpm  libnet-1.1.6-2.el5.pp.x86_64.rpm        perl-DBI-1.58-2.el5.rf.x86_64.rpm            perl-TimeDate-1.16-5.el5.noarch.rpm
heartbeat-pils-2.1.4-11.el5.x86_64.rpm        libtool-ltdl-1.5.22-7.el5_4.x86_64.rpm  perl-MailTools-1.77-1.el5.centos.noarch.rpm
heartbeat-stonith-2.1.4-11.el5.x86_64.rpm     openhpi-libs-2.14.0-5.el5.x86_64.rpm    perl-Net-Daemon-0.39-1.el5.rf.noarch.rpm
[root@db2 heartbeat2_x86_64]# rpm -ivh *.rpm


配置heartbeat
[root@db2 etc]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
172.16.1.75     db2.test.com      db2
172.16.1.74     db1.test.com      db1
172.16.1.78     demo.test.com     demo
172.16.1.73     db.test.com       db         #后面要专门设置这样一个虚拟IP地址


[root@db1 ~]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1                localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
172.16.1.74     db1.test.com      db1
172.16.1.75     db2.test.com      db2
172.16.1.78     demo.test.com     demo
172.16.1.73     db.test.com       db         #后面要专门设置这样一个虚拟IP地址


复制一些文件
[root@db2 doc]# cd /usr/share/doc/heartbeat-2.1.4/
[root@db2 heartbeat-3.0.3]# ls
apphbd.cf  authkeys  AUTHORS  ChangeLog  COPYING  COPYING.LGPL  ha.cf  haresources  README
[root@db2 heartbeat-3.0.3]# cp ha.cf /etc/ha.d/
[root@db2 heartbeat-3.0.3]# cp haresources /etc/ha.d/
[root@db2 heartbeat-3.0.3]# cp authkeys /etc/ha.d/


[root@db1 heartbeat-3.0.3]# ls
apphbd.cf  authkeys  AUTHORS  ChangeLog  COPYING  COPYING.LGPL  ha.cf  haresources  README
[root@db1 heartbeat-3.0.3]# cp ha.cf /etc/ha.d/
[root@db1 heartbeat-3.0.3]# cp haresources /etc/ha.d/
[root@db1 heartbeat-3.0.3]# cp authkeys /etc/ha.d/


vi /etc/ha.d/ha.cf添加如下内容
debugfile /var/log/ha.log
logfile /var/log/ha.log
logfacility     local0
bcast  eth0            # Linux
mcast eth0 225.0.0.1 694 1 0
auto_failback off
node   db1
node   db2
respawn hacluster /usr/lib/heartbeat/ipfail  #64位系统此处应该是/usr/lib64/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
keepalive 1
deadtime  10
#warntime 10
initdead 120
#udpport 694
#baud 19200


64位系统需要执行下面的操作:
[root@db2 lib]# mkdir heartbeat
[root@db2 lib]# ln -s /usr/lib64/heartbeat/ipfail /usr/lib/heartbeat/ipfail


vi /etc/ha.d/haresources 设置虚拟IP地址
db1.test.com ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr::172.16.1.73/24/eth0/172.16.1.255


vi /etc/ha.d/authkeys
auth 3
3 md5 Hello!
[root@db1 ha.d]# chmod 600 authkeys
[root@db2 ha.d]# chmod 600 authkeys
上述三个文件的配置db1、db2节点要相同


配置ldirectord
[root@db1 ha.d]# find / -name ldirectord.cf
/usr/share/doc/ldirectord-1.0.4/ldirectord.cf
[root@db1 ha.d]# cd /usr/share/doc/ldirectord-1.0.4/
[root@db1 ldirectord-1.0.4]# cp ldirectord.cf /etc/ha.d/
[root@db1 ldirectord-1.0.4]# vi /etc/ha.d/ldirectord.cf
配置如下的内容:


# Global Directives
checktimeout=10
checkinterval=2
#fallback=127.0.0.1:80
#fallback6=[::1]:80
autoreload=yes
logfile="/var/log/ldirectord.log"
#logfile="local0"
#emailalert="admin@x.y.z"
#emailalertfreq=3600
#emailalertstatus=all
quiescent=yes


# Sample for an http virtual service
virtual=172.16.1.73:3306
        real=172.16.1.74:3306 gate
        real=172.16.1.75:3306 gate
#       real=192.168..: gate
#       fallback=127.0.0.1:3306 gate
        service=mysql
        scheduler=wrr
        #persistent=600
        #netmask=255.255.255.255
        protocol=tcp
        checktype=negotiate
#       checkport=80
        login="ldirectord"
        passwd="ldirectord"
        database="ldirectordb"
        request="select * from ldirectordtest"
        receive="ldirectord is OK"
#       virtualhost=www.x.y.z


阻止ldirectord服务开机启动,推迟heartbeat启动的操作怎么弄呢?让hearbeat来控制ldirectord
/sbin/chkconfig --level 2345 heartbeat on
/sbin/chkconfig --del ldirectord


还有怎样才能修改heartbeat启动时候的优先级别呢?
目前的方法是/etc/rc.d/init.d/目录下的heatbeat的脚本的注释
# chkconfig: - 75 05
# description: Startup script high-availability services.
# processname: heartbeat
# pidfile: /var/run/heartbeat.pid
# config: /etc/ha.d/ha.cf


创建数据库和测试表
mysql>GRANT ALL ON ldirectordb.* TO 'ldirectord'@'%' IDENTIFIED BY 'ldirectord';
mysql>GRANT ALL ON ldirectordb.* TO 'ldirectord'@'localhost' IDENTIFIED BY 'ldirectord';
mysql>GRANT ALL ON ldirectordb.* TO 'ldirectord'@'db1' IDENTIFIED BY 'ldirectord';
mysql>GRANT ALL ON ldirectordb.* TO 'ldirectord'@'db2' IDENTIFIED BY 'ldirectord';
mysql>FLUSH PRIVILEGES;
mysql>CREATE DATABASE ldirectordb; 
mysql>USE ldirectordb;
mysql>CREATE TABLE ldirectordtest (Status INT) ; 
mysql>INSERT INTO ldirectordtest() VALUES (1); 
mysql>quit


先设置IP路由
检查是否已经安装了iproute包
[root@db1 mysql]# rpm -qa |grep iproute
iproute-2.6.18-11.el5
修改/etc/sysctl.conf文件
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.eth0.arp_announce = 2
sysctl -p


配置虚拟IP地址,虚拟IP应该使用lo:0来配置,不应该使用eth0:0来配置,使用eth0:0来配置会导致虚拟IP地址在局域网内冲突,使用lo:0
不会有这样情况发生。
/etc/sysconfig/network-scripts/
db1服务器lo:0的设置(虚拟IP地址)
DEVICE=lo:0
IPADDR=172.16.1.73
NETMASK=255.255.255.255
ONBOOT=yes          
db2上设置lo:0的IP地址(虚拟IP地址)
DEVICE=lo:0
IPADDR=172.16.1.73
NETMASK=255.255.255.255
ONBOOT=yes




[root@db1.test.com network-scripts]# ifup ifcfg-lo:0
db2.test.com上执行同样的配置和操作即可。 




高可用性和负载均衡测试
[root@db2 init.d]# ldirectord stop
[root@db2 init.d]# heartbeat start


执行上面命令若没有报错,就表示一切正常
重启两台服务器db1和db2


ldirectord状态检查
[root@db1 ~]# ldirectord ldirectord.cf status
ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 6203
[root@db2 ~]# ldirectord ldirectord.cf status
ldirectord is stopped for /etc/ha.d/ldirectord.cf
heartbeat状态检查
[root@db1 log]# service heartbeat status
heartbeat OK [pid 5750 et al] is running on db1.test.com [db1.test.com]...
[root@db2 log]# service heartbeat status
heartbeat OK [pid 5796 et al] is running on db2.test.com [db2.test.com]...


----错误问题解决
[root@db1 init.d]# ./ldirectord status
ldirectord stale pid file /var/run/ldirectord.ldirectord.pid for /etc/ha.d/ldirectord.cf
ldirectord is stopped for /etc/ha.d/ldirectord.cf
此问题已经解决了,是因为缺少包造成的。缺少perl-DBI-mysql、perl-DBD-mysql




虚拟IP状态检查
[root@db2 log]# ip addr sh eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 90:e2:ba:01:fe:24 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.75/24 brd 172.16.1.255 scope global eth0
    inet 172.16.1.73/24 brd 172.16.1.255 scope global secondary eth0:0
    inet6 fe80::92e2:baff:fe01:fe24/64 scope link 
       valid_lft forever preferred_lft forever
[root@db1 log]# ip addr sh eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 90:e2:ba:01:fa:fa brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.74/24 brd 172.16.1.255 scope global eth0
    inet6 fe80::92e2:baff:fe01:fafa/64 scope link 
       valid_lft forever preferred_lft forever 
       
IPVS状态检查
[root@db2 log]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.16.1.73:3306 wrr
  -> 172.16.1.74:3306             Route   1      0          0         
  -> 172.16.1.75:3306             Local   0      0          0         


[root@db1 log]# ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn


db1上执行./LVSSyncDaemonSwap master status
[root@db1 resource.d]# ./LVSSyncDaemonSwap master status
master running
(ipvs_syncmaster pid: 4043)


db1上执行./LVSSyncDaemonSwap master status
[root@db2 ha.d]# resource.d/LVSSyncDaemonSwap master status
master stopped




[root@db2 log]# mysql -u ldirectord -p -h 172.16.1.73
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 492
Server version: 5.5.31-ndb-7.2.13-cluster-gpl MySQL Cluster Community Server (GPL) 


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema | 
| ldirectordb     | 
| test               | 
+--------------------+
3 rows in set (0.00 sec)


mysql> use ldirectordb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A


Database changed
mysql> select * from ldirectordtest;
+--------+
| Status |
+--------+
|      1 | 
+--------+
1 row in set (0.00 sec)


mysql> 


[root@db1 log]# mysql -u ldirectord -p -h 172.16.1.73
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 492
Server version: 5.5.31-ndb-7.2.13-cluster-gpl MySQL Cluster Community Server (GPL)


Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema | 
| ldirectordb     | 
| test               | 
+--------------------+
3 rows in set (0.00 sec)


mysql> use ldirectordb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A


Database changed
mysql> select * from ldirectordtest;
+--------+
| Status |
+--------+
|      1 | 
+--------+
1 row in set (0.00 sec)


参考文档:
http://wenku.baidu.com/view/0054090e7cd184254b353546.html
http://blog.csdn.net/sdd00001/article/details/21464713
http://www.linuxfly.org/post/531/
http://blog.csdn.net/tongle_deng/article/details/7297597

 

转:http://blog.csdn.net/sdd00001/article/details/30460893