欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

MySQL下高可用故障转移方案MHA的超级部署教程

程序员文章站 2024-02-22 13:25:34
mha介绍 mha是一位日本mysql大牛用perl写的一套mysql故障切换方案,来保证数据库系统的高可用.在宕机的时间内(通常10—30秒内),完成故障切换,部署mh...

mha介绍
mha是一位日本mysql大牛用perl写的一套mysql故障切换方案,来保证数据库系统的高可用.在宕机的时间内(通常10—30秒内),完成故障切换,部署mha,可避免主从一致性问题,节约购买新服务器的费用,不影响服务器性能,易安装,不改变现有部署。
 
   还支持在线切换,从当前运行master切换到一个新的master上面,只需要很短的时间(0.5-2秒内),此时仅仅阻塞写操作,并不影响读操作,便于主机硬件维护。
 
在有高可用,数据一致性要求的系统上,mha 提供了有用的功能,几乎无间断的满足维护需要。
 
优点:
 
1.master自动监控和故障转移
 
  在当前已存在的主从复制环境中,mha可以监控master主机故障,并且故障自动转移。
 
即使有一些slave没有接受新的relay log events,mha也会从最新的slave自动识别差异的relay log events,并apply差异的event到其他slaves。因此所有的slave都是一致的。mha秒级别故障转移(9-12秒监测到主机故障,任选7秒钟关闭电源主机避免脑裂,接下来apply差异relay logs,注册到新的master,通常需要时间10-30秒即total downtime)。另外,在配置文件里可以配置一个slave优先成为master。因为mha修复了slave之间的一致性,dba就不用去处理一致性问题。
 
     当迁移新的master之后,并行恢复其他slave。即使有成千上万的slave,也不会影响恢复master时间,slave也很快完成。
 
      dena公司在150+主从环境中用mha。当其中一个master崩溃,mha4秒完成故障转移,这是主动/被动集群解决方案无法完成的。
 
2.互动(手动)master故障转移
 
 mha可以用来只做故障转移,而不监测master,mha只作为故障转移的交互。
 
3.非交互式故障转移
 
 非交互式的故障转移也提供(不监控master,自动故障转移)。这个特性很有用,特别是你已经安装了其他软件监控master。比如,用pacemaker(heartbeat)监测master故障和vip接管,用mha故障转移和slave提升。
 
4.在线切换master到不同主机
 
 在很多情况下,有必要将master转移到其他主机上(如替换raid控制器,提升master机器硬件等等)。这并不是master崩溃,但是计划维护必须去做。计划维护导致downtime,必须尽可能快的恢复。快速的master切换和优雅的阻塞写操作是必需的,mha提供了这种方式。优雅的master切换, 0.5-2秒内阻塞写操作。在很多情况下0.5-2秒的downtime是可以接受的,并且即使不在计划维护窗口。这意味着当需要更换更快机器,升级高版本时,dba可以很容易采取动作。
 
5.master crash不会导致主从数据不一致性
 
    当master crash后,mha自动识别slave间relay logevents的不同,然后应用与不同的slave,最终所有slave都同步。结合通过半同步一起使用,几乎没有任何数据丢失。
 
其他高可用方案
 
6.mha部署不影响当前环境设置
 
mha最重要的一个设计理念就是尽可能使用简单。使用与5.0+以上主从环境,其他ha方案需要改变mysql部署设置,mha不会让dba做这些部署配置,同步和半同步环境都可以用。启动/停止/升级/降级/安装/卸载 mha都不用改变mysql主从(如启动/停止)。
 
当你需要升级mha到新版本时,不需要停止mysql,仅仅更新hma版本,然后重新启动mhamanger即可。
 
   mha 支持包含5.0/5/1/5.5(应该也支持5.6,翻译文档时mha开发者没更新对于5.6版本)。有些ha方案要求特定的mysql版本(如mysqlcluster,mysql with global transaction id 等),而且你可能不想仅仅为了masterha而迁移应用。很多情况下,公司已经部署了许多传统的mysql应用,开发或dba不想花太多时间迁移到不同的存储引擎或新的特性(newer bleeding edge distributions 不知道这个是否该这么翻译)。
 
7.不增加服务器费用
 
mha 包含mha manager和mha node。mha node运行在每台mysql服务器上,manager可以单独部署一台机器,监控100+以上master,总服务器数量不会有太大增加。需要注意的是manager也可以运行在slaves中的一台机器上。
 
8.性能无影响
 
当监控master,mha只是几秒钟(默认3秒)发送ping包,不发送大的查询。主从复制性能不受影响
 
9.适用任何存储引擎
 
mysql不仅仅适用于事务安全的innodb引擎,在主从中适用的引擎,mha都可以适用。即使用遗留环境的mysiam引擎,不进行迁移,也可以用mha。


安装配置方法
1、示例环境介绍

  •          centos 6.4 x64
  •          mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz
  •          mha4mysql-manager-0.54.tar.gz
  •          mha4mysql-node-0.54.tar.gz

2、服务器说明

  •          masnager                  192.168.216.50
  •          master              192.168.216.51
  •          slave                  192.168.216.52

         三台服务器都安装mysql,将manager作为管理节点

3、配置等价性

         manager:

          ssh-keyger -t rsa

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.50

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.51

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.52

          ssh 192.168.216.50 date

          ssh 192.168.216.51 date

          ssh 192.168.216.52 date

         master:

          ssh-keyger -t rsa

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.50

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.51

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.52

          ssh 192.168.216.50 date

          ssh 192.168.216.51 date

          ssh 192.168.216.52 date

         slave:

          ssh-keyger -t rsa

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.50

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.51

          ssh-coyp-id -i /root/.ssh/id_rsa.pub root@192.168.216.52

          ssh 192.168.216.50 date

          ssh 192.168.216.51 date

          ssh 192.168.216.52 date

4、安装mysql

    groupadd mysql

     useradd -g mysql -s /bin/nologin -m mysql

     mkdir -pv /data/mysql

     tar -zxvf mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz
     mv mysql-5.6.16-linux-glibc2.5-x86_64 /usr/local/mysql

     cd /usr/local/mysql/script

     ./ mysql_install_db –user=mysql –basedir=/usr/local/mysql –datadair=/data/mysql

         创建mysql配置文件

     vim /usr/local/mysql/my.cnf

 

    [mysql]

     # client #

     port              = 3306

     socket             = /tmp/mysql.sock

 

     [mysqld]

     # general #

     user              = mysql

     default-storage-engine     = innodb

     socket             = /tmp/mysql.sock

     pid-file            = /data/mysql/mysql.pid

 

     # myisam #

     key-buffer-size        = 32m

     myisam-recover         = force,backup

 

     # safety #

     max-allowed-packet       = 16m

     max-connect-errors       = 1000000

    

     # data storage #

     datadir            = /data/mysql/

 

     # binary logging #

     server_id                        = 1      #不同服务器不一样

     log-bin            = /data/mysql/mysql-bin

     expire-logs-days        = 14

     sync-binlog          = 1

 

     # caches and limits #

     tmp-table-size         = 32m

     max-heap-table-size      = 32m

     query-cache-type        = 0

     query-cache-size        = 0

     max-connections        = 500

     thread-cache-size       = 50

     open-files-limit        = 65535

     table-definition-cache     = 1024

     table-open-cache        = 2048

 

     # innodb #

     innodb-flush-method      = o_direct

     innodb-log-files-in-group   = 2

     innodb-log-file-size      = 64m

     innodb-flush-log-at-trx-commit = 1

     innodb-file-per-table     = 1

     innodb-buffer-pool-size    = 592m

 

     # logging #

     log-error           = /data/mysql/mysql-error.log

     log-queries-not-using-indexes = 1

     slow-query-log         = 1

     slow-query-log-file      = /data/mysql/mysql-slow.log

5、配置权限

         manager:

          grant all privileges on *.* to root@'127.0.0.1' identified by 'root';

          grant all privileges on *.* to root@'localhost' identified by 'root';

          grant all privileges on *.* to root@'192.168.216.50' identified by 'root';

          grant replication slave on *.* to slave@'192.168.216.50' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.51' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.52' identified by 'slave';

         master:

          grant all privileges on *.* to root@'127.0.0.1' identified by 'root';

          grant all privileges on *.* to root@'localhost' identified by 'root';

          grant all privileges on *.* to root@'192.168.216.50' identified by 'root';

          grant replication slave on *.* to slave@'192.168.216.50' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.51' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.52' identified by 'slave';

         slave:

          grant all privileges on *.* to root@'127.0.0.1' identified by 'root';

          grant all privileges on *.* to root@'localhost' identified by 'root';

          grant all privileges on *.* to root@'192.168.216.50' identified by 'root';

          grant all privileges on *.* to root@'192.168.216.51' identified by 'root';

          grant replication slave on *.* to slave@'192.168.216.50' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.51' identified by 'slave';

          grant replication slave on *.* to slave@'192.168.216.52' identified by 'slave';

6、安装依赖包

         管理服务器manager需要安装以下全部

perl-config-tiny

perl-params-validate

perl-parallel-forkmanager

perl-log-dispatch

|- perl-mime-lite-3.027-2.el6.noarch.rpm

     |- perl-mime-types-1.28-2.el6.noarch.rpm

     |- perl-email-date-format-1.002-5.el6.noarch.rpm

     |- perl-mailtools-2.04-4.el6.noarch.rpm

          |- perl-timedate-1.16-11.1.el6.noarch.rpm

          |- perl-data-showtable-3.3-3.4.noarch.rpm

|- perl-mail-sender-0.8.22-21.1.noarch.rpm

     |- perl-io-socket-ssl-1.31-2.el6.noarch.rpm

          |- perl-net-libidn-0.12-3.el6.x86_64.rpm

          |- perl-net-ssleay-1.35-9.el6.x86_64.rpm

     |-perl-win32api-registry

|- perl-mail-sendmail-0.79_16-4.2.noarch.rpm

 

     rpm -ivh ncftp-debuginfo-3.2.3-1.3.x86_64.rpm

     rpm -ivh perl-parallel-forkmanager-0.7.5-2.2.el6.rf.noarch.rpm

     rpm -ivh perl-params-validate-0.91-2.4.x86_64.rpm

     rpm -ivh perl-config-tiny-2.12-7.1.el6.noarch.rpm

     rpm -ivh perl-mime-types-1.28-2.el6.noarch.rpm

     rpm -ivh perl-email-date-format-1.002-5.el6.noarch.rpm

     rpm -ivh perl-timedate-1.16-11.1.el6.noarch.rpm

     rpm -ivh perl-data-showtable-3.3-3.4.noarch.rpm

     rpm -ivh perl-mailtools-2.04-4.el6.noarch.rpm

     rpm -ivh perl-mime-lite-3.027-2.el6.noarch.rpm

     rpm -ivh perl-net-libidn-0.12-3.el6.x86_64.rpm

     rpm -ivh perl-net-ssleay-1.35-9.el6.x86_64.rpm

     rpm -ivh perl-io-socket-ssl-1.31-2.el6.noarch.rpm

     rpm -ivh perl-mail-sendmail-0.79_16-4.2.noarch.rpm

     rpm -ivh perl-mail-sender-0.8.22-21.1.noarch.rpm

     rpm -ivh perl-log-dispatch-2.22-7.3.noarch.rpm

 

         如果安perl-dbd-mysql-4.013-3.el6.x86_64.rpm在检查rep是报错则需要手动编译安装

     tar -zxvf dbd-mysql-4.027.tar.gz

     cd dbd-mysql-4.0.27

     perl makefile.pl

     make && make install

         节点服务器安装

     rpm -ivh ncftp-debuginfo-3.2.3-1.3.x86_64.rpm

     rpm -ivh perl-dbd-mysql-4.013-3.el6.x86_64.rpm

7、创建软连接

     ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog

     ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql

         导出mysql库搜索路径

     vim /etc/ld.so.conf.d/mysql-x86_64.conf

       /usr/local/mysql/lib

     ldconfig

8、在所有服务器上安装mha4mysql-node-0.54.tar.gz

     tar -zxvf mha4mysql-node-0.54.tar.gz

     cd mha4mysql-node-0.54

     perl makefile.pl

     make && make install

9、在管理服务上安装 mha4mysql-manager-0.54.tar.gz

     tar -zxvf mha4mysql-manager-0.54.tar.gz

     cd mha4mysql-manager-0.54

     perl makefile.pl

     make && make install

     mkdir -pv /etc/masterha

     mkdir -pv /masterha/app1

     cp samples/conf/* /etc/masterha

     cp samples/scripts/* /usr/local/bin

 

     vim /etc/masterha/app1.cnf

     [server default]

     manager_workdir=/masterha/app1

     manager_log=/masterha/app1/manager.log

 

     user=root

     password=root

 

     ssh_user=root

     repl_user=slave

     repl_password=slave

     shutdown_script=""

     #master_ip_failover_script="/usr/local/bin/masterha_ip_failover"

     master_ip_online_change_script="/usr/local/bin/masterha_ip_failover "

     report_script=""

 

     [server1]

     hostname=192.168.216.50

     master_binlog_dir="/data/mysql/"

     candidate_master=1

 

     [server2]

     hostname=192.168.216.51

     master_binlog_dir="/data/mysql/"

     candidate_master=1

 

     [server3]

     hostname=192.168.216.52

     master_binlog_dir="/data/mysql/"

     candidate_master=1

10、测试ssh连接

     masterha_check_ssh –conf=/etc/masterha/app1.cnf

11、测试replication

    masterha_check_repl –conf=/etc/masterha/app1.cnf

12、开启管理节点进程

 

   masterha_manager –conf=/etc/masterha/app1.cnf

13、测试故障转移

       关闭主库mysql,查看从库的的状态是否将同步ip切换到新的主库

14、设置故障转移的ip

     vim /etc/masterha/app1.cnf

          master_ip_failover_script="/usr/local/bin/masterha_ip_failover "

          master_ip_online_change_script="/usr/local/bin/masterha_ip_failover "

         编辑故障转移脚本,将vip设置成192.168.216.100

     vim /usr/local/bin/masterha_ip_failover

#!/usr/bin/env perl

use strict;

use warnings fatal => 'all';

 

use getopt::long;

 

my (

  $command,     $ssh_user,    $orig_master_host, $orig_master_ip,

  $orig_master_port, $new_master_host, $new_master_ip,  $new_master_port

);

 

# my $vip = '172.16.21.119/24'; # virtual ip

my $vip = '192.168.216.100/24'; # virtual ip

my $key = "1";

my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

 

getoptions(

  'command=s'     => \$command,

  'ssh_user=s'     => \$ssh_user,

  'orig_master_host=s' => \$orig_master_host,

  'orig_master_ip=s'  => \$orig_master_ip,

  'orig_master_port=i' => \$orig_master_port,

  'new_master_host=s' => \$new_master_host,

  'new_master_ip=s'  => \$new_master_ip,

  'new_master_port=i' => \$new_master_port,

);

 

exit &main();

 

sub main {

 

  print "\n\nin script test====$ssh_stop_vip==$ssh_start_vip===\n\n";

 

  if ( $command eq "stop" || $command eq "stopssh" ) {

 

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

    # if you manage master ip address at global catalog database,

    # invalidate orig_master_ip here.

    my $exit_code = 1;

    eval {

      print "disabling the vip on old master: $orig_master_host \n";

      &stop_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn "got error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

 

    # all arguments are passed.

    # if you manage master ip address at global catalog database,

    # activate new_master_ip here.

    # you can also grant write access (create user, set read_only=0, etc) here.

    my $exit_code = 10;

    eval {

      print "enabling the vip – $vip on the new master – $new_master_host \n";

      &start_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn $@;

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

    print "checking the status of the script.. ok \n";

    `ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`;

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

# a simple system call that enable the vip on the new master

sub start_vip() {

  `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

# a simple system call that disable the vip on the old_master

sub stop_vip() {

  `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

sub usage {

  print

  "usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n";

}

 

测试脚本

/usr/local/bin/masterha_ip_failover –command=status –ssh_user=root –orig_master_host=192.168.216.51 –orig_master_ip=192.168.216.51 –orig_master_port=3306

虚拟ip要手动启动,在主库故障后会自动转移。

/usr/local/bin/masterha_ip_failover –command=start –ssh_user=root –orig_master_host=192.168.216.51 –orig_master_ip=192.168.216.51 –orig_master_port=3306 –new_master_host=192.168.216.51

测试ip故障转移:

关闭主库mysql,查看vip是否转移的新的主库上。