redis高可用与集群实战案例

程序员文章站 2022-06-22 23:39:57

解决单点宕机问题方法一、配置redis主从方法二、redis集群1、Sentinel(哨兵)2、redis cluster3、Redis cluster 集群节点维护（1）动态添加节点（2）动态删除节点（3）模拟 Master 宕机（4）导入现有Redis 数据方法一、配置redis主从主备模式，可以实现 Redis 数据的跨主机备份。程序端连接到高可用负载的VIP，然后连接到负载服务器设置的Redis后端 real server，此模式不需要在程序里面配置 Redis 服务器的....

解决单点宕机问题
方法一、配置redis主从
方法二、redis集群
1、Sentinel(哨兵)
2、redis cluster
3、Redis cluster 集群节点维护
（1）动态添加节点
（2）动态删除节点
（3）模拟 Master 宕机
（4）导入现有Redis 数据

方法一、配置redis主从

主备模式，可以实现 Redis 数据的跨主机备份。
程序端连接到高可用负载的VIP，然后连接到负载服务器设置的Redis后端 real server，此模式不需要在程序里面配置 Redis 服务器的真实 IP 地址。

slave主要配置

Redis Slave 也要开启持久化并设置和 master 同样的连接密码，因为后期 slave 会有提升为 master 的可能，Slave 端切换 master 同步后会丢失之前的所有数据。

#将当前的master转换为slave角色，并指向master服务器的IP+PORT+Password
192.168.7.104:6379> REPLICAOF 192.168.7.103 6379 #将104配置为103的slave
OK
192.168.7.104:6379> CONFIG SET masterauth 123456 #设置密码
OK
192.168.7.104:6379> INFO Replication #自动同步
192.168.7.104:6379> replicaof 192.168.7.103 6379 #保存配置到redis.conf
192.168.7.104:6379> reboot
192.168.7.104:6379> INFO Replication #重启验证后必须为up状态
127.0.0.1:6379> KEYS * #验证slave数据
192.168.7.101:6379> SLAVEOF no one #停止slave同步
OK
192.168.7.101:6379> set key1 value1 #测试能否写入数据
OK

方法二、redis集群：

redis集群优于redis主从的核心两点：
①实现master 和 slave 角色的无缝切换，从而不影响业务使用
②可以横向动态扩展 Redis 服务器，从而实现多台服务器并行写入以实现更高并发的目的。

1、Sentinel(哨兵)：

可以实现master和slave角色的无缝切换，Redis的哨兵模式到了2.8版本之后就稳定了下来。
准备环境：三台服务器
192.168.7.101:6379：master服务器
192.168.7.102:6379：A slave1
192.168.7.103:6379：B slave2

#1、手动配置master：
#（1）服务器 A 配置 slave1：
192.168.7.102:6379> REPLICAOF 192.168.7.101 6379
OK
192.168.7.102:6379> CONFIG SET masterauth "123456"
OK
192.168.7.102:6379> info Replication
#（2）服务器 B 配置 slave2：
192.168.7.103:6379> REPLICAOF 192.168.7.101 6379
OK
192.168.7.103:6379> CONFIG SET masterauth "123456"
OK
192.168.7.102:6379> info Replication
#（3）查看当前master状态
192.168.7.101:6379> info Replication
#（4）应用程序是如何连接redis？
Jedis Sentinel：在JedisPool中添加了Sentinel和MasterName参数
#（5）python 连接 redis：
[root@redis-s3 ~]# yum install python-pip
[root@redis-s3 ~]# pip install redis
[root@redis-s3 ~]# cat test.py 
#!/bin/env python
#Author: ZhangJie
import redis
pool = redis.ConnectionPool(host="192.168.7.101", port=6379,password="123456")
r = redis.Redis(connection_pool=pool)
for i in range(100):
 r.set("k%d" % i,"v%d" % i)
 data=r.get("k%d" % i)
print(data)

#2、编辑配置文件sentinel.conf
#（1）Server1 配置：
[root@redis-s1 etc]# grep "^[a-Z]" /usr/local/redis/etc/sentinel.conf 
bind 0.0.0.0
port 26379
daemonize yes
pidfile "/usr/local/redis/redis-sentinel.pid"
logfile "/usr/local/redis/sentinel_26379.log"
dir "/usr/local/redis"
sentinel monitor mymaster 192.168.7.101 6379 2
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 30000 #(SDOWN)主观下线的时间
sentinel parallel-syncs mymaster 1 #发生故障转移时候同时向新master 同步数据的 slave 数量，数字越 小总同步时间越长
sentinel failover-timeout mymaster 180000 #所有slaves指向新的master所需的超时时间
sentinel deny-scripts-reconfig yes
#（2）Server2 配置：
bind 192.168.7.102
port 26379
daemonize yes
pidfile "/usr/local/redis/redis-sentinel.pid"
logfile "/usr/local/redis/sentinel_26379.log"
dir "/usr/local/redis"
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.7.101 6379 2
sentinel auth-pass mymaster 123456
#（3）Server3 配置：
bind 192.168.7.103
port 26379
daemonize yes
pidfile "/usr/local/redis/redis-sentinel.pid"
logfile "/usr/local/redis/sentinel_26379.log"
dir "/usr/local/redis"
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 192.168.7.101 6379 2
sentinel auth-pass mymaster 123456

#3、启动哨兵等
/usr/local/redis/bin/redis-sentinel /usr/local/redis/etc/sentinel.conf #三台哨兵都要启动
ss -tnl #验证端口
tail -f /usr/local/redis/sentinel_26379.log #哨兵日志

#4、当前状态
192.168.7.101:6379> info Replication #当前redis状态
[root@redis-s1 etc]# redis-cli -h 192.168.7.101 -p 26379
192.168.7.101:26379> info Sentinel #当前sentinel状态。尤其是最后一行，涉及到 master IP是多少，有几个slave，有几个sentinels，必须是符合全部服务器数量的。

#5、停止Redis Master测试故障转移： 
[root@redis-s1 ~]# systemctl stop redis
[root@redis-s1 ~]# redis-cli -h 192.168.7.101 -p 6379 #查看集群信息
[root@redis-s1 ~]# redis-cli -h 192.168.7.101 -p 26379 #查看哨兵信息
[root@redis-s1 ~]#tail -f /usr/local/redis/etc/sentinel.log #故障转移时sentinel的信息
[root@redis-s1 ~]#grep "^[a-Z]" /usr/local/redis/etc/sentinel.log#故障转移后的redis配置文件：故障转移后redis.conf中的 replicaof行的 master IP会被修改，sentinel.conf中的sentinel monitor IP 会被修改。
192.168.7.101:6379> info Replication #当前redis状态

2、redis cluster：

解决 redis 单机写入的瓶颈问题，即单机的redis写入性能受限于单机的内存大小、并发数量、网卡速率等因素，redis 3.0版本之后推出了无中心架构的redis cluster机制。
环境准备：
环境 A：三台服务器，每台服务器启动 6379 和 6380 两个 redis 服务。
192.168.7.101:6379/6380
192.168.7.102:6379/6380
192.168.7.103:6379/6380
另外预留一台服务器做集群添加节点测试。
192.168.7.104:6379/6380
环境 B：生产环境建议直接 6 台服务器。
172.18.200.101
172.18.200.102
172.18.200.103
172.18.200.104
172.18.200.105
172.18.200.106
预留服务器172.18.200.107
（1）创建 redis cluster 集群的前提：
①每个 redis node 节点采用相同的硬件配置、相同的密码
②每个节点必须开启的参数
③所有 redis 服务器必须没有任何数据
④先启动为单机 redis 且没有任何 key value

cluster-enabled yes #必须开启集群状态，开启后redis进程会有cluster显示#验证ps -ef|grep redis
cluster-config-file nodes-6380.conf #此文件由redis cluster集群自动创建和维护，不需要任何手动操作

（2）创建集群：需要集群管理工具 redis-trib.rb

#1、 redis-trib.rb是redis作者用ruby开发完成的，centos系统yum安装的 ruby存在版本较低问题，如下：
[root@s1 ~]# yum install ruby rubygems -y
[root@s1 ~]# find / -name redis-trib.rb 
/usr/local/src/redis-4.0.14/src/redis-trib.rb
[root@s1 ~]# cp /usr/local/src/redis-4.0.14/src/redis-trib.rb /usr/bin/
[root@s1 src]# gem install redis
Fetching: redis-4.1.2.gem (100%)
ERROR: Error installing redis:
redis requires Ruby version >= 2.3.0.
#解决 ruby 版本较低问题：
[root@s1 src]# yum remove ruby rubygems -y
[root@s1 src]# wget https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.5.tar.gz
[root@s1 src]# tar xf ruby-2.5.5.tar.gz
[root@s1 src]# cd ruby-2.5.5
[root@s1 ruby-2.5.5]# ./configure
[root@s1 ruby-2.5.5]# make -j 2
[root@s1 ruby-2.5.5]# make install
[root@s1 ruby-2.5.5]# gem install redis #https://rubygems.org/gems/redis, # gem install -l redis-3.3.0.gem
#验证 redis-trib.rb 命令是否可执行:
[root@s1 ruby-2.5.4]# redis-trib.rb 
Usage: redis-trib <command> <options> <arguments ...>
 create host1:port1 ... hostN:portN #创建集群
 --replicas <arg> #指定 master 的副本数量
 check host:port #检查集群信息
 info host:port #查看集群主机信息
 fix host:port #修复集群
 --timeout <arg>
 reshard host:port #在线热迁移集群指定主机的 slots 数据
 --from <arg>
 --to <arg>
 --slots <arg>
 --yes
 --timeout <arg>
 --pipeline <arg>
 rebalance host:port #平衡集群中各主机的 slot 数量
 --weight <arg>
 --auto-weights
 --use-empty-masters
 --timeout <arg>
 --simulate
 --pipeline <arg>
 --threshold <arg>
 add-node new_host:new_port existing_host:existing_port #添加主机到集群
 --slave
 --master-id <arg>
 del-node host:port node_id #删除主机
 set-timeout host:port milliseconds #设置节点的超时时间
 call host:port command arg arg .. arg #在集群上的所有节点上执行命令
 import host:port #导入外部 redis 服务器的数据到当前集群
 --from <arg>
 --copy
 --replace
 help (show this help)
[root@s1 ruby-2.5.4]# vim /usr/local/lib/ruby/gems/2.5.0/gems/redis-4.1.0/lib/redis/client.rb #修改密码为redis登录密码

#2、Redis 3/4 版本：
[root@s1 ~]# redis-trib.rb create --replicas 1 172.18.200.101:6379 172.18.200.102:6379 
172.18.200.103:6379 172.18.200.104:6379 172.18.200.105:6379 172.18.200.106:6379
#如果有之前的操作导致Redis集群创建报错，则执行清空数据和集群命令：
127.0.0.1:6379> FLUSHALL
OK
127.0.0.1:6379> cluster reset
OK

#3、Redis 5 版本：
[root@redis-s1 ~]# redis-cli -a 123456 --cluster create 192.168.7.101:6379 
M: f4cfc5cf821c0d855016488d6fbfb62c03a14fda 192.168.7.101:6379 #带 M 的为 master
 slots:[0-5460] (5461 slots) master #当前master的槽位起始和结束位
S: 2b6e5d9c3944d79a5b64a19e54e52f83d48438d6 192.168.7.101:6380 #带S 的为 slave
 replicates 70de3821dde4701c647bd6c23b9dd3c5c9f24a62
Can I set the above configuration? (type 'yes' to accept): yes #输入 yes 自动创建集群
M: f4cfc5cf821c0d855016488d6fbfb62c03a14fda 192.168.7.101:6379 #master 的 ID 及端口
 slots:[0-5460] (5461 slots) master #已经分配的槽位
 1 additional replica(s) #分配了一个 slave
S: 7186c6d03dd9a5e3be658f2d08e800bc55b04a09 192.168.7.102:6380
 slots: (0 slots) slave #slave 没有分配槽位
[OK] All nodes agree about slots configuration. #所有节点槽位分配完成
>>> Check for open slots... #检查打开的槽位
>>> Check slots coverage... #检查插槽覆盖范围
[OK] All 16384 slots covered. #所有槽位(16384 个)分配完成

（3）检查状态：
由于未设置 masterauth 认证密码，所以主从未建立起来，但是集群已经运行，所以需要在每个 slave控制台使用 config set 设置 masterauth 密码，或者写在每个 redis 配置文件中，最好是在控制点设置密码之后再写入配置文件当中。

[root@redis-s1 ~]#redis-cli -h 192.168.7.101 -p 6380 -a 123456
master_link_status:down

（4）分别设置 masterauth 密码：

[root@redis-s1 ~]# redis-cli -h 192.168.7.101 -p 6380 -a 123456
192.168.7.101:6380> CONFIG SET masterauth 123456
OK
[root@redis-s1 ~]# redis-cli -h 192.168.7.102 -p 6380 -a 123456
192.168.7.102:6380> CONFIG SET masterauth 123456
OK
[root@redis-s1 ~]# redis-cli -h 192.168.7.103 -p 6380 -a 123456
192.168.7.103:6380> CONFIG SET masterauth 123456
OK

（5）确认 slave 状态为 up：

192.168.7.103:6380> info replication 
role:slave   
master_link_status:up

（6）验证master状态

[root@redis-s1 ~]# redis-cli -h 192.168.7.101 -p 6379 -a 123456
192.168.7.101:6379> INFO Replication
# Replication
role:master
connected_slaves:1
slave0:ip=192.168.7.102,port=6380,state=online,offset=840,lag=0
master_replid:0aa3281030eb29bf268f3317d4afe401f661a917
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:840
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:4026531840
repl_backlog_first_byte_offset:1
repl_backlog_histlen:840

（7）验证集群状态：

192.168.7.101:6379> CLUSTER INFO
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:1474
cluster_stats_messages_pong_sent:1507
cluster_stats_messages_sent:2981
cluster_stats_messages_ping_received:1502
cluster_stats_messages_pong_received:1474
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:2981

（8）查看集群 node 对应关系：

192.168.7.103:6380> cluster nodes
7186c6d03dd9a5e3be658f2d08e800bc55b04a09 192.168.7.102:6380@16380 slave 
f4cfc5cf821c0d855016488d6fbfb62c03a14fda 0 1545659135000 4 connected
7eda0bcf0c01bb7343313b81267c42dd1b26c8a6 192.168.7.103:6380@16380 myself,slave 
116c4c6de036fdbac5aaad25eb1a61ea262b64af 0 1545659135000 6 conne
ctedf4cfc5cf821c0d855016488d6fbfb62c03a14fda 192.168.7.101:6379@16379 master - 0 
1545659135000 1 connected 0-5460
116c4c6de036fdbac5aaad25eb1a61ea262b64af 192.168.7.102:6379@16379 master - 0 
1545659136000 3 connected 5461-10922
70de3821dde4701c647bd6c23b9dd3c5c9f24a62 192.168.7.103:6379@16379 master - 0 
1545659134000 5 connected 10923-16383
2b6e5d9c3944d79a5b64a19e54e52f83d48438d6 192.168.7.101:6380@16380 slave 
70de3821dde4701c647bd6c23b9dd3c5c9f24a62 0 1545659135946 5 connected

（9）验证集群写入 key：

192.168.7.101:6379> SET key1 value1 #经过算法计算，当前 key 的槽位需要写入指定的 node 
(error) MOVED 9189 192.168.7.102:6379 #槽位不在当前 node 所以无法写入
192.168.7.103:6379> SET key1 value1 
(error) MOVED 9189 192.168.7.102:6379 
192.168.7.102:6379> SET key1 value1 #指定的 node 就可以写入
OK
192.168.7.102:6379> KEYS *
1) "key1"
192.168.7.101:6379> KEYS *
(empty list or set)
192.168.7.103:6379> KEYS *
(empty list or set)

（10）集群状态监控：

#1、Redis 4: 
[root@s1 ~]# redis-trib.rb check 172.18.200.105:6379
>>> Performing Cluster Check (using node 172.18.200.105:6379)
S: dfa53d634b3bd798e256ef9861579d5e637fc4b0 172.18.200.105:6379
 slots: (0 slots) slave
 replicates 99e09216d4ca5739788791da81a816bd2322802d
M: 3ed26459bcdf4bbd3004c9a7506ba1f6e87dd55a 172.18.200.102:6379
 slots:5461-10922 (5462 slots) master
 1 additional replica(s)
M: 99e09216d4ca5739788791da81a816bd2322802d 172.18.200.101:6379
 slots:0-5460 (5461 slots) master
 1 additional replica(s)
S: 45e28f36573eb5123c27b359ae870e55a7d73017 172.18.200.104:6379
 slots: (0 slots) slave
 replicates 4d6357633a82f4ab48910d9c26dec8a2ef5b757c
S: adc943d76aa9ef1b123319f0772e4322756f34a2 172.18.200.106:6379
 slots: (0 slots) slave
 replicates 3ed26459bcdf4bbd3004c9a7506ba1f6e87dd55a
M: 4d6357633a82f4ab48910d9c26dec8a2ef5b757c 172.18.200.103:6379
 slots:10923-16383 (5461 slots) master
 1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[root@s1 ~]# redis-trib.rb info 172.18.200.105:6379
172.18.200.102:6379 (3ed26459...) -> 1 keys | 5462 slots | 1 slaves.
172.18.200.101:6379 (99e09216...) -> 0 keys | 5461 slots | 1 slaves.
172.18.200.103:6379 (4d635763...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.

#2、Redis 5：
redis-cli -a 123456 --cluster check 192.168.7.101:6379

3、Redis cluster集群节点维护

（1）动态添加节点：增加Redis node节点，需要与之前的Redis node 版本相同、配置一致，然后分别启动两台Redis node，因为一主一从。
案例：
因公司业务发展迅猛，现有的三主三从 redis cluster 架构可能无法满足现有业务的并发写入需求，因此公司紧急采购一台服务器 192.168.7.104，需要将其动态添加到集群当中其不能影响业务使用和数据丢失，则添加过程如下：

#同步之前 Redis node 的配置文件到 192.168.7.104 Redis 编译安装目录，注意配置文件的监听IP。
 scp redis.conf 192.168.7.104:/usr/local/redis/etc/
 scp redis_6380.conf 192.168.7.104:/usr/local/redis/etc/
#分别启动 redis 服务：
 systemctl daemon-reload
 systemctl restart redis
 /usr/local/redis/bin/redis-server /usr/local/redis/etc/redis_6380.conf
 
#1、添加节点到集群：
#Redis 4 添加方式：
[root@s1 ~]# redis-trib.rb add-node 172.18.200.106:6379 172.18.200.101:6379
#Redis 5 添加方式：
[root@s1 ~]#redis-cli -a 123456 --cluster add-node 192.168.7.104:6379 192.168.7.101:6379
 
#2、分配槽位：添加主机之后需要对添加至集群种的新主机重新分片否则其没有分片
#验证当前状态：
#Redis 3/4：
[root@s1 ~]# redis-trib.rb reshard 172.18.200.107:6379
[root@s1 ~]# redis-trib.rb check 172.18.200.101:6379
#Redis 5：
[root@redis-s3 etc]# redis-cli -a 123456 --cluster check 192.168.7.103:6379
M：886338acd50c3015be68a760502b239f4509881c 192.168.7.104:6379  master
  slots:(0 slots) master #新添加的master没有槽位
#使用命令对新加的主机重新分配槽位：
[root@redis-s1 ~]# redis-cli -a 123456 --cluster reshard 192.168.7.104:6379
[root@redis-s1 ~]# redis-cli -a 123456 --cluster reshard 192.168.7.104:6379 
How many slots do you want to move (from 1 to 16384)? 4096 #分配多少个槽位 192.168.7.104:6379
What is the receiving node ID? 886338acd50c3015be68a760502b239f4509881c #接收 slot 的服务器 ID，手动输入192.168.7.104 的 node ID
Source node #1: all #将哪些源主机的槽位分配给 192.168.7.104:6379，all 是自动在所有的 redis node 选择划分，如果是从 redis cluster 删除主机可以使用此方式将主机上的槽位全部移动到别的 redis 主机
Do you want to proceed with the proposed reshard plan (yes/no)? yes #确认分配

#3、验证重新分配槽位之后的集群状态：重新分配槽位是自动从每个Redis node上移动一些槽位到新的master上
[root@redis-s3 etc]# redis-cli -a 123456 --cluster check 192.168.7.103:6379
slots:[12288-16383](4096 slots)master
slots:[6827-10922](4096 slots)master
slots:[1365-5460](4096 slots)master
slots:[0-1364],[5461-6826],[10923-12287](4096 slots)master

#4、为新的 master 添加 slave 节点：
[root@redis-s1 ~]# redis-cli -a 123456 --cluster add-node 192.168.7.104:6380 192.168.7.104:6379
>>>send cluster meet to node 192.167.7.104:6380 to make it join the cluster
>
#5、更改新节点状态为slave：需要手动将其指定为某个 master 的 slave，否则其默认角色为 master
[root@redis-s1 ~]# redis-cli -h 192.168.7.104 -p 6380 -a 123456 #登录到新添加节点
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.104:6380> CLUSTER NODES #查看当前集群节点，找到目标 master 的 ID
886338acd50c3015be68a760502b239f4509881c 192.168.7.104:6379@16379 master - 0 1545700465468 7 connected 0-1364 5461-6826 10923-12287
192.168.7.104:6380> CLUSTER REPLICATE 886338acd50c3015be68a760502b239f4509881c #将其设置 slave，命令格式为 cluster replicate MASTERID
OK
192.168.7.104:6380> CLUSTER NODES #再次查看集群节点状态，验证节点是否已经更改为指定 master 的 slave
b9a00d59fa3c2a322080a1c7d84f53a2c853b089 192.168.7.104:6380@16380 myself,slave 886338acd50c3015be68a760502b239f4509881c 0 1545700509000 0 connected
886338acd50c3015be68a760502b239f4509881c 192.168.7.104:6379@16379 master - 0 1545700516456 7 connected 0-1364 5461-6826 10923-12287

#6、验证当前集群状态：
[root@redis-s3 etc]# redis-cli -a 123456 --cluster check 192.168.7.103:6379
192.168.7.104:6379(886338ac...)-> 0 keys | 4096 slots | 1 slaves.

（2）动态删除节点：先将被删除的 Redis node 上的槽位迁移到集群中的其他 Redis node 节点上，然后再将其删除。
案例：
由于 192.168.7.101 服务器使用年限已经超过三年，已经超过厂商质保期而且硬盘出现异常报警，经运维部架构师提交方案并同开发同事开会商议，决定将现有 Redis 集群的 4 台服务器分别是
192.168.7.101/192.168.7.102/192.168.7.103/192.168.7.104 中的 192.168.7.101 临时下线，三台服务器的并发写入性能足够支出未来 1-2 年的业务需求，则删除 Redis node 192.168.7.101 的操作如下：

#1、迁移 master 的槽位之其他 master：被迁移 Redis 服务器必须保证没有数据
[root@s1 ~]# redis-trib.rb reshard 172.18.200.101:6379
[root@s1 ~]# redis-trib.rb fix 172.18.200.101:6379 #迁移失败需要修复集群
[root@redis-s1 ~]# redis-cli -a 123456 --cluster reshard 192.168.7.102:6379
How many slots do you want to move (from 1 to 16384)? 4096 #迁移 master 上的多少个槽位
What is the receiving node ID? 886338acd50c3015be68a760502b239f4509881c #接收槽位的服务器 ID
Source node #1: f4cfc5cf821c0d855016488d6fbfb62c03a14fda #从哪个服务器迁移 4096 个槽位
Source node #2: done #写 done，表示没有其他 master 了
Do you want to proceed with the proposed reshard plan (yes/no)? yes #是否继续

#2、验证槽位迁移完成
[root@redis-s3 etc]# redis-cli -a 123456 --cluster check 192.168.7.103:6379
M：886338acd50c3015be68a760502b239f4509881c 192.168.7.104:6379  master
  slots:(0 slots) master #新添加的master没有槽位
  
#3、从集群删除服务器：虽然槽位已经迁移完成，但是服务器 IP 信息还在集群当中
#删除 master： 
#Redis 4：
[root@s1 ~]# redis-trib.rb del-node 172.18.200.102:6379 
#Redis 5：
[root@redis-s1 ~]# redis-cli -a 123456 --cluster del-node 192.168.7.101:6379 

#4、验证 node 是否删除：
#发现 192.168.7.101 已经被删除，但是由于 192.168.7.101:6380 之前是 192.168.7.103:6379 的 slave，所 以删除后会导致相应的 master 缺少 slave，需要重新为没有 slave 的 master 分配 slave。
#可以发现下图的 192.168.7.104 有两个 slave，分别是 192.168.7.102:6380 和 192.168.7.104:6380，因此需要将其中一个 slave 转移为 192.168.7.103 的 slave。
[root@redis-s3 etc]# redis-cli -a 123456 --cluster check 192.168.7.103:6379
192.168.7.103:6379(70de3821...)-> 0 keys | 4096 slots | 0 slaves.
192.168.7.102:6379(116c4c6d...)-> 1 keys | 4096 slots | 1 slaves.
192.168.7.104:6379(886338ac...)-> 0 keys | 8192 slots | 2 slaves.

#5、重新分配 slave： 将 192.168.7.104:6380 转移为 192.168.7.103 的 slave
[root@redis-s1 ~]# redis-cli -h 192.168.7.104 -p 6379 -a 123456
70de3821dde4701c647bd6c23b9dd3c5c9f24a62 192.168.7.103:6379@16379 master - 0 1545708440000 5 connected 12288-16383
192.168.7.104:6380> CLUSTER REPLICATE 70de3821dde4701c647bd6c23b9dd3c5c9f24a62
OK
#6、验证集群 Master 与 Slave 对应关系：
#Redis Slave 节点一定不能个 master 在一个服务器，必须为跨主机交叉备份模式，避免主机故障后主备全部挂掉，如果出现 Redis Slave 与 Redis master 在同一台 Redis node 的情况，则需要安装以上步骤重新进行 slave 分配，直到不相互交叉备份为止。

redis高可用与集群实战案例 （3）模拟 Master 宕机：目前的架构为三主三从，互为跨主机 master slave 模式，测试 master 宕机之后是否会自动切换至 slave。

#1、测试数据写入：测试在 master 写入数据，并在其对应的 slave 验证数据：
192.168.7.102:6379> SET key1 value1
OK
192.168.7.102:6379> get key1
"value1"

#2、slave 验证数据：
192.168.7.103:6380> KEYS *
1) "key1"
192.168.7.103:6380> get key1
(error) MOVED 9189 192.168.7.102:6379 #slave 不提供读写，只提供数据备份即 master选举

#3、停止 master 并验证故障转移：Redis Master 服务停止之后，其对应的 slave 会被选举为 master 继续处理数据的读写操作。
 systemctl stop redis
 
#4、验证 slave 日志：
 tail -f /usr/local/redis/redis_6380.log #需要相应的故障转移时间
	 i am the new master #成为新的master
	 cluster state changed：OK#状态切换成功
 
#5、验证 slave 状态：
[root@redis-s3 ~]# redis-cli -h 192.168.7.103 -p 6379 -a 123456
role：master#角色已转换成master

#6、验证数据读写：确认 slave 192.168.7.103:6380 切换为 master 之后可以继续为业务提供读写业务且数据没有丢失。
192.168.7.103:6380> KEYS *
1) "key1"
192.168.7.103:6380> SET aaa bbb
OK
192.168.7.103:6380> get key1
"value1"
192.168.7.103:6380> get aaa
"bbb"
#注：服务恢复之后重新验证各 master 的 slave。

（4）导入现有 Redis 数据：导入数据需要 redis cluster 不能与被导入的数据有重复的 key 名称，否则导入不成功或中断。
案例：
公司将 redis cluster 部署完成之后，需要将之前的数据导入之 Redis cluster 集群，但是由于 Redis cluster使用的分片保存 key 的机制，因此使用传统的 AOF 文件或 RDB 快照无法满足需求，因此需要使用集群数据导入命令完成。

#1、基础环境准备：导入数据之前需要关闭各 redis 服务器的密码，包括集群中的各 node 和源 Redis server，避免认证带 来的环境不一致从而无法导入，可以加参数--cluster-replace 强制替换 Redis cluster 已有的 key。
[root@redis-s1 ~]# redis-cli -h 192.168.7.102 -p 6379 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.102:6379> CONFIG SET requirepass ""
OK
192.168.7.104:6379> exit
[root@redis-s1 ~]# redis-cli -h 192.168.7.102 -p 6380 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.102:6380> CONFIG SET requirepass ""
OK
192.168.7.104:6379> exit
[root@redis-s1 ~]# redis-cli -h 192.168.7.103 -p 6379 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.103:6379> CONFIG SET requirepass ""
OK
192.168.7.103:6379> exit
[root@redis-s1 ~]# redis-cli -h 192.168.7.103 -p 6380 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.103:6380> CONFIG SET requirepass ""
OK
192.168.7.104:6379> exit
[root@redis-s1 ~]# redis-cli -h 192.168.7.104 -p 6379 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.104:6379> CONFIG SET requirepass ""
OK
192.168.7.104:6379> exit
[root@redis-s1 ~]# redis-cli -h 192.168.7.104 -p 6380 -a 123456
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
192.168.7.104:6380> CONFIG SET requirepass ""
OK
192.168.7.104:6379> exit

#2、执行数据导入：将源 Redis server 的数据直接导入之 redis cluster。
# Redis 3/4：
[root@s1 ~]# redis-trib.rb import --from 172.18.200.107:6382 --replace 172.18.200.107:6379
#Redis 5： 
[root@redis-s2 redis]# redis-cli --cluster import 192.168.7.103:6379 --cluster-from 192.168.7.101:6379 --cluster-copy

#3、Redis cluster 验证数据：
192.168.7.102:6380> get k81
"v81"
192.168.7.102:6380> get k97
"v97"
192.168.7.102:6380> get k12
"v12"

本文地址：https://blog.csdn.net/weixin_44515412/article/details/107378187

redis高可用与集群实战案例

方法一、配置redis主从

slave主要配置

方法二、redis集群：

1、Sentinel(哨兵)：

2、redis cluster：

3、Redis cluster集群节点维护

16套java架构师，高并发，高可用，高性能，集群，大型分布式电商项目实战视频教程

Redis高可用及分片集群

三分钟快速搭建分布式高可用的Redis集群

Eureka的初理解【服务注册与发现、高可用集群、自我保护机制、与Zookeeper的比较】

Redis Sentinel安装与部署，实现redis的高可用

Redis Cluster搭建高可用Redis服务器集群

【中间件】Redis 实战之主从复制、高可用、分布式

Linux系统keepalived高可用集群配置与使用

Redis集群方案(高可用)之哨兵模式（一主二从三哨兵）

Redis 5.0.4-搭建3主3从3哨兵，实现高可用性与故障转移