MongoDb双机房容灾啦
机房意外断电断网不得不预防,下面模拟测试某机房断电断网,b机房断电断网后a机房可正常提供服务,a机房断电断网后可能需要强制重启继续提供服务了,目前查看数据都还在,暂时没验证是否有数据丢失,小试了一把。。。
大概架构
a机房
192.168.70.214:27017 (为primary,其他节点都是secondary)
192.168.70.215:27017
192.168.70.216:27017
b机房
192.168.71.214:27017
192.168.71.215:27017
下面重点记录下a机房断电断网,b机房怎么重启继续提供服务,其实主要就3步,见粗体
repset:secondary> use admin
switched to db admin
repset:secondary> cfg=rs.conf()
{
"_id" : "repset",
"version" : 79,
"members" : [
{
"_id" : 4,
"host" : "192.168.71.214:27017",
"priority" : 10
},
{
"_id" : 5,
"host" : "192.168.71.215:27017",
"priority" : 9
},
{
"_id" : 7,
"host" : "192.168.70.214:27017",
"priority" : 11
},
{
"_id" : 8,
"host" : "192.168.70.215:27017",
"priority" : 6
},
{
"_id" : 13,
"host" : "192.168.70.216:27017",
"priority" : 5
}
]
}
repset:secondary> cfg.members = [cfg.members[0], cfg.members[1]]
[
{
"_id" : 4,
"host" : "192.168.71.214:27017",
"priority" : 10
},
{
"_id" : 5,
"host" : "192.168.71.215:27017",
"priority" : 9
}
]
repset:secondary> rs.reconfig(cfg, {force :true })
{ "ok" : 1 }
repset:secondary> rs.status()
{
"set" : "repset",
"date" : isodate("2018-11-09t03:45:29z"),
"mystate" : 1,
"members" : [
{
"_id" : 4,
"name" : "192.168.71.214:27017",
"health" : 1,
"state" : 1,
"statestr" : "primary",
"uptime" : 69133,
"optime" : timestamp(1541663971000, 1),
"optimedate" : isodate("2018-11-08t07:59:31z"),
"self" : true
},
{
"_id" : 5,
"name" : "192.168.71.215:27017",
"health" : 1,
"state" : 2,
"statestr" : "secondary",
"uptime" : 8,
"optime" : timestamp(1541663971000, 1),
"optimedate" : isodate("2018-11-08t07:59:31z"),
"lastheartbeat" : isodate("2018-11-09t03:45:29z"),
"pingms" : 0
}
],
"ok" : 1
}
repset:primary>
repset:primary> use test
switched to db test
repset:primary> show collections
system.indexes
test
testdb
repset:primary> rs.conf()
{
"_id" : "repset",
"version" : 38342,
"members" : [
{
"_id" : 4,
"host" : "192.168.71.214:27017",
"priority" : 10
},
{
"_id" : 5,
"host" : "192.168.71.215:27017",
"priority" : 9
}
]
}
从上面观察验证:新选的primary可以持续提供服务了,之前的数据都还在,解决了单机房断电断网的故障。
欢迎各位提供指导,共同成长。