欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

记一次搭建zookeeper集群遇到的问题

程序员文章站 2022-05-11 10:50:59
...

搭建zookeeper集群的过程这里就不说了,主要讲一下搭建后启动遇到的问题

这里有四台机子,准备拿三台做zookeeper集群,至于为什么只拿三台,那是因为zookeeper它更喜欢单数(具体原因请自行查找资料)

三台机子的ip和hostname为:

10.131.14.138 slave1

10.131.14.139 slave2

10.131.14.140 slave3

zookeeper配置文件:

zookeeper/conf/zoo.cfg

zoo.cfg最开始为zoo_sample.cfg,需要copy一份:

cp zoo_sample.cfg zoo.cfg

zoo.cfg内容为:

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/local/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.0=slave1:2888:3888
server.1=slave2:2888:3888
server.2=slave3:2888:3888

发现问题

启动后查看进程:

[[email protected] data]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] data]# jps
14692 ResourceManager
20221 Jps
14399 NameNode

没有zookeeper的进程,查看日志文件zookeeper.out后显示:
 

ERROR [main:[email protected]] - Invalid config, exiting abnormally

....

Caused by: java.lang.IllegalArgumentException: /user/local/zookeeper/data/myid file is missing

解决办法:

在每个节点的/user/local/zookeeper/data/目录创建一个myid文件

启动后还是一样,没有进程,查看日志文件显示:

ERROR [main:[email protected]] - Invalid config, exiting abnormally
...

Caused by: java.lang.IllegalArgumentException: serverid null is not a number

解决办法:

向每个节点的myid文件任意添加一个数字:

slave1 添加1

slave2 添加2

slave3 添加3

结果slave1和slave2启动都成功了,slave3没有启动成功,查看日志

2019-03-08 18:09:35,596 [myid:] - INFO  [main:[email protected]] - Reading configuration from: /usr/local/zookeeper/bin/../conf/zoo.cfg
2019-03-08 18:09:35,609 [myid:] - INFO  [main:[email protected]] - Resolved hostname: slave2 to address: slave2/10.131.14.139
2019-03-08 18:09:35,609 [myid:] - INFO  [main:[email protected]] - Resolved hostname: slave1 to address: slave1/10.131.14.138
2019-03-08 18:09:35,610 [myid:] - INFO  [main:[email protected]] - Resolved hostname: slave3 to address: slave3/10.131.14.140
2019-03-08 18:09:35,610 [myid:] - INFO  [main:[email protected]] - Defaulting to majority quorums
2019-03-08 18:09:35,612 [myid:3] - INFO  [main:[email protected]] - autopurge.snapRetainCount set to 3
2019-03-08 18:09:35,612 [myid:3] - INFO  [main:[email protected]] - autopurge.purgeInterval set to 0
2019-03-08 18:09:35,613 [myid:3] - INFO  [main:[email protected]] - Purge task is not scheduled.
2019-03-08 18:09:35,620 [myid:3] - INFO  [main:[email protected]] - Starting quorum peer
2019-03-08 18:09:35,626 [myid:3] - INFO  [main:[email protected]] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2019-03-08 18:09:35,631 [myid:3] - INFO  [main:[email protected]] - binding to port 0.0.0.0/0.0.0.0:2181
2019-03-08 18:09:35,638 [myid:3] - INFO  [main:[email protected]] - tickTime set to 2000
2019-03-08 18:09:35,638 [myid:3] - INFO  [main:[email protected]] - initLimit set to 10
2019-03-08 18:09:35,638 [myid:3] - INFO  [main:[email protected]] - minSessionTimeout set to -1
2019-03-08 18:09:35,639 [myid:3] - INFO  [main:[email protected]] - maxSessionTimeout set to -1
2019-03-08 18:09:35,644 [myid:3] - ERROR [main:[email protected]] - Setting LearnerType to PARTICIPANT but 3 not in QuorumPeers.
2019-03-08 18:09:35,645 [myid:3] - INFO  [main:[email protected]] - QuorumPeer communication is not secured!
2019-03-08 18:09:35,645 [myid:3] - INFO  [main:[email protected]] - quorum.cnxn.threads.size set to 20
2019-03-08 18:09:35,647 [myid:3] - INFO  [main:[email protected]] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2019-03-08 18:09:35,677 [myid:3] - INFO  [main:[email protected]] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2019-03-08 18:09:35,703 [myid:3] - ERROR [main:[email protected]] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: My id 3 not in the peer list
        at org.apache.zookeeper.server.quorum.QuorumPeer.startLeaderElection(QuorumPeer.java:718)
        at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:637)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)

提示 My id 3 not in the peer list

解决办法

myid 最后按照zoo.cfg中server.后面的那个数字添加,不然启动不会成功,当然你将slave3的myid改成0也能启动成功,但是这样感觉不好管理。

因此这里将slave1、slave2、slave3的myid改为了

slave1 添加0

slave2 添加1

slave3 添加2

启动之后使用jps查看进程:

slave1

[[email protected] data]# jps
20236 QuorumPeerMain
20317 Jps

slave2

[[email protected] data]# jps
20202 QuorumPeerMain
20256 Jps

slave3

[[email protected] data]# jps
18302 QuorumPeerMain
18348 Jps

启动成功啦。

最后,再说一下关于集群里某些机子不能成功启动zookeeper的原因,可能是因为时间不同步,当你查阅了很多方法都不能解决的时候,不妨试一下将时间同步一下。center os7 可使用yum install ntp 安装时间同步工具。