记一次搭建zookeeper集群遇到的问题
搭建zookeeper集群的过程这里就不说了,主要讲一下搭建后启动遇到的问题
这里有四台机子,准备拿三台做zookeeper集群,至于为什么只拿三台,那是因为zookeeper它更喜欢单数(具体原因请自行查找资料)
三台机子的ip和hostname为:
10.131.14.138 slave1
10.131.14.139 slave2
10.131.14.140 slave3
zookeeper配置文件:
zookeeper/conf/zoo.cfg
zoo.cfg最开始为zoo_sample.cfg,需要copy一份:
cp zoo_sample.cfg zoo.cfg
zoo.cfg内容为:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/usr/local/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.0=slave1:2888:3888
server.1=slave2:2888:3888
server.2=slave3:2888:3888
发现问题
启动后查看进程:
[[email protected] data]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] data]# jps
14692 ResourceManager
20221 Jps
14399 NameNode
没有zookeeper的进程,查看日志文件zookeeper.out后显示:
ERROR [main:[email protected]] - Invalid config, exiting abnormally
....
Caused by: java.lang.IllegalArgumentException: /user/local/zookeeper/data/myid file is missing
解决办法:
在每个节点的/user/local/zookeeper/data/目录创建一个myid文件
启动后还是一样,没有进程,查看日志文件显示:
ERROR [main:[email protected]] - Invalid config, exiting abnormally
...
Caused by: java.lang.IllegalArgumentException: serverid null is not a number
解决办法:
向每个节点的myid文件任意添加一个数字:
slave1 添加1
slave2 添加2
slave3 添加3
结果slave1和slave2启动都成功了,slave3没有启动成功,查看日志
2019-03-08 18:09:35,596 [myid:] - INFO [main:[email protected]] - Reading configuration from: /usr/local/zookeeper/bin/../conf/zoo.cfg
2019-03-08 18:09:35,609 [myid:] - INFO [main:[email protected]] - Resolved hostname: slave2 to address: slave2/10.131.14.139
2019-03-08 18:09:35,609 [myid:] - INFO [main:[email protected]] - Resolved hostname: slave1 to address: slave1/10.131.14.138
2019-03-08 18:09:35,610 [myid:] - INFO [main:[email protected]] - Resolved hostname: slave3 to address: slave3/10.131.14.140
2019-03-08 18:09:35,610 [myid:] - INFO [main:[email protected]] - Defaulting to majority quorums
2019-03-08 18:09:35,612 [myid:3] - INFO [main:[email protected]] - autopurge.snapRetainCount set to 3
2019-03-08 18:09:35,612 [myid:3] - INFO [main:[email protected]] - autopurge.purgeInterval set to 0
2019-03-08 18:09:35,613 [myid:3] - INFO [main:[email protected]] - Purge task is not scheduled.
2019-03-08 18:09:35,620 [myid:3] - INFO [main:[email protected]] - Starting quorum peer
2019-03-08 18:09:35,626 [myid:3] - INFO [main:[email protected]] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2019-03-08 18:09:35,631 [myid:3] - INFO [main:[email protected]] - binding to port 0.0.0.0/0.0.0.0:2181
2019-03-08 18:09:35,638 [myid:3] - INFO [main:[email protected]] - tickTime set to 2000
2019-03-08 18:09:35,638 [myid:3] - INFO [main:[email protected]] - initLimit set to 10
2019-03-08 18:09:35,638 [myid:3] - INFO [main:[email protected]] - minSessionTimeout set to -1
2019-03-08 18:09:35,639 [myid:3] - INFO [main:[email protected]] - maxSessionTimeout set to -1
2019-03-08 18:09:35,644 [myid:3] - ERROR [main:[email protected]] - Setting LearnerType to PARTICIPANT but 3 not in QuorumPeers.
2019-03-08 18:09:35,645 [myid:3] - INFO [main:[email protected]] - QuorumPeer communication is not secured!
2019-03-08 18:09:35,645 [myid:3] - INFO [main:[email protected]] - quorum.cnxn.threads.size set to 20
2019-03-08 18:09:35,647 [myid:3] - INFO [main:[email protected]] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2019-03-08 18:09:35,677 [myid:3] - INFO [main:[email protected]] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2019-03-08 18:09:35,703 [myid:3] - ERROR [main:[email protected]] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: My id 3 not in the peer list
at org.apache.zookeeper.server.quorum.QuorumPeer.startLeaderElection(QuorumPeer.java:718)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:637)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:170)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:114)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:81)
提示 My id 3 not in the peer list
解决办法
myid 最后按照zoo.cfg中server.后面的那个数字添加,不然启动不会成功,当然你将slave3的myid改成0也能启动成功,但是这样感觉不好管理。
因此这里将slave1、slave2、slave3的myid改为了
slave1 添加0
slave2 添加1
slave3 添加2
启动之后使用jps查看进程:
slave1
[[email protected] data]# jps
20236 QuorumPeerMain
20317 Jps
slave2
[[email protected] data]# jps
20202 QuorumPeerMain
20256 Jps
slave3
[[email protected] data]# jps
18302 QuorumPeerMain
18348 Jps
启动成功啦。
最后,再说一下关于集群里某些机子不能成功启动zookeeper的原因,可能是因为时间不同步,当你查阅了很多方法都不能解决的时候,不妨试一下将时间同步一下。center os7 可使用yum install ntp 安装时间同步工具。
上一篇: 记一次面试中遇到的问题。
下一篇: 记一次编写python爬虫遇到的问题