Hadoop2.7.1+Hbase1.2.1集群环境搭建(5)hbase安装
(1)hadoop2.7.1源码编译 | http://aperise.iteye.com/blog/2246856 |
(2)hadoop2.7.1安装准备 | http://aperise.iteye.com/blog/2253544 |
(3)1.x和2.x都支持的集群安装 | http://aperise.iteye.com/blog/2245547 |
(4)hbase安装准备 | http://aperise.iteye.com/blog/2254451 |
(5)hbase安装 | http://aperise.iteye.com/blog/2254460 |
(6)snappy安装 | http://aperise.iteye.com/blog/2254487 |
(7)hbase性能优化 | http://aperise.iteye.com/blog/2282670 |
(8)雅虎YCSBC测试hbase性能测试 | http://aperise.iteye.com/blog/2248863 |
(9)spring-hadoop实战 | http://aperise.iteye.com/blog/2254491 |
(10)基于ZK的Hadoop HA集群安装 | http://aperise.iteye.com/blog/2305809 |
1.Hadoop安装
请参见 http://aperise.iteye.com/blog/2245547
2.hbase安装前准备
请参见 http://aperise.iteye.com/blog/2254451
3.hbase安装
3.1 下载安装包hbase-1.2.1-bin.tar.gz放置于/opt并解压
tar zxvf hbase-1.2.1-bin.tar.gz
3.2 配置环境变量
vi /etc/profile
#添加如下内容:
export PATH=${HBASE}/bin:${PATH}
3.3 创建hbase临时文件夹(集群每个节点都需要创建)
3.4 修改/opt/hbase-1.2.1/conf/hbase-env.sh,内容如下:
#/**
# * Licensed to the Apache Software Foundation (ASF) under one
# * or more contributor license agreements. See the NOTICE file
# * distributed with this work for additional information
# * regarding copyright ownership. The ASF licenses this file
# * to you under the Apache License, Version 2.0 (the
# * "License"); you may not use this file except in compliance
# * with the License. You may obtain a copy of the License at
# *
# * http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */
# Set environment variables here.
# This script sets variables multiple times over the course of starting an hbase process,
# so try to keep things idempotent unless you want to take an even deeper look
# into the startup scripts (bin/hbase, etc.)
# The java implementation to use. Java 1.7+ required.
# export JAVA_HOME=/usr/java/jdk1.6.0/
export JAVA_HOME=/opt/java/jdk1.7.0_65/
# Extra Java CLASSPATH elements. Optional.
# export HBASE_CLASSPATH=
export HBASE_CLASSPATH=/opt/hbase-1.2.1/conf
# The maximum amount of heap to use. Default is left to JVM default.
# export HBASE_HEAPSIZE=1G
#export HBASE_HEAPSIZE=2G
export HBASE_HEAPSIZE=4G
# Uncomment below if you intend to use off heap cache. For example, to allocate 8G of
# offheap, set the value to "8G".
# export HBASE_OFFHEAPSIZE=1G
# Extra Java runtime options.
# Below are what we set by default. May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export HBASE_OPTS="$HBASE_OPTS -XX:CMSInitiatingOccupancyFraction=60"
# Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
#export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xmx4g -Xms4g -Xmn1g -XX:SurvivorRatio=1 -XX:PermSize=128M -XX:MaxPermSize=128M -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false -XX:MaxTenuringThreshold=15 -XX:+CMSParallelRemarkEnabled -XX:+UseFastAccessorMethods -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+HeapDumpOnOutOfMemoryError -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -Xloggc:/opt/hbase-1.2.1/logs/gc-hbase-regionserver.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.
# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
# Uncomment one of the below three options to enable java garbage collection logging for the client processes.
# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/opt/hbase-1.2.1/logs/gc-hbase-client.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
# See the package documentation for org.apache.hadoop.hbase.io.hfile for other configurations
# needed setting up off-heap block caching.
# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.
# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
# NOTE: HBase provides an alternative JMX implementation to fix the random ports issue, please see JMX
# section in HBase Reference Guide for instructions.
# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"
# File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
# Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"
# File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
# Extra ssh options. Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
# Where log files are stored. $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs
# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER
# The scheduling priority for daemon processes. See 'man nice'.
# export HBASE_NICENESS=10
# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids
# Seconds to sleep between slave commands. Unset by default. This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
# export HBASE_MANAGES_ZK=true
export HBASE_MANAGES_ZK=false
# The default log rolling policy is RFA, where the log file is rolled as per the size defined for the
# RFA appender. Please refer to the log4j.properties file to see more details on this appender.
# In case one needs to do log rolling on a date change, one should set the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case of filling out disk space as
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.
3.5.修改配置文件/opt/hbase-1.2.1/conf/hbase-site.xml,内容如下:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!--hbase存储在HADOOP HDFS上文件根目录路径--> <property> <name>hbase.rootdir</name> <value>hdfs://192.168.181.66:9000/hbase</value> </property> <!--采用分布式模式--> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <!--zookeeper地址,端口不指定的话就默认为2181--> <property> <name>hbase.zookeeper.quorum</name> <value>nmsc0,nmsc1,nmsc2</value> </property> <!--hbase临时文件存储目录,比如一些数据表的预分区信息等等--> <property> <name>hbase.tmp.dir</name> <value>/home/hadoop/hbase/</value> </property> <property> <name>hbase.master</name> <value>hdfs://192.168.181.66:60000</value> </property> <!--zookeeper存储数据位置--> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hadoop/zookeeper</value> </property> <!--这里设置hbase API客户端侧缓存值,大于此值就进行一次提交,/opt/hbase-1.2.1/conf/hbase-site.xml统一配置为5M,对所有HTable都生效,那么客户端API就可不设置-> <property> <!--htable.setWriteBufferSize(5242880);//5M --> <name>hbase.client.write.buffer</name> <value>5242880</value> </property> <!--这里设置Master并发最大线程数--> <property> <name>hbase.regionserver.handler.count</name> <value>300</value> <description>Count of RPC Listener instances spun up on RegionServers.Same property is used by the Master for count of master handlers.</description> </property> <!-- hbase.table.sanity.checks是一个开关,主要用于hbase各种参数检查,当为true时候,检查步骤如下 1.check max file size,hbase.hregion.max.filesize,最小为2MB 2.check flush size,hbase.hregion.memstore.flush.size,最小为1MB 3.check that coprocessors and other specified plugin classes can be loaded 4.check compression can be loaded 5.check encryption can be loaded 6.Verify compaction policy 7.check that we have at least 1 CF 8.check blockSize 9.check versions 10.check minVersions <= maxVerions 11.check replication scope 12.check data replication factor, it can be 0(default value) when user has not explicitly set the value, in this case we use default replication factor set in the file system. 详细情况可以去查看源代码org.apache.hadoop.hbase.master.HMaster的方法sanityCheckTableDescriptor, 该代码位于hbase源码的模块hbase-server下 --> <property> <name>hbase.table.sanity.checks</name> <value>false</value> </property> <!--ZooKeeper 会话超时.HBase把这个值传递改zk集群,向他推荐一个会话的最大超时时间--> <property> <!--every 30s,the master will check regionser is working --> <name>zookeeper.session.timeout</name> <value>30000</value> </property> <!--数据表创建时会预分区,每个预分区最大大小这里设置为30G,防止频繁的split阻塞数据读写,只有当预分区超过30G时才会进行split,正式环境应该首先预测数据存储时间内的大致数据量,然后如果每个预分区为30G,计算出分区数,建表时指定分区设置,防止后期频繁split--> <property> <!--every region max file size set to 30G --> <name>hbase.hregion.max.filesize</name> <value>32212254720</value> </property> <!--默认hbase每24小时会进行一次major_compact,major_compact会阻塞读写,这里先禁用,但不代表这个操作不做,可以后期指定linux shell加入到cron定时任务在hbase集群空闲情况下执行--> <property> <name>hbase.hregion.majorcompaction</name> <value>0</value> </property> <!--hbase本质上可以说是HADOOP HDFS的客户端,虽然Hadoop的core-site.xml里设置了文件副本数,但是仍然是客户端传值优先,这里设置为2,意思是一个文件,最终在Hadoop上总个数为2,正式环境最好设置为3,目前发现此值小于3时,在遇到All datanodes xxx.xxx.xxx.xxx:port are bad. Aborting...错误信息时,如果某个DataNode宕机,原则上hbase调用的DFSClient会去其他Datanode 上重试写,但发现配置的值低于3就不会去尝试--> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- IncreasingToUpperBoundRegionSplitPolicy策略的意思是,数据表如果预分区为2,配置的memstore flush size=128M,那么下一次分裂大小是2的平方然后乘以128MB,即2*2*128M=512MB; ConstantSizeRegionSplitPolicy策略的意思是按照上面指定的region大小超过30G才做分裂 --> <property> <name>hbase.regionserver.region.split.policy</name> <value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value> </property> <!--一个edit版本在内存中的cache时长,默认3600000毫秒--> <property> <name>hbase.regionserver.optionalcacheflushinterval</name> <value>7200000</value> <description> Maximum amount of time an edit lives in memory before being automatically flushed. Default 1 hour. Set it to 0 to disable automatic flushing.</description> </property> <!--分配给HFile/StoreFile的block cache占最大堆(-Xmx setting)的比例。默认0.4意思是分配40%,设置为0就是禁用,但不推荐。--> <property> <name>hfile.block.cache.size</name> <value>0.3</value> <description>Percentage of maximum heap (-Xmx setting) to allocate to block cache used by HFile/StoreFile. Default of 0.4 means allocate 40%. Set to 0 to disable but it's not recommended; you need at least enough cache to hold the storefile indices.</description> </property> <!--当memstore的大小超过这个值的时候,会flush到磁盘。这个值被一个线程每隔hbase.server.thread.wakefrequency检查一下。--> <property> <name>hbase.hregion.memstore.flush.size</name> <value>52428800</value> </property> <!--单个region server的全部memtores的最大值。超过这个值,一个新的update操作会被挂起,强制执行flush操作。以前版本中是通过hbase.regionserver.global.memstore.upperLimit设置,老版本中含义是在hbase-env.sh中配置的HEAP_SIZE比如4G,那么以该值4G乘以配置的0.5就是2G,意思是所有memstore总和达到2G值时,阻塞所有读写,现在1.2.1版本hbase中被hbase.regionserver.global.memstore.size替代,计算方法仍然是HEAP_SIZE乘以配置的百分比比如下面的0.5,那么阻塞读写的阀值就为2G--> <property> <name>hbase.regionserver.global.memstore.size</name> <value>0.5</value> </property> <!--当强制执行flush操作的时候,当低于这个值的时候,flush会停止。默认是堆大小的 35% . 如果这个值和 hbase.regionserver.global.memstore.upperLimit 相同就意味着当update操作因为内存限制被挂起时,会尽量少的执行flush(译者注:一旦执行flush,值就会比下限要低,不再执行)。 在老版本中该值是通过hbase.regionserver.global.memstore.size.lower.limit设置,计算方法是HEAP_SIZE乘以配置的百分比比如0.3就是HEAP_SIZE4G乘以0.3=1.2G,达到这个值的话就在所有memstore中选择最大的那个做flush动作,新版本则完全不同了,首先是通过hbase.regionserver.global.memstore.lowerLimit设置,而且不是以HEAP_SIZE作为参考,二是以配置的hbase.regionserver.global.memstore.size的值再乘以配置的比例比如0.5,如果HEAP_SIZE=4G,hbase.regionserver.global.memstore.size配置为0.5,hbase.regionserver.global.memstore.size.lower.limit配置的为0.5,则计算出来的值为4G乘以0.5再乘以0.5就是1G了,达到1G就先找最大的memstore触发flush--> <property> <name>hbase.regionserver.global.memstore.size.lower.limit</name> <value>0.5</value> </property> <property> <!--这里设置HDFS客户端最大超时时间,尽量改大,后期hbase经常会因为该问题频繁宕机--> <name>dfs.client.socket-timeout</name> <value>600000/value> </property> </configuration>
3.6.修改/opt/hbase-1.2.1/conf/regionservers,内容如下:
nmsc1 nmsc2
3.7 远程复制分发安装文件
scp -r /opt/hbase-1.2.1 root@nmsc2:/opt/
3.8 启动和停止hbase,命令是在集群中任何机器执行都可以的,首先保证Hadoop要启动。
cd /opt/hbase-1.2.1/bin/
./start-hbase.sh
jps
cd /opt/hbase-1.2.1/bin/
./stop-hbase.sh
3.9 查看hbase管理界面http://192.168.181.66:16010
3.10 使用hbase自带的importTSV工具将TSV格式文件导入到hbase数据表
1)创hbase数据表sms_send_result,该表上列族为info,列族info下包含列info:sender,info:receiver,info:sendtime,info:sendstatus,info:message,列族info采用SNAPPY压缩,该表按照rowkey预分区,该列族info有效时间为60天,预分区按照['01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59','60','61','62','63','64','65','66','67','68','69','70','71','72','73','74','75','76','77','78','79','80','81','82','83','84','85','86','87','88','89','90','91','92','93','94','95','96','97','98','99']分区为100区
cd /opt/hbase-1.2.1 bin/hbase shell disable 'sms_send_result' drop 'sms_send_result' create 'sms_send_result', {NAME => 'info', COMPRESSION => 'SNAPPY',TTL=>'5184000' }, SPLITS => ['01','02','03','04','05','06','07','08','09','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59','60','61','62','63','64','65','66','67','68','69','70','71','72','73','74','75','76','77','78','79','80','81','82','83','84','85','86','87','88','89','90','91','92','93','94','95','96','97','98','99']
2)将TSV格式的linux文件/opt/hadoop-2.7.1/bin/sms.tsv上传到Hadoop的HDFS上,HDFS上放置到根目录下:
cd /opt/hadoop-2.7.1/bin/ ./hdfs dfs -put /opt/hadoop-2.7.1/bin/sms.tsv /
3)利用hbase自带工具importTSV将HDFS文件/sms.tsv导入到hbase数据表sms_send_result
cd /opt/hbase-1.2.1 bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,info:sender,info:receiver,info:sendtime,info:sendstatus,info:message sms_send_result hdfs://192.168.181.66:9000/sms.tsv
4)sms.tsv大致内容如下
1154011896700000000000000201112251548071060776636 106591145302 19999999999 20111225 15:48:07 DELIVRD 阿拉斯加的费 9908996845700000000000000201112251548071060776638 106591145302 19899999999 20111225 15:48:07 DELIVRD 暗室逢灯
在TSV中,要按照3中的顺序HBASE_ROW_KEY,info:sender,info:receiver,info:sendtime,info:sendstatus,info:message放置数据,数据间要严格以TAB分隔,如果不是TAB分隔,可以在导入的时候以-Dimporttsv.separator=来指定。
再者注意:hbase自带工具importTSV只在数据表为空表导入时候效率高,数据表为非空后,导入效率低下,原因是数据会不断进行split和comprassion
4.hbase日常维护
1.基本命令
一,基本命令: 建表:create 'sms_send_result', {NAME => 'info', COMPRESSION => 'SNAPPY',TTL=>'5184000' }, SPLITS => ['1','2','3','4','5','6','7','8','9'] 删除表:drop 'sms_send_result'(删除表之前先要禁用表,命令disable 'sms_send_result') 启用和禁用表: enable 'sms_send_result' 启用和禁用表: disable 'sms_send_result' 查看表结构:describe 'testtable' 修改表结构:disable 'sms_send_result' alter "sms_send_result",NAME=>'info',TTL=>'7776000' enable "sms_send_result" 查询前10条数据:scan 'sms_send_result',{COLUMNS=>'info',LIMIT=>10,STARTROW=>'666632331200000020160305',STOPROW=>'666632331200000020160308'} 列出所有表:list
merge_region 'region1的encode值','region2的encode值'
cd /opt/hbase-1.2.1/bin
./hbase shell
major_compact 'table'
quit
cd /opt/hbase-1.2.1/bin
./hbase shell
flush 'table1'
flush 'table2'
quit
cd /opt/hbase-1.2.1/bin
./hbase shell
balance_switch true
2.hbase日常操作以及日常维护 http://my.oschina.net/beiyou/blog/76456?fromerr=mPxT883K
5.hbase遇到的问题
1.问题一
[nmsc0:16000.activeMasterManager] zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at address=nmsc1,16020,1461818834056, exception=org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,,1 is not online on nmsc1,16020,1461819267215 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2898) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:947) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegionInfo(RSRpcServices.java:1232) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22233) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
解决办法: 1.系统防火墙开启后主机ip对应主机名解析有问题,需要删除Hbase 的tmp文件夹重启(每个节点都要操作) 2.hadoop 集群进入了safe model 模式,需要执行hdfs dfsadmin -safemode leave退出安全模式 #查看是否在安全模式 [hadoop@nmsc2 bin]$ cd /opt/hadoop-2.7.1/bin [hadoop@nmsc2 bin]$ ./hdfs dfsadmin -safemode get Safe mode is OFF [hadoop@nmsc2 bin]$ #离开安全模式 [hadoop@nmsc2 bin]$ cd /opt/hadoop-2.7.1/bin [hadoop@nmsc2 bin]$ ./hdfs dfsadmin -safemode leave Safe mode is OFF [hadoop@nmsc2 bin]$ 3.需要先删除所有hbase节点在hbase-site.xml文件中配置的hbase.tmp.dir=/home/hadoop/hbase/下临时数据,让hbase从各个regionserver的HDFS中读取hbase元数据信息 rm -rf /home/hadoop/hbase/* 4.存储在Hbase的数据有丢失,需要利用hadoop的回收站的机制恢复数据,或者删除HBase的数据
2.问题二All datanodes xxx.xxx.xxx.xxx:port are bad. Aborting...
http://blog.sina.com.cn/s/blog_72827fb1010198j7.html
http://blog.sina.com.cn/s/blog_46d817650100t5yf.html
http://p-x1984.iteye.com/blog/989577
java.io.IOException: All datanodes 192.168.88.21:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
该问题产生后,网上多半的解决办法是修改linux打开文件数和进程数,方法如下:
修改操作系统ulimit -n 限制 1)修改/etc/security/limits.conf ,在最后增加如下内容: * soft nofile 409600 * hard nofile 819200 * soft nproc 409600 * hard nproc 819200 2)修改/etc/pam.d/login,在最后添加如下内容: session required /lib/security/pam_limits.so 3)重启系统使得配置生效 #查看打开文件数ulimit -a #临时修改软连接文件数ulimit -Sn 409600 #临时修改硬链接文件数ulimit -Hn 819200 #ulimit -a
但是我改了之后还是出问题,仔细看了下出问题的日志信息,每次出问题,首先是遇到大批量的flushing操作,如下:
2016-05-18 20:00:17,795 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,795 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,795 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,795 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,796 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,796 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5004ms
2016-05-18 20:00:17,799 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,807 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,812 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,815 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,835 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5002ms
2016-05-18 20:00:17,836 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,839 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5002ms
2016-05-18 20:00:17,840 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,841 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5002ms
2016-05-18 20:00:17,841 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5002ms
2016-05-18 20:00:17,852 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,854 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:17,856 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 5003ms
2016-05-18 20:00:21,723 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 10408
2016-05-18 20:00:22,796 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,797 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10005ms
2016-05-18 20:00:22,797 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10005ms
2016-05-18 20:00:22,798 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10005ms
2016-05-18 20:00:22,798 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10006ms
2016-05-18 20:00:22,798 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10006ms
2016-05-18 20:00:22,798 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10006ms
2016-05-18 20:00:22,800 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,808 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,813 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,816 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,836 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10003ms
2016-05-18 20:00:22,838 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10005ms
2016-05-18 20:00:22,840 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10003ms
2016-05-18 20:00:22,841 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,842 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10003ms
2016-05-18 20:00:22,843 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,853 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,855 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:22,857 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 10004ms
2016-05-18 20:00:27,798 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,798 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,799 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15007ms
2016-05-18 20:00:27,799 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15007ms
2016-05-18 20:00:27,799 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15007ms
2016-05-18 20:00:27,799 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,799 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15007ms
2016-05-18 20:00:27,802 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,810 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,815 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,818 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,838 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15005ms
2016-05-18 20:00:27,839 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,842 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15005ms
2016-05-18 20:00:27,843 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,844 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15005ms
2016-05-18 20:00:27,844 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15005ms
2016-05-18 20:00:27,855 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,857 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:27,859 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 15006ms
2016-05-18 20:00:31,722 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 16850
2016-05-18 20:00:32,801 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,801 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,802 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20010ms
2016-05-18 20:00:32,802 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,802 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20010ms
2016-05-18 20:00:32,802 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20010ms
2016-05-18 20:00:32,802 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20010ms
2016-05-18 20:00:32,805 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,813 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,818 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,821 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,841 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20008ms
2016-05-18 20:00:32,842 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,845 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20008ms
2016-05-18 20:00:32,846 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,847 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20008ms
2016-05-18 20:00:32,847 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20008ms
2016-05-18 20:00:32,858 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,860 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:32,862 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 20009ms
2016-05-18 20:00:37,802 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,803 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,803 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25011ms
2016-05-18 20:00:37,804 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25011ms
2016-05-18 20:00:37,804 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25012ms
2016-05-18 20:00:37,804 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25012ms
2016-05-18 20:00:37,805 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25013ms
2016-05-18 20:00:37,806 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,814 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,819 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,822 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,842 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25009ms
2016-05-18 20:00:37,843 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,846 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25009ms
2016-05-18 20:00:37,847 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,848 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25009ms
2016-05-18 20:00:37,848 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25009ms
2016-05-18 20:00:37,859 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,861 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:37,863 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 25010ms
2016-05-18 20:00:41,722 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 18453
2016-05-18 20:00:42,804 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,805 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,805 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30013ms
2016-05-18 20:00:42,805 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30013ms
2016-05-18 20:00:42,806 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30014ms
2016-05-18 20:00:42,806 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30014ms
2016-05-18 20:00:42,806 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30014ms
2016-05-18 20:00:42,808 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,816 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,821 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,824 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,844 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30011ms
2016-05-18 20:00:42,845 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,848 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30011ms
2016-05-18 20:00:42,849 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,850 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30011ms
2016-05-18 20:00:42,850 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30011ms
2016-05-18 20:00:42,861 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,863 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:42,865 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 30012ms
2016-05-18 20:00:47,807 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,808 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,808 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35016ms
2016-05-18 20:00:47,808 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35016ms
2016-05-18 20:00:47,809 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35017ms
2016-05-18 20:00:47,809 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35017ms
2016-05-18 20:00:47,809 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35017ms
2016-05-18 20:00:47,811 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,819 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,824 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,827 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,847 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35014ms
2016-05-18 20:00:47,848 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,851 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35014ms
2016-05-18 20:00:47,852 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,853 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35014ms
2016-05-18 20:00:47,853 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35014ms
2016-05-18 20:00:47,864 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,866 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:47,868 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 35015ms
2016-05-18 20:00:51,723 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal460000,21,1450942786420.3366a30d7833e98cc51cb396f8b77d58. after a delay of 3320
2016-05-18 20:00:51,723 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 10640
2016-05-18 20:00:52,809 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40017ms
2016-05-18 20:00:52,809 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,810 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40018ms
2016-05-18 20:00:52,810 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40018ms
2016-05-18 20:00:52,810 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40018ms
2016-05-18 20:00:52,811 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40019ms
2016-05-18 20:00:52,811 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40019ms
2016-05-18 20:00:52,812 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,820 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,825 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,828 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,848 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40015ms
2016-05-18 20:00:52,849 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,852 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40015ms
2016-05-18 20:00:52,853 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,854 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40015ms
2016-05-18 20:00:52,854 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40015ms
2016-05-18 20:00:52,865 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,867 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:52,869 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 40016ms
2016-05-18 20:00:57,811 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45019ms
2016-05-18 20:00:57,811 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,812 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45020ms
2016-05-18 20:00:57,812 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45020ms
2016-05-18 20:00:57,812 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45020ms
2016-05-18 20:00:57,813 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45021ms
2016-05-18 20:00:57,813 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45021ms
2016-05-18 20:00:57,814 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,822 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,827 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,830 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,850 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45017ms
2016-05-18 20:00:57,851 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,854 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45017ms
2016-05-18 20:00:57,855 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,856 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45017ms
2016-05-18 20:00:57,856 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45017ms
2016-05-18 20:00:57,867 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,869 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:00:57,871 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 45018ms
2016-05-18 20:01:01,722 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal460000,21,1450942786420.3366a30d7833e98cc51cb396f8b77d58. after a delay of 14380
2016-05-18 20:01:01,722 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 10687
2016-05-18 20:01:02,814 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50022ms
2016-05-18 20:01:02,814 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,815 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50023ms
2016-05-18 20:01:02,815 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50023ms
2016-05-18 20:01:02,815 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50023ms
2016-05-18 20:01:02,816 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50024ms
2016-05-18 20:01:02,816 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50024ms
2016-05-18 20:01:02,817 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,825 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,830 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,833 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,853 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50020ms
2016-05-18 20:01:02,854 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,857 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50020ms
2016-05-18 20:01:02,858 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,859 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50020ms
2016-05-18 20:01:02,859 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50020ms
2016-05-18 20:01:02,870 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,872 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:02,874 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 50021ms
2016-05-18 20:01:07,816 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55024ms
2016-05-18 20:01:07,816 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,817 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55025ms
2016-05-18 20:01:07,817 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55025ms
2016-05-18 20:01:07,817 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55025ms
2016-05-18 20:01:07,818 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55026ms
2016-05-18 20:01:07,818 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55026ms
2016-05-18 20:01:07,819 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,827 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,832 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,834 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55022ms
2016-05-18 20:01:07,854 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55021ms
2016-05-18 20:01:07,855 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55022ms
2016-05-18 20:01:07,858 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55021ms
2016-05-18 20:01:07,860 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55021ms
2016-05-18 20:01:07,860 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,861 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55022ms
2016-05-18 20:01:07,872 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,874 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55023ms
2016-05-18 20:01:07,875 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 55022ms
2016-05-18 20:01:11,721 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal460000,21,1450942786420.3366a30d7833e98cc51cb396f8b77d58. after a delay of 10618
2016-05-18 20:01:11,722 INFO [nmsc1,16020,1463474336922_ChoreService_1] regionserver.HRegionServer: nmsc1,16020,1463474336922-MemstoreFlusherChore requesting flush for region signal630000,88,1450432267618.f8770cd416800a624f3aca9cbcf4f24f. after a delay of 20718
2016-05-18 20:01:12,818 WARN [B.defaultRpcServer.handler=44,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60026ms
2016-05-18 20:01:12,818 WARN [B.defaultRpcServer.handler=77,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,819 WARN [B.defaultRpcServer.handler=38,queue=8,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60027ms
2016-05-18 20:01:12,819 WARN [B.defaultRpcServer.handler=134,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60027ms
2016-05-18 20:01:12,819 WARN [B.defaultRpcServer.handler=110,queue=20,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60027ms
2016-05-18 20:01:12,820 WARN [B.defaultRpcServer.handler=296,queue=26,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60028ms
2016-05-18 20:01:12,820 WARN [B.defaultRpcServer.handler=12,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60028ms
2016-05-18 20:01:12,821 WARN [B.defaultRpcServer.handler=132,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,829 WARN [B.defaultRpcServer.handler=273,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,834 WARN [B.defaultRpcServer.handler=275,queue=5,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,836 WARN [B.defaultRpcServer.handler=271,queue=1,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60024ms
2016-05-18 20:01:12,856 WARN [B.defaultRpcServer.handler=234,queue=24,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60023ms
2016-05-18 20:01:12,857 WARN [B.defaultRpcServer.handler=287,queue=17,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60024ms
2016-05-18 20:01:12,860 WARN [B.defaultRpcServer.handler=156,queue=6,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60023ms
2016-05-18 20:01:12,862 WARN [B.defaultRpcServer.handler=244,queue=4,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60023ms
2016-05-18 20:01:12,862 WARN [B.defaultRpcServer.handler=164,queue=14,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,863 WARN [B.defaultRpcServer.handler=178,queue=28,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60024ms
2016-05-18 20:01:12,874 WARN [B.defaultRpcServer.handler=33,queue=3,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,876 WARN [B.defaultRpcServer.handler=42,queue=12,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60025ms
2016-05-18 20:01:12,877 WARN [B.defaultRpcServer.handler=62,queue=2,port=16020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 60024ms
接着是一堆超时错误,如下:
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1356)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.88.21:43696 remote=/192.168.88.21:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1356)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.88.21:43706 remote=/192.168.88.21:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1356)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
最后直接直接关闭regionserver,错误如下:
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1356)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-05-18 20:02:27,185 INFO [Thread-19599] hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.88.21:38128 remote=/192.168.88.22:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1356)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1281)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-05-18 20:04:23,546 ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting
java.lang.RuntimeException: HRegionServer Aborted
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68)
at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2651)
2016-05-18 20:04:23,578 INFO [Thread-6] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@2d969ab0
2016-05-18 20:04:23,578 INFO [Thread-6] regionserver.ShutdownHook: Starting fs shutdown hook thread.
2016-05-18 20:04:23,579 ERROR [Thread-19612] hdfs.DFSClient: Failed to close inode 5457064
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/data/default/signal460000/442c79376cb9b6e0cfa848dcb5569849/.tmp/917e9d8fb3fc40a0a8e1a3cc533a1d52 could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-05-18 20:04:23,580 ERROR [Thread-19612] hdfs.DFSClient: Failed to close inode 5457058
java.io.IOException: All datanodes 192.168.88.22:50010 are bad. Aborting...
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2016-05-18 20:04:23,580 ERROR [Thread-19612] hdfs.DFSClient: Failed to close inode 5457063
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/data/default/signal150000/53548fab04ee732f64c9362bffe289ae/.tmp/ba60c4b1a5694276b23f783c27b8f9b4 could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-05-18 20:04:23,581 ERROR [Thread-19612] hdfs.DFSClient: Failed to close inode 5457062
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/WALs/nmsc1,16020,1463474336922/nmsc1%2C16020%2C1463474336922.default.1463572812007 could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1550)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3034)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:723)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy19.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279)
at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
2016-05-18 20:04:23,582 INFO [Thread-6] regionserver.ShutdownHook: Shutdown hook finished.
- 1.大批量的flushing动作出现,此时hbase作为HDFS客户端,会调用DFSClient将各个预分区region上memstore flushing到磁盘;
- 2.量特别大,基本是有多少个区,就有多少个flushing操作,一些操作在进行,另外一些操作一直在等待执行,等待超时后反复尝试了四五次
- 3.大批量的flushing操作尝试多次失败后,直接宣告DataNode已经死亡,其实DataNode活的很好,只因为flushing量太大了;
- 4.最后协调服务的zookeeper直接宣判抛弃那个一直不响应它的regionserver节点,关闭regionserver
看到上面分析后,寻找解决办法:
首先要理解为什么会触发大批量的memstore进行flushing?然后进行解决。谈到这里,就要说hbase的memstore模型了,先看下图:
- 1.每一个Region都有一个Memstore,Memstore默认大小为128MB,可通过hbase.hregion.memstore.flush.size更改;
- 2.Region会随着split操作逐步增多,为了控制Memstore之和导致OOM错误,在hbase老版本中是通过hbase.regionserver.global.memstore.upperLimit和hbase.regionserver.global.memstore.lowerLimit进行控制,新版本中使用hbase.regionserver.global.memstore.size和hbase.regionserver.global.memstore.size.lower.limit控制;
- 3.Hbase-env.sh中HEAP_SIZE=4G时,老版本Hbase.regionserver.global.memstore.upperLimit(默认HEAP_SIZE*0.4)=1.6G,hbase.regionserver.global.memstore.lowerLimit(默认HEAP_SIZE*0.35)=1.4G,
- 新版本hbase.regionserver.global.memstore.size(默认HEAP_SIZE*0.4)=1.6G和hbase.regionserver.global.memstore.size.lower.limit(hbase.regionserver.global.memstore.size*HEAP_SIZE*0.95)=1.52G;
- 4.Memstore总和达到第一个临界值,会在所有memstore中选择一个最大的那个进行flushing,此时不会阻塞写;
- 5.Memstore总和达到第二个临界值,会阻塞所有的读写,将当前所有memstore进行flushing。
- 6.每一个Region都有一个BlockCache,BlockCache总和默认打下为HEAP_SIZE乘以0.4,默认是通过hfile.block.cache.size设置;
- 7.所有的读请求,先到BlockCache中查找,基本Memstore中有的值在BlockCache中也都有,找不到再去Hfile中找。
- 8.hbase中默认规定Memstore总和最大值(hbase.regionserver.global.memstore.size默认0.4)和BlockCache总和最大值(hfile.block.cache.size默认0.4)之和不能大于0.8,因为要预留0.2的HEAP_SIZE供其他操作使用,这个可详见hbase源代码Org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil.java文件。
/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.hadoop.hbase.io.util; import java.lang.management.ManagementFactory; import java.lang.management.MemoryUsage; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.hbase.classification.InterfaceAudience; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HConstants; @InterfaceAudience.Private public class HeapMemorySizeUtil { public static final String MEMSTORE_SIZE_KEY = "hbase.regionserver.global.memstore.size"; public static final String MEMSTORE_SIZE_OLD_KEY = "hbase.regionserver.global.memstore.upperLimit"; public static final String MEMSTORE_SIZE_LOWER_LIMIT_KEY = "hbase.regionserver.global.memstore.size.lower.limit"; public static final String MEMSTORE_SIZE_LOWER_LIMIT_OLD_KEY = "hbase.regionserver.global.memstore.lowerLimit"; public static final float DEFAULT_MEMSTORE_SIZE = 0.4f; // Default lower water mark limit is 95% size of memstore size. public static final float DEFAULT_MEMSTORE_SIZE_LOWER_LIMIT = 0.95f; private static final Log LOG = LogFactory.getLog(HeapMemorySizeUtil.class); // a constant to convert a fraction to a percentage private static final int CONVERT_TO_PERCENTAGE = 100; /** * Checks whether we have enough heap memory left out after portion for Memstore and Block cache. * We need atleast 20% of heap left out for other RS functions. * @param conf */ public static void checkForClusterFreeMemoryLimit(Configuration conf) { if (conf.get(MEMSTORE_SIZE_OLD_KEY) != null) { LOG.warn(MEMSTORE_SIZE_OLD_KEY + " is deprecated by " + MEMSTORE_SIZE_KEY); } float globalMemstoreSize = getGlobalMemStorePercent(conf, false); int gml = (int)(globalMemstoreSize * CONVERT_TO_PERCENTAGE); float blockCacheUpperLimit = getBlockCacheHeapPercent(conf); int bcul = (int)(blockCacheUpperLimit * CONVERT_TO_PERCENTAGE); if (CONVERT_TO_PERCENTAGE - (gml + bcul) < (int)(CONVERT_TO_PERCENTAGE * HConstants.HBASE_CLUSTER_MINIMUM_MEMORY_THRESHOLD)) { throw new RuntimeException("Current heap configuration for MemStore and BlockCache exceeds " + "the threshold required for successful cluster operation. " + "The combined value cannot exceed 0.8. Please check " + "the settings for hbase.regionserver.global.memstore.size and " + "hfile.block.cache.size in your configuration. " + "hbase.regionserver.global.memstore.size is " + globalMemstoreSize + " hfile.block.cache.size is " + blockCacheUpperLimit); } } /** * Retrieve global memstore configured size as percentage of total heap. * @param c * @param logInvalid */ public static float getGlobalMemStorePercent(final Configuration c, final boolean logInvalid) { float limit = c.getFloat(MEMSTORE_SIZE_KEY, c.getFloat(MEMSTORE_SIZE_OLD_KEY, DEFAULT_MEMSTORE_SIZE)); if (limit > 0.8f || limit <= 0.0f) { if (logInvalid) { LOG.warn("Setting global memstore limit to default of " + DEFAULT_MEMSTORE_SIZE + " because supplied value outside allowed range of (0 -> 0.8]"); } limit = DEFAULT_MEMSTORE_SIZE; } return limit; } /** * Retrieve configured size for global memstore lower water mark as percentage of total heap. * @param c * @param globalMemStorePercent */ public static float getGlobalMemStoreLowerMark(final Configuration c, float globalMemStorePercent) { String lowMarkPercentStr = c.get(MEMSTORE_SIZE_LOWER_LIMIT_KEY); if (lowMarkPercentStr != null) { return Float.parseFloat(lowMarkPercentStr); } String lowerWaterMarkOldValStr = c.get(MEMSTORE_SIZE_LOWER_LIMIT_OLD_KEY); if (lowerWaterMarkOldValStr != null) { LOG.warn(MEMSTORE_SIZE_LOWER_LIMIT_OLD_KEY + " is deprecated. Instead use " + MEMSTORE_SIZE_LOWER_LIMIT_KEY); float lowerWaterMarkOldVal = Float.parseFloat(lowerWaterMarkOldValStr); if (lowerWaterMarkOldVal > globalMemStorePercent) { lowerWaterMarkOldVal = globalMemStorePercent; LOG.info("Setting globalMemStoreLimitLowMark == globalMemStoreLimit " + "because supplied " + MEMSTORE_SIZE_LOWER_LIMIT_OLD_KEY + " was > " + MEMSTORE_SIZE_OLD_KEY); } return lowerWaterMarkOldVal / globalMemStorePercent; } return DEFAULT_MEMSTORE_SIZE_LOWER_LIMIT; } /** * Retrieve configured size for on heap block cache as percentage of total heap. * @param conf */ public static float getBlockCacheHeapPercent(final Configuration conf) { // L1 block cache is always on heap float l1CachePercent = conf.getFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY, HConstants.HFILE_BLOCK_CACHE_SIZE_DEFAULT); float l2CachePercent = getL2BlockCacheHeapPercent(conf); return l1CachePercent + l2CachePercent; } /** * @param conf * @return The on heap size for L2 block cache. */ public static float getL2BlockCacheHeapPercent(Configuration conf) { float l2CachePercent = 0.0F; String bucketCacheIOEngineName = conf.get(HConstants.BUCKET_CACHE_IOENGINE_KEY, null); // L2 block cache can be on heap when IOEngine is "heap" if (bucketCacheIOEngineName != null && bucketCacheIOEngineName.startsWith("heap")) { float bucketCachePercentage = conf.getFloat(HConstants.BUCKET_CACHE_SIZE_KEY, 0F); MemoryUsage mu = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage(); l2CachePercent = bucketCachePercentage < 1 ? bucketCachePercentage : (bucketCachePercentage * 1024 * 1024) / mu.getMax(); } return l2CachePercent; } }
所以要尽量避免触发所有memstore全部做flushing那个动作,解决办法是调节两个临界点之间差值,我增加的配置如下:
<property> <name>hfile.block.cache.size</name> <value>0.3</value> </property> <property> <name>hbase.regionserver.global.memstore.size.lower.limit</name> <value>0.5</value> </property> <property> <name>hbase.regionserver.global.memstore.size</name> <value>0.5</value> </property>
这样在HEAP_SIZE=4G时候,
hfile.block.cache.size计算值为4G*0.3=1.2G;
hbase.regionserver.global.memstore.size计算值为4G*0.5=2G;
hbase.regionserver.global.memstore.size.lower.limit计算值为4G*0.5*0.5=1G;
并且0.3+0.5<=0.8,没有超过hbase设置的不能超过0.8这个值
还有一点值得注意,就是HDFS的客户端超时也会导致zookeeper误以为hbase写出错,从而导致hbase被zookeeper抛弃,从而导致hbase节点宕机,这个问题需要修改Hadoop配置文件hdfs-site.xml和hbase配置文件hbase-site.xml,两个文件都增加配置项如下:
<property> <!--这里设置HDFS客户端最大超时时间,尽量改大,后期hbase经常会因为该问题频繁宕机--> <name>dfs.client.socket-timeout</name> <value>600000/value> </property>
推荐阅读
-
Ubuntu16.04下伪分布式环境搭建之hadoop、jdk、Hbase、phoenix的安装与配置
-
kafka集群环境搭建及kafkamanager监控安装
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(6)snappy安装
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(5)hbase安装
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(5)hbase安装
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(4)hbase安装准备
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(9)spring-hadoop实战
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(4)hbase安装准备
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(9)spring-hadoop实战
-
Hadoop2.7.1+Hbase1.2.1集群环境搭建(10)基于ZK的Hadoop HA集群安装