What are some tips for configuring HBase?
程序员文章站
2022-06-13 17:28:51
...
Jeff Hammerbacher 8 endorsements
2 votes by Oleksiy Kovyrin and Alex Kamil
Much of this content is taken from the HBase Overview [1] and the HBase Default Configuration [2].
System
HDFS
HBase
Per-Cluster
Per-Table
Per-Family
Per-Region Server
Per-Store
[1] http://hbase.apache.org/docs/cur...
[2] http://hbase.apache.org/docs/cur...
[3] http://wiki.apache.org/hadoop/Hb...
[4] http://wiki.apache.org/hadoop/Hb...
[5] http://hbase.markmail.org/thread...
[6] http://svn.apache.org/viewvc/hba...
[7] http://hbase.apache.org/docs/cur...
System
- Increase the default per-process file handle limit [3] in
/etc/security/limits.conf
HDFS
- Set dfs.datanode.max.xceivers to 2047 [4] in
$HADOOP_HOME/conf/hdfs-site.xml
- Set dfs.datanode.socket.write.timeout to 0 [5]
HBase
- First, note that the default configuration values are stored atsrc/main/resources/hbase-default.xml [6] in the source tree
- For your site-specific configuration values, edit conf/hbase-site.xml
- Set hbase.rootdir to point to the directory in HDFS where HBase will put its data; e.g.
hdfs://localhost:9000/hbase
Per-Cluster
- hfile.block.cache.size controls the amount of region server heap space to devote to the block cache. Currently defaults to 20%.
Per-Table
- Max File Size: for clusters with lots of data, can be tuned up to 1 GB to result in less regions on the cluster.
- MemStore Flush Size
Per-Family
- Compression
- Bloom filters
Per-Region Server
- hbase.regionserver.global.memstore.upperLimit is used to cap the amount of heap room in each region server to reserve for all MemStores [7] served by that region. It defaults to 40% of the heap.
- hbase.hregion.memstore.flush.size is the threshold for deciding when to flush a single MemStore to disk. It defaults to 64 MB.
- hbase.hregion.memstore.block.multiplier controls when to start blocking writes to keep the MemStore size sane. It defaults to 2 (multiplied by the memstore.flush.size). For production clusters with lots of RAM that you monitor closely, you can up to something like 8.
- hbase.hregion.max.filesize determines how big a StoreFile is allowed to grow before splitting a region. Defaults to 256 MB.
Per-Store
- hbase.hstore.blockingStoreFiles determines the maximum number of StoreFiles per Store to allow before blocking writes and forcing a compaction. The default is 7, but in production clusters monitored closely, it may make sense to up to 15.
[1] http://hbase.apache.org/docs/cur...
[2] http://hbase.apache.org/docs/cur...
[3] http://wiki.apache.org/hadoop/Hb...
[4] http://wiki.apache.org/hadoop/Hb...
[5] http://hbase.markmail.org/thread...
[6] http://svn.apache.org/viewvc/hba...
[7] http://hbase.apache.org/docs/cur...
下一篇: 【内存碎片/内存空洞】