Hbase的安装和基本使用
hbase介绍
hbase是一个开源的非关系型分布式数据库(nosql),它参考了谷歌的bigtable建模,实现的编程语言为 java。它是apache软件基金会的hadoop项目的一部分,运行于hdfs文件系统之上,为 hadoop 提供类似于bigtable 规模的服务。因此,它可以容错地存储海量稀疏的数据。
hbase安装
安装环境
三台虚拟机:master、slave1、slave2,
已经安装好hadoop环境和zookeeper
下载hbase安装包,根据你自己的需求下载对应的安装包
wget http://archive.apache.org/dist/hbase/0.98.24/hbase-0.98.24-hadoop2-bin.tar.gz
也可以直接去镜像网站下载,地址:
下载好后,解压安装包
tar -zxvf hbase-0.98.24-hadoop2-bin.tar.gz
添加hbase的环境变量
//打开~/.bashrc文件 vim ~/.bashrc //然后在里边追加两行 export hbase_home=/usr/local/src/hbase-0.98.24-hadoop2 export path=$path:$hbase_home/bin //然后保存退出,source一下 source ~/.bashrc
配置hbase
打开hbase目录下conf/hbase-env.sh(如果没有新建一个)
vim conf/hbase-env.sh //添加下边两个配置 export java_home=/usr/local/src/jdk1.8.0_171 //java home export hbase_manages_zk=false //是否使用自带的zookeeper,自己有安装的话就用自己的,没有就用自带的
配置hbase-site.xml文件
vim conf/hbase-site.xml //添加如下配置 <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://master:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master,slave1,slave2</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> </configuration>
修改regionservers文件
vim conf/regionservers //添加需要安装regionserver的机器节点 slave1 slave2
到这里hbase简单的环境就搭建好了
hbase的启动
启动hbase需要首先启动hadoop和zookeeper
启动hadoop
master机器节点
//进入到hadoop目录的sbin下 ./start-all.sh
查看hadoop是不是启动成功
master机器节点,jps查看进程看到图中进程说明成功启动
slave机器节点,jps查看
zookeeper启动
master和slave节点都执行,进入zookeeper安装目录bin目录下
zkserver.sh start
然后jps查看进程,能看到quorumpeermain说明zookeeper启动成功
启动hbase
在hadoop和zookeeper都启动之后就可以启动hbase了,进入hbase的安装目录的bin目录下
./start-hbase.sh
jps查看进程,在master能看到hmaster进程,在slave节点能看到hregionserver进程,说明hbase启动成功
也可以通过网址来检查,
hbase简单的shell命令操作
进入shell命令模式,在bin目录下执行
./hbase shell hbase(main):001:0>
- 查看当前所有表
hbase(main):003:0> list table 0 row(s) in 0.1510 seconds => []
- 创建表
hbase(main):006:0> create 'test_table' , 'mate_data', 'action' 0 row(s) in 2.4390 seconds => hbase::table - test_table
- 查看表详情
hbase(main):009:0> desc 'test_table' table test_table is enabled test_table column families description {name => 'action', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block_en coding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536', replication_scope => '0'} {name => 'mate_data', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block _encoding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536 ', replication_scope => '0'} 2 row(s) in 0.0520 seconds
- 增加列簇
hbase(main):010:0> alter 'test_table', {name => 'new', versions => '2', in_memory => 'true'} updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. done. 0 row(s) in 2.2790 seconds hbase(main):011:0> desc 'test_table' table test_table is enabled test_table column families description {name => 'action', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block_en coding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536', replication_scope => '0'} {name => 'mate_data', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block _encoding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536 ', replication_scope => '0'} {name => 'new', bloomfilter => 'row', versions => '2', in_memory => 'true', keep_deleted_cells => 'false', data_block_encodi ng => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536', repl ication_scope => '0'} 3 row(s) in 0.0570 seconds
- 删除列簇
hbase(main):013:0> alter 'test_table', {name => 'new', method => 'delete'} updating all regions with the new schema... 0/1 regions updated. 1/1 regions updated. done. 0 row(s) in 2.2390 seconds hbase(main):014:0> desc 'test_table' table test_table is enabled test_table column families description {name => 'action', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block_en coding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536', replication_scope => '0'} {name => 'mate_data', bloomfilter => 'row', versions => '1', in_memory => 'false', keep_deleted_cells => 'false', data_block _encoding => 'none', ttl => 'forever', compression => 'none', min_versions => '0', blockcache => 'true', blocksize => '65536 ', replication_scope => '0'} 2 row(s) in 0.0430 seconds
- 删除表
//首先disable hbase(main):016:0> disable 'test_table' 0 row(s) in 1.2980 seconds //然后再删除 hbase(main):017:0> drop 'test_table' 0 row(s) in 0.2020 seconds //查看是否删除 hbase(main):018:0> list table 0 row(s) in 0.0070 seconds => []
- 往表里写数据并查看
hbase(main):021:0> put 'test_table', '1001', 'mate_data:name', 'zhangsan' 0 row(s) in 0.1400 seconds hbase(main):022:0> put 'test_table', '1002', 'mate_data:name', 'lisi' 0 row(s) in 0.0110 seconds hbase(main):023:0> put 'test_table', '1001', 'mate_data:gender', 'woman' 0 row(s) in 0.0170 seconds hbase(main):024:0> put 'test_table', '1002', 'mate_data:age', '25' 0 row(s) in 0.0140 seconds hbase(main):025:0> scan 'test_table' row column+cell 1001 column=mate_data:gender, timestamp=1540034584363, value=woman 1001 column=mate_data:name, timestamp=1540034497293, value=zhangsan 1002 column=mate_data:age, timestamp=1540034603800, value=25 1002 column=mate_data:name, timestamp=1540034519659, value=lisi 2 row(s) in 0.0410 seconds
- 读取数据
hbase(main):026:0> get 'test_table', '1001' column cell mate_data:gender timestamp=1540034584363, value=woman mate_data:name timestamp=1540034497293, value=zhangsan 2 row(s) in 0.0340 seconds hbase(main):027:0> get 'test_table', '1001', 'mate_data:name' column cell mate_data:name timestamp=1540034497293, value=zhangsan 1 row(s) in 0.0320 seconds
- 查看行数
hbase(main):028:0> count 'test_table' 2 row(s) in 0.0390 seconds => 2
- 清空表数据
hbase(main):029:0> truncate 'test_table' truncating 'test_table' table (it may take a while): - disabling table... - truncating table... 0 row(s) in 1.5220 seconds
通过python脚本来操作hbase
不能通过python脚本来直接操作hbase,必须要借助thrift服务作为中间层,所以需要两个python模块:hbase模块和thrift模块,和安装thrift来实现python对hbase的操作
安装thrift并获得thrift模块
- 下载安装thrift
wget http://archive.apache.org/dist/thrift/0.11.0/thrift-0.11.0.tar.gz tar -zxvf thrift-0.11.0.tar.gz cd thrift-0.11.0/ ./configure make make install cd lib/py/build/lib.linux-x86_64-2.7
然后就能看到thrift模块
获得hbase模块
- 下载hbase源码包
wget http://archive.apache.org/dist/hbase/0.98.24/hbase-0.98.24-src.tar.gz tar -zxvf hbase-0.98.24-src.tar.gz
- 产生hbase模块
//进入该目录 cd /usr/local/src/hbase-0.98.24/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift //执行如下命令,产生gen-py目录 thrift --gen py hbase.thrift //进入该目录就能得到生成的hbase模块 cd gen-py
使用python写数据
- 创建表
from thrift.transport import tsocket from thrift.protocol import tbinaryprotocol from hbase import hbase from hbase.ttypes import * transport = tsocket.tsocket('master', 9090) transport = ttransport.tbufferedtransport(transport) protocol = tbinaryprotocol.tbinaryprotocol(transport) client = hbase.client(protocol) transport.open() base_info_contents = columndescriptor(name='columnname1', maxversions=1) other_info_contents = columndescriptor(name='columnname2', maxversions=1) client.createtable('tablename', [base_info_contents,other_info_contents])
- 插入数据
from thrift.transport import tsocket from thrift.protocol import tbinaryprotocol from hbase import hbase from hbase.ttypes import * transport = tsocket.tsocket('master', 9090) transport = ttransport.tbufferedtransport(transport) protocol = tbinaryprotocol.tbinaryprotocol(transport) client = hbase.client(protocol) transport.open() table_name = 'tablename' rowkey = 'rowkeyname' mutations = [mutation(column="columnname:columnpro", value="valuename")] client.mutaterow(table_name,rowkey,mutations,none)
- 查看数据
from thrift.transport import tsocket from thrift.protocol import tbinaryprotocol from hbase import hbase from hbase.ttypes import * transport = tsocket.tsocket('master', 9090) transport = ttransport.tbufferedtransport(transport) protocol = tbinaryprotocol.tbinaryprotocol(transport) client = hbase.client(protocol) transport.open() table_name = 'tablename' rowkey = 'rowkeyname' result = client.getrow(table_name,rowkey,none) for l in result: print "the row is "+ l.row for k,v in l.columns.items(): print '\t'.join([k,v.value])
from thrift.transport import tsocket from thrift.protocol import tbinaryprotocol from hbase import hbase from hbase.ttypes import * transport = tsocket.tsocket('master', 9090) transport = ttransport.tbufferedtransport(transport) protocol = tbinaryprotocol.tbinaryprotocol(transport) client = hbase.client(protocol) transport.open() table_name = 'tablename' scan = tscan() id = client.scanneropenwithscan(table_name,scan,none) result = client.scannergetlist(id,10) for l in result: print "=========" print "the row is "+ l.row for k,v in l.columns.items(): print '\t'.join([k,v.value])
欢迎关注公众号
下一篇: 狮子女想复合的表现有哪些?
推荐阅读
-
iOS中的音频服务和音频AVAudioPlayer音频播放器使用指南
-
详解iOS App中UISwitch开关组件的基本创建及使用方法
-
深入解析Vue.js中v-bind v-model的使用和区别
-
C#中的Timer和DispatcherTimer使用实例
-
在ASP.NET 2.0中操作数据之三十三:基于DataList和Repeater使用DropDownList过滤的主/从报表
-
Android 静默安装和卸载的方法
-
浅谈pc和移动端的响应式的使用
-
Symfony的安装和配置方法
-
详解C#中的System.Timers.Timer定时器的使用和定时自动清理内存应用
-
Python的网络编程库Gevent的安装及使用技巧