Hbase数据库基本操作
本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
上个月写了一篇Hive数据仓库基本操作过了这么长的时间,还没来得及复习,今天又学了Hbase数据库的一大堆操作,为了防止混淆,和后期快速复习,查找,今天再写一篇Hbase Shell的基本操作,记性不好,只好写下来啦。
命令 | 作用 |
---|---|
create | 创建表 |
desc | 查看表信息 |
put | 插入数据 |
get | 数据查询 |
scan | 数据查询 |
alter | 修改表 |
truncate | 清空数据表 |
drop | 删除表 |
… | … |
… | … |
… | … |
在保证Hbase和相关依赖项都启动后输入hbase shell,进入Hbase客户端。
[root@namenode opt]# hbase shell
输入help,查看Hbase的shell命令
hbase(main):001:0> help
HBase Shell, version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
......
命令有很多,本文只罗列一些基本的命令
1、create 创建表
输入help ‘create’ 查看建表语句
hbase(main):002:0> help 'create'
......
基本语句:create ‘表名’ , ‘列族名’
插一句:可能会出现 znode data = = null 的问题,这是因为运行Hbase的用户无法将文件写入zookeeper,导致znode为空
解决方案:在hbase-site.xml文件中指定zookeeper的文件目录即可
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/data.zk</value>
</property>
Hbase创建表时。只指定表的名称和列族名称,不指定列的名称和类型
如:创建一个名为 student1 ,列族名为 cf1 的表
hbase(main):003:0> create 'student1' , 'cf1'
0 row(s) in 3.4160 seconds
=> Hbase::Table - student1
创建一个名为 student2 ,列族名为 cf1 和 cf2 的表
hbase(main):011:0> create 'student2' , 'cf1' , 'cf2'
0 row(s) in 2.2410 seconds
=> Hbase::Table - student2
输入 list 命令,查看Hbase中的表
hbase(main):016:0> list
TABLE
student1
student2
2 row(s) in 0.0080 seconds
=> ["student1", "student2"]
2、desc 查看表信息
使用命令:desc ‘表名’ 查看student1表的详细信息
hbase(main):018:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0590 seconds
{}中的内容是列族的信息,以下是各个字符的对应解释
字符 | 解释 |
---|---|
NAME | 名称 |
VERSION | 版本号 |
IN_MEMORY | 是否将数据在内存中存储 |
TTL | 创建时间 |
BLOCKSIZE | 列族的大小 |
REPLICATION_SCOPE | 复制 |
查看 student2 表的信息, student2 表有两个列族 ,故有两个{}
hbase(main):019:0> desc 'student2'
Table student2 is ENABLED
student2
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2050 seconds
3、put 插入数据
put命令:put ‘表名’ , ‘行键’ , ‘列族:列名’ , ‘数据内容’
将信息(name:July,age:18,grade:98,sex:男)插入到表student1中
hbase(main):020:0> put 'student1' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.2070 seconds
hbase(main):021:0> put 'student1' , '001' , 'cf1:age' , '18'
0 row(s) in 0.0240 seconds
hbase(main):022:0> put 'student1' , '001' , 'cf1:grade' , '98'
0 row(s) in 0.0210 seconds
hbase(main):023:0> put 'student1' , '001' , 'cf1:sex' , 'M'
0 row(s) in 0.0080 seconds
浏览器输入:namenode:50070
Hbase插入的数据存储在HDFS中
存储路径为:/hbase/data/default/表/region编号/列族/HDFS的文名
default 是默认的命名空间
4、get 数据查询
基本命令:get ‘表名’ , ‘行键’
获取表 student1 行键为 001 的数据
hbase(main):025:0> get 'student1' , '001'
COLUMN CELL
cf1:age timestamp=1589278473070, value=18
cf1:grade timestamp=1589278487224, value=98
cf1:name timestamp=1589278459460, value=July
cf1:sex timestamp=1589278496165, value=M
4 row(s) in 0.0760 seconds
(其中,timetamp 表示存入数据的时间戳,value 是对应的值)
查询表 student1,001行,cf1列族,name列的数据
hbase(main):026:0> get 'student1' , '001' , 'cf1:name'
COLUMN CELL
cf1:name timestamp=1589278459460, value=July
1 row(s) in 0.0210 seconds
将表student1中,name列的值July修改为Mary,然后查询结果
hbase(main):027:0> put 'student1' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0220 seconds
hbase(main):028:0> get 'student1' , '001' , 'cf1:name'
COLUMN CELL
cf1:name timestamp=1589279637308, value=Mary
1 row(s) in 0.0150 seconds
(在Hbase中,列族默认VERSION的值为1,表示每一列只能存储一个值,后插入的值会覆盖之前的值!)
创建表时,指定VERSION的值:create ‘表名’ {NAME => ‘列族名’ , VERSIONS => ‘版本值’}
创建表 student3,并指定列族 cf1 的版本值是3,查询结果
hbase(main):031:0> create 'student3' , {NAME => 'cf1' , VERSIONS => '3'}
0 row(s) in 8.8800 seconds
=> Hbase::Table - student3
hbase(main):032:0> desc 'student3'
Table student3 is ENABLED
student3
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0410 seconds
向表student3的相同的列中插入3次数据,查询结果
hbase(main):033:0> put 'student3' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.0470 seconds
hbase(main):034:0> put 'student3' , '001' , 'cf1:name' , 'Tom'
0 row(s) in 0.0120 seconds
hbase(main):035:0> put 'student3' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0090 seconds
hbase(main):039:0> get 'student3' , '001' , {COLUMN => 'cf1:name' ,VERSIONS => 3}
COLUMN CELL
cf1:name timestamp=1589280435047, value=Mary
cf1:name timestamp=1589280430294, value=Tom
cf1:name timestamp=1589280423248, value=July
3 row(s) in 0.2580 seconds
5、scan 数据查询
使用get查询时,必须输入行键,不能直接对某一列进行查询
可以使用scan对表的指定列进行查询
命令:scan ‘表名’ , {COLUMN => ‘列族:列名’ , VERSIONS => ‘版本值’ }
查询表student3中的name列
hbase(main):005:0> scan 'student3' , {COLUMN => 'cf1:name' , VERSIONS => 3}
ROW COLUMN+CELL
001 column=cf1:name, timestamp=1589280435047, value=Mary
001 column=cf1:name, timestamp=1589280430294, value=Tom
001 column=cf1:name, timestamp=1589280423248, value=July
1 row(s) in 0.1310 seconds
6、alter 修改表
alter 可以在表中增加列族
命令:alter ‘表名’ , NAME => ‘列族名’ , VERSIONS => 版本值
在表student1中增加列族cf1,修改版本值为3,查询结果
hbase(main):007:0> alter 'student1' , NAME => 'cf2' , VERSIONS => 3
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 5.0950 seconds
hbase(main):008:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0500 seconds
alter 可以删除表中的数据,但alter只能以列族为单位删除
命令:
alter '表名’ , NAME => ‘列族’ , METHOD => ‘delete’ ,或输入
alter ‘表名’ , ‘delete’ => ‘列族’
删除表student1中的列族cf2
hbase(main):009:0> alter 'student1' , 'delete' => 'cf2'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.7580 seconds
hbase(main):010:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0280 seconds
7、truncate 清空数据
命令:truncate ‘表名’
清空表student3中的数据
hbase(main):011:0> truncate 'student3'
Truncating 'student3' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 4.7220 seconds
hbase(main):014:0> get 'student3' , '001'
COLUMN CELL
0 row(s) in 0.2540 seconds
在清空表中数据时,系统自动先禁用表再清空数据
当数据清空完成后,系统自动恢复表的使用
使用命令:is_enabled ‘表名’ 查看表是否可用
hbase(main):015:0> is_enabled 'student3'
true
0 row(s) in 0.0190 seconds
8、drop 删除表
命令:drop ‘表名’
Hbase表不能直接删除
hbase(main):016:0> drop 'student2'
ERROR: Table student2 is enabled. Disable it first.
Here is some help for this command:
Drop the named table. Table must first be disabled:
hbase> drop 't1'
hbase> drop 'ns1:t1'
在删除之前,必须先禁用表
命令:disable ‘表名’
hbase(main):017:0> disable 'student2'
0 row(s) in 2.3580 seconds
hbase(main):018:0> drop 'student2'
0 row(s) in 1.3430 seconds
hbase(main):019:0> list
TABLE
student1
student3
2 row(s) in 0.0090 seconds
=> ["student1", "student3"]
…
…
…
本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
后续遇到其他命令将继续跟新此文章
如有错误(包括之前的博文),欢迎私信
技术永无止境!谢谢支持!
上一篇: layUI-数据表格