欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Hbase数据库基本操作

程序员文章站 2022-04-29 08:54:04
...

本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>
上个月写了一篇Hive数据仓库基本操作过了这么长的时间,还没来得及复习,今天又学了Hbase数据库的一大堆操作,为了防止混淆,和后期快速复习,查找,今天再写一篇Hbase Shell的基本操作,记性不好,只好写下来啦。

命令 作用
create 创建表
desc 查看表信息
put 插入数据
get 数据查询
scan 数据查询
alter 修改表
truncate 清空数据表
drop 删除表

在保证Hbase和相关依赖项都启动后输入hbase shell,进入Hbase客户端。

[root@namenode opt]# hbase shell

输入help,查看Hbase的shell命令

hbase(main):001:0> help
HBase Shell, version 1.2.6, rUnknown, Mon May 29 02:25:32 CDT 2017
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
......

命令有很多,本文只罗列一些基本的命令

1、create 创建表

输入help ‘create’ 查看建表语句

hbase(main):002:0> help 'create'
......

基本语句:create ‘表名’ , ‘列族名’
插一句:可能会出现 znode data = = null 的问题,这是因为运行Hbase的用户无法将文件写入zookeeper,导致znode为空
解决方案:在hbase-site.xml文件中指定zookeeper的文件目录即可

<property>
	<name>hbase.zookeeper.property.dataDir</name>
    <value>/opt/data.zk</value>
</property>

Hbase创建表时。只指定表的名称和列族名称,不指定列的名称和类型
如:创建一个名为 student1 ,列族名为 cf1 的表

hbase(main):003:0> create 'student1' , 'cf1'
0 row(s) in 3.4160 seconds

=> Hbase::Table - student1

创建一个名为 student2 ,列族名为 cf1 和 cf2 的表

hbase(main):011:0> create 'student2' , 'cf1' , 'cf2'
0 row(s) in 2.2410 seconds

=> Hbase::Table - student2

输入 list 命令,查看Hbase中的表

hbase(main):016:0> list
TABLE
student1
student2
2 row(s) in 0.0080 seconds

=> ["student1", "student2"]

2、desc 查看表信息

使用命令:desc ‘表名’ 查看student1表的详细信息

hbase(main):018:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0590 seconds

{}中的内容是列族的信息,以下是各个字符的对应解释

字符 解释
NAME 名称
VERSION 版本号
IN_MEMORY 是否将数据在内存中存储
TTL 创建时间
BLOCKSIZE 列族的大小
REPLICATION_SCOPE 复制

查看 student2 表的信息, student2 表有两个列族 ,故有两个{}

hbase(main):019:0> desc 'student2'
Table student2 is ENABLED
student2
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.2050 seconds

3、put 插入数据

put命令:put ‘表名’ , ‘行键’ , ‘列族:列名’ , ‘数据内容’
将信息(name:July,age:18,grade:98,sex:男)插入到表student1中

hbase(main):020:0> put 'student1' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.2070 seconds

hbase(main):021:0> put 'student1' , '001' , 'cf1:age' , '18'
0 row(s) in 0.0240 seconds

hbase(main):022:0> put 'student1' , '001' , 'cf1:grade' , '98'
0 row(s) in 0.0210 seconds

hbase(main):023:0> put 'student1' , '001' , 'cf1:sex' , 'M'
0 row(s) in 0.0080 seconds

浏览器输入:namenode:50070
Hbase数据库基本操作
Hbase插入的数据存储在HDFS中
存储路径为:/hbase/data/default/表/region编号/列族/HDFS的文名
default 是默认的命名空间

4、get 数据查询

基本命令:get ‘表名’ , ‘行键’
获取表 student1 行键为 001 的数据

hbase(main):025:0> get 'student1' , '001'
COLUMN                                              CELL
 cf1:age                                            timestamp=1589278473070, value=18
 cf1:grade                                          timestamp=1589278487224, value=98
 cf1:name                                           timestamp=1589278459460, value=July
 cf1:sex                                            timestamp=1589278496165, value=M
4 row(s) in 0.0760 seconds

其中,timetamp 表示存入数据的时间戳,value 是对应的值
查询表 student1,001行,cf1列族,name列的数据

hbase(main):026:0> get 'student1' , '001' , 'cf1:name'
COLUMN                                              CELL
 cf1:name                                           timestamp=1589278459460, value=July
1 row(s) in 0.0210 seconds

将表student1中,name列的值July修改为Mary,然后查询结果

hbase(main):027:0> put 'student1' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0220 seconds

hbase(main):028:0> get 'student1' , '001' , 'cf1:name'
COLUMN                                              CELL
 cf1:name                                           timestamp=1589279637308, value=Mary
1 row(s) in 0.0150 seconds

在Hbase中,列族默认VERSION的值为1,表示每一列只能存储一个值,后插入的值会覆盖之前的值!
创建表时,指定VERSION的值:create ‘表名’ {NAME => ‘列族名’ , VERSIONS => ‘版本值’}
创建表 student3,并指定列族 cf1 的版本值是3,查询结果

hbase(main):031:0> create 'student3' , {NAME => 'cf1' , VERSIONS => '3'}
0 row(s) in 8.8800 seconds

=> Hbase::Table - student3

hbase(main):032:0> desc 'student3'
Table student3 is ENABLED
student3
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0410 seconds

向表student3的相同的列中插入3次数据,查询结果

hbase(main):033:0> put 'student3' , '001' , 'cf1:name' , 'July'
0 row(s) in 0.0470 seconds

hbase(main):034:0> put 'student3' , '001' , 'cf1:name' , 'Tom'
0 row(s) in 0.0120 seconds

hbase(main):035:0> put 'student3' , '001' , 'cf1:name' , 'Mary'
0 row(s) in 0.0090 seconds

hbase(main):039:0> get 'student3' , '001' , {COLUMN => 'cf1:name' ,VERSIONS => 3}
COLUMN                                              CELL
 cf1:name                                           timestamp=1589280435047, value=Mary
 cf1:name                                           timestamp=1589280430294, value=Tom
 cf1:name                                           timestamp=1589280423248, value=July
3 row(s) in 0.2580 seconds

5、scan 数据查询

使用get查询时,必须输入行键,不能直接对某一列进行查询
可以使用scan对表的指定列进行查询
命令:scan ‘表名’ , {COLUMN => ‘列族:列名’ , VERSIONS => ‘版本值’ }
查询表student3中的name列

hbase(main):005:0> scan 'student3' , {COLUMN => 'cf1:name' , VERSIONS => 3}
ROW                                                 COLUMN+CELL
 001                                                column=cf1:name, timestamp=1589280435047, value=Mary
 001                                                column=cf1:name, timestamp=1589280430294, value=Tom
 001                                                column=cf1:name, timestamp=1589280423248, value=July
1 row(s) in 0.1310 seconds

6、alter 修改表

alter 可以在表中增加列族
命令:alter ‘表名’ , NAME => ‘列族名’ , VERSIONS => 版本值
在表student1中增加列族cf1,修改版本值为3,查询结果

hbase(main):007:0> alter 'student1' , NAME => 'cf2' , VERSIONS => 3
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 5.0950 seconds

hbase(main):008:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0500 seconds

alter 可以删除表中的数据,但alter只能以列族为单位删除
命令:
alter '表名’ , NAME => ‘列族’ , METHOD => ‘delete’ ,或输入
alter ‘表名’ , ‘delete’ => ‘列族’

删除表student1中的列族cf2

hbase(main):009:0> alter 'student1' , 'delete' => 'cf2'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.7580 seconds

hbase(main):010:0> desc 'student1'
Table student1 is ENABLED
student1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0280 seconds

7、truncate 清空数据

命令:truncate ‘表名’
清空表student3中的数据

hbase(main):011:0> truncate 'student3'
Truncating 'student3' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 4.7220 seconds

hbase(main):014:0> get 'student3' , '001'
COLUMN                                              CELL
0 row(s) in 0.2540 seconds

在清空表中数据时,系统自动先禁用表再清空数据
当数据清空完成后,系统自动恢复表的使用

使用命令:is_enabled ‘表名’ 查看表是否可用

hbase(main):015:0> is_enabled 'student3'
true
0 row(s) in 0.0190 seconds

8、drop 删除表

命令:drop ‘表名’
Hbase表不能直接删除

hbase(main):016:0> drop 'student2'

ERROR: Table student2 is enabled. Disable it first.

Here is some help for this command:
Drop the named table. Table must first be disabled:
  hbase> drop 't1'
  hbase> drop 'ns1:t1'

在删除之前,必须先禁用表
命令:disable ‘表名’

hbase(main):017:0> disable 'student2'
0 row(s) in 2.3580 seconds

hbase(main):018:0> drop 'student2'
0 row(s) in 1.3430 seconds

hbase(main):019:0> list
TABLE
student1
student3
2 row(s) in 0.0090 seconds

=> ["student1", "student3"]




本文参考:数据酷客<Hadoop基础.Hbase的Shell命令>

后续遇到其他命令将继续跟新此文章

如有错误(包括之前的博文),欢迎私信

技术永无止境!谢谢支持!