欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  科技

HBASE查询表出现错误及解决方法

程序员文章站 2022-06-18 16:20:38
ERROR: No server address listed in hbase:meta for region test1,,1517390330801.40ff7bbead5f57620c...

ERROR: No server address listed in hbase:meta for region test1,,1517390330801.40ff7bbead5f57620c4ef2126403a109. containing row

上面是HBASE查询表的一个错误。业务逻辑很简单,每天通过SQOOP抽取数据到HBASE,建立HIVE关联表,通过SPARK SQL关联查询,然后得到的结果通过SPARK PUT到HBASE结果表中,这一系列的过程部署在OOZIE上。

突然有一天这个JOB一直HANG住,跑了10多个小时,于是需要去找原因,于是查询表出现以下错误(下面是模拟):

hbase(main):039:0> scan 'test1'
ROW                                        COLUMN+CELL                                                                                                                 

ERROR: No server address listed in hbase:meta for region test1,,1517390330801.40ff7bbead5f57620c4ef2126403a109. containing row 

Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications.  Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP,
MAXLENGTH or COLUMNS, CACHE or RAW, VERSIONS, ALL_METRICS or METRICS

If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family'.
什么原因导致的? 代码如下:
 try {
      println("start to disable table test")
      val config = HBaseConfiguration.create()
      config.set("hbase.zookeeper.quorum", "datanode01.isesol.com,datanode02.isesol.com,datanode03.isesol.com,datanode04.isesol.com,cmserver.isesol.com")
      config.set("hbase.zookeeper.property.clientPort", "2181")
      val connection = ConnectionFactory.createConnection(config)
      val admin = connection.getAdmin()
      admin.disableTable(TableName.valueOf("test1"))
      admin.truncateTable(TableName.valueOf("test1"), true)
      //admin.enableTable(TableName.valueOf("test"))
      admin.close()
      connection.close()
    } catch {
      case ex: Exception => { println(ex) }
    }
这个结果表存储最新的结果,所以每天会通过disable, truncate方式去清空,再插入,仅此而已。

通过实验证明,如果表为空,那么在truncate的之后,就会有如上的错误出现,如果表不为空,那么没有问题。

hbase(main):040:0> scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('test1')"} 
ROW                                        COLUMN+CELL                                                                                                                 
 test1,,1517390330801.40ff7bbead5f57620c4e column=info:regioninfo, timestamp=1517390365180, value={ENCODED => 40ff7bbead5f57620c4ef2126403a109, NAME => 'test1,,1517390
 f2126403a109.                             330801.40ff7bbead5f57620c4ef2126403a109.', STARTKEY => '', ENDKEY => ''}                                                    
1 row(s) in 0.0370 seconds

hbase(main):041:0> 
通过查询META,的确看不到 server的信息,只有region info. 正常的表应该是如下:
hbase(main):042:0> 
hbase(main):043:0* scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('test2')"} 
ROW                                        COLUMN+CELL                                                                                                                 
 test2,,1506443929217.64139fa4ea9706556ddf column=info:regioninfo, timestamp=1506443925447, value={ENCODED => 64139fa4ea9706556ddf8a96958e5435, NAME => 'test2,,1506443
 8a96958e5435.                             929217.64139fa4ea9706556ddf8a96958e5435.', STARTKEY => '', ENDKEY => ''}                                                    
 test2,,1506443929217.64139fa4ea9706556ddf column=info:seqnumDuringOpen, timestamp=1506444316259, value=\x00\x00\x00\x00\x00\x00\x00\x0E                               
 8a96958e5435.                                                                                                                                                         
 test2,,1506443929217.64139fa4ea9706556ddf column=info:server, timestamp=1506444316259, value=datanode01.isesol.com:60020                                              
 8a96958e5435.                                                                                                                                                         
 test2,,1506443929217.64139fa4ea9706556ddf column=info:serverstartcode, timestamp=1506444316259, value=1506150299411                                                   
 8a96958e5435.                                       
知道是什么导致的,那么为什么呢? 我也不明白,我心中有几千个*在奔腾。我很疑惑的是为什么 cli没有问题,但是通过API确有问题?

我这种业务逻辑实际应该很多,这种方法不行,我该怎么办? 难道每次我去先查询是否有数据,如果没有就直接插入,有在truncate ?