KylinFailedtoloadHiveTable:Overwritingconflict的解决方法
1. 问题
使用kylin加载hive的表时出错,报错如下:
2018-01-25 15:55:47,581 TRACE [http-bio-7070-exec-5] hbase.HBaseResourceStore:311 : Update row /table_exd/NLOGS.BRO_DHCP.json from oldTs: 0, to newTs: 1516866947530, operation result: false 2018-01-25 15:55:47,583 ERROR [http-bio-7070-exec-5] controller.TableController:118 : Failed to load Hive Table java.lang.IllegalStateException: Overwriting conflict /table_exd/NLOGS.BRO_DHCP.json, expect old TS 0, but it is 1516183299747 at org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:315) at org.apache.kylin.common.persistence.ResourceStore.checkAndPutResourceCheckpoint(ResourceStore.java:294) at org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:280) at org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:260) at org.apache.kylin.metadata.MetadataManager.saveTableExt(MetadataManager.java:241) at org.apache.kylin.rest.service.TableService.loadHiveTablesToProject(TableService.java:166) at org.apache.kylin.rest.service.TableService$$FastClassBySpringCGLIB$$4a7fb179.invoke() at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
2. 分析原因
容易看出,主要错误是将元数据写入HBase时,TS时间戳冲突,推断是HBase中已经存在了该表的信息,所以,需要知道该信息存储在HBase的那一张表,记录RowKey分别是什么。
在错误信息中已经表明了错误发生的位置:
org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:315)
我使用的Kylin版本是2.1,所以,到GitHub上找到kylin项目,选择分支“2.1.x”,找到文件HBaseResourceStore.java的315行,如下所示:
@Override protected long checkAndPutResourceImpl(String resPath, byte[] content, long oldTS, long newTS) throws IOException, IllegalStateException { Table table = getConnection().getTable(TableName.valueOf(tableName)); try { byte[] row = Bytes.toBytes(resPath); byte[] bOldTS = oldTS == 0 ? null : Bytes.toBytes(oldTS); Put put = buildPut(resPath, newTS, row, content, table); boolean ok = table.checkAndPut(row, B_FAMILY, B_COLUMN_TS, bOldTS, put); logger.trace("Update row " + resPath + " from oldTs: " + oldTS + ", to newTs: " + newTS + ", operation result: " + ok);//311 - 312行 if (!ok) { long real = getResourceTimestampImpl(resPath); throw new IllegalStateException( //315行 "Overwriting conflict " + resPath + ", expect old TS " + oldTS + ", but it is " + real); } return newTS; } finally { IOUtils.closeQuietly(table); } }
检测oldTS的时候,不OK了,所以报错,从此处可以看出,resPath直接作为Rowkey,从上方错误信息中的Trace输出以及代码的311-312行能够看到resPath=”/table_exd/NLOGS.BRO_DHCP.json”:
Update row /table_exd/NLOGS.BRO_DHCP.json …
此时,还需要Hbase表名,依然在该文件中,87-107行代码如下:
public HBaseResourceStore(KylinConfig kylinConfig) throws IOException { super(kylinConfig); metadataUrl = buildMetadataUrl(kylinConfig); tableName = metadataUrl.getIdentifier(); createHTableIfNeeded(tableName); } private StorageURL buildMetadataUrl(KylinConfig kylinConfig) throws IOException { StorageURL url = kylinConfig.getMetadataUrl(); if (!url.getScheme().equals("hbase")) throw new IOException("Cannot create HBaseResourceStore. Url not match. Url: " + url); // control timeout for prompt error report Map newParams = new LinkedHashMap<>(); newParams.put("hbase.client.scanner.timeout.period", "10000"); newParams.put("hbase.rpc.timeout", "5000"); newParams.put("hbase.client.retries.number", "1"); newParams.putAll(url.getAllParameters()); return url.copy(newParams); }
容易看出,tableName在StorageURL类中生成,所以找到该类,在97行有一句如下:
this.identifier = n.isEmpty() ? "kylin_metadata" : n;
所以到此就能确定
HBase表名:kylin_metadata, rowKey:/table_exd/NLOGS.BRO_DHCP.json
到Habse shell中执行
get “kylin_metadata”,”/table_exd/NLOGS.BRO_DHCP.json”
能够看到返回了结果,表的元数据已经存储了,但是Kylin没有正常识别,所以多次Load,在写入HBase时,则发生了冲突,主要是因为Kylin检查了时间戳,时间戳不匹配,则不更新。
3. 解决方案
知道原因,解决方式就很简单了,直接删除该条记录,重新加载hive表即可,执行命令如下:
deleteall “kylin_metadata”,”/table_exd/NLOGS.BRO_DHCP.json”