细说HBase怎么完成一个Get操作 (server side)
上面有一篇记录了client边的过程,现在看看RegionSever这边怎么进行下去。
经过HBaseRPC后,调用传递到HRegionServer.get(byte[] regionName, Get get).
HRegion region = getRegion(regionName); return region.get(get, getLockFromId(get.getLockId()));
然后是HRegion.get(Get)方法:
/* * Do a get based on the get parameter. */ private List<KeyValue> get(final Get get) throws IOException { Scan scan = new Scan(get); List<KeyValue> results = new ArrayList<KeyValue>(); InternalScanner scanner = null; try { scanner = getScanner(scan); scanner.next(results); } finally { if (scanner != null) scanner.close(); } return results; }
返回的scanner是一个RegionScanner. 去看它的construction method吧:
RegionScanner(Scan scan, List<KeyValueScanner> additionalScanners) { //DebugPrint.println("HRegionScanner.<init>"); this.filter = scan.getFilter(); // Doesn't need to be volatile, always accessed under a sync'ed method this.oldFilter = scan.getOldFilter(); if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW)) { this.stopRow = null; } else { this.stopRow = scan.getStopRow(); } this.isScan = scan.isGetScan() ? -1 : 0; this.readPt = ReadWriteConsistencyControl.resetThreadReadPoint(rwcc); List<KeyValueScanner> scanners = new ArrayList<KeyValueScanner>(); if (additionalScanners != null) { scanners.addAll(additionalScanners); } for (Map.Entry<byte[], NavigableSet<byte[]>> entry : scan.getFamilyMap().entrySet()) { Store store = stores.get(entry.getKey()); scanners.add(store.getScanner(scan, entry.getValue())); } this.storeHeap = new KeyValueHeap(scanners.toArray(new KeyValueScanner[0]), comparator); }
关键code是获取scanners。这里要对hbase的数据模型要有个了解了,hbase存储的table有column family的概念,一个column family可以包含不同的column。hbase存储的时候每个cf存储成一个Store,而每个store的数据包含在内存中的memstore和disk上的一个或多个HFile。所以store.getScanner(scan, entry.getValue()))返回memstore和HFile上的scanner。第二个参数就是所要查询的column集合。
/** * Return a scanner for both the memstore and the HStore files */ protected KeyValueScanner getScanner(Scan scan, final NavigableSet<byte []> targetCols) { lock.readLock().lock(); try { return new StoreScanner(this, scan, targetCols); } finally { lock.readLock().unlock(); } }
因为Store既包括memstore又包括StoreFile,所以每一个StoreScanner又要生成多个KeyValueScanner,具体看如下code:
/** * Opens a scanner across memstore, snapshot, and all StoreFiles. * * @param store who we scan * @param scan the spec * @param columns which columns we are scanning */ StoreScanner(Store store, Scan scan, final NavigableSet<byte[]> columns) { //DebugPrint.println("SS new"); this.store = store; this.cacheBlocks = scan.getCacheBlocks(); matcher = new ScanQueryMatcher(scan, store.getFamily().getName(), columns, store.ttl, store.comparator.getRawComparator(), store.versionsToReturn(scan.getMaxVersions())); this.isGet = scan.isGetScan(); List<KeyValueScanner> scanners = getScanners(); // Seek all scanners to the initial key // TODO if scan.isGetScan, use bloomfilters to skip seeking for(KeyValueScanner scanner : scanners) { scanner.seek(matcher.getStartKey()); } // Combine all seeked scanners with a heap heap = new KeyValueHeap( scanners.toArray(new KeyValueScanner[scanners.size()]), store.comparator); this.store.addChangedReaderObserver(this); }
StoreScanner这个construction method里面先调用getScanners()拿到所有的KeyValueScanner,然后seek所有的scanner到指定的key;然后再讲所有的scanner放到一个heap里,用以merge要返回的结果。
/* * @return List of scanners ordered properly. */ private List<KeyValueScanner> getScanners() { List<KeyValueScanner> scanners = getStoreFileScanners(); KeyValueScanner [] memstorescanners = this.store.memstore.getScanners(); for (int i = memstorescanners.length - 1; i >= 0; i--) { scanners.add(memstorescanners[i]); } return scanners; }
分析各个scanner的seek,首先看StoreFileScanner,具体执行seek的是HFile上:
public int seekTo(byte[] key, int offset, int length) throws IOException { int b = reader.blockContainingKey(key, offset, length); if (b < 0) return -1; // falls before the beginning of the file! :-( // Avoid re-reading the same block (that'd be dumb). loadBlock(b); return blockSeek(key, offset, length, false); }
这里要注意的是每个Store的每个HFile在regionserver起来后是一直处于open状态的,HFile上的block index被读取到内存保持的。这里1)首先在index上查查key所在的那个data block在HFile上的位置;2)然后再把这个block读取进来;3)再然后seek到要找的key。
这里的三步,第一步是在内存里做二分查找;第二步:
private void loadBlock(int bloc) throws IOException { if (block == null) { block = reader.readBlock(bloc, this.cacheBlocks, this.pread); currBlock = bloc; blockFetches++; } else { if (bloc != currBlock) { block = reader.readBlock(bloc, this.cacheBlocks, this.pread); currBlock = bloc; blockFetches++; } else { // we are already in the same block, just rewind to seek again. block.rewind(); } } }
大意就是check一下当前block是否正式要找的,如果是那就太好了!不是就去load吧。。。
load的code很长,但逻辑很简单,看看cache,cache里面没有,再去file system上load。这里面有了一次pread。玩HDFS的经验告诉我即使是pread,HDFS上的随机读效率差。
第三步是在读上来的block里面找到key,内存中的比较。
memstore上的seek就不说了,比较简单。
等所有scanner seek到位置后,然后都被添加到KeyValueHeap的优先队列中去,StoreScanner就构造好了。
再然后,所有的Store的StoreScanner又构造出一个KeyValueHeap,用于前面说过的不同的column family的结果合并。
有了这些后,就可以做查询了!HRegion.next(List<KeyValue> outResults) 调用HRegion.nextInteral()
do { this.storeHeap.next(results); } while (Bytes.equals(currentRow, nextRow = peekRow()));
之所以有个循环,就是要遍历不同的column family吧。
我的理解暂且到此。
下一篇: 【JAVA之多线程下载文件实现】