欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Fetch抓取

程序员文章站 2022-04-25 20:01:41
...

Fetch抓取


Fetch抓取是指,Hive中对某些情况的查询可以不必使用MapReduce计算

在hive-default.xml.template文件中hive.fetch.task.conversion默认是more,老版本hive默认是minimal,该属性修改为more以后,在全局查找、字段查找、limit查找等都不走mapreduce

<property>
    <name>hive.fetch.task.conversion</name>
    <value>more</value>
    <description>
      Expects one of [none, minimal, more].
      Some select queries can be converted to single FETCH task minimizing latency.
      Currently the query should be single sourced not having any subquery and should not have
      any aggregations or distincts (which incurs RS), lateral views and joins.
      0. none : disable hive.fetch.task.conversion
      1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
      2. more  : SELECT, FILTER, LIMIT only (support TABLESAMPLE and virtual columns)
    </description>
</property>
1. 把hive.fetch.task.conversion设置成none,然后执行查询语句,都会执行mapreduce程序
hive (default)> set hive.fetch.task.conversion=minimal;
hive (default)>  select ename from emp;
2. 把hive.fetch.task.conversion设置成more,然后执行查询语句,如下查询方式都不会执行mapreduce程序
hive (default)> set hive.fetch.task.conversion=more;
hive (default)>  select ename from emp;