欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

hive解析lzo文件失败,No LZO codec found, cannot run

程序员文章站 2024-01-23 16:35:34
...

Hive创建外部表,指向lzo格式文件时,无法解析出数据,报错如下:

java.io.IOException: No LZO codec found, cannot run.

hiveserver2日志报错如下:

Diagnostic Messages for this Task:
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
 at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:265)
 at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:212)
 at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:332)
 at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:721)
 at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
 at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:251)
 ... 11 more
Caused by: java.io.IOException: No LZO codec found, cannot run.
 at com.hadoop.mapred.DeprecatedLzoLineRecordReader.<init>(DeprecatedLzoLineRecordReader.java:53)
 at com.hadoop.mapred.DeprecatedLzoTextInputFormat.getRecordReader(DeprecatedLzoTextInputFormat.java:156)
 at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:66)
 ... 16 more


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

解决方法:

检查hadoop集群是否添加了hadoop-lzo-*.jar,该jar包是否已配置到HADOOP_CLASSPATH里

如果配置进去了,并检查core-site.xml里是否配置了如下信息:

<property>
 <name>io.compression.codecs</name>
 <value>
  org.apache.hadoop.io.compress.GzipCodec,
  org.apache.hadoop.io.compress.DefaultCodec,
  org.apache.hadoop.io.compress.BZip2Codec,
  org.apache.hadoop.io.compress.SnappyCodec,
  com.hadoop.compression.lzo.LzoCodec,
  com.hadoop.compression.lzo.LzopCodec
 </value>
 <description>
  A comma-separated list of the compression codec classes
  that can be
  used for compression/decompression. In addition to any
  classes
  specified with this property (which take precedence), codec
  classes
  on the classpath are discovered using a Java ServiceLoader.
 </description>
</property>

<property>
 <name>io.compression.codec.lzo.class</name>
 <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

如果满足这两点,应该就可以了(注意:hive安装包的lib目录不用添加hadoop-lzo-*.jar)。