欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

sqoop 安装与使用

程序员文章站 2022-11-07 12:43:38
本文介绍了如何安装sqoop,以及如何使用sqoop导入数据到hadoop、hive中去 ......

sqoop(发音:skup)是一款开源的工具,主要用于在hadoop(hive)与传统的数据库间进行数据的传递,可以将一个关系型数据库(例如 : mysql ,oracle ,postgres等)中的数据导进到hadoop的hdfs中,也可以将hdfs的数据导进到关系型数据库中。

官网:http://sqoop.apache.org/

由于官方不建议将sqoop 2用于生产环境,因为仍然使用sqoop 1

1、准备

下载

找到对应的镜像地址安装,复制下载地址

wget http://apache-mirror.8birdsvideo.com/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

解压

tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

2、安装

mv sqoop-1.4.7.bin__hadoop-2.6.0  /usr/local/

添加jdbc到sqoop的lib中:

cp mysql-connector-java-5.1.17.jar /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/lib/

3、配置

配置sqoop,加入path

vim /etc/profile

export sqoop_home=/usr/local/sqoop-1.4.7.bin__hadoop-2.6.0
export path=$sqoop_home/bin:$path

 

配置sqoop-evn.sh

在sqoop的conf目录下

cp sqoop-env-template.sh sqoop-env.sh

vim sqoop-env.sh

文件内容增加

 

export hadoop_common_home=/lib/hadoop
export hadoop_mapred_home=/lib/hadoop-mapreduce
export hive_home=/search/hive
export hive_conf_dir=/search/hive/conf

4、测试

测试与mysql的连通性

sqoop list-databases --connect jdbc:mysql://${host}:3306/test?characterencoding=utf-8 --username ${username} --password '$password'
warning: /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! hcatalog jobs will fail.
please set $hcat_home to the root of your hcatalog installation.
warning: /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! accumulo imports will fail.
please set $accumulo_home to the root of your accumulo installation.
19/12/12 16:39:52 info sqoop.sqoop: running sqoop version: 1.4.7
19/12/12 16:39:52 warn tool.basesqooptool: setting your password on the command-line is insecure. consider using -p instead.
19/12/12 16:39:53 info manager.mysqlmanager: preparing to use a mysql streaming resultset.
information_schema
test

测试从mysql中导入到hive中

sqoop import \
--connect jdbc:mysql://***:3306/test?characterencoding=utf-8 \
--username ***\
--password '***' \
--table testxxx \
-m 1 \
--warehouse-dir /user/teemo/hive/warehouse/lyj_temp.db \
--hive-database lyj_temp \
--hive-import \
--hive-table testxxx

运行结果:

进入hive,select  一下能看到结果

遇到得问题:

问题1

19/12/12 17:22:29 info orm.compilationmanager: writing jar file: /tmp/sqoop-root/compile/5c81c7e9c9d08472a6c1a1a272e4eeb3/testxxx.jar
exception in thread "main" java.lang.noclassdeffounderror: org/apache/hadoop/mapreduce/inputformat
        at java.lang.classloader.defineclass1(native method)
        at java.lang.classloader.defineclass(classloader.java:763)
        at java.security.secureclassloader.defineclass(secureclassloader.java:142)
        at java.net.urlclassloader.defineclass(urlclassloader.java:468)
        at java.net.urlclassloader.access$100(urlclassloader.java:74)
        at java.net.urlclassloader$1.run(urlclassloader.java:369)
        at java.net.urlclassloader$1.run(urlclassloader.java:363)
        at java.security.accesscontroller.doprivileged(native method)
        at java.net.urlclassloader.findclass(urlclassloader.java:362)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        at java.lang.classloader.defineclass1(native method)
        at java.lang.classloader.defineclass(classloader.java:763)
        at java.security.secureclassloader.defineclass(secureclassloader.java:142)
        at java.net.urlclassloader.defineclass(urlclassloader.java:468)
        at java.net.urlclassloader.access$100(urlclassloader.java:74)
        at java.net.urlclassloader$1.run(urlclassloader.java:369)
        at java.net.urlclassloader$1.run(urlclassloader.java:363)
        at java.security.accesscontroller.doprivileged(native method)
        at java.net.urlclassloader.findclass(urlclassloader.java:362)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        at java.lang.classloader.defineclass1(native method)
        at java.lang.classloader.defineclass(classloader.java:763)
        at java.security.secureclassloader.defineclass(secureclassloader.java:142)
        at java.net.urlclassloader.defineclass(urlclassloader.java:468)
        at java.net.urlclassloader.access$100(urlclassloader.java:74)
        at java.net.urlclassloader$1.run(urlclassloader.java:369)
        at java.net.urlclassloader$1.run(urlclassloader.java:363)
        at java.security.accesscontroller.doprivileged(native method)
        at java.net.urlclassloader.findclass(urlclassloader.java:362)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        at java.lang.classloader.defineclass1(native method)
        at java.lang.classloader.defineclass(classloader.java:763)
        at java.security.secureclassloader.defineclass(secureclassloader.java:142)
        at java.net.urlclassloader.defineclass(urlclassloader.java:468)
        at java.net.urlclassloader.access$100(urlclassloader.java:74)
        at java.net.urlclassloader$1.run(urlclassloader.java:369)
        at java.net.urlclassloader$1.run(urlclassloader.java:363)
        at java.security.accesscontroller.doprivileged(native method)
        at java.net.urlclassloader.findclass(urlclassloader.java:362)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        at org.apache.sqoop.manager.importjobcontext.<init>(importjobcontext.java:51)
        at com.cloudera.sqoop.manager.importjobcontext.<init>(importjobcontext.java:33)
        at org.apache.sqoop.tool.importtool.importtable(importtool.java:506)
        at org.apache.sqoop.tool.importtool.run(importtool.java:628)
        at org.apache.sqoop.sqoop.run(sqoop.java:147)
        at org.apache.hadoop.util.toolrunner.run(toolrunner.java:70)
        at org.apache.sqoop.sqoop.runsqoop(sqoop.java:183)
        at org.apache.sqoop.sqoop.runtool(sqoop.java:234)
        at org.apache.sqoop.sqoop.runtool(sqoop.java:243)
        at org.apache.sqoop.sqoop.main(sqoop.java:252)
caused by: java.lang.classnotfoundexception: org.apache.hadoop.mapreduce.inputformat
        at java.net.urlclassloader.findclass(urlclassloader.java:382)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        ... 58 more

配置的hive_home这个变量有问题,应该是hadoop-mapreduce这个目录

问题2

19/12/12 17:59:34 error tool.importtool: import failed: java.io.ioexception: java.lang.classnotfoundexception: org.apache.hadoop.hive.conf.hiveconf
        at org.apache.sqoop.hive.hiveconfig.gethiveconf(hiveconfig.java:50)
        at org.apache.sqoop.hive.hiveimport.gethiveargs(hiveimport.java:392)
        at org.apache.sqoop.hive.hiveimport.executeexternalhivescript(hiveimport.java:379)
        at org.apache.sqoop.hive.hiveimport.executescript(hiveimport.java:337)
        at org.apache.sqoop.hive.hiveimport.importtable(hiveimport.java:241)
        at org.apache.sqoop.tool.importtool.importtable(importtool.java:537)
        at org.apache.sqoop.tool.importtool.run(importtool.java:628)
        at org.apache.sqoop.sqoop.run(sqoop.java:147)
        at org.apache.hadoop.util.toolrunner.run(toolrunner.java:70)
        at org.apache.sqoop.sqoop.runsqoop(sqoop.java:183)
        at org.apache.sqoop.sqoop.runtool(sqoop.java:234)
        at org.apache.sqoop.sqoop.runtool(sqoop.java:243)
        at org.apache.sqoop.sqoop.main(sqoop.java:252)
caused by: java.lang.classnotfoundexception: org.apache.hadoop.hive.conf.hiveconf
        at java.net.urlclassloader.findclass(urlclassloader.java:382)
        at java.lang.classloader.loadclass(classloader.java:424)
        at sun.misc.launcher$appclassloader.loadclass(launcher.java:349)
        at java.lang.classloader.loadclass(classloader.java:357)
        at java.lang.class.forname0(native method)
        at java.lang.class.forname(class.java:264)
        at org.apache.sqoop.hive.hiveconfig.gethiveconf(hiveconfig.java:44)
        ... 12 more

将hive的lib里面的拷贝到sqoop的lib目录下

cp hive-exec-1.2.1.jar /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/lib/