sqoop 安装与使用
sqoop(发音:skup)是一款开源的工具,主要用于在hadoop(hive)与传统的数据库间进行数据的传递,可以将一个关系型数据库(例如 : mysql ,oracle ,postgres等)中的数据导进到hadoop的hdfs中,也可以将hdfs的数据导进到关系型数据库中。
官网:http://sqoop.apache.org/
由于官方不建议将sqoop 2用于生产环境,因为仍然使用sqoop 1
1、准备
下载
找到对应的镜像地址安装,复制下载地址
wget http://apache-mirror.8birdsvideo.com/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
解压
tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
2、安装
mv sqoop-1.4.7.bin__hadoop-2.6.0 /usr/local/
添加jdbc到sqoop的lib中:
cp mysql-connector-java-5.1.17.jar /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
3、配置
配置sqoop,加入path
vim /etc/profile
export sqoop_home=/usr/local/sqoop-1.4.7.bin__hadoop-2.6.0 export path=$sqoop_home/bin:$path
配置sqoop-evn.sh
在sqoop的conf目录下
cp sqoop-env-template.sh sqoop-env.sh vim sqoop-env.sh
文件内容增加
export hadoop_common_home=/lib/hadoop
export hadoop_mapred_home=/lib/hadoop-mapreduce
export hive_home=/search/hive
export hive_conf_dir=/search/hive/conf
4、测试
测试与mysql的连通性
sqoop list-databases --connect jdbc:mysql://${host}:3306/test?characterencoding=utf-8 --username ${username} --password '$password' warning: /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! hcatalog jobs will fail. please set $hcat_home to the root of your hcatalog installation. warning: /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! accumulo imports will fail. please set $accumulo_home to the root of your accumulo installation. 19/12/12 16:39:52 info sqoop.sqoop: running sqoop version: 1.4.7 19/12/12 16:39:52 warn tool.basesqooptool: setting your password on the command-line is insecure. consider using -p instead. 19/12/12 16:39:53 info manager.mysqlmanager: preparing to use a mysql streaming resultset. information_schema test
测试从mysql中导入到hive中
sqoop import \ --connect jdbc:mysql://***:3306/test?characterencoding=utf-8 \ --username ***\ --password '***' \ --table testxxx \ -m 1 \ --warehouse-dir /user/teemo/hive/warehouse/lyj_temp.db \ --hive-database lyj_temp \ --hive-import \ --hive-table testxxx
运行结果:
进入hive,select 一下能看到结果
遇到得问题:
问题1
19/12/12 17:22:29 info orm.compilationmanager: writing jar file: /tmp/sqoop-root/compile/5c81c7e9c9d08472a6c1a1a272e4eeb3/testxxx.jar exception in thread "main" java.lang.noclassdeffounderror: org/apache/hadoop/mapreduce/inputformat at java.lang.classloader.defineclass1(native method) at java.lang.classloader.defineclass(classloader.java:763) at java.security.secureclassloader.defineclass(secureclassloader.java:142) at java.net.urlclassloader.defineclass(urlclassloader.java:468) at java.net.urlclassloader.access$100(urlclassloader.java:74) at java.net.urlclassloader$1.run(urlclassloader.java:369) at java.net.urlclassloader$1.run(urlclassloader.java:363) at java.security.accesscontroller.doprivileged(native method) at java.net.urlclassloader.findclass(urlclassloader.java:362) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) at java.lang.classloader.defineclass1(native method) at java.lang.classloader.defineclass(classloader.java:763) at java.security.secureclassloader.defineclass(secureclassloader.java:142) at java.net.urlclassloader.defineclass(urlclassloader.java:468) at java.net.urlclassloader.access$100(urlclassloader.java:74) at java.net.urlclassloader$1.run(urlclassloader.java:369) at java.net.urlclassloader$1.run(urlclassloader.java:363) at java.security.accesscontroller.doprivileged(native method) at java.net.urlclassloader.findclass(urlclassloader.java:362) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) at java.lang.classloader.defineclass1(native method) at java.lang.classloader.defineclass(classloader.java:763) at java.security.secureclassloader.defineclass(secureclassloader.java:142) at java.net.urlclassloader.defineclass(urlclassloader.java:468) at java.net.urlclassloader.access$100(urlclassloader.java:74) at java.net.urlclassloader$1.run(urlclassloader.java:369) at java.net.urlclassloader$1.run(urlclassloader.java:363) at java.security.accesscontroller.doprivileged(native method) at java.net.urlclassloader.findclass(urlclassloader.java:362) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) at java.lang.classloader.defineclass1(native method) at java.lang.classloader.defineclass(classloader.java:763) at java.security.secureclassloader.defineclass(secureclassloader.java:142) at java.net.urlclassloader.defineclass(urlclassloader.java:468) at java.net.urlclassloader.access$100(urlclassloader.java:74) at java.net.urlclassloader$1.run(urlclassloader.java:369) at java.net.urlclassloader$1.run(urlclassloader.java:363) at java.security.accesscontroller.doprivileged(native method) at java.net.urlclassloader.findclass(urlclassloader.java:362) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) at org.apache.sqoop.manager.importjobcontext.<init>(importjobcontext.java:51) at com.cloudera.sqoop.manager.importjobcontext.<init>(importjobcontext.java:33) at org.apache.sqoop.tool.importtool.importtable(importtool.java:506) at org.apache.sqoop.tool.importtool.run(importtool.java:628) at org.apache.sqoop.sqoop.run(sqoop.java:147) at org.apache.hadoop.util.toolrunner.run(toolrunner.java:70) at org.apache.sqoop.sqoop.runsqoop(sqoop.java:183) at org.apache.sqoop.sqoop.runtool(sqoop.java:234) at org.apache.sqoop.sqoop.runtool(sqoop.java:243) at org.apache.sqoop.sqoop.main(sqoop.java:252) caused by: java.lang.classnotfoundexception: org.apache.hadoop.mapreduce.inputformat at java.net.urlclassloader.findclass(urlclassloader.java:382) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) ... 58 more
配置的hive_home这个变量有问题,应该是hadoop-mapreduce这个目录
问题2
19/12/12 17:59:34 error tool.importtool: import failed: java.io.ioexception: java.lang.classnotfoundexception: org.apache.hadoop.hive.conf.hiveconf at org.apache.sqoop.hive.hiveconfig.gethiveconf(hiveconfig.java:50) at org.apache.sqoop.hive.hiveimport.gethiveargs(hiveimport.java:392) at org.apache.sqoop.hive.hiveimport.executeexternalhivescript(hiveimport.java:379) at org.apache.sqoop.hive.hiveimport.executescript(hiveimport.java:337) at org.apache.sqoop.hive.hiveimport.importtable(hiveimport.java:241) at org.apache.sqoop.tool.importtool.importtable(importtool.java:537) at org.apache.sqoop.tool.importtool.run(importtool.java:628) at org.apache.sqoop.sqoop.run(sqoop.java:147) at org.apache.hadoop.util.toolrunner.run(toolrunner.java:70) at org.apache.sqoop.sqoop.runsqoop(sqoop.java:183) at org.apache.sqoop.sqoop.runtool(sqoop.java:234) at org.apache.sqoop.sqoop.runtool(sqoop.java:243) at org.apache.sqoop.sqoop.main(sqoop.java:252) caused by: java.lang.classnotfoundexception: org.apache.hadoop.hive.conf.hiveconf at java.net.urlclassloader.findclass(urlclassloader.java:382) at java.lang.classloader.loadclass(classloader.java:424) at sun.misc.launcher$appclassloader.loadclass(launcher.java:349) at java.lang.classloader.loadclass(classloader.java:357) at java.lang.class.forname0(native method) at java.lang.class.forname(class.java:264) at org.apache.sqoop.hive.hiveconfig.gethiveconf(hiveconfig.java:44) ... 12 more
将hive的lib里面的拷贝到sqoop的lib目录下
cp hive-exec-1.2.1.jar /usr/local/sqoop-1.4.7.bin__hadoop-2.6.0/lib/
上一篇: 迅雷智能充电器官宣:一边充电一边赚钱
下一篇: 武则天最后为什么会把*还给李家?