Hadoop集群(CHD4)实践之 (4) Oozie搭建
目录结构 Hadoop集群(CDH4)实践之 (0) 前言 Hadoop集群(CDH4)实践之 (1) Hadoop(HDFS)搭建 Hadoop集群(CDH4)实践之 (2) HBaseZookeeper搭建 Hadoop集群(CDH4)实践之 (3) Hive搭建 Hadoop集群(CHD4)实践之 (4) Oozie搭建 Hadoop集群(CHD4)实践之 (5) Sqoop安
目录结构
Hadoop集群(CDH4)实践之 (0) 前言
Hadoop集群(CDH4)实践之 (1) Hadoop(HDFS)搭建
Hadoop集群(CDH4)实践之 (2) HBase&Zookeeper搭建
Hadoop集群(CDH4)实践之 (3) Hive搭建
Hadoop集群(CHD4)实践之 (4) Oozie搭建
Hadoop集群(CHD4)实践之 (5) Sqoop安装
本文内容
Hadoop集群(CHD4)实践之 (4) Oozie搭建
参考资料
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/CDH4-Installation-Guide.html
环境准备
OS: CentOS 6.4 x86_64
Servers:
hadoop-master: 172.17.20.230 内存10G
- namenode
- hbase-master
hadoop-secondary: 172.17.20.234 内存10G
- secondarybackupnamenode,jobtracker
- hive-server,hive-metastore
- oozie
hadoop-node-1: 172.17.20.231 内存10G sudo yum install hbase-regionserver
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
hadoop-node-2: 172.17.20.232 内存10G
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
hadoop-node-3: 172.17.20.233 内存10G
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
对以上角色做一些简单的介绍:
namenode - 整个HDFS的命名空间管理服务
secondarynamenode - 可以看做是namenode的冗余服务
jobtracker - 并行计算的job管理服务
datanode - HDFS的节点服务
tasktracker - 并行计算的job执行服务
hbase-master - Hbase的管理服务
hbase-regionServer - 对Client端插入,删除,查询数据等提供服务
zookeeper-server - Zookeeper协作与配置管理服务
hive-server - Hive的管理服务
hive-metastore - Hive的元存储,用于对元数据进行类型检查与语法分析
oozie - Oozie是一种Java Web应用程序,用于工作流的定义和管理
本文定义的规范,避免在配置多台服务器上产生理解上的混乱:
以下操作都只需要在 Oozie 所在主机,即 hadoop-secondary 上执行。
1. 安装前的准备
Hadoop集群(CDH4)实践之 (3) Hive搭建
2. 安装Oozie
$ sudo yum install oozie oozie-client
3. 创建Oozie数据库
$ mysql -uroot -phiveserver
mysql> create database oozie; mysql> grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie'; mysql> grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie'; mysql> exit;
4.配置oozie-site.xml
$ sudo vim /etc/oozie/conf/oozie-site.xml
oozie.service.ActionService.executor.ext.classes org.apache.oozie.action.email.EmailActionExecutor, org.apache.oozie.action.hadoop.HiveActionExecutor, org.apache.oozie.action.hadoop.ShellActionExecutor, org.apache.oozie.action.hadoop.SqoopActionExecutor, org.apache.oozie.action.hadoop.DistcpActionExecutor oozie.service.SchemaService.wf.ext.schemas shell-action-0.1.xsd,shell-action-0.2.xsd,email-action-0.1.xsd,hive-action-0.2.xsd,hive-action-0.3.xsd,hive-action-0.4.xsd,hive-action-0.5.xsd,sqoop-action-0.2.xsd,sqoop-action-0.3.xsd,ssh-action-0.1.xsd,ssh-action-0.2.xsd,distcp-action-0.1.xsd oozie.system.id oozie-${user.name} oozie.systemmode NORMAL oozie.service.AuthorizationService.security.enabled false oozie.service.PurgeService.older.than 30 oozie.service.PurgeService.purge.interval 3600 oozie.service.CallableQueueService.queue.size 10000 oozie.service.CallableQueueService.threads 10 oozie.service.CallableQueueService.callable.concurrency 3 oozie.service.coord.normal.default.timeout 120 oozie.db.schema.name oozie oozie.service.JPAService.create.db.schema true oozie.service.JPAService.jdbc.driver com.mysql.jdbc.Driver oozie.service.JPAService.jdbc.url jdbc:mysql://localhost:3306/oozie oozie.service.JPAService.jdbc.username oozie oozie.service.JPAService.jdbc.password oozie oozie.service.JPAService.pool.max.active.conn 10 oozie.service.HadoopAccessorService.kerberos.enabled false local.realm LOCALHOST oozie.service.HadoopAccessorService.keytab.file ${user.home}/oozie.keytab oozie.service.HadoopAccessorService.kerberos.principal ${user.name}/localhost@${local.realm} oozie.service.HadoopAccessorService.jobTracker.whitelist oozie.service.HadoopAccessorService.nameNode.whitelist oozie.service.HadoopAccessorService.hadoop.configurations *=/etc/hadoop/conf oozie.service.WorkflowAppService.system.libpath /user/${user.name}/share/lib use.system.libpath.for.mapreduce.and.pig.jobs false oozie.authentication.type simple oozie.authentication.token.validity 36000 oozie.authentication.signature.secret oozie oozie.authentication.cookie.domain oozie.authentication.simple.anonymous.allowed true oozie.authentication.kerberos.principal HTTP/localhost@${local.realm} oozie.authentication.kerberos.keytab ${oozie.service.HadoopAccessorService.keytab.file} oozie.authentication.kerberos.name.rules DEFAULT oozie.service.ProxyUserService.proxyuser.oozie.hosts * oozie.service.ProxyUserService.proxyuser.oozie.groups * oozie.service.ProxyUserService.proxyuser.hue.hosts * oozie.service.ProxyUserService.proxyuser.hue.groups * oozie.action.mapreduce.uber.jar.enable true oozie.service.HadoopAccessorService.supported.filesystems hdfs,viewfs
5. 配置Oozie Web Console
$ cd /tmp/
$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
$ cd /var/lib/oozie/
$ sudo unzip /tmp/ext-2.2.zip
$ cd ext-2.2/
$ sudo -u hdfs hadoop fs -mkdir /user/oozie
$ sudo -u hdfs hadoop fs -chown oozie:oozie /user/oozie
6. 配置Oozie ShareLib
$ mkdir /tmp/ooziesharelib
$ cd /tmp/ooziesharelib
$ tar xzf /usr/lib/oozie/oozie-sharelib.tar.gz
$ sudo -u oozie hadoop fs -put share /user/oozie/share
$ sudo -u oozie hadoop fs -ls /user/oozie/share
$ sudo -u oozie hadoop fs -ls /user/oozie/share/lib
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/hbase.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/zookeeper.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.5.0.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/guava-11.0.2.jar /user/oozie/share/lib/hive/
7. 启动Oozie
$ sudo service oozie start
8. 访问Oozie Web Console
http://hadoop-secondary:11000/oozie
9. 至此,Oozie的搭建就已经完成。