Hadoop2.2.0 开发环境搭建测试
Hadoop2.2.0 单机开发搭建。 环境: 系统 CentOS 6.3 64位 Jdk版本 oracle jdk 1.7 Hadoop版本 2.2.0 使用linux用户 Hadoop 目录配置 /home/hadoop 用户目录 /app/hadoop/hadoop-2.2.0 软件home /app/hadoop/dfs/name 数据和编辑文件 /app/hadoop/dfs/data
Hadoop2.2.0 单机开发搭建。
环境:
系统 | CentOS 6.3 64位 |
Jdk版本 | oracle jdk 1.7 |
Hadoop版本 | 2.2.0 |
使用linux用户 | Hadoop |
目录配置
/home/hadoop | 用户目录 |
/app/hadoop/hadoop-2.2.0 | 软件home |
/app/hadoop/dfs/name | 数据和编辑文件 |
/app/hadoop/dfs/data | 数据和编辑文件 |
/app/hadoop/mapred/local | 存放数据 |
/app/hadoop/mapred/system | 存放数据 |
1. 安装jdk
sudo vim /etc/profile
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH
source /etc/profile
2. ssh无密码登录
?
Hadoop用户操作:
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub>> ~/.ssh/authorized_keys
Root用户操作:
chmod go-w ?/home/hadoop/.ssh chmod 600 /home/hadoop/.ssh/authorized_keys
测试:
Hadoop用户
[hadoop@hadoop01 ~]$ ssh localhost
3. 安装hadoop
可以自己下载源码包编译成适合本地native包
为了简单我直接下载编译好的hadoop包:
地址:http://apache.fayea.com/apache-mirror/hadoop/common/hadoop-2.2.0/
解压到目录;
移动解压软件到软件目录:
/app/hadoop/hadoop-2.2.0
4. 修改hadoop参数文件
Vim ?core-site.xml
fs.default.name hdfs://hadoop-host:8020 The name of the defaultfile system. Either the literal string "local" a host:port forNDFS. true
修改hdfs-site.xml
dfs.namenode.name.dir file:/app/hadoop/dfs/name true dfs.datanode.data.dir file:/app/hadoop/dfs/data Determineswhere on the local filesystem an DFS data node should store its blocks. If thisis a comma-delimited list of directories, then data will be stored in all nameddirectories, typically on different devices.Directories that do not exist areignored. true dfs.replication 1 dfs.permissions false
修改Mapred-site.xml
mapreduce.framework.name yarn mapred.system.dir file:/app/hadoop/mapred/system true mapred.local.dir file:/app/hadoop/mapred/local true
修改Yarn-site.xml
yarn.nodemanager.aux-services mapreduce_shuffle
如果要配置成集群环境则Yarn-site.xml的配置如下:
yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.aux-services.mapreduce.shuffle.class org.apache.hadoop.mapred.ShuffleHandler yarn.resourcemanager.address Master.Hadoop:8032 yarn.resourcemanager.scheduler.address Master.Hadoop:8030 yarn.resourcemanager.resource-tracker.address Master.Hadoop:8031 yarn.resourcemanager.admin.address Master.Hadoop:8033 yarn.resourcemanager.webapp.address Master.Hadoop:8088
修改 hadoop-env.sh:
增加:
export JAVA_HOME=/usr/java/jdk1.7.0_45
创建本地目录
mkdir –p ?/app/hadoop/dfs/name mkdir -p ?/app/hadoop/dfs/data mkdir –p ?/app/hadoop/mapred/local mkdir -p ?/app/hadoop/mapred/system
启动hadoop
格式化namenode
[hadoop@hadoop01 ~]$ hdfs namenode –format
开启dfs守护进程
启动:start-all.sh
停止:stop-all.sh
开启yarn守护进程
启动:start-yarn.sh
停止:stop-yarn.sh
使用jps查看启动的进程:
[hadoop@ttpod sbin]$ jps
7621 NameNode
11834 Jps
7734 DataNode
7881 SecondaryNameNode
10156 NodeManager
10053 ResourceManager
有以上内容说明已经启动
查看hadoop资源管理页面
http://192.168.6.124:8088/
查看hdfs界面:http://192.168.6.124:50070
测试:
使用pi程序:
hadoop jar ?$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 10
hadoop jar hadoop-mapreduce-examples-2.2.0.jar pi 10 10
Number of Maps? = 10Samples per Map = 1013/12/13 16:27:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicableWrote input for Map #0
Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 13/12/13 16:27:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 13/12/13 16:27:25 INFO input.FileInputFormat: Total input paths to process : 10 13/12/13 16:27:25 INFO mapreduce.JobSubmitter: number of splits:10 13/12/13 16:27:25 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 13/12/13 16:27:25 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 13/12/13 16:27:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1386923206015_0001 13/12/13 16:27:26 INFO impl.YarnClientImpl: Submitted application application_1386923206015_0001 to ResourceManager at /0.0.0.0:8032 13/12/13 16:27:26 INFO mapreduce.Job: The url to track the job: http://ttpod:8088/proxy/application_1386923206015_0001/ 13/12/13 16:27:26 INFO mapreduce.Job: Running job: job_1386923206015_0001 13/12/13 16:27:34 INFO mapreduce.Job: Job job_1386923206015_0001 running in uber mode : false 13/12/13 16:27:34 INFO mapreduce.Job:? map 0% reduce 0% 13/12/13 16:27:56 INFO mapreduce.Job:? map 60% reduce 0% 13/12/13 16:28:13 INFO mapreduce.Job:? map 100% reduce 0% 13/12/13 16:28:14 INFO mapreduce.Job:? map 100% reduce 100% 13/12/13 16:28:15 INFO mapreduce.Job: Job job_1386923206015_0001 completed successfully 13/12/13 16:28:16 INFO mapreduce.Job: Counters: 43 File System Counters FILE: Number of bytes read=226 FILE: Number of bytes written=871752 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2610 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=185391 Total time spent by all reduces in occupied slots (ms)=15412 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1430 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=280 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=1841 CPU time spent (ms)=7600 Physical memory (bytes) snapshot=2507419648 Virtual memory (bytes) snapshot=9588948992 Total committed heap usage (bytes)=1944584192 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 51.367 seconds Estimated value of Pi is 3.20000000000000000000 |
正常.
如出现什么异常请自行去查看运行日志。
原文地址:Hadoop2.2.0 开发环境搭建测试, 感谢原作者分享。