欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  数据库

Hadoop2.2.0 开发环境搭建测试

程序员文章站 2022-05-18 17:56:10
...

Hadoop2.2.0 单机开发搭建。 环境: 系统 CentOS 6.3 64位 Jdk版本 oracle jdk 1.7 Hadoop版本 2.2.0 使用linux用户 Hadoop 目录配置 /home/hadoop 用户目录 /app/hadoop/hadoop-2.2.0 软件home /app/hadoop/dfs/name 数据和编辑文件 /app/hadoop/dfs/data

Hadoop2.2.0 单机开发搭建。

环境:

系统 CentOS 6.3 64位
Jdk版本 oracle jdk 1.7
Hadoop版本 2.2.0
使用linux用户 Hadoop

目录配置

/home/hadoop 用户目录
/app/hadoop/hadoop-2.2.0 软件home
/app/hadoop/dfs/name 数据和编辑文件
/app/hadoop/dfs/data 数据和编辑文件
/app/hadoop/mapred/local 存放数据
/app/hadoop/mapred/system 存放数据

1. 安装jdk

sudo vim /etc/profile
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH
source /etc/profile

2. ssh无密码登录

?

Hadoop用户操作:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub>> ~/.ssh/authorized_keys

Root用户操作:

chmod go-w ?/home/hadoop/.ssh
chmod 600 /home/hadoop/.ssh/authorized_keys

测试:

Hadoop用户

[hadoop@hadoop01 ~]$ ssh localhost

3. 安装hadoop

可以自己下载源码包编译成适合本地native包

为了简单我直接下载编译好的hadoop包:

地址:http://apache.fayea.com/apache-mirror/hadoop/common/hadoop-2.2.0/

解压到目录;

移动解压软件到软件目录:

/app/hadoop/hadoop-2.2.0

4. 修改hadoop参数文件

Vim ?core-site.xml

        fs.default.name
        hdfs://hadoop-host:8020
        The name of the defaultfile system. Either the literal string "local" a host:port forNDFS.
        true

修改hdfs-site.xml

        dfs.namenode.name.dir
        file:/app/hadoop/dfs/name
        true
        dfs.datanode.data.dir
        file:/app/hadoop/dfs/data
        Determineswhere on the local filesystem an DFS data node should store its blocks. If thisis a comma-delimited list of directories, then data will be stored in all nameddirectories, typically on different devices.Directories that do not exist areignored.
        true
        dfs.replication
        1
        dfs.permissions
        false

修改Mapred-site.xml

        mapreduce.framework.name
        yarn
        mapred.system.dir
        file:/app/hadoop/mapred/system
        true
        mapred.local.dir
        file:/app/hadoop/mapred/local
        true

修改Yarn-site.xml

        yarn.nodemanager.aux-services
        mapreduce_shuffle

如果要配置成集群环境则Yarn-site.xml的配置如下:

       yarn.nodemanager.aux-services
       mapreduce_shuffle
       yarn.nodemanager.aux-services.mapreduce.shuffle.class
       org.apache.hadoop.mapred.ShuffleHandler
       yarn.resourcemanager.address
       Master.Hadoop:8032
       yarn.resourcemanager.scheduler.address
       Master.Hadoop:8030
       yarn.resourcemanager.resource-tracker.address
       Master.Hadoop:8031
       yarn.resourcemanager.admin.address
       Master.Hadoop:8033
       yarn.resourcemanager.webapp.address
       Master.Hadoop:8088

修改 hadoop-env.sh:

增加:

export JAVA_HOME=/usr/java/jdk1.7.0_45

创建本地目录

mkdir –p ?/app/hadoop/dfs/name
mkdir -p ?/app/hadoop/dfs/data
mkdir –p ?/app/hadoop/mapred/local
mkdir -p ?/app/hadoop/mapred/system

启动hadoop

格式化namenode

[hadoop@hadoop01 ~]$ hdfs namenode –format

开启dfs守护进程

启动:start-all.sh

停止:stop-all.sh

开启yarn守护进程

启动:start-yarn.sh

停止:stop-yarn.sh

使用jps查看启动的进程:

[hadoop@ttpod sbin]$ jps

7621 NameNode

11834 Jps

7734 DataNode

7881 SecondaryNameNode

10156 NodeManager

10053 ResourceManager

有以上内容说明已经启动

查看hadoop资源管理页面

http://192.168.6.124:8088/

Hadoop2.2.0 开发环境搭建测试

查看hdfs界面:http://192.168.6.124:50070

Hadoop2.2.0 开发环境搭建测试

测试:

使用pi程序

hadoop jar ?$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 10 10

hadoop jar hadoop-mapreduce-examples-2.2.0.jar pi 10 10

Number of Maps? = 10Samples per Map = 1013/12/13 16:27:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicableWrote input for Map #0

Wrote input for Map #1

Wrote input for Map #2

Wrote input for Map #3

Wrote input for Map #4

Wrote input for Map #5

Wrote input for Map #6

Wrote input for Map #7

Wrote input for Map #8

Wrote input for Map #9

Starting Job

13/12/13 16:27:24 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

13/12/13 16:27:25 INFO input.FileInputFormat: Total input paths to process : 10

13/12/13 16:27:25 INFO mapreduce.JobSubmitter: number of splits:10

13/12/13 16:27:25 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name

13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir

13/12/13 16:27:25 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class

13/12/13 16:27:25 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir

13/12/13 16:27:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1386923206015_0001

13/12/13 16:27:26 INFO impl.YarnClientImpl: Submitted application application_1386923206015_0001 to ResourceManager at /0.0.0.0:8032

13/12/13 16:27:26 INFO mapreduce.Job: The url to track the job: http://ttpod:8088/proxy/application_1386923206015_0001/

13/12/13 16:27:26 INFO mapreduce.Job: Running job: job_1386923206015_0001

13/12/13 16:27:34 INFO mapreduce.Job: Job job_1386923206015_0001 running in uber mode : false

13/12/13 16:27:34 INFO mapreduce.Job:? map 0% reduce 0%

13/12/13 16:27:56 INFO mapreduce.Job:? map 60% reduce 0%

13/12/13 16:28:13 INFO mapreduce.Job:? map 100% reduce 0%

13/12/13 16:28:14 INFO mapreduce.Job:? map 100% reduce 100%

13/12/13 16:28:15 INFO mapreduce.Job: Job job_1386923206015_0001 completed successfully

13/12/13 16:28:16 INFO mapreduce.Job: Counters: 43

File System Counters

FILE: Number of bytes read=226

FILE: Number of bytes written=871752

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=2610

HDFS: Number of bytes written=215

HDFS: Number of read operations=43

HDFS: Number of large read operations=0

HDFS: Number of write operations=3

Job Counters

Launched map tasks=10

Launched reduce tasks=1

Data-local map tasks=10

Total time spent by all maps in occupied slots (ms)=185391

Total time spent by all reduces in occupied slots (ms)=15412

Map-Reduce Framework

Map input records=10

Map output records=20

Map output bytes=180

Map output materialized bytes=280

Input split bytes=1430

Combine input records=0

Combine output records=0

Reduce input groups=2

Reduce shuffle bytes=280

Reduce input records=20

Reduce output records=0

Spilled Records=40

Shuffled Maps =10

Failed Shuffles=0

Merged Map outputs=10

GC time elapsed (ms)=1841

CPU time spent (ms)=7600

Physical memory (bytes) snapshot=2507419648

Virtual memory (bytes) snapshot=9588948992

Total committed heap usage (bytes)=1944584192

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=1180

File Output Format Counters

Bytes Written=97

Job Finished in 51.367 seconds

Estimated value of Pi is 3.20000000000000000000

正常.

如出现什么异常请自行去查看运行日志。