hadoop-hadoop2.9.0集群部署
程序员文章站
2022-03-09 08:13:00
...
1.准备工作
3个节点以及角色分配:
192.168.50.235 hd01 master
192.168.50.236 hd02 slaver
192.168.50.237 hd03 slaver
2.将安装包上传至主节点
解压缩:
添加环境变量,并使其生效:
vi /etc/profile
#java
export JAVA_HOME=/opt/cm/hadoop/jdk1.8.0_131
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#hadoop
export HADOOP_HOME=/opt/cm/hadoop/hadoop-2.9.0
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
3.各节点之间免密登录
在各节点分别执行:
ssh-****** -t rsa
三次空格键至结束
cd /root/.ssh/
cat id_rsa.pub >> authorized_keys
主节点执行:
ssh aaa@qq.com cat ~/.ssh/id_rsa.pub >> authorized_keys
ssh aaa@qq.com cat ~/.ssh/id_rsa.pub >> authorized_keys
ssh aaa@qq.com cat ~/.ssh/id_rsa.pub >> authorized_keys
scp authorized_keys known_hosts aaa@qq.com:/root/.ssh/
scp authorized_keys known_hosts aaa@qq.com:/root/.ssh/
4.修改hosts文件
在主节点上修改hosts文件
vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.50.235 hd01
192.168.50.236 hd02
192.168.50.237 hd03
scp到slave节点:
scp /etc/hosts aaa@qq.com:/etc
scp /etc/hosts aaa@qq.com:/etc
5.主节点上修改hadoop的相关配置文件
1).core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hd01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
2).hadoop-env.sh 修改java_home
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/opt/cm/hadoop/jdk1.8.0_131
#export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
export HADOOP_CONF_DIR=/opt/cm/hadoop/hadoop-2.9.0/etc/hadoop
修改这两项的默认值为本环境的具体路径
3).hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hd01:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
4).mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hd01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hd01:19888</value>
</property>
</configuration>
5).slaves
hd01
hd02
hd03
6).yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hd01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hd01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hd01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hd01:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hd01:8088</value>
</property>
# <property>
# <name>yarn.nodemanager.resource.memory-mb</name>
# <value>768</value>
# </property>
#这一项注释原因:集群启动以后yarn的 nodemanager启动失败
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
</configuration>
7).将hadoop以及java的相关文件scp至slave节点
scp -r /opt/cm/hadoop/ aaa@qq.com:/opt/cm
scp -r /opt/cm/hadoop/ aaa@qq.com:/opt/cm
scp -r /etc/profile aaa@qq.com:/etc/
scp -r /etc/profile aaa@qq.com:/etc/
6.slave节点上执行:
source /etc/profile
mkdir -p /home/hadoop/tmp
mkdir -p /home/hadoop/dfs/name
mkdir -p /home/hadoop/dfs/data
7.格式化:
hdfs namenode -format
************************************************************/
18/08/27 16:05:22 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
18/08/27 16:05:22 INFO namenode.NameNode: createNameNode [-format]
18/08/27 16:05:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-513442d2-d886-4021-9fe2-e02c65c854e5
18/08/27 16:05:23 INFO namenode.FSEditLog: Edit logging is async:true
18/08/27 16:05:23 INFO namenode.FSNamesystem: KeyProvider: null
18/08/27 16:05:23 INFO namenode.FSNamesystem: fsLock is fair: true
18/08/27 16:05:23 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
18/08/27 16:05:23 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
18/08/27 16:05:23 INFO namenode.FSNamesystem: supergroup = supergroup
18/08/27 16:05:23 INFO namenode.FSNamesystem: isPermissionEnabled = true
18/08/27 16:05:23 INFO namenode.FSNamesystem: HA Enabled: false
18/08/27 16:05:23 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
18/08/27 16:05:23 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
18/08/27 16:05:23 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
18/08/27 16:05:23 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
18/08/27 16:05:23 INFO blockmanagement.BlockManager: The block deletion will start around 2018 八月 27 16:05:23
18/08/27 16:05:23 INFO util.GSet: Computing capacity for map BlocksMap
18/08/27 16:05:23 INFO util.GSet: VM type = 64-bit
18/08/27 16:05:23 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
18/08/27 16:05:23 INFO util.GSet: capacity = 2^21 = 2097152 entries
18/08/27 16:05:23 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
18/08/27 16:05:23 WARN conf.Configuration: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
18/08/27 16:05:23 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
18/08/27 16:05:23 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
18/08/27 16:05:23 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
18/08/27 16:05:23 INFO blockmanagement.BlockManager: defaultReplication = 2
18/08/27 16:05:23 INFO blockmanagement.BlockManager: maxReplication = 512
18/08/27 16:05:23 INFO blockmanagement.BlockManager: minReplication = 1
18/08/27 16:05:23 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
18/08/27 16:05:23 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
18/08/27 16:05:23 INFO blockmanagement.BlockManager: encryptDataTransfer = false
18/08/27 16:05:23 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
18/08/27 16:05:23 INFO namenode.FSNamesystem: Append Enabled: true
18/08/27 16:05:23 INFO util.GSet: Computing capacity for map INodeMap
18/08/27 16:05:23 INFO util.GSet: VM type = 64-bit
18/08/27 16:05:23 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
18/08/27 16:05:23 INFO util.GSet: capacity = 2^20 = 1048576 entries
18/08/27 16:05:23 INFO namenode.FSDirectory: ACLs enabled? false
18/08/27 16:05:23 INFO namenode.FSDirectory: XAttrs enabled? true
18/08/27 16:05:23 INFO namenode.NameNode: Caching file names occurring more than 10 times
18/08/27 16:05:23 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: falseskipCaptureAccessTimeOnlyChange: false
18/08/27 16:05:23 INFO util.GSet: Computing capacity for map cachedBlocks
18/08/27 16:05:23 INFO util.GSet: VM type = 64-bit
18/08/27 16:05:23 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
18/08/27 16:05:23 INFO util.GSet: capacity = 2^18 = 262144 entries
18/08/27 16:05:23 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
18/08/27 16:05:23 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
18/08/27 16:05:23 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
18/08/27 16:05:23 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
18/08/27 16:05:23 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
18/08/27 16:05:23 INFO util.GSet: Computing capacity for map NameNodeRetryCache
18/08/27 16:05:23 INFO util.GSet: VM type = 64-bit
18/08/27 16:05:23 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
18/08/27 16:05:23 INFO util.GSet: capacity = 2^15 = 32768 entries
18/08/27 16:05:23 INFO namenode.FSImage: Allocated new BlockPoolId: BP-74007497-192.168.50.236-1535357123682
18/08/27 16:05:23 INFO common.Storage: Storage directory /home/hadoop/dfs/name has been successfully formatted.
18/08/27 16:05:23 INFO namenode.FSImageFormatProtobuf: Saving image file /home/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
18/08/27 16:05:23 INFO namenode.FSImageFormatProtobuf: Image file /home/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 321 bytes saved in 0 seconds.
18/08/27 16:05:23 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
18/08/27 16:05:23 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hd02/192.168.50.236
************************************************************/
出现:Storage directory /home/hadoop/dfs/name has been successfully formatted即格式化成功
8.启动集群
主节点上执行 start-all.sh
[aaa@qq.com hadoop]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
18/08/27 16:22:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hd01]
hd01: starting namenode, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/hadoop-root-namenode-hd01.out
hd02: starting datanode, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/hadoop-root-datanode-hd02.out
hd03: starting datanode, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/hadoop-root-datanode-hd03.out
hd01: starting datanode, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/hadoop-root-datanode-hd01.out
Starting secondary namenodes [hd01]
hd01: starting secondarynamenode, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/hadoop-root-secondarynamenode-hd01.out
18/08/27 16:23:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/yarn-root-resourcemanager-hd01.out
hd02: starting nodemanager, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/yarn-root-nodemanager-hd02.out
hd01: starting nodemanager, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/yarn-root-nodemanager-hd01.out
hd03: starting nodemanager, logging to /opt/cm/hadoop/hadoop-2.9.0/logs/yarn-root-nodemanager-hd03.out
各个节点查询进程:
集群部署成功!
上一篇: MongoDB数据库
下一篇: Mongodb数据库