欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

hadoop2.7.3完全分布式集群搭建

程序员文章站 2022-05-07 18:50:07
...

hadoop2.7.3完全分布式集群搭建

系统及软件配置:
Centos 7
jdk-8u131-linux-x64.tar.gz
hadoop2.7.3
节点:
spark1(192.168.6.137)
spark2(192.168.6.138)
spark3(192.168.6.139)

1.首先设置每个节点的静态IP地址

分别对应 192.168.6.137, 192.168.6.138, 192.168.6.139
设置方法在之前的博客已经设置,这里不在阐述。详细请看Vmware虚拟机设置静态IP地址

2.配置host

编辑spark1/etc/hosts,添加如下代码

192.168.6.137     spark1
192.168.6.138     spark2
192.168.6.139     spark3

3.jdk安装

利用SecureCRT对spark1上传jdk,解压下载的jdk1.8.0_131

tar -zxvf jdk-8u131-linux-x64.tar.gz -C /usr/localhost

spark1节点配置环境变量,vi /etc/profile

# jdk environment
alias cdha='cd /usr/local/hadoop-2.7.3/etc/hadoop'
export JAVA_HOME=/usr/local/jdk1.8.0_131
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin:$PATH

4.spark1节点分布式集群配置

修改hadoop-env.sh(目录/usr/local/hadoop-2.7.3/etc/hadoop)

export JAVA_HOME=/usr/local/jdk1.8.0_131

hadoop2.7.3完全分布式集群搭建
修改core-site.xml(目录/usr/local/hadoop-2.7.3/etc/hadoop)

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://spark1:9000</value>
    </property>
</configuration>

修改hdfs-site.xml(目录/usr/local/hadoop-2.7.3/etc/hadoop)

<configuration>
    <property>
        <name>dfs.name.dir</name>
        <value>/usr/local/data/namenode</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/usr/local/data/datanode</value>
    </property>
    <property>
        <name>dfs.tmp.dir</name>
        <value>/usr/local/data/tmp</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
</configuration>

修改mapred-site.xml(目录/usr/local/hadoop-2.7.3/etc/hadoop)

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

修改yarn-site.xml(目录/usr/local/hadoop-2.7.3/etc/hadoop)

<configuration>
<!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>spark1</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

修改slaves(目录/usr/local/hadoop-2.7.3/etc/hadoop)

spark1
spark2
spark3

5.通过vmware对spark1整个系统进行复制

点击spark1虚拟机,右键-》管理-》克隆
克隆源选择=》虚拟机中的当前状态
hadoop2.7.3完全分布式集群搭建
克隆类型=》创建完整克隆
hadoop2.7.3完全分布式集群搭建
把spark1克隆,生成spark2和spark3
hadoop2.7.3完全分布式集群搭建

6.配置免密码登录本机及集群之间的机器

spark1

ssh-****** -t rsa
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

spark2

ssh-****** -t rsa
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

spark3

ssh-****** -t rsa
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

然后
spark1

ssh-copy-id -i spark2
ssh-copy-id -i spark3

spark2

ssh-copy-id -i spark1
ssh-copy-id -i spark3

spark3

ssh-copy-id -i spark1
ssh-copy-id -i spark2

7.启动hadoop集群

在spark1上格式化namenode

hdfs namenode -format

启动
start-dfs.sh

[root@spark1 hadoop]# jps
3345 Jps
3236 SecondaryNameNode
3078 DataNode
2952 NameNode

[root@spark2 hadoop]# jps
2025 Jps
1951 DataNode

[root@spark3 hadoop]# jps
1970 DataNode
2035 Jps

启动

start-yarn.sh
[root@spark1 hadoop]# jps
1840 ResourceManager
1521 DataNode
2289 Jps
2005 NodeManager
1658 SecondaryNameNode
1389 NameNode

[root@spark2 ~]# jps
1173 NodeManager
1063 DataNode
1319 Jps

[root@spark3 ~]# jps
1312 Jps
1176 NodeManager
1066 DataNode

7.通过网页可以观看资源情况
hadoop2.7.3完全分布式集群搭建
hadoop2.7.3完全分布式集群搭建