欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

大数据cloudera集群部署安装详解

程序员文章站 2024-03-22 12:21:34
...

centos7环境下部署cloudera大数据。

节点名称 IP 配置 说明
Manager-node 192.168.42.100 4C/8G/100G 管理节点
Agent1-node 192.168.42.101 8C/32G/1T 数据节点
Agent2-node 192.168.42.102 8C/32G/1T 数据节点
Agent3-node 192.168.42.103 8C/32G/1T 数据节点

一、 部署前做准备

系统性能参数调整

CDH服务为发挥更好的性能,需对以下参数做些调整。直接将下面脚本刷到系统中即可。

cat << EOF >> /etc/sysctl.conf
vm.swappiness = 0
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv4.ip_local_port_range = 1024 65000
EOF
sysctl -p
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
cat << EOF >> /etc/rc.local
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
EOF

配置NTP时间同步服务

在所有节点上安装和启用。如果是内网,侧需要其他机器同步集群内一台主机。

yum -y install ntp

service ntpd restart

在管理节点上配置时间同步服务器

vi /etc/ntp.conf

restrict 127.0.0.1

restrict -6 ::1

restrict default nomodify notrap

server ntp1.aliyun.com prefer 

includefile /etc/ntp/crypto/pw

keys /etc/ntp/keys

在数据节点上配置时间同步客户端

vi /etc/ntp.conf

restrict 127.0.0.1

restrict -6 ::1

restrict default kod nomodify notrap nopeer noquery

restrict -6 default kod nomodify notrap nopeer noquery

#这里是主节点的主机名或者ip

server Manager-node

includefile /etc/ntp/crypto/pw

keys /etc/ntp/keys

配置文件完成,保存退出,启动服务,执行如下命令:service ntpd start 

JDK安装

在集群中所有服务器(包含CM管理节点和各个agent节点)中安装JDK

wget thttp://download.oracle.com/otn/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1528156044_59d0d3a22c59b5ac6d9f0dddd4418808

tar -zxvf jdk-7u80-linux-x64.tar.gz -C /usr/local/java
 cat >>~/.bashrc <<EOF
export JAVA_HOME=/usr/local/jdk1.7.0_80
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

EOF

source ~/.bashrc

MYSQL数据库安装

root# wget https://dev.mysql.com/get/mysql80-community-release-el7-1.noarch.rpm

root# sudo rpm -Uvh mysql80-community-release-el7-1.noarch.rpm

root# yum repolist all | grep mysql

root# sudo yum-config-manager --enable mysql57-community

修改只安装mysql57-community-server

root# vi /etc/yum.repos.d/mysql-community.repo

# Enable to use MySQL 5.7
[mysql57-community]
name=MySQL 5.7 Community Server
baseurl=http://repo.mysql.com/yum/mysql-5.7-community/el/6/$basearch/
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-mysql

root# yum repolist enabled | grep mysql

root# yum install mysql-community-server

root# systemctl start mysqld.service

root# sudo grep 'temporary password' /var/log/mysqld.log

mysql>mysql -uroot -p

mysql>ALTER USER 'root'@'localhost' IDENTIFIED BY 'MyNewPass4!';

 Cloudera官网优化mysql配置建议:

vi /etc/my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0

key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M

#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log

#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1

binlog_format = mixed

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit  = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

sql_mode=STRICT_ALL_TABLES

 

配置开机启动

/usr/local/mysql

在support-files下执行

cp mysql.server /etc/init.d/mysql

启动mysql

/etc/init.d/mysql start

修改初始密码

alter user aaa@qq.com identified by '123456HW';

flush privileges;

配置Cloudera Manager需要访问的mysql用户名和密码

  • Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large.
  • Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small.
  • Activity Monitor - Contains information about past activities. In large clusters, this database can grow large. Configuring an Activity Monitor database is only necessary if a MapReduce service is deployed.
  • Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.
  • Hive Metastore Server - Contains Hive metadata. Relatively small.
  • Hue Server - Contains user account information, job submissions, and Hive queries. Relatively small.
  • Sentry Server - Contains authorization metadata. Relatively small.
  • Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.
  • Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.
role Database User Password
root   root 123456HW
Activity Monitor amon amon amon
Reports Manager rman rman rman
Hive Metastore Server hive hive hive
Sentry Server sentry sentry sentry
Cloudera Navigator Audit Server nav nav nav
Cloudera Navigator Metadata Server navms navms navms
Oozie oozie oozie oozie
Hue hue hue hue
Cloudera Manager Server cmf cmf cmf

建库,授权

create database amon DEFAULT CHARACTER SET utf8;
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
 
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman!';
 
create database hive DEFAULT CHARACTER SET utf8;
grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
 
create database sentry DEFAULT CHARACTER SET utf8;
grant all on sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
 
create database nav DEFAULT CHARACTER SET utf8;
grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
 
create database navms DEFAULT CHARACTER SET utf8;
grant all on navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
 
create database oozie DEFAULT CHARACTER SET utf8;
grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
 
create database hue DEFAULT CHARACTER SET utf8;
grant all on hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
 
create database cmf DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'cmf';
 
flush privileges;

 

等密验证

在manager节点上操作:

ssh-****** -t rsa

ssh-copy-id aaa@qq.com

ssh-copy-id aaa@qq.com

ssh-copy-id aaa@qq.com

4、关闭防火墙

systemctl stop firewalld

systemctl disable firewalld

setenforce 0

 

安装JDBC driver

wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz

tar zxvf mysql-connector-java-5.1.46.tar.gz
sudo mkdir -p /usr/share/java/
cd mysql-connector-java-5.1.46
sudo cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

二、部署

1、安装Cloudera Manager

在Manager-node部署CM,安装最新的CM可在Cloudera官网上下载5.15版本进行二进制安装,我这里进行yum 安装。

sudo rpm --import https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
sudo yum install cloudera-manager-daemons cloudera-manager-server

初始化Cloudera Manager Mysql脚本(mysql与cm在同一节点上安装)

sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql -h 192.168.42.100 --scm-host 192.168.42.100 scm scm
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.7.0_80-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.7.0_80-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[                          main] DbCommandExecutor              INFO  Successfully connected to database.
All done, your SCM database is configured correctly!

启动cms

systemctl start cloudera-scm-server

登录到cmf:   http://192.168.42.100:7180/cmf

用户名密码admin/admin

     大数据cloudera集群部署安装详解

2、安装CDH和其他软件

群集安装前准备,在manager节点上登录cm,http://192.168.42.100:7180/cmf

大数据cloudera集群部署安装详解

3、安装集群

可按以下选择进行安装,

选择服务

  • Core Hadoop

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, and Hue

  • Core with HBase

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and HBase

  • Core with Impala

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Impala

  • Core with Search

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Solr

  • Core with Spark

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, and Spark

  • All Services

    HDFS, YARN (MapReduce 2 Included), ZooKeeper, Oozie, Hive, Hue, HBase, Impala, Solr, Spark, and Key-Value Store Indexer