欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

CDH6.3.2安装部署(Ubuntu)

程序员文章站 2022-07-14 21:53:53
...

ubuntu1804部署CDH6

安装前准备

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/installation_reqts.html

配置网络名(以foo-1为例)

1、修改主机名

sudo hostnamectl set-hostname foo-1.example.com

2、编辑 /etc/hosts 文件

1.1.1.1  foo-1.example.com  foo-1
2.2.2.2  foo-2.example.com  foo-2
3.3.3.3  foo-3.example.com  foo-3
4.4.4.4  foo-4.example.com  foo-4

3、确认每个主机都是固定网络

a. 分别运行 uname -ahostname,检查主机名是否一致

b. 运行/sbin/ifconfig,注意eth0中的inet addr的值,例如:

eth0      Link encap:Ethernet  HWaddr 00:0C:29:A4:E8:97  
          inet addr:172.29.82.176  Bcast:172.29.87.255  Mask:255.255.248.0
...

c. 运行host -v -t A $(hostname),检查主机名是否和hostname命令输出一致,IP地址是否和ifconfig中的输出一致

Trying "foo-1.example.com"
...
;; ANSWER SECTION:
foo-1.example.com. 60 IN
A
172.29.82.176

关闭防火墙

sudo systemctl stop ufw
sudo systemctl disable ufw

设置SELinux模式

Ubuntu默认没有安装selinux-utils,如果没有selinux,可以跳过此步。

如果使用了selinux,请先关闭,安装CDH完成后,再开启。

设置时钟同步

cloudera manager需要ntp服务。需要安装。使用ntp做时钟同步的操作在这里不做描述,可以自行查找资料。

另外,Ubuntu内置支持了timedatectl,默认开启了时钟同步。可用来调整服务器时间。

查看

timedatectl

设置时区

sudo timedatectl set-timezone "Asia/Shanghai"

ssh设置

1、允许root登陆

编辑/etc/ssh/sshd_config,找到修改配置 PermitRootLogin yes,使用sudo systemctl restart ssh重启ssh服务。

2、设置root密码

切换到root:sudo su

设置密码:passwd

安装

搭建CM源

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ig_create_local_package_repo.html#internal_package_repo_content

设置web服务

sudo apt-get install apache2
sudo systemctl start apache2
sudo systemctl enable apache2

下载并发布包仓库

sudo mkdir -p /var/www/html/cloudera-repos/cm6
wget https://archive.cloudera.com/cm6/6.3.1/repo-as-tarball/cm6.3.1-ubuntu1804.tar.gz
sudo tar xvfz cm6.3.1-ubuntu1804.tar.gz -C /var/www/html/cloudera-repos/cm6 --strip-components=1
cd /var/www/html/cloudera-repos/cm6
sudo wget https://archive.cloudera.com/cm6/6.3.1/allkeys.asc
sudo chmod -R ugo+rX /var/www/html/cloudera-repos/cm6

配置并使用内部仓库

创建/etc/apt/sources.list.d/cloudera-repo.list文件,编辑内容如下:

deb http://<web_server>/cm <codename> <components>

可以从 ./conf/distributions 找到 <codename><components>

可以下载官方源文件进行参考:

sudo wget https://archive.cloudera.com/cm6/6.3.1/ubuntu1804/apt/cloudera-manager.list

完整示例如下:

# Cloudera Manager 6.3.1
deb [arch=amd64] http://foo-1.example.com/cloudera-repos/cm6 bionic-cm6.3.1 contrib

导入仓库签名GPG KEY,archive.key在上面本地源的cloudera-repos/cm6目录下:

sudo apt-key add archive.key

更新

sudo apt-get update

安装CM

安装jdk(每台机器都要安装)

sudo apt-get install openjdk-8-jdk

安装cm:

sudo apt-get install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server

安装mariadb:

sudo apt-get install mariadb-server

停止mariadb

sudo systemctl stop mariadb

mariadb (mysql) 配置 /etc/mysql/conf.d/mysql.cnf:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
# Settings user and group are ignored when systemd is used.
# If you need to run mysqld under a different user or group,
# customize your systemd unit file for mariadb according to the
# instructions in http://fedoraproject.org/wiki/Systemd

key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M

#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log

#In later versions of MariaDB, if you enable the binary log and do not set
#a server_id, MariaDB will not start. The server_id must be unique within
#the replicating group.
server_id=1

binlog_format = mixed

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit  = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M

[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid

允许远程访问

编辑**/etc/mysql/mariadb.conf.d/50-server.cnf**,

注释掉:bind-address = 127.0.0.1

启动mariadb

sudo systemctl start mariadb

设置root密码

sudo /usr/bin/mysql_secure_installation

操作如下:

[...]
Enter current password for root (enter for none):
OK, successfully used password, moving on...
[...]
Set root password? [Y/n] Y
New password:
Re-enter new password:
[...]
Remove anonymous users? [Y/n] Y
[...]
Disallow root login remotely? [Y/n] N
[...]
Remove test database and access to it [Y/n] Y
[...]
Reload privilege tables now? [Y/n] Y
[...]
All done!  If you've completed all of the above steps, your MariaDB
installation should now be secure.

Thanks for using MariaDB!

安装MySQL JDBC Driver for MariaDB

sudo apt-get install libmysql-java

创建数据库

1、进入mriadb

sudo mysql -u root -p

2、建库并授权(以下非全部库)

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';

CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';

CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';

CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';

CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';

CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';

CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';

设置cm数据库

sudo /opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm scm -h localhost

搭建CDH parcels源

下载cdh6,放到cdh6目录下

wget https://archive.cloudera.com/cdh6/6.3.2/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554-bionic.parcel
wget https://archive.cloudera.com/cdh6/6.3.2/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554-bionic.parcel.sha1
https://archive.cloudera.com/cdh6/6.3.2/parcels/manifest.json
mv CDH-6.3.2-1.cdh6.3.2.p0.1605554-bionic.parcel.sha1 CDH-6.3.2-1.cdh6.3.2.p0.1605554-bionic.parcel.sha

下载gplextras6,放到gplextras6目录下

wget https://archive.cloudera.com/gplextras6/6.3.2/parcels/GPLEXTRAS-6.3.2-1.gplextras6.3.2.p0.1605554-bionic.parcel
wget https://archive.cloudera.com/gplextras6/6.3.2/parcels/GPLEXTRAS-6.3.2-1.gplextras6.3.2.p0.1605554-bionic.parcel.sha1
wget https://archive.cloudera.com/gplextras6/6.3.2/parcels/manifest.json
mv GPLEXTRAS-6.3.2-1.gplextras6.3.2.p0.1605554-bionic.parcel.sha1 GPLEXTRAS-6.3.2-1.gplextras6.3.2.p0.1605554-bionic.parcel.sha

创建本地Parcel仓库

sudo cp cdh6/* /opt/cloudera/parcel-repo/
sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo/*

安装

https://docs.cloudera.com/documentation/enterprise/6/latest/topics/install_cm_cdh.html

CM

安装CDH

sudo systemctl start cloudera-scm-server

监控日志

sudo tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

访问

http://<server_host>:7180

默认用户 admin/admin

安装过程,失败记录:

感觉是网络延迟导致的问题,正常情况应该不会出现这些问题。

1、hdfs namenode格式化失败

解决:删除目录,重新执行

2、hive和oozie,初始化建表失败

解决:重试几次

3、hive表缺失

解决:手动初始化hive元数据表,在/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/scripts/metastore/upgrade/mysql/下找到对应的hive-schema-<version>.db.sql,

例如:hive-schema-2.1.1.mysql.sql,进入mysql,使用 hive元数据,执行source /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-2.1.1.mysql.sql

4、上载Oozie共享库超时

估计是因为机器配置低,上载太慢,导致超时。

解决:在出错位置,点击后面 Oozie 链接,跳转到Oozie监控页面,在右上角搜索 oozie_upload_sharelib_cmd_timeout,可找到Oozie上载ShareLib命令超时配置,默认270,修改为更大的值,比如2700。(测试机器配置太差,没办法,哈哈)
在刚才出错的环节页面,点击Resume,进行重试。