欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Centos7安装K8S集群环境

程序员文章站 2022-03-01 13:05:41
...

一、系统设置

环境

[[email protected] /]# cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
[[email protected] /]# uname -a
Linux st01015vm192 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

默认为root用户安装

1、关闭swap

临时关闭swap

swapoff -a

永久关闭
注释掉 /etc/fstab 中的下面配置

#/dev/mapper/centos-swap swap                    swap    defaults        0 0

2、 关闭SELinux

kubelet不支持SELinux, 这里需要将SELinux设置为permissive模式

# 查看状态
# /usr/sbin/sestatus -v  
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

3、关闭防火墙

systemctl disable firewalld
systemctl stop firewalld

4、配置sysctl

创建文件/etc/sysctl.d/k8s.conf, 文件内容如下

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

net.ipv4.ip_forward = 1

执行

sysctl -p /etc/sysctl.d/k8s.conf

在RHEL/CentOS 7上由于 iptables 被绕过导致网络请求被错误的路由。您得保证 在您的 sysctl 配置中 net.bridge.bridge-nf-call-iptables 被设为1。
插件将容器连接到 Linux 网桥,插件必须将 net/bridge/bridge-nf-call-iptables 系统参数设置为1,以确保 iptables 代理正常工作。

最后,在内核中启用了 IP 转发(因此内核将处理桥接容器的数据包):
sysctl net.ipv4.ip_forward=1
所有这些的结果是所有 Pods 都可以互相访问,并且可以将流量发送到互联网。

k8s网络插件
https://kubernetes.io/zh/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/

k8s 网络模型
https://kubernetes.io/zh/docs/concepts/cluster-administration/networking/

5、配置安装源为阿里

5.1 配置yum安装源

## 备份
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
## 下载阿里源
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

5.2 配置k8s源

vim /etc/yum.repos.d/kubernetes.repo

[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

5.3 重建yum缓存

yum clean all
yum makecache fast
yum -y update

二、安装docker

1、安装docker

可以参考官网文档 https://docs.docker.com/engine/install/centos/

  • 卸载旧版本
yum remove docker \
           docker-client \
           docker-client-latest \
           docker-common \
           docker-latest \
           docker-latest-logrotate \
           docker-logrotate \
           docker-engine
  • 安装docker
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager \
    --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

yum install -y docker-ce docker-ce-cli containerd.io

如果想要安装指定版本docker

yum list docker-ce --showduplicates | sort -r
sudo yum install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io

2、docker配置

创建文件/etc/docker/daemon.json,写入配置
mkdir /etc/docker/
vim /etc/docker/daemon.json

{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
//   如果在国内安装,添加以下配置
//   "registry-mirrors": [
//     "https://registry.docker-cn.com"
//   ],
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ]
}

国内docker源

 "registry-mirrors": [
        "https://1nj0zren.mirror.aliyuncs.com",
        "https://docker.mirrors.ustc.edu.cn",
        "http://f1361db2.m.daocloud.io",
        "https://registry.docker-cn.com"
    ]

3、重启docker

mkdir -p /etc/systemd/system/docker.service.d
systemctl daemon-reload
systemctl restart docker

可能会报错,docker.service failed

[[email protected] ~]# journalctl -xe 
May 08 15:36:09 st01015vm193 systemd[1]: Dependency failed for Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has failed.
-- 
-- The result is dependency.
May 08 15:36:09 st01015vm193 systemd[1]: Job docker.service/start failed with result 'dependency'.
May 08 15:36:09 st01015vm193 systemd[1]: Unit docker.socket entered failed state.

需要给系统添加一个docker组,如果做过基线配置的话,会提示没有权限,使用 chattr -i 增加一下权限

chattr -i /etc/group
groupadd docker
systemctl enable docker  && systemctl start docker

三、集群安装

1、安装kubeadm, kubelet和kubectl

yum install -y kubelet kubeadm kubectl kubernetes-cni --disableexcludes=kubernetes
systemctl enable --now kubelet && systemctl start kubelet

2、使用kubeadm创建集群

只在master节点执行

# master节点执行:
kubeadm init \
 --apiserver-advertise-address 10.10.45.192 \
 --pod-network-cidr=10.244.0.0/16
# --kubernetes-version=v1.15.0 \
# --image-repository=registry.aliyuncs.com/google_containers
# --apiserver-advertise-address 指定与其它节点通信的接口
# --pod-network-cidr 指定pod网络子网,使用fannel网络必须使用这个CIDR

可能会报错如下

[[email protected] ~]# kubeadm init \
>  --apiserver-advertise-address 10.10.45.192 \
>  --pod-network-cidr=10.244.0.0/16
W0508 15:09:35.577282   28115 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
	[ERROR Swap]: running with swap on is not supported. Please disable swap
	[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

加入这个参数:–ignore-preflight-errors=all

报错如下

[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.

一般情况下是kubelet启动失败,查看状态如日志,查找原因,我这里是swap临时关闭的,导致机器重启后swap被打开了:
systemctl status kubelet
journalctl -xeu kubelet

安装成功后,有如下打印

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.10.45.192:6443 --token 82scon.3zopf5qra2b1s25i \
    --discovery-token-ca-cert-hash sha256:2ea38c2a269d105b09bbf2964c089c067f0c8e7b44c0504b5854fd9acac263e0 

3、用户设置权限(root用户也需要执行)

# master节点执行:
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

4、应用flannel网络

sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
## 查看flannal是否安装成功
sudo kubectl -n kube-system get po -l app=flannel -o wide

5、节点加入

按照在master节点上构建集群后的打印,执行节点加入集群操作
kubeadm join 10.10.45.192:6443 --token 82scon.3zopf5qra2b1s25i
–discovery-token-ca-cert-hash sha256:2ea38c2a269d105b09bbf2964c089c067f0c8e7b44c0504b5854fd9acac263e0

W0508 15:42:29.235510    5131 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
	[WARNING Hostname]: hostname "st01015vm194" could not be reached
	[WARNING Hostname]: hostname "st01015vm194": lookup st01015vm194 on 192.168.16.24:53: no such host

修改/etc/hosts,添加hosts

10.10.45.192 st01015vm192
10.10.45.193 st01015vm193
10.10.45.194 st01015vm194

分别在节点上执行加入集群操作,执行完成后,在master节点上查看节点状态:

[[email protected] ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE     VERSION
st01015vm192   Ready      master   16d     v1.18.2
st01015vm193   NotReady   <none>   29s     v1.18.2
st01015vm194   NotReady   <none>   6m43s   v1.18.2

发现节点状态一直是NotReady,检查pod状态
kubectl get pods -n kube-system -o wide

kube-flannel-ds-amd64-sdvpf            0/1     Init:ImagePullBackOff   0          8m12s   10.10.45.193   st01015vm193   <none>           <none>
kube-flannel-ds-amd64-x58td            0/1     Init:ImagePullBackOff   0          14m     10.10.45.194   st01015vm194   <none>           <none>

发现子节点上的flannel pod报错,查看详细信息
kubectl describe pod -n kube-system kube-flannel-ds-amd64-sdvpf

Events:
  Type     Reason     Age                    From                   Message
  ----     ------     ----                   ----                   -------
  Normal   Scheduled  <unknown>              default-scheduler      Successfully assigned kube-system/kube-flannel-ds-amd64-sdvpf to st01015vm193
  Warning  Failed     2m22s (x2 over 4m33s)  kubelet, st01015vm193  Failed to pull image "quay.io/coreos/flannel:v0.12.0-amd64": rpc error: code = Unknown desc = context canceled
  Warning  Failed     2m22s (x2 over 4m33s)  kubelet, st01015vm193  Error: ErrImagePull
  Normal   BackOff    2m11s (x2 over 4m33s)  kubelet, st01015vm193  Back-off pulling image "quay.io/coreos/flannel:v0.12.0-amd64"
  Warning  Failed     2m11s (x2 over 4m33s)  kubelet, st01015vm193  Error: ImagePullBackOff
  Normal   Pulling    118s (x3 over 7m28s)   kubelet, st01015vm193  Pulling image "quay.io/coreos/flannel:v0.12.0-amd64"

Event中显示镜像拉取失败,这个可能是网络问题,pod运行失败后,会尝试重新运行,所以耐心等待一会,或者在失败的节点上手动拉取一下镜像,并且修改deployment中的imagePullPolicy: Always

By default, the kubelet will try to pull each image from the specified registry. However, if the imagePullPolicy property of the container is set to IfNotPresent or Never, then a local image is used (preferentially or exclusively, respectively).

修改为 imagePullPolicy: IfNotPresent

docker pull quay.io/coreos/flannel:v0.12.0-amd64

[[email protected] ~]# kubectl get nodes
NAME           STATUS   ROLES    AGE   VERSION
st01015vm192   Ready    master   17d   v1.18.2
st01015vm193   Ready    <none>   33m   v1.18.2
st01015vm194   Ready    <none>   39m   v1.18.2

四、安装网页界面 (Dashboard)

默认情况下不会部署 Dashboard。可以通过以下命令部署:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml
为了保护您的集群数据,默认情况下,Dashboard 会使用最少的 RBAC 配置进行部署。 当前,Dashboard 仅支持使用 Bearer 令牌登录。

所以,我们需要下载yaml并进行配置
我们将以下配置进行修改

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443

  selector:
    k8s-app: kubernetes-dashboard

修改后如下

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001
  selector:
    k8s-app: kubernetes-dashboard

修改镜像拉取策略为IfNotPresent或Never,并在k8s所有节点上拉取镜像,这一步是为了避免pod创建时因为网络原因拉取镜像失败
docker pull kubernetesui/metrics-scraper:v1.0.4
docker pull kubernetesui/dashboard:v2.0.0

如果某一台机器上已经拉取镜像成功了,而其他机器一直拉取不成功,可以将镜像备份后在不成功的机器上还原
备份

docker save -o dashboard.tar kubernetesui/dashboard:v2.0.0
docker save -o metrics-scraper.tar kubernetesui/metrics-scraper:v1.0.4

还原

docker load -i dashboard.tar 
docker load -i metrics-scraper.tar 

应用

kubectl apply -f recommended.yaml

检查状态

kubectl get pod,svc,ing,deploy -n kubernetes-dashboard

待所有pod都运行起来后,查看

https://10.10.45.192:30001/

安装完成后,在master节点上获取token

[[email protected] ~]# kubectl -n kubernetes-dashboard  get secret
NAME                               TYPE                                  DATA   AGE
default-token-25jb9                kubernetes.io/service-account-token   3      23m
kubernetes-dashboard-certs         Opaque                                0      23m
kubernetes-dashboard-csrf          Opaque                                1      23m
kubernetes-dashboard-key-holder    Opaque                                2      23m
kubernetes-dashboard-token-fm795   kubernetes.io/service-account-token   3      23m
[[email protected] ~]# kubectl -n kubernetes-dashboard describe  secret kubernetes-dashboard-token-fm795
Name:         kubernetes-dashboard-token-fm795
Namespace:    kubernetes-dashboard
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: kubernetes-dashboard
              kubernetes.io/service-account.uid: af7a61cf-901f-42f9-bcbe-6f521d026bc2

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  20 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IlhCZGZQa0MtaVFmMHJ0YTRBS083emppS0tKSENvb24xeW9scHIxY19zU0kifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1mbTc5NSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImFmN2E2MWNmLTkwMWYtNDJmOS1iY2JlLTZmNTIxZDAyNmJjMiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.haW6XvBAyog0BasqbaWJxPqWjTJKiemVBwP3J8dwFE43Q93Jx41yjxK41NRNaUflL8xL3Aj4CNIJ0YUQwlpIutIzOJq7rXWkneRI6tmgr3jsCarFtjwETph7-spg-WQAHXRQxt7hwMyxcNkJprEc13q6zGO_ycx9ei_hjjliXo0O8JMuQsL0rlm2zXrWOpRer5U77Hj33dnVSGrjvlD3X_5NsI0dlzG2MmKMFZHM0_PVbYFnSvWcEmLl_04_u5CJPtPfp9Pu6RTjy1lMOZtsHgBxqDC-oXxm0UP2Tcn2qlu_UDfIPhiL3r-QrwWFy7b3WpxJCcXwcm07pfUzijQ77A

在 https://10.10.45.192:30001/ 上使用token登录即可。

参考资料

https://my.oschina.net/u/2539854/blog/3023384
https://juejin.im/post/5d089f49f265da1baa1e7611