kubeadm 搭建k8s集群
0. 环境
组件:
- kubernetes: v1.16.8
- docker: 18.09.9
- calico: v3.14.1
节点:
- k8s01:10.13.84.186(master)
- k8s02:10.13.84.187(node)
- k8s03:10.13.84.188(node)
注意: 如无说明则表示命令在所有节点并以root权限执行
1. 准备工作
参考上面分别修改节点的主机名,比如
hostnamectl set-hostname k8s01
然后重新登录即可看到新主机名
为了访问方便,增加以下解析
cat >> /etc/hosts <<EOF
10.13.84.186 k8s01
10.13.84.187 k8s02
10.13.84.188 k8s03
EOF
增加dns避免节点访问某些域名出错
建议把下面的命令写到/etc/rc.local中,否则重启就没有了
cat >> /etc/resolv.conf <<EOF
nameserver 114.114.114.114
EOF
2. centos7升级内核
载入公钥,不行的话先wget到本地再载入
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo,不行的话先wget到本地再安装
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64
# 遇到以下错误:
# 错误:软件包:kernel-lt-tools-4.4.218-1.el7.elrepo.x86_64 (elrepo-kernel)
# 需要:libpci.so.3(LIBPCI_3.5)(64bit)
# 错误:软件包:kernel-lt-tools-4.4.218-1.el7.elrepo.x86_64 (elrepo-kernel)
# 需要:libpci.so.3(LIBPCI_3.3)(64bit)
# 解决: https://centos.pkgs.org/7/centos-x86_64/pciutils-libs-3.5.1-3.el7.x86_64.rpm.html
# 执行下面的的命令,再重新执行上一条命令
yum install pciutils-libs
#查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
# 选择默认启动内核版本
grub2-set-default 1
#重启并检查
reboot
3. 内核调优
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf
4. 修改文件描述符
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
hard limits自AIX 4.1版本开始引入。hard limits 应由AIX系统管理员设置,只有security组的成员可以将此值增大,用户本身可以减小此限定值,但是其更改将随着该用户从系统退出而失效
soft limits 是AIX核心使用的限制进程对系统资源的使用的上限值。此值可由任何人更改,但不能超出hard limits值。这里要注意的是只有security组的成员可使更改永久生效普通用户的更改在其退出系统后将失效
1)soft nofile和hard nofile示,单个用用户的软限制为1000,硬限制为1200,即表示单用户能打开的最大文件数量为1000,不管它开启多少个shell。
2)soft nproc和hard nproc 单个用户可用的最大进程数量,软限制和硬限制
3)memlock 一个任务锁住的物理内存的最大值(这里设置成无限制)
5. 关闭SELinux、防火墙、Swap
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
swapoff -a
yes | cp /etc/fstab /etc/fstab_bak
cat /etc/fstab_bak |grep -v swap > /etc/fstab
6. 时间同步
# 如果主机时间不同步,会导致区块同步出错
# 如果安装系统时已启用NTP,则跳过此步骤
# 查看时间,时区是否正确,是否已启用NTP
yum install -y ntp
ntpdate pool.ntp.org
systemctl enable ntpd.service
systemctl restart ntpd.service
systemctl status ntpd.service
ntpdate pool.ntp.org
timedatectl
7. 安装依赖、工具等
yum install -y epel-release
# 不行试试下面的
rpm -vih http://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-7-12.noarch.rpm
yum clean all && yum makecache
yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools conntrack-tools libseccomp libtool-ltdl lrzsz
8. 配置ipvs模块
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
yum install ipset ipvsadm -y
9. 配置k8s yum 源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
10. 安装docker
卸载旧版本
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-selinux \
docker-engine-selinux \
docker-engine
安装新版本
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 查看源
ll /etc/yum.repos.d/
yum makecache fast
# 看你想安装哪个版本,并不是越新越好
yum list docker-ce --showduplicates | sort -r
yum install -y docker-ce-18.09.9-3.el7
cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"max-concurrent-downloads": 20,
"live-restore": true,
"max-concurrent-uploads": 10,
"log-opts": {
"max-size": "100m",
"max-file": "5"
}
}
EOF
systemctl daemon-reload
systemctl enable docker.service && systemctl start docker.service
11. 安装 kubelet kubeadm kubectl
export K8S_VERSION=1.16.8
yum install -y --disableexcludes=kubernetes kubelet-$K8S_VERSION kubeadm-$K8S_VERSION kubectl-$K8S_VERSION
systemctl enable kubelet.service
# 暂不启动 kubelet
12. 下载必须用到的镜像(国内环境)
kubeadm config print init-defaults > /root/kubeadm.conf
修改 /root/kubeadm.conf 其中几行
imageRepository: registry.aliyuncs.com/google_containers
kubernetesVersion: v1.16.8
下载镜像
kubeadm config images pull --config /root/kubeadm.conf
建议: 上面的命令都在所有节点执行完,再考虑下面的步骤
13. 初始化集群(master上执行)
注意版本
kubeadm init --kubernetes-version=v1.16.8 \
--pod-network-cidr=10.244.0.0/16
# 这里没有设置--service-cidr 是因为我们下面需要部署calico网络,calico会帮助我们设置service网络,如果此地设置了service网络会导致calico部署不成功
结果(找个地方保存起来,后面要用)
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.13.84.186:6443 --token 4q9g1x.j42gsbmfz1e9d1jv \
--discovery-token-ca-cert-hash sha256:2d960f1d625e95087b295c322df6d5eb5e0d7f8b84cf986b75ba5a7fc09dae97
创建相关文件
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
查看
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5644d7b6d9-tv99t 0/1 Pending 0 4m31s
kube-system coredns-5644d7b6d9-vfhpc 0/1 Pending 0 4m31s
kube-system etcd-k8s01 1/1 Running 0 3m44s
kube-system kube-apiserver-k8s01 1/1 Running 0 3m26s
kube-system kube-controller-manager-k8s01 1/1 Running 0 3m52s
kube-system kube-proxy-rp67r 1/1 Running 0 4m31s
kube-system kube-scheduler-k8s01 1/1 Running 0 3m43s
# coredns 是pending状态 先不用管它,因为这个没有网络插件的导致的
14. 部署calico网络插件(master上执行)
官方文档
https://docs.projectcalico.org/getting-started/kubernetes/quickstart
Configure NetworkManager
Configure NetworkManager before attempting to use Calico networking.
NetworkManager manipulates the routing table for interfaces in the default network namespace where Calico veth pairs are anchored for connections to containers. This can interfere with the Calico agent’s ability to route correctly.
每个节点上执行:
Create the following configuration file at /etc/NetworkManager/conf.d/calico.conf
to prevent NetworkManager from interfering with the interfaces:
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*
# 这里是v3.14.1
$ wget https://docs.projectcalico.org/manifests/calico.yaml
$ vim calico.yaml
1)修改ipip模式关闭 和typha_service_name
- name: CALICO_IPV4POOL_IPIP
value: "off"
typha_service_name: "calico-typha"
calico网络,默认是ipip模式(在每台node主机创建一个tunl0网口,这个隧道链接所有的node容器网络,官网推荐不同的ip网段适合,比如aws的不同区域主机),
修改成BGP模式,它会以daemonset方式安装在所有node主机,每台主机启动一个bird(BGP client),它会将calico网络内的所有node分配的ip段告知集群内的主机,并通过本机的网卡eth0或者ens33转发数据;
2)修改replicas
replicas: 1
3)修改pod的网段CALICO_IPV4POOL_CIDR
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
4)如果手动下载镜像请查看calico.yaml 文件里面标注的镜像版本 否则可以直接执行会自动下载
$ cat calico.yaml |grep image
5)部署calico
$ kubectl apply -f calico.yaml
6)查看
$ kubectl get pods --all-namespaces
15. node节点加入集群(所有node节点上执行)
kubeadm join 10.13.84.186:6443 --token 4q9g1x.j42gsbmfz1e9d1jv \
--discovery-token-ca-cert-hash sha256:2d960f1d625e95087b295c322df6d5eb5e0d7f8b84cf986b75ba5a7fc09dae97
注意: 如果以后还有机器加入集群如何获取token 和 hash值,请看下面,否则跳过
# 1)获取token
$ kubeadm token list
# 默认情况下 Token 过期是时间是24小时,如果 Token 过期以后,可以输入以下命令,生成新的 Token
$ kubeadm token create
# 2)获取hash值
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
耐心等一段时间
在master上查看集群信息
# 执行
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s01 Ready master 11m v1.16.8
k8s02 Ready <none> 3m29s v1.16.8
k8s03 Ready <none> 3m27s v1.16.8
# 允许master放开调度
$ kubectl taint nodes --all node-role.kubernetes.io/master-
# 等到calico运行正常了,执行
$ ip route show
default via 10.13.84.1 dev ens32 proto static metric 100
10.13.84.0/23 dev ens32 proto kernel scope link src 10.13.84.186 metric 100
10.244.1.0/26 via 10.13.84.187 dev ens32 proto bird
10.244.2.0/26 via 10.13.84.188 dev ens32 proto bird
10.244.73.64 dev cali2fd97f91c35 scope link
blackhole 10.244.73.64/26 proto bird
10.244.73.65 dev cali316268cf0c4 scope link
10.244.73.66 dev cali4a5794afb0a scope link
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
$ curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v3.14.1/calicoctl
$ chmod +x ./calicoctl
#export CALICO_DATASTORE_TYPE=kubernetes
#export CALICO_KUBECONFIG=~/.kube/config
$ ./calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+--------------+-------------------+-------+----------+-------------+
| 10.13.84.188 | node-to-node mesh | up | 03:09:22 | Established |
| 10.13.84.187 | node-to-node mesh | up | 03:09:35 | Established |
+--------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
16. 开启kube-proxy ipvs模式
# 修改ConfigMap的kube-system/kube-proxy中的config.conf,`mode: "ipvs"`:
$ kubectl edit cm kube-proxy -n kube-system
# 等10s,删除pod让它自动重建
$ kubectl get pod -n kube-system | grep kube-proxy | awk '{system("kubectl delete pod "$1" -n kube-system")}'
$ ipvsadm -L -n
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.96.0.1:443 rr
-> 10.13.84.186:6443 Masq 1 0 0
TCP 10.96.0.10:53 rr
-> 10.244.73.65:53 Masq 1 0 0
-> 10.244.73.66:53 Masq 1 0 0
TCP 10.96.0.10:9153 rr
-> 10.244.73.65:9153 Masq 1 0 0
-> 10.244.73.66:9153 Masq 1 0 0
UDP 10.96.0.10:53 rr
-> 10.244.73.65:53 Masq 1 0 0
-> 10.244.73.66:53 Masq 1 0 0
17.安装 Nodelocal DNS
nodelocaldns.yaml
# Copyright 2018 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-local-dns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
#apiVersion: v1
#kind: Service
#metadata:
# name: kube-dns-upstream
# namespace: kube-system
# labels:
# k8s-app: kube-dns
# kubernetes.io/cluster-service: "true"
# addonmanager.kubernetes.io/mode: Reconcile
# kubernetes.io/name: "KubeDNSUpstream"
#spec:
# ports:
# - name: dns
# port: 53
# protocol: UDP
# targetPort: 53
# - name: dns-tcp
# port: 53
# protocol: TCP
# targetPort: 53
# selector:
# k8s-app: kube-dns
#---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-local-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
Corefile: |
cluster.local:53 {
errors
cache {
success 9984 30
denial 9984 5
}
reload
loop
bind 169.254.20.10
forward . 10.96.0.10 {
force_tcp
}
prometheus :9253
health 169.254.20.10:8080
}
in-addr.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.20.10
forward . 10.96.0.10 {
force_tcp
}
prometheus :9253
}
ip6.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.20.10
forward . 10.96.0.10 {
force_tcp
}
prometheus :9253
}
.:53 {
errors
cache 30
reload
loop
bind 169.254.20.10
forward . /etc/resolv.conf
prometheus :9253
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
labels:
k8s-app: node-local-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
updateStrategy:
rollingUpdate:
maxUnavailable: 10%
selector:
matchLabels:
k8s-app: node-local-dns
template:
metadata:
labels:
k8s-app: node-local-dns
annotations:
prometheus.io/port: "9253"
prometheus.io/scrape: "true"
spec:
priorityClassName: system-node-critical
serviceAccountName: node-local-dns
hostNetwork: true
dnsPolicy: Default # Don't use cluster DNS.
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- effect: "NoExecute"
operator: "Exists"
- effect: "NoSchedule"
operator: "Exists"
containers:
- name: node-cache
image: k8s.gcr.io/k8s-dns-node-cache:1.15.13
resources:
requests:
cpu: 25m
memory: 5Mi
args: [ "-localip", "169.254.20.10", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns" ]
securityContext:
privileged: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9253
name: metrics
protocol: TCP
livenessProbe:
httpGet:
host: 169.254.20.10
path: /health
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
volumeMounts:
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- name: config-volume
mountPath: /etc/coredns
- name: kube-dns-config
mountPath: /etc/kube-dns
volumes:
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
- name: config-volume
configMap:
name: node-local-dns
items:
- key: Corefile
path: Corefile.base
$ kubectl apply -f nodelocaldns.yam
$ kubectl get pods -n kube-system
运行正常后,修改每个节点的/var/lib/kubelet/config.yaml
:
clusterDNS:
- 169.254.20.10
然后在每个节点:
systemctl daemon-reload && systemctl restart kubelet.service
18.测试
$ kubectl run cirros-$RANDOM --rm -it --image=cirros -- sh
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf
nameserver 169.254.20.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ #
清理
需要清理的话在所有节点上执行
kubeadm reset
ipvsadm --clear
iptables -F
问题
遇到最多的问题就是节点的网络问题