欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

kubernetes etcd备份与恢复

程序员文章站 2022-07-13 22:36:39
...

ETCD简介

ETCD 用于共享和配置服务发现的分布式,一致性的KV存储系统。 ETCD是CoreOS公司发起的一个开源项目,授权协议为Apache。

ETCD使用场景

ETCD 有很多使用场景,包括但不限于:

  • 配置管理
  • 服务注册于发现
  • 选主
  • 应用调度
  • 分布式队列
  • 分布式锁

ETCD 存储 k8s 所有数据信息

ETCD 是k8s集群极为重要的一块服务,存储了集群所有的数据信息。同理,如果发生灾难或者 etcd 的数据丢失,都会影响集群数据的恢复。所以,本文重点讲如何备份和恢复数据。

ETCD 一些查询操作

查看集群状态

[[email protected] ~]# ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379,https://172.16.2.92:2379,https://172.16.2.93:2379" endpoint health -w table
+--------------------------+--------+-------------+-------+
|         ENDPOINT         | HEALTH |    TOOK     | ERROR |
+--------------------------+--------+-------------+-------+
| https://172.16.2.91:2379 |   true |  14.99871ms |       |
| https://172.16.2.93:2379 |   true | 15.118918ms |       |
| https://172.16.2.92:2379 |   true | 15.465182ms |       |
+--------------------------+--------+-------------+-------+

获取某个 key 信息

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379,https://172.16.2.92:2379,https://172.16.2.93:2379" get /registry/apiregistration.k8s.io/apiservices/v1.apps

获取 etcd 版本信息

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379,https://172.16.2.92:2379,https://172.16.2.93:2379" version

获取 ETCD 所有的 key

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379,https://172.16.2.92:2379,https://172.16.2.93:2379" get / --prefix --keys-only

环境

主机			 IP
k8s-master1	 172.16.2.91
k8s-master2	 172.16.2.92
k8s-master3	 172.16.2.93
  • ETCD version 3.4.9
  • Kubernetes version v1.14.10 admin安装

备份

注意:ETCD 不同的版本的 etcdctl 命令不一样,但大致差不多,本文备份使用 napshot save , 每次备份一个节点就行。

命令备份(k8s-master1 机器上备份):

# mkdir  /tmp/backup/etcd/		# 用于存放备份数据
# ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379" snapshot save /tmp/backup/etcd/etcd-snapshot-`date +%Y%m%d`.db

备份脚本(k8s-master1 机器上备份):

#!/usr/bin/env bash

CACERT="/etc/kubernetes/pki/etcd/ca.crt"
CERT="/etc/kubernetes/pki/etcd/server.crt"
EKY="/etc/kubernetes/pki/etcd/server.key"
ENDPOINTS="172.16.2.91:2379"

ETCDCTL_API=3 etcdctl \
--cacert="${CACERT}" --cert="${CERT}" --key="${EKY}" \
--endpoints=${ENDPOINTS} \
snapshot save /tmp/backup/etcd/etcd-snapshot-`date +%Y%m%d`.db

# 备份保留30天
find /tmp/backup/etcd -name *.db -mtime +30 -exec rm -f {} \;

恢复

准备工作

停止所有 Master 上kube-apiserver 服务

$ systemctl stop kube-apiserver  
# 确认 kube-apiserver 服务是否停止 
$ ps -ef | grep kube-apiserver

停止集群中所有 ETCD 服务

systemctl stop etcd

移除所有 ETCD 存储目录下数据

mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak

拷贝 ETCD 备份快照

# 从 k8s-master1 机器上拷贝备份 
$ scp /tmp/backup/etcd/etcd-snapshot-20210610.db [email protected]:/tmp/backup/etcd/
$ scp /tmp/backup/etcd/etcd-snapshot-20210610.db [email protected]:/tmp/backup/etcd/

恢复备份

# k8s-master1 机器上操作
$ ETCDCTL_API=3 etcdctl snapshot restore /tmp/backup/etcd/etcd-snapshot-20210610.db \
  --name k8s-m1 \
  --initial-cluster "k8s-m1=https://172.16.2.91:2380,k8s-m2=https://172.16.2.92:2380,k8s-m3=https://172.16.2.93:2380" \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls https://172.16.2.91:2380 \
  --data-dir=/var/lib/etcd/default.etcd
  
# k8s-master2 机器上操作
$ ETCDCTL_API=3 etcdctl snapshot restore /tmp/backup/etcd/etcd-snapshot-20210610.db \
  --name k8s-m2 \
  --initial-cluster "k8s-m1=https://172.16.2.91:2380,k8s-m2=https://172.16.2.92:2380,k8s-m3=https://172.16.2.93:2380"  \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls https://172.16.2.92:2380 \
  --data-dir=/var/lib/etcd/default.etcd
  
# k8s-master3 机器上操作
$ ETCDCTL_API=3 etcdctl snapshot restore /tmp/backup/etcd/etcd-snapshot-20210610.db \
  --name k8s-m3 \
  --initial-cluster "k8s-m1=https://172.16.2.91:2380,k8s-m2=https://172.16.2.92:2380,k8s-m3=https://172.16.2.93:2380"  \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls https://172.16.2.93:2380 \
  --data-dir=/var/lib/etcd/default.etcd

上面三台 ETCD 都恢复完成后,依次登陆三台机器启动 ETCD

$ systemctl start etcd

三台 ETCD 启动完成,检查 ETCD 集群状态

ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key  --endpoints="https://172.16.2.91:2379,https://172.16.2.92:2379,https://172.16.2.93:2379" endpoint health -w table

三台 ETCD 全部健康,分别到每台 Master 启动 kube-apiserver

$ systemctl start kube-apiserver

检查 Kubernetes 集群是否恢复正常

$ kubectl get cs

总结

Kubernetes 集群备份主要是备份 ETCD 集群。而恢复时,主要考虑恢复整个顺序:

停止kube-apiserver --> 停止ETCD --> 恢复数据 --> 启动ETCD --> 启动kube-apiserve

注意:备份ETCD集群时,只需要备份一个ETCD就行,恢复时,拿同一份备份数据恢复。