16.9.2 高可用与灾备设计
高可用与灾备设计在 Kubernetes 生产环境中需要从控制面与工作负载两个层面统筹规划,目标是减少单点故障、缩短恢复时间并保证数据一致性。下面给出可执行的落地步骤、原理草图与排错要点。
原理与架构草图(控制面 + 多可用区):
控制面高可用:部署要点与命令示例#
目标:多主+etcd 奇数节点,API Server 前置负载均衡,控制器与调度器选主。
1) 验证控制面组件副本与选主
# 查看 kube-controller-manager 与 kube-scheduler 选主情况
kubectl -n kube-system get endpoints kube-controller-manager -o yaml
kubectl -n kube-system get endpoints kube-scheduler -o yaml
# 预期:endpoints 中仅 1 个 leader
2) etcd 备份与压缩策略(示例)
# 1) 备份快照
ETCDCTL_API=3 etcdctl \
--endpoints=https://10.0.0.11:2379,https://10.0.0.12:2379,https://10.0.0.13:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /backup/etcd-$(date +%F).db
# 2) 查看快照
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-$(date +%F).db
# 3) 触发压缩(保留最近 1000 个 revision)
REV=$(ETCDCTL_API=3 etcdctl endpoint status -w json \
--endpoints=https://10.0.0.11:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key | jq -r '.[0].Status.header.revision')
ETCDCTL_API=3 etcdctl compact $((REV-1000)) \
--endpoints=https://10.0.0.11:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
3) API Server 前置负载均衡(HAProxy 示例)
# /etc/haproxy/haproxy.cfg
global
maxconn 2000
defaults
mode tcp
timeout connect 5s
timeout client 30s
timeout server 30s
frontend k8s_api
bind *:6443
default_backend k8s_api_backend
backend k8s_api_backend
balance roundrobin
option tcp-check
server apiserver1 10.0.0.11:6443 check
server apiserver2 10.0.0.12:6443 check
server apiserver3 10.0.0.13:6443 check
4) 验证负载均衡状态
# 客户端 kubeconfig 指向 LB VIP/域名
kubectl --kubeconfig=/root/.kube/config get nodes
# 预期:控制面故障时仍可访问
排错示例
- API Server 不可达:
# 检查 LB 端口与后端健康
ss -lntp | grep 6443
echo "show stat" | socat stdio /run/haproxy/admin.sock | head
# 检查 apiserver 日志
journalctl -u kube-apiserver -n 200
- etcd 集群不稳定:
ETCDCTL_API=3 etcdctl endpoint health --cluster \
--endpoints=https://10.0.0.11:2379,https://10.0.0.12:2379,https://10.0.0.13:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
工作负载高可用:示例与关键参数#
1) 关键组件反亲和与 PDB(示例)
# coredns 高可用 + 反亲和 + PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: coredns-pdb
namespace: kube-system
spec:
minAvailable: 2
selector:
matchLabels:
k8s-app: kube-dns
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
spec:
replicas: 3
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
k8s-app: kube-dns
topologyKey: "topology.kubernetes.io/zone"
containers:
- name: coredns
image: coredns/coredns:1.11.1
ports:
- containerPort: 53
protocol: UDP
2) 服务探针与优雅终止(示例)
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
terminationGracePeriodSeconds: 30
containers:
- name: web
image: nginx:1.25
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
3) 状态服务 + 持久化(StatefulSet + PVC)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7.2
ports:
- containerPort: 6379
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
排错示例
- Pod 频繁重启:
kubectl describe pod web-xxx | sed -n '/Events/,$p'
kubectl logs web-xxx --previous
- 反亲和未生效:
kubectl get pod -o wide -l app=web
# 检查是否落在同一节点或同一 zone
kubectl get node --show-labels | grep topology.kubernetes.io/zone
灾备设计:RTO/RPO、备份与恢复流程#
目标:定义 RTO/RPO,并实现多集群或异地容灾。
1) 资源清单备份(GitOps / 定期导出)
# 导出关键命名空间资源
kubectl get all -n prod -o yaml > /backup/prod-all-$(date +%F).yaml
kubectl get pvc -n prod -o yaml > /backup/prod-pvc-$(date +%F).yaml
2) etcd 恢复(演练)
# 停止 apiserver 与 etcd(不同发行版命令可能不同)
systemctl stop kube-apiserver
systemctl stop etcd
# 恢复快照到新数据目录
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-2024-01-01.db \
--data-dir /var/lib/etcd-new
# 替换 etcd 数据目录
mv /var/lib/etcd /var/lib/etcd-bak
mv /var/lib/etcd-new /var/lib/etcd
systemctl start etcd
systemctl start kube-apiserver
3) 多集群切换(Ingress DNS 级切换示例)
# 将流量切换到灾备集群(示例:修改权重)
# 伪代码:实际由 DNS 管理平台完成
dnscli set-weight --record app.example.com --cluster dr --weight 100
dnscli set-weight --record app.example.com --cluster primary --weight 0
排错示例
- 恢复后资源缺失:
kubectl get ns
kubectl get all -A | head
# 若缺失,检查是否使用正确的快照与恢复流程
故障演练与验证(操作与预期)#
1) 模拟节点故障
# 驱逐节点
kubectl cordon worker-1
kubectl drain worker-1 --ignore-daemonsets --delete-emptydir-data
# 预期:PDB 保护关键组件,业务依旧可用
kubectl get pod -n kube-system -l k8s-app=kube-dns -o wide
2) 模拟 API Server 故障
# 在其中一个控制面停止 apiserver
systemctl stop kube-apiserver
# 预期:kubectl 仍可通过 LB 访问
kubectl get nodes
练习题(带命令检查)#
1) 为一个无状态服务添加 PDB 与反亲和,并验证跨可用区分布。
检查命令:
kubectl get pdb
kubectl get pod -o wide -l app=<your-app>
kubectl get node --show-labels | grep topology.kubernetes.io/zone
2) 制作一次 etcd 快照并执行恢复演练,记录 RTO。
检查命令:
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-*.db
kubectl get nodes
3) 配置 Ingress Controller 多副本并设置 HPA。
检查命令:
kubectl get deploy -n ingress-nginx
kubectl get hpa -n ingress-nginx
以上内容可作为生产级高可用与灾备设计的最小可落地方案,通过定期演练与自动化脚本持续验证恢复能力。