16.9.2 高可用与灾备设计

高可用与灾备设计在 Kubernetes 生产环境中需要从控制面与工作负载两个层面统筹规划,目标是减少单点故障、缩短恢复时间并保证数据一致性。下面给出可执行的落地步骤、原理草图与排错要点。

原理与架构草图(控制面 + 多可用区):

文章图片

控制面高可用:部署要点与命令示例#

目标:多主+etcd 奇数节点,API Server 前置负载均衡,控制器与调度器选主。

1) 验证控制面组件副本与选主

# 查看 kube-controller-manager 与 kube-scheduler 选主情况
kubectl -n kube-system get endpoints kube-controller-manager -o yaml
kubectl -n kube-system get endpoints kube-scheduler -o yaml
# 预期:endpoints 中仅 1 个 leader

2) etcd 备份与压缩策略(示例)

# 1) 备份快照
ETCDCTL_API=3 etcdctl \
  --endpoints=https://10.0.0.11:2379,https://10.0.0.12:2379,https://10.0.0.13:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /backup/etcd-$(date +%F).db

# 2) 查看快照
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-$(date +%F).db

# 3) 触发压缩(保留最近 1000 个 revision)
REV=$(ETCDCTL_API=3 etcdctl endpoint status -w json \
  --endpoints=https://10.0.0.11:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key | jq -r '.[0].Status.header.revision')
ETCDCTL_API=3 etcdctl compact $((REV-1000)) \
  --endpoints=https://10.0.0.11:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

3) API Server 前置负载均衡(HAProxy 示例)

# /etc/haproxy/haproxy.cfg
global
  maxconn 2000
defaults
  mode tcp
  timeout connect 5s
  timeout client  30s
  timeout server  30s

frontend k8s_api
  bind *:6443
  default_backend k8s_api_backend

backend k8s_api_backend
  balance roundrobin
  option tcp-check
  server apiserver1 10.0.0.11:6443 check
  server apiserver2 10.0.0.12:6443 check
  server apiserver3 10.0.0.13:6443 check

4) 验证负载均衡状态

# 客户端 kubeconfig 指向 LB VIP/域名
kubectl --kubeconfig=/root/.kube/config get nodes
# 预期:控制面故障时仍可访问

排错示例
- API Server 不可达:

# 检查 LB 端口与后端健康
ss -lntp | grep 6443
echo "show stat" | socat stdio /run/haproxy/admin.sock | head
# 检查 apiserver 日志
journalctl -u kube-apiserver -n 200
  • etcd 集群不稳定:
ETCDCTL_API=3 etcdctl endpoint health --cluster \
  --endpoints=https://10.0.0.11:2379,https://10.0.0.12:2379,https://10.0.0.13:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

工作负载高可用:示例与关键参数#

1) 关键组件反亲和与 PDB(示例)

# coredns 高可用 + 反亲和 + PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: coredns-pdb
  namespace: kube-system
spec:
  minAvailable: 2
  selector:
    matchLabels:
      k8s-app: kube-dns
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns
  namespace: kube-system
spec:
  replicas: 3
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  k8s-app: kube-dns
              topologyKey: "topology.kubernetes.io/zone"
      containers:
      - name: coredns
        image: coredns/coredns:1.11.1
        ports:
        - containerPort: 53
          protocol: UDP

2) 服务探针与优雅终止(示例)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      terminationGracePeriodSeconds: 30
      containers:
      - name: web
        image: nginx:1.25
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5

3) 状态服务 + 持久化(StatefulSet + PVC)

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
spec:
  serviceName: redis
  replicas: 3
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7.2
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi

排错示例
- Pod 频繁重启:

kubectl describe pod web-xxx | sed -n '/Events/,$p'
kubectl logs web-xxx --previous
  • 反亲和未生效:
kubectl get pod -o wide -l app=web
# 检查是否落在同一节点或同一 zone
kubectl get node --show-labels | grep topology.kubernetes.io/zone

灾备设计:RTO/RPO、备份与恢复流程#

目标:定义 RTO/RPO,并实现多集群或异地容灾。

1) 资源清单备份(GitOps / 定期导出)

# 导出关键命名空间资源
kubectl get all -n prod -o yaml > /backup/prod-all-$(date +%F).yaml
kubectl get pvc -n prod -o yaml > /backup/prod-pvc-$(date +%F).yaml

2) etcd 恢复(演练)

# 停止 apiserver 与 etcd(不同发行版命令可能不同)
systemctl stop kube-apiserver
systemctl stop etcd

# 恢复快照到新数据目录
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-2024-01-01.db \
  --data-dir /var/lib/etcd-new

# 替换 etcd 数据目录
mv /var/lib/etcd /var/lib/etcd-bak
mv /var/lib/etcd-new /var/lib/etcd

systemctl start etcd
systemctl start kube-apiserver

3) 多集群切换(Ingress DNS 级切换示例)

# 将流量切换到灾备集群(示例:修改权重)
# 伪代码:实际由 DNS 管理平台完成
dnscli set-weight --record app.example.com --cluster dr --weight 100
dnscli set-weight --record app.example.com --cluster primary --weight 0

排错示例
- 恢复后资源缺失:

kubectl get ns
kubectl get all -A | head
# 若缺失,检查是否使用正确的快照与恢复流程

故障演练与验证(操作与预期)#

1) 模拟节点故障

# 驱逐节点
kubectl cordon worker-1
kubectl drain worker-1 --ignore-daemonsets --delete-emptydir-data

# 预期:PDB 保护关键组件,业务依旧可用
kubectl get pod -n kube-system -l k8s-app=kube-dns -o wide

2) 模拟 API Server 故障

# 在其中一个控制面停止 apiserver
systemctl stop kube-apiserver

# 预期:kubectl 仍可通过 LB 访问
kubectl get nodes

练习题(带命令检查)#

1) 为一个无状态服务添加 PDB 与反亲和,并验证跨可用区分布。
检查命令

kubectl get pdb
kubectl get pod -o wide -l app=<your-app>
kubectl get node --show-labels | grep topology.kubernetes.io/zone

2) 制作一次 etcd 快照并执行恢复演练,记录 RTO。
检查命令

ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-*.db
kubectl get nodes

3) 配置 Ingress Controller 多副本并设置 HPA。
检查命令

kubectl get deploy -n ingress-nginx
kubectl get hpa -n ingress-nginx

以上内容可作为生产级高可用与灾备设计的最小可落地方案,通过定期演练与自动化脚本持续验证恢复能力。