16.6.7 资源弹性与自动伸缩(HPA/VPA/CA)

资源弹性与自动伸缩面向应用负载波动与集群容量变化,主要由 HPA(水平扩缩容)、VPA(垂直扩缩容)与 CA(Cluster Autoscaler)协同实现。三者分别处理副本数、资源规格与节点规模,合理组合可在成本与性能间取得平衡。

文章图片

HPA:水平自动伸缩#

安装与前置

# 1) 安装 metrics-server(示例为 Kubernetes 1.26+)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# 2) 验证指标是否可用
kubectl top nodes
kubectl top pods -A

示例:基于 CPU 的 HPA

# 文件:k8s/hpa/nginx-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-hpa-demo
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-hpa-demo
  template:
    metadata:
      labels:
        app: nginx-hpa-demo
    spec:
      containers:
      - name: nginx
        image: nginx:1.25
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "256Mi"
        ports:
        - containerPort: 80
---
# 文件:k8s/hpa/nginx-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-hpa-demo
  minReplicas: 2
  maxReplicas: 10
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 120
      policies:
      - type: Pods
        value: 1
        periodSeconds: 60
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 60
kubectl apply -f k8s/hpa/nginx-deploy.yaml
kubectl apply -f k8s/hpa/nginx-hpa.yaml

# 观察 HPA 状态与扩缩容
kubectl get hpa nginx-hpa -w
kubectl describe hpa nginx-hpa

压测触发扩容

# 创建一个压测 Pod
kubectl run loadgen --image=busybox:1.36 -it --rm -- \
  /bin/sh -c "while true; do wget -q -O- http://nginx-hpa-demo.default.svc.cluster.local; done"

预期效果:HPA TARGET 接近或超过 60% 时副本数上升。

排错要点

# 1) HPA 不工作:检查 metrics-server
kubectl get apiservice | grep metrics
kubectl -n kube-system logs deploy/metrics-server | tail -n 50

# 2) TARGET 显示 <unknown>:Pod 未设置 requests 或指标不可用
kubectl get deploy nginx-hpa-demo -o yaml | grep -A5 requests

# 3) 频繁抖动:查看 behavior 与 stabilization
kubectl describe hpa nginx-hpa | grep -A20 Behavior

VPA:垂直自动伸缩#

安装(示例为官方 VPA 组件)

# 1) 克隆并安装
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

# 2) 验证 VPA 组件
kubectl get deploy -n kube-system | grep vpa

示例:VPA Auto 模式

# 文件:k8s/vpa/redis-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-vpa-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis-vpa-demo
  template:
    metadata:
      labels:
        app: redis-vpa-demo
    spec:
      containers:
      - name: redis
        image: redis:7.2
        resources:
          requests:
            cpu: "50m"
            memory: "64Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"
---
# 文件:k8s/vpa/redis-vpa.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: redis-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: redis-vpa-demo
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: redis
      minAllowed:
        cpu: "50m"
        memory: "64Mi"
      maxAllowed:
        cpu: "800m"
        memory: "1Gi"
kubectl apply -f k8s/vpa/redis-deploy.yaml
kubectl apply -f k8s/vpa/redis-vpa.yaml

# 查看 VPA 建议与实际调整
kubectl describe vpa redis-vpa

排错要点

# 1) VPA 不生效:检查 VPA admission/controller/updater
kubectl -n kube-system get pods | grep vpa
kubectl -n kube-system logs deploy/vpa-updater | tail -n 50

# 2) 频繁重建:查看 updateMode 与 PDB
kubectl get pdb -A

CA:集群自动伸缩#

安装(示例为 Cluster Autoscaler 标准部署)

# 以 cloud provider 为例的模板需按环境修改,
# 这里示例使用 --cloud-provider=clusterapi(或对应云厂商)
curl -LO https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/clusterapi/examples/cluster-autoscaler-deployment.yaml

关键配置示例(截取)

# 文件:k8s/ca/cluster-autoscaler.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: cluster-autoscaler
        image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.28.0
        command:
        - ./cluster-autoscaler
        - --cloud-provider=clusterapi
        - --nodes=1:5:worker-pool-a   # 最小:1 最大:5 节点池
        - --scale-down-delay-after-add=10m
        - --scale-down-unneeded-time=10m
        - --balance-similar-node-groups
        - --expander=least-waste
kubectl apply -f k8s/ca/cluster-autoscaler.yaml

# 查看 CA 日志
kubectl -n kube-system logs deploy/cluster-autoscaler | tail -n 50

不可调度触发扩容演示

# 创建需要较大资源的 Pod,逼迫扩容
kubectl run big-pod --image=busybox:1.36 --restart=Never -- \
  /bin/sh -c "sleep 3600" \
  --requests=cpu=2000m,memory=2Gi

# 观察 Pod Pending 与 CA 决策
kubectl get pod big-pod -w
kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "scale up" -n

排错要点

# 1) CA 不扩容:查看节点池与标签/污点约束
kubectl get nodes --show-labels
kubectl describe pod big-pod | sed -n '/Events/,$p'

# 2) 资源碎片:查看节点资源与 Pod requests
kubectl top nodes
kubectl describe node <node-name> | sed -n '/Allocated resources/,$p'

组合策略与冲突规避#

  • HPA 与 VPA 不应同时控制同一资源维度(CPU/内存)。
  • 典型组合:HPA 负责副本数,VPA 仅在 Initial 或 Off 模式提供建议;CA 扩节点。

示例:VPA 仅建议模式

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa-recommend
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-hpa-demo
  updatePolicy:
    updateMode: "Off"   # 仅建议,不改动资源

练习#

  1. HPA 练习:将 averageUtilization 从 60 调整到 40,压测观察扩容幅度,记录 HPA 事件与副本变化。
  2. VPA 练习:设置 updateMode: Initial,重建 Pod 观察 requests 变化,记录 VPA Recommendation。
  3. CA 练习:将节点池最大值从 5 改为 2,创建 3 个大资源 Pod,观察 Pending 事件并解释原因。

常见故障速查(命令 + 结论)#

# HPA 指标缺失
kubectl get hpa -A
kubectl top pods -A

# VPA 组件异常
kubectl -n kube-system get deploy | grep vpa
kubectl -n kube-system logs deploy/vpa-admission-controller | tail -n 30

# CA 不扩容
kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "no scale up" -n
kubectl describe pod <pending-pod> | sed -n '/Events/,$p'

通过监控验证伸缩效果,关注指标延迟、扩缩容耗时与失败原因(指标缺失、资源配额不足、节点池不可用),并结合 PDB 与优先级控制缩容风险。通过演练高峰流量与故障场景,持续调整阈值与策略,形成稳定可预测的弹性体系。