16.6.7 资源弹性与自动伸缩(HPA/VPA/CA)
资源弹性与自动伸缩面向应用负载波动与集群容量变化,主要由 HPA(水平扩缩容)、VPA(垂直扩缩容)与 CA(Cluster Autoscaler)协同实现。三者分别处理副本数、资源规格与节点规模,合理组合可在成本与性能间取得平衡。
HPA:水平自动伸缩#
安装与前置
# 1) 安装 metrics-server(示例为 Kubernetes 1.26+)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 2) 验证指标是否可用
kubectl top nodes
kubectl top pods -A
示例:基于 CPU 的 HPA
# 文件:k8s/hpa/nginx-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa-demo
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nginx-hpa-demo
template:
metadata:
labels:
app: nginx-hpa-demo
spec:
containers:
- name: nginx
image: nginx:1.25
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
ports:
- containerPort: 80
---
# 文件:k8s/hpa/nginx-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa-demo
minReplicas: 2
maxReplicas: 10
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 120
policies:
- type: Pods
value: 1
periodSeconds: 60
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
kubectl apply -f k8s/hpa/nginx-deploy.yaml
kubectl apply -f k8s/hpa/nginx-hpa.yaml
# 观察 HPA 状态与扩缩容
kubectl get hpa nginx-hpa -w
kubectl describe hpa nginx-hpa
压测触发扩容
# 创建一个压测 Pod
kubectl run loadgen --image=busybox:1.36 -it --rm -- \
/bin/sh -c "while true; do wget -q -O- http://nginx-hpa-demo.default.svc.cluster.local; done"
预期效果:HPA TARGET 接近或超过 60% 时副本数上升。
排错要点
# 1) HPA 不工作:检查 metrics-server
kubectl get apiservice | grep metrics
kubectl -n kube-system logs deploy/metrics-server | tail -n 50
# 2) TARGET 显示 <unknown>:Pod 未设置 requests 或指标不可用
kubectl get deploy nginx-hpa-demo -o yaml | grep -A5 requests
# 3) 频繁抖动:查看 behavior 与 stabilization
kubectl describe hpa nginx-hpa | grep -A20 Behavior
VPA:垂直自动伸缩#
安装(示例为官方 VPA 组件)
# 1) 克隆并安装
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh
# 2) 验证 VPA 组件
kubectl get deploy -n kube-system | grep vpa
示例:VPA Auto 模式
# 文件:k8s/vpa/redis-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-vpa-demo
spec:
replicas: 1
selector:
matchLabels:
app: redis-vpa-demo
template:
metadata:
labels:
app: redis-vpa-demo
spec:
containers:
- name: redis
image: redis:7.2
resources:
requests:
cpu: "50m"
memory: "64Mi"
limits:
cpu: "500m"
memory: "512Mi"
---
# 文件:k8s/vpa/redis-vpa.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: redis-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: redis-vpa-demo
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: redis
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "800m"
memory: "1Gi"
kubectl apply -f k8s/vpa/redis-deploy.yaml
kubectl apply -f k8s/vpa/redis-vpa.yaml
# 查看 VPA 建议与实际调整
kubectl describe vpa redis-vpa
排错要点
# 1) VPA 不生效:检查 VPA admission/controller/updater
kubectl -n kube-system get pods | grep vpa
kubectl -n kube-system logs deploy/vpa-updater | tail -n 50
# 2) 频繁重建:查看 updateMode 与 PDB
kubectl get pdb -A
CA:集群自动伸缩#
安装(示例为 Cluster Autoscaler 标准部署)
# 以 cloud provider 为例的模板需按环境修改,
# 这里示例使用 --cloud-provider=clusterapi(或对应云厂商)
curl -LO https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/clusterapi/examples/cluster-autoscaler-deployment.yaml
关键配置示例(截取)
# 文件:k8s/ca/cluster-autoscaler.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
template:
spec:
containers:
- name: cluster-autoscaler
image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.28.0
command:
- ./cluster-autoscaler
- --cloud-provider=clusterapi
- --nodes=1:5:worker-pool-a # 最小:1 最大:5 节点池
- --scale-down-delay-after-add=10m
- --scale-down-unneeded-time=10m
- --balance-similar-node-groups
- --expander=least-waste
kubectl apply -f k8s/ca/cluster-autoscaler.yaml
# 查看 CA 日志
kubectl -n kube-system logs deploy/cluster-autoscaler | tail -n 50
不可调度触发扩容演示
# 创建需要较大资源的 Pod,逼迫扩容
kubectl run big-pod --image=busybox:1.36 --restart=Never -- \
/bin/sh -c "sleep 3600" \
--requests=cpu=2000m,memory=2Gi
# 观察 Pod Pending 与 CA 决策
kubectl get pod big-pod -w
kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "scale up" -n
排错要点
# 1) CA 不扩容:查看节点池与标签/污点约束
kubectl get nodes --show-labels
kubectl describe pod big-pod | sed -n '/Events/,$p'
# 2) 资源碎片:查看节点资源与 Pod requests
kubectl top nodes
kubectl describe node <node-name> | sed -n '/Allocated resources/,$p'
组合策略与冲突规避#
- HPA 与 VPA 不应同时控制同一资源维度(CPU/内存)。
- 典型组合:HPA 负责副本数,VPA 仅在 Initial 或 Off 模式提供建议;CA 扩节点。
示例:VPA 仅建议模式
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa-recommend
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa-demo
updatePolicy:
updateMode: "Off" # 仅建议,不改动资源
练习#
- HPA 练习:将
averageUtilization从 60 调整到 40,压测观察扩容幅度,记录 HPA 事件与副本变化。 - VPA 练习:设置
updateMode: Initial,重建 Pod 观察 requests 变化,记录 VPA Recommendation。 - CA 练习:将节点池最大值从 5 改为 2,创建 3 个大资源 Pod,观察 Pending 事件并解释原因。
常见故障速查(命令 + 结论)#
# HPA 指标缺失
kubectl get hpa -A
kubectl top pods -A
# VPA 组件异常
kubectl -n kube-system get deploy | grep vpa
kubectl -n kube-system logs deploy/vpa-admission-controller | tail -n 30
# CA 不扩容
kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "no scale up" -n
kubectl describe pod <pending-pod> | sed -n '/Events/,$p'
通过监控验证伸缩效果,关注指标延迟、扩缩容耗时与失败原因(指标缺失、资源配额不足、节点池不可用),并结合 PDB 与优先级控制缩容风险。通过演练高峰流量与故障场景,持续调整阈值与策略,形成稳定可预测的弹性体系。