12.6.1 Keepalived + Nginx 高可用Web架构

Keepalived + Nginx 高可用Web架构#

目标与价值
- 为Web入口提供VIP,实现主备自动切换与消除单点。
- 通过健康检查保证服务可用性与可维护性。
- 提供可演练、可回滚的高可用实践模板。

原理草图(VIP 漂移)

文章图片

典型架构与流向
- 两台节点:Master/Backup,均部署Nginx + Keepalived。
- 客户端访问VIP,VIP根据VRRP漂移到健康节点。
- 后端可对接应用集群或静态资源目录。


1) 安装与基础准备(CentOS/RHEL示例)#

# 两台机器均执行
sudo yum install -y keepalived nginx

# 启动并设置开机自启
sudo systemctl enable --now nginx
sudo systemctl enable --now keepalived

# 检查服务状态
sudo systemctl status nginx
sudo systemctl status keepalived

命令解释
- systemctl enable --now nginx:立即启动并设置开机自启。
- systemctl status:查看服务当前状态和日志摘要。


2) Nginx基础配置与健康检测页面#

# 创建一个简单健康检查页面
sudo mkdir -p /usr/share/nginx/html/health
echo "OK" | sudo tee /usr/share/nginx/html/health/index.html

# 确保Nginx默认站点可访问
curl -s http://127.0.0.1/health/  # 预期输出:OK

关键点
- /health/ 作为健康检查路径,避免命中复杂业务逻辑。


3) Keepalived 配置(主备)#

Master 配置 /etc/keepalived/keepalived.conf

global_defs {
  router_id nginx-ha-master
}

vrrp_script chk_nginx {
  script "/etc/keepalived/check_nginx.sh"
  interval 2
  fall 3
  rise 2
}

vrrp_instance VI_1 {
  state MASTER
  interface eth0
  virtual_router_id 51
  priority 120
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 12345678
  }
  virtual_ipaddress {
    10.0.0.100/24 dev eth0
  }
  track_script {
    chk_nginx
  }
  notify_master "/etc/keepalived/notify.sh master"
  notify_backup "/etc/keepalived/notify.sh backup"
  notify_fault  "/etc/keepalived/notify.sh fault"
}

Backup 配置 /etc/keepalived/keepalived.conf

global_defs {
  router_id nginx-ha-backup
}

vrrp_script chk_nginx {
  script "/etc/keepalived/check_nginx.sh"
  interval 2
  fall 3
  rise 2
}

vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 100
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 12345678
  }
  virtual_ipaddress {
    10.0.0.100/24 dev eth0
  }
  track_script {
    chk_nginx
  }
  notify_master "/etc/keepalived/notify.sh master"
  notify_backup "/etc/keepalived/notify.sh backup"
  notify_fault  "/etc/keepalived/notify.sh fault"
}

命令解释
- priority:优先级,数值大者优先成为Master。
- virtual_router_id:同一VRRP组必须一致。
- track_script:健康检查脚本返回非0将触发降级。


4) 健康检查与通知脚本#

健康检查脚本 /etc/keepalived/check_nginx.sh

#!/bin/bash
# 端口与HTTP健康检查
ss -lntp | grep -q ':80' || exit 1
code=$(curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1/health/)
[ "$code" = "200" ] || exit 1
exit 0

通知脚本 /etc/keepalived/notify.sh

#!/bin/bash
role=$1
logger -t keepalived "Role changed to $role on $(hostname)"
# 授权
sudo chmod +x /etc/keepalived/check_nginx.sh /etc/keepalived/notify.sh

# 重启 keepalived
sudo systemctl restart keepalived

5) 验证与演练#

验证 VIP 是否漂移

# Master上查看VIP
ip addr show eth0 | grep 10.0.0.100

# 客户端访问VIP
curl -s http://10.0.0.100/health/   # 预期输出:OK

演练切换

# 在Master停止Nginx
sudo systemctl stop nginx

# 在Backup查看VIP是否接管
ip addr show eth0 | grep 10.0.0.100

预期效果
- Master停止Nginx后,Backup接管VIP,HTTP访问不中断或短暂中断。


6) 常见故障与排错#

问题:VIP不漂移

# 检查Keepalived状态
sudo systemctl status keepalived

# 检查VRRP组播是否被阻断
sudo tcpdump -i eth0 vrrp
  • 解决:确保防火墙允许VRRP(协议号112),网络支持组播或启用单播VRRP。

问题:频繁切换
- 检查 fall/rise 参数是否过小。
- 检查Nginx是否短暂超时、负载过高导致健康检查失败。

问题:服务正常但健康检查失败

# 直接访问健康检查路径
curl -v http://127.0.0.1/health/
  • 解决:确保路径存在、权限正确、Nginx正常监听80端口。

7) 练习与自测#

  1. advert_int调大(如2秒),观察切换延迟变化。
  2. 健康检查加入HTTPS路径,模拟证书错误时的切换。
  3. 编写脚本在切换时自动发送钉钉/企业微信告警。

8) 关键建议#

  • 使用配置管理工具保证两端Nginx与Keepalived配置一致。
  • 健康检查路径应轻量稳定,避免依赖数据库等外部组件。
  • 定期演练主备切换,评估RTO与业务影响。