12.6.4 Keepalived + Redis 哨兵高可用

Keepalived + Redis 哨兵高可用的目标是让客户端始终访问 VIP,哨兵负责主从切换,Keepalived 负责入口漂移。主库变化后,健康检查脚本识别角色并驱动 VIP 迁移,避免客户端感知主从变化。

文章图片

核心原理:
- 哨兵检测主库故障并完成选举与提升;
- Keepalived 通过脚本判断“本机是否为主库”,仅主库持有 VIP;
- VIP 漂移时间=哨兵完成切换时间+Keepalived 发现并切换时间。


安装与基础准备(示例)#

以两台候选主节点(node1、node2)为例,Redis 与 Keepalived 同机部署。

# 安装 Keepalived
yum install -y keepalived

# 安装 Redis(示例)
yum install -y redis
systemctl enable redis && systemctl start redis

# 检查 Redis 端口
ss -lntp | grep 6379
# 预期:LISTEN 0 128 *:6379

Redis 哨兵配置(示例)#

每个节点部署一个哨兵进程,哨兵配置一致。

文件路径:/etc/redis/sentinel.conf

port 26379
sentinel monitor mymaster 10.0.0.11 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 15000
sentinel parallel-syncs mymaster 1

# 认证示例(有密码时)
# sentinel auth-pass mymaster redispass

启动哨兵:

redis-sentinel /etc/redis/sentinel.conf
# 预期:日志显示 "Sentinel is running"

Keepalived 配置与健康检查脚本(示例)#

1)角色检查脚本#

文件:/etc/keepalived/check_redis_master.sh

#!/usr/bin/env bash
REDIS_HOST="127.0.0.1"
REDIS_PORT=6379
REDIS_PASS=""  # 有密码则填写

if [ -n "$REDIS_PASS" ]; then
  ROLE=$(redis-cli -h "$REDIS_HOST" -p "$REDIS_PORT" -a "$REDIS_PASS" ROLE | head -n 1)
else
  ROLE=$(redis-cli -h "$REDIS_HOST" -p "$REDIS_PORT" ROLE | head -n 1)
fi

# 可选:端口可达性检测
if ! nc -z "$REDIS_HOST" "$REDIS_PORT"; then
  exit 1
fi

# ROLE 返回 "master" 视为健康
if [ "$ROLE" = "master" ]; then
  exit 0
else
  exit 1
fi

权限与测试:

chmod +x /etc/keepalived/check_redis_master.sh
/etc/keepalived/check_redis_master.sh
echo $?
# 预期:主库节点返回 0,从库返回 1

2)Keepalived 配置#

文件:/etc/keepalived/keepalived.conf(node1 示例)

vrrp_script chk_redis_master {
  script "/etc/keepalived/check_redis_master.sh"
  interval 2
  fall 2
  rise 1
  weight 20
}

vrrp_instance VI_1 {
  state BACKUP
  interface eth0
  virtual_router_id 51
  priority 120
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 1111
  }
  track_script {
    chk_redis_master
  }
  virtual_ipaddress {
    10.0.0.100/24
  }
}

node2 仅调整 priority(例如 100),其余一致。

启动:

systemctl enable keepalived
systemctl restart keepalived
ip a | grep 10.0.0.100
# 预期:仅主库节点绑定 VIP

切换流程演示(示例)#

1)停掉当前主库 Redis 进程:

systemctl stop redis

2)观察哨兵与 VIP 漂移:

# Sentinel 日志应看到切主
tail -f /var/log/redis/sentinel.log

# VIP 漂移到新主
ip a | grep 10.0.0.100

3)客户端无感:

redis-cli -h 10.0.0.100 -p 6379 ping
# 预期:PONG

排错与运维实践(示例)#

常见问题与命令:

# 1. VIP 未漂移:检查 keepalived 状态与脚本返回
systemctl status keepalived
journalctl -u keepalived -n 100
/etc/keepalived/check_redis_master.sh; echo $?

# 2. 角色判断错误:检查 ROLE 输出
redis-cli -h 127.0.0.1 -p 6379 ROLE

# 3. 哨兵未完成切主:查看哨兵状态
redis-cli -p 26379 SENTINEL master mymaster
redis-cli -p 26379 SENTINEL slaves mymaster

# 4. ARP 缓存导致客户端仍访问旧主
arp -an | grep 10.0.0.100
# 可在切换时发送 gratuitous ARP

建议:
- interval 与哨兵 down-after-milliseconds 配合,避免 VIP 过早切换;
- fall/rise 防抖,减少误判;
- Keepalived 与 Redis 资源隔离,避免同机负载导致脚本误判。


练习#

  1. down-after-milliseconds 改为 3000,观察切换时间变化。
  2. 人为将脚本返回固定 0/1,观察 VIP 变化与日志。
  3. 停掉哨兵进程,验证 Keepalived 是否仍保持 VIP(预期不切换)。
  4. 为 Redis 设置密码,更新脚本与哨兵配置,验证角色检测仍正常。