12.6.4 Keepalived + Redis 哨兵高可用
Keepalived + Redis 哨兵高可用的目标是让客户端始终访问 VIP,哨兵负责主从切换,Keepalived 负责入口漂移。主库变化后,健康检查脚本识别角色并驱动 VIP 迁移,避免客户端感知主从变化。
核心原理:
- 哨兵检测主库故障并完成选举与提升;
- Keepalived 通过脚本判断“本机是否为主库”,仅主库持有 VIP;
- VIP 漂移时间=哨兵完成切换时间+Keepalived 发现并切换时间。
安装与基础准备(示例)#
以两台候选主节点(node1、node2)为例,Redis 与 Keepalived 同机部署。
# 安装 Keepalived
yum install -y keepalived
# 安装 Redis(示例)
yum install -y redis
systemctl enable redis && systemctl start redis
# 检查 Redis 端口
ss -lntp | grep 6379
# 预期:LISTEN 0 128 *:6379
Redis 哨兵配置(示例)#
每个节点部署一个哨兵进程,哨兵配置一致。
文件路径:/etc/redis/sentinel.conf
port 26379
sentinel monitor mymaster 10.0.0.11 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 15000
sentinel parallel-syncs mymaster 1
# 认证示例(有密码时)
# sentinel auth-pass mymaster redispass
启动哨兵:
redis-sentinel /etc/redis/sentinel.conf
# 预期:日志显示 "Sentinel is running"
Keepalived 配置与健康检查脚本(示例)#
1)角色检查脚本#
文件:/etc/keepalived/check_redis_master.sh
#!/usr/bin/env bash
REDIS_HOST="127.0.0.1"
REDIS_PORT=6379
REDIS_PASS="" # 有密码则填写
if [ -n "$REDIS_PASS" ]; then
ROLE=$(redis-cli -h "$REDIS_HOST" -p "$REDIS_PORT" -a "$REDIS_PASS" ROLE | head -n 1)
else
ROLE=$(redis-cli -h "$REDIS_HOST" -p "$REDIS_PORT" ROLE | head -n 1)
fi
# 可选:端口可达性检测
if ! nc -z "$REDIS_HOST" "$REDIS_PORT"; then
exit 1
fi
# ROLE 返回 "master" 视为健康
if [ "$ROLE" = "master" ]; then
exit 0
else
exit 1
fi
权限与测试:
chmod +x /etc/keepalived/check_redis_master.sh
/etc/keepalived/check_redis_master.sh
echo $?
# 预期:主库节点返回 0,从库返回 1
2)Keepalived 配置#
文件:/etc/keepalived/keepalived.conf(node1 示例)
vrrp_script chk_redis_master {
script "/etc/keepalived/check_redis_master.sh"
interval 2
fall 2
rise 1
weight 20
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_redis_master
}
virtual_ipaddress {
10.0.0.100/24
}
}
node2 仅调整 priority(例如 100),其余一致。
启动:
systemctl enable keepalived
systemctl restart keepalived
ip a | grep 10.0.0.100
# 预期:仅主库节点绑定 VIP
切换流程演示(示例)#
1)停掉当前主库 Redis 进程:
systemctl stop redis
2)观察哨兵与 VIP 漂移:
# Sentinel 日志应看到切主
tail -f /var/log/redis/sentinel.log
# VIP 漂移到新主
ip a | grep 10.0.0.100
3)客户端无感:
redis-cli -h 10.0.0.100 -p 6379 ping
# 预期:PONG
排错与运维实践(示例)#
常见问题与命令:
# 1. VIP 未漂移:检查 keepalived 状态与脚本返回
systemctl status keepalived
journalctl -u keepalived -n 100
/etc/keepalived/check_redis_master.sh; echo $?
# 2. 角色判断错误:检查 ROLE 输出
redis-cli -h 127.0.0.1 -p 6379 ROLE
# 3. 哨兵未完成切主:查看哨兵状态
redis-cli -p 26379 SENTINEL master mymaster
redis-cli -p 26379 SENTINEL slaves mymaster
# 4. ARP 缓存导致客户端仍访问旧主
arp -an | grep 10.0.0.100
# 可在切换时发送 gratuitous ARP
建议:
- interval 与哨兵 down-after-milliseconds 配合,避免 VIP 过早切换;
- fall/rise 防抖,减少误判;
- Keepalived 与 Redis 资源隔离,避免同机负载导致脚本误判。
练习#
- 将
down-after-milliseconds改为 3000,观察切换时间变化。 - 人为将脚本返回固定 0/1,观察 VIP 变化与日志。
- 停掉哨兵进程,验证 Keepalived 是否仍保持 VIP(预期不切换)。
- 为 Redis 设置密码,更新脚本与哨兵配置,验证角色检测仍正常。