14.6.5 读写分离场景下的切换流程
在读写分离架构中,ProxySQL将写流量指向 writer hostgroup,读流量指向 reader hostgroup。切换流程目标:写入只落在新主库、读流量快速回收到健康从库、降低连接抖动与一致性风险。
原理草图(读写分离切换):
切换触发来源:健康检查主库不可用、外部高可用组件完成主从切换后的通知、人工运维触发。核心动作:下线旧主、提升新主、调整读库权重、回收存量连接。
关键配置示例(完整可执行,ProxySQL Admin 端口默认 6032):
-- 1) 定义 writer/reader hostgroup 关联
INSERT INTO mysql_replication_hostgroups(writer_hostgroup, reader_hostgroup, comment)
VALUES (10, 20, 'rw-split');
-- 2) 注册后端
INSERT INTO mysql_servers(hostgroup_id, hostname, port, weight, max_replication_lag, comment)
VALUES
(10, '10.0.0.11', 3306, 100, 0, 'master'),
(20, '10.0.0.12', 3306, 100, 2, 'slave1'),
(20, '10.0.0.13', 3306, 100, 2, 'slave2');
-- 3) 监控账号与健康检查
UPDATE global_variables SET variable_value='monitor' WHERE variable_name='mysql-monitor_username';
UPDATE global_variables SET variable_value='monitor_pwd' WHERE variable_name='mysql-monitor_password';
UPDATE global_variables SET variable_value='2000' WHERE variable_name='mysql-monitor_connect_interval';
UPDATE global_variables SET variable_value='2' WHERE variable_name='mysql-monitor_connect_timeout';
-- 4) 读写路由规则(示例:SELECT 走读库)
INSERT INTO mysql_query_rules(rule_id, active, match_pattern, destination_hostgroup, apply)
VALUES (1, 1, '^SELECT', 20, 1);
-- 5) 生效与持久化
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK;
LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK;
LOAD MYSQL VARIABLES TO RUNTIME; SAVE MYSQL VARIABLES TO DISK;
切换流程与命令示例(手工触发,读写分离场景):
# 登录 ProxySQL 管理端
mysql -uadmin -padmin -h127.0.0.1 -P6032
-- Step1: 下线旧主(平滑下线,允许存量连接自然结束)
UPDATE mysql_servers
SET status='OFFLINE_SOFT'
WHERE hostgroup_id=10 AND hostname='10.0.0.11';
-- Step2: 提升新主(假设 10.0.0.12 已完成主从切换)
UPDATE mysql_servers
SET hostgroup_id=10, status='ONLINE', weight=100
WHERE hostname='10.0.0.12';
-- Step3: 调整其余从库权重/延迟阈值
UPDATE mysql_servers
SET weight=80, max_replication_lag=2
WHERE hostgroup_id=20 AND hostname='10.0.0.13';
-- Step4: 运行时生效
LOAD MYSQL SERVERS TO RUNTIME;
-- Step5: 确认状态
SELECT hostgroup_id, hostname, status, weight, max_replication_lag
FROM runtime_mysql_servers ORDER BY hostgroup_id, hostname;
外部高可用组件通知切换(示例脚本,收到通知后自动执行):
#!/usr/bin/env bash
# /usr/local/bin/proxysql_failover.sh
OLD_MASTER="$1"
NEW_MASTER="$2"
mysql -uadmin -padmin -h127.0.0.1 -P6032 <<SQL
UPDATE mysql_servers SET status='OFFLINE_SOFT'
WHERE hostgroup_id=10 AND hostname='${OLD_MASTER}';
UPDATE mysql_servers SET hostgroup_id=10, status='ONLINE', weight=100
WHERE hostname='${NEW_MASTER}';
LOAD MYSQL SERVERS TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
SQL
命令解释:
- OFFLINE_SOFT:平滑下线,允许已有连接结束,减少中断。
- hostgroup_id=10:指定 writer 组,确保写只命中新主库。
- max_replication_lag:延迟阈值,超过即自动摘除读库。
验证方法(切换后快速验证):
-- 验证写命中主库(可在主库查看写入)
SELECT @@hostname AS backend, @@read_only AS ro;
-- 验证读路由(在应用端执行 SELECT,观察 ProxySQL 统计)
SELECT hostgroup, srv_host, srv_port, status, ConnUsed
FROM stats_mysql_connection_pool ORDER BY hostgroup, srv_host;
-- 验证路由规则命中
SELECT rule_id, hits, match_pattern, destination_hostgroup
FROM stats_mysql_query_rules;
排错要点与示例:
-- 问题1:写请求仍命中旧主
SELECT hostgroup_id, hostname, status FROM runtime_mysql_servers WHERE hostgroup_id=10;
-- 处理:确认旧主已 OFFLINE_SOFT/OUT,且规则未误路由
-- 问题2:读库被全部摘除
SELECT hostname, status, max_replication_lag
FROM runtime_mysql_servers WHERE hostgroup_id=20;
-- 处理:放宽 max_replication_lag 或修复复制延迟
-- 问题3:健康检查失败
SELECT * FROM runtime_mysql_monitor_connect_log ORDER BY time_start_us DESC LIMIT 5;
-- 处理:检查监控账号权限、网络连通性、防火墙
练习(建议在测试环境完成):
1. 手工将主库设置为 OFFLINE_SOFT,观察写请求路由是否停止。
2. 提升某从库为新主并更新 hostgroup,验证写入是否命中新主。
3. 将某从库的 max_replication_lag 设置为 1 秒,模拟延迟后观察是否自动摘除。
4. 通过 stats_mysql_connection_pool 验证连接回收是否生效。