hghac集群主备数据不一致

环境

系统平台:中标麒麟(龙芯)7
版本:4.5

症状

hghac集群中主备库数据不一致,存在较大数据差异。

问题原因

用户修改数据库中流复制相关用户密码后,未修改hghac.yml中的对应的用户密码,也未检查流复制备库连接字符串中的用户密码是否正确,导致流复制断开,集群数据不一致。

解决方案

查看集群状态主库正常,两个备库均存在数据差异,猜测备库与主库连接已断开。

[root@test-hghac-0 /]# /opt/HighGo/tools/hghac/hghactl  list
+ Cluster: test-hghac (6969050704707002401) --+---------+----+-----------+
| Member       | Host               | Role    | State   | TL | Lag in MB |
+--------------+--------------------+---------+---------+----+-----------+
| test-hghac-0 | 10.233.103.89:5866 | Replica | running |    |     689.0 |
| test-hghac-1 | 10.233.105.48:5866 | Replica | running |    |     945.0 |
| test-hghac-2 | 10.233.92.57:5866  | Leader  | running |  6 |           |
+--------------+--------------------+---------+---------+----+-----------+

wal进程已停止,检查数据库日志发现如下大量报错:

2021-06-28 00:00:14.254 CST,"a_replication","",2328,"127.0.0.1:46089",60d8a08e.918,2,"authentication",2021-06-28 00:00:14 CST,3/47794,0,FATAL,22023,"Too many invalid attempts. Please contact the syssso user to unlock",,,,,,,,,""
2021-06-28 16:13:59.633 CST,"a_rewind","highgo",23226,"10.233.103.85:50824",60d984c7.5aba,2,"authentication",2021-06-28 16:13:59 CST,5/13199,0,FATAL,22023,"Too many invalid attempts. Please contact the syssso user to unlock",,,,,,,,,""

登录test-hghac-2服务器,对被锁定的用户进行解锁:

[root@test-hghac-2 /]# psql highgo sysdba
Password for user sysdba:
NOTICE: 
-------------------------------------------
Login User: sysdba
Login time: 2021-06-30 09:48:10.725705+08
Login Address: [local]
Last Login Status: FAILED
Login Failures: 1
Valied Until: 2022-06-08 15:28:25+08
-------------------------------------------

psql (4.5.6)
Type "help" for help.

highgo=# select user_unlock('a_replication'); 
highgo=# select user_unlock('a_rewind'); 

解锁完成后,a_replication用户立刻再次被锁。与客户沟通后得知客户修改过a_replication、a_rewind用户密码,但未修改hghac.yml 以及流复制连接字符串中的用户密码,导致流复制断开主备数据不一致。

修改流复制相关用户的密码并修改hghac.yml文件以及流复制连接字符串中的信息。

[root@test-hghac-2 ]# psql highgo syssso
highgo=> alter user  a_rewind  encrypted password  'test@123';
ALTER ROLE
highgo=> alter user a_replication encrypted password  'test@123';
ALTER ROLE
highgo=> \q

登陆test-hghac-1 以及test-hghac-0 服务器,确认/root/pgpass 中的密码正确后重启备库。

[root@test-hghac-1 ]# cat postgresql.conf |grep primary_conninfo 
primary_conninfo = 'user=a_replication passfile=/root/pgpass host=10.10.10.89 port=5866 sslmode=prefer application_name=test-hghac-1'

重新启动备库后检查数据库日志以及集群状态:

FATAL,XX000,"highest timeline 6 of the primary is behind recovery timeline 7",,,,,,,,,""
[root@test-hghac-0 /]# /opt/HighGo/tools/hghac/hghactl  list
+ Cluster: test-hghac (6969050704707002401) --+---------+----+-----------+
| Member       | Host               | Role    | State   | TL | Lag in MB |
+--------------+--------------------+---------+---------+----+-----------+
| test-hghac-0 | 10.233.103.89:5866 | Replica | crashed |    |   unknown |
| test-hghac-1 | 10.233.105.48:5866 | Replica | stoppped|    |   unknown |
| test-hghac-2 | 10.233.92.57:5866  | Leader  | running |  6 |           |
+--------------+--------------------+---------+---------+----+-----------+

因为时间线问题无法将备库加入集群,只能重新初始化集群备库:

[root@test-hghac-1 /]# /opt/HighGo/tools/hghac/hghactl  reinit  test-hghac test-hghac-1
[root@test-hghac-0 /]# /opt/HighGo/tools/hghac/hghactl  reinit  test-hghac test-hghac-0

完成后查看数据库集群状态:

[root@test-hghac-0 /]# /opt/HighGo/tools/hghac/hghactl  list
+ Cluster: test-hghac (6969050704707002401) --+---------+----+-----------+
| Member       | Host               | Role    | State   | TL | Lag in MB |
+--------------+--------------------+---------+---------+----+-----------+
| test-hghac-0 | 10.233.103.89:5866 | Leader  | running |  8 |           |
| test-hghac-1 | 10.233.105.48:5866 | Replica | running |  8 |       0.0 |
| test-hghac-2 | 10.233.92.57:5866  | Replica | running |  8 |       0.0 |
+--------------+--------------------+---------+---------+----+-----------+

集群状态正常,处理完成。

相关文档

报错编码

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值