Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
Products & Services Knowledgebase Cluster node rebooted due to link failover in network bonding
Environment
Red Hat Enterprise Linux 6 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 7 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 8 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 9 (with the High Availability or Resilient Storage Add on)
Issue
Cluster node rebooted due to link failover in network bonding
Jun 28 21:57:00 Node1 kernel: Bond: (slave ens1): link status definitely up, 10000 Mbps
full duplex
Jun 28 21:57:00 Node1 kernel: Bond: (slave ens1): making interface the new active one
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] link: host: 1 link: 0 is down
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] host: host: 1 (passive) best link: 0
(pri: 1)
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] host: host: 1 has no active links
Jun 28 21:57:02 Node1 corosync[17893]: [TOTEM ] Token has not been received in 2250 ms
Jun 28 21:57:03 Node1 corosync[17893]: [TOTEM ] A processor failed, forming new
configuration: token timed out (3000ms), waiting 3600ms for consensus.
Jun 28 21:57:07 Node1 corosync[17893]: [QUORUM] Sync members[1]: 2
Jun 28 21:57:07 Node1 corosync[17893]: [QUORUM] Sync left[1]: 11
Jun 28 21:57:07 Node1 corosync[17893]: [TOTEM ] A new membership (2.528) was formed.
Members left: 1
https://access.redhat.com/solutions/7078878 1/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
Resolution
The cluster is very sensitive by default - to ensure a quick recovery of resources. Increasing the
cluster's timeout makes it less sensitive to short periods of unresponsiveness and will allow more
time for a node to check-in before needing to be rebooted.
The totem token value is a tuneable attribute and therefore a suitable value varies from
environment to environment depending on the workload/network/infra etc. Please check with
your network team howlong the network switch takes to do a complete recovery and accordingly
update the totem token value.
Refer
How to change totem token timeout value in a RHEL 5, 6, 7, 8 or 9 High Availability cluster?
Root Cause
In the link failover situation in bond network interface it may require more time for the network
to stabilize hence the corosync totem timing needs to be increased to allow sufficient time for
network switch to stabilize before the corosync totem communication can be established
between the nodes.
Additional resources:
Why did a RHEL High Availability cluster node reboot - and how can I prevent it from
happening again?
Support Policies for RHEL High Availability Clusters - Cluster Interconnect Network
Interfaces
This solution is part of Red Hat’s fast-track publication program, providing a huge library of
solutions that Red Hat engineers have created while supporting our customers. To give you the
knowledge you need the instant it becomes available, these articles may be presented in a raw
and unedited form.
https://access.redhat.com/solutions/7078878 2/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
YES NO
Comments
https://access.redhat.com/solutions/7078878 3/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
Add comment
PRO
Formatting Help
Submit
Quick Links
Help
Site Info
Related Sites
About
https://access.redhat.com/solutions/7078878 4/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal
https://access.redhat.com/solutions/7078878 5/5