0% found this document useful (0 votes)
9 views

Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

The document discusses an issue where a cluster node rebooted due to link failover in network bonding across various versions of Red Hat Enterprise Linux. It suggests increasing the totem token timeout value to allow more time for network stabilization before rebooting the node. Additional resources and solutions are provided to prevent future occurrences of this issue.

Uploaded by

manujaleel24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

The document discusses an issue where a cluster node rebooted due to link failover in network bonding across various versions of Red Hat Enterprise Linux. It suggests increasing the totem token timeout value to allow more time for network stabilization before rebooting the node. Additional resources and solutions are provided to prevent future occurrences of this issue.

Uploaded by

manujaleel24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

Subscriptions Downloads Red Hat Console Get Support

Products & Services Knowledgebase Cluster node rebooted due to link failover in network bonding

Cluster node rebooted due to link failover in


network bonding
 SOLUTION IN PROGRESS - Updated August 8 2024 at 8:32 AM - English

Environment
Red Hat Enterprise Linux 6 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 7 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 8 (with the High Availability or Resilient Storage Add on)
Red Hat Enterprise Linux 9 (with the High Availability or Resilient Storage Add on)

Issue
Cluster node rebooted due to link failover in network bonding

Jun 28 21:57:00 Node1 kernel: Bond: (slave ens1): link status definitely up, 10000 Mbps
full duplex
Jun 28 21:57:00 Node1 kernel: Bond: (slave ens1): making interface the new active one
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] link: host: 1 link: 0 is down
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] host: host: 1 (passive) best link: 0
(pri: 1)
Jun 28 21:57:01 Node1 corosync[17893]: [KNET ] host: host: 1 has no active links
Jun 28 21:57:02 Node1 corosync[17893]: [TOTEM ] Token has not been received in 2250 ms
Jun 28 21:57:03 Node1 corosync[17893]: [TOTEM ] A processor failed, forming new
configuration: token timed out (3000ms), waiting 3600ms for consensus.
Jun 28 21:57:07 Node1 corosync[17893]: [QUORUM] Sync members[1]: 2
Jun 28 21:57:07 Node1 corosync[17893]: [QUORUM] Sync left[1]: 11
Jun 28 21:57:07 Node1 corosync[17893]: [TOTEM ] A new membership (2.528) was formed.
Members left: 1

https://access.redhat.com/solutions/7078878 1/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

Resolution
The cluster is very sensitive by default - to ensure a quick recovery of resources. Increasing the
cluster's timeout makes it less sensitive to short periods of unresponsiveness and will allow more
time for a node to check-in before needing to be rebooted.

The totem token value is a tuneable attribute and therefore a suitable value varies from
environment to environment depending on the workload/network/infra etc. Please check with
your network team howlong the network switch takes to do a complete recovery and accordingly
update the totem token value.

Refer
How to change totem token timeout value in a RHEL 5, 6, 7, 8 or 9 High Availability cluster?

Root Cause
In the link failover situation in bond network interface it may require more time for the network
to stabilize hence the corosync totem timing needs to be increased to allow sufficient time for
network switch to stabilize before the corosync totem communication can be established
between the nodes.

Additional resources:

Why did a RHEL High Availability cluster node reboot - and how can I prevent it from
happening again?
Support Policies for RHEL High Availability Clusters - Cluster Interconnect Network
Interfaces

Product(s) Red Hat Enterprise Linux Component corosync totem

Category Learn more Tags high_availability_add-on

This solution is part of Red Hat’s fast-track publication program, providing a huge library of
solutions that Red Hat engineers have created while supporting our customers. To give you the
knowledge you need the instant it becomes available, these articles may be presented in a raw
and unedited form.

https://access.redhat.com/solutions/7078878 2/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

Was this helpful?

YES NO

People who viewed this solution also viewed

The corosync pacemaker fails rgmanager


service fails to start with error segfaults in a RHEL
with "Error: "status=Timed out" 6 High Availability
corosync[xxxx]: when starting an cluster using RRP
parse error in NFS-based and cpglockd when
config: Can't open Filesystem resource stopping cman
logfile initially after boot
'/var/log/cluster/corosyn in a RHEL 7 High
for reason: No such Availability cluster
file or directory"

Solution - Jun 14, Solution - Aug 9,


2024 Solution - Aug 5, 2024 2024

Get notified when this content is updated FOLLOW

Comments

https://access.redhat.com/solutions/7078878 3/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

Add comment

PRO

Formatting Help

Send notifications to content followers

Submit

Quick Links

Help

Site Info

Related Sites

About

Red Hat Subscription Value

About Red Hat

Red Hat Jobs

https://access.redhat.com/solutions/7078878 4/5
2/18/25, 11:34 AM Cluster node rebooted due to link failover in network bonding - Red Hat Customer Portal

About Red Hat


Jobs
Events
Locations
Contact Red Hat
Red Hat Blog
Diversity, equity, and inclusion
Cool Stuff Store
Red Hat Summit

Copyright © 2025 Red Hat, Inc.


Privacy statement
Terms of use
All policies and guidelines
Digital accessibility
Cookie preferences

https://access.redhat.com/solutions/7078878 5/5

You might also like