Stateful Switchover
Stateful Switchover
Note • For complete syntax and usage information for the commands used in this chapter, see these
publications:
http://www.cisco.com/en/US/products/ps11845/prod_command_reference_list.html
• Cisco IOS Release 15.0SY supports only Ethernet interfaces. Cisco IOS Release 15.0SY does not
support any WAN features or commands.
• SSO and NSF do not support IPv6 multicast traffic.
Tip For additional information about Cisco Catalyst 6500 Series Switches (including configuration examples
and troubleshooting information), see the documents listed on this page:
http://www.cisco.com/en/US/products/hw/switches/ps708/tsd_products_support_series_home.html
Participate in the Technical Documentation Ideas forum
General Restrictions
• Two RPs must be installed in the chassis, each running the same version of the Cisco IOS software.
• Both RPs must run the same Cisco IOS image. If the RPs are operating different Cisco IOS images,
the system reverts to RPR mode even if SSO is configured.
• Configuration changes made through SNMP may not be automatically configured on the standby RP
after a switchover occurs.
• Load sharing between dual processors is not supported.
• The Hot Standby Routing Protocol (HSRP) is not supported with Cisco Nonstop Forwarding with
Stateful Switchover. Do not use HSRP with Cisco Nonstop Forwarding with Stateful Switchover.
• Enhanced Object Tracking (EOT) is not stateful switchover-aware and cannot be used with HSRP,
Virtual Router Redundancy Protocol (VRRP), or Gateway Load Balancing Protocol (GLBP) in SSO
mode.
• Multicast is not SSO-aware and restarts after switchover; therefore, multicast tables and data
structures are cleared upon switchover.
• Interfaces on the RP itself are not stateful and will experience a reset across switchovers. In
particular, the GE interfaces on the RPs are reset across switchovers and do not support SSO.
• Any line cards that are not online at the time of a switchover (line cards not in Cisco IOS running
state) are reset and reloaded on a switchover.
SSO Overview
The switch is supports fault resistance by allowing a redundant supervisor engine to take over if the
primary supervisor engine fails. Cisco SSO (frequently used with NSF) minimizes the time a network is
unavailable to its users following a switchover while continuing to forward IP packets. The switch is
supports route processor redundancy (RPR). For more information, see Chapter 1,
“Route Processor Redundancy (RPR).”
SSO is particularly useful at the network edge. Traditionally, core routers protect against network faults
using router redundancy and mesh connections that allow traffic to bypass failed network elements. SSO
provides protection for network edge devices with dual Route Processors (RPs) that represent a single
point of failure in the network design, and where an outage might result in loss of service for customers.
SSO has many benefits. Because the SSO feature maintains stateful feature information, user session
information is maintained during a switchover, and line cards continue to forward network traffic with
no loss of sessions, providing improved network availability. SSO provides a faster switchover than RPR
by fully initializing and fully configuring the standby RP, and by synchronizing state information, which
can reduce the time required for routing protocols to converge. Network stability may be improved with
the reduction in the number of route flaps had been created when routers in the network failed and lost
their routing tables.
SSO is required by the Cisco Nonstop Forwarding (NSF) feature (see Chapter 1, “Nonstop Forwarding
(NSF)”).
Figure 1-1 illustrates how SSO is typically deployed in service provider networks. In this example, Cisco
NSF with SSO is primarily at the access layer (edge) of the service provider network. A fault at this point
could result in loss of service for enterprise customers requiring access to the service provider network.
For Cisco NSF protocols that require neighboring devices to participate in Cisco NSF, Cisco NSF-aware
software images must be installed on those neighboring distribution layer devices. Additional network
availability benefits might be achieved by applying Cisco NSF and SSO features at the core layer of your
network; however, consult your network design engineers to evaluate your specific site requirements.
Figure 1-1 Cisco NSF with SSO Network Deployment: Service Provider Networks
Customers
72134
Additional levels of availability may be gained by deploying Cisco NSF with SSO at other points in the
network where a single point of failure exists. Figure 1-2 illustrates an optional deployment strategy that
applies Cisco NSF with SSO at the enterprise network access layer. In this example, each access point
in the enterprise network represents another single point of failure in the network design. In the event of
a switchover or a planned software upgrade, enterprise customer sessions would continue uninterrupted
through the network.
Figure 1-2 Cisco NSF with SSO Network Deployment: Enterprise Networks
Secondary deployment
Enterprise position for Cisco NSF
access with SSO-capable
layer or -aware routers
Enterprise Good position for
distribution NSF-aware
layer routers
72064
layer
SSO Operation
SSO establishes one of the RPs as the active processor while the other RP is designated as the standby
processor. SSO fully initializes the standby RP, and then synchronizes critical state information between
the active and standby RP.
During an SSO switchover, the line cards are not reset, which provides faster switchover between the
processors. The following events cause a switchover:
• A hardware failure on the active supervisor engine
• Clock synchronization failure between supervisor engines
• A manual switchover or shutdown
An SSO switchover does not interrupt Layer 2 traffic. An SSO switchover preserves FIB and adjacency
entries and can forward Layer 3 traffic after a switchover. SSO switchover duration is between 0 and 3
seconds.
Synchronization Overview
In networking devices running SSO, both RPs must be running the same configuration so that the
standby RP is always ready to assume control if the active RP fails. SSO synchronizes the configuration
information from the active RP to the standby RP at startup and whenever changes to the active RP
configuration occur. This synchronization occurs in two separate phases:
• While the standby RP is booting, the configuration information is synchronized in bulk from the
active RP to the standby RP.
• When configuration or state changes occur, an incremental synchronization is conducted from the
active RP to the standby RP.
Incremental Synchronization
• Incremental Synchronization Overview, page 1-7
• CLI Commands, page 1-7
• SNMP SET Commands, page 1-7
• Routing and Forwarding Information, page 1-7
• Chassis State, page 1-7
• Line Card State, page 1-7
• Counters and Statistics, page 1-8
After both RPs are fully initialized, any further changes to the running configuration or active RP states
are synchronized to the standby RP as they occur. Active RP states are updated as a result of processing
feature information, external events (such as the interface becoming up or down), or user configuration
commands (using CLI commands or Simple Network Management Protocol [SNMP]) or other internal
events.
CLI Commands
CLI changes to the running configuration are synchronized from the active RP to the standby RP. In
effect, the CLI command is run on both the active and the standby RP.
Configuration changes caused by an SNMP set operation are synchronized on a case-by-case basis.
Currently only two SNMP configuration set operations are supported:
• shut and no-shut (of an interface)
• link up/down trap enable/disable
Chassis State
Changes to the chassis state due to line card insertion or removal are synchronized to the standby RP.
Changes to the line card states are synchronized to the standby RP. Line card state information is initially
obtained during bulk synchronization of the standby RP. Following bulk synchronization, line card
events, such as whether the interface is up or down, received at the active processor are synchronized to
the standby RP.
The various counters and statistics maintained in the active RP are not synchronized because they may
change often and because the degree of synchronization they require is substantial. The volume of
information associated with statistics makes synchronizing them impractical.
Note Not synchronizing counters and statistics between RPs may create problems for external network
management systems that monitor this information.
SSO Operation
• SSO Conditions, page 1-8
• Switchover Time, page 1-8
• Online Removal of the Active RP, page 1-9
• Fast Software Upgrade, page 1-9
• Core Dump Operation, page 1-9
SSO Conditions
An automatic or manual switchover may occur under the following conditions:
• A fault condition that causes the active RP to crash or reboot—automatic switchover
• The active RP is declared dead (not responding)—automatic switchover
• The CLI is invoked—manual switchover
The user can force the switchover from the active RP to the standby RP by using a CLI command. This
manual procedure allows for a “graceful” or controlled shutdown of the active RP and switchover to the
standby RP. This graceful shutdown allows critical cleanup to occur.
Note This procedure should not be confused with the graceful shutdown procedure for routing protocols in
core routers—they are separate mechanisms.
Caution The SSO feature introduces a number of new command and command changes, including commands to
manually cause a switchover. The reload command does not cause a switchover. The reload command
causes a full reload of the box, removing all table entries, resetting all line cards, and interrupting
nonstop forwarding.
Switchover Time
The time required by the device to switch over from the active RP to the standby RP is between zero and
three seconds.
Although the newly active processor takes over almost immediately following a switchover, the time
required for the device to begin operating again in full redundancy (SSO) mode can be several minutes,
depending on the platform. The length of time can be due to a number of factors including the time
needed for the previously active processor to obtain crash information, load code and microcode, and
synchronize configurations between processors.
On DFC-equipped switching modules, forwarding information is distributed, and packets forwarded
from the same line card should have little to no forwarding delay; however, forwarding packets between
line cards requires interaction with the RP, meaning that packet forwarding might have to wait for the
switchover time.
Note During the upgrade process, different images will be loaded on the RPs for a short period of time. During
this time, the device will operate in RPR mode.
Note Core dumps are generally useful only to your technical support representative. The core dump file, which
is a very large binary file, must be transferred using the TFTP, FTP, or remote copy protocol (rcp) server
and subsequently interpreted by a Cisco Technical Assistance Center (TAC) representative that has
access to source code and detailed memory maps.
SSO-Aware Features
A feature is SSO-aware if it maintains, either partially or completely, undisturbed operation through an
RP switchover. State information for SSO-aware features is synchronized from active to standby to
achieve stateful switchover for those features.
The dynamically created state of SSO-unaware features is lost on switchover and must be reinitialized
and restarted on switchover.
The output of the show redundancy clients command displays the SSO-aware features (see the
“Verifying SSO Features” section on page 1-13).
Either the SSO or RPR redundancy mode is always configured. The SSO redundancy mode is configured
by default. To revert to the default SSO redundancy mode from the RPR redundancy mode, perform
this task:
Command Purpose
Step 1 Router> enable Enables privileged EXEC mode (enter your
password if prompted).
Step 2 Router# configure terminal Enters global configuration mode.
Step 3 Router(config)# redundancy Enters redundancy configuration mode.
Step 4 Router(config)# mode sso Sets the redundancy configuration mode to SSO on
both the active and standby RP.
Note After configuring SSO mode, the standby
RP will automatically reset.
Step 5 Router(config-red)# end Exits redundancy configuration mode and returns
the switch to privileged EXEC mode.
Step 6 Router# copy running-config startup-config Saves the configuration changes to the startup
configuration file.
Router(config-red)# end
Router# copy running-config startup-config
Router#
Troubleshooting SSO
• Possible SSO Problem Situations, page 1-11
• SSO Troubleshooting, page 1-12
• The show redundancy states command shows an operating mode that is different than what is
configured on the networking device—On certain platforms the output of the show redundancy
states command displays the actual operating redundancy mode running on the device, and not the
configured mode as set by the platform. The operating mode of the system can change depending
on system events. For example, SSO requires that both RPs on the networking device be running the
same software image; if the images are different, the device will not operate in SSO mode, regardless
of its configuration.
For example, during the upgrade process different images will be loaded on the RPs for a short
period of time. If a switchover occurs during this time, the device will recover in RPR mode.
• Reloading the device disrupts SSO operation—The SSO feature introduces a number of commands,
including commands to manually cause a switchover. The reload command is not an SSO command.
This command causes a full reload of the box, removing all table entries, resetting all line cards, and
thereby interrupting network traffic forwarding. To avoid reloading the box unintentionally, use the
redundancy force-switchover command.
• During a software upgrade, the networking device appears to be in a mode other than SSO—During
the software upgrade process, the show redundancy command indicates that the device is running in
a mode other than SSO.
This is normal behavior. Until the FSU procedure is complete, each RP will be running a different
software version. While the RPs are running different software versions, the mode will change to
either RPR. The device will change to SSO mode once the upgrade has completed.
• The previously active processor is being reset and reloaded before the core dump completes—Use
the crashdump-timeout command to set the maximum time that the newly active processor waits
before resetting and reloading the previously active processor.
• Issuing a “send break” does not cause a system switchover—This is normal operation. Using “send
break” to break or pause the system is not recommended and may cause unpredictable results. To
initiate a manual switchover, use the redundancy force-switchover command.
In Cisco IOS software, you can enter ROM monitor mode by restarting the switch and then pressing
the Break key or issuing a “send break” command from a telnet session during the first 60 seconds
of startup.The send break function can be useful for experienced users or for users under the
direction of a Cisco Technical Assistance Center (TAC) representative to recover from certain
system problems or to evaluate the cause of system problems.
SSO Troubleshooting
The following commands may be used as needed to troubleshoot the SSO feature. These commands do
not have to be entered in any particular order.
Command Purpose
Router(config-red)# crashdump-timeout [mm | hh:mm] Sets the longest time that the newly active RP will wait before
reloading the formerly active RP.
Router# debug redundancy {all | ui | clk | hub} Debugs redundancy on the networking device.
Router# show diag [slot-number | chassis | subslot Displays hardware information.
slot/subslot] [details | summary]
Router# show redundancy [clients | counters | Displays the redundancy configuration mode of the RP. Also
debug-log | handover | history | switchover history | displays information about the number of switchovers, system
states | inter-device]
uptime, processor uptime, and redundancy state, and reasons
for any switchovers.
Router# show version Displays image information for each RP.
Compiled ...
BOOT = disk0:0726_c4,12
CONFIG_FILE =
BOOTLDR =
Configuration register = 0x2102
Router#
Router#
Tip For additional information about Cisco Catalyst 6500 Series Switches (including configuration examples
and troubleshooting information), see the documents listed on this page:
http://www.cisco.com/en/US/products/hw/switches/ps708/tsd_products_support_series_home.html
Participate in the Technical Documentation Ideas forum