IM and Presence Server High Availability - Cisco PDF
IM and Presence Server High Availability - Cisco PDF
Contents
Introduction
Prerequisites
Requirements
Components used
Manual Fallback
Automatic Fallback
Troubleshoot IM and Presence High Availability
Introduction
This document describes how Instant Message and Presence (IM&P) High Availability (also known as
Redundancy) works in an enterprise IM and Presence environment and how to troubleshoot it.
Prerequisites
Requirements
Cisco recommends that you have knowledge of these topics:
Cisco Unied IM&P
Cisco Jabber clients
Components used
Cisco Unied IM&P 10.0 and above
Cisco Jabber clients 9.6 and above
The information in this document was created from the components in a specic lab environment. All of the
components used in this document started with a cleared (default) conguration. If your network is live,
ensure that you understand the potential impact of any command.
IM and Presence oers high availability or redundancy in the form of logical server groups in the CUCM
conguration. This conguration is passed to IM and Presence and then utilized to allow for redundancy in
the event of an IM and Presence service or server failure. When a HA event takes place, the end user's
sessions are moved from the failed server to the backup. When the server has been restored to a normal
state, user sessions are then moved back either automatically or manually by the administrator.
When the administrator adds the IM&P Publisher to the System > Server conguration on CUCM and the
IM&P server is saved, the DefaultCUPSubCluster redundancy group gets created with the Publisher assigned
to it.
When created, the Redundancy Group will look like this:
This Redundancy Group translates to the IM and Presence subcluster. In the current state of the Redundancy
Group conguration in CUCM, this would be what it would look like in the IM and Presence Cluster Topology
web page:
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 2/8
9/4/2020 IM and Presence Server High Availability - Cisco
We see that the IM&P Publisher is assigned to the DefaultCUPSubcluster and the Subscriber server is not.
This is because the IM&P Subscriber server is not assigned to the Redundancy Group in the CUCM
conguration.
Assign the Subscriber to the Redundancy Group:
In order to assign the Subscriber server to the Redundancy Group, simply select the subscriber server from
the dropdown then Save the conguration change.
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 3/8
9/4/2020 IM and Presence Server High Availability - Cisco
We see after the addition of the secondary node(the subscriber) we get the High Availability option. In order
to enable High Availability, we would simply need to select the Enable High Availability checkbox and Save
the conguration change.
After High Availability is enabled:
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 4/8
9/4/2020 IM and Presence Server High Availability - Cisco
The page will auto-refresh the sever state and reason. When the server is in an initialization state, this
means that the two servers are able to communicate. The servers would then verify service status before
the state transitions to a Normal state. If the two servers can connect to each other and all monitored
services are up on both, we would then get a Normal-Normal state. This means that all monitored services
are active on the IM&P Servers.
Normal-Normal Redundancy Group State:
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 5/8
9/4/2020 IM and Presence Server High Availability - Cisco
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 6/8
9/4/2020 IM and Presence Server High Availability - Cisco
In order for the user session to become fully active on the secondary IM&P node after a failover event, the
user must attempt to log in to that server via SOAP(Client Prole Agent). This happens automatically with the
one-time password that is passed from the IMDB database. Since log ins are extremely expensive to
resources on the IM and Presence server, there must be a way to throttle log ins when a failover event
occurs. This throttle or buer will allow all users to log in to the secondary node without service disruption
for users on the secondary node. The mechanism that is used to throttle user log ins are the Client Re-
Login Lower Limit and Client Re-Login Upper Limit Server Recovery Manager(SRM) service parameters.
Client Re-Login Lower Limit - the parameter that denes the minimum amount of time(in seconds) that the
Jabber client will wait before the client attempts to log in to the secondary server in the event of an HA
event.
Client Re-Login Upper Limit - the parameter that denes the maximum amount of time(in seconds) that the
Jabber client will wait before the client attempts to log in to the secondary server in the event of an HA
event.
The Jabber client receives these parameters at log in to the server and caches the values for future use.
When we receive a HA event from the IM&P server, the client will choose a random number of seconds
between the upper and lower limits and wait that amount of time before the Jabber client attempts to log in
to the secondary. Once the timer expires, the client will attempt SOAP log in to the secondary node.
Manual Fallback
Manual fallback(default conguration for Server Recovery Manager) takes place when service has been
restored and the redundancy group allows the Fallback button. When this button is selected, the user
sessions that were moved to the secondary node, will be moved back to their homed node.The Jabber client
will apply the re-log in upper and lower limits for the fallback.
Automatic Fallback
Automatic fallback takes place when the server monitors the services and the Server Recovery
Manager(SRM) service will automatically fallback users to their homed nodes. The key in this conguration is
that the Server Recovery Manager(SRM) service will wait 30 minutes for a failed service/server to remain
active before an automatic fallback is initiated. Once this 30 minute up time is established, user sessions are
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 7/8
9/4/2020 IM and Presence Server High Availability - Cisco
moved back to their homed nodes. The Jabber client will apply the re-log in upper and lower limits for the
fallback.
Automatic fallback is not the default conguration, but it can be enabled. To enable automatic fallback,
change the Enable Automatic Fallback parameter in the Server Recovery Manager Service Parameters to
value True.
https://www.cisco.com/c/en/us/support/docs/unified-communications/unified-communications-manager-im-presence-service/200958-IM-and-Presence-… 8/8