Lotus Domino Cluster
Lotus Domino Cluster
Lotus Domino 6
Disclaimer THIS DOCUMENTATION IS PROVIDED FOR REFERENCE PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS DOCUMENTATION, THIS DOCUMENTATION IS PROVIDED AS IS WITHOUT ANY WARRANTY WHATSOEVER AND TO THE MAXIMUM EXTENT PERMITTED, IBM DISCLAIMS ALL IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE, WITH RESPECT TO THE SAME. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION, DIRECT, INDIRECT, CONSEQUENTIAL OR INCIDENTAL DAMAGES, ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS DOCUMENTATION OR ANY OTHER DOCUMENTATION. NOTWITHSTANDING ANYTHING TO THE CONTRARY, NOTHING CONTAINED IN THIS DOCUMENTATION OR ANY OTHER DOCUMENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENT GOVERNING THE USE OF THIS SOFTWARE. Copyright Under the copyright laws, neither the documentation nor the software may be copied, photocopied, reproduced, translated, or reduced to any electronic medium or machine-readable form, in whole or in part, without the prior written consent of IBM, except in the manner described in the documentation or the applicable licensing agreement governing the use of the software. Copyright IBM Corporation 1985, 2002 All rights reserved. Lotus Software IBM Software Group One Rogers Street Cambridge, MA 02142 US Government Users Restricted Rights Use, duplication or disclosure restricted by GS ADP Schedule Contract with IBM Corp. List of Trademarks 1-2-3, cc:Mail, Domino, Domino Designer, Freelance Graphics, iNotes, Lotus, Lotus Discovery Server, Lotus Enterprise Integrator, Lotus Mobile Notes, Lotus Notes, Lotus Organizer, LotusScript, Notes, QuickPlace, Sametime, SmartSuite, and Word Pro are trademarks or registered trademarks of Lotus Development Corporation and/or IBM Corporation in the United States, other countries, or both. AIX, AS/400, DB2, IBM, iSeries, MQSeries, Netfinity, OfficeVision, OS/2, OS/390, OS/400, S/390, Tivoli, and WebSphere are registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Pentium is a trademark of Intel Corporation in the United States, other countries, or both. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. All other trademarks are the property of their respective owners.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . vii 1 Cluster Benefits and Requirements . . . . . . . . . . . . . . . . . 1-1 3 Planning a Cluster . . . . . . . . . . . . 3-1
Determining how many servers to include in a cluster . . . . . . . . . . . . . . . . Determining the number and placement of replicas in a cluster . . . . . . . . . . .
..
3-1 3-3 3-4 3-7 3-7 3-9 3-10 3-11 3-11 3-12
............ How do clusters help you? . . . . . . . . . . Clustering requirements . . . . . . . . . . . . . . 2 How Domino Clustering Works . . Clustering basics . . . . . . . . . . . . . . . . . . How failover works . . . . . . . . . . . . . . . . When failover occurs . . . . . . . . . . . . . .
What is a Domino cluster?
1-1 1-1 1-3 2-1 2-1 2-2 2-3 2-5 2-5 2-7 2-7 2-8
.. How many replicas to create . . . . . . . . . Distributing databases in a cluster . . . . . . . Distributing mail databases . . . . . . . . . . Distributing application databases . . . . .
Determining whether to create a private LAN for your cluster . . . . . . . .
..... How workload balancing works . . . . . . . . The cluster components . . . . . . . . . . . . . . The Cluster Manager . . . . . . . . . . . . . . The Cluster Database Directory . . . . . . .
The Cluster Database Directory Manager . . . . . . . . . . . .
... Clustering over a wide area network . . . . . Fault recovery in a cluster . . . . . . . . . . . . Examples of cluster configurations . . . . . .
Example of clustering two servers for mail and applications . . . . . . .
.. .
Example of clustering three servers for mail and applications . . . . . . . . Example of clustering six servers for mail and applications . . . . . . . Examples of clustering hub servers Example of clustering for disaster preparedness . . . . . . . . . . . Example of clustering partitioned servers . . . . . . . . . . . . . . .
. . . . . . . . 2-9 The Cluster Administrator . . . . . . . . . . 2-9 The Cluster Replicator . . . . . . . . . . . . . 2-9 How replication works in a cluster . . . . . . 2-10 Replication history in a cluster . . . . . . . 2-11 Private folder replication in a cluster . . . 2-11 Mail failover in a cluster . . . . . . . . . . . . . 2-12 How calendars work in a cluster . . . . . . . 2-13 How operating system clusters work . . . . 2-14
Benefits of using OS clusters with Domino clusters . . . . . . . . .
.. ...
.... ....
Example of clustering passthru servers Example of using a Domino cluster with an operating system cluster . . . . . .
3-22
.....
2-17
iii
.....
..... Running Cluster Analysis . . . . . . . . . . . Managing user access to databases . . . . . . . Setting up database ACLs in a cluster . . .
Controlling other settings that restrict database access . . . . . . . . . . .
................ ....
5-1 5-2 5-3 5-3 5-3 5-4 5-5 5-5 5-6 5-7 5-8 5-9 5-11 5-13 5-14 5-15 5-17 5-18 5-24 5-26
Displaying a list of cluster members and their availability . . . . . . . . . . . Enabling statistic reporting in the Monitoring Results database
....... Enabling statistic report generation . . . . . Creating a server statistic collection . . . . Starting the Statistic Collector task . . . . . ......
.... .......
4-11 4-11 4-12 4-13 4-13 4-15 4-17 4-17 4-19 4-19 4-20 4-20 4-21 4-23 4-23
Viewing Cluster Manager events and statistics . . . . . . . . . . . . . . Viewing failover and workload balancing events . . . . . . .
.... Using standard replication in a cluster . . . . Scheduled replication in a cluster . . . . . Replicating with all servers in a cluster .
Enabling the display of cluster replication status messages . . . . . . . . . . . . Obeying database size quotas during cluster replication . . . . . . . .
..
..... Viewing cluster replication events . . . . . Viewing cluster replication statistics . . . . ... ...
Using Tell commands to display cluster replication information . . . . . . About the Tell commands for cluster replication . . . . . . . . . . . . . . . Viewing the cluster replication information . . . . . . . . . .
.. . ..
Setting up existing users for roaming in a cluster . . . . . . . . . . . . . . . . . . Setting up cluster access for mobile users Using the Server Web Navigator in a cluster . . . . . . . . . . . . . . .
Monitoring all the servers in a cluster at the same time . . . . . . . . . . . . . . . . . Balancing the workload in a cluster Limiting the workload of a server
.. ............
5-26 5-27 5-27 5-29 5-29 5-30 5-30 5-31 5-33 5-35 5-37 5-38 5-39
Moving a server from one cluster to another . . . . . . . . . . . . . . Viewing information in the Cluster Database Directory . . . . . . Creating a new Cluster Database Directory . . . . . . . . . . . .
...... ......
....
.. Generating URLs that refer to the ICM . . . . Planning to use the ICM . . . . . . . . . . . . . . Planning the cluster . . . . . . . . . . . . . . Planning the location of the ICM . . . . . .
How the Internet Cluster Manager works Example of a single ICM outside the cluster . . . . . . . . . . . . . . . . .
.... ...
Example of multiple ICMs outside the cluster . . . . . . . . . . . . . . . . . . Example of a single ICM inside the cluster . . . . . . . . . . . . . . . .
.. .
Forcing the Cluster Replicator to update the Cluster Database Directory information immediately . . . . . . . Forcing the Cluster Replicator to log immediately . . . . . . . . . . . . . Creating configuration settings for all servers in a cluster . . . . . . .
..... ....
5-40 5-40 5-41 5-42 5-42 5-43 5-44 5-45 5-46 5-47 5-48
...
. . 6-9 Configuring the ICM . . . . . . . . . . . . . . . . 6-9 To configure the ICM . . . . . . . . . . . . . 6-10 .. Starting the ICM . . . . . . . . . . . . . . . . . . Failover and workload balancing . . . . . . . When a server fails . . . . . . . . . . . . . . Security . . . . . . . . . . . . . . . . . . . . . . . Managing and monitoring the ICM . . . . . Viewing the log file . . . . . . . . . . . . . . Viewing ICM statistics . . . . . . . . . . . .
Monitoring all the ICM servers at the same time . . . . . . . . . . . . . . . Setting up a separate IP address for the ICM . . . . . . . . . . . . . . . . . . . . 6-12 6-13 6-14 6-15 6-16 6-16 6-16 6-17 6-17
Example of one ICM outside the cluster and one ICM inside the cluster . . . .
. ....... .
...
Contents v
Compatibility with previous releases of Domino . . . . . . . . . . . . . . . . Using an IP sprayer with Domino for HTTP and POP3 failover . . . Setting up failover for IMAP
...
..... ..........
Preface
The documentation for IBM Lotus Notes, IBM Lotus Domino, and IBM Lotus Domino Designer is available online in Help databases and, with the exception of the Notes client documentation, in print format.
License information
Any information or reference related to license terms in this document is provided to you for your information. However, your use of Notes and Domino, and any other IBM program referenced in this document, is solely subject to the terms and conditions of the IBM International Program License Agreement (IPLA) and related License Information (LI) document accompanying each such program. You may not rely on this document should there be any questions concerning your right to use Notes and Domino. Please refer to the IPLA and LI for Notes and Domino that is located in the file LICENSE.TXT.
System requirements
Information about the system requirements for Lotus Notes and Domino is listed in the Release Notes.
Related information
In addition to the documentation that is available with the product, other information about Notes and Domino is available on the Web sites listed here. IBM Redbooks are available at www.redbooks.ibm.com.
vii
A technical journal, discussion forums, demos, and other information is available on the Lotus Developer Domain site at www-10.lotus.com/ldd.
Table of conventions
This table lists conventions used in the Notes and Domino documentation.
Convention italics
monospaced type
Description Variables and book titles are shown in italic type. Code examples and console commands are shown in monospaced type. File names are shown in uppercase, for example NAMES.NSF. Hyphens are used between menu names, to show the sequence of menus.
Description Describes how to upgrade existing Domino servers and Notes clients to Notes and Domino 6. Also describes how to move users from other messaging and directory systems to Notes and Domino 6. Describes how to plan a Domino installation; how to configure Domino to work with network protocols such as Novell SPX, TCP/IP, and NetBIOS; how to install servers; and how to install and begin using Domino Administrator and the Web Administrator. Describes how to register and manage users and groups, and how to register and manage servers including managing directories, connections, mail, replication, security, calendars and scheduling, activity logging, databases, and system monitoring. This book also describes how to use Domino in a service provider environment, how to use Domino Off-Line Services, and how to use IBM Tivoli Analyzer for Lotus Domino. Describes how to set up, manage, and troubleshoot Domino clusters.
Documentation for Domino Designer The following table describes the books that comprise the Domino Designer documentation set. The information in these books is also found online in the Lotus Domino Designer 6 Help database (HELP6_DESIGNER.NSF) with one exception: Domino Enterprise Connection Services (DECS) Installation and User Guide is available online in a separate database, DECS User Guide Template (DECSDOC6.NSF). The printed documentation set also includes Domino Objects posters. In addition to the books listed here, the Domino Designer Templates Guide is available for download in NSF or PDF format. This guide presents an in-depth look at three commonly used Designer templates: TeamRoom, Discussion, and Documentation Library.
Title Application Development with Domino Designer Description Explains how to create all the design elements used in building Domino applications, how to share information with other applications, and how to customize and manage applications.
Domino Designer Programming Introduces programming in Domino Designer and Guide, describes the formula language. Volume 1: Overview and Formula Language continued Preface ix
Title
Description
Domino Designer Programming Describes the LotusScript/COM/OLE classes for access to databases and other Domino structures. Guide, Volumes 2A and 2B: LotusScript/COM/OLE Classes Domino Designer Programming Provides reference information on using the Java and CORBA classes to provide access to databases Guide, Volume 3: Java/CORBA Classes and other Domino structures. Domino Designer Programming Describes the XML and JSP interfaces for access to Guide, databases and other Domino structures. Volume 4: XML Domino DTD and JSP Tags LotusScript Language Guide Domino Enterprise Connection Services (DECS) Installation and User Guide Lotus Connectors and Connectivity Guide Describes the LotusScript programming language. Describes how to use Domino Enterprise Connection Services (DECS) to access enterprise data in real time. Describes how to configure Lotus Connectors for use with either DECS or IBM Lotus Enterprise Integrator for Domino (LEI). It also describes how to test connectivity between DECS or LEI and an external system, such as DB2, Oracle, or Sybase. Lastly, it describes usage and feature options for all of the base connection types that are supplied with LEI and DECS. This online documentation file name is LCCON6.NSF. Describes how to use the LC LSX to programmatically perform Lotus Connector-related tasks outside of, or in conjunction with, either LEI or DECS. This online documentation file name is LSXLC6.NSF. Describes installation, configuration, and migration information and instructions for LEI. The online documentation file names are LEIIG.NSF and LEIIG.PDF. This document is for LEI customers only and is supplied with LEI, not with Domino. Provides information and instructions for using LEI and its activities. The online documentation file names are LEIDOC.NSF and LEIDOC.PDF. This document is for LEI customers only and is supplied with LEI, not with Domino.
IBM Lotus Enterprise Integrator for Domino (LEI) Activities and User Guide
1-1
Workload balancing When users try to access databases on heavily used servers, Domino can redirect the user requests to other cluster servers that arent as busy so that the workload is evenly distributed across the cluster. Workload balancing of cluster servers helps your system achieve optimum performance, which leads to faster data access.
Scalability As the number of users you support increases, you can easily add servers to a cluster to keep server performance high. You can also create multiple database replicas to maximize data availability, and you can move users to other servers or clusters as you plan for future growth. As your enterprise grows, you can distribute user accounts across clusters and balance the additional workload to optimize system performance within a cluster.
Data synchronization A key to effective clustering is setting up replicas on two or more cluster servers so that users have access to data when a server is down or is being used heavily. Cluster replication ensures that all changes, whether to databases or to the cluster membership itself, are immediately passed to other databases or servers in the cluster. Thus, databases are continuously synchronized to provide high availability of information.
Analysis tools Using the cluster analysis tools, as well as the log file, the Monitoring Configuration and Monitoring Results databases, and the server monitor, you can analyze cluster activity and make any changes necessary to improve performance.
Ease of changing operating systems, hardware, or versions of Domino When you want to change your hardware, operating system, or Domino release, you can mark the clustered server as RESTRICTED so that requests to access a database on the server fail over to other cluster servers that contain replicas. This lets you make changes without interrupting the productivity of your users.
Data backup and disaster planning You can set up a cluster server as a backup server to protect crucial data. You can prevent users from accessing the server, but cluster replication keeps the server updated at all times. You can even do this over a WAN so that the backup is in a different geographical location.
Easy administration You can create a cluster with a few keystrokes. You can also add servers to a cluster, remove servers from a cluster, and move servers between clusters with a few keystrokes. In addition, you can drag and drop databases into a cluster and specify which cluster servers should receive replicas. You can also create multiple mail replicas and roaming file replicas for users when you register them, and you can monitor all the servers in a cluster simultaneously.
Use of any hardware and operating system that Domino supports You can set up a cluster using the same hardware you use for your Domino servers. You do not need to use special hardware to create a Domino cluster. In addition, the cluster can contain servers that use any operating system that Domino supports.
Clustering requirements
Server requirements All servers in a cluster must run one of the following: the Lotus Domino 6 Enterprise server, the Lotus Domino 6 Utility server, the Domino Release 5 or Domino Release 4.62 Enterprise server, or the Domino Release 4.6 or Domino Release 4.5 Advanced Services server. Note Earlier releases of Domino do not have access to features that are new in Domino 6. All servers in a cluster must be connected using a high-speed local area network (LAN) or a high-speed wide area network (WAN). You can also set up a private LAN for cluster traffic. All servers in a cluster must use TCP/IP and be on the same Notes named network All servers in a cluster must be in the same Domino domain and share a common Domino Directory. You must specify an administration server for the Domino Directory in the domain that contains the cluster. If you do not specify an administration server, the Administration Process cannot change cluster membership. The administration server does not have to be a member of a cluster. Each server in the cluster must have a hierarchical server ID. If any servers have flat IDs, you must convert them to hierarchical IDs to use them in a cluster. A server can be a member of only one cluster at a time.
Each server must have adequate disk space to function as a cluster member. Because clusters usually require more database replicas, servers in clusters require more disk space than unclustered servers. Each server must have adequate processing power and memory capacity. In general, clustered servers require more computer power than unclustered servers.
For more information on determining the resources you need to set up a cluster, see the chapter Planning a Cluster. Client requirement Notes clients must run Notes Release 4.5 or later to take advantage of the cluster failover feature.
Clustering basics
All the servers in a Domino cluster continually communicate with each other to keep updated on the status of each server and to keep database replicas synchronized. Each server in a cluster contains cluster components that are installed with the Lotus Domino 6 Enterprise server or the Lotus Domino 6 Utility server. These components, and the Administration Process, perform the cluster management and monitoring tasks that run the cluster and let you administer the cluster. The components keep replica databases synchronized, and they communicate with each other to ensure that the cluster is running efficiently and smoothly. They also let you set limits for workload balancing, track the availability of servers and databases, and add servers and databases to the cluster. To take advantage of failover and workload balancing, you distribute databases and replicas throughout the cluster. You do not need a replica of every database on every server. The number of replicas you create for a database depends on how busy the database is and how important it is for users to have constant access to that database. For some databases, you may not need to create any replicas; for others, you may need to create multiple replicas. For information about deciding how many replicas to create, see the chapter Planning a Cluster.
2-1
Server 1 DB1
S Re erv sp er N on o din t g
Server 2
DB1 DB2 DB3
Server 3
DB1
1. A Notes user attempts to open a database on Server 1. 2. Notes realizes that Server 1 is not responding. 3. Instead of displaying a message that says the server is not responding, Notes looks in its cluster cache to see if this server is a member of a cluster and to find the names of the other servers in the cluster. (When a Notes client first accesses a server in a cluster, the names of all the
2-2 Administering Domino Clusters
servers in the cluster are added to the cluster cache on the client. This cache is updated every 15 minutes.) 4. Notes accesses the Cluster Manager on the next server listed in the cluster cache. 5. The Cluster Manager looks in the Cluster Database Directory to find which servers in the cluster contain a replica of the desired database. 6. The Cluster Manager looks in its server cluster cache to find the availability of each server that contains a replica. (The server cluster cache contains information about all the servers in the cluster. Cluster servers obtain this information when they send probes to the other cluster servers.) 7. The Cluster Manager creates a list of the servers in the cluster that contain a replica of the database, sorts the list in order of availability, and sends the list to Notes. 8. Notes opens the replica on the first server in the list (the most available server). If that server is no longer available, Notes opens the replica on the next server in the list. In this example, Server 2 was the most available server. When the Notes client shuts down, it stores the contents of the cluster cache in the file CLUSTER.NCF. Each time the client starts, it populates the cluster cache from the information in CLUSTER.NCF.
When a server or database is not available, failover occurs when a user attempts to use Notes to perform certain actions. The following table describes the actions that trigger failover.
Category Database open operations Action that triggers failover Opening a database from a bookmark Clicking a document link, a view link, or a database link Using Domain Search when a clustered indexing server is unavailable Accessing roaming files when a clustered roaming server is unavailable Activating a field, action, or button that contains @command ([FileOpenDatabase]) Running a LotusScript routine that contains the OpenWithFailover method of the NotesDatabase class Using Java that contains the OpenDatabase method of the DbDirectory class Replicating with a database on a cluster server that is not running or not reachable on the network Mail server related operations Sending mail Name lookups Type-ahead Routing mail messages Mail predelivery agents Meeting invitations Free time lookups Server lookups Web server operations Selecting the Open URL icon Clicking a URL hotspot Accessing a URL with a Web browser
When failover does not occur Failover does not occur in the following cases: When a server becomes unavailable while a user has a database open Note The user can reopen the database, which causes failover to a different replica, if one exists in the cluster. If the user was editing a document when the server became unavailable, the user can copy the document to the replica.
2-4 Administering Domino Clusters
When a user chooses File - Database - Properties or File - Database Open When the router attempts to deliver mail when youve disabled mail routing failover or set MailClusterFailover to 0 When the template server is unavailable while creating a new database When a server fails while running agents, other than the mail predelivery agents When a server fails while running the Administration Process When replicating with a server that is restricted by the administrator or has reached the maximum number of users or the maximum usage level set by the administrator. Also, when replicating with a database marked Out of Service. Replication occurs regardless of such restrictions, so there is no need for failover to occur.
users you want to access a server. When the server reaches this limit, users are redirected to another server. This keeps the workload balanced and keeps the server working at optimum performance. When a user tries to open a database on a BUSY server, the Cluster Manager looks in the Cluster Database Directory for a replica of that database. It then checks the availability of the servers that contain a replica and redirects the user to the most available server. If no other cluster server contains a replica or if all cluster servers are BUSY, the original database opens, even though the server is BUSY. Example This example describes how Domino performs workload balancing. This cluster contains three servers. Server 2 is currently BUSY because the workload has reached the availability threshold that the administrator set for this server. The Cluster Managers on Server 1 and Server 3 are aware that Server 2 is BUSY.
Notes Client
Server 3 DB2
DB2
1. A Notes user attempts to open a database on Server 2. 2. Domino sends Notes a message that the server is BUSY. 3. Notes looks in its cluster cache to find the names of the other servers in the cluster. 4. Notes accesses the Cluster Manager on the next server listed in the cluster cache. 5. The Cluster Manager looks in the Cluster Database Directory to find which servers in the cluster contain a replica of the desired database.
6. The Cluster Manager looks in its server cluster cache to find the availability of each server that contains a replica. 7. The Cluster Manager creates a list of the servers in the cluster that contain a replica of the database, sorts the list in order of availability, and sends the list to Notes. 8. Notes opens the replica on the first server in the list (the most available server). If that server is no longer available, Notes opens the replica on the next server in the list. For information about deciding how many replicas to create, see the chapter Planning a Cluster.
These components are described in the following sections, except the Internet Cluster Manager, which is described in the chapter Clustering Domino Servers That Run Internet Protocols.
The Cluster Manager then informs the client which servers contain a replica and the availability of those servers. This lets the client redirect the request to the most available server that contains a replica. The tasks of the Cluster Manager include: Determining which servers belong to the cluster. It does this by periodically monitoring the Domino Directory for changes to the ClusterName field in the Server document and the cluster membership list. Monitoring server availability and workload in the cluster. Informing other Cluster Managers of changes in server availability. Informing clients about available replicas and availability of cluster servers so the clients can redirect database requests based on the availability of cluster servers (failover). Balancing server workloads in the cluster based on the availability of cluster servers. Logging failover and workload balance events in the server log file.
When it starts, the Cluster Manager checks the Domino Directory to determine which servers belong to the cluster. It maintains this information in memory in the servers Cluster Name Cache. The Cluster Manager uses this information to exchange probes with other Cluster Managers. The Cluster Manager also uses the Cluster Name Cache to store the availability information it receives from these probes. This information helps the Cluster Manager perform the functions listed above, such as failover and workload balancing. To view the information in the Cluster Name Cache, type show cluster at the server console.
changes to other servers. Periodically (every 15 seconds by default), the Cluster Replicator checks for changes in the Cluster Database Directory. When the Cluster Replicator detects a change in the Cluster Database Directory for example, an added or deleted database or a database that now has Cluster Replication disabled it updates the information it has stored in memory. The Cluster Replicator pushes changes to servers in the cluster only. The standard replicator task (REPLICA) replicates changes to and from servers outside the cluster.
The Cluster Replicator leaves the processing of replication formulas to the standard replicator. Because these formulas can use a lot of processing power, they are not processed by the Cluster Replicator in order to minimize the overhead of using cluster replication. If you use selective replication, therefore, a database may temporarily include documents that do not match the selection formula. Domino deletes these documents when you run standard replication. In addition, the Cluster Replicator does not honor the settings on the Advanced panel in the Replication Settings dialog box. Therefore, you cannot disable the replication of specific elements of a database, such as the ACL, agents, and design elements. The Cluster Replicator always attempts to make all replicas identical so that users who fail over do not notice that they failed over. Caution Standard replication cannot automatically remove changes to specific database elements, such as the ACL, agents, or design elements. If limiting the replication of these items is important for a database, consider using only standard replication, not cluster replication, with that database.
For more information about shared mail, see the book Administering the Domino System.
The following example shows a basic configuration for active-passive operating system clustering. A Domino server runs on Node 1. Node 2 monitors Node 1 and waits for a failure to occur.
Node 1 Node 2
Monitor
Data
When a failure occurs, Node 2 picks up the resources of Node 1 and takes over running the Domino server. Node 2 uses the same disk set and the same IP address for the Domino server that Node 1 used.
Node 1 Node 2
Monitor
Data
To run Domino in an active-active cluster, you must use Domino partitioned servers on the nodes. Doing so lets each node take over the tasks of the other node while also maintaining its own tasks. The following example shows a basic configuration for active-active operating system clustering. Node 1 and Node 2 each have Domino running in the first partition. The second partition on each node duplicates the resources of the first partition on the other node. Each node has its own disk set, but both nodes have access to both disk sets in case failover occurs. The nodes monitor each other.
Node 1 Node 2
Data1
Data 2
When Node 1 fails, Node 2 picks up the resources of Node 1 and runs the Domino servers for both nodes.
Node 1 Node 2
Data1
Data 2
To use an active-active configuration, you must be sure that each node can handle the load of the other node if failover occurs.
For these features, it is a good idea to set up an active-passive operating system cluster to run in conjunction with the Domino cluster.
After the cluster is up and running, you can further balance the workload by setting a maximum number of users for each server and setting the availability threshold in a way that does not allow any server to become overloaded. You can track the cluster statistics to determine whether you need to make any changes to the cluster setup. For more information about balancing the workload, see the chapter Managing and Monitoring a Cluster.
when it probes other cluster servers to find out their status and when it does cluster replication. Therefore, do not add servers to a cluster until you need the additional capacity or additional redundancy. In a larger organization, you must decide whether to create large clusters or small clusters. A larger cluster is better able to absorb the workload when a cluster server fails. If you have a cluster with only two servers, for example, if one of the servers fails, the other server must absorb 100% of the failed servers workload. That means that you could run each server at only 50% of its capacity so that it has enough capacity available to absorb the workload of the other server. If the cluster has six servers, however, each of the remaining five servers must absorb only 20% of the failed servers workload. That means you could run each server at 80% of capacity, and they would still be able to absorb the workload if a server goes down. (Of course, there are other factors that determine how the workload of a failed server is absorbed, such as the way you have distributed replicas across the cluster servers.) Hardware considerations The number of servers you decide to include in a cluster can be affected by the amount of disk space and the processing power of each server. Keep the following in mind as you decide which hardware to use in your cluster: The more replicas you create, the more disk space you need and the more processing power you need for cluster replication. The Cluster Database Directory requires approximately 2MB of disk space plus an additional 1MB for each 2,000 databases in the cluster. The more servers in the cluster, the more processing power each server uses to communicate with the other cluster servers. The more server tasks and CPU-intensive applications you run on a server, the more processing power you need. Each server needs adequate processing power for the databases it contains and for any databases that might fail over to the server. Clustered servers require more memory than nonclustered servers. The actual amount you will require depends on the level of activity on the server. To see if you need additional memory or processing power on your computer, check the Platform statistics. For information about the Platform statistics, see the book Administering the Domino System.
When you have a large cluster or a cluster with a heavy workload, you might need to use multiple Cluster Replicators to improve the performance of cluster replication. Check the Cluster Replicator statistics to see if there is a large queue of databases waiting to be processed. If so, add Cluster Replicators one at a time until the statistics improve adequately. Because Cluster Replicators use system resources, the overall performance of the system could decrease as you add Cluster Replicators. Therefore, do not add more Cluster Replicators than you need. For more information about using multiple Cluster Replicators, see the chapter Managing and Monitoring a Cluster.
has significantly more or less processing power than the other servers, consider changing the number of databases on the server and the number of databases that can fail over to the server. Also, distribute mail files across a cluster, or set up separate servers or separate clusters for mail. Because busy databases in a cluster can create a lot of replication events, it is a good idea to install these replicas on the fastest disk hardware available in the cluster. If possible, place these replicas where other processes are not in contention for example, on a partition other than the one that contains the operating system swap file. To view which databases and replicas already exist in the cluster, open the Cluster Database Directory (CLDBDIR.NSF). It contains a document that stores information about each database and replica in a cluster. Note Selective replication formulas work differently in a cluster. For more information about selective replication in a cluster, see the chapter How Domino Clustering Works.
Consider the power and bandwidth of your system when creating replicas. The busier a database is, the more network traffic and processing power it takes to keep replicas updated. If you have systems with limited power and bandwidth, you may want to create fewer replicas of busy databases than you would if you had more power and bandwidth, or you may want to add more processors and other resources to the servers. In a cluster with limited resources, creating replicas of busy databases can be counterproductive because of the additional resources needed for cluster replication. (Clustering is not a solution for inadequate resources.) The less busy a database is, however, the less overhead it takes to keep that database updated. If you arent sure how many replicas to create, start with one and track the cluster statistics. If the statistics show that the server becomes unavailable or that performance becomes a problem, increasing the number of replicas may solve the problem. Do not create replicas of databases for which availability or workload balancing is not one of your goals.
Analyzing databases to determine the number of replicas There are many factors to consider when deciding how many replicas to create. Some factors suggest creating more replicas, and some suggest creating fewer replicas. Below is a list of factors and how they might affect your cluster traffic and performance. Prior to distributing databases in a cluster, it can be helpful to create a table of information about the databases and the cluster hardware. You can use the table to determine how important specific databases are and how adequate your resources are. You can include some or all of the following in the table: Titles of the databases This identifies each database. Size of each database Large databases consume a lot of disk space. Depending on your disk capacity, you may want to create fewer replicas of larger databases to preserve disk space. Number and distribution of database users If you have a large number of users, they will probably experience better performance if usage is spread across multiple servers. This requires multiple replicas. If the number of users is small, they probably wont notice a performance improvement from additional replicas.
How often user transactions take place If the transaction rate is high, creating multiple replicas may improve performance. To find out the rate of activity for a database, look in the Notes log file.
Expected volume of new data If you expect a large amount of new data in the database, additional replicas may slow down performance because cluster replication will cause a lot of additional traffic. If you have powerful servers and a lot of bandwidth, this may not create a problem.
Capacity of Domino server hardware The more powerful the servers and the more disk space they have, the more active replicas you can create without significantly affecting performance.
Type of network connections between servers Cluster replication can create a bottleneck on a network that does not have enough bandwidth. Therefore, the greater the bandwidth, the more replicas you can create.
How critical the database is to the functioning of your business For databases that are mission-critical, you should create multiple replicas. For databases where availability is less important, create fewer replicas or none at all.
Example table When you create a table of database information, include the factors that are most important to you. The following table uses a subset of the preceding information to determine the number of replicas needed.
Database title Size Maximum concurrent users 600 200 20 50 Transaction Growth rate rate High Medium Low Medium High High Suggested Need for availability number of replicas High Critical 2 2 or more 0 or 1 0
This table helps identify which databases require high availability, which databases are busiest, and how much additional disk space you will need in the future. In this example, two databases are very important and are growing rapidly. You should be sure that there are enough replicas of these
3-6 Administering Domino Clusters
databases so that they are always available. You should also be sure there is adequate disk space for growth on every server that contains a replica of these databases. One database is of medium importance, not growing as quickly, and not very active. You should provide no more than one replica of this database, unless it would affect your business negatively if the database was not available for a while. One database is not very important and does not require a replica in the cluster. The number of concurrent users helps you determine the need for workload balancing. In this example, two databases are very busy and both are very important. Therefore, you should consider placing these databases on different servers to balance the workload. You should also be sure that workload balancing parameters are set on the servers that contain these databases so that users will fail over to another server when these databases become busy. For more information about managing workload balancing, see the chapter Managing and Monitoring a Cluster.
increasing each servers work load by 33%. You might be tempted to place all 300 replicas from Server 1 onto Server 2 and all the replicas for Server 3 onto Server 4. In such a case, however, if Server 1 fails, all 300 users fail over to Server 2, increasing the workload on Server 2 by 100% but not increasing the workload on Server 3 or Server 4 at all. The following figure shows a mail cluster that contains four servers with 300 mail databases on each server. Replicas of the mail databases are evenly distributed among all the other servers in the cluster, keeping the workload of the other servers as low as possible, even when failover occurs.
Mail Server 2 (300 Users)
300
se 0U 10
10 0
o ail sF
r ve
Mail Server 3 (300 Users)
300
300
300
The following figure shows a mail cluster that contains two servers with 100 mail files on each server. Because there are only two servers, each server must fail over to the other server. Therefore, each server contains replicas of all the mail databases on the other server.
Mail Server 1 (100 Mail Users)
100
Since users often open mail databases once a day and leave them open, distributing the mail databases is usually adequate for workload balancing. You do not usually have to use separate workload balancing settings, especially if you dedicate servers to mail only. After failing over to a replica mail database, users automatically return to the mail database on their mail server the next time they start their Notes clients, as long as the Location document that points to that mail database is the current Location document. Note If you do not create a dedicated mail cluster, you should distribute mail databases equally among the cluster servers, if the cluster servers are approximately equal in power. If some servers are more powerful than others, distribute more databases to the more powerful servers. This distribution helps to keep the workload balanced. Caution If you plan to create a cluster that includes some Domino 6 servers and some Domino Release 5 or Domino Release 4.6 or 4.5 servers, keep the following in mind: The Domino 6 mail template does not work properly on Domino Release 5 or Domino Release 4.6 or 4.5 servers. If a user has a Domino 6 mail database, do not create a replica on a Domino Release 5 or a Domino Release 4.6 or 4.5 server. Because the Cluster Replicator always replicates the template design between replicas, a users mail replicas should all use the same template the Domino 6 mail template, the Domino Release 5 mail template, or the Domino Release 4.6 or 4.5 mail template.
The following figure shows a cluster with four servers of varying amounts of power. The databases in the cluster are distributed in a way that takes advantage of the resources of each server.
Average Powered Server DB1 DB2 DB3 Powerful Server DB1 DB4 DB5 DB7 DB6 Above Average Average Powered Server Powered Server DB2DB3 DB4 DB5
DB4 DB6
DB4
The following figure shows a cluster with four servers that are equal in power. The databases in this example all receive a similar amount of use. DB1 is a critical database, so each server contains a replica.
DB1 DB2 DB3 DB1 DB4 DB5 DB1 DB3 DB6 DB1
DB4
If you create a private LAN for your cluster, all cluster members must be connected to both the private LAN, for intra-cluster communication, and the primary LAN, for client access. For information about setting up a private LAN, see the chapter Setting Up a Cluster.
recovery works well. Fault recovery restarts Domino on its current server, and no operating system fail over occurs. If you configured your operating system cluster to fail over on both hardware and software failures, you dont need fault recovery because the operating system cluster will restart Domino on another server in the cluster. In fact, you should disable fault recovery so you wont have Domino restarting itself while the operating system cluster is also restarting it. This can lead to problems. By default, fault recovery is disabled. You enable it in the Server document. 1. From the Domino Administrator or the Web Administrator, click the Configuration tab. 2. In the Task pane, expand Server, and click All Server Documents. 3. In the Results pane, select the Server document you want, and click Edit Server. 4. In the Fault Recovery field, choose Enabled. 5. (Optional) Complete any of the following fields that you want. In the Cleanup Script Name field, enter the name of a cleanup script. In the Cleanup Script Maximum Execution Time field, enter the maximum time for a cleanup script to run before being terminated. In the Maximum Crash Limits field, enter the maximum number of restarts allowed during the specified period. If the number of restarts exceeds the limit, the server wont restart. In the Mail Crash Notification to field, enter the names of the people to notify each time the server restarts. 6. Make any other changes you want to the Server document, and then click Save & Close. For more information about fault recovery, see the book Administering the Domino System.
Cluster partitioned servers Cluster passthru servers Use a Domino cluster with an operating system cluster
These examples assume that all servers in a cluster are equal in processing power and resources. If that is not true in your cluster, you may need to make adjustments.
DB2 DB3
DB4 DB5
Server 3
DB1DB3 DB4 DB5
Server 4
DB1DB2 DB5 DB5 DB6
Server 5
DB1DB3 DB6 DB5 DB7
Server 6
DB1DB4 DB5 DB5 DB7
Hub 2
Server 3
Server 4
When the primary purpose of the hub servers is replication, the previous configuration may not work well in some enterprises because of the amount of replication that is required. Because both hub servers need to contain the same databases, this configuration causes a lot of replication. Depending on the equipment you use, this configuration could slow down servers and create a lot of network traffic. If that is the case in your enterprise, you might consider using an active-passive operating system cluster for your hub servers, as in the following figure.
Application Server 1 Application Server 2
Monitor
RE
IC TR
TE
Remote Server
For more information about designating a server as RESTRICTED, see the chapter Managing and Monitoring a Cluster.
Server 1 at 192.94.222.169
Server 4 at 206.34.80.234
Server 2 at 192.94.222.170
Cluster 2
Server 5 at 206.34.80.235
Server 3 at 192.94.222.171
Cluster 3
Server 6 at 206.34.80.236
When you include a partitioned server in a cluster, you do not have to include all the partitioned servers on a machine in the cluster. The following figure shows two computers that each have three partitioned servers. Four of the partitioned servers are configured in two clusters, and two of the partitioned servers are not in a cluster.
Cluster 1
Server 1 at 192.94.222.169
Server 4 at 206.34.80.234
Server 3 at 192.94.222.171
Cl us ter
Server 2 at 192.94.222.170
Server 5 at 206.34.80.235
Server 6 at 206.34.80.236
Passthru Server 1
Passthru Server 2
Failover
Server 1
Server 2
Server 3
This configuration works only for clients that have a LAN or WAN connection to the passthru servers. For information about setting up mobile clients to use passthru servers, see the chapter Setting Up a Cluster.
Server 1 Backup
Server 1 Server 2
Database A
Monitor
Database A Database B
For more information about clustering requirements, see the chapter Cluster Benefits and Requirements. For more information about distributing databases, see the chapter Planning a Cluster. For more information about setting up scheduled replication, see the book Administering the Domino System.
4-1
Creating a cluster
To create a cluster, you must have the following access rights: Author access and Delete Documents rights and the ServerModifier and ServerCreator roles in the Domino Directory Author access with Create documents rights in the Administration Requests database
If possible, use the administration server when creating a cluster. This makes the creation process faster. The administration server does not have to be part of the cluster. If a server belongs to a different cluster, you do not have to remove the server from that cluster before you add it to the new cluster. The Cluster Administration Process removes the server from the original cluster and then adds it to the new cluster. Note You cannot use the Web Administrator to create a cluster. 1. From the Domino Administrator, make sure the administration server or another server is current. 2. Click the Configuration tab. 3. In the Tasks pane, expand Server, and click All Server Documents. 4. In the Results pane, select the servers that you want to add to the cluster. 5. Click Add to Cluster. 6. In the Cluster Name dialog box, choose Create New Cluster, and then click OK. 7. Type the name of the new cluster, and then click OK. 8. Choose Yes to add the servers to the cluster immediately, or choose No to submit a request to the Administration Process to add the servers to the cluster. 9. (Optional) If you chose No in Step 8 and you did not add the servers on the administration server, force replication between the server you used and the administration server so that the administration server receives the requested changes sooner. 10. (Optional) If you chose No in Step 8, force replication between the administration server and the cluster servers so the cluster servers receive all the changes sooner. 11. (Optional) If you chose Yes in Step 8, the cluster information is added immediately to the Domino Directory on the server you used to create the cluster. If this server is not part of the new cluster, replicate the changes to one of the servers you added to the cluster.
4-2 Administering Domino Clusters
The Cluster Administrator replicates the Cluster Database Directory and the Domino Directory with the other servers in the cluster so they are all synchronized. The Schedule Manager creates the Free Time database (CLUBUSY.NSF). The Free Time database replicates with the other cluster servers so they are all synchronized.
When the Domino Directory updates to include the new cluster, each cluster server begins to send messages, known as probes, to the other servers in the cluster. These probes gather information about the status of the other servers in the cluster.
From the Domino Administrator or the The name of the cluster followed by Web Administrator, expand Clusters in the the names of the cluster servers. Server pane.
From the Domino Administrator or the The name of the cluster followed
Web Administrator, click the Configuration tab. In the Task pane, expand Cluster, and then click Clusters. by the names of the cluster servers displayed in the Results pane.
Compare the replica IDs of the Cluster The same replica ID on each server. Database Directories on each cluster server. From the server console, send the following command: show cluster The name of the cluster, some statistics for the current server, and the names of all the cluster servers.
You can also use Cluster Analysis to generate reports that show if there are any configuration problems in the cluster. For information about Cluster Analysis, see the topic Using Cluster Analysis to check the cluster configuration.
Description Compares the ACLs of replicas throughout the cluster to be sure the ACLs are consistent. If they are not, users could fail over to replicas that they cant access or replicas that give them different rights to view and alter database information. Checks to see if cluster replication is enabled for the databases on the server. If users fail over to a database that does not have cluster replication enabled, they may see different information than in the original database. Checks for inconsistent replication formulas among replicas that share the same path. Replicas with the same path should have the same replication formulas. Checks to see if databases on the current server have replicas in the cluster. Returns failed if no replica exists. (Not all databases require replicas.) Checks to see if the Web databases (WEB.NSF) on cluster members are replicas of each other. If they arent, the Web databases will not fail over to each other.
Web Navigator
3. In the Tools pane, expand Analyze, and click Cluster. 4. (Optional) To write the results of the analysis tests to a database other than the Cluster Analysis database, click Results Database and specify the server, title, and file name of the database. Then click OK. If the database does not already exist, Domino creates it. 5. (Optional) If a Cluster Analysis database already exists and you want to append the new reports to this database, select Append to this database. Otherwise, the cluster analysis overwrites the existing database. 6. Select the types of reports you want Domino to generate: Server, Databases, or Web Navigator. 7. If you selected Databases as a Report type in Step 6, select the type of database tests you want to run: Consistent ACLs, Disabled Replication, Consistent replication formulas, and/or Replicas exist within cluster. 8. Click OK to run the analysis and to open the Results Database. Viewing the results of a Cluster Analysis 1. Open the Cluster Analysis database if it is not already open. 2. Click one of the following views: By Cluster By Date By Test 3. Open a Cluster Analysis Results document.
3. Click the Files tab. 4. Do one of the following: In the Task pane of the Domino Administrator or the Web Administrator, select the folder or view that contains the database you want. In the Task pane of the Domino Administrator only, expand Cluster Directory (6), and then select the view you want. 5. In the Results pane, select the database you want. 6. In the Tools pane, expand Database, and then click Manage ACL. 7. Click the Advanced icon. 8. Choose Enforce a consistent Access Control List across all replicas of this database, and then click OK. This setting ensures that ACLs are consistent across replicas and also enforces the ACL when replicas are accessed locally on either a server or a client. Another way to keep ACLs consistent across replicas is to give all servers in a cluster Manager access to all databases in the cluster. This ensures that every server can update the ACL of every database. To give the cluster servers Manager access to all databases, you can create a Group document in the Domino Directory that includes all the servers in the cluster. Then add this group to the ACL of each database, select the user type Server group, and give the group Manager access. It is important that cluster servers have adequate access so they can replicate all data from one replica to another. If there are any restrictions in one replica that are not in another replica, some information will not be available to users when failover occurs. Therefore, be sure that servers not only have Manager access, but that they can all replicate the same data without restrictions. Private folders replicate differently in a cluster than outside a cluster. When outside a cluster, private folders and their contents do not replicate during server-to-server replication but do replicate during client-to-server replication. In a cluster, however, private folders replicate from server to server so that users are able to access their private folders if they fail over to a different replica. To ensure that private folders replicate between servers in a cluster, be sure to set the user type of the servers in the ACL to Server or Server group.
For example, if the servers in the cluster contain database or directory links that include access lists, be sure that the cluster servers are in the access lists. Otherwise, they will not have access to those databases or directories and will not be able to replicate with those databases, even if they have Manager access in the ACLs. If a document in a database includes a Readers field, the cluster servers must be listed in the Readers field or the servers will not have access to that document and will not be able to replicate the document. The same is true if a folder or view includes a Readers field. Because Readers fields are often maintained by a database designer rather than a network administrator, network administrators need to communicate with database designers about this issue.
Hub 2
into the shared mail database. If shared mail is not being used, the server deposits the entire message into the replica of the recipients mail database. To set up shared mail in a cluster and have replicated messages stored in the shared mail database, you use the same procedure you use for setting up shared mail with replicas that are not in a cluster. This procedure includes the Load Object Set - Always command. You do this on every server that uses shared mail in the cluster. For more information about setting up shared mail for replica mail databases, see the book Administering the Domino System.
From the Web Administrator 1. Click the People & Groups tab. 2. In the Tools pane, expand People, and then click Register. 3. Choose a CA certifier and, optionally, an explicit policy. Then click OK. 4. In the Register Person dialog box, select Advanced, and then click the Mail tab. 5. In the Mail system field, choose Lotus Notes. 6. In the Mail server field, choose a cluster server as the Mail server. 7. In the Mail template field, choose Mail (6). 8. Complete any other fields you want on the Mail tab, and then click the Replica tab. 9. Select Create replica(s) of. A list is displayed of servers in the same cluster as the Mail server. 10. Do one of the following: To create a replica of the mail database on all of the cluster servers, skip this step. To change the list of servers to receive a replica, use the Add button and the Remove button. 11. Complete the rest of the user registration the way you normally would.
To replicate databases for which you have disabled cluster replication You may have databases that you want to replicate but not every time they are updated. You can disable cluster replication for these databases. To see whether cluster replication is disabled for a database, open the Cluster Database Directory. Databases with the letter X in the left column have cluster replication disabled. You can also check this by looking in the Cluster Replication field in the document for each database in the Cluster Database Directory. For more information about disabling cluster replication and viewing information in the Cluster Database Directory, see the chapter Managing and Monitoring a Cluster.
To replicate based on selective replication formulas The Cluster Replicator leaves the processing of replication formulas to the standard replicator. Before using replication formulas in a cluster, you should be aware of how this affects cluster replication. For more information about selective replication in a cluster, see the chapter How Domino Clustering Works.
To replicate replicas that are on the same server The Cluster Replicator pushes changes to other servers that contain replicas but does not update other replicas on its own server. Note If there are multiple replicas on a server, the Cluster Manager uses failover by path to select the replica for a user to open during failover. If you put multiple replicas on a server, be sure that all replicas in the cluster that have the same path use the same selective replication formulas. Otherwise, the replica to which users fail over may contain different data than they expect.
You should run standard replication on a regular basis. The number of times per day you run standard replication depends on how important it is for you to keep all replicas synchronized. In most cases, once or twice per day is sufficient. If it is absolutely critical to keeps data synchronized at all times, you may want to replicate every hour or two. In addition, you should replicate whenever you start the server to be sure that all databases are up-to-date. You can create a Program document in the Domino Directory to accomplish this.
Issuing the Replicate command with a cluster name From the server console, send the following Replicate commands to replicate databases on a local server with databases in the specified cluster.
Purpose Command Explanation of variables cluster_name is the name of the cluster
To replicate all the databases replicate cluster_name that the local server has in common with servers in a specific cluster To replicate a specific database only To replicate with all the databases in a specific directory replicate cluster_name filename replicate cluster_name local_directory
filename is the file name of a database local_directory is the name of a directory that contains databases
Specifying a cluster in a Connection document You can create a Connection document to replicate with a cluster. Type the appropriate information in these fields:
Tab Basics Field name Source server Destination server Replication/ Routing Replication task Information you enter Type the name of a server. The server cannot be a member of the cluster. Type the cluster name. Choose Enabled.
Replication Type Select Pull Push, Pull Only, or Push Only. Pull Pull does not work with a cluster name. Files/Directory Paths to Replicate Leave blank to select all databases; type one or more file names or a directory name to specify databases.
Replicating with a cluster from a Notes client Replicating with a cluster name is useful when you dont know the location of a database within the cluster. Note A client must access a server in the cluster directly once before replicating with a cluster name. Doing so makes the client aware of the cluster by adding the names of the servers in the cluster to the CLUSTER.NCF file on the client.
1. Open the database you want, or right-click its bookmark. 2. Do one of the following: If you opened the database in Step 1, choose File - Replication Replicate. If you right-clicked the bookmark in Step 1, choose Replication Replicate. 3. If a box appears asking whether to Replicate via background replicator or Replicate with options, choose Replicate with options and click OK. 4. In the with field, type the cluster name. 5. Make any other changes you want, and click OK.
where the possible values for n are 1, which enables the display of cluster replication status messages, and 0, which disables the display of cluster replication status messages.
4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Item field, select or enter CLREPL_OBEYS_QUOTAS. 7. In the Value field, enter 1. 8. Click Add, and then click OK. 9. Click Save & Close. Note To ignore database size quotas again, place 0 (zero) in the Value field in step 7, or delete CLREPL_OBEYS_QUOTAS from the Configuration Settings document. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Available Parameters box, click CLREPL_OBEYS_QUOTAS, and then click Add. 7. In the Value field, enter 1, and then click OK 8. Click Save & Close. Note To ignore database size quotas again, place 0 (zero) in the Value field in step 7, or delete CLREPL_OBEYS_QUOTAS from the Configuration Settings document.
6. Select a cluster server as the Roaming server. Domino displays a list of all the servers in the cluster. 7. Select the additional servers you want to receive replicas, and then click OK. 8. Complete the rest of the fields the way you normally would when you upgrade a user for roaming, and then click OK.
Passthru Server
Server 1
Server 2
Server 3
Because mobile clients can typically call only one server at a time, they cannot take advantage of a cluster of passthru servers. However, you can set up a hunt group of passthru servers to ensure that mobile clients have high availability to passthru servers. If you do not have access to a passthru server, users can still use their Replicator page to simulate failover if a cluster server is down. To replicate a mail database, for example, set up the Replicator page to call and replicate with the users mail server, and then call and replicate with the server that contains the users replica mail database. That way, if either server is available, the user will have access to the mail database. The call to the second server will take very little time if the first call was successful. The following figure shows a mobile client calling Server 1, which is the users mail server, and then calling Server 2, which contains a replica of the users mail database.
Remote Notes Client
Mail Server 1
Mail Server 2
For example, add a port named CLUSTER, and then add the following information to the Port - Notes Network Ports tab in the Server document to enable the port.
Field name Port Protocol Notes Network Net Address Enabled Example CLUSTER TCP Cluster Network Acme_clu.acme.com ENABLED
For more information about adding and enabling a new port, see Administering the Domino System. 7. Assign each port an IP address from the corresponding subnets, and place this information in the NOTES.INI file in the following form: PORT1_TcpIPAddress=0,a.b.c.d:1352 PORT2_TcpIPAddress=0,e.f.g.h:1352 where PORT1 and PORT2 are the port names and a.b.c.d and e.f.g.h are the IP addresses for the ports. If you have ports named TCPIP and CLUSTER, for example, these lines might be: TCPIP_TcpIPAddress=0,192.114.32.5:1352 CLUSTER_TcpIPAddress=0,192.168.64.1:1352 8. Do one of the following: Reorder the ports so that the cluster port is first. This ensures that all cluster traffic uses this port for cluster traffic. Be sure that all other traffic is assigned to use other ports. Add the following line to the NOTES.INI file: Server_Cluster_Default_Port=Cluster Port where Cluster Port is the port you created for the cluster. In this example, this line would be: Server_Cluster_Default_Port=CLUSTER This ensures that all cluster traffic uses this port for cluster communications no matter what order the ports are in. Note There is a disadvantage to using the Server_Cluster_Default_Port setting to assign a port to the private LAN for cluster traffic. If a cluster server encounters a problem connecting over this port, it will not try another port. Therefore, the server will not be able to communicate or replicate with other cluster
4-24 Administering Domino Clusters
servers. You will have to resolve the network problem or remove this setting from the NOTES.INI file before the server will be able to communicate with the cluster again. For information about reordering network ports on a server, see Administering the Domino System. 9. Restart the server.
These values should be fairly close to each other, although they will not be the same. 8. Compare the NET.portname.BytesSent value with the Replica.Cluster.SessionBytes.Out value. These values should also be close to each other. They wont match exactly because the private network is used for more than just cluster replication.
Monitoring a cluster
Domino provides several ways to find out what is happening in a cluster and make adjustments to keep the cluster running smoothly and efficiently, so that no server is overloaded. When running as part of a cluster, a Domino server constantly monitors its workload, the workload of the other servers in the cluster, and the availability of databases throughout the cluster. In addition, Domino monitors statistics and events that are relevant to a cluster. There are many ways to view this information. For example, you can view it from the server console or in the log file or in the Statistics pane in the Domino Administrator. In addition, you can collect statistic reports in the Monitoring Results database and then use the Domino Administrator to look at the statistic reports. Some of the ways to monitor a cluster are: Displaying a list of cluster members and their availability Enabling statistic reporting in the Monitoring Results database Viewing Cluster Manager events and statistics Viewing cluster replication events and statistics Using Tell commands to display cluster replication information Monitoring all the servers in a cluster at the same time
5-1
This command displays the name of the cluster, several statistics for the current server, and the names and availability indexes of all the servers in the cluster.
From the following servers. If you choose From the following servers, do the following: In the Server(s) field, enter the names of the servers from which you want to collect statistics. To collect statistics from all the servers in a cluster, choose From the following servers, and then enter the names of the cluster servers in the Server(s) field. 6. Click the Options tab, and select Log statistics to a database. 7. (Optional) Do any of the following: In the Database to receive reports field, enter the name of a database to store reports. By default, this is STATREP.NSF (the Monitoring Results database). In the Collection report interval field, enter the number of minutes between reports. The minimum is 15. In the Collection alarm interval field, enter the number of minutes between alarms. The minimum is 15. In the Statistic Filters field, select the types of statistics you do NOT want to collect. By default, the server statistic collection includes all the types of statistics. To collect cluster replication statistics, do NOT select REPLICA in this field. The Cluster Manager statistics are always collected in statistic collections. 8. Click Save & Close.
3. Click the Server - Status tab. 4. In the Task pane, do one of the following: From the Domino Administrator, click Server Tasks. From the Web Administrator, click All Server Tasks. 5. In the Tools pane, expand Task, and then click Start. 6. Select Statistic Collector, and then click Start Task. 7. Click Done. From the server console Send the following Domino command from the server console:
load collect
When Domino fails over to balance the workload, the event may look like this in the Domino server log file:
08/23/2002 11:08:48 AM Load balancing off of Sales/Acme!!Customer.nsf for replica ID 852560C9:007232D, directing open to Sales2/Acme
You can view these events in the log file. Do one of the following. From the Domino Administrator or the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server that stores the log file you want to view. 3. Click the Server - Analysis tab.
4. In the Task pane, expand Notes Log. 5. Click Miscellaneous Events. 6. In the Results pane, open the document you want to view. From the Domino server log file 1. Open the Domino server log file (LOG.NSF). 2. Open the Miscellaneous Events view. 3. Open the Notes Log document you want to view. You can also run Log Analysis to gather all of the failover and workload balancing events into a database. For more information about the Domino server log file and Log Analysis, see the book Administering the Domino System.
Viewing a list of Cluster Manager statistics You can view a list of Cluster Manager statistics from the Domino Administrator, the Web Administrator, or the server console. From the Domino Administrator or the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Statistics tab. 4. In the statistics list, expand Server, and then expand Cluster. Note To see the availability index, the availability threshold, and the expansion factor of the current server, look in the Server section of the statistics, not the Server - Cluster section. From the server console Send the following Domino command from the server console:
show stat server.cluster*
The Cluster Manager statistics begin with Server.Cluster. They give you information about failover, workload balancing, and the state of the servers in the cluster. Among other things, the statistics tell you how often the Cluster Manager attempted failover and workload balancing, and how many of these attempts were successful. Note To see the availability index, the availability threshold, and the expansion factor of the current server, send the Domino command show stat server from the server console. For an explanation of all the cluster statistics, see the appendix Cluster Statistics.
Events Unable to replicate from customer.nsf to Sales2/Acme customer.nsf: Remote system no longer responding.
Database Sales2 names.nsf Access Manager Added Deleted Updated KBrec KBsent 34 0 0 2 1 2 3 1 15 13 From cldbdir.nsf names.nsf
You can also run Log Analysis to gather all of the replication events into a database. For more information about the Domino server log file and Log Analysis, see the book Administering the Domino System.
2. Select the server you want. 3. Click the Server - Analysis tab. 4. In the Task pane, expand Monitoring Results, and then expand Statistics Reports. 5. Click Clusters. 6. In the Results pane, open the document you want, and then look in the Replica cluster statistics section of the document. Note If you prefer, you can view these reports directly in the Monitoring Results database (STATREP.NSF). Open the database, expand Statistics Reports, and then click Clusters. From the Web Administrator only You can use the Web Administrator to monitor a predetermined set of cluster replication statistics. These statistics show cluster replication activity, workload, and status. These statistics refresh automatically every minute. 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Replication tab. 4. In the Task pane, click Replication Statistics. Viewing a list of cluster replication statistics You can view a list of cluster replication statistics from the Domino Administrator, the Web Administrator, or the server console. From the Domino Administrator or the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Statistics tab. 4. In the statistics list, expand Replica, and then expand Cluster. From the server console Send the following Domino command from the server console:
show stat replica.cluster*
The cluster replication statistics begin with Replica.Cluster. They give you information about cluster replication events, such as the number of documents updated, the number of times the Cluster Replicator retried pending replication, and the number of bytes received during cluster replication. For an explanation of all the cluster statistics, see the appendix Cluster Statistics.
5-10 Administering Domino Clusters
Using cluster replication statistics to find replication backlogs During peak activity periods, servers may show an especially high frequency of replication events. Replication backlogs may occur if the Cluster Replicator is unable to handle all replication requests. Examine the Replica.Cluster.WorkQueueDepth statistic. This statistic shows the number of modified databases awaiting replication. In addition, examine the Replica.Cluster.SecondsOnQueue statistic. This statistic shows how long a database waited to be replicated. If the number of databases waiting to be replicated is consistently much greater than zero, or if the amount of time a database waits to be replicated is consistently longer than you would like, consider enabling additional Cluster Replicators to help decrease the replication backlog. You could also decrease the server workload by removing very active databases from the server or by decreasing the number of users who can access the server. For more information about enabling multiple Cluster Replicators, see the topic Using multiple Cluster Replicators. For an explanation of all the Cluster Replicator statistics, see the appendix Cluster Statistics.
Note For information about these commands, see About the Tell commands for cluster replication, which follows. From the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click All Server Tasks. 5. In the Results pane, click Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. In the Tell Task dialog box, enter one of the following, and then click OK. tell clrepl dump tell clrepl dump server tell clrepl dump retry Note For information about these commands, see About the Tell commands for cluster replication, which follows.
To see which
databases did not replicate and which replicas are not up-to-date To determine if cluster replication is being delayed due to long queues or to replication that needs to be retried To determine whether to make any changes to the way you have configured cluster replication. To determine if you want to change the availability threshold on a server or redistribute replicas in the cluster To determine if you want to force the Cluster Replicator to retry all failed replications immediately rather than waiting for the next scheduled retry continued
Choice in Command in Information displayed Domino Web Administrator Administrator All known tell clrepl The number of Cluster cluster mates dump server Replicators running The work queue depth The number of cluster replication retries in progress The last time cluster replication was unsuccessful with each of the other cluster servers The last time replication was retried with each of the other cluster servers All databases that have failed to replicate tell clrepl dump retry
Use
To view the status of Replicators running cluster replication The work queue depth To see which databases did not The number of cluster replicate replication retries in progress To see if particular servers are having The names of the problems so you can databases that are attend to those waiting to be replicated servers The time the next retry is To determine if you due for each database want the Cluster The retry interval for Replicator to retry each database all failed replications immediately rather than waiting for the next scheduled retry
The following procedure tells how to copy the output information from the Domino Administrator or the Web Administrator to another program. 1. From the Domino Administrator or the Web Administrator, click the Server - Status tab. 2. Do one of the following: In the Task pane of the Domino Administrator, click Server Console. Be sure that the console is not live. In the Task pane of the Web Administrator, click Quick Console. 3. In the Results pane, select the text you want to copy. 4. Choose Edit - Copy. 5. Paste the copied text into the program you want.
Customizing a monitoring profile for a cluster You can create new profiles and edit existing profiles to customize the tasks and statistics that Domino displays. 1. From the Domino Administrator, click the Server - Monitoring tab. 2. In the Monitoring profiles field, select an existing profile. Selecting a profile initializes the server monitor if it is not already initialized. You cannot make changes to a profile until the server monitor is initialized. 3. To add one or more tasks to monitor, choose Monitoring - Monitor New Task, select the tasks you want to add, and then click OK. For clustering, it can be useful to monitor the Cluster Database Directory Manager and the Cluster Replicator. 4. To add one or more statistics to monitor, choose Monitoring - Monitor New Statistic, do the following in the Add Statistic(s) to this profile dialog box, and then click OK. Expand Replica - Cluster, and then select the statistics you want to monitor for cluster replication. There are many statistics that are helpful, but SecondsOnQueue and WorkQueueDepth are particularly helpful in determining whether you need to increase the number of Cluster Replicators you are running on the server. Expand Server - Cluster, and then select the other cluster statistics you want to monitor. If Availability Index and Availability Threshold are not already included in your profile, it is helpful to monitor those. It is also helpful to monitor OpenRedirects - Failover and OpenRedirects LoadBalance, as well as OpenRequest - LoadBalanced and OpenRequest - ClusterBusy to track how often failover occurs. 5. (Optional) To add a server to the profile, select Monitoring - Monitor New Server, and then select the server from the list; or drag a server from the Server pane to the server monitor. 6. (Optional) To remove a server from the profile, click the name of the server you want to remove, and then select Monitoring - Remove Server. 7. To save your changes to the profile, do one of the following: To save this profile as a new profile while also preserving the original profile, choose Monitoring - Profiles - Save As, and then enter a name for the profile.
To have this modified profile replace the original profile, you do not have to do anything. The profile is saved automatically when you close the Domino Administrator. For more information about monitoring Domino servers, see the book Administering the Domino System.
Keep in mind that workload balancing is not the solution for a general lack of capacity in your enterprise. If your Domino servers are struggling with their current workload, and there are no additional servers to handle the excess load, enabling workload balancing will not solve the problem. To balance the workload, there must be somewhere to send the overflow from the overworked servers. If there is nowhere to send these requests, they will be handled by the original busy servers. However, the process of looking for another available server for each request will increase the workload on the server. If you do not have enough capacity in your enterprise, consider adding more memory or processors or otherwise upgrading your equipment to handle a larger load. If the workload in your cluster is normally well distributed, consider configuring the cluster for failover only, not for workload balancing.
Domino stores the minimum time for each type of transaction in memory and in the LOADMON.NCF file, which the server reads each time it starts. When the server shuts down, Domino updates the LOADMON.NCF file with the latest information. To determine the current expansion factor, Domino tracks the most commonly used types of Domino transactions for specified periods of time. By default Domino tracks these transactions for 5 periods of 15 seconds each. Domino then determines the average time it took to complete each type of transaction and divides that time by the minimum time it ever took to complete that same type of transaction. This determines an expansion factor for each type of transaction. To determine the expansion factor for the entire server, Domino averages the expansion factors for all the types of transactions, giving a heavier weighting to the most frequently used types of transactions. As the server gets busier, adding more load has an increasingly greater effect on performance and availability. Thus, adding more load to a busy server increases the expansion factor faster than adding more load to a less busy server. An expansion factor of 64 is considered the maximum load for a server. In other words, if the server is taking 64 times longer to complete transactions than it does under optimal conditions, Domino considers the server to be fully loaded. How the availability index compares to the expansion factor To determine the availability index, Domino uses a formula that converts the expansion factor into an approximation of the percentage of the total server capacity that is still available. The following table shows a few examples of expansion factors converted to availability indexes.
Expansion factor 1 2 4 8 16 32 64 Availability index 100 83 67 50 33 17 0
Note The expansion factor and the availability index measure only the response time of the server, which is usually only a small portion of the response time clients experience. For example, the network response time between a client and a server often accounts for a significant portion of the response time the client experiences.
Managing and Monitoring a Cluster 5-19
Changing the amount of data used to compute the expansion factor Although it is not usually necessary, you can use the following NOTES.INI settings to change the amount of data that Domino collects in order to figure the expansion factor. To change the number of data collection periods that Domino uses, use the NOTES.INI setting Server_Transinfo_Max=x where x is the number of collection periods you want Domino to use. To change the length of each data collection period, use the NOTES.INI setting Server_Transinfo_Update_Interval=x where x is the length of each period in seconds.
Choosing the server availability threshold Setting the server availability threshold on each server is a key factor in balancing the workload in the cluster. Setting the server availability threshold too high can result in failover occurring unnecessarily. Setting it too low can result in poor performance for users who could have received better performance from a different server. To determine the proper value for the server availability threshold, do the following: 1. During periods of normal to heavy load, use one of the following methods to observe the server availability index: From the Domino Administrator or the Web Administrator, make the server you want current. Then click the Server - Statistics tab. Then, in the statistics list, expand Server. In the Server pane of the Domino Administrator, expand All Servers or expand Clusters, right-click the server you want, choose Server Properties, and then click the Cluster tab. This method is not available in the Web Administrator. At the server console, type show cluster. At the server console, type show stat server. Note Prior to using the next two methods, you must enable statistic reporting. From the Domino Administrator or the Web Administrator, click the Server - Analysis tab. In the Task pane, expand Monitoring Results Statistics Reports - Clusters. In the Results pane, open the Monitoring Results document you want. Open the Monitoring Results database (STATREP.NSF), and look in the Statistics Reports - Cluster view.
2. Set an initial availability threshold based on the results of your observation. Consider the following when setting this value: The value should be near the lower end of the values you observed. Add some extra capacity (lower the availability threshold number) to accommodate servers that may fail over to this server. When a server fails, the workload fails over to other servers in the cluster. If there are only two servers in the cluster, each with the same workload, this would result in approximately a 100% increase in the workload of the remaining server. If there are six servers in the cluster, this would result in approximately a 20% increase in the workload of the remaining servers. Therefore, you should set the availability threshold of each cluster server low enough to allow the server to absorb an adequate portion of the workload if another cluster server fails. 3. Track other cluster statistics to see if the workload is reasonably balanced. The following table lists some of the statistics that are helpful in determining if the workload is balanced.
Statistic name Server.AvailabilityIndex Description The current value of the server availability index. The values range from 0 to 100. A value of 0 indicates that there are no resources available on the server. A value of 100 means that the server is completely available. The current expansion factor. The values range from 1 to 64. A value of 1 indicates that the server is completing transactions at the minimum time for that server. A value of 64 indicates that it is taking 64 times longer than the minimum time to complete transactions. An expansion factor of 64 is considered fully loaded. The number of times a BUSY server successfully redirects a client to another cluster member. The number of times a BUSY server is unsuccessful in redirecting a client to another cluster member. A server will be unsuccessful if the target server is also in a BUSY state or otherwise unavailable. continued
Server.ExpansionFactor
Description The number of times a BUSY server tries to redirect a client request when all other cluster servers were also BUSY. The number of times a user tried to open a database on this server when the server was BUSY.
These statistics are cumulative from when the server was started. 4. Compare these statistics for all the servers in the cluster. 5. Adjust the server availability threshold on any servers that do not seem well balanced. Note Workload balancing is not a substitute for having adequate computer resources for your enterprise. If your servers are already overloaded, workload balancing merely increases the problem because there is no place for a BUSY server to send client requests. Looking for an available server, however, decreases the performance on an already busy server. Setting the server availability threshold The server availability threshold specifies the lowest acceptable server availability index. Approximately once each minute, Domino computes the server availability index and compares it to the server availability threshold that you set. If the availability index is less than or equal to the availability threshold, the server is marked as BUSY. When a server is marked as BUSY, requests to open databases are redirected to another server, if one is available. When the availability index becomes higher than the availability threshold again, the BUSY condition is withdrawn. From the Domino Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, select that document, and then click Edit Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Item field, select or enter SERVER_AVAILABILITY_THRESHOLD.
5-22 Administering Domino Clusters
7. In the Value field, enter the number you want for the server availability threshold. The higher the number you enter, the less workload the server can carry before going into the BUSY state. Entering the number 100 automatically puts the server into the BUSY state, regardless of its actual availability. Entering the number 0 disables workload balancing for that server. The default value is 0. 8. Click Add, and then click OK. 9. Click Save & Close. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Available Parameters box, click SERVER_AVAILABILITY_THRESHOLD, and then click Add. 7. In the Value field, enter the number you want for the server availability threshold, and then click OK. The higher the number you enter, the less workload the server can carry before going into the BUSY state. Entering the number 100 automatically puts the server into the BUSY state, regardless of its actual availability. Entering the number 0 disables workload balancing for that server. The default value is 0. 8. Click Save & Close. Using the availability threshold when you restart a server in a cluster When you restart a server in a cluster, it is a good idea to make the server BUSY until all replication to the server is complete. This ensures that users access up-to-date information in the databases on the server. You can make a server BUSY by setting the availability threshold to 100. When replication is complete, make the server available to users.
Using the server availability threshold to control failover to specific servers In some cases, you may want to limit failover to a server. For example, if you set up a cluster over a WAN and one of the cluster servers is more distant than the other servers, you may want to limit failover to the distant server. You can limit failover to this server by setting its availability threshold very high. For example, if you have three servers one in Boston, one in New York, and one in Hong Kong the Boston server would fail over to the Hong Kong server if it is more available than the New York server. However, if you set the availability threshold on the Hong Kong server to 100, the other cluster servers will not fail over to the Hong Kong server unless no other available cluster server contains a replica of the requested database. When you control failover in this manner, be sure that the other cluster servers (the servers in Boston and New York in the example) have enough resources to handle most of the failover in the cluster.
5. Click Set/Modify Parameters. 6. In the Item field, select or enter SERVER_MAXUSERS. 7. In the Value field, enter the maximum number of users you want to access the server at the same time. 8. Click Add, and then click OK. 9. Click Save & Close. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Available Parameters box, click SERVER_MAXUSERS, and then click Add. 7. In the Value field, enter the maximum number of users you want to access the server at the same time, and then click OK. 8. Click Save & Close. To see how often requests are redirected, check the log file for workload balancing events or check the Cluster Manager statistics. For information about viewing the log file and the Cluster Manager statistics, see the topic Monitoring Cluster Manager events and statistics. Note You can use the Server_MaxUsers setting with any Domino server. However, only the servers in a cluster redirect access requests to another server when a server is in a MAXUSERS state. Servers that are not in a cluster reject the access requests.
Redistributing replicas
Often you can redistribute replicas to other servers in the cluster in order to better balance the workload. For example, if one of the servers in the cluster is significantly more busy than the other servers, consider moving one or more replicas from the busy server to the less busy servers. You can also create more replicas of a busy database so that the workload is shared by more servers, thus reducing the workload on a busy server. To move or create replicas in a cluster, you use the same procedures you use to move or create replicas on any Domino server. While moving or creating the replicas, you can select Show me only cluster members for cluster: cluster name in the Move Database or Create Replica dialog box. This causes the Domino Administrator or the Web Administrator to display only the names of the servers in the cluster. You can then easily place the replicas on every cluster server you want. For more information about moving replicas and creating replicas, see the book Administering the Domino System.
7. In the Value field, enter 0, 1, or 2 where these numbers mean the following: 0 - The server is unrestricted 1 - The server is RESTRICTED for the current session only. Restarting the server clears the setting. 2 - The server is RESTRICTED persistently, even after the server restarts. 8. Click Add, and then click OK. 9. Click Save & Close. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Available Parameters box, click SERVER_RESTRICTED, and then click Add. 7. In the Value field, enter 0, 1, or 2 where these numbers mean the following: 0 - The server is unrestricted 1 - The server is RESTRICTED for the current session only. Restarting the server clears the setting. 2 - The server is RESTRICTED persistently, even after the server restarts. 8. Click OK, and then click Save & Close.
From the server console Send the following Domino command from the server console:
set config server_restricted=n
where n can be 0, 1, or 2. These numbers mean the following: 0 - The server is unrestricted 1 - The server is RESTRICTED for the current session only. Restarting the server clears the setting. 2 - The server is RESTRICTED persistently, even after the server restarts. If you want to restrict a server and do not want to wait for all users to close their existing sessions, enter the Drop All command at the console after you put the server into the RESTRICTED state. The Drop All command closes all existing sessions on the server. When users try to reopen the databases they were using, they fail over to a different server, if one is available.
From the Domino Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, select that document and then click Edit Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Item field, select or enter CLUSTER_REPLICATORS. 7. In the Value field, enter the number of Cluster Replicators you want to run on this server. Note Entering 0 (zero) in the Value field does not stop all Cluster Replicators. One Cluster Replicator will still run. To turn off all Cluster Replicators, see the topic Disabling cluster replication for an entire server. 8. Click Add, and then click OK. 9. Click Save & Close. 10. Restart the server so the setting takes effect. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Available Parameters box, click CLUSTER_REPLICATORS, and then click Add. 7. In the Value field, enter the number of Cluster Replicators you want to run on this server, and then click OK.
5-32 Administering Domino Clusters
Note Entering 0 (zero) in the Value field does not stop all Cluster Replicators. One Cluster Replicator will still run. To turn off all Cluster Replicators, see the topic Disabling cluster replication for an entire server. 8. Click Save & Close. 9. Restart the server so the setting takes effect. Starting multiple Cluster Replicators for the current session only To run multiple Cluster Replicators for the current session only, do one of the following. From the Domino Administrator or the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, do one of the following: From the Domino Administrator, click Server Tasks. From the Web Administrator, click All Server Tasks. 5. In the Tools pane, expand Task, and then click Start. 6. Select Cluster Replicator. 7. Click Start Task once for each Cluster Replicator you want to start, and then click Done. From the server console Send the following command from the server console once for each Cluster Replicator you want to start.
load clrepl
Each time you send this command, the server starts another Cluster Replicator.
Use one of the following procedures to disable cluster replication. Note Disabling the Cluster Replicator prevents only replication from that server to other cluster servers. It does not prevent replication to the server from other cluster servers. Disabling cluster replication automatically at server startup You can use the Domino Administrator or the Web Administrator to configure Domino to disable cluster replication at server startup. From the Domino Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, select that document, and then click Edit Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab. 5. Click Set/Modify Parameters. 6. In the Item field, select or enter DISABLE_CLUSTER_REPLICATOR. 7. In the Value field, enter 1 (one). 8. Click Add, and then click OK. 9. Click Save & Close. 10. Restart the server so the setting takes effect. Note To restart cluster replication, set Disable_Cluster_Replicator to 0 (zero) or remove this line from the Configuration Settings document. Then restart the server. From the Web Administrator 1. Click the Configuration tab. 2. In the Task pane, expand Server, and then click Configurations. 3. Do one of the following: If a Configuration Settings document already exists for the server you want, open that document, and then click Edit Server Configuration. If a Configuration Settings document does not already exist for the server you want, click Add Configuration, and add the name of the server in the Group or Server name field on the Basics tab. 4. Click the NOTES.INI Settings tab.
5-34 Administering Domino Clusters
5. Click Set/Modify Parameters. 6. In the Available Parameters box, click DISABLE_CLUSTER_REPLICATOR, and then click Add. 7. In the Value field, enter 1 (one), and then click OK. 8. Click Save & Close. 9. Restart the server so the setting takes effect. Disabling cluster replication for the current session only To disable cluster replication for the current session only, do one of the following. From the Domino Administrator or the Web Administrator 1. Click the Server - Status tab. 2. In the Task pane, do one of the following: From the Domino Administrator, click Server Tasks. From the Web Administrator, click All Server Tasks. 3. In the Results pane, select a Cluster Replicator. 4. In the Tools pane, expand Task, and then click Stop. From the server console Send the following Domino command from the server console:
tell clrepl quit
Using these procedures shuts down all Cluster Replicators, even if multiple Cluster Replicators are running.
Databases by Server The Databases by Filename view is particularly useful for disabling cluster replication on specific databases. 6. In the Results pane, select the databases for which you want to disable cluster replication. 7. Do one of the following: From the Domino Administrator, click Tools - Disable Cluster Replication on Selected Databases. From the Web Administrator, click Tools - Disable Replication. Tip Databases with the letter X in the left column in the Cluster Database Directory have cluster replication disabled. Disabling cluster replication of a database prevents only replication of changes from that database to other servers in the cluster. It does not prevent replication to the database from other cluster servers. Disabling cluster replication has no effect on standard replication. Note To reenable cluster replication for specific databases, follow the procedure above with the following exception: In Step 7, click Tools Enable Cluster Replication on Selected Databases from the Domino Administrator, or click Tools - Enable Replication from the Web Administrator.
11. In the Action field, choose Modify Field. 12. In the Modify by field, choose Replacing. 13. In the The value in field field, choose ClusterReplicate. 14. In the With the new value field, type 0 (zero). 15. Click Add. 16. Choose File - Save to save the agent. Creating an agent to reenable cluster replication for specific databases If you create an agent to disable cluster replication on specific databases, you might want to create another agent to reenable cluster replication on specific databases. To create this agent, follow the previous procedure with the following exceptions: In step 3, type a different name for the agent. In step 14, type 1 (one) in the With the new value field.
Running the agents 1. Open the Cluster Database Directory. 2. Select the databases for which you want to disable cluster replication or enable cluster replication. 3. Choose the appropriate agent name from the Actions menu.
5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Click Pause, and then click OK. From the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click All Server Tasks. 5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Type tell clrepl pause and then click OK. From the server console Send the following Domino command from the server console:
tell clrepl pause
From the server console Send the following Domino command from the server console:
tell clrepl resume
Forcing the Cluster Replicator to update the Cluster Database Directory information immediately
The Cluster Replicator stores information from the Cluster Database Directory in memory and uses this information to replicate changes to the other cluster servers. Every 15 seconds, the Cluster Replicator checks the Cluster Database Directory for changes, such as databases that were added or deleted or databases that have a different cluster replication status. If the Cluster Replicator detects changes, it updates the information it has stored in memory. You can force the Cluster Replicator to check immediately for changes to the Cluster Database Directory. From the Domino Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click Server Tasks. 5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Click Refresh cluster configuration changes, and then click OK. From the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click All Server Tasks. 5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Type tell clrepl refresh and then click OK. From the server console Send the following Domino command from the server console:
tell clrepl refresh
From the Domino Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click Server Tasks. 5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Select Write a Replication Event Log record immediately, and then click OK. From the Web Administrator 1. In the Server pane, expand All Servers or expand Clusters. 2. Select the server you want. 3. Click the Server - Status tab. 4. In the Task pane, click All Server Tasks. 5. In the Results pane, select the Cluster Replicator. 6. In the Tools pane, expand Task, and then click Tell. 7. Type tell clrepl log and then click OK. From the server console Send the following Domino command from the server console:
tell clrepl log
8. in the Group or Server name field on the Basics tab, type the name of the group you created. 9. Choose the settings you want. 10. Save and close the document. Note You can include a server name in more than one Configuration Settings document, but there are specific rules for which document takes precedence. For more information about the rules of precedence in Configuration Settings documents, see the book Administering the Domino System.
To mark a database out of service, follow this procedure: 1. In the Server pane of the Domino Administrator or the Web Administrator, expand All Servers or expand Clusters. 2. Select the server that contains the database you want to mark out of service. 3. Click the Files tab. 4. Do one of the following: In the Task pane in the Domino Administrator or the Web Administrator, select the folder or view that contains the database you want. In the Task pane in the Domino Administrator only, expand Cluster Directory (6), and then select the view you want. 5. In the Results pane, select the database or databases you want. 6. In the Tools pane, expand Database, and then click Cluster. 7. Click Out of service, and then click OK. To mark all databases on a server out of service, use the Server_Restricted setting. A restricted server does not accept new database open requests. For information about the Server_Restricted setting, see the topic Managing failover in a cluster.
6. In the Tools pane, expand Database, and then click Cluster. 7. Click In service, and then click OK.
If possible, use the administration server when adding a server to a cluster. The administration server does not have to be part of the cluster. If a server is part of a different cluster, you do not have to remove the server from that cluster to add it to the new cluster. The Cluster Administration Process removes the server from its original cluster and then adds it to the new cluster. Note You cannot use the Web Administrator to add a server to a cluster. 1. From the Domino Administrator, make sure the administration server or another server is current. 2. Click the Configuration tab. 3. In the Task pane, expand Server, and click All Server Documents. 4. In the Results pane, select the server(s) you want to add to the cluster. 5. Click Add to Cluster. 6. Choose the name of the cluster you want to add the server(s) to, and then click OK. 7. Choose Yes to add the server to the cluster immediately, or choose No to submit a request to the Administration Process to add the server to the cluster. 8. (Optional) If you chose No in Step 7 and you did not use the administration server to add the server to the cluster, force replication between the server you used and the administration server so that the administration server receives the requested changes sooner. 9. (Optional) If you chose No in Step 7, force replication between the administration server and the cluster servers so the cluster servers receive the changes sooner. 10. (Optional) If you chose Yes in Step 7, the cluster information is added immediately to the Domino Directory on the server you used to add the server to the cluster. If the server you used is not part of the cluster, replicate the changes to one of the cluster servers. When you add a server to a cluster, there can be a performance impact because of the amount of replication that must take place initially. Depending on the types of databases on the server and the number of
Managing and Monitoring a Cluster 5-45
replicas you create, adding a server can affect CPU performance, Input/Output, and network traffic. Therefore, it is a good idea to add only one server at a time to a cluster, depending on the ability of the equipment in the cluster. If you set up a private LAN for cluster traffic, adding servers is less of a concern because it does not affect your primary network.
When the added servers Domino Directory updates, the added server begins to send probes to other servers in the cluster. In return, cluster members begin to probe the new server when their Domino Directories update. This is the way that all the servers in the cluster keep track of the availability and status of the other servers. To verify that the server was added correctly to the cluster, follow the same procedure you follow to check that a cluster was created correctly. For more information, see the chapter Setting Up a Cluster.
If possible, use the administration server when removing a server from a cluster. The administration server does not have to be part of the cluster. Note You cannot use the Web Administrator to remove a server from a cluster. 1. From the Domino Administrator, make sure the administration server or another server is current. 2. Click the Configuration tab.
Managing and Monitoring a Cluster 5-47
3. In the Task pane, expand Cluster and click Clusters. 4. In the Results pane, select the server you want to remove from the cluster. 5. Click Remove from Cluster. 6. Choose Yes to remove the server from the cluster immediately, or choose No to wait until the Administration Process removes the server from the cluster.
Note It is possible to remove a server from a cluster even if the server is shut down. However, the documents for this servers databases remain in the Cluster Database Directory. To remove these documents, open the Cluster Database Directory, select the Databases by Server view, and manually remove the documents for the server.
request. The Administration Process then removes the cluster information from the Server document in the administration servers Domino Directory. The next time the administrative server replicates with the removed server, Domino replicates these changes to the removed servers Domino Directory. Cluster membership changes do not take effect until the removed server receives the changes to the Server document. If you choose to remove a server from a cluster immediately, Domino immediately makes the changes to the Server document on the server from which you initiated the Remove from Cluster command. If you initiate the request on the server you are removing, Domino updates the cluster information immediately on the server youre removing. You do not have to wait for the Administration Process to update the Domino Directories on the cluster servers. Although this removes the server from the cluster faster, it can also lead to replication conflicts.
Databases by Filename Databases by Pathname Databases by Replica ID Databases by Server 6. In the Results pane, open the database document you want. The following table describes the views in the Cluster Database Directory.
View Databases by Filename Description Use
Shows databases in To find which databases are in a cluster a cluster sorted by and on which servers replicas reside. Using this view is a convenient way to see which database name databases have cluster replication enabled or disabled and to alter that setting. This is also a convenient view for deleting replicas and creating new replicas Shows databases in a cluster sorted by the path name relative to the Domino Data directory To find information about a database when you know the database file name or when there is more than one replica of a database on the same server. You can also use this view to see which databases have cluster replication enabled or disabled.
Databases by Pathname
Databases by Replica ID
Shows databases in To find which databases are in the cluster, a cluster sorted by where the databases are located, and how Replica ID many replicas of each database exist in the cluster. Shows databases in a cluster sorted by the name of the server on which they reside To find a database when you know where it is located or to find which databases are on each server. You can also use this to determine if the databases on a server are added to the Cluster Database Directory when the server is added to the cluster, and to determine if the databases on a server are removed from the Cluster Database Directory when a server is removed from a cluster.
Databases by Server
3. Restart the Cluster Database Directory Manager by sending the command load CLDBDIR from the server console. It is important that you create the replica before starting the Cluster Database Directory Manager. Otherwise, the Cluster Database Directory Manager will create a new Cluster Database Directory and add to it a document for each database on the server. However, the existing Cluster Database Directories on the other cluster servers already contain a document for each database on this server. Therefore, after these Cluster Database Directories replicate, they would contain two documents for every database on this server. Although these duplicate documents are deleted in subsequent replications, it can be confusing to view the Cluster Database Directory while it contains so many duplicate documents.
If the page that a Web server displays to a client includes links to other databases, the Web server includes the host name of the ICM in the URLs to those databases in the following instances: When generating URLs to databases on the same server as the original database When generating URLs to databases on different servers if there are replicas of those databases on the server that contains the original database
This ensures that users accessing those links go through the ICM. Note In cases not mentioned above, you can use the Redirect URL command to create links to other servers. For information about the Redirect URL command, see the book Administering the Domino System. The following figure shows an HTTP client asking the ICM to open a database, and the ICM redirecting the client to the best server that contains the requested database, Server 2. The client then connects directly to Server 2.
HTTP Client
tion rec edi R
Server A ICM
Web Server 1
Web Server 2
Web Server 3
The ICM can run on a server in the cluster or outside the cluster. When the ICM runs on a server in the cluster, it accesses the local copy of the Cluster Database Directory. When the ICM runs on a server outside the cluster, it selects a server in the cluster and accesses the Cluster Database Directory on that server. If the server that the ICM selects becomes unavailable, this connection fails over to another server in the cluster.
The ICM always uses its local copy of the Domino Directory. Therefore, the ICM must be in the same Domino domain as the cluster. Performance considerations In most cases, users will experience better performance when you use the ICM. The overhead of using the ICM is very small, but the benefit to performance from workload balancing can be significant. In cases where the workload was already balanced, there will not be a significant increase or decrease in performance.
The following conditions can affect the way the ICM generates URLs:
For more information about planning a cluster, see the chapter Planning a Cluster.
Server A ICM
n ctio dire Re
Web Server 1
Web Server 2
Web Server 3
Web Server 4
Server A ICM
n ectio Redir
Red ire ction
Server B ICM
Web Server 1
Web Server 2
Web Server 3
Web Server 4
R ed ire ct io n
Web Server 2
Web Server 3
Web Server 4
n tio ec dir Re
Re di re ct io n
Web Server 2
Web Server 3
Example of one ICM outside the cluster and one ICM inside the cluster
You can configure one ICM to run outside the cluster and one to run inside the cluster. If the dedicated server outside the cluster becomes unavailable, you have a backup ICM available without having to dedicate a server to the additional ICM. The following figure shows an HTTP client with access to two ICMs, one outside the cluster and one inside the cluster. Each ICM can redirect the client to any of the four Web servers in the cluster.
HTTP Client Server A
Re d
ire ct
Web Server 2
io n
Re dire ctio n
ICM
Web Server 3
Web Server 4
Section
Field name
Description Lets you specify a different Server document from which to get configuration information. This field lets multiple ICMs share the same configuration.
Obtain ICM This field appears when you select configuration from another server document in the field Get configuration from. Enter the name of the server whose Server document contains the configuration you want to use. ICM hostname The fully qualified name of the host that clients should use to communicate with the ICM. This can be the registered DNS name or the IP address. The Domino Web server uses this field to create URLs that reference the ICM. If this field is blank, the Web server will not be able to generate URLs that refer to the ICM. Enter the port number for the ICM to use. If you are running the ICM on the same server as the Web server, you must avoid address and port conflicts. If you do not give the ICM its own IP address, be sure the port number the ICM uses is different from any of the other port numbers you use on the server.
TCP/IP port status To enable HTTP communication with the ICM, choose enabled. To disable HTTP communication with the ICM, choose Disabled. SSL port number Enter the port number to use for SSL. If you are running the ICM on the same server as the Web server and you do not give the ICM its own IP address, be sure the SSL port number is different from any of the other port numbers you use on the server. To enable HTTPS communication with the ICM, choose enabled. To disable HTTPS communication with the ICM, choose Disabled.
When the ICM starts, it looks at the Server document on the server on which it is running to find the ICM cluster name and its network address. It then obtains the host name and port settings from the same Server document or from the Server document specified in the field Obtain ICM configuration from. If you run the ICM on the same system as a Domino Web server, you must avoid IP address or port number conflicts. The best approach is to assign the ICM its own IP address. You can also have the ICM share an IP address with the Web server if you specify different port numbers for the ICM and the other protocols on the Web server.
5. From the Domino Administrator or the Web Administrator, click the Configuration tab. 6. In the Tasks pane, expand Server, and click All Server Documents. 7. Do one of the following: In the Results pane of the Domino Administrator, select the Server document for the server on which you want to run the ICM. Then click Edit Server. In the Results pane of the Web Administrator, open the Server document for the server on which you want to run the ICM. Then click Edit Server. 8. Click the Server Tasks - Internet Cluster Manager tab. 9. In the field ICM Notes port, enter the name of the port you configured, such as ICMPORT. 10. If you want to use port 80 for both the ICM and the Web server, you must do the following: In the Server document, click the Internet Protocols - HTTP tab. In the Host name(s) field, enter the IP address or host name of the Web server. In the Bind to host name field, select Enabled. 11. Click Save & Close.
The ICM maintains the following information so that it can find a replica when a client asks for one: Information about which databases are available in the cluster and where they are stored. The ICM obtains this information from the Cluster Database Directory. Information about the availability of each server. The ICM obtains this information each time it probes the servers in the cluster. Information about which Web servers are configured for HTTP and which are configured for HTTPS. The ICM obtains this information from the Server documents of each server in the cluster.
To determine which replica of a database to open, the ICM does the following: Determines where replicas reside and whether they are marked out of service or pending delete. Checks the server availability index of each server that contains a replica. Checks the availability of the server by pinging the HTTP port or the HTTPS port, depending on the client request. Eliminates any servers that are not reachable or are RESTRICTED. Eliminates any servers that are BUSY or in the MAXUSERS state. Selects a server from those remaining. If there are no servers remaining, the ICM chooses a server that is BUSY or in the MAXUSERS state, if one is available. If there are multiple servers remaining, the ICM chooses the server with the lightest current workload.
After choosing the server to access, the ICM looks at the Server document to determine which port to use to access the server.
The user may or may not have to reauthenticate with the new server. This is determined by the following factors: If the user already authenticated with the new server during this session, no authentication is necessary If the HTTP client and the server both support SSL3, reauthentication occurs automatically
Security
The ICM supports SSL. The ICM can use the same SSL certificates that the Domino Web server uses, or you can specify a different set of SSL certificates for the ICM. You configure this on the Server Tasks - Internet Cluster Manager tab of the Server document. The ICM uses settings on the Ports - Internet Ports tab of the Server document to determine the SSL protocol version and whether to accept expired certificates. In addition, normal Domino server and database security are in effect when using the ICM. The ICM, however, does not participate in the security process. When an HTTP client wants to access a database, it sends an anonymous request to the ICM. The ICM responds by telling the client which server to access. The client then redirects its request to the appropriate server. The server then establishes a dialog with the client and uses whatever security measures are in effect on that server to authenticate the user. If you want to protect the ICM itself from unauthorized access, you can use a firewall or another hardware security system. For more information about SSL, firewalls, and other network security methods, see the book Administering the Domino System.
From the Domino Administrator or the Web Administrator 1. In the Server pane, select the server that stores the log file you want to view. 2. Click the Server - Analysis tab. 3. In the Task pane, expand Notes Log, and then click Miscellaneous Events. 4. In the Results pane, open the document you want to view. From the Domino server log file 1. Open the Domino server log file (LOG.NSF). 2. Open the Miscellaneous Events view. 3. Open the document you want to view.
To start the server monitor manually 1. From the Domino Administrator, click the Server - Monitoring tab. 2. In the Monitoring profiles field, select the profile you want. 3. Click the Start button. Once pressed, the Start button becomes the Stop button. To start the server monitor automatically when the server starts 1. Choose File - Preferences - Administration Preferences. 2. Click Monitoring. 3. Select Automatically monitor servers at startup. 4. Make any other changes you want, and then click OK. For more information about monitoring Domino servers, see the book Administering the Domino System. Creating or customizing a monitoring profile for the ICM You can create new profiles and edit existing profiles to customize the tasks and statistics that Domino displays. 1. From the Domino Administrator, click the Server - Monitoring tab. 2. In the Monitoring profiles field, select an existing profile. Selecting a profile initializes the server monitor if it is not already initialized. You cannot make changes to a profile until the server monitor is initialized. 3. To add one or more tasks to monitor, choose Monitoring - Monitor New Task, select the tasks you want to add, and then click OK. To monitor the ICM task, select Internet Cluster Manager (ICM). 4. To add one or more statistics to monitor, choose Monitoring - Monitor New Statistic. 5. In the Add Statistic(s) to this profile dialog box, expand ICM, select the ICM statistics you want to monitor, and then click OK. For example, the Command - Redirects statistics and the Requests statistics tell you how busy each ICM is. This can help you to balance the workload. 6. (Optional) To add a server to the profile, select Monitoring - Monitor New Server, and then select the server from the list; or drag a server from the Server pane to the server monitor. 7. (Optional) To remove a server from the list, click the name of the server you want to remove, and then select Monitoring - Remove Server.
8. To save your changes to the profile, do one of the following: To save this profile as a new profile while also preserving the original profile, choose Monitoring - Profiles - Save As, and enter a name for the profile. To have this modified profile replace the original profile, you do not have to do anything. The profile is saved automatically when you close the Domino Administrator.
In the following figure, an IP sprayer directs an HTTP client to Web Server 1. If Web Server 1 is not available, the IP sprayer directs the client to Web Server 2, which contains a replica of the database the client requested. You can configure the IP sprayer to alternate between Web Server 1 and Web Server 2 when directing HTTP client requests. This helps to balance the workload. If either server becomes unavailable, the IP sprayer directs all requests to the server that is still available.
Web Server 1 Web Server 2
HTTP Client
IP Sprayer
You can also use an IP sprayer for failover with POP3 clients.
Mail 2
DIR
DIR
LDAP Client
IP Sprayer
You can also use an operating system cluster for LDAP failover. If you are using directory assistance, you can use either directory assistance failover or Domino failover. For more information on configuring directory assistance to fail over, see the book Administering the Domino System.
This appendix describes these statistics. For more information about viewing statistics, see the book Administering the Domino System.
AvailabilityThreshold
A-1
Description The current expansion factor. This value is used to compute the availability index. The values range from 1 to 64. A value of 1 indicates that the server is completing transactions at the minimum time for that server. A value of 64 indicates that it is taking 64 times longer than the minimum time to complete transactions. An expansion factor of 64 is considered fully loaded, and results in an availability index of 0 (zero). Total times that server successfully redirects a client to another cluster member after the client fails to open a database by replica ID Total times that server is unable to redirect a client to another cluster member after the client fails to open a database by replica ID Total times server successfully redirects a client to another cluster member after the client fails to open a database by path name Total times server is unable to redirect a client to another cluster member after the client fails to open a database by path name Total times server successfully redirects a client to another cluster member after the client tries to open a database by replica ID when the server is BUSY Total times server is unable to redirect a client to another cluster member after the client tries to open a database by replica ID when the server is BUSY Total times that server is unable to redirect a client to another cluster member after the client tries to open a database by path name when the server is BUSY Total times that server successfully redirects a client to another cluster member after the client tries to open a database by path name when the server is BUSY Total client requests when all servers are BUSY continued
OpenRedirects.Failover.Successful
OpenRedirects.LoadBalance. Unsuccessful
OpenRedirects.LoadBalanceByPath. Unsuccessful
OpenRedirects.LoadBalanceByPath. Successful
OpenRequest.ClusterBusy
Statistic name
Description
OpenRequest.DatabaseOutOfService Total times a client tries to open a database that is marked out-of-service on the server OpenRequest.LoadBalanced PortName Total times a client tries to open a database on the server when the server is BUSY Default port used for intra-cluster network traffic or an asterisk, which indicates there is no default port and any available active port can be used Total times that a server completes a probe of the other cluster members Total times that a server receives an error when probing another server Shows the interval at which an intracluster probe occurs
* This statistic does not appear in the Cluster statistics report, but you can use the Show Stat command to view it or you can add it to the Cluster statistics report form.
Description Total times the Cluster Replicator did not attempt to replicate a database. The retry is skipped when the destination server is known to be unreachable or the database is waiting for another retry attempt. Total number of replicas that are waiting for retry attempts Total time, in seconds, that the last database replicated spent on the work queue Average time, in seconds, that a database spent on the work queue Maximum time, in seconds, that a database spent on the work queue Total bytes received during cluster replication Total bytes sent during cluster replication Current number of databases awaiting replication by the Cluster Replicator Average work queue depth since the server started Maximum work queue depth since the server started
Description The number of HTTP requests the ICM received in the past hour The number of HTTP requests the ICM received in the past minute
ICM.Requests.Per5Minutes. The number of HTTP requests the ICM received in Total the past 5 minutes ICM.Server.Running ICM.Sessions.Inbound. Accept.Queue ICM.Sessions.Inbound. Active ICM.Sessions.Inbound. Active.SSL ICM.Sessions.Inbound. BytesReceived ICM.Sessions.Inbound. BytesSent Tells whether the ICM task is running The number of new connections that have been detected and are waiting to be serviced by a server thread Current number of inbound connections Current number inbound connections that are SSL connections Total number of bytes received by all inbound connections since the server started Total number of bytes sent by all inbound connections since the server started
ICM.Sessions.Inbound.Peak The maximum number of concurrent inbound connections since the server started ICM.Sessions.Inbound.SSL ICM.Sessions.Inbound. Total ICM.Sessions.Inbound. Total.SSL ICM.Sessions.Outbound. Active ICM.Sessions.Outbound. Active.SSL ICM.Sessions.Outbound. BytesReceived ICM.Sessions.Outbound. BytesSent ICM.Sessions.Outbound. Peak ICM.Sessions.Outbound. Peak.SSL The maximum number of concurrent inbound SSL connections since the server started Total number of inbound connections since the server started Total number of inbound SSL connections since the server started Current number of outbound connections Current number outbound connections that are SSL connections Total number of bytes received by all outbound connections since the server started Total number of bytes sent by all outbound connections since the server started The maximum number of concurrent outbound connections since the server started The maximum number of concurrent outbound SSL connections since the server started continued
Description Total number of outbound connections since the server started Total number of outbound SSL connections since the server started Number of threads that are currently idle Current number of threads created
The modified database and its replicas on other servers are listed in all the Cluster Database Directories. All replicas of the modified databases have the same Replica ID. To check this, open the Databases by Replica ID view in the Cluster Database Directory. The Cluster Replicator is not encountering errors when it attempts to replicate to other servers in the cluster. Check the Replica.Cluster.Failed and Replica.Cluster.Retry.Waiting statistics to see if error conditions exist. Also, examine the Replication Events log documents generated by the Cluster Replicator. The Cluster Replicator is able to keep up with the current server replication workload. Check the Replica.Cluster.WorkQueueDepth and Replica.Cluster.SecondsOnQueue statistics to determine if there is a backlog of replication requests. If so, consider starting an additional Cluster Replicator. Cluster replication is enabled for all replicas of the database. Open the Cluster Database Directory, and check the left column for the letter X. Databases with the letter X in the left column have cluster replication disabled.
B-1
CLREPL_OBEYS_QUOTAS is set to 0 (zero) if you used this setting in the Configurations Settings document or in the NOTES.INI file. When this setting is set to 1, cluster replication obeys the database size quotas that were set by the database administrator. The Cluster Replicator will not push changes to a replica if the changes would result in the replica exceeding its size quota. If CLREPL_OBEYS_QUOTAS is set to 0 or is not present at all, the Cluster Replicator ignores database size quotas.
Client requests do not fail over for certain databases even though the replicas are listed in the Cluster Database Directory
When there are two or more replicas of a database on a server, the Cluster Manager uses failover by path, not failover by replica ID. To ensure that client requests fail over correctly, do not include multiple replicas of a database on the same server; or if you do, create the replicas using the same names and paths as the replicas to which you want to fail over.
The Cluster Database Directory includes two copies of the database documents for all the databases on a particular server
If the Cluster Database Directory on a server is deleted, the Cluster Database Directory Manager recreates it and then populates it with a document for each database on the server. These documents then replicate to the other servers in the cluster. Since each server in the cluster already has documents for this servers databases, their Cluster Database Directories will then contain two documents for each database on this server. This is a temporary condition and causes no system errors. The next time the servers Cluster Database Directory Manager starts, it detects the problem and removes the extra documents. To avoid creating duplicate documents, replicate the Cluster Database Directory from another server to the server on which the Cluster Database Directory was deleted before you restart the server.
To view the error conditions, examine each of the Log documents generated by this command (one for each server being replicated to), and then correct the errors. You can sometimes correct the problem by restarting a server that is currently unreachable. When the errors are corrected, cluster replication succeeds, and the Replica.Cluster.Retry.Waiting statistic becomes zero.
Cluster Replicator was unable to configure using Cluster Database Directory cldbdir.nsf: File does not exist
This message can occur for the following reasons: The Cluster Replicator cannot find the Cluster Database Directory. This often occurs when you first add a server to a cluster and the Cluster Replicator starts before the Cluster Database Directory Manager has created the servers Cluster Database Directory. If this is the cause of the problem, it will resolve itself. The Cluster Database Directory has been deleted. Replicate the Cluster Database Directory from a different cluster server.
Cluster Replicator was unable to configure using Cluster Database Directory cldbdir.nsf: Invalid replica ID for cluster database directory. If cluster name changed, delete cluster database directory and restart cldbdir task.
The ClReplD field in the Server document In the Domino Directory does not match the replica ID of the Cluster Database Directory. To fix this, you can delete the Cluster Database Directory from this server and then replicate it from a different cluster server. If this doesnt correct the problem, remove the server from the cluster and add it to the cluster again.
Troubleshooting Cluster Problems B-3
HTTP Server Initialization error. Could not bind port 80. Port may be in use.
This message can occur if you have conflicting IP addresses or port numbers when you attempt to start the Domino Web server on a server that is running the ICM. The most likely cause is that the ICM and HTTP task are both attempting to use the same IP address and TCP/IP port. Check the Server document to ensure that the ICM and the HTTP server have been assigned different TCP/IP port numbers or that the ICM is configured to use a different IP address than the HTTP server. If the ICM and HTTP server are both using port 80, but on different IP addresses, make sure that you have chosen Enabled in the Bind to host name field on the Internet Protocols - HTTP tab in the Server document.
Clients receive the message Server Not Responding instead of failing over
If a server becomes unavailable while a database is open, failover does not occur. The user should reopen the database. Reopening the database causes failover to occur if a replica exists on an available server in the cluster. If the user was editing a document when the server became unavailable, the user can copy the document to the replica.
Server Not Responding can also appear when a user tries to send and save a message when the users mail server is unavailable. The message is sent successfully because the mail router fails over. (The user can see that the message was sent successfully by clicking the status history list in the status bar.) However, saving a message or document does not cause failover. To save the message, the user can reopen the database, which causes failover if a replica mail database exists on an available server. The user can then copy the sent message to the replica.
Index
A
ACL enforcing on replicas, 4-7 Activity Trends clusters and, 5-26 Administration Process adding servers, 5-46 creating clusters with, 4-3 removing servers with, 5-47 to 5-48 Availability index described, 5-18 expansion factor and, 5-19 viewing, 5-7 Availability threshold described, 5-20 server, 5-18 setting, 5-20, 5-22, 5-24 viewing, 5-7 workload balancing and, 2-5 troubleshooting, B-2 updating information stored, 5-40 viewing, 5-49 Cluster Database Directory Manager described, 2-9 Cluster Manager described, 2-7 failover and, 2-2 locating replicas, 2-5 monitoring events, 5-5 removing server documents, 5-48 statistics, A-1 workload balancing, 2-5 Cluster Name Cache described, 2-8 Cluster names replication and, 4-16 Cluster replication configuring, 5-30 described, 2-10, 5-30 disabling, 5-33, 5-35 pausing, 5-37 resuming, 5-38 Cluster Replicator backlog, 5-11 described, 2-9, 5-30 log file, 5-8, 5-40 replication history, 2-11 retrying failed replications, 5-39 statistics, A-1, A-3 Tell commands, 5-11, 5-13 troubleshooting, B-3 updating Cluster Database Directory information, 5-40 using multiple, 5-31, 5-33 Clusters adding servers, 5-45 to 5-46 analyzing, 4-5 to 4-6 backup servers, 5-29 balancing workload, 5-17 benefits of, 1-1 calendar management in, 2-13 components of, 2-7 configuring, 3-1, 3-12 to 3-22, 5-41 Connection documents, 4-16 controlling access to, 4-7, 5-24 controlling database availability, 5-42 creating, 4-1 to 4-4 creating mail databases in, 4-12 database distribution in, 3-7, 3-9, 5-26 database information, 2-9 database size quotas, 4-17 deleting databases from, 5-44 described, 1-1 Directory assistance database, 4-19 disabling replication, 5-33, 5-35 disaster preparedness, 3-18 displaying members, 5-2 failover, 2-2 to 2-5, 5-27 fault recovery in, 3-11 hardware for, 3-2 HTTP servers in, 6-1, 6-19 IMAP servers in, 6-21 LANs and, 3-10, 4-23, 4-25 LDAP servers in, 6-22 log file, 5-5, 5-7 logging replication events, 4-17 mail databases in, 3-7, 2-12 mail routing failover, 4-10 memory requirements, 3-2 mobile user access to, 4-21 monitoring, 5-1, 5-15 to 5-16 moving servers, 5-49 NOTES.INI settings, 4-26, 5-51 operating system clusters, 2-14, 3-22, 6-21 partitioned servers in, 3-19 to 3-20 pausing replication, 5-37 planning, 6-4 removing servers, 5-47 to 5-49 replacing servers, 5-29 replica distribution in, 2-1, 3-3 replicating folders, 2-11, 4-8 replication, 2-9, 4-15 replication history, 2-11 replication schedule, 4-13
B
BUSY state servers, 2-5, 5-18, 5-22 BUSYTIME.NSF free time database, 2-13
C
Calendar and scheduling clusters and, 2-13 CLUBUSY.NSF free time database, 2-13 Cluster Administrator described, 2-9 Cluster Analysis described, 4-5 results, 4-7 running, 4-6 Cluster Analysis Database described, 4-5 Cluster Database Directory creating, 5-50 described, 2-8, 5-49
Index-1
replication types, 5-30 requirements, 1-3 resuming replication, 5-38 roaming users, 4-19 server availability, 5-18 shared mail, 2-12, 4-11 statistics, 5-3, 5-15 to 5-16, A-1, A-4 WANs and, 3-11, 3-18, 5-24 Web databases in, 4-23 Web servers in, 6-1, 6-19 workload balancing, 2-5, 5-11, 5-17 Configuring clusters, 3-1, 3-12 to 3-22, 5-30, 5-41 Internet Cluster Manager, 6-5 to 6-13 Connection documents replicating, 4-16
E
Examples active-active operating system cluster, 2-16 active-passive operating system cluster, 2-15 cluster configuration, 3-12 to 3-22 cluster for disaster preparedness, 3-18 Cluster Manager events, 5-5 Cluster Replicator events, 5-9 Domino and operating system clusters together, 3-22 failover, 2-2 hub server cluster, 3-16 IMAP server cluster, 6-21 Internet Cluster Manager configuration, 6-5 to 6-13 IP sprayer and Web servers, 6-19 LDAP server cluster, 6-22 partitioned server cluster, 3-19 to 3-20 passthru server cluster, 3-21 workload balancing, 2-6 Expansion factor described, 5-18
H
Hardware cluster, 3-2 HTTP clients Internet Cluster Manager and, 6-1 security, 6-16 HTTP servers failover, 6-14, 6-19 troubleshooting, B-4 Hub servers configuring, 3-16 Hub-and-spoke topology clusters and, 3-16
I
IBM Tivoli Analyzer for Lotus Domino tool clusters and, 5-26 IMAP failover, 6-21 Internet security, 6-16 Internet Cluster Manager compatibility with Domino releases, 6-19 configuring, 6-5 to 6-13 described, 6-1 failover, 6-14 IP address, 6-9, 6-12 location of, 6-4 log file, 6-16 performance, 6-3, 6-17 planning, 6-4 security, 6-16 starting, 6-13 statistics, 6-17, A-1, A-4 troubleshooting, B-4 URLs, 6-3, 6-9, 6-19 workload balancing, 6-14 IP sprayer failover, 6-19, 6-22
D
Data synchronization described, 1-2 Database access controlling, 4-9 managing, 4-7 Database quotas obeying in a cluster, 4-17 Databases controlling access to, 4-7 deleting from a cluster, 5-44 disabling replication, 4-14 distributing, 3-3, 3-7, 3-9, 3-13 to 3-15 failover causes, 2-3 information about, 2-8 making available for access, 5-43 making unavailable for access, 5-42 marking in service, 5-43 marking out of service, 5-42 Directories finding, 4-19 Directory assistance clusters and, 4-19 Disaster preparedness clusters and, 3-18 Drop All command restricting servers with, 5-29 Duplicate documents avoiding creation of, B-2
F
Failover causes of, 2-3 causing, 5-24, 5-27 Cluster Manager and, 2-8 described, 1-1 to 2-2 HTTP servers, 6-14, 6-19 IMAP servers, 6-21 Internet Cluster Manager, 6-14 IP sprayer and, 6-19, 6-22 LDAP servers, 6-22 log file and, 5-5 mail databases, 2-12 operating system clusters, 2-14 troubleshooting, B-2 Web servers, 6-14, 6-19 Fault recovery clusters and, 3-11 enabling, 3-12 Folders replicating, 2-11, 4-8 troubleshooting, B-4 Free Time database described, 2-13
L
LANs setting up in a cluster, 3-10, 4-23, 4-25 LDAP service failover, 6-22
Index-2
M
Mail databases distributing, 3-7, 3-13, 3-15, 4-12 failover, 2-12 Mail files clusters and, 4-12 Mail routing failover, 4-10 Memory requirements for clusters, 3-2 Mobile users cluster access for, 4-21 replication and, 4-22 Monitoring cluster workload, 5-1, 5-15 to 5-16 events and statistics, 5-5 statistics, 5-15 to 5-16
N
Notes client requirements for cluster, 1-4 NOTES.INI file Internet Cluster Manager variable, 6-13 workload balancing variables, 2-5 NOTES.INI settings clusters, 4-26, 5-51
O
Operating system clusters benefits of, 2-17 described, 2-14 failover, 2-14 IMAP and, 6-21 with Domino clusters, 3-22
ACLs and, 4-7 creating, 3-4 to 3-6, 4-9 distributing, 2-1, 3-3 moving, 5-26 troubleshooting, B-2 Replication backlog, 5-11 cluster names and, 4-16 clusters and, 2-9, 4-15, 5-30 disabling, 4-14, 5-33, 5-35 displaying status messages, 4-17 error information, 2-10 folders, 2-11 formulas, 4-14 history, 2-11 LANs and, 4-25 logging events, 4-17 mobile users and, 4-22 pausing, 5-37 resuming, 5-38 retrying failed, 5-39 statistics, 5-7, 5-10 troubleshooting, B-1 types of, 5-30 Web databases, 4-23 Replication Log error information, 2-10 Response time server, 5-18 Roaming users change from nonroaming, 4-20 clusters and, 4-19 registering, 4-20
S
Scalability clusters and, 1-1 Schedule Manager described, 2-13 Scheduling information clusters and, 2-13 Secure Sockets Layer. See SSL, 6-16 Security Internet, 6-16 Selective replication formulas processing, 2-11, 4-14 Server availability index described, 5-18 expansion factor and, 5-19 viewing, 5-7 Server availability threshold described, 5-20 setting, 5-22 to 5-24
P
Partitioned servers clustering, 3-19 to 3-20 Passthru servers configuring, 3-21 mobile user access with, 4-21
R
Replica IDs troubleshooting, B-3 to B-4 REPLICA task using, 2-10 Replicas access rights to, 4-7
viewing, 5-7 Server documents Internet Cluster Manager settings, 6-9 Server Health Monitoring clusters and, 5-26 Server statistic collection creating, 5-3 Server Web Navigator starting in a cluster, 4-23 Servers adding to clusters, 5-45 to 5-46 availability index, 5-7, 5-18 to 5-20, 5-22 availability threshold, 5-7, 5-18, 5-20 to 5-24 backing up, 5-29 BUSY state, 2-5, 5-18, 5-22 cluster requirements, 1-3 clustering, 3-1, 3-12 to 3-22 failover causes, 2-3, 5-27 log file, 5-5, 5-7 removing from cluster, 5-47 to 5-49 restricting access, 5-27, 5-29 Shared mail clusters and, 2-12, 4-11 Size quotas cluster replication and, 4-17 SSL Internet security and, 6-16 Standard replication described, 5-30 server failure and, 4-13 Statistic Collector Cluster Manager statistics, 5-6, 5-10 starting, 5-4 Statistics cluster, 5-3, A-1, A-4 Cluster Manager, 5-6, 5-10 Internet Cluster Manager, 6-17 Statistics Reporting database Cluster Manager statistics, 5-6, 5-10 replication statistics, 5-10
T
Tell commands Cluster Replicator, 5-11 to 5-15, 5-37 to 5-41
Index-3
Tivoli Analyzer. See IBM Tivoli Analyzer for Lotus Domino, 5-26 Troubleshooting Cluster Database Directory, B-2 Cluster Replicator, B-3 failover, B-2 folder replication, B-4 HTTP servers, B-4 Internet Cluster Manager, B-4 IP addresses, B-4 replica IDs, B-3 to B-4 replication, B-1, B-2 Web Navigator, B-4
U
URLs Internet Cluster Manager and, 6-3, 6-9, 6-19
W
WANs using in a cluster, 3-11, 3-18, 5-24 Web databases replicating, 4-23 Web Navigator troubleshooting, B-4 Web servers failover, 6-14, 6-19 security, 6-16 Workload balancing Activity Trends and, 5-26 Cluster Manager and, 2-8 clusters and, 2-5, 5-11, 5-17 controlling cluster access, 5-24 database distribution for, 3-7, 3-9 Internet Cluster Manager, 6-14 log file and, 5-5 moving replicas, 5-26 multiple Cluster Replicators, 5-31, 5-33 NOTES.INI file variables, 2-5 Server Health Monitoring and, 5-26 servers and, 5-17 to 5-26
Index-4