AIX Version 7.2_ Cluster management
AIX Version 7.2_ Cluster management
Cluster management
IBM
This edition applies to AIX Version 7.3 and to all subsequent releases and modifications until otherwise indicated in new
editions.
© Copyright International Business Machines Corporation 2021.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with
IBM Corp.
Contents
Cluster management............................................................................................. 1
Cluster Aware concepts............................................................................................................................... 1
CAA ports................................................................................................................................................ 2
Cluster repository................................................................................................................................... 2
Cluster system architecture flow........................................................................................................... 2
Naming a cluster.....................................................................................................................................3
Cluster communication.......................................................................................................................... 3
Deadman switch..................................................................................................................................... 4
Linked cluster......................................................................................................................................... 4
Asymmetric topology reconciliation...................................................................................................... 5
Unicast communication..........................................................................................................................5
Rolling upgrade and coexistence with prior AIX technology levels...................................................... 5
IPv6 support........................................................................................................................................... 6
Configuring Cluster Aware........................................................................................................................... 7
Setting up cluster SAN communication................................................................................................. 7
Configuring cluster security................................................................................................................... 8
CAA licensing.......................................................................................................................................... 8
Managing clusters with commands............................................................................................................. 9
Managing cluster events.............................................................................................................................. 9
Programming cluster sockets.................................................................................................................... 10
Troubleshooting Cluster Aware................................................................................................................. 11
Troubleshooting with the snap command........................................................................................... 11
Troubleshooting with node maintenance mode..................................................................................11
Troubleshooting with component trace...............................................................................................12
Sample output for cluster commands.......................................................................................................12
clcmd date command sample output.................................................................................................. 12
lscluster -d command sample output..................................................................................................13
lscluster -i command sample output...................................................................................................13
lscluster -m command sample output................................................................................................ 14
lscluster -s command sample output.................................................................................................. 15
nodeState cluster event sample output.............................................................................................. 15
Code samples for cluster events............................................................................................................... 15
Cluster events using AHAFS sample code...........................................................................................15
Notices................................................................................................................19
Privacy policy considerations.................................................................................................................... 20
Trademarks................................................................................................................................................ 21
iii
iv
About this document
The Cluster Aware function is part of the AIX operating system. Using Cluster Aware AIX, you can create a
cluster of AIX nodes and build a highly available architectural solution for a data center.
Highlighting
The following highlighting conventions are used in this document:
Bold Identifies commands, subroutines, keywords, files, structures, directories, and other
items whose names are predefined by the system. Also identifies graphical objects
such as buttons, labels, and icons that the user selects.
Italics Identifies parameters whose actual names or values are to be supplied by the user.
Identifies examples of specific data values, examples of text similar to what you
Monospace
might see displayed, examples of portions of program code similar to what you
might write as a programmer, messages from the system, or information you should
actually type.
Case-sensitivity in AIX
Everything in the AIX operating system is case-sensitive, which means that it distinguishes between
uppercase and lowercase letters. For example, you can use the ls command to list files. If you type LS,
the system responds that the command is not found. Likewise, FILEA, FiLea, and filea are three
distinct file names, even if they reside in the same directory. To avoid causing undesirable actions to be
performed, always ensure that you use the correct case.
ISO 9000
ISO 9000 registered quality systems were used in the development and manufacturing of this product.
Cluster repository
The cluster repository disk is used as the central repository for the cluster configuration data.
The cluster repository disk must be accessible from all nodes in the cluster. The minimal size of the
repository is largely dependent upon the cluster configuration. A minimal disk size of 10 GB is preferred.
For VIOS, PowerHA pureScale cluster, see the respective release notes for the minimal size.
The cluster repository disk is backed up by a redundant and highly available storage configuration.
The cluster repository disk should be configured for RAID to accommodate the requirements of the data
center.
The cluster repository disk is a special device for the cluster. The use of LVM commands are not supported
when used on the cluster repository disk. The AIX LVM commands are single node administrative
commands, and are not applicable in a clustered configuration.
Due to the special device characteristics required by the cluster repository disk, a raw section of the disk
and a section of the disk that contains a special volume group and special logical volumes are used during
cluster operations.
When CAA is configured with repos_loss mode set to assert and CAA loses access to the repository
disk, the system automatically shuts down.
Reservation policy for repository disk
The following is an explanation of the reservation policy used in Cluster Aware.
All storage area network (SAN) provisioned disks must be zoned to all Fibre Channel adapters on the
Virtual I/O Servers that will be members of the shared storage pool cluster.
The disks must have the reserve policy set to no_reserve. One disk with a minimum of 1 GB is used as
the repository disk for the cluster.
Notes:
• Cluster Aware AIX (CAA) opens the repository disk, and CAA sets the ODM reserve attribute to
no_reserve for all storage types.
• For nonrepository disks, use the chdev command to change the attribute to no_reserve.
• The cluster repository disk must be compliant with the 512 byte block size.
Related information
chdev Command
Naming a cluster
When you are naming a cluster you must follow specific guidelines.
The only acceptable ASCII characters you can use when naming a cluster are A - Z, a - z, 0 - 9, -
(hyphen), . (period), and _ (underscore). The first character of the cluster name and domain name cannot
be a hyphen. The maximum length of a cluster name is 63 characters.
Cluster communication
Cluster communication takes advantage of traditional networking interfaces, such as IP based network
communications and storage interface communication through Fibre Channel and SAS adapters.
When you use both the IP-based network communications and the storage interface communications, all
nodes in the cluster can always communicate with any other nodes in the cluster configuration. Having
clusters in this configuration eliminates "split brain" incidents.
You must complete the Fibre Channel setup before the cluster can use the storage interfaces as an
alternative communication path. The SAS adapter does not require special setup.
During Storage Area Network port configuration you must verify that your server interfaces are connected
to the SAN fabric ports in the same zone.
Related concepts
Setting up cluster SAN communication
Cluster management 3
You must complete the following setup before creating a cluster that uses storage communication
interfaces.
shutdown -restart
2. On the Hardware Management Console (HMC), add a virtual Ethernet adapter to the profile of each
PowerHA SystemMirror virtual client node that has a VLAN ID of 3358.
3. Reactivate the partition by using the new profile. The new profile will boot, and then display a new
entX. To display the interface status, enter the command lscluster -i
Notes:
1. VLAN 3358 must be created on the virtual client LPARs and VIOS servers.
2. VLAN 3358 is the only value that CAA uses. The VLAN tag of sfw0 must not be changed.
3. The entX adapter that is associated with VLAN 3358 does not require an enX interface or an IP
address.
4. VLAN 3358 must not be bridged to the Shared Ethernet Adapter (SEA).
5. When SAN communication is configured properly, the lscluster -m command shows the status of
the sfwcom (storage framework communication) interface as up.
6. The VIOS fcs adapter that serves the repository disk through N_Port ID Virtualization (NPIV) can also
be used for SAN communication. However, this configuration represents a single point of failure and
therefore, different VIOS fcs adapters must be used for the repository and SAN communication.
Deadman switch
A deadman switch is an action that occurs when Cluster Aware AIX (CAA) detects that a node has become
isolated in a multinode environment. This setting occurs when nodes are not communicating with each
other via the network and the repository disk.
The AIX operating system can react differently depending on the deadman switch setting or the
deadman_mode which is tunable. The deadman switch mode can be set to either force a system shut
down or generate an Autonomic Health Advisor File System (AHAFS) event.
Related information
clctrl Command
Linked cluster
IBM AIX 7.1 with Technology Level 2 Cluster Aware AIX (CAA) introduces the concept of linked cluster.
Linked cluster provides the reliable exchange of data and control messages between two or more nodes
that are part of the same cluster but that are separated by geographical boundaries. Each location is
called a site. The AIX 7 with 7100-02 CAA supports up to two sites.
The only mode of communication between nodes that are in two sites is through TCP/IP. There is no
Storage Area Network (SAN) or disk communication.
Unicast communication
Cluster Aware AIX (CAA) uses multicast communications for heartbeat and other protocol messages,
which might require an additional network setup at customer site. The unicast cluster provides a new
capability to CAA to support clustering with simultaneous unicasting of CAA protocol messages, instead of
multicasting. It is applied to all sites within the CAA cluster.
The communication mode of the cluster can be toggled at run time by using the clctrl -tune
command and changing the value of the communication_mode tunable parameter, between u (for
unicast) and m (for multicast). The CAA default value is m but it can vary depending on product. For
example, VIOS SSP defaults to the unicast mode.
Cluster management 5
A rolling upgrade of a cluster is done by taking a node offline and upgrading it to a new AIX technology
level, while the other nodes remain active. After a node is upgraded, the node is rebooted and brought
online by issuing the clctrl command. This process is repeated until all the nodes are upgraded.
In a mixed cluster environment, nodes running AIX 7 with 7100-02 (CAA) maintain compatibility with
nodes that are still running prior AIX technology levels by running at the lowest effective level. New
features are not enabled until all the cluster nodes are upgraded to the new technology level.
For example, AIX 7 with 7100-02(CAA) introduces support for IPv6 networks and multiple sites. This
support is not available until the entire cluster is upgraded to AIX 7 with 7100-02 (CAA).
Rolling upgrade and coexistence support are not provided for nodes running AIX 7.1 or AIX 7.1 SP5
(CAA) unless the mandatory APARs are installed. Nodes that have AIX 7.1 must have APAR IV16481. If
your nodes do not have the required APARs, a total cluster outage is still required. In that situation, you
must remove your cluster, install AIX 7 with 7100-02 (CAA) on all of your nodes, and then re-create your
cluster.
Note: Applying the mandatory APARs also requires a total cluster outage, so it is worthwhile to install the
mandatory APARs, if you immediately plan to install AIX 7 with 7100-02 (CAA).
If you are running other clustering software, such as PowerHA SystemMirror, on top of your CAA cluster,
see the documentation for that software for additional information and instructions for upgrading your
cluster.
IPv6 support
IBM AIX 7.1 with Technology Level 2 Cluster Aware AIX (CAA) introduces support for Internet Protocol
version 6 (IPv6) for network-based communications.
With this support, nodes are now able to participate in homogeneous IPv6 and heterogeneous IPv4 and
IPv6 network environments.
Network interfaces configured with IPv6 are automatically detected and used by the CAA kernel
communications services. Network interfaces configured with both IPv4 and IPv6 maintain heartbeat
and communicate over both versions of IP.
The lscluster command has been updated to support IPv6:
• IPv6 addresses configured over monitored network interfaces will be displayed.
• The IP protocol for each network-based point-of-contact will be displayed.
The IPv6 multicast group is of site-local scope and is generated by using the IPv4 multicast group that
was either manually specified or automatically generated. Specifically, the IPv4 multicast group occupies
the bottom 32-bit word of a standard IPv6 site-local multicast address. The AIX 7 with 7100-02 CAA
does not allow you to specify or change the IPv6 multicast group used for the cluster. The multiple-site
feature introduced in AIX 7 with 7100-02 CAA requires that each site have its own unique multicast
group. The site multicast group is either specified or automatically generated when the site is created.
The ability to directly define a site's IPv6 multicast group is not supported.
You can upgrade an existing AIX 7 with 7100-01 or AIX 7 with 7100-01 SP4 release of a CAA cluster
that does not have support for IPv6 to an AIX 7 with 7100-02 release of a CAA cluster that does have
support for IPv6 through the process of a rolling upgrade. Additionally, for clusters that you plan to run
IPv6 exclusively over their network topology, you need to specify the IPv6 capabilities flag during cluster
creation to indicate that IPv6 support is required on all nodes to create the cluster.
VLAN pseudoadapter support
IBM AIX 7 with 7100-02 release of a Cluster Aware AIX (CAA) supports VLAN pseudoadapters
for participation in VLAN networks. Network interfaces configured over VLAN pseudoadapters are
automatically detected and used for CAA kernel communications services.
Related concepts
Linked cluster
Note: If you booted from the Fibre Channel adapter, you do not need to complete this step.
2. Run the following command:
Note: If you booted from the Fibre Channel adapter, add the -P flag.
3. Run the following command:
Cluster management 7
5. Verify the configuration changes by running the following command:
The following is an example of the output displayed from the lsdev -C | grep sfwcom command:
After you create the cluster, you can list the cluster interfaces and view the storage interfaces by running
the following command:
lscluster -i
Related concepts
Cluster communication
Cluster communication takes advantage of traditional networking interfaces, such as IP based network
communications and storage interface communication through Fibre Channel and SAS adapters.
smitty clustsec
Related information
clctrl Command
CAA licensing
A list of product versions for which CAA is licensed.
The following table lists the product versions for which CAA is licensed:
CAA
licensed AIX 6.1 AIX 7.1
Express Standard Enterprise Express Standard Enterprise
PowerHA® Yes Yes Yes Yes Yes Yes
VIOS SSP Yes Yes Yes Yes Yes Yes
(shared
storage
pools)
External No No No No Yes Yes
consumer
chcluster
Use this command to change the cluster configuration. The following example adds a node to the
cluster configuration:
rmcluster
Use this command to remove the cluster configuration. The following example removes the cluster
configuration:
rmcluster -n mycluster
lscluster
Use this command to list cluster configuration information. The following example lists the cluster
configuration for all nodes:
lscluster -m
clcmd
Use this command to distribute a command to a set of nodes that are members of a cluster. The
following example lists the date for all the nodes in the cluster:
clcmd date
Related concepts
Sample output for cluster commands
You can view sample output for the lscluster -d command, the lscluster -i command, the
lscluster -m command, and the lscluster -s command.
Related information
chcluster command
clcmd command
lscluster command
mkcluster command
rmcluster command
Cluster management 9
The AHAFS file system is automatically mounted when you create the cluster. If the AHAFS file system is
already mounted by another application before the cluster is created, the original mount point is used by
the cluster configuration.
./socksimple -r -a 1
./socksimple -s -a 1
Note: The –a (address) option sends the packets to node 1 in this local cluster.
The following code is output from running the socksimple –s –a 1 command:
./socksimple -s -a 1
socksimple version 1.2
socksimple 1/12 with ttl=1:
snap caa
The following structure is an example of the data files collected during the snap script execution for
Cluster Aware AIX:
/tmp/ibmsupt
|
'-- caa
|
'-- Data
|
|-- 20100817215934 (For example, a timestamp at which "snap caa" was run)
| |
| |-- nodeA.austin.ibm.com.tar.gz
| |-- ...
| |-- nodeB.austin.ibm.com.tar.gz
| |--
| |-- nodeC.austin.ibm.com.tar.gz
|
'-- ... (For example, more timestamp directories to distinguish separate "snap caa" invocations)
Related information
snap command
Cluster management 11
Nodes that have been stopped do not participate in cluster configuration or communications and are seen
by the other nodes as down. The stopped state is persistent. Nodes that have been stopped must be
explicitly started via the clctrl -start command before they can resume cluster participation.
To set a node in maintenance mode, run the following command:
-------------------------------
NODE nodeB.austin.ibm.com
-------------------------------
Fri Jul 30 08:00:00 CDT 2010
-------------------------------
NODE nodeC.austin.ibm.com
-------------------------------
Fri Jul 30 08:00:00 CDT 2010
Node nodeA.austin.ibm.com
Node uuid = 1602a950-e651-11e1-84be-00145e76c700
Number of disk discovered = 2
hdisk6
State : UP
uDid : 200B75DC891480507210790003IBMfcp
uUid : 447dac46-c779-c5ff-ca46-7f885ec6f742
Site uUid : 51735173-5173-5173-5173-517351735173
Type : CLUSDISK
hdisk7:
State : UP
uDid : 200B75DC891480607210790003IBMfcp
uUid : 3e77c6b6-5624-d27a-01d9-9b291c5e8437
Site uUid : 51735173-5173-5173-5173-517351735173
Type : REPDISK
Node nodeB.austin.ibm.com
Node UUID = ebc9b154-e70b-11e1-a379-00145e76c700
Number of disks discovered = 2
hdisk6:
State : UP
uDid : 200B75DC891480507210790003IBMfcp
uUid : 447dac46-c779-c5ff-ca46-7f885ec6f742
Site uUid : 51735173-5173-5173-5173-517351735173
Type : CLUSDISK
hdisk7:
State : UP
uDid : 200B75DC891480607210790003IBMfcp
uUid : 3e77c6b6-5624-d27a-01d9-9b291c5e8437
Site uUid : 51735173-5173-5173-5173-517351735173
Type : REPDISK
Node nodeA.austin.ibm.com
Node uuid = 1602a950-e651-11e1-84be-00145e76c700
Number of interfaces discovered = 2
Interface number 1 en0
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 00:14:5E:E7:01:F1
Smoothed RTT across interface = 8
Mean deviation in network RTT across interface = 3
Probe interval for interface = 110 ms
IFNET flags for interface = 0x1E080863
NDD flags for interface = 0x0061081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.3.207.183 broadcast 10.3.207.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.3.207.179
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 330
Mean deviation in network RTT across interface = 214
Probe interval for interface = 5440 ms
IFNET flags for interface = 0x00000000
Cluster management 13
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
Node nodeB.austin.ibm.com
Node UUID = 6bdfd974-e651-11e1-a546-00145e76c700
Number of interfaces discovered = 2
Interface number 1, en0
IFNET type = 6 (IFT_ETHER)
NDD type = 7 (NDD_ISO88023)
MAC address length = 6
MAC address = 00:14:5E:E7:2C:B1
Smoothed RTT across interface = 7
Mean deviation in network RTT across interface = 3
Probe interval for interface = 100 ms
IFNET flags for interface = 0x1E080863
NDD flags for interface = 0x0061081B
Interface state = UP
Number of regular addresses configured on interface = 1
IPv4 ADDRESS: 10.3.207.197 broadcast 10.3.207.255 netmask 255.255.255.0
Number of cluster multicast addresses configured on interface = 1
IPv4 MULTICAST ADDRESS: 228.3.207.179
Interface number 2, dpcom
IFNET type = 0 (none)
NDD type = 305 (NDD_PINGCOMM)
Smoothed RTT across interface = 701
Mean deviation in network RTT across interface = 413
Probe interval for interface = 11140 ms
IFNET flags for interface = 0x00000000
NDD flags for interface = 0x00000009
Interface state = UP RESTRICTED AIX_CONTROLLED
---------------------------------------------------------------------
BEGIN_EVENT_INFO
TIME_tvsec=1280597380
TIME_tvnsec=591097152
SEQUENCE_NUM=4
RC_FROM_EVPROD=0
BEGIN_EVPROD_INFO
EVENT_TYPE=NODE_DOWN
NODE_NUMBER=1
NODE_ID=0xDCE3A808999111DFAA800245C0004002
CLUSTER_ID=0x22A3BFAE9CC611DFA9B80245C0002004
END_EVPROD_INFO
END_EVENT_INFO
Related concepts
Managing cluster events
AIX event management is implemented using a pseudofile system architecture. The use of the pseudofile
system allows you to use existing application programming interfaces (APIs) to program the monitoring of
events, such as a select ( ) call or a blocking read ( ) call.
Cluster management 15
The following is the code for test_prog:
#include <stdio.h>
#include <string.h> /* for strcmp() */
#include <fcntl.h>
#include <errno.h>
#include <sys/time.h>
#include <sys/select.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <libgen.h>
#include <usersec.h>
char *monFile;
test_prog :: main
int main (int argc, char *argv[])
{
int fd,outfd, rc,i=0,cnt=0;
fd_set readfds;
char *outputFile;
char wrStr[MAX_WRITE_STR_LEN+1];
char waitInRead[] = "WAIT_TYPE=WAIT_IN_READ";
if (argc < 5)
syntax( argv[0]);
monFile = argv[1];
if ( ! ahaMonFile(monFile) ) /* Not a .mon file under /aha */
syntax( argv[0]);
/* Create intermediate directories of the .mon file */
rc = mk_parent_dirs(monFile);
if (rc)
{
fprintf (stderr,
"Could not create intermediate directories of the file %s !\n", monFile);
return(-1);
}
printf("Monitor file name: %s\n", monFile);
sprintf (wrStr, "%s", argv[2]);
cnt = atoi(argv[3]);
printf("Write String : %s\n", wrStr);
outputFile = argv[4];
fd = open (monFile, O_CREAT|O_RDWR);
if (fd < 0)
{
fprintf (stderr,"Could not open the file %s; errno = %d\n", monFile,errno);
exit (1);
}
outfd = open (outputFile, O_CREAT|O_RDWR);
if (outfd < 0)
{
fprintf (stderr, "Could not open the file %s; errno = %d !\n", monFile, errno);
return(-1);
}
write(fd, wrStr, strlen(wrStr));
test_prog :: syntax
/* -------------------------------------------------------------------------- */
void syntax(char *prog)
{
printf("\nSYNTAX: %s <aha-monitor-file> [<key1>=<value1>[;<key2>=<value2>;...]] <count> <outfile> \n",prog);
exit (1);
}
test_prog :: ahaMonFile
/* --------------------------------------------------------------------------
* PURPOSE: To check whether the file provided is an AHA monitor file.
*/
int ahaMonFile(char *str)
{
char cwd[PATH_MAX];
int len1=strlen(str), len2=strlen(".mon");
int rc = 0;
struct stat sbuf;
Cluster management 17
test_prog :: mk_parent_dirs
/*-----------------------------------------------------------------
* NAME: mk_parent_dirs()
* PURPOSE: To create intermediate directories of a .mon file if
* they are not created.
*/
static int
mk_parent_dirs (char *path)
{
char s[PATH_MAX];
char *dirp;
struct stat buf;
int rc=0;
dirp = dirname(path);
if (stat(dirp, &buf) != 0)
{
sprintf(s, "/usr/bin/mkdir -p %s", dirp);
rc = system(s);
}
return (rc);
}
test_prog :: read_data
/*-----------------------------------------------------------------
* PURPOSE: To parse and print the data received at the occurrence
* of the event.
*/
void
read_data (int fd,int outfd)
{
#define READ_BUF_SIZE 3072
char data[READ_BUF_SIZE];
char *p, *line;
char cmd[64];
time_t sec, nsec;
pid_t pid;
uid_t uid, luid;
gid_t gid;
char curTm[64];
int n;
int stackInfo = 0;
char uname[64], lname[64], gname[64];
For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual
Property Department in your country or send inquiries, in writing, to:
Such information may be available, subject to appropriate terms and conditions, including in some cases,
payment of a fee.
Portions of this code are derived from IBM Corp. Sample Programs.
© Copyright IBM Corp. _enter the year or years_.
20 Notices
For more information about the use of various technologies, including cookies, for these purposes,
see IBM’s Privacy Policy at http://www.ibm.com/privacy and IBM’s Online Privacy Statement at http://
www.ibm.com/privacy/details the section entitled “Cookies, Web Beacons and Other Technologies”
and the “IBM Software Products and Software-as-a-Service Privacy Statement” at http://www.ibm.com/
software/info/product-privacy.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at
Copyright and trademark information at www.ibm.com/legal/copytrade.shtml.
Notices 21
22 AIX Version 7.2: Cluster management
IBM®