0% found this document useful (0 votes)
9 views326 pages

FF32B00 Administration Operation V1-0

This user guide provides comprehensive instructions for the administration and operation of FlexFrame™ for mySAP™ Business Suite version 3.2. It covers topics such as system requirements, architecture, basic administration, hardware changes, software updates, and troubleshooting. The document is published by Fujitsu Siemens Computers GmbH and includes detailed procedures for managing application nodes and network configurations.

Uploaded by

ankitbasis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views326 pages

FF32B00 Administration Operation V1-0

This user guide provides comprehensive instructions for the administration and operation of FlexFrame™ for mySAP™ Business Suite version 3.2. It covers topics such as system requirements, architecture, basic administration, hardware changes, software updates, and troubleshooting. The document is published by Fujitsu Siemens Computers GmbH and includes detailed procedures for managing application nodes and network configurations.

Uploaded by

ankitbasis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 326

FA_Agents.ti1 Seite 1 Dienstag, 22.

März 2005 6:38 06


Seitenanzahl: 168

User Guide

FlexFrame™ for mySAP™ Business Suite 3.1


english

FlexFrame™ for mySAP™ Business Suite 3.2


Administration and Operation

Dieses Handbuch wurde erstellt von / This manual was produced by


cognitas. Gesellschaft für Technik-Dokumentation mbH — www.cognitas.de

FA Agents – Installation and Administration


Herausgegeben von / Published by
Fujitsu Siemens Computers GmbH
Printed in the Federal Republic of Germany
Ausgabe/Edition March 2005

User Guide
Bestell-Nr./ Order No.:
U41608-J-Z145-1-76

*U41608-J-Z145-1-76*

U41608-J-Z145-1-76
FlexFrame™ for mySAP™ Business Suite
Version 3.2

Administration and Operation

Edition May 2006


Document Version 1.0
Fujitsu Siemens Computers GmbH
Global Competence Center SAP
D 69190 Walldorf, Germany
© Copyright Fujitsu Siemens Computers GmbH 2006
FlexFrame™, PRIMECLUSTER™. PRIMEPOWER™ and PRIMERGY™ are trademarks of
Fujitsu Siemens Computers
®
SPARC64 is a registered trademark of Fujitsu Ltd.
®
SAP , mySAP™ and NetWeaver™ are trademarks or registered trademarks of SAP AG in
Germany and in several other countries
®
Linux is a registered trademark of Linus Torvalds
®
SuSE is a registered trademark of SuSE Linux AG
Java™ and Solaris™ are trademarks of Sun Microsystems, Inc. in the United States and
other countries
® ®
Intel and PXE are registered trademarks of Intel Corporation in the United States and other
countries
®
MaxDB is a registered trademark of MySQL AB, Sweden
®
MySQL is a registered trademark of MySQL AB, Sweden
® ®
NetApp , Network Appliance , Open Network Technology for Appliance Products™, Write
Anywhere File Layout™ and WAFL™ are trademarks or registered trademarks of Network
Appilance, Inc. in the United States and other countries
®
Oracle is a registered trademark of ORACLE Corporation
SPARC™ is a trademark of SPARC International, Inc. in the United States and other
countries
®
Ethernet is a registered trademark of XEROX, Inc., Digital Equipment Corporation and Intel
Corporation
® ® ®
Windows , Excel and Word are registered trademarks of Microsoft Corporation
All other hardware and software names used are trademarks of their respective companies .

All rights, including rights of translation, reproduction by printing, copying or similar methods,
in part or in whole, are reserved.
Offenders will be liable for damages.
All rights, including rights created by patent grant or registration of a utility model or design,
are reserved.
Delivery subject to availability. Right of technical modification reserved.
Contents
1 Introduction ..................................................................................................... 1
1.1 Requirements .................................................................................................... 1
1.2 Notational Conventions ..................................................................................... 1
1.3 Document History.............................................................................................. 2
1.4 Related Documents........................................................................................... 2
1.5 Special Hints for FlexFrame 3.2 ........................................................................ 2

2 FlexFrame Architecture .................................................................................. 3


2.1 General Notes on FlexFrame 3.2 ...................................................................... 4
2.2 Hardware........................................................................................................... 5
2.3 Software ............................................................................................................ 6
2.4 Shared Operating System ................................................................................. 8
2.4.1 Shared OS Boot Concept .................................................................................. 8
2.4.2 Control Nodes ................................................................................................. 10
2.4.3 Application Nodes ........................................................................................... 10
2.4.3.1 Linux Application Nodes.................................................................................. 10
2.4.3.2 Solaris Application Nodes ............................................................................... 11
2.5 Switch and Filer Configuration......................................................................... 11
2.5.1 Adding a Switch to a Switch Group ................................................................. 11
2.5.2 Removing a Switch from a Switch Group ........................................................ 13
2.5.3 Listing a Switch Group Configuration .............................................................. 14
2.5.4 Changing the Password of a Switch Group ..................................................... 16
2.5.5 Changing the Host Name of a Switch Group................................................... 16
2.5.6 Adding a Switch Port Configuration................................................................. 17
2.5.7 Removing a Switch Port Configuration............................................................ 19
2.5.8 Displaying a Switch Port Configuration ........................................................... 20
2.5.9 Adding a New Filer .......................................................................................... 21
2.5.10 Displaying All Configured Filers....................................................................... 23
2.5.11 Displaying Filer Configuration.......................................................................... 24
2.5.12 Adding a Pool to a Filer ................................................................................... 25
2.5.13 Removing a Pool from a Filer .......................................................................... 26
2.5.14 Removing a Filer ............................................................................................. 27
2.6 LDAP............................................................................................................... 28
2.6.1 FlexFrame Structure in LDAP.......................................................................... 28
2.6.2 Working with LDAP ......................................................................................... 28
2.6.3 Disaster Recovery – Turning Replica into Master ........................................... 28
2.7 PRIMECLUSTER ............................................................................................ 29
2.7.1 PRIMECLUSTER Components ....................................................................... 29
2.7.2 FlexFrame Specific RMS Configuration .......................................................... 30
2.7.3 RMS Configuration – Schematic Overview ..................................................... 32
2.7.4 Node Failures .................................................................................................. 33
2.7.5 PRIMECLUSTER CLI Commands .................................................................. 33

Administration and Operation


Contents

2.7.6 Usage Example: Switch Application ................................................................ 35


2.7.7 PRIMECLUSTER Log Files ............................................................................. 37
2.7.8 Desaster Repair............................................................................................... 37
2.8 Network ........................................................................................................... 38
2.8.1 LAN Failover.................................................................................................... 38
2.8.2 Segments ........................................................................................................ 38
2.8.3 Network Switches ............................................................................................ 40
2.8.4 Automounter Concept...................................................................................... 40
2.9 Network Appliance Filer................................................................................... 44
2.9.1 Built-in Cluster File System.............................................................................. 45
2.9.2 Volume Layout................................................................................................. 45
2.9.3 Snapshots........................................................................................................ 45
2.9.4 Filer Cluster ..................................................................................................... 45

3 FlexFrame Basic Administration.................................................................. 46


3.1 Accessing a FlexFrame Landscape (Remote Administration) ......................... 46
3.2 Powering up the FlexFrame Landscape .......................................................... 46
3.3 Powering off the FlexFrame Landscape .......................................................... 48
3.4 Reactivating ANs after Power Shutdown by FA Agents .................................. 49

4 Displaying the Current FlexFrame Configuration State ............................. 51


4.1 Networks.......................................................................................................... 52
4.1.1 Script: ff_netscan.sh ........................................................................................ 54
4.1.2 Script: ff_pool_defrt.sh..................................................................................... 61
4.1.3 Script: ff_pool_dnssrv.sh ................................................................................. 62
4.2 State of Pools .................................................................................................. 63
4.3 State of Application Nodes .............................................................................. 63
4.4 State of SAP Systems ..................................................................................... 63
4.5 State of SID Instances ..................................................................................... 63

5 Web Interfaces ............................................................................................... 65


5.1 FlexFrame Web Portal..................................................................................... 65
5.2 FA Autonomous Agents................................................................................... 66
5.3 PRIMECLUSTER Administration..................................................................... 66
5.3.1 Cluster Foundation (CF) .................................................................................. 70
5.3.2 Reliant Monitor Services (RMS) ...................................................................... 71
5.3.3 Inconsistent and Faulted Applications ............................................................. 72
5.3.4 Switching Applications ..................................................................................... 74
5.3.5 Application Maintenance Mode........................................................................ 74
5.4 ServerView S2................................................................................................. 75

6 Hardware Changes ........................................................................................ 83


6.1 Changing BIOS Settings for Netboot ............................................................... 83
6.2 Replacing Network Cards ................................................................................ 89

Administration and Operation


Contents

6.2.1 Replacing a Network Card – Control Node ..................................................... 89


6.2.2 Replacing a Network Card – Application Node................................................ 90
6.3 Replacing Power Control Hardware ................................................................ 90
6.4 Exchanging a Control Node ............................................................................ 91
6.4.1 Hardware Failed, Hard Disk and Installed OS Are Not Affected...................... 91
6.4.2 One Hard Disk is Defect, the Other One Is Undamaged ................................. 92
6.4.3 The Control Nodes OS Is Damaged................................................................ 92
6.5 Replacing Switch Blades................................................................................. 92

7 Software Updates .......................................................................................... 93


7.1 Updating the entire FlexFrame Landscape ..................................................... 93
7.1.1 Upgrading from FlexFrame 3.0 or lower Version to FlexFrame 3.2................. 93
7.1.2 Upgrading from FlexFrame 3.1 to FlexFrame 3.2............................................ 93
7.2 Software Update on the Control Node............................................................. 93
7.2.1 ServerView Update via RPM ........................................................................... 94
7.2.2 Updating/Installing a New Linux Kernel........................................................... 95
7.2.2.1 Software Stage................................................................................................ 95
7.2.2.2 Install the New Kernel ..................................................................................... 96
7.2.2.3 New Kernel Source for PCL4 and ServerView ................................................ 98
7.2.2.4 Reboot the Control Node................................................................................. 99
7.2.3 Installing a New OS Image .............................................................................. 99
7.3 Installation of a New FA Agent Version ........................................................... 99
7.3.1 Migration at Pool Level.................................................................................. 100
7.3.2 FlexFrame Autonomy Command Line Interface ............................................ 101
7.3.3 Migration of FA Agent Versions on Pool Level .............................................. 101
7.4 The FA Migration Tool................................................................................... 103
7.4.1 Pool Mode ..................................................................................................... 103
7.4.2 File Mode ...................................................................................................... 103
7.4.3 Usage of Help................................................................................................ 104
7.4.4 Parameters of the FA Migration Tool............................................................. 104
7.5 Installing ONTAP Patches ............................................................................. 105
7.6 Third Party Software...................................................................................... 105

8 Administrating Application Nodes............................................................. 107


8.1 Listing Application Nodes .............................................................................. 107
8.1.1 Displaying Information on a Specific Application Node ................................. 107
8.1.2 Displaying Information on all Application Nodes............................................ 111
8.2 Adding Application Nodes ............................................................................. 113
8.3 Removing Application Nodes ........................................................................ 116
8.4 Renaming Application Nodes ........................................................................ 117
8.5 Administrating Blade Server Cabinets........................................................... 117
8.5.1 Listing Blade Server Cabinets ....................................................................... 117
8.5.1.1 Displaying Information on a Specific Blade Server Cabinet .......................... 117
8.5.1.2 Displaying Information on all Configured Blade Server Cabinets .................. 118
8.5.2 Adding Blade Server Cabinets ...................................................................... 119

Administration and Operation


Contents

8.5.3 Removing Blade Server Cabinets.................................................................. 123


8.5.4 Changing Switch Blade Type......................................................................... 124
8.5.5 Changing Switch Blade Name ....................................................................... 125
8.5.6 Changing Switch Blade Password................................................................. 126
8.5.7 Getting Switch Blade Initial Configuration...................................................... 127
8.6 Maintenance of Linux Application Node Images............................................ 129
8.6.1 Installing an Application Node Image from Installation Media ....................... 129
8.6.1.1 Installing the Application Node Image ........................................................... 130
8.6.1.2 Creating the New Netboot Configuration ....................................................... 131
8.6.1.3 Creating the New var Image .......................................................................... 131
8.6.2 Creating a New Linux OS Image for Application Nodes ................................ 132
8.6.2.1 Schematic Overview of the Maintenance Cycle............................................. 133
8.6.2.2 Creating a Custom Image Tree for Linux Application Nodes ......................... 134
8.6.2.3 Disabling the FA Agents ................................................................................ 134
8.6.2.4 Modifying the Netboot Configuration ............................................................. 135
8.6.2.5 Creating the New var Image .......................................................................... 135
8.6.2.6 Enabling Read-Write Access to the New Root Image ................................... 135
8.6.2.7 Maintaining the Application Node Image ....................................................... 136
8.6.2.8 Creating the New var_template ..................................................................... 136
8.6.2.9 Enabling the FA Agents ................................................................................. 137
8.6.2.10 Disabling Read-Write Access to the New Root Image................................... 137
8.6.2.11 Migrating the Remaining Application Nodes .................................................. 137
8.6.3 Service Packs................................................................................................ 137
8.6.4 Updating / Installing a New Linux Kernel ....................................................... 138
8.6.4.1 Software Stage .............................................................................................. 138
8.6.4.2 Installing a New Linux Kernel ........................................................................ 139
8.6.4.3 Creating a New initrd for Application Nodes .................................................. 141
8.6.4.4 New Netboot Configuration............................................................................ 141
8.6.4.5 Netboot Configuration for the First Test......................................................... 142
8.6.4.6 Removing Write Permissions for Maintenance Application Node on the Root
Image ............................................................................................................ 143
8.6.4.7 Changing All Netboot Configuration Templates to the New Linux Kernel...... 144
8.6.5 ServerView Update........................................................................................ 144
8.6.6 Upgrading RPM Packages on an Application Node ...................................... 146
8.6.7 Upgrading the Application Software .............................................................. 146
8.6.8 Migrating Remaining Application Nodes to the New Application Node Image147
8.6.8.1 Modifying the Netboot Configuration ............................................................. 147
8.6.8.2 Creating the New var Images ........................................................................ 147
8.7 Installation / Activation of New Solaris Images .............................................. 148
8.7.1 Introduction.................................................................................................... 148
8.7.1.1 General Notes ............................................................................................... 148
8.7.1.2 How to Access the Console of a PW 250 / PW 450 ...................................... 149
8.7.1.3 OBP Flag: Local-mac-address?..................................................................... 150
8.7.1.4 OBP Flag: Auto-boot? ................................................................................... 150
8.7.1.5 What Happens When the Solaris Client Boots .............................................. 151

Administration and Operation


Contents

8.7.2 Solaris Image – rc scripts .............................................................................. 152


8.7.3 Preparation for Solaris Application Nodes ..................................................... 153
8.7.3.1 Using a Running Solaris Application Node as Helper System for Preparation154
8.7.3.2 Setup a Solaris Helper System for Preparation, which is Connected to
FlexFrame but Booted from Local Disk ......................................................... 158
8.8 Solaris Image Maintenance Cycle ................................................................. 162
8.8.1 Introduction ................................................................................................... 162
8.8.2 Overview ....................................................................................................... 163
8.8.3 Running the Cycle ......................................................................................... 165
8.8.4 Create Solaris Application Nodes on the Maintained Image ......................... 168
8.9 Image Customization for Experts .................................................................. 169
8.10 Troubleshooting............................................................................................. 170
8.10.1 Solaris Image – Traces of Solaris rc-scripts .................................................. 170
8.10.2 Problems With /usr During Maintenance Cycle ............................................. 170
8.10.3 Boot Hangs With “Timeout waiting…” ........................................................... 171
8.10.4 Boot Hangs After “router IP is…”................................................................... 171
8.10.5 Boot Stops Complaining About /usr............................................................... 171
8.10.6 Boot Hangs When Trying to Mount /usr ........................................................ 171
8.10.7 Boot Asks for Date, Time and/or Locale........................................................ 171
8.11 Pools and Groups.......................................................................................... 172
8.11.1 Adding a Pool ................................................................................................ 172
8.11.2 Removing a Pool ........................................................................................... 175
8.11.3 Listing Pool Details........................................................................................ 176
8.11.4 Listing all Pools ............................................................................................. 181
8.11.5 Adding a Group to a Pool .............................................................................. 183
8.11.6 Removing Pool Group ................................................................................... 184
8.11.7 Changing Group Assignment of Application Nodes....................................... 184
8.11.8 Changing Group and Pool Assignment of Application Nodes ....................... 185
8.12 The Hosts Database...................................................................................... 185
8.12.1 Script: ff_hosts.sh.......................................................................................... 185
8.13 Rebooting All Application Nodes ................................................................... 187

9 Security ........................................................................................................ 189


9.1 Requested Passwords and Password Settings During Installation ............... 189
9.1.1 Requested Passwords During Installation of the Control Nodes ................... 189
9.1.2 Setting Passwords During Installation of a NetApp Filer ............................... 190
9.1.3 Initial SSH Configuration ............................................................................... 190
9.2 Password Settings During Operation, Update and Upgrade ......................... 191
9.2.1 User Administration ....................................................................................... 191
9.2.2 Password Management on Control Nodes .................................................... 192
9.2.2.1 Passwords for Root and Standard Unix Users .............................................. 192
9.2.2.2 Password for Root of LDAP Server and Replica ........................................... 193
9.2.2.3 Password for LDAPadmins ........................................................................... 195
9.2.2.4 Password for SNMP Community ................................................................... 195
9.2.2.5 Key for Access to the Name Server (named) ................................................ 196

Administration and Operation


Contents

9.2.2.6 Password for myAMC WebGUI ..................................................................... 196


9.2.2.7 Password for mySQL Database .................................................................... 196
9.2.2.8 Password for Power Shutdown...................................................................... 196
9.2.2.9 Password for SAPDB/MaxDB in FlexFrame Start/Stop Scripts ..................... 197
9.2.3 Password Management on Linux Application Nodes..................................... 197
9.2.3.1 Password for Admin on BX(3,6)00 ................................................................ 197
9.2.3.2 Password for PowerShutdown on BX(3,6)00, RXxxx .................................... 197
9.2.3.3 Passwords for Root and Standard Unix Users .............................................. 197
9.2.3.4 Passwords for SAP Users ............................................................................. 197
9.2.4 Password Management on Solaris Application Nodes .................................. 198
9.2.4.1 Password for PowerShutdown....................................................................... 198
9.2.4.2 Passwords for Root and Standard Unix Users .............................................. 198
9.2.4.3 Passwords for SAP Users ............................................................................. 198
9.2.5 Passwords for Networking Components........................................................ 198
9.2.5.1 Password for Cisco Switch Enable/Login and SNMP_Community ................ 198
9.2.5.2 Admin Password for SwitchBlade BX(3,6)00................................................. 200
9.2.5.3 Admin and SNMP_Community Storage Passwords ...................................... 200
9.2.6 Preparation for Linux Application Nodes ....................................................... 201
9.2.6.1 Script: ff_install_an_linux_images.sh............................................................. 201
9.2.7 Preparation for Solaris Application Nodes ..................................................... 202
9.2.7.1 Script: nb_unpack_bi ..................................................................................... 202
9.3 Passwords Stored in Initialization Files ......................................................... 202
9.3.1 Switch Group Switch Definition Files ............................................................. 202
9.4 Configuration of the SCON Shutdown Agent on the Control Nodes .............. 203
9.5 FA Agents...................................................................................................... 204
9.5.1 Power-Shutdown Configuration ..................................................................... 205
9.5.1.1 BX300/600..................................................................................................... 205
9.5.1.2 RX300/RX300 S2 .......................................................................................... 205
9.5.1.3 RX600 ........................................................................................................... 206
9.5.1.4 RX800 ........................................................................................................... 206
9.5.1.5 PRIMEPOWER 250/450................................................................................ 206
9.5.1.6 PRIMEPOWER 650/850................................................................................ 207
9.5.2 SNMP Traps .................................................................................................. 207
9.5.2.1 General.......................................................................................................... 207
9.5.2.2 Configuring User, Password and Community ................................................ 208
9.5.2.3 Configuring Management Blades .................................................................. 208
9.6 PRIMECLUSTER .......................................................................................... 208

10 Administrating SAP Systems ..................................................................... 211


10.1 Listing SAP SIDs and Instances .................................................................... 211
10.2 Adding / Removing SAP SIDs and Instances ................................................ 212
10.3 Cloning a SAP SID into a Different Pool ........................................................ 214
10.3.1 Script: ff_clone_sid.pl .................................................................................... 214
10.3.2 Script: ff_change_id.pl ................................................................................... 214
10.3.3 Changing User and Group IDs after Cloning ................................................. 217

Administration and Operation


Contents

10.4 Multiple Filers and Multiple Volumes ............................................................. 218


10.5 Upgrading a SAP System.............................................................................. 220
10.5.1 Service Port................................................................................................... 220
10.5.2 FA Agents ..................................................................................................... 221
10.6 SAP Kernel Updates and Patches................................................................. 222

11 Administrating SAP Services ..................................................................... 223


11.1 Displaying Status of SAP Services................................................................ 223
11.1.1 myAMC.FA WebGUI ..................................................................................... 223
11.1.2 Script: ff_list_services.sh ............................................................................... 224
11.2 Starting and Stopping Application Services................................................... 225
11.2.1 SAP Service Scripts ...................................................................................... 225
11.2.2 SAP Service Script Actions ........................................................................... 226
11.2.3 SAP Service Scripts User Exits ..................................................................... 227
11.3 Return Code of the SAP Service Scripts ....................................................... 228
11.4 Starting and Stopping Multipe SAP Services ................................................ 229
11.4.1 Details on Controlling Multiple SAP Services ................................................ 229
11.5 Removing an Application from Monitoring by FA Agents .............................. 230
11.5.1 Stopping and Starting an Application for Upgrades Using r3up .................... 231
11.6 Service Switchover........................................................................................ 232

12 SAP ACC ...................................................................................................... 235


12.1 Integration of New Servers, Pools and SAP Services ................................... 235
12.1.1 Integration of New ACC Pools (=FF groups) ................................................. 235
12.1.2 Integration of New Servers ............................................................................ 235
12.1.3 Integration of new SAP Services ................................................................... 236
12.1.3.1 Moving SAP Services into SLD ..................................................................... 236
12.2 User Administration ....................................................................................... 236
12.3 Usage of ACC ............................................................................................... 236
12.3.1 Displaying Status of SAP Services................................................................ 236
12.3.2 Starting SAP Services ................................................................................... 237
12.3.3 Stoping SAP Services ................................................................................... 237
12.3.4 Relocating SAP Services .............................................................................. 237
12.3.5 Archiving the ACC Log .................................................................................. 238

13 Configuring FA Agents ............................................................................... 239


13.1 Groups .......................................................................................................... 240
13.1.1 General ......................................................................................................... 240
13.1.2 Service Classes............................................................................................. 241
13.1.3 Service Priority .............................................................................................. 241
13.1.4 Service Power Value ..................................................................................... 242
13.1.5 Class Creation Rules..................................................................................... 242
13.2 Traps ............................................................................................................. 242
13.2.1 General ......................................................................................................... 242
13.2.2 Changing the Trap Destinations .................................................................... 242

Administration and Operation


Contents

13.3 FlexFrame Autonomy .................................................................................... 243


13.3.1 General Parameters ...................................................................................... 243
13.3.2 Node-Related Parameters ............................................................................. 244
13.3.3 Service-Related Parameters.......................................................................... 246
13.3.4 Path Configuration......................................................................................... 248
13.4 Power Management (On/Off/Power-Cycle) ................................................... 249
13.4.1 General.......................................................................................................... 249
13.4.2 Architecture ................................................................................................... 250
13.4.3 Configuring User, Password and Community ................................................ 251
13.4.4 Configuring Management Blades .................................................................. 251
13.5 Linux Kernel Crash Dump (LKCD) Utilities .................................................... 252

14 Data Protection – Backup and Restore ..................................................... 255


14.1 Backup of Filer Volumes with NetApp Snapshot ........................................... 256
14.1.1 Filer Volumes................................................................................................. 256
14.1.2 Snapshot Schedules...................................................................................... 257
14.2 Backup of SAP Databases ............................................................................ 258
14.3 Restore SnapShot ......................................................................................... 262
14.4 FlexFrame Backup with Tape Library ............................................................ 263
14.4.1 Arcserve ........................................................................................................ 263
14.4.2 NetWorker ..................................................................................................... 264
14.5 Backup / Restore of FlexFrame Control Nodes ............................................. 265
14.5.1 Backup of a Control Node.............................................................................. 265
14.5.2 Restore of a Control Node ............................................................................. 265
14.6 Backing Up Switch Configurations................................................................. 267
14.7 Restoring Switch Configuration ..................................................................... 268

15 Error Handling & Trouble Shooting ........................................................... 269


15.1 Log Files ........................................................................................................ 269
15.2 Network Errors............................................................................................... 270
15.3 NFS Mount Messages ................................................................................... 270
15.4 LDAP Error Codes and Messages................................................................. 271
15.5 FA Agents Error Diagnosis ............................................................................ 272
15.6 FA Agents Operation and Log Files............................................................... 276
15.6.1 General.......................................................................................................... 276
15.6.2 Overview, important Files and Directories ..................................................... 277
15.6.3 Special Files .................................................................................................. 280
15.6.3.1 Livelist ........................................................................................................... 281
15.6.3.2 Services List .................................................................................................. 281
15.6.3.3 Services Log.................................................................................................. 281
15.6.3.4 Reboot ........................................................................................................... 281
15.6.3.5 Switchover ..................................................................................................... 281
15.6.3.6 XML Repository ............................................................................................. 281
15.6.3.7 BlackBoard .................................................................................................... 282
15.6.4 FA Autonomy Diagnostic Tool ....................................................................... 282

Administration and Operation


Contents

15.6.5 Data for Diagnosis in the Support Department .............................................. 282


15.7 Start/Stop Script Errors ................................................................................. 283
15.7.1 Common Error Messages for all Start/Stop Scripts ....................................... 283
15.7.2 SAPDB Specific Error Messages .................................................................. 285
15.7.3 Sapci-specific Error Messages ...................................................................... 287
15.7.4 Sapscs-specific Error Messages ................................................................... 288
15.7.5 Sapascs-specific Error Messages ................................................................. 289
15.7.6 Sapjc-specific Error Messages ...................................................................... 289
15.7.7 Sapapp-specific Error Messages................................................................... 289
15.7.8 Sapj-specific Error Messages........................................................................ 289
15.8 SAP ACC Troubleshooting ............................................................................ 289
15.8.1 ACC Logging ................................................................................................. 289
15.8.2 Missing Server in the ACC Physical Landscape............................................ 289
15.8.3 Reset of Service Status in Case of Failures .................................................. 290
15.8.4 Hanging Locks............................................................................................... 290
15.9 PRIMECLUSTER .......................................................................................... 290
15.9.1 Problem Reporting ........................................................................................ 290
15.9.2 Removing “Ghost Devices” from RMS GUI ................................................... 290
15.10 Script Debugging........................................................................................... 291
15.10.1 Shell Scripts .................................................................................................. 291
15.10.2 Perl Scripts .................................................................................................... 291

16 Abbreviations .............................................................................................. 293

17 Glossary....................................................................................................... 297

18 Index............................................................................................................. 303

Administration and Operation


1 Introduction
This document provides instructions on administrating and operating an installed
FlexFrame™ 3.2 environment. It focusses on general aspects of the architecture as well
as on software updates, hardware extensions and FlexFrame-specific configuration.
It does not cover the installation of an entire FlexFrame environment. Please refer to the
“Installation of a FlexFrame Environment” manual for information on initial installation.

1.1 Requirements
This document addresses administrators on FlexFrame environments. We assume that
the reader of this document has technical background knowledge in the areas of
operating systems (Linux®, Solaris™), IP networking and SAP® basis.

1.2 Notational Conventions


The following conventions are used in this manual:

Additional information that should be observed.

Warning that must be observed.

fixed font Names of paths, files, commands, and system output.


<fixed font> Names of variables.
fixed font User inputs in command examples (if applicable using <> with
variables).
Command prompt: # The notation
control1:/<somewhere> # <command>
indicates that the command <command> is issued on the first
Control Node in the directory /<somewhere>.
The reader may need to change into the directory first, e.g.
control1:~ # cd /<somewhere>
control1:/<somewhere># <command>.

Administration and Operation 1


Introduction Document History

1.3 Document History


Document Version Changes Date
1.0 First Edition 2006-05-01

1.4 Related Documents


FlexFrame™ for mySAP™ Business Suite – Planning Tool
FlexFrame™ for mySAP™ Business Suite – Installation of a FlexFrame Environment
FlexFrame™ for mySAP™ Business Suite – Installation Guide for mySAP Solutions
FlexFrame™ for mySAP™ Business Suite – FA Agents - Installation and Administration
PRIMECLUSTER Documentation
ServerView Documentation
SuSE Linux Enterprise Server 8 Documentation
SuSE Linux Enterprise Server 9 Documentation
Solaris Documentation

1.5 Special Hints for FlexFrame 3.2


In this document, you often will find console output, configuration data and installation
examples which are based on FlexFrame 3.1 and SLES 8 / Solaris 8 Application Nodes.
Please keep in mind that these are examples and may look slightly different on the new
operating systems introduced in FlexFrame 3.2.

2 Administration and Operation


2 FlexFrame Architecture
The FlexFrame solution V3.1 is a revolutionary approach to run complex mySAP™
solutions with higher efficiency. At the same time some major changes of the
configuration paradigms for infrastructures have been implemented. These changes are:
● A shared operating system booted via IP networks for the SAP Application Nodes.
● Decoupling of application software and operating system, called virtualization of SAP
software or Adaptive Computing.
● Shared Network Attached Storage from Network Appliance® providing Write
Anywhere File Layout (WAFL™) and sophisticated snap functionality.
● FlexFrame Autonomous Agents (FA Agents) providing revolutionary mechanisms to
implement high-availability functions without cluster software
The concept of FlexFrame for mySAP Business Suite consists of several components,
which implement state-of-the-art functionality. Together with new components, such as
the FlexFrame Autonomous Agents, the whole solution is far more than just the sum of its
components. A major part of the benefits consist in a dramatic reduction in day to day
operating costs for SAP environments.
It is, of course, possible to use parts of the FlexFrame solution in project-based
implementations. However, they can not be called FlexFrame.

Administration and Operation 3


FlexFrame Architecture General Notes on FlexFrame 3.2

2.1 General Notes on FlexFrame 3.2


FlexFrame was designed and developed as a platform for mySAP Business Applications.
Its major purpose is to simplify and abstract the basic components to enable the
administrator of an SAP system landscape to focus on SAP and not worry about servers,
networking and storage.
The FlexFrame Control Nodes are seen as an appliance. Like a toaster or a micro wave
oven, they have a well defined purpose and the build-components must work together to
achieve that purpose. It happened that Fujitsu Siemens Computers picked SuSE® Linux
Enterprise Server (SLES) 8 as the operating system for the Control Nodes, however it is
not intended that the customer is using it as a regular server, meaning installing
additional software on it and applying patches to it is not wanted, unless Fujitsu Siemens
Computers support line instructs so. Upcoming versions of the Control Node’s operating
system may be totally different and may not allow modifying anything at all. The
installation and backup/restore functionality of a Control Node are based on fixed images
which are delivered on DVD. Modifications of installed images will not be taken care of, if
a new version is installed.
Modifications may even lead to errors which may be hard to find. Therefore we strongly
recommend to the customers not to install any software or patches onto the Control
Nodes without confirmation of Fujitsu Siemens Computers support.
Similar to the Control Nodes are the Application Nodes. Fixed images are shipped for
installation in the FlexFrame landscape.
Another aspect of FlexFrame is the reduction of the TCO (total cost of ownership). A
static approach (once started, never touched) will not be very efficient. To achieve the
best savings, it is recommended to actively manage where a certain SAP application
instance or database instance is running. If, as an example, a SAP instance requires the
power of two CPUs of an Application Node during most days of a month and eight CPUs
during month-end calculations, it is best to move it back and forth to the appropriate
Application Node with the right size. During the time where the application instance is
running on the two CPU Application Node, another SAP instance can use the bigger eight
CPU Application Nodes and therefore saving the need to have more eight CPU
Application Nodes as in a static approach.

4 Administration and Operation


Hardware FlexFrame Architecture

2.2 Hardware
A typical FlexFrame environment consists of the following hardware types:

Add-on FlexFrame

1 4
Control

2
Network

3
5

Backup Storage Application

1. Control Nodes: PRIMERGY™ RX300 S2 with SuSE Linux Enterprise Server 8 (on
local disks).
2. Two or more identical network switches Cisco Catalyst 3750G (per switch group)
3. Network Attached Storage with one or more Network Appliance Filer heads, each
with a disk storage of at least 2*14 disks with 144 GB each hosting shared OS file
systems and application data.
4. Intel® based PRIMERGY servers (standard rack servers or blade servers) with SuSE
Linux Enterprise Server 8 (shared OS).
®
5. SPARC64 V based PRIMEPOWER servers with Solaris 8 and 9 (Shared OS).

Administration and Operation 5


FlexFrame Architecture Software

6. Network cabling (1000Mbit Copper and 1000Mbit optical).


Any other functions such as backup have to be implemented separately as an add-on to
FlexFrame and need dedicated hardware, operating system, high availability etc. for
production systems. The correct size is a result of a detailed SAP sizing process.

2.3 Software
The FlexFrame solution consists of both, hardware and software. To grant proper
function of the whole landscape, the entire software set is strictly defined. Anything other
than the software components listed below is not part of FlexFrame. This applies
unchanged if software from the list below is missing, is installed in other versions than
below, or if software other that the actual SAP components is added.

No. Hardware OS Software Services


1 Control Nodes: SLES 8 SP3 FlexFrame 3.2 TFTP
file system image DHCP
2x kernel-smp-
Control Node; LDAP
PRIMERGY RX 300 S2 2.4.21-291
NTP
PRIMECLUSTER
RARP
™ 4.1B00
BOOT-PARAM
FA Agents V2.0 SNMP
(Control Agents)
ServerView S2
2 Network switches: IOS 12.1.19 (as delivered)
or greater
2 x Cisco Catalyst
WS-C3750G-24TS or
WS-C3750G-24T
3 Network Attached ONTAP 6.5.2P3 (as delivered) NFS
Storage or higher
One or more Network
Appliance Filer heads
with at least 2*14 disks
with 144 GB each for
production systems
Hosting shared OS file
systems and application
data

6 Administration and Operation


Software FlexFrame Architecture

No. Hardware OS Software Services


4 Linux Application Nodes: SLES 9 SP4 FlexFrame 3.2 SAP & DB
file system image
and
(for Linux
SLES SP38 Application
Nodes)
Intel based PRIMERGY Kernel 2.6.5-
servers (standard rack or 7.244 (SLES9) FA Agents
blade servers) (Application
and Agents)
Kernel 2.4.21- DomainAdmin
291 (SLES8)
5 Solaris Application Solaris 8 and FlexFrame 3.2 SAP & DB
Nodes: Solaris 9 file system image
SPARC64 V based FA Agents
PRIMEPOWER servers (Application
Agents)
SAP & DB
services
DomainAdmin

Administration and Operation 7


FlexFrame Architecture Shared Operating System

2.4 Shared Operating System


One major aspect of FlexFrame is its shared operating system. Sharing in this case
means, that the very same files of essential parts of the underlying operating system are
used to run multiple Application Nodes. This part of the file system is mounted read-only,
so none of the Application Nodes that run the actual applications can modify it. Server
specific information is linked to a file system area that is server-specific and mounted
read-write. The shared operating system is kept to a Network Attached Storage Filer from
Network Appliance.

2.4.1 Shared OS Boot Concept


The chart below shows the boot process of a FlexFrame Application Node
(PRIMERGY/Linux):

The FlexFram e Concept – Shared Linux


C ontrol Node (clustered)
PR IMER G Y ƒ assigns server names
FSC3
BX300 ƒ provides boot info
ƒ distributes SAP services
FSC4 ƒ adm in/m onitoring
1 get boot info

2 Server Nam e IP address


where to boot from

Storage NetApp Filer


3 m ount root directory
PR IMER G Y FSC1 OS-shared
RX600 4 boot Linux

PR IMER G Y FSC2
RX800

The network boot concept used for Solaris-based PRIMEPOWER servers is derived from
the Solaris Diskless Client Concept. A shared /usr area is mounted read-only. Each
server has its own root file system which is mounted read-write.

8 Administration and Operation


Shared Operating System FlexFrame Architecture

The following graphic explains the boot process.

In Step (1) the PRIMEPOWER™ server sends an RARP (Reverse Address Resolution
Protocol) request into the Storage LAN network segment. The Control Node with the
active in.rarpd process is receiving this request. The request asks for an IP address
based on the MAC address of the initiating NIC (Network Interface Card).
The in.rarpd process on the Control Node will search the file /etc/ethers for the
MAC address provided. If it finds a line with the MAC address, it will return the associated
IP address as an RARP response (2).
Now the PRIMEPOWER server can setup the NIC with the IP address. Next step (3) is to
get a file via the TFTP (Trivial File Transfer Protocol) from the Control Node which
responded to the RARP request. The file name is build by the hex notation of the IP
address.
The in.tftpd of the Control Node will send this file to the PRIMERPOWER server. This
file contains the first step of the Solaris kernel for booting via the network.
Now additional parameters are required. The Solaris kernel sends a bootparam request
to the Control Node (5). The rpc.bootparamd process on the Control Node will read
the file /etc/bootparams for the given Application Node’s host name and replies with
the detail information about its NFS server and path to the root file system along with
mount options and other parameters.

Administration and Operation 9


FlexFrame Architecture Shared Operating System

2.4.2 Control Nodes


A productive FlexFrame landscape always includes two Control Nodes. Their purpose is
to be a single point of control for the Application Nodes, as well as to check and manage
the proper function of the Application Nodes.
Control Nodes are not running SAP software (with the exception of saprouter, as an
option). They exclusively run SuSE Linux Enterprise Server Version 8 (SLES8), installed
on local disks. Control Nodes provide and run services such as:
● Timeserver for the complete FlexFrame landscape
● Control Agents
● Web server to provide the Control Agents user interface
● Assignment of IP addresses to Application Nodes using DHCP (Linux) or RARPD
and BOOTPARAMD (Solaris)
● TFTP server for the boot process
● saprouter (optional)
Control Nodes have to be of the server type PRIMERGY RX300 S2.

2.4.3 Application Nodes


Application Nodes are the workhorses of the FlexFrame solution. They offer CPU and
memory and run database and SAP services. Application Nodes must have local disks,
though, which are exclusively used for swap space and do not contain any other data.
As Application Nodes, there are two major variants:
● SLES 8 and SLES 9 on PRIMERGY hardware
● Solaris 8 and Solaris 9 on PRIMEPOWER hardware

2.4.3.1 Linux Application Nodes


During the boot process using Intel’s PXE® technology, each Application Node will be
identified using the hardware address of its boot interface (MAC address). The Control
Node will assign an IP address to it and supply the operating system via the network.
File systems, especially the root file system (/), are mounted via the network in read-only
mode. If, for any reason, an Application Node needs to be replaced or added, only a
handful of settings need to be adjusted to integrate it into the FlexFrame landscape.
Intel's PXE technology is implemented in Fujitsu Siemens Computers PRIMERGY
servers and allows booting via the network. DHCP will be used with static MAC-Address
relationship for all the Application Nodes.

10 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

2.4.3.2 Solaris Application Nodes


For Solaris Application Nodes, the diskless client concept of Sun Microsystems, Inc. was
enhanced for the FlexFrame solution. This concept has a dedicated root file system for
each Application Node while sharing the same /usr file system amongst a group of
servers.
On the Filer, preconfigured file systems for each PRIMEPOWER model are provided.
The boot process is similar to the above Linux network boot. As mechanisms for booting
via the network, RARP and BOOTPARAM are used. IP address information, boot
information and location of the root file system of the Application Nodes are delivered by
the BOOTPARAM protocol.

2.5 Switch and Filer Configuration


A switch group consists of at least two switches of the Cisco Catalyst 3750G switch
family. They are interconnected on the back with Cisco StackWise technology to form a
stack and behave like a virtual modular switch.
For a description how to interconnect the switches, please refer to Cisco
Catalyst 3750 Installation Manual or “Cisco StackWise Technology” White
Paper.
Adding or removing a switch to or from a switch group means to add or remove
a switch to/from a stack. Cisco notes this may be done during normal operation
(see Catalyst 3750 White Paper “Cisco StackWise Technology”).
To ensure safe operation we recommend doing this at a downtime to minimize
influence on running systems. This requires shutting down all systems
connected to this switch group. In case a Filer is connected, all systems that
have mounted file systems from the Filer have to be shut down as well.

2.5.1 Adding a Switch to a Switch Group


To add a new switch to an existing switch group, the new switch has to be inserted into
the stack. Mount switch into system cabinet next to mounted switch group and insert it
into StackWise cabeling. See notes above for recommendations.
To be able to use the new switch with the maintenance tools, it has to be added to LDAP
database. Use the command ff_swgroup_adm.pl to do this.

Synopsis

ff_swgroup_adm.pl --op add-sw --group <switch_group_id>


--type <switch_type>

Administration and Operation 11


FlexFrame Architecture Switch and Filer Configuration

Command Options
--op add-sw
Adds a new member to the switch group.
--group <switch_group_id>
Defines the switch group to be used.
--type <switch_type>
Defines the type of the new member to be added to the switch group. Currently the
type has to be cat3750g-24t or cat3750g-24ts.

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_swgroup_adm.pl --op add-sw --group 1
--type cat3750g-24ts

If program is aborted by Ctrl-C or a failure remove left overs


by calling:
ff_swgroup_adm.pl --op rem-sw --group 1 --switch 3

Switch was added to LDAP data.

Keep in mind: INSERTING SWITCH TO STACK NEEDS A DOWN TIME !

To add the switch to switch stack (switch group 1) write down the
current switch ids as they may change inserting the new switch to
stack. To connect the switch to the stack use the provided
stacking cable and stacking ports at rear side. See Cisco
installation manual for details.

If switch ids get scrambled use the IOS command


"switch <current nmbr> renumber <new nmbr>" to put them in
same order as before.

In short the to do list:


-> write down switch ids of each switch of group
-> power down entire switch group
-> insert switch into stack
-> power on entire switch group
-> look at switch ids and compare with your noticed
-> in case of differences use IOS command to renumber switches
Switch group is ready for use

12 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

See file /tmp/swgrp-add-1-3/next_steps for same instructions as


above.

2.5.2 Removing a Switch from a Switch Group


A switch may be removed if it is unused and has the highest ID within the stack. Remove
it first from the LDAP database with the command ff_swgroup_adm.pl and then from
the stack. See notes above for recommendations.

Synopsis

ff_swgroup_adm.pl --op rem-sw --group <switch_group_id>


--switch <switch_id>

Command Options
--op rem-sw
Removes the last member from a switch group.
--group <switch_group_id>
Defines the switch group to be used.
--switch <switch_id>
Defines the stack ID of the switch to be removed from a switch group.

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_swgroup_adm.pl --op rem-sw --group 1
--switch 3
Switch was successfully removed from LDAP data.

Keep in mind: REMOVING SWITCH FROM STACK NEEDS A DOWN TIME !


In short the to do list:
-> power down entire switch group
-> remove switch from stack
-> power on entire switch group
Switch group is ready for use

See file /tmp/swgrp-rem-1-3/next_steps for same instructions as


above.

Administration and Operation 13


FlexFrame Architecture Switch and Filer Configuration

2.5.3 Listing a Switch Group Configuration


Invoking the command ff_swgroup_adm.pl with the list operation mode displays
the configuration of a switch group like used switch types, port channels, port usage
statistics and used switch ports.

Synopsis

ff_swgroup_adm.pl --op list --group <switch_group_id>

Command Options
--op list
Displays switch group configuration.
--group <switch_group_id>
Defines the switch group to be used.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_swgroup_adm.pl --op list --group 1
Switch Group 1

Name/IP: switch-i-2 / 192.168.13.15


Password: passwort
SNMP Community: public;ro

Switch Types: (switch id, switch type)


1 cat3750g-24ts
2 cat3750g-24ts

Port Channels: (channel id, switch ports, connected device)


3 1/3,2/3 swb-1-1/11,swb-1-1/12
4 1/4,2/4 swb-1-2/11,swb-1-2/12
5 1/13,2/13 filer-ip1

Switch port usage: (switch id, used, free tx, free fx ports)
1 21 used 3 free tx 4 free fx
2 21 used 3 free tx 4 free fx

Switch port list: (switch id, port id, connected device, vlans)
1 1 pw250-10 t10,t12,u11
1 2 pw250-11 t10,t12,u11
1 3 swb-1-1/11 t1,t10,t11,t12,t13

14 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

1 4 swb-1-2/11 t1,t10,t11,t12,t13
1 5 pw650-20 t10,t12,u11
1 6 pw900-211 t10,t12,u11
1 7 rx300-13 t10,t12,u11
1 8 rx300-14 t10,t12,u11
1 9 pw250-12 t10,t12,u11
1 10 pw250-13 t10,t12,u11
1 11 cn1 t10,t11,t12,u13
1 12 cn2 t10,t11,t12,u13
1 13 filer-ip1 t11
1 14 extern. Connect t10,t11,t12
1 15 --- unused ---
1 16 --- unused ---
1 17 --- unused ---
1 18 cn1 mgmt u13
1 19 pw250-12 mgmt u13
1 20 rx300-13 mgmt u13
1 21 - u13
1 22 pw650-20 mgmt u13
1 23 pw250-10 mgmt u13
1 24 Corporate LAN u10
2 1 pw250-10 t10,t12,u11
2 2 pw250-11 t10,t12,u11
2 3 swb-1-1/12 t1,t10,t11,t12,t13
2 4 swb-1-2/12 t1,t10,t11,t12,t13
2 5 pw650-20 t10,t12,u11
2 6 pw900-211 t10,t12,u11
2 7 rx300-13 t10,t12,u11
2 8 rx300-14 t10,t12,u11
2 9 pw250-12 t10,t12,u11
2 10 pw250-13 t10,t12,u11
2 11 cn1 t10,t11,t12,u13
2 12 cn2 t10,t11,t12,u13
2 13 filer-ip1 t11
2 14 extern. Connect t10,t11,t12
2 15 --- unused ---
2 16 --- unused ---
2 17 --- unused ---
2 18 cn2 mgmt u13
2 19 pw250-13 mgmt u13
2 20 rx300-14 mgmt u13
2 21 - u13
2 22 - u13
2 23 pw250-11 mgmt u13
2 24 Corporate LAN u10

Administration and Operation 15


FlexFrame Architecture Switch and Filer Configuration

2.5.4 Changing the Password of a Switch Group


To change the access password of a switch group it has to be changed at switch group
as well as LDAP database. The command ff_swgroup_adm.pl changes both.

Synopsis

ff_swgroup_adm.pl --op pass --group <switch_group_id>


--passwd <password>

Command Options
--op pass
Changes the switch group access password.
--group <switch_group_id>
Defines the switch group to be used.
--passwd <password>
Defines the new password as clear text.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_swgroup_adm.pl --op pass --group 1
--passwd passwort
update switch 1/1 configuration
Notice: Update will take about 1 minute.
............+

Password changed from "password" to "passwort".

See file /tmp/swgrp-pass-1/info for same information as above.

2.5.5 Changing the Host Name of a Switch Group


To change the host name of a switch group it has to be changed at switch group as well
as LDAP database and host files on both control nodes. The command
ff_swgroup_adm.pl changes all of them.

Synopsis

ff_swgroup_adm.pl --op name --group <switch_group_id>


--name <name>

16 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

Command Options
--op name
Changes the switch group host name.
--group <switch_group_id>
Defines the switch group to be used.
--name <name>
Defines the new host name to be used.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_swgroup_adm.pl --op name --group 1
--name swg1
update switch 1/1 configuration
Notice: Update will take about 1 minute.
...+

Switch name changed from "swg-1" to "swg1".

See file /tmp/swgrp-name-1/info for same information as above.

2.5.6 Adding a Switch Port Configuration


Switch ports are typically directly configured by maintenance tools. But some issues like
configuring ports for gateways, backup or migration systems need a way to do this on a
per port basis. For this type of configuration a special peer type is used. The program
ff_swport_adm.pl is used to configure or remove this type of port configuration.

Synopsis

ff_swport_adm.pl --op add --port <swgroup:switch:port>


--lan <pool:lan[:lan][,pool:lan[:lan]]>
[--native <pool:lan>] [--desc <description>]

Command Options
--op add
Adds a switch port configuration.
--port <swgroup:switch:port>
Defines the switch group, switch and port ID of the port to be used.

Administration and Operation 17


FlexFrame Architecture Switch and Filer Configuration

--lan <pool:lan[:lan][,pool:lan[:lan]]>
Defines the accessible VLANs. For better readability, a VLAN is specified with its
pool and LAN name. Use only client, server or storage as LAN names. For more than
one LAN per pool the LAN names may be added to the same pool statement. The
VLANs are not restricted to belong to the same pool. To directly add VLAN IDs not
used within any pool, use '#' as pool name and the VLAN ID(s) as LAN(s).

The accessible VLANs. For better readability, a VLAN is specified with its pool and
LAN name. Use only client, server or storage as LAN names. For more than one LAN
per pool the LAN names may be added to the same pool statement. The VLANs are
not restricted to belong to the same pool. To directly add VLAN IDs not used within
any pool, use '#' as pool name and the VLAN ID(s) as LAN(s).

If only a single VLAN is configured this is accessible as native VLAN. This means the
data packet contains no VLAN tag. This is the behavior used by a standard server
network interface. For more than one LAN they are configured as tagged. To define
which of them should be used as native VLAN use the option --native.

Examples:
--lan poolA:client:server,poolB:client:server
--lan poolA:client,poolA:server,poolB:client:server
--lan poolA:storage,poolB:storage
--lan poolA:server
--lan '#:417:891'
--lan poolA:server,'#:417:891'
--lan 'poolA:server,#:417:891'

--native <pool:lan>
Use this option to define the native VLAN of the accessible VLANs defined with
option --lan. To directly add VLAN ID not used within any pool, use '#' as pool
name and the VLAN ID as LAN.

Examples:
--native poolA:server
--native '#:417'

--desc <description>
The description is added to configuration of switch port and the LDAP data of the
switch port configuration.

18 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_swport_adm.pl --op add --port 1:1:15
--lan ip1:storage:server,’#:4000’ --native ip1:storage

Execution may take some minutes. If program is aborted


by Ctrl-C or a failure remove left overs by calling:
ff_swport_adm.pl --op rem --port 1:1:15

update switch 1/1 configuration


Notice: Update will take about 1 minute.
...........+

If not reported any error the port is configured and LDAP is


updated successfully.

2.5.7 Removing a Switch Port Configuration


To remove a switch port configuration that was previously configured by
ff_swport_adm.pl or as external connectivity with PlanningTool, use the
ff_swport_adm.pl command. Other ports are configured with maintenance tools like
ff_an_adm.pl, ff_pool_adm.pl or ff_bx_cabinet_adm.pl.
The switch port configuration will be removed from the switch and the LDAP database.

Synopsis

ff_swport_adm.pl --op rem --port <swgroup:switch:port>

Command Options
--op rem
Removes the configuration of a switch port.
--port <swgroup:switch:port>
Defines the switch group, switch and port ID of the port to be used.

Administration and Operation 19


FlexFrame Architecture Switch and Filer Configuration

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_swport_adm.pl --op rem --port 1:1:15

Execution may take some minutes. If program is aborted


by Ctrl-C or a failure remove left overs by calling:
ff_swport_adm.pl --op rem --port 1:1:15

update switch 1/1 configuration


Notice: Update will take about 1 minute.
.............+

If not reported any error the port is unconfigured and LDAP is


updated successfully.

2.5.8 Displaying a Switch Port Configuration


To display the configuration of a switch port as known by LDAP database in detail, use
the command ff_swport_adm.pl with operation mode list.

Synopsis

ff_swport_adm.pl --op list --port <swgroup:switch:port>

Command Options
--op list
Displays configuration of the switch port.
--port <swgroup:switch:port>
Defines the switch group, switch and port ID of the port to be used.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_swport_adm.pl --op list --port 1:1:4

Switch Port Configuration of


1:1:4 (Switch Group : Switch : Port)

assigned VLAN IDs: 24,25,26


assigned VLAN Names: pool1:client,pool1:server,pool1:storage

20 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

native VLAN: 26
Port Peer Type: AN
Peer Node: rx300-1

The display of an unconfigured port looks like:

ERROR: wrong switch port "1:1:8".


Port configuration unknown.

2.5.9 Adding a New Filer


To add a Filer to FlexFrame environment the Filer has to be configured, the network
prepared and some data has to be stored at LDAP database. The Filer configuration has
to be done manually. But all nescessary data and locations are displayed by this
program. The network and LDAP preparation is done directly by ff_filer_adm.pl.
The network configuration differs by Filer type. FAS2xx family Filers are limited to two
ethernet interfaces (NICs). For this type of Filers the Control LAN is configured onto the
link aggregate (multi vif) of the two NICs. For all other Filer types the Control LAN is
configured to use interface e0 and the multi vif is reserved for pool LANs.

Synopsis

ff_filer_adm.pl --op add --name <node_name> --type <filer_type>


--swgroup <switch_group_id>
[--host <ip_host_part>] [--ports <port_count>]

Command Options
--op add
Adds a Filer.
--name <node_name>
Defines the node name of Filer.
--type <filer_type>
Defines the type of the Filer. See usage for a list of known Filer types.
--swgroup <switch_group_id>
Defines the switch group the Filer should be added to. See usage for a list of
configured switch group IDs.
--host <ip_host_part>
Defines the host part to be used to build IP addresses for the Control or Storage LAN
networks. If this option is omitted the script uses a free host number to calculate the
IP address.

Administration and Operation 21


FlexFrame Architecture Switch and Filer Configuration

--ports <port_count>
Defines the count of ports to be used with pool Storage LAN networks. If this option
is omitted the script uses the default of two ports.

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op add --name filer2
--type FAS270 --swgroup 1
update LDAP
.....
update switch 1/1 configuration
Notice: Update will take about 1 minute.
................................+
Some manual interventions are nescessary to integrate the filer
into FlexFrame environment.
The following list of actions have to be performed in order to
integrate the filer into your FlexFrame landscape. Since your
exact configuration may vary, these steps have to be performed
manually. However, the VIF must be named 'storage'.

The /etc/rc, /etc/exports and /etc/hosts.equiv have to be edited


at volume vol0.

These lines have to be added to or changed at /etc/rc:

hostname filer2
vif create multi storage <-- add your NICs here, eg. e0a e0b
vlan create storage 13
ifconfig storage-13 192.168.20.4 netmask 255.255.255.0 broadcast
192.168.20.255 mtusize 1500 -wins up
options dns.enable off
options nis.enable off
savecore

These lines have to be added to or changed at /etc/exports:

/vol/vol0 -sec=sys,rw=192.168.20.2:192.168.20.1,anon=0

These lines have to be added to /etc/hosts.equiv:

192.168.20.2 root
192.168.20.1 root

22 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

As the switch ports are already configured the correct wiring


between filer and switch ports has to be done. See below a list of
cable connections.
Connect your filer LAN interfaces to named switch ports:
SwitchGroup / Switch / Port LAN Interface
1 / 2 / 3 filer2 (FAS270): port "data NIC-1"
1 / 1 / 3 filer2 (FAS270): port "data NIC-2"

Finally execute command "mount /FlexFrame/filer2/vol0"


on both Control Nodes to mount filers vol0 at Control Nodes. This
is nescessary for further automated configuration of filer.

The complete instruction above is listed at file /tmp/filer2-


add/todo

2.5.10 Displaying All Configured Filers


To get an overview of all configured Filers within the FlexFrame environment use the
operation mode list-all of the program ff_filer_adm.pl. It displays IP addresses
and names, type, switch ports and the link aggregation id, separated by Filer.

Synopsis

ff_filer_adm.pl --op list-all

Command Option
--op list-all
Displays all configured Filers.

Command Output
The command displays some information about the Filer configuration. The output may
look like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op list-all
Filer configurations

filer
Control Lan
192.168.20.3 filer-co
Type: FAS940
Switch Link Aggregation
Port Count: 2

Administration and Operation 23


FlexFrame Architecture Switch and Filer Configuration

Link Aggr.ID: 5
Storage LAN switch ports
1 / 1 / 13 SwGroup / Switch / Port
1 / 2 / 13 SwGroup / Switch / Port
Control LAN switch ports
1 / 2 / 15 SwGroup / Switch / Port
Pools
pool1 192.168.2.131 filer-pool1-st master
pool2 10.3.1.3 filer-pool2-st master
pool5 192.168.40.3 filer-pool5 master
usr 192.168.5.3 filer-usr-st master

2.5.11 Displaying Filer Configuration


To display the detailed configuration of a Filer as known by LDAP database, use the
command ff_filer_adm.pl with operation mode list.

Synopsis

ff_filer_adm.pl --op list --name <node_name>

Command Options
--op list
Displays the configuration of a switch port.
--name <node_name>
Defines the node name of a Filer.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op list --name filer
Filer configurations

filer
Control Lan
192.168.20.3 filer-co
Type: FAS940
Switch Link Aggregation
Port Count: 2
Link Aggr.ID: 5

24 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

Storage LAN switch ports


1 / 1 / 13 SwGroup / Switch / Port
1 / 2 / 13 SwGroup / Switch / Port
Control LAN switch ports
1 / 2 / 15 SwGroup / Switch / Port
Pools
pool1 192.168.2.131 filer-pool1-st master
pool2 10.3.1.3 filer-pool2-st master
pool5 192.168.40.3 filer-pool5 master
usr 192.168.5.3 filer-usr-st master

2.5.12 Adding a Pool to a Filer


To be able to use an existing Filer for a pool, the network connection to the Filer has to
be enhanced. On the Filer, a new VIF has to be created and on the switch ports, the new
VLAN has to be configured. All these steps are done with the ff_filer_adm.pl using
the operation mode add-pool, but the program will not change any exports.

Synopsis

ff_filer_adm.pl --op add-pool --name <node_name>


--pool <pool_name> [--role {master|slave}]
[--host <ip_host_part>]

Command Options
--op add-pool
Adds the given pool to named Filer.
--name <node_name>
Defines the node name of Filer.
--pool <pool_name>
Defines the pool name the Filer has to be support. See usage for a list of configured
pools.
--role {master|slave}
Defines the role of Filer within given pool. This information is used by the
ff_setup_sid_folder program. If this option is omitted the script uses the default
role master.
--host <ip_host_part>
Defines the host part to be used to build IP addresses for the Control or Storage LAN
networks. If this option is omitted, the script uses a free host number to calculate the
IP address.

Administration and Operation 25


FlexFrame Architecture Switch and Filer Configuration

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op add-pool
--name filer --pool pool4
update LDAP
....
update switch 1/1 configuration
Notice: Update will take about 1 minute.
...........+
vlan: storage-25 has been created
Pool pool4 successfully added to filer, LDAP and network.

2.5.13 Removing a Pool from a Filer


The rem-pool mode is the opposite to the add-pool operation mode of
ff_filer_adm.pl. It removes the VIF to pool and removes VLAN access on Filers
switch ports for given pool. This action will be permitted if any SID of pool uses this Filer
or the Filer is the default Filer of the pool. Within the last case, the Filer’s pool interface
will be removed when removing the pool.

Synopsis

ff_filer_adm.pl --op rem-pool --name <node_name>


--pool <pool_name>

Command Options
--op rem-pool
Removes the given pool from the named Filer.
--name <node_name>
Defines the node name of the Filer.
--pool <pool_name>
Defines the pool name the Filer has to be support. See usage for a list of configured
pools.

26 Administration and Operation


Switch and Filer Configuration FlexFrame Architecture

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op rem-pool
--name filer --pool pool4
update switch 1/1 configuration
Notice: Update will take about 1 minute.
...........+
update LDAP
....
Pool pool4 successfully removed from filer, LDAP and network.

2.5.14 Removing a Filer


To remove a Filer from FlexFrame landscape it may not be used by any pool. use the
ff_swport_adm.pl command. Other ports are configured with maintenance tools like
ff_an_adm.pl, ff_pool_adm.pl or ff_bx_cabinet_adm.pl.
The switch port configuration will be removed from switch and LDAP database.

Synopsis

ff_filer_adm.pl --op rem --name <node_name>

Command Options
--op rem
Removes a Filer from FlexFrame landscape.
--name <node_name>
Defines the node name of the Filer.

Command Output
The command displays some information about processing steps. The output may look
like this:
cn1:/opt/FlexFrame/bin # ff_filer_adm.pl --op rem --name filer2
update switch 1/1 configuration
Notice: Update will take about 1 minute.
.............................+
update LDAP
.....
Filer successfully removed from network and LDAP.

Administration and Operation 27


FlexFrame Architecture LDAP

2.6 LDAP
A directory service is typically a specialized database which is optimized for reading and
searching hierarchical structures. Unlike RDBMS, these databases do not support
transaction based processing. LDAP (Lightweight Directory Access Protocol, see RFC
2251) defines means of setting and retrieving information in such directories. The
structure of the LDAP data is called Domain Information Tree (DIT).

2.6.1 FlexFrame Structure in LDAP


FlexFrame utilizes LDAP for two different purposes:
(1) for operating naming services (such as host name resolution, user/password
retrieval, tcp/udp service lookup, etc.) and
(2) for storing FlexFrame specific data on the structure of the installed environment
Application Nodes are only able to search in area (1). It is separated into pool specific
sections in order to protect pools from accessing other pools’ data. Each of them contains
pool specific network information service (NIS) like data. The LDAP servers have access
lists to prevent searches outside of the own pool.
The other main DIT part contains FlexFrame configuration data (2). It should only be
accessed through maintenance tools from one of the Control Nodes. This part of the DIT
contains a lot of cross references, which need to be kept in sync. Do not try to change
this data, unless you are explicitly instructed by Fujitsu Siemens Computers support to do
so.

2.6.2 Working with LDAP


The usage of LDAP specific commands like ldapadd or ldapmodify is limited to very
few actions. One is to create or remove a PUT service for a SAP system copy. This
action is described within the “Installation Guide for mySAP Solutions” manual. Other
direct interaction through LDAP commands is limited to service issues.
No other interventions have to and should be done. The FlexFrame maintenance tools
provide the necessary functionality.

2.6.3 Disaster Recovery – Turning Replica into Master


As there are two LDAP databases running in a FlexFrame environment (one on each
Control Node), one of them is run as writable master DB while the other one is a non-
writable replica.
In case of a desaster, one Control Node may fail, and therefore its LDAP server becomes
unusable.

28 Administration and Operation


PRIMECLUSTER FlexFrame Architecture

In case (only) the replica fails, the remaining master LDAP DB continues to work.
In case the master DB fails, the remaining replica LDAP DB will provide the functionality.
However, if the LDAP DB needs to be changed, the replica needs to act as master (i.e.
become read-writable).
This requires manual interaction. The procedure is described in the document "Disaster
Procedures for FlexFrame Landscapes", slide “FlexFrame Control Node (LDAP master)".

2.7 PRIMECLUSTER
This section briefly describes how PRIMECLUSTER is implemented in FlexFrame.
PRIMECLUSTER is used for high availability of Control Nodes only, not for Application
Nodes.

2.7.1 PRIMECLUSTER Components


PRIMECLUSTER in FlexFrame consists of several components:
● Cluster Founation (CF) )
CF controls the cluster node interoperability and builds a base for all other
PRIMECLUSTER components.
● Shutdown Facility (SF)
SF controls power shutdown of the cluster nodes in case of cluster node failures.
In FlexFrame, Shutdown Facility agents will also be used by the FA Control Agent.
● Reliant Monitor Services (RMS)
RMS manages all defined service groups (“userApplication”) by starting, stopping
and monitoring services. It will react if services fail and switch it over to another
cluster node.

Administration and Operation 29


FlexFrame Architecture PRIMECLUSTER

2.7.2 FlexFrame Specific RMS Configuration


FlexFrame delivers a pre-configured RMS configuration divided in five service groups,
called “userApplication”. The table below shows the application groups and other
respective properties:

userApplication may switch service used by / needed for


over?
ff_manage yes apache web server FlexFrame web portal
This application FA Control Agent n/a
contains services
for automatic and ServerView ff_udp_forwarder needed
manual WebExtension and by myAMC FA Messenger
management Alarm Service Server
facilities. processes;
FlexFrame UDP
forwarder
mySQLdatabase myAMC FA Messenger Server
myAMC FA n/a
Messenger

jakarta tomcat myAMC FA Messenger Server


web interface
netboot_srv yes inet daemon Solaris and Linux netboot; tftp
This application DHCP daemon Linux netboot
contains services
required for RARP daemon Solaris netboot
netboot bootparam daemon Solaris netboot
ldap_srv1 no, LDAP master server important component of the
CN1 only (slapd) complete FlexFrame
environment
ldap replication
daemon (slurpd)
ldap_srv2 no, LDAP replica server
CN2 only (slapd)
saprouter yes optional saprouter n/a
initially unconfigured

30 Administration and Operation


PRIMECLUSTER FlexFrame Architecture

All these services are monitored by PRIMECLUSTER and automatically restarted, if they
fail. If a recovery is not possible, the application will be set into faulted state and is tried to
start on the other Control Node. This process is called “failover”.
Individual userApplications may be switched to the other Control Node, i.e. it is possible
to run ff_manage on the first and netboot_srv on the second Control Node.
PRIMECLUSTER uses the scripts in /etc/PRIMECLUSTER/bin for starting, stopping
and monitoring services. These scripts are delivered with the ffPCL RPM.
If you try to stop running services manually, RMS will restart them. To prevent
this, you should set the corresponding userApplication into maintenance mode.
See the command overview table below and section “PRIMECLUSTER
Administration” on page 66.
You may damage your configuration if you ignore this!

Administration and Operation 31


FlexFrame Architecture PRIMECLUSTER

2.7.3 RMS Configuration – Schematic Overview


The following picture shows the initial state of PRIMECLUSTER RMS:
● ldap_srv1 is running on Control Node 1 and may not be switched to
Control Node 2
● ldap_srv2 is running on Control Node 2 and may not be switched to
Control Node 1
● netboot_srv, ff_manage and saprouter (if configured) are running on Control
Node 1 and can be individually switched to Control Node 2

active service(s) active service(s)


on Control Node 1 on Control Node 2

Control Control
Node 1 Node 2

switch switch switch


over over over

optional

ldap_srv1 netboot_srv ff_manage saprouter ldap_srv2

ldap master bootparamd jakarta tomcat saprouter ldap replica


slapd + slurpd slapd
rarpd myAMC Msg virtual ip

dhcpd mysql
start

stop

inetd ServerView

myAMC Ctrl

apache

32 Administration and Operation


PRIMECLUSTER FlexFrame Architecture

2.7.4 Node Failures


When a Control Node crashes or freezes, CF will detect an interconnect timeout. SF will
then try to issue a power cycle command to the IPMI interface of the failed Control Node.
If this command is successful, RMS will switch over netboot_srv, ff_manage and, if
configured, saprouter if they were online on the faulted Control Node.
If the IPMI interface does not respond, PRIMECLUSTER cannot exactly determine which
Control Node is still alive. To prevent inconsistencies and data loss, RMS will not switch
over any applications. In this case you have to interact manually.
PRIMECLUSTER cannot handle situations where a node is completely powered
off including the on-board IPMI interface. Therefore two power supplies have to
be installed in the Control Nodes to prevent inconsistent cluster states in case of
power failures.
If you boot with only one Control Node, RMS will not start switchable services to
prevent inconsistencies. In this situation you have to do a forced online switch of
netboot_srv, ff_manage, ldap_srv1 (on the first Control Node),
ldap_srv2 (on the second Control Node) and, if configured, saprouter.
See command table below or use the PRIMECLUSTER Administration interface.

2.7.5 PRIMECLUSTER CLI Commands


Here is a quick overview of PRIMECLUSTER CLI commands.
For further information, read the appropriate man pages by typing man <command> and
the official PRIMECLUSTER manuals.

Action Command
display state of all resources hvdisp -a
display and continuously update hvdisp -u
state of all resources
display state of all resources of type hvdisp [-u] -T <resource_type>
<resource type>
(-u = update)
start or switch over hvswitch [-f] <userApplication>
<userApplication> [<node>]
(-f = forced switch)

Administration and Operation 33


FlexFrame Architecture PRIMECLUSTER

Action Command
stop <userApplication> hvutil -f <userApplication>
clear faulted <userApplication> hvutil -c <userApplication>
set <userApplication> into hvutil -m on <userApplication>
maintenance mode
set all userApplications into hvutil -M on
maintenance mode
exit <userApplication> hvutil -m [force]off
maintenance mode <userApplication>
(forced exit)
exit cluster maintenance mode hvutil -M [force]off
(forced exit)
stop RMS on all nodes hvshut -a
(do offline processing)
stop RMS on all nodes hvshut -A
(don’t do offline processing)
forced/emergency stop of RMS on hvshut -f
local node
(don’t do offline processing)
stop RMS on the local node hvshut -l
(do offline processing)
forces stop of RMS on local node hvshut -L
(don’t do offline processing)
start RMS on local node hvcm
verify shutdown facility status sdtool -s
reinitialize shutdown facility sdtool -r
verify CF status cftool -n
stop PRIMECLUSTER CF including /opt/SMAW/SMAWcf/dep/master unload
all dependent components
/etc/init.d/cf stop

34 Administration and Operation


PRIMECLUSTER FlexFrame Architecture

Action Command
start PRIMECLUSTER CF including /etc/init.d/cf start
all dependent components
/opt/SMAW/SMAWcf/dep/master load
Test the IPMI Login (the ipmipower ipmipower -s <IP_address> -u OEM
tool can be found on the ServerView -p <password>
CD)
Display the Cluster Foundation cfconfig -n
Nodes and their current status
Display all LAN interfaces that were cfconfig -d
recognized by the OS
(Solaris/Linux)

Some commands (hvswitch, hvutil, etc.) return without showing any result.
Verify the result using hvdisp -a or hvdisp -u and/or check the logfiles
(especially switchlog, see below).

2.7.6 Usage Example: Switch Application


If you are not familiar with PRIMECLUSTER, here is an example how to switch the
application netboot_srv from the second Control Node to the first one.
First, take a look at the initial situation:
control1:~ # hvdisp -T userApplication
Local System: control1RMS
Configuration: /opt/SMAW/SMAWRrms/build/flexframe.us
Resource Type HostName State
StateDetails
------------------------------------------------------------------
saprouter userApp Offline
netboot_srv userApp Offline
netboot_srv userApp control2RMS Online
ldap_srv2 userApp
ldap_srv2 userApp control2RMS Online
ldap_srv1 userApp Online
ff_manage userApp Online

As you can see, netboot_srv is online on the second Control Node.

Administration and Operation 35


FlexFrame Architecture PRIMECLUSTER

To switch the netboot services to the first Control Node, call:


control1:~ # hvswitch netboot_srv control1

The suffix RMS of the target node is not necessary for most commands.
The target node is optional. If not specified, the application will be switched to the
preferred node, therefore in this case you could have also used:
control1:~ # hvswitch netboot_srv

This command returns immediately.


In the background, RMS will perform the switch operation in two steps:
1. The netboot services rarpd, bootparamd, inetd and dhcpd will be taken offline on the
second Control Node
2. The netboot services will be taken online on the first Control Node
To see the result, call:
control1:~ # hvdisp -T userApplication

Local System: control1RMS


Configuration: /opt/SMAW/SMAWRrms/build/flexframe.us

Resource Type HostName State


StateDetails
------------------------------------------------------------------
saprouter userApp Offline
netboot_srv userApp Online
ldap_srv2 userApp
ldap_srv2 userApp control2RMS Online
ldap_srv1 userApp Online
ff_manage userApp Online

As you can see after a few seconds, netboot_srv has been switched over to the first
Control Node.
The command hvswitch will also be used to start an application that is offline on both
nodes. If nothing happens after issuing the hvswitch command, see switchlog and
netboot_srv.log for the cause (see below).
Remember that some applications (especially ff_manage) can take minutes to stop and
start.
For information how to access the administration interface of PRIMECLUSTER,
please refer to section “PRIMECLUSTER Administration” on page 66.

36 Administration and Operation


PRIMECLUSTER FlexFrame Architecture

2.7.7 PRIMECLUSTER Log Files


All PRIMECLUSTER components log their actions to log files. Every action will be
logged. These files are important to find out the reasons for errors.
Whenever PRIMECLUSTER reports an error, it is very unlikely that
PRIMECLUSTER is the cause for this fault. In most situations, a service
controlled by PRIMECLUSTER has failed (i.e. due to misconfiguration) and
PRIMECLUSTER has detected and reported this failure.
Please take a look into the logfiles before contacting Fujitsu Siemens Computers
support.
Here is an overview of important log files:

Component Log File


RMS switchlog /var/opt/reliant/log/switchlog
RMS application log /var/opt/reliant/log/<userApplication>.log
CF /var/log/messages (dmesg)
SF /var/opt/SMAWsf/log/rcsd.log

2.7.8 Desaster Repair


If the FlexFrame landscape is distributed over two locations and if one side fails
completely, it may be necessary to clean up the state of the services monitored by
PRIMECLUSTER. This can be done using the tool ff_rms.sh.
The purpose of this tool is to support the administrator in recovering the RMS services in
a disaster situation only.
The following services are monitored by PRIMECLUSTER's RMS component:
netboot_srv
Montoring inetd, dhcpd, rarpd and bootparamd
ldap_srv1
Monitoring LDAP server on Control Node 1 (master)
ldap_srv2
Monitoring LDAP server on Control Node 2 (replica)
ff_manage
Monitoring FA Control Agents
If these services are not in state Online, this tool tries to restart them. If the partner
Control Node is not available, ALL services (which may run locally will be stopped and
restarted again).

Administration and Operation 37


FlexFrame Architecture Network

Those services will be unavailable during the restart.

The program will stop if a service is in maintenance mode. Since this indicates an
operator intervention, it must be fixed manually, e.g. by calling:
hvutil -m off <service>

Debugging information can be found in /tmp/ff_rms.DEBUGLOG. In case of problems,


please provide this file.

2.8 Network
The network is the backbone of the FlexFrame solution. Communication between the
various nodes and storage devices is done exclusively via the IP network infrastructure. It
serves both, communication between servers and clients as well as delivering IO data
blocks between the NAS (Network Attached Storage) and the servers.
The IP network infrastructure is essential for every FlexFrame configuration. FlexFrame is
designed with a dedicated network for connections between servers and storage that is
reserved for FlexFrame traffic only. One network segment, the Client LAN (see below)
can be routed outside the FlexFrame network to connect to the existing network.

2.8.1 LAN Failover


The term „LAN failover“ describes the ability of a FlexFrame environment to use a logical
network interface that consists of several physical network interface cards (NICs), which
in turn are using redundant network paths (cables and switches). When a network
component (NIC, cable, switch, etc.) fails, the network management logic will switch over
to another network interface card and path.

2.8.2 Segments
FlexFrame 3.2 introduces a new network concept, providing higher availability as well as
increased flexibility in virtualizing the whole FlexFrame landscape.
This concept relies on VLAN technology that allows running multiple virtual networks
across a single physical network. Additionally, in order to ensure high network availability,
LAN bonding is used on every node. This includes a double switch and wiring
infrastructure, to keep the whole environment working, even when a network switch or
cable should fail. Additionally, the networks segements are prioritized, to ensure that
important connections are prefered.

38 Administration and Operation


Network FlexFrame Architecture

Similar to former versions, there are four virtual networks within FlexFrame. The
difference of FlexFrame 3.2 is that these networks are run through one logical redundant
NIC, using bonding on Linux and IPMP on Solaris.
The following figure outlines the basic network segments of a typical FlexFrame
landscape with Linux-based Application Nodes.

Virtual Network Segments Network Appliance


Filer

Control Node 1 Control Node 2


Onboard Onboard
LAN A LAN B PCI - NIC LAN A LAN B PCI - NIC
IPMI IPMI
eth 1 eth 2 eth 1 eth 2
eth 0 eth0
Bond 0 Bond 0
V2 V3 V4 cip0
V2 V3 V4 cip0

e0 e4a e4b

vif storage
VLANs:
Control-LAN

Storage-LAN
Server-LAN

Client-LAN
bond0 bond0 bond0
Mgmt.
RSB * eth0 eth1 IPMI * eth1 eth2 eth0 eth1 Blade
Clients IPMP

XSCF fjgi0 fjgi1


*) Power shut down facility my be: Blade servers
IPMI, RSB, RPS, XSCF or SCF

Application Nodes

The following virtual network segments are mandatory:


● Client LAN
The purpose of the Client LAN segment is to have dedicated user connectivity to the
SAP instances. This segment also allows administrators to access the Control
Nodes.
● Control LAN
The Control LAN segment carries all RSB, IPMI, XSCF, RPS, e0 and administrative
communication.
Control LAN access to the Application Nodes is required for interventions
using the Remote Service Board (RSB).

Administration and Operation 39


FlexFrame Architecture Network

● Server LAN
The Server LAN segment is used for the communication between SAP instances
among each other and the databases.
● Storage LAN
The Storage LAN segment is dedicated to NFS communication for accessing the
executables of SAP and the RDBMS as well as the IO of the database content and
SAP instances.
The network bandwidth has to be one Gigabit for all components, since intense storage
traffic has to be handled.

2.8.3 Network Switches


Network switching components play a very important role within FlexFrame. Therefore,
only the following switch types are tested and supported:
● Cisco Catalyst WS-C3750G-24TS
● Cisco Catalyst WS-C3750G-24T
with IOS 12.1.19 or higher.
This model supports VLAN technology for a flexible configuration for the various network
segments and QoS (quality of service) to prioritize traffic (e.g. higher priority for
production systems over test systems etc).

2.8.4 Automounter Concept


The Automounter Concept is based on the ability of Linux and Solaris to mount file
systems automatically as their mount points are accessed.
During the boot process of an Application Node some file systems are mounted. For
Linux these are the root file system (read-only) as well as the /var mount point (read-
write). For Solaris Application Nodes the root file system is mounted read-write and the
/usr mount point is mounted read-only. These are the basic file systems which must be
accessible for the Application Node to function properly. There is no data in the two file
systems that is specific for a SAP service.
Data which is specific for a SAP service, a database or a FlexFrame Autonomous Agent
is found in directories which are mounted on first access. Some of the mounts will stay as
long as the Application Node is operational. Others will be unmounted again, if
directories or files below that mount point have not been accessed for a certain period
time.
Within the LDAP database there are two types of data which relate to the automounter
configuration: automountMap and automount.

40 Administration and Operation


Network FlexFrame Architecture

An automountMap is a base for automount objects. Here’s how to list the


automountMaps:
control1:~ # ldapsearch -x -LLL '(objectClass=automountMap)'
dn: ou=auto.FlexFrame,ou=Automount,ou=pool2,ou=Pools,ou=FlexFrame,
dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automountMap
ou: auto.FlexFrame

dn: ou=auto_FlexFrame,ou=Automount,ou=pool2,ou=Pools,ou=FlexFrame,
dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automountMap
ou: auto_FlexFrame
...

The base directory looks like this:

dn: cn=/FlexFrame,ou=auto_master,ou=Automount,ou=pool1,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount
cn: /FlexFrame
automountInformation: auto_FlexFrame

Further on there are entries like:

dn: cn=myAMC,ou=auto_FlexFrame,ou=Automount,ou=pool1,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount
cn: myAMC
automountInformation:
-rw,nointr,hard,rsize=32768,wsize=32768,proto=tcp,nolock,
vers=3 filpool1-st:/vol/volFF/pool-pool1/pooldata/&

There are two things that have to be pointed out.


First, the ou=auto_FlexFrame denotes an entry which is dedicated for Solaris
Application Nodes and referes to the base directory as shown before. The same entries
for Linux Application Nodes are named ou=auto.FlexFrame (point instead of an
underscore).
The second notable aspect in this entry is the use of the wildcard &. In essence this entry
tells the autofs process, that if the folder /FlexFrame/myAMC is accessed to try to
mount it from the path filpool1-st:/vol/volFF/pool-pool1/pooldata/myAMC.
If the folder myAMC is found and permissions allow the clients to access it, it will be

Administration and Operation 41


FlexFrame Architecture Network

mounted to /FlexFrame/myAMC/<name>. If myAMC is not found or the client does not


have the permissions, the folder will not be mounted. In such a case, try to mount the
folder manually on a different folder, e.g. like:

an_linux:~ #
mount filpool1-st:/vol/volFF/pool-pool1/pooldata/myAMC/mnt

If you get an error message like Permission denied, check the exports on the Filer
and the existence of the directory myAMC/ itself.
Other entries in LDAP make use of variables like ${OSNAME} which is either Linux or
SunOS:

dn: cn=/,ou=auto.oracle,ou=Automount,ou=pool2,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount
cn: /
description: catch-all for Linux automount
automountInformation:
-rw,nointr,hard,rsize=32768,wsize=32768,proto=tcp,nolock,
vers=3 filpool2-st:/vol/volFF/pool-pool2/oracle/${OSNAME}/&

On Linux, the automount mount points can be read using the following command:
an_linux:~ # mount
rootfs on / type rootfs (rw)
/dev/root on / type nfs (ro,v3,rsize=32768,wsize=32768,reserved,
hard,intr,tcp,nolock,addr=192.168.11.206)
192.168.11.206:/vol/volFF/os/Linux/FSC_3.2B00-000.SLES-
9.X86_64/var_img/var-c0a80b36
on /var type nfs (rw,v3,rsize=32768,wsize=32768,reserved,
hard,intr,tcp,nolock,addr=192.168.11.206)
192.168.11.206:/vol/volFF/os/Linux/FSC_3.2B00-000.SLES-
9.X86_64/var_img/var-c0a80b36
/dev on /dev type nfs (rw,v3,rsize=32768,wsize=32768,reserved,
hard,intr,tcp,nolock,addr=192.168.11.206)
192.168.11.206:/vol/volFF/os/Linux/pool_img/pool-c0a80bff on
/pool_img type nfs (rw,v3,rsize=32768,wsize=32768,reserved,
hard,intr,tcp,nolock,addr=192.168.11.206)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw)
shmfs on /dev/shm type shm (rw)
/dev/ram on /var/agentx type ext2 (rw)
automount(pid1750) on /FlexFrame type autofs (rw)
automount(pid1772) on /saplog/mirrlogA type autofs (rw)
automount(pid1752) on /home_sap type autofs (rw)

42 Administration and Operation


Network FlexFrame Architecture

automount(pid1788) on /saplog/saplog1 type autofs (rw)


automount(pid1766) on /sapdata/sapdata5 type autofs (rw)
automount(pid1778) on /saplog/origlogA type autofs (rw)
automount(pid1762) on /sapdata/sapdata3 type autofs (rw)
automount(pid1758) on /sapdata/sapdata1 type autofs (rw)
automount(pid1784) on /saplog/saparch type autofs (rw)
automount(pid1764) on /sapdata/sapdata4 type autofs (rw)
automount(pid1786) on /saplog/sapbackup type autofs (rw)
automount(pid1760) on /sapdata/sapdata2 type autofs (rw)
automount(pid1754) on /myAMC type autofs (rw)
automount(pid1796) on /usr/sap type autofs (rw)
automount(pid1780) on /saplog/origlogB type autofs (rw)
automount(pid1768) on /sapdata/sapdata6 type autofs (rw)
automount(pid1792) on /saplog/sapreorg type autofs (rw)
automount(pid1776) on /saplog/oraarch type autofs (rw)
automount(pid1774) on /saplog/mirrlogB type autofs (rw)
automount(pid1770) on /sapdb type autofs (rw)
automount(pid1756) on /oracle type autofs (rw)
automount(pid1790) on /saplog/saplog2 type autofs (rw)
automount(pid1794) on /sapmnt type autofs (rw)
filpool2-st:/vol/volFF/pool-pool2/pooldata on /FlexFrame/pooldata
type nfs (rw,v3,rsize=32768,wsize=32768,reserved,hard,tcp,
nolock,addr=filpool2-st)

If your Application Node is a Solaris server, use the following command:


an_solaris:~ # ldaplist -l ou=auto_master,ou=Automount
dn: cn=/FlexFrame,ou=auto_master,ou=Automount,ou=pool1,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount
cn: /FlexFrame
automountInformation: auto_FlexFrame

dn: cn=/home_sap,ou=auto_master,ou=Automount,ou=pool1,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount
cn: /home_sap
automountInformation: auto_home_sap

dn: cn=/myAMC,ou=auto_master,ou=Automount,ou=pool1,ou=Pools,
ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,dc=com
objectClass: top
objectClass: automount

Administration and Operation 43


FlexFrame Architecture Network Appliance Filer

cn: /myAMC
automountInformation: auto_myAMC

dn: cn=/sapdata/sapdata1,ou=auto_master,ou=Automount,ou=pool1,
ou=Pools,ou=FlexFrame,dc=flexframe,dc=wdf,dc=fujitsu-siemens,
dc=com
objectClass: top
objectClass: automount
cn: /sapdata/sapdata1
automountInformation: auto_sapdata1
...

The cn: parts show the mount point.

2.9 Network Appliance Filer


In FlexFrame 3.2, the storage for all Application Nodes is consolidated to one or more
NAS Filers from Network Appliance.
Fujitsu Siemens Computers are working jointly with Network Appliance (see
http://www.netapp.com) on the development of the FlexFrame concept. The Network
Appliance product class “Filer” is an essential part of the FlexFrame solution.
The operating system of the Filer is called "ONTAP". The disks will be grouped into RAID
groups. A combination of RAID groups will make a volume. Starting with ONTAP 7, also
aggregates can be created. FlexVolumes can be created ontop of aggregates. A volume
contains a file system (WAFL - Write Anywhere File Layout) and can serve as NFS (for
UNIX systems) or CIFS (for Windows systems) volumes or mount points. The Filer has
NVRAM (Non Volatile RAM) that buffers committed IO blocks. The contents of the
NVRAM will remain intact if the power of the Filer should fail. Data will be flushed to the
disks once power is back online.
The minimal FlexFrame landscape has at least the following volumes:
● vol0 (ONTAP, configuration of Filer)
● sapdata (database files)
● saplog (database log files)
● volFF (OS images of Application Nodes, SAP and datase software,
pool related files)
In FlexFrame 3.2, the volume volFF will separate FlexFrame data (file system of
Application Nodes and other software) from the Filer's configuration and ONTAP. In
larger installations, multiple sapdata and saplog volumes can be created (e.g. to
separate production and QA etc.).

44 Administration and Operation


Network Appliance Filer FlexFrame Architecture

2.9.1 Built-in Cluster File System


The Network Appliance implementation of the NFS (Networked File System) allows
sharing of the same data files between multiple hosts. No additional product (e.g. cluster
file system) is required.

2.9.2 Volume Layout


The FlexFrame concept reduces the amount of "wasted" disk space since multiple SAP
systems can optionally share the same volume of disks. As the data grow, one can easily
add additional disks and enlarge the volumes without downtime.

2.9.3 Snapshots
When a snapshot is taken, no data blocks are being copied. Just the information where
the data blocks are located is saved. If a data block is modified, it is written to a new
location, while the content of the original data block is preserved (also known as “copy on
write”). Therefore, the creation of a snapshot is done very quickly, since only few data
have to be copied. Besides that, the snapshot functionality provided by NetApp is unique,
because the usage of snapshots does not decrease the throughput and performance of
the storage system.
Snapshot functionality will allow the administrator to create up to 250 backup-views of a
volume. The functionality “SnapRestore” provided by NetApp significantly reduces the
time to restore any of the copies if required. Snapshots will be named and can be re-
named and deleted. Nested snapshots can be used to create e.g. hourly and daily
backups of all databases. In a FlexFrame landscape, a single backup server is sufficient
to create tape backups of all volumes. Even a server-less backup can be implemented.
Off-Line backups require a minimal down-time to the database because the backup to
tape can be done reading form a quickly taken snapshot.

2.9.4 Filer Cluster


A Filer can be clustered to protect data against the failure of a single Filer. Switching from
one Filer to its cluster counterpart is transparent to the Application Nodes.
Filer clustering is a functionality of Network Appliance Filers.

Administration and Operation 45


Reactivating Application Nodes FlexFrame Basic Administration

3 FlexFrame Basic Administration

3.1 Accessing a FlexFrame Landscape (Remote


Administration)
A FlexFrame landscape can be accessed through Secure Shell (ssh, scp) connections
to the Control Node. Any other remote administration tools like rsh or telnet have been
disabled for reasons of security.

3.2 Powering up the FlexFrame Landscape


If the complete FlexFrame landscape was powered off, the following power-on sequence
is recommended:

Administration and Operation 46


Powering up the FlexFrame Landscape FlexFrame Basic Administration

Before an Application Nodes can be booted, the NTP server must be set up correctly.
This can be verified by runing the following command.
control1:~ # ntpq -p
remote refid st t when poll reach delay offset jitter
=================================================================
*LOCAL(0) LOCAL(0) 5 l 45 64 377 0.000 0.000 0.004
control2-se control1-se 7 u 248 1024 377 2.287 2.587 0.815

Administration and Operation 47


FlexFrame Basic Administration Powering off the FlexFrame Landscape

This command needs to be repeated until an asterisk (*) is displayed at the begining of
one of the data lines. This character indicates that the NTP server is now ready. If you do
not wait and continue with booting, the Application Nodes may work with a different time
than the Control Nodes and may (among other possible side effects) create files which
may have wrong time stamp information.
If this sequence is not used and all servers are powered on at the same time,
the Application Nodes will try to boot while the Control Nodes are not ready to
receive the Application Node’s boot request. If this is the case, manual
intervention is required to re-initiate the boot process of the Application Nodes.

3.3 Powering off the FlexFrame Landscape


If you need to power-off the complete FlexFrame landscape (e.g. to move it to a different
location) we recommend following the steps as outlined below:

Before shutting down all SAP and DB services, check if users, batch jobs, print
jobs and RFC connections to other SAP systems have finished working.

48 Administration and Operation


Reactivating ANs after Power Shutdown by FA Agents FlexFrame Basic Administration

To shutdown the SAP and DB services, use the following command for each pool:
control1:~ # stop_all_sapservices <pool_name>

Before you can stop the Filer, you need to stop all processes on the Control Nodes. Since
those processes are under control of PRIMECLUSTER you can use a single command:
control1:~ # hvshut -a

To send the halt command to the Filer, use the following command:
control1:~ # rsh <filer_name> halt

There’s no explicit power-off sequence for the switches. Assuming there are no
other devices connected to the switches, they may simply be plugged-off after
all components were powered down.
If you do not send an explicit “halt” command to the Filer, the backup-battery of
the Filer maybe drained since the Filer assumes a power-loss and tries to
preserve the contents of its NVRAM. If the Filer is powered-off for too long the
result can be a loss of NVRAM data and a long waiting period during next
startup of the Filer.

3.4 Reactivating ANs after Power Shutdown by FA


Agents
FlexFrame Autonomy places an Application Node out of service if it is confronted with a
problem it cannot solve.
The reactions, messages and alarms which take place in this case are described in the
“FA Agents - Installation and Administration” manual in the context of the switchover
scenarios.
It is the responsibility of the administrator to analyze why the node could not be used any
longer.
We recommend analyzing the FA log and work files as these may be able to supply
valuable information.
A node which is to start operating as a Spare Node after switchover must be validated
using suitable test scenarios.

Administration and Operation 49


4 Displaying the Current FlexFrame
Configuration State
To obtain a general overview of an active FlexFrame system, use the FA WebGUI.
In principle, the FA WebGUI can be used with every browser with SUN JAVA Plugin
V1.4.1 or higher which has access to the page on the Control Node. The WebGUI can
always be accessed when the Apache Tomcat service is running. This service is normally
started by PRIMECLUSTER.
The WebGUI is described in detail in chapter 5.
The login mask expects a user name and password for authentication purposes. You can
only use the WebGUI with a valid combination of user name and password. For details on
the configuration of the users, see the myAMC documentation for the WebGUI.

Administration and Operation 51


Displaying the Current FlexFrame Configuration State Networks

The FA WebGUI provides a presentation of all elements of a FlexFrame system. On the


left-hand side the pools, groups, nodes and the active SAP services on the individual
nodes are shown in a TreeView.
The TreeView of the FA WebGUI can show either the physical view or the application-
related view of the active SAP systems and their instances or a mixed view.
The panels derived from the TreeView (Application Server Panel, Application System
Panel, ServiceView and MessageView) always show the objects in relation to the
selected hierarchical level in the TreeView.
The FA WebGUI is thus the central cockpit for displaying the static configuration of a
FlexFrame infrastructure, but as well it is the display for the active SAP systems and their
instances. Here, all user interactions such as startup and shutdown are shown directly.
All reactions initiate by the FA Agents are displayed as well.
If the FA messenger component has been configured and activated, all traps are stored
in a support database. This permits the temporal process for the traps to be displayed
very simply at pool, group or node level.

4.1 Networks
The network of FlexFrame is its backbone. Here are some tips to get an overview of the
current situation on the various networks:
To double-check the network addresses, their names and pool assignment you can use
the getent command:

control1:~ # getent networks


loopback 127.0.0.0
control 192.168.20.0
storage_pool1 192.168.10.0
server_pool1 192.168.3.0
client_pool1 10.1.1.0
storage_pool2 192.168.11.0
server_pool2 192.168.4.0
client_pool2 10.1.2.0

The loopback network is local for each host and always has the IP address 127.0.0.0.
The control network is the Control LAN network segment for the complete FlexFrame
landscape.
In the example we have configured two pools called pool1 and pool2. For each pool
there are the three dedicated and distinct segments storage, server and client. The
building rule of the network name is <segment>_<pool name>.

52 Administration and Operation


Networks Displaying the Current FlexFrame Configuration State

On the Control Nodes you can see the relation of each pool specific segment to its
interface using the netstat -r command like this:

control1:~ # netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window
irtt Iface
192.168.100.0 * 255.255.255.252 U 0 0
0 cip0
server_pool2 * 255.255.255.0 U 0 0
0 vlan42
control * 255.255.255.0 U 0 0
0 bond0
control * 255.255.255.0 U 0 0
0 eth1
control * 255.255.255.0 U 0 0
0 eth2
server_pool1 * 255.255.255.0 U 0 0
0 vlan32
storage_pool2 * 255.255.255.0 U 0 0
0 vlan41
client_pool1 * 255.255.255.0 U 0 0
0 vlan30
storage_pool1 * 255.255.255.0 U 0 0
0 vlan31
client_pool2 * 255.255.255.0 U 0 0
0 vlan40
default gw216p1 0.0.0.0 UG 0 0
0 vlan30

The cip0 interface is used for PRIMECLUSTER communication between the two Control
Nodes, this network does not have a name associated with it.
Here you can quickly see that the Server LAN segment of pool2 (server_pool2) is
using the VLAN ID 42 on interface vlan42.
Note that the control Control LAN segment is shown on the three interfaces bond0,
eth1 and eth2) which is because bond0 is the combination of eth1 and eth2 grouped
together for redundancy purposes.

Administration and Operation 53


Displaying the Current FlexFrame Configuration State Networks

4.1.1 Script: ff_netscan.sh


To get more detailed information on each network component, FlexFrame provides a tool
ff_netscan.sh. The tool probes all known hosts (derived from LDAP) and those who
respond to a broadcast ping in each network (except for the Client LANs).
Gathering the information via SNMP can take some time.
The following outputs are sections of a single ff_netscan.sh command. It was split
into several sections for explanation purposes only.
control1:~ # ff_netscan.sh
Netscan version 1.28 started
Copyright (C) 2004, 2005 Fujitsu Siemens Computers. All rights
reserved.
SNMP @ 10.1.1.10 (an_0600 ):ok.
SNMP @ 10.1.1.20 (an_0700 ):ok.
SNMP @ 10.1.1.30 (an_0800 ):ok.
SNMP @ 10.1.1.40 (an_0900 ):Not reached.
SNMP @ 10.1.1.60 (an_0510 ):ok.
SNMP @ 10.1.1.61 (an_0511 ):ok.

Each line represents a request to a certain IP address. Since Application Nodes have
multiple IP addresses they are queried multiple times, which is required to detect wrong
configurations.
The Application Node an_0900 was “Not reached“ means that this IP address (read
from LDAP database) did not respond to an ICMP request. A possible cause may be that
this server is powered off.

SNMP @ 10.1.2.205 (control2 ):ok.


SNMP @ 10.1.2.254 (gw216p2 ):No SNMP info.

The message No SNMP info. is printed if the IP address is available but did not respond
to the SNMP request. A possible cause is that the server may not be part of the
FlexFrame landscape. In our example it is a gateway which does not have an SNMP-
daemon running.

SNMP @ 192.168.3.10 (an_0600-se ):ok.


SNMP @ 192.168.3.12 ( ):ok.
SNMP @ 192.168.3.20 (an_0700-se ):ok.
SNMP @ 192.168.3.22 ( ):ok.

54 Administration and Operation


Networks Displaying the Current FlexFrame Configuration State

There are three IP addresses listed which do not have a host name associated with,
therefore the names in brackets are missing. In our case this is caused by the test
addresses for Solaris Application Nodes (IPMP).

SNMP @ 192.168.20.252 (Cis-Sum ):ok.


Scanning switches...

After the last server was probed with SNMP, we scan for available switches.
Now each Control Node or Application Node is listed. There are different types of
components which produce slightly different output.
Here is a sample of a Control Node:

FlexFrame Node information:


Nodename OS OS Version Model
control1 cn SuSE SLES-8 FSC

In the header of the block for this Control Node you see its name and the OS as cn which
denotes a Control Node. The OS Version indicates that this Control Node is running
SuSE SLES-8. The Model FSC (Fujitsu Siemens Computers) is displayed if no detailed
information on the servers exact model name can be found.
Now, each network device is listed along with information on the associated IP address,
the MTU size (maximum transfer unit), and MAC address and to which switch and which
port the device is connected.

Device IP Addr MTU MAC Switch Port


lo 127.0.0.1 16436 n.a. n.a. n.a.
eth0 not up 1500 003005406ab8 !undef! !undef!
eth1 192.168.20.201 1500 003005406ab9 Cisco1 Gi1/0/3
eth2 no ip 1500 003005406ab9 Cisco1 Gi1/0/3
bond0 no ip 1500 003005406ab9 Cisco1 Gi1/0/3
vlan30 10.1.1.201 1500 003005406ab9 Cisco1 Gi1/0/3
vlan31 192.168.10.201 1500 003005406ab9 Cisco1 Gi1/0/3
vlan32 192.168.3.201 1500 003005406ab9 Cisco1 Gi1/0/3
vlan40 10.1.2.204 1500 003005406ab9 Cisco1 Gi1/0/3
vlan41 192.168.11.204 1500 003005406ab9 Cisco1 Gi1/0/3
vlan42 192.168.4.204 1500 003005406ab9 Cisco1 Gi1/0/3
cip0 192.168.100.1 1500 00c0a8640100 !undef! !undef!
cip1 not up 1500 !undef! !undef!
cip2 not up 1500 !undef! !undef!
cip3 not up 1500 !undef! !undef!
cip4 not up 1500 !undef! !undef!
cip5 not up 1500 !undef! !undef!
cip6 not up 1500 !undef! !undef!
cip7 not up 1500 !undef! !undef!
sit0 not up 1480 !undef! !undef!

Administration and Operation 55


Displaying the Current FlexFrame Configuration State Networks

There are several abbreviations which are explained in this table:

Abbreviation Explanation
n.a. not applicable. Since the device lo (loopback) is a logical
interface with no connection to the outside world, no information on
MAC, switch or port can be displayed.
!undef! Marks an undefined status. The device is not configured or in an
undefined status. Please check these interfaces. As for a Control
Node, the eth0 is used for IPMI and hence not used for the OS. The
devices sit0 and cip1 through cip7 are not used.
no ip There is no IP address associated whit this device.
not up This device or interface is not up.
Gi<s>/0/<p> This denotes a port on a switch. Gi is short for “GigabitEthernet” and
<s> is the switch-number of the stack of switches. <p> is the port
number where this particular port is connected to.
The example Gi1/0/3 tells us that the eth1 interface is connected
to port 3 of switch #1 of this stack of switches.

Since the “bonding” driver of Linux shares the same MAC


address between both lines, only the active port of the
redundant interfaces is seen.
Po<c> If Po<c> is shown as a port it is a PortChannel (a combination of
multiple ports). To see further details of a port channel, you need to
read the switch configuration using the command show
etherchannel details.

The bond0 interface is built of eth1 and eth2. Unfortunately, Linux does not
deliver the correct information in such a case. The bond0 interface is actually up
and using the IP address shown for eth1.
The name of the switch and the short notation of the port are shown if possible. The
information is derived from ARP caches which are very volatile. There maybe cases
where !undef! is listed, but a port is associated. Furthermore, only the active interface
of eth1 and eth2 is seen.

56 Administration and Operation


Networks Displaying the Current FlexFrame Configuration State

The following output is from a Filer:

FlexFrame Node information:


Nodename OS OS Version Model
filpool1-co ONTAP 7.0 FAS940
Device IP Addr MTU MAC Switch Port
e0 192.168.20.203 1500 00a098011e92 Cisco1 Gi1/0/18
e9a not up 1500 0007e9391884 !undef! !undef!
e9b not up 1500 0007e9391885 !undef! !undef!
e10a not up 1500 02a098011e92 Cisco1 Po5
e10b not up 1500 0007e93ef757 !undef! !undef!
e11a not up 1500 0007e93919d8 !undef! !undef!
e11b not up 1500 0007e93919d9 !undef! !undef!
e4a not up 1500 02a098011e92 Cisco1 Po5
e4b not up 1500 0007e93ef3e1 !undef! !undef!
lo 127.0.0.1 9188 00000000 !undef! !undef!
vh not up 9188 00000000 !undef! !undef!
storage not up 1500 02a098011e92 Cisco1 Po5
storage-31 192.168.10.203 1500 02a098011e92 Cisco1 Po5
storage-41 192.168.11.206 1500 02a098011e92 Cisco1 Po5

Here a sample for a BX600 or BX300 Linux blade server:

FlexFrame Node information:


Nodename OS OS Version Model
an_0504 Linux SuSE SLES-8 FSC
Device IP Addr MTU MAC Switch Port
lo 127.0.0.1 16436 n.a. n.a. n.a.
eth0 192.168.11.54 1500 00c09f3a874e Cisco1 Po4
eth1 no ip 1500 00c09f3a874e Cisco1 Po4
bond0 no ip 1500 00c09f3a874e Cisco1 Po4
vlan40 10.1.2.54 1500 00c09f3a874e Cisco1 Po4
vlan42 192.168.4.54 1500 00c09f3a874e Cisco1 Po4

The blade servers are connected to switch blades. The related port is shown as
a PortChannel. This represents the “view” from the Cisco switch.

A sample switch blade configuration is shown below:

FlexFrame Node information:


Nodename OS OS Version Model
bx600-2-swb2 BX_SwitchBlade 1.0.0.2 Switch Blade
Device IP Addr MTU MAC Switch Port
port:1 192.168.20.52 1522 0030f1c42f81 !undef! !undef!
port:2 no ip 1522 0030f1c42f82 !undef! !undef!
port:3 no ip 1522 0030f1c42f83 !undef! !undef!

Administration and Operation 57


Displaying the Current FlexFrame Configuration State Networks

port:4 not up 1522 0030f1c42f84 !undef! !undef!


port:5 not up 1522 0030f1c42f85 !undef! !undef!
port:6 not up 1522 0030f1c42f86 !undef! !undef!
port:7 not up 1522 0030f1c42f87 !undef! !undef!
port:8 no ip 1522 0030f1c42f88 !undef! !undef!
port:9 no ip 1522 0030f1c42f89 !undef! !undef!
port:10 no ip 1522 0030f1c42f8a !undef! !undef!
port:11 no ip 1522 0030f1c42f8b Cisco1 Po4
port:12 no ip 1522 0030f1c42f8c !undef! !undef!
port:13 not up 1522 0030f1c42f80 Cisco1 Po4
port:14 no ip 1522 0030f1c42f8b Cisco1 Po4
port:15 no ip 0 000000000000 !undef! !undef!
port:1001 no ip 0 0030f1c42f80 Cisco1 Po4
port:1008 no ip 0 0030f1c42f80 Cisco1 Po4
port:1030 no ip 0 0030f1c42f80 Cisco1 Po4
port:1031 no ip 0 0030f1c42f80 Cisco1 Po4
port:1032 no ip 0 0030f1c42f80 Cisco1 Po4
port:1040 no ip 0 0030f1c42f80 Cisco1 Po4
port:1041 no ip 0 0030f1c42f80 Cisco1 Po4
port:1042 no ip 0 0030f1c42f80 Cisco1 Po4

The management blade of a blade server chassis is displayed as follows:

FlexFrame Node information:


Nodename OS OS Version Model
"BX600" frame BX 1.42
--- Slot # 1 serial#: XXXX000000
--- Slot # 2 serial#: XXXX000001
--- Slot # 3 serial#: XXXX000002
--- Slot # 4 serial#: XXXX000003
--- Slot # 5 serial#: XXXX000004
--- Slot # 6 serial#: XXXX000005
--- Slot # 7 serial#: XXXX000006
--- Slot # 8 serial#: XXXX000007
--- Slot # 9 serial#: XXXX000008
--- Slot #10 serial#: XXXX000009
Device IP Addr MTU MAC Switch Port
eth0 192.168.20.50 1500 00c09f37e6a8 Cisco1 Gi2/0/22

Here you can see if there is a certain blade server in a given slot. If so, its serial number
is listed.
Only one interface (eth0) of the two available management blades is active at a
given time.

58 Administration and Operation


Networks Displaying the Current FlexFrame Configuration State

Sample output for a Solaris Application Node:

FlexFrame Node information:


Nodename OS OS Version Model
an_0800 Solaris 5.8. PW250
Device IP Addr MTU MAC Switch Port
fjgi1 192.168.10.31 1500 00e000c51a0b Cisco1 Gi1/0/9
lo0 127.0.0.1 8232 n.a. n.a. n.a.
fjgi0 192.168.10.32 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi0:1 192.168.10.30 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi30000 not up 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi30000:1 10.1.1.30 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi30001 not up 1500 00e000c51a0b Cisco1 Gi1/0/9
fjgi32000 not up 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi32000:1 192.168.3.30 1500 00e000a6d1d8 Cisco1 Gi2/0/9
fjgi32001 not up 1500 00e000c51a0b Cisco1 Gi1/0/9
XSCF board @ 192.168.20.30
FlexFrame Node information:
Nodename OS OS Version Model
an_0800-co xscf 04040001 unknown
Device IP Addr MTU MAC Switch Port
lo0 127.0.0.1 1536 000000000000 !undef! !undef!
scflan0 192.168.20.30 1500 00e000c59a0b Cisco1 Gi1/0/20

PRIMEPOWER 250 and 450 models have a built-in XSCF which is listed after
the Application Node itself. Here, the XSCF is connected to port Gi1/0/20.

The following example shows some error messages. This error messages indicates that
one NIC of this Application Node is not connected to the port where it should be
(according to the LDAP database). If, e.g. two ports of two Application Nodes of the same
pool are mixed up, the functionality is just the same. However, if one Application Node
gets moved into a different pool, the ports are changed. This would lead to a malfunction
of both ports.
If this error message is displayed check the cabling and run ff_netscan.sh again.

FlexFrame Node information:


Nodename OS OS Version Model
an_0600 Solaris 5.8. PW250
Device IP Addr MTU MAC Switch Port
Error: Node "an_0600" is connected to port=Gi1/0/6, but should be
Gi1/0/5 or Gi2/0/5.
fjgi1 192.168.10.11 1500 00e000c51941 Cis-Sum-1 Gi1/0/6
lo0 127.0.0.1 8232 n.a. n.a. n.a.
fjgi0 192.168.10.12 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5
fjgi0:1 192.168.10.10 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5

Administration and Operation 59


Displaying the Current FlexFrame Configuration State Networks

fjgi30000 not up 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5


fjgi30000:1 10.1.1.10 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5
Error: Node "an_0600" is connected to port=Gi1/0/6, but should be
Gi1/0/5 or Gi2/0/5.
fjgi30001 not up 1500 00e000c51941 Cis-Sum-1 Gi1/0/6
fjgi32000 not up 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5
fjgi32000:1 192.168.3.10 1500 00e000a6d13a Cis-Sum-1 Gi2/0/5
Error: Node "an_0600" is connected to port=Gi1/0/6, but should be
Gi1/0/5 or Gi2/0/5.
fjgi32001 not up 1500 00e000c51941 Cis-Sum-1 Gi1/0/6
XSCF board @ 192.168.20.10
FlexFrame Node information:
Nodename OS OS Version Model
an_0600-co xscf 04040001 unknown
Device IP Addr MTU MAC Switch Port
lo0 127.0.0.1 1536 000000000000 !undef! !undef!
scflan0 192.168.20.10 1500 00e000c59941 Cis-Sum-1
Gi1/0/21

Similar error messages are:

Error: Node <node_name> NIC <nic_name>: IP address <ip_address> is


in wrong VLAN (<number>)

This error message indicates that the IP address shown does not belong to the interface
where it is configured on. It has to be in a different VLAN (network segment).
Not all combinations can be verified this way.

Error Node <node_name> unknown MAC (<mac_addr>) instead of (<MAC1>


or <MAC2>)

This indicates that the two MAC addresses (MAC1 and MAC2) which are known to the
LDAP database are not found on this Application Node (<node name>). One possible
cause is that a NIC was replaced but the LDAP database was not updated using the
appropriate tool.
If the known NIC fails and the Application Node tries to boot again it will not get an IP
address and cannot boot.

Error: Node <node_name> does not have IP address <ldap_an_ip>


configured!

The IP address ldap_an_ip could not be found on this Application Node. Maybe there
was a problem during configuration of the network interfaces.

60 Administration and Operation


Networks Displaying the Current FlexFrame Configuration State

There may be still wrong settings which cannot be detected by


ff_netscan.sh.

At the end of the ff_netscan.sh output you should find the line:

Netscan finished.

For further details on options refer to the online manual page of ff_netscan.sh using
the following command:
control1:~ # man ff_netscan.sh

4.1.2 Script: ff_pool_defrt.sh

Description
This command sets the default router for a FlexFrame pool. LDAP entries for DHCP
configuration and pool information are adjusted. Each Solaris Application Node (if
installed) will get an /etc/defaultrouter file with the default router you provided in
the command line.
The default router must be a valid router (resolvable name or IP address) of any
of the available networks for all Application Nodes in the given pool (by pool
name). This tool will not check if the default router is reachable from the
Application Nodes.

Synopsis

ff_pool_defrt.sh [-d] -p <pool_name> -r <defrouter>

Command Options
-d Debugging information will be logged to /tmp/ff_pool_defrt.DEBUGLOG.
-p <pool_name>
Name of the pool. Note: The pool name is case sensitive.
-r <defrouter>
Name or IP address for the default router for the given pool.

Debugging
/tmp/ff_pool_defrt.DEBUGLOG will hold debugging information. In case of problems
please provide this file.

Administration and Operation 61


Displaying the Current FlexFrame Configuration State Networks

4.1.3 Script: ff_pool_dnssrv.sh

Description
This command adds or removes a DNS server for a pool. LDAP entries for DHCP
configuration and pool information are adjusted. Each Solaris Application Node will get an
/etc/resolv.conf file with the DNS server and DNS domain name given in the
command line. Each Linux node will get an entry in
/FlexFrame/vollFF/pool-<pool name>/pooldata/config/etc/resolv.conf,
which is a symbolic link from the /etc/resolv.conf file of the Application Node.

Synopsis

ff_pool_dnssrv.sh [-d -f] -p <pool_name> -r <dns_server>


-u <action> -n <domain_name>

Command Options
-d Debugging information will be logged to /tmp/ff_pool_dnssrv.DEBUGLOG.
-p <pool_name>
Name of the pool. Note: The pool name is case sensitive.
-r <dnssrvr>
IP address for the DNS serverof the given pool. If the IP address do not match any of
the local pool networks (Client VLAN, Storage VLAN or Server VLAN) and no default
router is installed, processing is aborted.
-u <action>
Possible actions are add or remove.
-n <domain_name>
DNS domain name of the given pool. If the domain name do not match the one given
in LDAP and the -f option is not used, processing is aborted.
-f Use force option to change the domain name when adding a DNS server.

Debugging
/tmp/ff_pool_dnssrv.DEBUGLOG will hold debugging information. In case of
problems please provide this file.

62 Administration and Operation


State of Pools Displaying the Current FlexFrame Configuration State

4.2 State of Pools


The active configured pools are displayed on the FA WebGUI. The nodes or systems
belonging to a pool are displayed in the FA WebGUI TreeView. Each node element in the
tree which represents a pool is identified by the prefixed keyword Pool.

4.3 State of Application Nodes


Each Application Node with a running FA Application Agent is shown in the FA WebGUI.
It is shown in the Nodes TreeView with its host name (Linux) resp. node name (Solaris)
and also in the Application Server Panel. In the Application Server Panel the data
displayed depend on the hierarchical level selected in the TreeView.

4.4 State of SAP Systems


The active SAP system IDs can be displayed very easily in the Systems TreeView of the
FA WebGUI and also in the SAP System Panel. The SAP System Panel is shown in
parallel to the Application Server Panel. In the SAP System Panel, the data displayed
depend on the hierarchical level selected in the TreeView.

4.5 State of SID Instances


The active instances of a SID can be displayed very simply in the InstancesView of the
FA WebGUI by clicking on a SID in the SAP System Panel.
For each view, information is provided in tabular form specifying the service’s current
pool, group, node, priority and status.

Administration and Operation 63


5 Web Interfaces

5.1 FlexFrame Web Portal


The FlexFrame Control Nodes provide a web portal with links to Web interfaces of
several FlexFrame components, i.e;
● FA Autonomous Agents
● PRIMECLUSTER Administration
● ServerView S2
To access this portal, start Mozilla and enter the Control Node’s IP address in the location
bar. If you are directly on the Control Node you want to configure, just call mozilla from
the shell:
mozilla localhost

You will see an overview of all Web interfaces installed on the Control Nodes.

Administration and Operation 65


Web Interfaces FA Autonomous Agents

If you cannot connect to any Control Node, the http service might be running on
the other Control node – if not, check the PRIMECLUSTER configuration.

5.2 FA Autonomous Agents


Use this tool to manage the virtualized services in your FlexFrame environment.
For information on usage, please refer to the FA Autonomous Agents manuals.
You can access the FlexFrame Autonomous Agents WebGUI directly from the
active Control Node by entering the following URL:
http://localhost:8080/FAwebgui
This only works if the Jakarta Tomcat web server is running. If it is not running,
check the PRIMECLUSTER configuration.

5.3 PRIMECLUSTER Administration


The PRIMECLUSTER administration interface allows to manage the PRIMECLUSTER
components of FlexFrame.
To start, click on the “PRIMECLUSTER Administration” link on the FlexFrame web portal.

You can directly access the PRIMECLUSTER Administration interface from any
Control Node by accessing the following URL:
http://localhost:8081/Plugin.cgi
This even works if the Apache web server is not running and therefore the
FlexFrame web portal is down.

You see the logon window.

Enter user name and password of the Linux root user.

66 Administration and Operation


PRIMECLUSTER Administration Web Interfaces

After successful login you will see the Web-Based Admin View tool:

The Web-Based Admin View is a framework for integration of different administration


interfaces running on dedicated http daemons, independent of other installed web
servers.

Administration and Operation 67


Web Interfaces PRIMECLUSTER Administration

Click on Global Cluster Services and then click on Cluster Admin.


This will start the PRIMECLUSTER administration interface JAVA applet.

68 Administration and Operation


PRIMECLUSTER Administration Web Interfaces

To run the Cluster Admin JAVA applet, you have to accept the security certificate by
Fujitsu Siemens Computers. The following warning is displayed:

Select Yes or Always. Selecting Always will automatically accept the certificate the next
time you call this applet.
Next, you will be asked to which management server Cluster Admin should connect:

Select a server and click on Ok. For the PRIMECLUSTER management of FlexFrame it
does not matter which one you select.

Administration and Operation 69


Web Interfaces PRIMECLUSTER Administration

5.3.1 Cluster Foundation (CF)


In the first window you will see the state of Cluster Foundation:

This part of PRIMECLUSTER is the base for Cluster interconnectivity and node
monitoring. Both Control Nodes should be displayed in green colour.
In some cases, PRIMECLUSTER cannot detect the correct state of the partner node and
will display LEFTCLUSTER and/or DOWN. For example this will happen if you remove all
network cables to the remote node including the IPMI interface.
In case of a fault:
● Check the connectivity and restart CF by selecting Tools - Stop CF and
Tools - Start CF. If this does not work, reboot at least one Control Node.
● If a node is really down and cannot be rebooted, select
Tools - Mark Node Down from the pull down menu.
Do not try to change CF configuration and do not unconfigure CF unless
instructed to do so by Fujitsu Siemens Computers support!

70 Administration and Operation


PRIMECLUSTER Administration Web Interfaces

5.3.2 Reliant Monitor Services (RMS)


RMS is a part of PRIMECLUSTER which is used to control “userApplications”.
In the following window you can see a properly working FlexFrame environment (in this
example cn1_pool1RMS is the first Control Node, cn2_pool1RMS the second one).
Do not change configuration parameters of RMS unless instructed by Fujitsu
Siemens Computers support!

Select the tab labeled rms at the bottom of the left panel to proceed to PRIMECLUSTER
RMS administration.
For a description of how PRIMECLUSTER is implemented in FlexFrame, refer to section
PRIMECLUSTER on page 29.

Administration and Operation 71


Web Interfaces PRIMECLUSTER Administration

5.3.3 Inconsistent and Faulted Applications


Inconsistent applications (see example 1) may occur when a sub application is started or
stopped manually and therefore the whole application including its children is neither
completely online nor completely offline.
If ff_manage is online on the first Control Node and Apache is started manually on the
second one, PRIMECLUSTER will mark ff_manage as “Inconsistent” on this node. To
see and understand this inconsistency, expand the tree view of the application to view its
children by clicking on the leftmost symbol.

Example 1: Inconsistent application

To clear this state, right click on ff_manage on the second Control Node and select
Offline or Clear fault.
You could also select Online in this case; PRIMECLUSTER would then shutdown
ff_manage on the first Control Node and start it on the second one (this may take some
time). The application will be coloured yellow (Wait state) until the switch-over was
finished.
Faulted applications (see examples 2 and 3) may occur when a sub application crashes
and cannot be restarted or when a sub application cannot be started during
PRIMECLUSTER startup.
Before clearing a fault, you should look into the logfile to find out the reason for the fault.
To view the logfile, right click on the faulted application and select View logfile.

72 Administration and Operation


PRIMECLUSTER Administration Web Interfaces

Example 2: Faulted application

Example 3: Inconsistent faulted application

Common causes for faults are misconfigured services, a broken network connection to
the Filer or a damaged mySQL database which is used by myAMC Messenger, to name
just a few examples.
To clear the fault state, right click on the application and select Clear fault to tell
PRIMECLUSTER that it should do offline processing on the application.
You have to switch the application online on one node to make sure that
FlexFrame will work correctly. This can be established by using the Online
(hvswitch) or Switch (hvswitch) entry of the application context menu.

Administration and Operation 73


Web Interfaces PRIMECLUSTER Administration

5.3.4 Switching Applications


This section describes how to shutdown, start and switch-over applications including their
services.
Switch application offline
Right click on the application and select Offline (hvutil -f) from the context
menu.
Switch application online
Right click on the application and select Offline (hvutil -f) from the context
menu.
Switch over application to other Control Node
Right click on the application and select Switch (hvswitch) from the context menu,
then select the Node which the application should be switched to.

5.3.5 Application Maintenance Mode


PRIMECLUSTER RMS provides a special maintenance mode. If activated, RMS will not
take any corrective actions if the status of a service changes.
Whenever you need to manually stop a service which is controlled by PRIMECLUSTER
(e.g. when installing updates and/or patches), you should switch the corresponding
application into maintenance mode.
To switch an application into maintenance mode, right click on the application and select
Enter Maintenance Mode.
To leave maintenance mode. Right click on the application and select Exit
Maintenance Mode.
If you attempt to stop a service (for example LDAP by calling
/etc/init.d/ldap stop) without entering maintenance mode or shutting
down RMS, PRIMECLUSTER will try to recover the service after a few seconds
by calling the start script.
You may damage your configuration if you ignore this!
It is possible to set all applications into maintenance mode at once by right-clicking on the
cluster name (FLEXFRAME) and selecting Enter Maintenance Mode and respectively
Exit Maintenance Mode in this popup menu.
For further information how to use PRIMECLUSTER refer to the
PRIMECLUSTER manuals.

74 Administration and Operation


ServerView S2 Web Interfaces

5.4 ServerView S2
ServerView S2 allows you to query information of PRIMERGY servers in the FlexFrame
environment.
This tool is not needed for administration of FlexFrame but is required by Fujitsu Siemens
Computers support.
If you would like to monitor the hardware status of PRIMERGY servers in your
FlexFrame™ environment, click on ServerView and on the next page, click on the
button labeled start:

Please note that SSL connections are not supported.

Administration and Operation 75


Web Interfaces ServerView S2

An overview of all configured servers (initially only the local host) will be shown on the
main screen:

To add more servers, run the Server Browser by either clicking on Administration /
Server Browser on the top navigation bar or right clicking anywhere in the list and
selecting New Server from the context menu.

76 Administration and Operation


ServerView S2 Web Interfaces

The Server Browser window will open:

Administration and Operation 77


Web Interfaces ServerView S2

The easiest way to add servers is to scan a subnet for each pool, for example the
“Server” subnet for Application Nodes and the “Control” subnet for management blades.
To scan a subnet, enter the first three bytes of the network address in the Subnet input
field on the bottom left of the screen and click on Start Browsing:

The browsing process is finished when the button Stop Browsing changes its caption to
Start Browsing.

Please do not interrupt the browsing process as it may hang the JAVA applet or
the browsing processes. If this happens, the Server Browser or even ServerView
S2 must be restarted.
To restart ServerView S2, run the following commands on the console:
cn1:~ # /etc/PRIMECLUSTER/bin/srvst.sh stop
PRIMECLUSTER will detect the “faulted” application and try to restart it after
about 10 seconds. Verify this by calling:
cn1:~ # /etc/PRIMECLUSTER/bin/srvst.sh status

78 Administration and Operation


ServerView S2 Web Interfaces

To add all manageable servers, right click anywhere on the list and click on
Select Manageables.

Administration and Operation 79


Web Interfaces ServerView S2

After clicking on Select Manageables, all servers with known type will be selected:

To finally add those selected servers, click on Apply in the upper right corner of the
Server Browser window.
Note that the Control Nodes are accessible from all subnets, so a message about already
existing servers will be shown which should be confirmed by clicking on OK:

80 Administration and Operation


ServerView S2 Web Interfaces

If ServerView S2 detects blade servers, it asks to add the whole Blade Center instead of
each individual blade. This makes management easier and is highly recommended.
So in the following dialogue you should reply with Yes:

If ServerView can not detect the host name of the Management Blade, it will ask for it:

You may enter anything you want here.

Administration and Operation 81


Web Interfaces ServerView S2

After closing the Server Browser, the main window containing the server list will show the
added servers. Further servers can be added by repeating the recent steps using a
different subnet address.

To view the state of a single server, click on the blue server name. To view the state of a
Server Blade, click on the Management Blade / Blade Center, then select the right blade
and click on ServerView on the bottom navigation bar.

Events such as SNMP Traps can be viewed by navigating to


Event Management on the top navigation bar. This will open ServerView
AlarmService, which will not be described in detail. To monitor SNMP Traps, we
recommend using FA WebGUI / myAMC Messenger.

82 Administration and Operation


6 Hardware Changes

6.1 Changing BIOS Settings for Netboot


For detailed information on setting BIOS parameters, please refer to the hardware
documentation. Look at some of the configuration examples below. The exact procedure
may differ depending on the BIOS version of your hardware.

BX300 / 600
● While booting, press F2
● Menu:
Advanced -> PCI Configuration -> OnBoard LAN1 + LAN2 [Enable]
● Reboot
● Menu: Boot 2 x MBA …. on top
● Menu: Exit -> Saving Changes

BX600 S2
● While booting, press F2
● Menu:
Advanced -> PCI Configuration -> OnBoard LAN1 + LAN2 [Enable]
● Reboot
● While booting, press F2
● Menu: Boot -> Boot Device Priority -> 1st Boot Device
● Select IBA GE Slot 0420 …
● Menu: Boot -> Boot Device Priority -> 2nd Boot Device
● Select IBA GE Slot 0421 …
● Menu: Exit -> Saving Changes

RX300
● While booting, press F2
● Menu: Advanced -> LAN Remote Boot A+B [PXE]
● Reboot
● While booting, press F2

Administration and Operation 83


Hardware Changes Changing BIOS Settings for Netboot

● Menu: Boot 2 x MBA … on top


● Menu: Exit -> Saving Changes

RX300 S2
First, you have to make the PCI-X NICs net bootable.
Download Proboot.exe from the INTEL website:
http://downloadfinder.intel.com/scripts-df-external/Product_Filter.aspx?ProductID=412
Extract proboot.exe and move it to a bootable CD-Rom.
Power on the RX300 S2.
After booting start IBAutil from the DOS Prompt:
<LW>:\ cd <IBAutil-DIR>
<LW>:\ IBAutil.exe -all -flashenable

Reboot.
After booting start IBAutil from the DOS Prompt:
<LW>:\ cd <IBAutil-DIR>
<LW>:\ IBAutil.exe -all -install pxe

Repeat the following steps for each NIC:


1. Answer the first question with “No”.
2. Answer the second question with “Yes”
3. At the end you will find a list of all NICs with the corresponding MAC addresses
4. Write down the MAC address of NIC1.

Remove the CD and reboot.


● While booting, press F2
● Menu: Advanced -> Peripheral Configuration
● Select LAN Controller [Channel A&B]
● Select LAN Remote Boot Ch. A: [Disabled]
● Select LAN Remote Boot Ch. B: [PXE]
● Menu: Exit, Save Changes + Exit
● Select Yes
● Reboot

84 Administration and Operation


Changing BIOS Settings for Netboot Hardware Changes

● While booting, press F2


● Menu: Main -> Boot Option -> Boot Sequence
● Select BootManage PXE, SLOT 0500 (Onboard LAN B)
● Move it behind CDROM, Diskett by pressing +
● Select IBA GE Slot 0320 v1228 extension LAN card NIC 1)
● Move it behind BootManage PXE, SLOT 0500 by pressing +
● Select IBA GE Slot 0321 v1228 (extension LAN card NIC 2)
● Press space to deactivate (!)
● Menu: Exit, Save Changes + Exit
● Select Yes

RX600
● While booting, press F2
● Menu: Advanced -> Ethernet on Board -> Rom Scan [Enable]
Enable Master [Enable]
● Menu: Advanced -> Embedded SCSI Bios Scan Order [LAST]
● Menu: Main -> Boot Options 2 x MBA …. on top
● Menu: Exit -> Saving Changes

RX600 S2
First, configure the PCI-E NICs as network-bootable.
Download proboot.exe from the INTEL website:
http://developer.intel.com/design/network/products/ethernet/linecard_ec.htm

Extract proboot.exe and move it to a bootable CD-Rom. Insert it into the drive to boot
the node from CD.
● Power on the RX600 S2 .
● While booting you will be requested to Press any key …
● Select Boot Manager
● Select Primary Master CD-ROM

Administration and Operation 85


Hardware Changes Changing BIOS Settings for Netboot

After booting, start IBAutil from the DOS Prompt:

<LW>:\ cd <IBAutil-DIR>
<LW>:\ IBAutil.exe -all -install pxe

Repeat the following steps for each NIC:


1. Answer the first question with “No”.
2. Answer the second question with “Yes”
3. At the end you will find a list of all NICs with the corresponding MAC addresses
4. Write down the MAC address of NIC1 and NIC3.

Remove the CD from the drive and reboot.


● While booting you will be requested to Press any key …
● Select Maintenance Manager
● Select Boot Options
● Select Legacy BEV Order
● Select BEV Drive #00
● Select NIC1 from card1 e.g IBA GE Slot 0220 v1220
● Select BEV Drive #01
● Select NIC3 from card2 e.g IBA GE Slot 0920 v1220
● Select Apply Changes
● Go Back To Main Page
● Select Boot Option
● Select Change Boot Order
● Check if NIC1 is in 2nd position
● Press ESC 3 x to go back to Main Page
● Select Continue Booting

86 Administration and Operation


Changing BIOS Settings for Netboot Hardware Changes

RX800
● While booting, press F1.
● Menu: Configuration/Setup Utility
● Select Menu Devices and I/O Ports
● Set Planar Ethernet to [Enabled]
● Press ESC to go back
● Select Menu Start Options
● Select Menu Startup Sequence Options
● Set First Startup Device to [Network]
● Press ESC to go back
● Menu: Start Options
● Set Planar Ethernet PXE/DHCP to [Planar Ethernet 2]
● Press ESC to go back
● Select Menu Advanced Setup
● Select Menu CPU Options
● Set Hyper-Threading Technology to [Enabled]
● Press 2 x ESC to go back
● Select Save settings
● Press ENTER to save
● Select Exit Setup
● Select Yes, exit the Setup Utility

RX800 S2
● While booting, press F1.
● Menu Configuration/Setup Utility
● Select Menu Start Options
● Select Menu Startup Sequence Options

Administration and Operation 87


Hardware Changes Changing BIOS Settings for Netboot

● Set First Startup Device to [Network]


● Press ESC to go back

88 Administration and Operation


Replacing Network Cards Hardware Changes

● Set Planar Ethernet PXE/DHCP to [Planar Ethernet 2]


● Press ESC to go back
● Select Save settings
● Press ENTER to save
● Select Exit Setup
● Select Yes, exit the Setup Utility

6.2 Replacing Network Cards

6.2.1 Replacing a Network Card – Control Node


If the motherboard or the PCI network card in a Control Node is replaced, no further
actions are required since there is no reference to the replaced MAC addresses.

Administration and Operation 89


Hardware Changes Replacing Power Control Hardware

6.2.2 Replacing a Network Card – Application Node


If a network card on an Application Node fails and is replaced, further actions are
necessary to ensure proper operation.
During their boot processes, both PRIMEPOWER and PRIMERGY nodes need to
retreive some information through lower level protocols which are based on hardware
addresses (MAC addresses). The PRIMERGY's BIOS utilizes DHCP (Dynamic Host
Configuration Protocol) for that purpose, the PRIMERPOWER's OBP (Open Boot Prom)
uses RARP (Reverse Address Resolution Protocol).
The MAC address is the build-in, physical address which is unique to that network card.
Since there are two network cards configured for booting you may not even notice the
MAC address change of a single one, because the other card will be used for booting.
Once the server is up, operation is normal. For full redundancy – even for booting – we
recommend to adjust the configuration accordingly.
● On PRIMEPOWER systems, use OBP to determine all MAC addresses, including
the new one.
● On PRIMERGY blade systems, use the management blade to determine the server
blade MAC addresses.
● On PRIMERGY non-blade systems, use BIOS or network card labels to determine
the MAC addresses.
To change the MAC addresses for booting, use the command below. Replace the node
name and MAC addresses of the sample with your Application Node’s name and noted
addresses:
control:~ # ff_an_adm.pl --op mac --name appnode5
--mac 00:e0:00:c5:19:41,00:e0:00:c5:19:41
MAC addresses changed.

The program changed the MAC addresses of the node within DHCP or RARP
configuration and LDAP. No further intervention is necessary.
Some software licenses are derived from information based on server specific
numbers, such as MAC addresses of network cards. On Linux, double-check
your SAP licenses after you have replaced a network card.

6.3 Replacing Power Control Hardware


This section describes what to do after the replacement of power control hardware,
depending on your Application Nodes and Control Nodes hardware.
For instructions on replacing the power control hardware, please refer to the
documentation delivered with your hardware.

90 Administration and Operation


Exchanging a Control Node Hardware Changes

Description of the Power Shutdown Function


The power shutdown is necessary for FlexFrame Autonomous Agents to switch off
Application Nodes in case of a switchover scenario to make sure that a system in an
undefined state will be switched off. It is also necessary for the PRIMECLUSTER
Shutdown Facility on the Control Nodes.
The power shutdown will be implemented by the Shutdown Agents (SA) of the
PRIMECLUSTER Shutdown Facility and the ff_xscf.sh script.
Each Shutdown Agent (SA_blade, SA_ipmi, SA_rsb, SA_rps and SA_scon) has its
own config file which will be configured automatically by the FlexFrame Autonomous
Agents and the PRIMECLUSTER Shutdown Facility on the Control Nodes.
The PRIMEPOWER XSCF (eXtended System Control Facility) is a built-in component for
PRIMEPOWER 250 and 450 systems. It enables either serial or LAN access to the
server’s console or power-on, power-off and reset commands.
Only the hardware and software which is used by the Shutdown Agents and XSCF must
be prepared with IP address, user, password etc.
For details how to configure the power control hardware, please refer to chapter “Power-
Shutdown Configuration” in the “Installation of a FlexFrame Environment” manual.
More information about the functionality of the SA tools can be found in section “Power
Management (On/Off/Power-Cycle)” on page 249.

6.4 Exchanging a Control Node


If a Control Node must be replaced for any reason, attention has to be paid to the fact
that the replacing hardware has to be the same as the replaced component. Otherwise,
problems with missing Linux drivers might occur. Problems due to hard disk failures
should usually not occur, since the Control Node's hard disks are mirrored through
hardware-based RAID-1.

6.4.1 Hardware Failed, Hard Disk and Installed OS Are Not


Affected
In case that other hardware components than one of the HDs are broken (e.g. memory,
CPU, etc.) and the installed OS image is still operational, these hardware components
have to be replaced with equivalent ones. Even the entire hardware can be replaced -
excluding the HDs. The HDs can be removed from the existing hardware, plugged into
the new hardware and booted, after checking the RAID controller's BIOS for hardware
RAID settings being enabled. Even if one of the hard disks is broken, it can be replaced
with the same model and synchronized with the working hard disk through the RAID
controller's BIOS.

Administration and Operation 91


Hardware Changes Replacing Switch Blades

In this case the Control Node hardware must be replaced by the same model with the
same disk controller and the same network interface controllers. The approach is very
simple, plug in your hard disks in the new Control Node in the same order as the old one,
power on the Control Node, enter the RAID controller BIOS and check the parameters;
hardware raid should be enabled as before.
See also the manual “Installation of a FlexFrame Environment“, chapter “Control Node
Host RAID Configuration”.

6.4.2 One Hard Disk is Defect, the Other One Is Undamaged


The Control Nodes OS installation must be on a RAID-1 disk array.
Replace the defect hard disk with the same model and synchronize it with the
undamaged one via the RAID controller BIOS. This approach is dependent from the
RAID controller model and its BIOS.

6.4.3 The Control Nodes OS Is Damaged


If, for any case, the operating system of a Control Node should be damaged, it has to be
installed anew with the original Control Node installation DVD. The configuration has to
be recovered from a previous backup. See section “Backup / Restore of FlexFrame
Control Nodes” on page 265.

6.5 Replacing Switch Blades


To reconfigure a replaced switch blade, see chapter Restoring Switch Configuration on
page 268.

92 Administration and Operation


7 Software Updates

7.1 Updating the entire FlexFrame Landscape

7.1.1 Upgrading from FlexFrame 3.0 or lower Version to


FlexFrame 3.2
If you want to upgrade your FlexFrame environment with version 3.0 or lower to release
3.2 , please contact the Competence Center SAP (CCSAP). There exists a process,
which has to be adapted on project specific needs.

7.1.2 Upgrading from FlexFrame 3.1 to FlexFrame 3.2


For information on upgrading FlexFrame 3.1 environment to release 3.2, please see the
document “Upgrading FlexFrame™ 3.1 to 3.2”.

7.2 Software Update on the Control Node


This section describes FlexFrame software updates. Control Nodes are not designed to
run any third party software (except saprouter).
Installing third party software on Control Nodes or Application Nodes may cause
functional restrictions and other malfunctions of the FlexFrame system software.
Fujitsu Siemens Computers shall have no liability whatsoever whether of a
direct, indirect or consequential nature with regard to damages caused by the
third party software or its erroneous installation.

For further information on third party software within FlexFrame, please refer to section
“Third Party Software” on page 105.
Nevertheless we will describe here the general approach to update/install software on the
Control Nodes, on a few examples. In your special case this approach can differ to the
ones described here.
New FlexFrame software should be updated/installed as RPMs. Only the RPMs will be
documented in the RPM database. If there is no RPM package available (because the
vendor does not deliver this piece of software not as an RPM and SuSE does not provide
this version as RPM as well) you may also install this software from a tar archive. In this
case you should document this in the directory
/FlexFrame/volFF/FlexFrame/documentation/ as a plain text file.
Normally, an already installed software package must be updated with the rpm command
rpm -U <package_name>.rpm.

Administration and Operation 93


Software Updates Software Update on the Control Node

Don’t forget to install the same software package on the other Control Node, too.

7.2.1 ServerView Update via RPM


Download the newest ServerView packages from Fujitsu Siemens Computers, e.g. here
version 3.10-14 for SLES8. Copy the RPMs to your software stage directory and install
the RPMs.
In this example, the software stage is the directory
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8.
Copy the RPMs to your software stage:
control1: # cp -p <somewhere>/srvmagt-mods_src-3.10-14.suse.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8
control1: # cp -p <somewhere>/srvmagt-agents-3.10-14.suse.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8
control1: # cp -p <somewhere>/srvmagt-agents-3.10-14.suse.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

Update the installed RPMs:


control1: # cd /FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 # rpm -U
srvmagt-mods_src-3.10-14.suse.rpm
insserv: script ff_mysql: service mysql already provided!
insserv: script mysql.org: service mysql already provided!
insserv: script ff_myAMC.MessengerSrv: service myAMC.Messenger
already provided!
Compiling modules for 2.4.21-286-smp:
copa(Ok) ipmi(Ok) smbus(Ok)
..done
Loading modules: copa ipmi smbus msr cpuid

..done
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 #

control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 # rpm -U
srvmagt-eecd-3.10-14.suse.rpm
Running pre (1) for srvmagt-eecd-3.10-14
-rwxr-xr-x 1 7484 Sep 26 2004 /sbin/halt
Running post (1) for srvmagt-eecd-3.10-14
insserv: script ff_mysql: service mysql already provided!
insserv: script mysql.org: service mysql already provided!
insserv: script ff_myAMC.MessengerSrv: service myAMC.Messenger
already provided!

94 Administration and Operation


Software Update on the Control Node Software Updates

Starting eecd..done
You have new mail in /var/mail/root
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 #

control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 # rpm -U
srvmagt-agents-3.10-14.suse.rpm
Running pre (1) for srvmagt-agents-3.10-14
ONUCDSNMP=true
Shutting down snmpd:..done
Running post (1) for srvmagt-agents-3.10-14
Linking 32-bit/64-bit binaries for i686
insserv: script ff_mysql: service mysql already provided!
insserv: script mysql.org: service mysql already provided!
insserv: script ff_myAMC.MessengerSrv: service myAMC.Messenger
already provided!
-rwxr-xr-x 1 47507 Oct 7 2004 /usr/sbin/snmpd
Running triggerin (1, 1) for srvmagt-3.10-14
lrwxrwxrwx 1 16 Oct 27 18:09 /usr/sbin/snmpd -> ucdsnmpd-
srvmagt
Starting snmpd..done
Starting agents: sc bus hd unix ether bios secur status inv
vv..done
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 #

Do this also on the second Control Node.

7.2.2 Updating/Installing a New Linux Kernel


The following sections describe how to install a new Linux kernel for Control Nodes in
FlexFrame 3.2. This only has to be done if instructed to do so by Fujitsu Siemens
Computers support.
The process will be described on examples accomplished on a development system. The
version numbers here in this document are not generally binding; they must be replaced
with the correct information for your system.
This approach describes the kernel installation without the recommended server-start-
CD.

7.2.2.1 Software Stage


To install and update any of the software packages in FlexFrame, it is useful to mount
your Filer or jukebox or any other software stage on a local mount point.
For example, your software stage is /FlexFrame/volFF/FlexFrame/stage/.

Administration and Operation 95


Software Updates Software Update on the Control Node

Copy the delivered software packages to the software stage:


control1: # cd /FlexFrame/volFF/FlexFrame/stage
control1:/FlexFrame/volFF/FlexFrame/stage # mkdir -p SuSE/SLES8

Kernel:
control1: # cp -p <somewhere>/k_smp-2.4.21-286.i586.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

Kernel source:
control1: # cp -p <somewhere>/kernel-source-2.4.21-286.i586.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

Control Node initrd:

control1: # cp -p <somewhere>/initrd-2.4.21-286-smp
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

This is not the original initrd as created while installing the kernel-RPM. This initrd
already contains all necessary drivers for the Control Node while booting.
Special Fujitsu Siemens Computers modules:
control1: # cp -p <somewhere>/FSC-2.4.21-286-smp.tar
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

In this case only the net/bcm5700.o module is necessary. Normally all special Fujitsu
Siemens Computers drivers will be delivered in a zip archive. In our case we have
extracted only the necessary bcm5700 module and repacked it in a tar archive to simplify
the next installation steps.

7.2.2.2 Install the New Kernel


To install the new kernel, execute the following commands:
control1: # cd /FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 #
rpm -i k_smp-2.4.21-286.i586.rpm

Replace the original initrd with the Control Node initrd:

control1: # cd /boot
control1:/boot # cp -p initrd-2.4.21-286-smp
initrd-2.4.21-286-smp.ORIG
control1:/boot # cp -p
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8/initrd-2.4.21-286-smp.

96 Administration and Operation


Software Update on the Control Node Software Updates

Because various PRIMERGYs (i.e. RX300 and RX300 S2) with different SCSI
controllers are supported, we load more than one kernel module in our initrd.
The modules that could not find the appropriate hardware will display
corresponding (non-critical) error messages.
control1:/boot/mnt/lib/modules/2.4.21-286-smp/kernel/drivers/scsi
# insmod megaraid.o
megaraid.o: init_module: No such device
Hint: insmod errors can be caused by incorrect module parameters,
including invalid IO or IRQ parameters.
You may find more information in syslog or the output from
dmesg
control1:/boot/mnt/lib/modules/2.4.21-286-smp/kernel/drivers/scsi

Replace the original kernel modules with the Fujitsu Siemens Computers modules.
Rename the original driver:

control1:/lib/modules/2.4.21-286-smp/kernel/drivers/net/bcm #
mv bcm5700.o bcm5700.o.ORIG

Install the special Fujitsu Siemens Computers driver bc5700.o; unpack this tar in “/”:
control1:/ # cd /
control1:/ # tar -xvf
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8/FSC-2.4.21-286-smp.tar
lib/modules/2.4.21-286-smp/net/bcm5700.o

/etc/modules.conf should look like this:


..
..
options bcm5700 vlan_tag_mode=1
alias eth0 bcm5700
alias eth1 bcm5700
alias eth2 e1000
alias eth3 e1000
alias eth4 e1000
alias eth5 e1000
alias bond0 bonding
options bond0 miimon=100 mode=1

Configure the old kernel as a fall-back in the boot loader. Create a new entry for the
previous kernel:
control1: # cd /boot/grub
control1:/boot/grub # cp -p menu.lst menu.lst.ORIG
control1:/boot/grub # vi menu.lst

Administration and Operation 97


Software Updates Software Update on the Control Node

gfxmenu (hd0,0)/message
color white/blue black/light-gray
default 0
timeout 8

title linux
kernel (hd0,0)/vmlinuz root=/dev/sda5 vga=788 acpi=off
initrd (hd0,0)/initrd
title floppy
root (fd0)
chainloader +1
title failsafe
kernel (hd0,0)/vmlinuz.shipped root=/dev/sda5 ide=nodma apm=off
acpi=off vga=normal nosmp disableapic maxcpus=0 3
initrd (hd0,0)/initrd.shipped
title oldkernel-251
kernel (hd0,0)/vmlinuz-2.4.21-251-smp root=/dev/sda3 vga=788
acpi=off
initrd (hd0,0)/initrd-2.4.21-251-smp

7.2.2.3 New Kernel Source for PCL4 and ServerView


The kernel source is necessary to check and recompile the PCL4 cf.o module and
ServerView modules while rebooting the Control Node.
control1: # cd /usr/share/doc/packages
control1:/usr/share/doc/packages # mv kernel-source kernel-source-
2.4.21-251
control1: # cd /usr/src
control1:/usr/src # rm linux.old; mv linux linux.old
control1:/usr/src # rm linux-include.old; mv linux-include linux-
include.old

control1: # cd /FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8
control1:/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8 # rpm -i --
force kernel-source-2.4.21-286.i586.rpm

control1: # cd /usr/src/linux
control1:/usr/src/linux # make menuconfig

SAVE + EXIT

control1:/usr/src/linux # make dep

98 Administration and Operation


Installation of a New FA Agent Version Software Updates

7.2.2.4 Reboot the Control Node


Test the functionality of the updated Control Node, PCL, network connectivity, and so on.
control1:~ # uname -r
2.4.21-286-smp
control1:~ #

7.2.3 Installing a New OS Image


If a new version of the Control Nodes OS Image becomes available from Fujitsu Siemens
Computers, the Control Nodes can be re-installed with this new version, delivered as a
bootable DVD in the same manner as the previous installation DVD. The previously
saved configuration files has to be recovered then. See page 265.
Follow these steps:
● Read the documentation shipped with new Control Node image
● Backup the Control Node (see page 265)
● Switch application (see page 35)
● Shutdown the Control Node
● Install the new Control Node DVD as described in the Installation Guide
● After the first boot, check the network connectivity and the NFS mounted file
systems.
● Restore settings (see page 265)
● If required, perform the same steps with the other Control Node.

7.3 Installation of a New FA Agent Version


The following sections describe how to update the FA Agents. Below, you find a brief
description of what is to be done. For details on how it is to be done, please refer to the
“FA Agents - Installation and Administration” documentation.
In accordance to the FlexFrame concept, migration of FlexFrame Autonomy is flexible
and involves considerably less effort than for conventional systems:
● Installation of multiple agent versions or revision levels in parallel
● Separation of the installation process from the activation process
● Parameter data and agent versions are only loosely linked
● Pool-by-pool activation of the new agent versions

Administration and Operation 99


Software Updates Installation of a New FA Agent Version

● Conversion and saving of old parameter files with support of the CLI
● Option of reverting to the previous FA Agent version in the event of an unsatisfactory
update
Versioning of the FlexFrame Autonomous Agents is linked to the FlexFrame versions via
a matrix. In other words, it is possible that multiple agent versions can exist for operation
with one FlexFrame version. What is decisive then is the scope of functions and rules that
is required. In concrete terms this means that the FlexFrame Autonomous Agent Version
2.0 can, for example, also be used in compatibility mode for FlexFrame 3.0. In this way
you can profit from enhanced detector and rule characteristics of a new FlexFrame
Autonomous Agent without at the same time having to update the entire FlexFrame.
As soon as new functions of a release are used, functional compatibility only applies to a
limited degree. A matrix will show you which functions are compatible with the release in
the FlexFrame Agents or which are upward- or downward-compatible to a FlexFrame
version.

7.3.1 Migration at Pool Level


Version 2.0 of the FlexFrame Autonomous Agents enables multiple independent pools to
be set up.
Installation and startup of the agents must be regarded as two separate processes. In
principle, multiple versions of the agents can be installed. This permits, for example, a
patch for the FA Agents to be tested in one pool while a previous version is used in
another pool.
The version or revision level of the FA Agents must be entered or modified in the .info
file with an editor (such as vi). The .info pool is contained in the pool directory
/opt/myAMC/vFF/vFF_<pool_name>. The example below shows an .info file:

# Version V20K16
pool.release.current=V20K16
pool.release.base=V20K16

To switch to another version or revision level you have to adjust the value of the
pool.release.current entry. The syntax is

V<version_number>K<revision_number>
This syntax is mandatory. Ensure that the installed version is entered .
For further information on migrating an FA Agent version please refer to section 7.3.3 ff.

100 Administration and Operation


Installation of a New FA Agent Version Software Updates

7.3.2 FlexFrame Autonomy Command Line Interface


FlexFrame Autonomy contains a command line interface (PGTool.sh) to provide various
information about FlexFrame.
The command line interface is an autonomous program which is a component of every
FlexFrame Autonomy installation package.
The FlexFrame Autonomy command line interface supplies information about
● the pool
● the FA Agent version running
● the hardware of the node,
where the tool is executed.

7.3.3 Migration of FA Agent Versions on Pool Level


The FlexFrame Autonomous Agents offer a whole raft of strategies and functionalities for
installing and activating patches and new release versions for a wide range of security,
test and release scenarios.
The administrator can use the update and activation functionality provided by the agents
in line with the specific requirements. The following basic functions are available:
1. Reading and observing update, patch and release notes.
2. Installation of a new FA Agent version parallel to an operating FA Agent version. All
data and configuration information for the operating FA Agent version are retained.
3. Taking over the configuration data for the new FA Agent version using the FA
migration tool.
4. Pool-by-pool configuration/parameterization and activation of the new FA Agent
version.
5. Testing a new FA Agent version, e.g. in a separate test pool, if required by
deactivating the autonomous reactions for test operation.
The following activities are required to install or update a FlexFrame Autonomous Agent
patch or a newer release version.
1. Read the update, patch and release notes and observe any required modifications
and special features, particularly when updating FlexFrame and operating system
versions and patches simultaneously.
2. Installation of a patch or the new release (FA Control Agent and FA Application
Agent).

Administration and Operation 101


Software Updates Installation of a New FA Agent Version

3. Parameterization/configuration, possibly using the FA migration tool.


– Copying the parameters from the active agent version to the migration
configuration directory using the FA migration tool.
– Testing, parameterizing and configuring the parameters specified in the
configration files.
– Testing any new parameters and, if necessary, modifying the default values
entered.
– Parameterizing and configuring of FlexFrame/operating system version
dependencies if the FlexFrame basis is updated at the same time.
– Check the modifications made by the migration tool, according to the file
MIGRATION-INSTRUCTIONS.txt in the target directory of the migration.

4. Pool-specific deactivation of the active FA Agent.


– Stopping the FA CtrlAgent for the pool whose agents are to be updated.
– Stopping the FA AppAgents on all nodes of the pool whose agents are to be
updated.
5. Pool-specific activation of the new FA Agents.
– Modifying the active agent version in the .info file in the associated pool
directory.
– Transferring the migrated configuration to the configuration directory.
– Starting the FA AppAgents on all Application Nodes of the updated pool.
– Starting the FA CtrlAgent.
6. Checking the new active FA Agent version.
– Checking the agent processes.
– Checking agent messages at startup.
– Diagnosis and checking if the shown data are correct.
– Performing FlexFrame Autonomy tests (restart, reboot, etc.).
Steps 1, 2 and 3 can take place while FlexFrame Autonomy is active. The FA Autonomy
functions are not available only for a brief period between deactivation of the active agent
version and activation of the new agent version.
Only version-compatible FA CtrlAgents, FA AppAgents and FlexFrame versions
can be used. Compatibility of the agent versions with various FlexFrame
versions results in dependencies which must be taken into account.

102 Administration and Operation


The FA Migration Tool Software Updates

For further information on the migration of FA Agent versions on pool level, please see
documentation "FA Agents - Installation and Administration", section 4.8.

7.4 The FA Migration Tool


The FA migration tool is used to migrate configurations of a selected pool to and from a
particular FA Agent version. The FA migration tool also enables you to merge
configuration files.

7.4.1 Pool Mode


Pool mode generates a migrated configuration in the subdirectory
Migration.<Version>_<timestamp>, including the backup of the current files.
To enable the migrated configuration to be used it must be copied into the relevant
configuration directory of the pool concerned.
Required/useful parameters:
-p/--migrate-pool=<pool>
-r/--target-release=<release>
-b/--backup
[-V/--verbose] (optional)
[-d/--pools-basedir=<dir>] (optional)
[-c/--clean] (optional)
[-s/--source-release=<release>] (optional)
See section “Parameters of the FA Migration Tool” on page 104 for a description of the
parameters.
Example:
MGRTool.sh --migrate-pool=<pool> --target-release=<release>
--backup

7.4.2 File Mode


File mode merges two files which are in myAMC configuration format. The two files are
defined with the parameters merge-file and template.
File mode can only be used on files which are in myAMC configuration format (e.g.
myAMC_FA.xml). The myAMC_Pools.xml and myAMC_FA_Groups.xml files are not in
this format. These files can therefore not be migrated using file mode, but only in pool
mode.

Administration and Operation 103


Software Updates The FA Migration Tool

Required / useful parameters:


-m/--merge-file=<file>
-t/--template=<template>
-o/--out-file=<file>
[-V/--verbose] (optional)
[-c/--clean] (optional)
See section “Parameters of the FA Migration Tool” on page 104 for a description of the
parameters.
Example:
MGRTool.sh --merge-file=File1.xml --template=File1-default.xml
--out-file=File-out.xml

7.4.3 Usage of Help


The usage of the FA migration tool can be displayed using the
/opt/myAMC/FA_CtrlAgent/MGRTool.sh --help command:

7.4.4 Parameters of the FA Migration Tool


Parameter Description
-m/--merge-file=<file> Merges the <file> file with the template.
-t/--template=<template> Specifies the template to be used for the
merge.
-o/--out-file=<file> Write merged results into the <file> file
(use '-' for standard output).
-p/--migrate-pool=<pool> Specifies the <pool> pool to be migrated.
-s/--source-release=<release> Migrates from specified version (parameter is
optional). If parameter is not specified, the
version of the pool to be converted is used.
-r/--target-release=<release> Migration to the specified version.
-b/--backup Generates a backup of all files.
-d/--pools-basedir=<dir> Basic directory of the pools
(default: /opt/myAMC/vFF).
-l/--list-releases Lists all available (installed) versions.
-c/--clean Removes unnecessary files and configuration
settings.

104 Administration and Operation


Installing ONTAP Patches Software Updates

Parameter Description
-V/--verbose Detailed output during migration.
-lf/--logfile <file> Writes log messages to log file <file>.
-lp/--logpath <path> Generates the log file in the specified directory.
-h/--help Prints usage.

7.5 Installing ONTAP Patches


Before applying a new patch to the Filer’s ONTAP make sure that it is supported for the
FlexFrame concept. If in doubt contact Fujitsu Siemens Computers support.
Download the tar file(s) for the new ONTAP onto one of your Control Nodes.
Create a temporary directory e.g. in /FlexFrame/volFF/FlexFrame/stage/ontap
(create the directory if it does not exists) and extract the tar file there.
Proceed as described in the manual or README information of that patch.
Before you activate the new patch by rebooting the Filer, make sure that no SAP system
or Database is busy. You do not need to reboot the Application Nodes, however we
recommend double-checking the mount points after the Filer is up again. If an Application
Node should get stuck, reboot it.

7.6 Third Party Software


If the customer needs to install third party software on the Application Nodes, the
maintenance procedures described in this manual must be followed. Installation of
software or patches may have to be performed again after installing a maintenance
image shipped by Fujitsu Siemens Computers.

rd
If you have installed additional 3 party software to your FlexFrame landscape,
please make sure to back up this software components and their configuration
data for later restore. Please consider: If you install a new FlexFrame image, the
whole file system and directories are overwritten by a complete new root file
rd
system image. Afterwards, any 3 party software and configurations have to be
re-installed or restored.
Installing third party software on Control Nodes or Application Nodes may cause
functional restrictions and other malfunctions of the FlexFrame system software.
Fujitsu Siemens Computers shall have no liability whatsoever whether of a
direct, indirect or consequential nature with regard to damages caused by the
third party software or its erroneous installation.

Administration and Operation 105


8 Administrating Application Nodes
Manual Application node maintenance would be very complex and and error-prone. The
script ff_an_adm.pl does the major changes and supports adding, changing, removing
and listing of Application Nodes. Below, each action is described in detail.
In this document, you often will find console output, configuration data and
installation examples which are based on FlexFrame 3.1 and SLES 8 / Solaris 8
Application Nodes. Please keep in mind that these are examples and may look
slightly different on the new operating systems introduced in FlexFrame 3.2.

8.1 Listing Application Nodes


To list details of an Application Node use the maintenance script
/opt/FlexFrame/bin/ff_an_adm.pl.

8.1.1 Displaying Information on a Specific Application Node


This section describes how to list detailed information on a specific Application Node.

Synopsis

ff_an_adm.pl --op list --name <node_name>

Command Options
--op list
Lists the configuration details of an Application Node.
--name <node_name>
The name of the Application Node to be listed. Use operation mode list-all to get
a list of all configured Application Nodes with their node names (see 8.1.2).
The output is structured in sections: hardware, software, network, assigned pool and
group, switch ports.
● Hardware
This section contains information about system type, rack ID, device rack name,
shutdown facility with IP address and host name, mac addresses and on blade
servers or partition servers the chassis and slot/partition ID.
● Software
This section contains information on OS type, vendor and version and the root
image path.

Administration and Operation 107


Administrating Application Nodes Listing Application Nodes

● Network
This section lists the VLAN ID, IP address and host name of all configured networks,
sorted by LAN segments.
● Pool and Group
This section lists the names of the assigned pool and group.
● Switch ports
In case link aggregates are configured, this section identifies the aggregate and its
ports. Each used switch port is shown with the switch group ID, switch ID and port ID
(cabinet ID, switch blade ID and port on blade servers) for the common LANs
(Storage, Server and Client LAN) and the Control LAN (if used).
Definition of Switch Group:
A number of Cisco Catalyst 3750G switches within one system cabinet. The
switches are connected as a loop with Cisco StackWise cables at the rear of
each switch. With connected cables, the switches form a stack that behaves like
a virtual switch including all ports of connected switches. To identify a switch
port within the entire FlexFrame environment, three items are used:
• the number of the switch group (it is like a numbering of the virtual switches)
starting with 1.
• the number of the switch within the switch group starting with 1.
• the number of the switch port starting with 1.
Definition of Switch Port:
A switch has a number of ports where other network devices and the host net-
work interfaces are connected. The port is identified by a number starting at 1.
Within a switch group, the port number is prefixed by the switch number (the
identification number of the switch within the switch group).

108 Administration and Operation


Listing Application Nodes Administrating Application Nodes

Command Output
The command displays detailed information on the selected Application Node. The output
differs between blade servers and all others.
A PRIMERGY blade server output may look like this:
cn1:/opt/FlexFrame/bin # ff_an_adm.pl --op list --name bx2-6

Configuration details of node bx2-6

Hardware
System: BX620S2 RackID: 2 AN Name: AN6
Shut.Facil.: Mgmt Blade bx2-co (195.40.224.78)
MAC Addr.: 00:C0:9F:99:E9:F4 00:C0:9F:99:E9:F5
IDs: 2 / 6 (System Cabinet / Slot|Partition)

Software
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool2-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img

Network
VlanID Host IP Hostname
Storage LAN: 890 195.40.231.98 bx2-6-st
Server LAN: 741 195.40.231.2 bx2-6-se
Client LAN: 740 172.28.26.2 bx2-6

Pool and Group


Pool: pool2
Group: bx600_o

Switch ports
Cabinet SwBlade Port
Common LANs: 2 1 6
Common LANs: 2 2 6

Administration and Operation 109


Administrating Application Nodes Listing Application Nodes

Below, a sample of a PRIMEPOWER output:


cn1:~ # ff_an_adm.pl --op list --name pw651

Configuration details of node pw651

Hardware
System: PW650 RackID: 1 AN Name: AN5
Shut.Facil.: RPS pw651-co (192.168.10.171)
MAC Addr.: 00:e0:00:a6:0a:10 00:e0:00:a6:da:0c

Software
OS: Sun / SunOS / 5.8 (Vendor / Type / Version)
OS Path: filer-p1-
st:/vol/volFF/os/Solaris/FSC_5.8_202_20050530/bi_FJSV,GPUZC-M_PW-
CMZ/root/pw651-st

Network
VlanID Host IP Hostname
Storage LAN: 200 192.168.21.171 pw651-st
200 192.168.21.181 pw651-stt1
200 192.168.21.191 pw651-stt2
Server LAN: 300 192.168.31.171 pw651-se
300 192.168.31.181 pw651-set1
300 192.168.31.191 pw651-set2
Client LAN: 400 192.168.41.171 pw651
400 192.168.41.181 pw651-clt1
400 192.168.41.191 pw651-clt2

Pool and Group


Pool: p1
Group: group15

Switch ports
SW Grp Switch Port
Common LANs: 1 1 9
Common LANs: 1 2 9
Control LAN: 1 1 21

110 Administration and Operation


Listing Application Nodes Administrating Application Nodes

8.1.2 Displaying Information on all Application Nodes


This section describes how to list information on all configured Application Nodes.
To list all configured Application Nodes of entire FlexFrame or of a specific pool use the
operation mode list-all.

Synopsis

ff_an_adm.pl --op list-all [--pool <pool_name>]

Command Options
--op list-all
Lists all configured Application Nodes.
--pool <pool_name>
The name of the pool of which the Application Nodes have to be listed.

Command Output
The output may look like this:
cn1:/opt/FlexFrame/bin # ff_an_adm.pl --op list-all

Nodes sorted by pool, group and name


Pool pool1
Pool Group bx600_a
bx1-1
Node Type: BX600 Rack/Cabinet/Slot|Partition ID: 1/1/1
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool1-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img
Host IP Hostname
Storage LAN: 195.40.224.222 bx1-1-st
Server LAN: 195.40.224.30 bx1-1-se
Client LAN: 143.161.72.30 bx1-1
MAC Addr.: 00:c0:9f:95:5f:ac 00:c0:9f:95:5f:ad

bx1-2
Node Type: BX600 Rack/Cabinet/Slot|Partition ID: 1/1/2
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool1-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img

Administration and Operation 111


Administrating Application Nodes Listing Application Nodes

Host IP Hostname
Storage LAN: 195.40.224.223 bx1-2-st
Server LAN: 195.40.224.31 bx1-2-se
Client LAN: 143.161.72.31 bx1.2
MAC Addr.: 00:c0:9f:95:5f:8a 00:c0:9f:95:5f:8b

bx2-1
Node Type: BX600 Rack/Cabinet/Slot|Partition ID: 2/2/1
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool1-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img
Host IP Hostname
Storage LAN: 195.40.224.227 bx2-1-st
Server LAN: 195.40.224.35 bx2-1-se
Client LAN: 143.161.72.35 vies1pyx
MAC Addr.: 00:c0:9f:95:60:60 00:c0:9f:95:60:61

bx2-2
Node Type: BX600 Rack/Cabinet/Slot|Partition ID: 2/2/2
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool1-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img
Host IP Hostname
Storage LAN: 195.40.224.228 bx2-2-st
Server LAN: 195.40.224.36 bx2-2-se
Client LAN: 143.161.72.36 bx2-2
MAC Addr.: 00:c0:9f:93:7f:cc 00:c0:9f:93:7f:cd

Pool pool2
Pool Group bx600_o
bx1-6
Node Type: BX620S2 Rack/Cabinet/Slot|Partition ID: 1/1/6
OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool2-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img
Host IP Hostname
Storage LAN: 195.40.231.97 bx1-6-st
Server LAN: 195.40.231.1 bx1-6-se
Client LAN: 172.28.26.1 bx1-6
MAC Addr.: 00:C0:9F:99:E6:CC 00:C0:9F:99:E6:CD

bx2-6

112 Administration and Operation


Adding Application Nodes Administrating Application Nodes

Node Type: BX620S2 Rack/Cabinet/Slot|Partition ID: 2/2/6


OS: SuSE / Linux / SLES-9.X86_64 (Vendor / Type /
Version)
OS Path: f1-pool2-st:/vol/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/root_img
Host IP Hostname
Storage LAN: 195.40.231.98 bx2-6-st
Server LAN: 195.40.231.2 bx2-6-se
Client LAN: 172.28.26.2 bx2-6
MAC Addr.: 00:C0:9F:99:E9:F4 00:C0:9F:99:E9:F5

The output of list-all is less detailed than the list output. It is used to get an
overview. It shows the Application Nodes sorted by pool and group in alphabetical order.
For each node the system type, the cabinet and slot/partition ID (if node is a blade or
partition server), the OS type, vendor, version, the root image path, the main IP
addresses, host names and the MAC addresses are listed.

8.2 Adding Application Nodes


This section describes how to provide the required information for adding a new AN to an
existing FlexFrame environment. See also sections “Changing BIOS Settings for Netboot”
on page 83 and “Power Management (On/Off/Power-Cycle)” on page 249.
To add an Application Node, use the maintenance script
/opt/FlexFrame/bin/ff_an_adm.pl. You have to define some parameters at
command line. They are used to configure switch ports, to create the boot information
and the OS image.
Adding an Application Node changes the exports file on the common volFF Filer.
Temporary exports (not written to the exports file /vol0/etc/exports) on this
Filer will be gone after running ff_new_an.sh.
Be sure not to have temporary exports.

Synopsis

ff_an_adm.pl --op add --type <system_type> --name <node_name>


--pool <pool_name> --group <group_name>
--swgroup <switch_group_id> --mac <mac_addresses>
--ospath <path_to_os_image>
[--host <ip_host_number>[,<test1_host_number>,
<test2_host_number>]]
[--slot <BXxxx_cabinet/slot>]
[--part <PW_cabinet/partition>]

Administration and Operation 113


Administrating Application Nodes Adding Application Nodes

Command Options
--op add
Adds an Application Node.
--type <system_type>
Specifies the product name and type like “PW250” or “RX300S2”. These are the
common system type (family) terms. More detailed product identifiers are not
necessary. See usage (call ff_an_adm.pl without any parameter) to get a list of
supported system types.
--name <node_name>
The name of the Application Node. This name has to be unique for the entire
FlexFrame system. All interface names are based on this node name. We
recommend using lower case names if possible.
--pool <pool_name>
The name of the pool this node should belong to. See usage (call ff_an_adm.pl
without any parameter) to get a list of currently configured pools.
--group <group_name>
The name of the pool group this node is a member of. A group must consist of
Application Nodes of the same OS image version and should be of the same
capacity (CPU, Memory etc.). There should be at least one spare Node in a group.
Otherwise, take-over of failing services will not be possible. Use command
ff_pool_adm.pl with op mode list or list-all to get the pool groups.
--swgroup <switch_group_id>
Defines the switch group the Application Node is connected to. This information is
necessary to assign and configure switch ports. The switch group was numbered
during installation with the FlexFrame PlanningTool. Use this number here too. See
usage (call ff_an_adm.pl without any parameter) to get a list of currently
configured switch group IDs.
--mac <mac_addresses>
Add here the both MAC addresses of the data NICs used with booting. Use the
double colon separated hex notation for each MAC address. Concatenate them with
a comma. The MAC address syntax is six colon separated hex values, eg.
00:e0:00:c5:19:41.
--ospath <path_to_os_image>
Defines the OS image to be used. Add the relative path to
/FlexFrame/volFF/os/ as seen from the Control Node. See usage (call
ff_an_adm.pl without any parameter) to get a list of available OS pathes.
--host <ip_host_number>[,<test1_host_number>,<test2_host_number>]
Host part to be used to build IP addresses for the three networks. On Solaris systems
(PRIMEPOWER) you have to give three different host numbers, separated with a

114 Administration and Operation


Adding Application Nodes Administrating Application Nodes

comma. If this option is omitted, the script uses free host numbers to calculate the IP
addresses.
With some server types you have to give additional information, i.e. PRIMERGY server
blades or PRIMEPOWER server partitions:
--slot <BXxxx_cabinet/slot>
With PRIMERGY server blades use this option to define the cabinet and slot ID of the
server blade. New cabinets have to be defined with the
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl command.
--part <PW_cabinet/partition>
With PRIMEPOWER server partitions (models PW900, PW1500 and PW2500) use
this option to define the cabinet and partition ID of the server partition. With new
cabinets, a separate SMC Control LAN IP address and switch port are assigned
additionally.

Command Output
The command displays some information about processing steps. The output for a blade
server may look like this:
cn1:/opt/FlexFrame/bin # ff_an_adm.pl --op add --type BX620S2
--name bx1-6 --pool pool1 --group bx600_o --ospath
Linux/FSC_3.2B00-000.SLES-9.X86_64 --host 1 --slot 1/6 --mac
00:C0:9F:99:E6:CC,00:C0:9F:99:E6:CD
update swblade 1/1 configuration
Notice: Update will take about 1 minute.
update swblade 1/2 configuration
Notice: Update will take about 1 minute.
If not reported any error all precautions are done to create
application nodes os image. To do this call:
ff_new_an.sh -n bx1-6
Creating and customizing an image may take some minutes.
Don't get anxious.

The output for a non-blade server may look like this:


cn1:/opt/FlexFrame/bin # ff_an_adm.pl --op add --name pw250-3
--type PW250 --pool p1 --group Solaris --swgroup 1
--mac 00:e0:00:c5:42:ab 00:e0:00:a6:eb:9f
--ospath Solaris/FSC_5.8_202_20050530
update switch 1/1 configuration
Notice: Update will take about 1 minute.
Connect your systems LAN interfaces to named switch ports:
SwitchGroup / Switch / Port LAN Interface
1 / 2 / 14 data NIC-1

Administration and Operation 115


Administrating Application Nodes Removing Application Nodes

1 / 1 / 14 data NIC-2
1 / 2 / 15 mgmt NIC-1

If not reported any error all precautions are done to create


application nodes os image. To do this call:
ff_new_an.sh -n pw250-3
Creating and customizing an image may take some minutes.
Don't get anxious.

The script first checks all arguments and aborts with error messages in case of errors.
Then it fetches free IP addresses and switch ports. The switch ports are reconfigured to
match requirements, the LDAP data is created and a netboot file is written. The netboot
file is used by ff_new_an.sh to create Application Node images and extend the Filer’s
exports list.
At the end you get a cabling advice and instructions how to call ff_new_an.sh script to
finish the Application Node creation.

8.3 Removing Application Nodes


To remove an Application Node use the maintenance script
/opt/FlexFrame/bin/ff_an_adm.pl. You only have to give the node name to be
removed at the command line. All switch ports will be unconfigured and the boot
information and OS image are deleted.

Removing an Application Node results in direct deletion of its image, removal of


its LDAP entries as well as disabling the respective switch ports.
Please make sure you really want to remove the Application Node (AN) when
calling the script, the script does not ask for further confirmation.
Removing an Application Node changes the exports file on the common volFF
Filer. Temporary exports (not written to the exports file /vol0/etc/exports)
on this Filer will be gone after running ff_an_adm.pl.
Please make sure not to have temporary exports.

Synopsis

ff_an_adm.pl --op rem --name <node_name>

Command Options
--op rem
Removes an Application Node.

116 Administration and Operation


Renaming Application Nodes Administrating Application Nodes

--name <node_name>
The name of the Application Node to be removed. Use operation mode list-all to
get all configured Application Nodes and their names (see 8.1.2).

Command Output
The command displays only errors and warnings. An output may look like this:
cn1:/opt/FlexFrame/bin # ff_an_adm.pl --op rem --name pw250-3

8.4 Renaming Application Nodes


At this time there is no tool available to do this directly. Remove the Application Node and
add it with a new name instead (as described above).

8.5 Administrating Blade Server Cabinets


Some network settings have to be changed to add or remove a blade server cabinet. The
script ff_bx_cabinet_adm.pl will simplify the administration by doing LDAP changes
automatically and preparing configurations to be done manually. The script supports
adding, removing and listing of blade server cabinets. Each action is described in detail
below.

8.5.1 Listing Blade Server Cabinets


Lists known cabinets from LDAP database with maintenance script
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl.

8.5.1.1 Displaying Information on a Specific Blade Server Cabinet


This section describes how to list detailed information on a specific blade server cabinet.

Synopsis

ff_bx_cabinet_adm.pl --op list --name <cabinet_name>

Command Options
--op list
Lists the configuration details of a blade server cabinet.
--name <cabinet_name>
The name of the blade server cabinet to be listed.

Administration and Operation 117


Administrating Application Nodes Administrating Blade Server Cabinets

The output is structured in sections: hardware, software, network, assigned pool and
group, switch ports.

Command Output
The command displays detailed information on the selected blade server cabinet. The
output may look like this:

Primergy Cabinet 1 (cab1)


System Type: BX600
Management Blade
Hostname / IP Address: cab1-co 195.40.224.75
Core Switch Ports: SwitchGroup SwitchID PortID
1 1 8
1 2 8

Switch Blade
SwitchID Type Switch name Hostname IP Address
1 Quanta bx600-2-swb1 bx600-2-swb1 195.40.224.76
2 Quanta bx600-2-swb2 bx600-2-swb2 195.40.224.77

Switch Blade Port Core Switch Port


Switch Blade ID PortID <--> SwitchGroup SwitchID PortID
1 11 1 2 11
1 12 1 1 11
2 11 1 2 12
2 12 1 1 12

As seen from the sample above, the cabinet ID and name, the cabinet system type, the
management blade and the switch blades are listed.
For the management blade the host name, the IP address and both core switch ports are
displayed.
The switch blade information shows the switch and host name, the IP address and the
switch blade port to core switch port connections, structured by switch blade ID.

8.5.1.2 Displaying Information on all Configured Blade Server Cabinets


This section describes how to list information on all configured blade server cabinets.

Synopsis

ff_bx_cabinet_adm.pl --op list-all

118 Administration and Operation


Administrating Blade Server Cabinets Administrating Application Nodes

Command Option
--op list-all
Lists all configured blade server cabinets.

Command Output
The command displays the configured blade server cabinets. An output may look like
this:

Primergy Cabinets

1 (cab1) BX600
Management Blade: cab1-co / 195.40.224.75
Switch Group ID: 1
Server Blades (by slot id)
1 (blade1) BX600
Pool / Group: pool1 / PROD
2 (blade2) BX600
Pool / Group: pool1 / PROD
3 (blade3) BX600
Pool / Group: pool1 / PROD
4 (blade4) BX600S2D
Pool / Group: pool2 / DEV
5 (blade5) BX600
Pool / Group: pool2 / DEV

For each cabinet the ID, the cabinet name, the management host name and IP address
and the server blades are displayed.
Each server blade is shown with its slot ID and name, the system type and the pool and
group it belongs to.

8.5.2 Adding Blade Server Cabinets


This section describes how to provide the required information for adding a new blade
server cabinet to an existing FlexFrame environment.
To add a blade server cabinet, use the maintenance script
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl. You have to define some
parameters at the command line. They are used to configure switch ports and to create
the switch blade configurations.

Administration and Operation 119


Administrating Application Nodes Administrating Blade Server Cabinets

Synopsis

ff_bx_cabinet_adm.pl --op add --type <system_type>


--name <cabinet_name>
--swgroup <switch_group_id>
[--swblades <type_of_switch_blades>]

Command Options
--op add
Adds a blade server cabinet.
--type <system_type>
PRIMERGY blade system type e.g. BX300 or BX600.
Call ff_bx_cabinet_adm.pl without any parameter to get a list of supported
system types.
--name <cabinet_name>
Name of the subsystem (cabinet). It is used to generate a new name for the
management blade (has to be unique within entire FlexFrame).
--swgroup <switch_group_id>
Switch group number (starts with 1) the cabinet has to be connected to (physically).
See usage (call ff_bx_cabinet_adm.pl without any parameter) to get a list of
currently configured switch group IDs.
--swblades <type_of_switch_blades>
The type of switch blades. Default is set to the newer type of switch blades for this
cabinet type (Quanta). Valid types are: Quanta, Accton, Pass-Through. For the
default switch blades this option may be omitted. Else give the type for the switch
blade like "Accton,Accton", or any combination of valid types for all used switch
blades.

Command Output
At the end of the output, the command displays further instructions. The output may look
like this:

A new Primergy BX600 cabinet numbered 1 connected at SwitchGroup 1


was added to LDAP.
You have to configure your ManagementBlade with the control lan
settings:
control lan IP address: 195.40.224.75
control lan name: cab1-co
to interoperate correctly with the FA Agents.

120 Administration and Operation


Administrating Blade Server Cabinets Administrating Application Nodes

Interconnect the ManagementBlades and SwitchBlades with the


switches of SwitchGroup 1 as noted below:

SwitchID/Port Mgmt/SwitchBlade
1 / 8 slave ManagementBlade
1 / 11 SwitchBlade 1 Port 12
1 / 12 SwitchBlade 2 Port 12
2 / 8 master ManagementBlade
2 / 11 SwitchBlade 1 Port 11
2 / 12 SwitchBlade 2 Port 11

Upload of initial SwitchBlade configurations have to be done


manualy. See installation guide for details. For a quick
instruction see below.
The files to be uploaded are named:

SwitchBlade Blade Type File Path


1 Quanta /tftpboot/swblade-2-1.config
2 Accton /tftpboot/swblade-2-2.config

Quick instruction for switch blade type "Accton"

Plug only port 11 to core switches to prevent disabling of


portchannel at core switches.

Use the management blade console redirection (telnet cab1-co 3172)


to get the console of the switch blade.
See below a session snippet as sample how to upload the
configuration.

Console> enable
Console # configure
Console(config)# interface vlan 200
Console(config-if)# ip address 195.40.224.77 255.255.255.0
Console(config-if)# end
Console # copy tftp startup-config
TFTP server ip address: 195.40.224.3
Source configuration file name: swblade-2-2.config
Startup configuration file name [startup-config]:
\Write to FLASH Programming.
-Write to FLASH finish.
Success.

Console # reload

Administration and Operation 121


Administrating Application Nodes Administrating Blade Server Cabinets

System will be restarted, continue <y/n>? y

Plug port 12 to core switches. Now the switch blade should be


fully operational.

Quick instruction for switch blade type "Quanta"

Plug only port 11 to core switches to prevent disabling of


portchannel at core switches.

Use the management blade console redirection (telnet cab1-co 3172)


to get the console of the switch blade.
See below a session snippet as sample how to upload the
configuration.

(Vty-0) #configure
(Vty-0) (Config)#vlan database
(Vty-0) (Vlan)#vlan 200
(Vty-0) (Vlan)#exit
(Vty-0) (Config)#interface vlan 200
(Vty-0) (if-vlan 200)#ip address 195.40.224.76 255.255.255.0
(Vty-0) (if-vlan 200)#exit
(Vty-0) (Config)#end
(Vty-0) #copy tftp://195.40.224.3/swblade-2-1.config script
config.scr

Mode....................................... TFTP
Set TFTP Server IP......................... 195.40.224.3
TFTP Path..................................
TFTP Filename.............................. swblade-2-1.config
Data Type.................................. Config Script
Destination Filename....................... config.scr

Management access will be blocked for the duration of the


transfer
Are you sure you want to start? (y/n) y

Validating configuration script...

File transfer operation completed successfully.

(Vty-0) #script apply config.scr

122 Administration and Operation


Administrating Blade Server Cabinets Administrating Application Nodes

Are you sure you want to apply the configuration script?


(y/n)y

The system has unsaved changes.


Would you like to save them now? (y/n) y

a lot of lines ...

Configuration script 'config.scr' applied.

(bx600-klaus-swb1) #copy running-config startup-config


Configuration Saved!

(swblade-2-1) #

Plug all other uplink ports to core switches. Now the switch blade
should be fully operational.

Unless any errors are reported follow instructions above to solve


all precautions needed to create new application nodes.
Look at "/opt/FlexFrame/network/wiring-BX600-cab1.txt" to get a
copy of this message.

Set up the management blade initially with name and IP address listed by the output as
seen above. Use console redirection of management blade to connect to the console of
the switch blades, and upload configuration as described by FlexFrame Installation
Guide.
Finally, plug in the network cables according to the wiring plan given by the command
output.

8.5.3 Removing Blade Server Cabinets


To remove a blade server cabinet, use the maintenance script
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl. You only have to give the ID of the
cabinet that is to be removed, at the command line. All core switch ports will be
unconfigured.
Removing a blade server cabinet requires removing of all of its server blades.

Synopsis

ff_bx_cabinet_adm.pl --op rem --id <cabinet_id>

Administration and Operation 123


Administrating Application Nodes Administrating Blade Server Cabinets

Command Options
--op rem
Removes a blade server cabinet.
--id <cabinet_id>
Specifies the subsystem (cabinet) ID of the cabinet to be removed. Use the
list-all option to get the ID (see section 8.5.1.2).

Command Output
If there are any server blades configured for this cabinet, an error message is displayed
like in the sample below:

ERROR: there are server blades configured for this cabinet.


To remove the cabinet, remove application nodes (server
blades) first.
Use command ff_an_adm.pl to do this.

Use the list operation mode to list the configured server blades. You have to remove
them before you can remove the cabinet they are in.
If no server blades are configured for this cabinet, the command displays a summary at
the end. The output may look like this:

If not reported any warnings or errors the cabinet was removed


from LDAP and core switches.

The cabinet has been removed successfully from LDAP and the core switch ports used
by the cabinet have been reconfigured to default.

8.5.4 Changing Switch Blade Type


Within service cases it may be necessary to change the type of a switch blade due to a
defective part replacement. Only switching blades can be used for a type change.
The proper program for this action is
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl. To change the switch blade type
the cabinet, the switch blade id and the switch blade type have to be specified.

Synopsis

ff_bx_cabinet_adm.pl --op swb-change --id <cabinet_id>


--swbid <switch_blade_id>
--swbtype <switch_blade_type>

124 Administration and Operation


Administrating Blade Server Cabinets Administrating Application Nodes

Command Options
--op swb-change
Selects the operation mode. Change the type of a switch blade.
--id <cabinet_id>
Specifiesd the subsystem (cabinet) ID of the cabinet. Use the list-all option to
get the ID.
--swbid <switch_blade_id>
Specifies the ID of the switch blade. The ID is the slot number of the selected switch
blade.
--swbtype <switch_blade_type>
Defines the new type of the switch blade. See usage for the currently supported
types.

Command Output
The output may look like this:

Switch type of switch blade 2 was successfully changed from


"Accton" to "Quanta" at LDAP database.

The switch blade type was changed at LDAP database. To get the initial configuration
use operation mode swb-config of this program. It will display instructions how to
upload the configurations too.

8.5.5 Changing Switch Blade Name


Adding a new cabinet names the switch blades like the cabinet with a slot extension. In
some cases the name of the switch blades have to be changed to match naming
conventions. To do this, use /opt/FlexFrame/bin/ff_bx_cabinet_adm.pl.

Synopsis

ff_bx_cabinet_adm.pl --op swb-name --id <cabinet_id>


--swbid <switch_blade_id>
--swbname <switch_blade_name>

Command Options
--op swb-name
Selects the operation mode. Change the name of a switch blade.

Administration and Operation 125


Administrating Application Nodes Administrating Blade Server Cabinets

--id <cabinet_id>
Specifies the subsystem (cabinet) ID of the cabinet. Use the list-all option to get
the ID.
--swbid <switch_blade_id>
Specifies the ID of the switch blade. The ID is the slot number of the selected switch
blade.
--swbname <switch_blade_name>
Defines the new name of the switch blade.

Command Output
The output may look like this:

If not reported any warnings or errors the hostname was


successfully changed at switch blade, hosts files and LDAP.

As noted by the program the name of the switch will be changed at /etc/hosts of both
control nodes, the LDAP database and at least the hostname and, if possible, the SNMP
sysname at the selected switch blade itself.

8.5.6 Changing Switch Blade Password


The default password used with adding a new cabinet is password. This is not secure.
To change the password to a secure one use
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl.

Synopsis

ff_bx_cabinet_adm.pl --op swb-passwd --id <cabinet_id>


--swbid <switch_blade_id>
--swbpwd <password>

Command Options
--op swb-passwd
Selects the operation mode. Change the login password of a switch blade.
--id <cabinet_id>
Specifies the subsystem (cabinet) ID of the cabinet. Use the list-all option to get
the ID.
--swbid <switch_blade_id>
Specifies the ID of the switch blade. The ID is the slot number of the selected switch
blade.

126 Administration and Operation


Administrating Blade Server Cabinets Administrating Application Nodes

--swbpwd <password>
Defines the new login and enable password of the switch blade.

Command Output
The output may look like this:

If not reported any warnings or errors the password was


successfully changed at switch blade and LDAP.

As noted by the program the password will be changed at the LDAP database and the
selected switch blade. At the switch blade the login password and theenable password
are changed and have to be the same.

8.5.7 Getting Switch Blade Initial Configuration


In case of a service issue it may be nescessary to get an initial switch blade
configuration, with have to be uploaded manually. To create such a configuration use
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl.

Synopsis

ff_bx_cabinet_adm.pl --op swb-config --id <cabinet_id>


--swbid <switch_blade_id>

Command Options
--op swb-config
Selects the operation mode. Create the initial switch blade configuration.
--id <cabinet_id>
Specifies the subsystem (cabinet) ID of the cabinet. Use the list-all option to get
the ID.
--swbid <switch_blade_id>
Specifies the ID of the switch blade. The ID is the slot number of the selected switch
blade.

Command Output
The output may look like this:

If not reported any warnings or errors the switch configuration


was successfully created and stored into
/tftpboot/swblade2.config.

Administration and Operation 127


Administrating Application Nodes Administrating Blade Server Cabinets

To upload the initial switch blade configuration see detailed


description at "FlexFrame(TM) Installation Guide" chapter
"SwitchBlade Configuration".

For a quick guide see below.

Depending on switch blade type there are different upload


procedures.

The sample below shows the procedure according to this switch


bladetype (Accton).
Plug only port 11 to core switches to prevent disabling of
portchannel at core switches.

Use the management blade console redirection (telnet cab1-co 3172)


to get the console of the switch blade.
See below a session snippet as sample how to upload the
configuration.

Console> enable
Console # configure
Console(config)# interface vlan 200
Console(config-if)# ip address 195.40.224.77 255.255.255.0
Console(config-if)# end
Console # copy tftp startup-config
TFTP server ip address: 195.40.224.3
Source configuration file name: swblade2-2.config
Startup configuration file name [startup-config]:
\Write to FLASH Programming.
-Write to FLASH finish.
Success.

Console # reload
System will be restarted, continue <y/n>? y

Plug port 12 to core switches. Now the switch blade should be


fully operational.
This hint is additionaly stored at:
/tmp/swb-config_bx_cabinet/todo.txt

The configuration file is stored directly to /tftpboot. Upload of configuration is


described using TFTP which uses /tftpboot as top level directory. The program
displays a switch blade type depending short description how to upload the configuration
to the switch blade. This short description uses actual parameters like file names and ip

128 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

addresses to be able to copy&paste if possible. A more detailed instruction can be found


in the „Installation Guide“.

8.6 Maintenance of Linux Application Node Images


This section describes how to …
● install an Application Node image from the installation media.
● create a new customer specific Linux Application Nodes image.
● install Linux service packs.
● update/install a new Linux kernel.
● update ServerView.
● upgrade RPM packages on an Application Node.
● upgrade the application software.
● migrate remaining Application Nodes to the new Application Node image.
Choose an Application Node for the maintenance and isolate it. This means all running
Applications like SAP services must be moved to another Application Node i.e. a Spare
Node. In this document, the Application Node “an_0504” will be used for the
maintenance.

8.6.1 Installing an Application Node Image from Installation


Media
This section describes how to install an Application Node image delivered by Fujitsu
Siemens Computers. This Application Node image contains two components, one root
image which is shared for all Application Nodes and one var image as a template for the
individual Application Node var images.
All images delivered by FSC are named FSC_<FF version>.<OS
version>.<OS architecture>. For SLES8 images, OS version and
architecture may be missing due to backward compatibility. The prefix FSC_ is
reserved for originals delivered from FSC and must not be used for customer
specific modified images.
The following approach is similar to the Installation Guide. These newly delivered Images
must be installed in a new separate image directory tree besides the old one.
For example, the old image directory tree is FSC_3.2B00-000.SLES-9.X86_64,
containing an 3.2 SLES9 (64bit) Image. The new one is FSC_3.2.x.

Administration and Operation 129


Administrating Application Nodes Maintenance of Linux Application Node Images

control1:/FlexFrame/volFF/os/Linux # ls -l
total 20
drwxr-xr-x 5 root root 4096 Apr 29 16:33 .
drwxr-xr-x 4 root root 4096 Mar 15 15:10 ..
drwxr-xr-x 4 root root 4096 Apr 22 13:41
FSC_3.2B00-000.SLES-9.X86_64
drwxr-xr-x 4 root root 4096 Apr 22 14:15 pool_img

8.6.1.1 Installing the Application Node Image


Install the Application Node image from installation media with the installation tool
ff_install_an_linux_images.sh on the Control Node.

Synopsis

ff_install_an_linux_images.sh [-p <path_to_images>]

Command Option
-p <path_to_images>
Each image version has its own path to images i.e. FSC_3.2.x. If this path already
exist you have to specify a new image path with the -p option.
Example for destination path FSC_3.2.x:
control1:/ # mount /media/dvd
control1:/ # /media/dvd/ff_install_an_linux_images.sh
-p /FlexFrame/volFF/os/Linux/FSC_3.2.x
..
..
control1:/FlexFrame/volFF/os/Linux # ls -l
total 20
drwxr-xr-x 5 root root 4096 Apr 29 16:33 .
drwxr-xr-x 4 root root 4096 Mar 15 15:10 ..
drwxr-xr-x 4 root root 4096 Apr 22 13:41
FSC_3.2B00-000.SLES-9.X86_64
drwxr-xr-x 4 root root 4096 Apr 29 17:10 FSC_3.2.x
drwxr-xr-x 4 root root 4096 Apr 22 14:15 pool_img

The location for the Linux Application Node images always has to be
/FlexFrame/volFF/os/Linux/.

130 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.1.2 Creating the New Netboot Configuration


Create the new netboot configuration for one Application Node with
ff_create_an_cfg.pl.

Synopsis

ff_create_an_cfg.pl --name <app node name>


--lin_os_path <directory>

Example:
control1:/ # ff_create_an_cfg.pl --name an_0504
--lin_os_path /FlexFrame/volFF/os/Linux/FSC_3.2.x

8.6.1.3 Creating the New var Image


Create the new var image for one Application Node. The following steps will be done
automatically:
● create the new var images for the Application Nodes
● create the new entries for the Filer exports
● create the new pxelinux configuration
● update the LDAP database

Synopsis

ff_new_an.sh -n <app_node_name>

Example:
control1:/ # ff_new_an.sh -n an_0504

You now have a new Application Node Image tree with one newly configured Application
Node. The newly configured Application Node will use this image tree upon the next
reboot and can be used for maintenance purposes.

Administration and Operation 131


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.2 Creating a New Linux OS Image for Application Nodes


This section describes how to build a custom Application Node image for test purposes or
small software updates. Usually, new Application Node images will be delivered from
Fujitsu Siemens Computers, see section “Installing an Application Node Image from
Installation Media” on page 129.
Software maintenance for Application Nodes must be done in the right order:
● Choose an appropriate Application Node for the maintenance.
● Copy the productive image FSC_<fsc_version> to a maintenance image
<Custom_Image>, this includes the root-image and the var_template.
● Isolate the maintenance Application Node. FA Agents have to be disabled.
● Create the new netboot configuration for one Application Node with
ff_create_an_cfg.pl.
● Create the new var image for one Application Node.
● Install or update the Software on the maintenance Application Node. New software
parts in the var image must be back-ported to the new template var image
var_template, e.g. copy the maintenance var image to the var_template and
delete all files in /FlexFrame/volFF/os/Linux/<Custom_Image>/var_img
/var_template/tmp/.
● Test all applications on the maintenance Application Node
● Migrate the remaining Application Nodes to the new image tree.

132 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.2.1 Schematic Overview of the Maintenance Cycle

Boot
Boot or
or Reboot
Reboot
Create
Create next maintenance cycle
Remaining
Remaining
Custom
Custom Image
Image Tree
Tree Application
Application Nodes
Nodes

Create
Create and
and Configure
Configure
Disable
Disable FA
FA Agents
Agents Remaining
Remaining
in
in Custom
Custom root
root Image
Image Application
Application Node
Node
Images
Images

Create
Create
Isolate
Isolate one
one Netboot
Netboot Configuration
Configuration
Application
Application Node
Node for
for Remaining
Remaining
for
for Maintenance
Maintenance Application
Application Nodes
Nodes

Create
Create Change
Change exports
exports
Netboot
Netboot Configuration
Configuration on
on Filer
Filer to
to
for
for Maintenance
Maintenance Read-Only
Read-Only
Application
Application Node
Node for
for root
root Image
Image

Create
Create and
and Configure
Configure
Maintenance
Maintenance Enable
Enable FA
FA Agents
Agents
Application
Application Node
Node in
in Custom
Custom root
root Image
Image
Image
Image

Change
Change exports
exports Create
Create var_template
var_template
on
on Filer
Filer to
to from
from Customized
Customized
Read-Write
Read-Write var
var Image
Image
for
for root
root Image
Image

Boot
Boot or
or Reboot
Reboot test failed?
Reboot
Reboot and
and Test
Test
Maintenance
Maintenance Maintenance
Maintenance
Application
Application Node
Node Application
Application Node
Node

Remount
Remount Remount
Remount
root
root Filesystem
Filesystem root
root Filesystem
Filesystem
as
as Read-Write
Read-Write as
as Read-Only
Read-Only

Start
Start Software
Software Update,
Update, Finalize
Finalize
Maintenance
Maintenance Actions
Actions Kernel
Kernel Update,
Update, ...
... Maintenance
Maintenance Actions
Actions

Administration and Operation 133


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.2.2 Creating a Custom Image Tree for Linux Application Nodes


In this example, the customer specific image tree is CustomImage_3.2.
Copy the original image directory tree FSC_3.2B00-000.SLES-8.i686 to
CustomImage_3.2:

control1:/FlexFrame/volFF/os/Linux/FSC_3.2B00-000.SLES-9.X86_64 #
find root_img var_img/var_template | cpio -pdum ../CustomImage_3.2
control1:/FlexFrame/volFF/os/Linux/FSC_3.2B00-000.SLES-9.X86_64 #
cd ..
control1:/FlexFrame/volFF/os/Linux # ls -l
total 20
drwxr-xr-x 5 root root 4096 Apr 29 16:33 .
drwxr-xr-x 4 root root 4096 Mar 15 15:10 ..
drwxr-xr-x 4 root root 4096 Apr 22 13:41
FSC_3.2B00-000.SLES-9.X86_64
drwxr-xr-x 4 root root 4096 Apr 29 17:10
CustomImage_3.2
drwxr-xr-x 4 root root 4096 Apr 22 14:15 pool_img

This image directory tree contains all necessary images, the root image and all var
images:
control1:/FlexFrame/volFF/os/Linux # ls -l CustomImage_3.2
total 16
drwxr-xr-x 4 root root 4096 Apr 29 17:10 .
drwxr-xr-x 5 root root 4096 Apr 29 16:33 ..
drwxr-xr-x 37 root root 4096 May 3 11:07 root_img
drwxr-xr-x 9 root root 4096 Apr 22 14:15 var_img

8.6.2.3 Disabling the FA Agents


The FA Agents must be disabled for the maintenance cycle before booting or rebooting
the maintenance Application Node. In order to do this, deactivate the service by using the
insserv -r command. You can do this directly in the maintenance image without
influence on the productive images
Execute in the maintenance root image on the Control Node:

control1:/FlexFrame/volFF/os/Linux/CustomImage_3.2/root_img/etc/
init.d # insserv -r ./myAMC*

134 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.2.4 Modifying the Netboot Configuration


Modify the netboot configuration for the maintenance Application Node with
ff_an_adm.pl:

control1:/ # ff_an_adm.pl --op os --name an_0504


--ospath Linux/CustomImage_3.2

8.6.2.5 Creating the New var Image


Create the new var image for the maintenance Application Node on the Control Node.
The following steps will be done automatically:
● create the new var images for the Application Nodes
● create the new entries for the Filer exports
● create the new pxelinux configuration
● update the LDAP database
control1:/ # ff_new_an.sh -n an_0504

You now have a new Application Node image tree with one newly configured Application
Node.
The newly configured Application Node will use this image tree upon the next reboot and
can be used for maintenance purposes.

8.6.2.6 Enabling Read-Write Access to the New Root Image


Modify the new image on the Filer, permit read-write access in the Storage LAN to this
Application Node for the root image.
Get the Storage LAN IP address:
control1:/FlexFrame/<filer>/vol0/etc # getent hosts an_0504-st
192.168.11.53 an_0504-st

Modify the exports file of the Filer:

control1:/FlexFrame/<filer>/vol0/etc # vi exports
....
/vol/volFF/os/Linux/CustomImage_3.2/root_img
-sec=sys,rw=192.168.11.53,anon=0

Administration and Operation 135


Administrating Application Nodes Maintenance of Linux Application Node Images

Activate the permissions on the Filer:


control1:/ # rsh <filer>-st exportfs -a

Reboot this Application Node.


Remount the root image as read-write:
an_0504:/ # mount -o remount,rw /

Check the write access:


an_0504:/ # cd /
an_0504:/ # touch delme
an_0504:/ # rm delme

8.6.2.7 Maintaining the Application Node Image


The image maintenance can now go on. Normally RPMs will be installed. Examples for
this will be described below.

8.6.2.8 Creating the New var_template


If this new Application Node’s root image and var image are fully operative, then it is time
to copy this var image to the new template var image var_template. Obsolete files
should be removed.
Copy:

control1:/vol/volFF/os/Linux/CustomImage_3.2/var_img #
mv var_template var_template.OLD
control1:/vol/volFF/os/Linux/CustomImage_3.2/var_img #
cp -pdR var-coa80b35 var_template

Cleanup:

control1:/vol/volFF/os/Linux/CustomImage_3.2/var_img/var_template/
tmp # rm -rf *

control1:/vol/volFF/os/Linux/CustomImage_3.2/var_img/var_template/
FlexFrame/etc/sysconfig/network # rm ifcfg-vlan*

Do not remove files in CustomImage_3.2/var_img/var_template/log/


because some files are required by several Linux components.

136 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.2.9 Enabling the FA Agents


The FA Agents must be re-enabled on the new root-image:
an_0504:/etc/init.d # insserv ./myAMC*

or, from the Control Node:

control1:/FlexFrame/volFF/os/Linux/CustomImage_3.2/root_img/etc/
init.d # insserv ./myAMC*

8.6.2.10 Disabling Read-Write Access to the New Root Image


Modify the new image on the Filer, permit only read access to all Application Nodes for
the root-image:
control1:/FlexFrame/<filer>-st/vol0/etc # getent hosts an_0504-st
192.168.11.53 an_0504-st
control1:/FlexFrame/<filer>-st/vol0/etc # vi exports
....
/vol/volFF/os/Linux/CustomImage_3.2/root_img -sec=sys,ro,anon=0

control1:/ # rsh <filer>-st exportfs -a

Remount the root image as read-only:


an_0504:/ # mount -o remount,ro /

8.6.2.11 Migrating the Remaining Application Nodes


Migrate the remaining Application Nodes as described in section “Migrating Remaining
Application Nodes to the New Application Node Image” on page 147.

8.6.3 Service Packs


It is not recommended to install any Service Packs on Application Node images manually.
The only way to obtain a new service pack for Application Nodes is to install new
Application Node images delivered from Fujitsu Siemens Computers. Service Packs
contain a lot of changes that can have influences to the FlexFrame functionality. Service
Packs can also contain initial configuration files that will substitute the FlexFrame
configurations.

Administration and Operation 137


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.4 Updating / Installing a New Linux Kernel


This section describes how to install an additional (alternative) Linux kernel into an
existing Application Node image.
For new Linux kernels it is not necessary to do the upgrade in a maintenance image.
Every kernel version has its own module directory, its own initial ramdisk initrd and its
own kernel file.
The netboot configuration
/tftpboot/pxelinux.cfg/<hw_class>_<pool_name>_<image_tree_name>
will determine which Linux kernel and initial ramdisk need to be used for a subset of
Application Nodes. The image tree name of original images delivered by FSC begins with
FSC_and follows the scheme
FSC_<FF version>.<OS version>.<OS architecture>.

8.6.4.1 Software Stage


To install and update any software packages in FlexFrame, it is useful to mount your Filer
or jukebox or any other software stage on a local mount point. The same software stage
must be mounted on an Application Node which is appointed to install the new kernel-
RPMs. This Application Node must be able to run the SMP-kernel and the default-kernel;
The SMP kernel can not be used with the BX300 Mono-Blades with Pentium-M
processor.
Our standard software stage in FlexFrame is /FlexFrame/volFF/FlexFrame/stage.
Create a subdirectory for the delivered software:
control1:/FlexFrame/volFF/FlexFrame/stage # mkdir -p SuSE/SLES8

Copy the delivered software packages to the software stage.


SMP kernel:
control1: # cp -p <somewhere>/k_smp-2.4.21-286.i586.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

Default kernel:
control1: # cp -p <somewhere>/k_default-2.4.21-286.i586.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

Kernel source:
control1: # cp -p <somewhere>/kernel-source-2.4.21-286.i586.rpm
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

138 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

Control Node initrd:

control1: # cp -p <somewhere>/initrd_2.4.21.gz
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8

This is not the original initrd as created while installing the kernel RPM. This initrd
already contains all necessary drivers for the Application Nodes while net-booting.

8.6.4.2 Installing a New Linux Kernel


Logon to the appointed Application Node:
control1:/ # ssh -X an_0504

Check the currently installed Linux kernels:


an_0504:/ # rpm -qa|grep k_
k_deflt-2.4.21-251
k_smp-2.4.21-251
an_0504:/ #

To install any software on the Application Nodes it is necessary to get write permissions
to the mounted root file system. Mount the root file system as read-write (example).
Export the root image as read-write only for the maintenance Application Node
192.168.12.19.
Edit vol0/etc/exports on the Filer:

control1:/FlexFrame/<filer>/vol0/etc # vi exports
...
/vol/volFF/os/Linux/CustomImage_3.2/root_img -
ro,anon=0,rw=192.168.12.19

Export the temporary permissions via rsh on the Filer. Telnet to the Filer is also possible:
control1:/ # rsh <filer> exportfs -a

Do not forget to export the root image as read-only after the maintenance!
On the Application Node:
Remount the Application Nodes root filesystem with write permissions:
an_0504:/ # mount -o remount rw /

Administration and Operation 139


Administrating Application Nodes Maintenance of Linux Application Node Images

Mount the software stage:


an_0504:/mnt # mkdir /mnt/stage
an_0504:/mnt #
mount -t nfs <filer>:/vol/volFF/FlexFrame/stage /mnt/stage

Install the new Linux kernel 2.4.21-286, -default for Mono Blades BX300 with
Pentium-M processor and -smp for the rest.
For SMP kernels:
an_0504:/mnt/stage/SuSE/SLES8 # rpm -i k_smp-2.4.21-286.i586.rpm

For default kernels:


an_0504:/mnt/stage/SuSE/SLES8 # rpm -i k_deflt-2.4.21-286.i586.rpm

Kernel source:
Save the old symbolic links:
an_0504:/usr/share/doc/packages # mv kernel-source
kernel-source-2.4.21-251

an_0504:/usr/src # rm linux.old; mv linux linux.old


an_0504:/usr/src #
rm linux-include.old; mv linux-include linux-include.old

Install the kernel source:


an_0504:/mnt/stage/SuSE/SLES8 # rpm -i --force
kernel-source-2.4.21-286.i586.rpm

For ServerView you have to configure the kernel source:


an_0504:/usr/src/linux # make menuconfig

Leave the configuration unchanged and continue as follows:


SAVE + EXIT

an_0504:/usr/src/linux # make dep

140 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.4.3 Creating a New initrd for Application Nodes


This section describes how to create a new initrd for Application Nodes within the
FlexFrame environment.
Replace the original initrd by the new Application Node initrd. The new Application
Node initrd contains modules for the SMP kernel and default kernel and previous
kernel versions, too.
Execute on the Control Node:
control1:/tftpboot # cp -p initrd-2.4.21.gz initrd-2.4.21.gz.OLD
control1:/tftpboot # cp -p
/FlexFrame/volFF/FlexFrame/stage/SuSE/SLES8/initrd-2.4.21.gz .

Because we support different PRIMERGYs with different network, SCSI and IDE
controllers, the appropriate modules are loaded depending from the netboot
configuration parameter DRV in the special Application Node initrd.

8.6.4.4 New Netboot Configuration


Copy the new kernel to /tftpboot:

control1:/FlexFrame/volFF/os/Linux/CustomImage_3.2/root_img/boot #
cp -p vmlinuz-2.4.21-286-smp /tftpboot
control1:/FlexFrame/volFF/os/Linux/CustomImage_3.2/root_img/boot #
cp -p vmlinuz-2.4.21-286-default /tftpboot

Create new symbolic links for the netboot configuration:


control1:/tftpboot # ln -s vmlinuz-2.4.21-286-smp SLES8_2.4.21-286
control1:/tftpboot #
ln -s vmlinuz-2.4.21-286-default SLES8_2.4.21-286-def

Administration and Operation 141


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.4.5 Netboot Configuration for the First Test


Change the kernel descriptors to the new one, in our example -286:

control1:/tftpboot/pxelinux.cfg # vi rx300_pool1_CustomImage_3.2

default SLES8-286
..
..
LABEL SLES8-286
KERNEL SLES8_2.4.21-286
APPEND initrd=initrd_2.4.21.gz
nfsroot=192.168.12.204:/vol/volFF/os/Linux/CustomImage_3.2/
root_img,intr,v3,nolock,rsize=32768,wsize=32768,tcp BI=1:2
DRV=BCM5700,E1000,AIC79 apm=off

ServerView will generate (compile) new modules for the new Linux Kernel during the next
reboot. The root image has to be writable for this, too.
While rebooting the new kernel, this message should appear on the console:

Compiling modules for 2.4.21-286-smp:


copa(Ok) cop(Ok) ihpci(Ok) ipmi(Ok) smbus(Ok)
done

Because of the mount option “ro” on the Application Node, automatic module compilation
will fail. In this case this has to be done manually with:
an_0504:/ # mount -o remount,rw /
an_0504:/ # /etc/init.d/eecd_mods_src start
Compiling modules for 2.4.21-286-smp:
copa(Ok) cop(Ok) ihpci(Ok) ipmi(Ok) smbus(Ok)
done

For SLES8 Application Node images, it is required to do this for both, the SMP kernel and
the default kernel. With SLES9 there is only an smp kernel and you are finished now.

142 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

Therefore, for SLES8, the netboot configuration should be changed like this:
control1:/tftpboot/pxelinux.cfg # vi rx300_pool1_CustomImage_3.2

default SLES8-286-def
..
LABEL SLES8-286-def
KERNEL SLES8_2.4.21-286-def
APPEND initrd=initrd_2.4.21.gz
nfsroot=192.168.12.204:/vol/volFF/os/Linux/CustomImage_3.2/
root_img,intr,v3,nolock,rsize=32768,wsize=32768,tcp BI=1:2
DRV=BCM5700,E1000,AIC79 apm=off

Reboot the Application Node and compile the ServerView modules again:
an_0504:/ # mount -o remount,rw /
an_0504:/ # /etc/init.d/eecd_mods_src start
Compiling modules for 2.4.21-286-default:
copa(Ok) cop(Ok) ihpci(Ok) ipmi(Ok) smbus(Ok)
done

8.6.4.6 Removing Write Permissions for Maintenance Application Node on the


Root Image
Mount the root file system as ro:

an_0504:/ # mount -o remount,ro /

Export the root image as ro for all Application Nodes.


Edit the vol0/etc/exports on the Filer:

control1:/FlexFrame/<filer>/vol0/etc # vi exports
..
/vol/volFF/os/Linux/CustomImage_3.2/root_img -ro,anon=0

Export the regular permissions via rsh on the Filer. Telnet to the Filer is also possible:

control1:/ # rsh <filer> exportfs -a

Test the functionality with this Application Node.

Administration and Operation 143


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.4.7 Changing All Netboot Configuration Templates to the New Linux Kernel
In case the kernel tests were run successfully, the netboot configuration templates for
creating new Application Nodes should be switched to the new configuration.
In this example the old Linux kernel descriptor was “-251” and the new one is “-286”:

control1:/tftpboot/pxelinux.cfg/templates #
perl -p -i.$(date +'%Y%m%d%H%M%S') -e 's/-251/-286/' *_template

8.6.5 ServerView Update


Prepare the custom Application Node image as described in section “Creating a New
Linux OS Image for Application Nodes” on page 132.
To update or install new software or patches we need write permission for the root image
on the Filer. This should be done for this maintenance Application Node only. For this
purpose we temporary remount the root image read-writable.
On the Control Node:
control1:/FlexFrame/<filer>/vol0/etc # vi exports
..
#/vol/volFF/os/Linux/CustomImage_3.2/root_img -ro,anon=0
/vol/volFF/os/Linux/CustomImage_3.2/root_img
-ro,anon=0,rw=192.168.11.53

Activate the changes:


control1:/ # rsh <filer> exportfs -a

On the Application Node:


an_0504:/ # mount -o remount,rw /

Mount your software stage:


an_0504:/mnt # mkdir /mnt/stage
an_0504:/mnt #
mount -t nfs <filer>:/vol/volFF/FlexFrame/stage /mnt/stage

Update the installed RPMs:

an_0504:/mnt/stage/ServerView #
rpm -U srvmagt-mods_src-3.10-14.suse.rpm
insserv: can not stat(_sap_acc_agents)
Loading modules: ipmi smbus msr cpuid
..done

144 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

(The message insserv: can not stat ... can be ignored.)

an_0504:/mnt/stage/ServerView #
rpm -U srvmagt-eecd-3.10-14.suse.rpm
Running pre (2) for srvmagt-eecd-3.10-14
Shutting down eecd: TERM..done
lrwxrwxrwx 1 12 Sep 24 18:01 /sbin/halt -> halt-srvmagt
Running post (2) for srvmagt-eecd-3.10-14
insserv: can not stat(_sap_acc_agents)
Starting eecd..done

an_0504:/mnt/stage/ServerView #
rpm -U srvmagt-agents-3.10-14.suse.rpm
Running pre (2) for srvmagt-agents-3.10-14
ONUCDSNMP=true
Stopping agents: sc bus hd unix ether bios secur status inv
vv..done
Shutting down snmpd:..done
Running post (2) for srvmagt-agents-3.10-14
Linking 32-bit/64-bit binaries for i686
insserv: can not stat(_sap_acc_agents)
lrwxrwxrwx 1 16 Sep 24 17:47 /usr/sbin/snmpd -> ucdsnmpd-
srvmagt
Running triggerin (2, 1) for srvmagt-3.10-14
lrwxrwxrwx 1 16 Sep 24 17:47 /usr/sbin/snmpd -> ucdsnmpd-
srvmagt
Starting snmpd..done
Starting agents: sc bus hd unix ether bios secur status inv
vv..done
an_0504:/mnt/stage/ServerView #

On the Control Node:


For security reasons, we now remount our root image as read-only again:
control1:/FlexFrame/<filer>/vol0/etc # vi exports
...
/vol/volFF/os/Linux/CustomImage_3.2/root_img -ro,anon=0
#/vol/volFF/os/Linux/CustomImage_3.2/root_img
-ro,anon=0,rw=192.168.11.53
control1:/ # rsh <filer>-st exportfs -a

On the maintenance Application Node:


an_0504:/ # mount -o remount,ro /

The ServerView Agent update is now finished.

Administration and Operation 145


Administrating Application Nodes Maintenance of Linux Application Node Images

8.6.6 Upgrading RPM Packages on an Application Node


Best practice to update RPM packages is “Creating a New Linux OS Image for
Application Nodes” - see section 8.6.2. Alternatively, the deployment of RPM content
belonging the var images can be done manually.
This is not recommended!
Anyway, the necessary steps are described below.
Application Nodes do not have a local Linux installation. The installation is divided into a
shared root image and a private var image.
The contents of an RPM package are distributed on a package-specific basis to the file
system. Therefore, it may occur that the RPM package is located partly in the root image
and partly in the var image. If an RPM package is to be loaded, the root image must be
temporarily mounted as read-writable.
Possible approaches:
● The RPM package is loaded separately on each Application Node; the part in the
shared root image is always overwritten.
● The RPM package is only loaded on one Application Node, and the part in the var
image, including the RPM database, is then distributed manually to the other
Application Nodes.

The RPM database is linked from the var image to the root image.

8.6.7 Upgrading the Application Software


The rules for installation of application software (SAP/Oracle) apply here as well.
An Application Node should be available exclusively for upgrading the application
software, so that the physical host name can be set to be identical with the virtual host
name of the application.
Provided this has been done, upgrading can be carried out in accordance with the
software vendor’s standard guides.
All parts of the application software must reside on the network files system mounted
from the Filer.
For more details see the SAP installation Guides.

146 Administration and Operation


Maintenance of Linux Application Node Images Administrating Application Nodes

8.6.8 Migrating Remaining Application Nodes to the New


Application Node Image
Deploy the new image to other Application Nodes with the administration tools.

8.6.8.1 Modifying the Netboot Configuration


Modify the netboot configuration for each Application Node with ff_an_adm.pl:

Synopsis

control1:/ # ff_an_adm.pl –op os --name <app node name>


--ospath <directory>

To modify the netboot configuration files for a single Application Node, i.e. Application
Node an_0506, invoke:

control1:/ #
ff_an_adm.pl --name an_0506 --ospath Linux/CustomImage_3.2

8.6.8.2 Creating the New var Images


Create the new var images for the Application Nodes. The following steps will be done
automatically:
● create the new var images for the Application Nodes
● create the new entries for the Filer exports
● create the new pxelinux configuration
● update the LDAP database

Synopsis

ff_new_an.sh {-n <app node name>|-s <search criteria>}

To create a new var image for one Application Node, i.e. an_0506 invoke:

control1:/ # ff_new_an.sh -n an_0506

To create new var images for all Application Nodes belonging the new Application Node
image CustomImage_3.2, invoke:

control1:/ # ff_new_an.sh
-s 'PATH_TO_IMAGES=/FlexFrame/volFF/os/Linux/CustomImage_3.2'

Administration and Operation 147


Administrating Application Nodes Installation / Activation of New Solaris Images

This will create all var-images for the new Application Node image at once.
The newly configured Application Nodes will use their new images upon the next reboot.

8.7 Installation / Activation of New Solaris Images

8.7.1 Introduction

8.7.1.1 General Notes


Seen from the view of a Control Node the Solaris Application Node images are located in
a folder on the NAS storage at /FlexFrame/volFF/os/Solaris/<identifier>.

<identifier> denotes the version and origin of the images below. For images created
at Fujitsu Siemens Computers the identifier is built as follows:
<identifier> = FSC_<SunOS_version>_<version_details>
Names for customer specific Solaris images must not start with “FSC”. It is
recommended to use a similar naming convention.

In this directory you will find one or more boot images (bi*) or master boot image
directories (mbi*) in the form
bi_<server_class> or mbi_<server_class>

<server class> denotes groups of PRIMEPOWER servers within the same server
class which share the same basic technology.
As of the time this document was written, the following server classes are supported:
FJSV,GPUZC-M_PW-P (PW 250 and PW 450)
FJSV,GPUZC-M_PW-CMZ (PW 650 and PW 850)
FJSV,GPUZC-L_PW-CLZ (PW 900, PW 1500 and PW 2500)
The classification in different hardware platforms is necessary because of the ESF
(Enhanced Shutdown Facility) software, which has to be installed on each
PRIMEPOWER system.
The next directory level holds a structure similar to the Solaris diskless client concept:
Solaris_<version>
exec
root

In exec /Solaris_<version>_sparc.all/usr you will find the read-only area


(/usr). This folder is shared between all Application Nodes of the same server class.

148 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

The root folder holds the clone and directories named like the Application Nodes using
this image (including -st suffix for Storage LAN). The clone is the basis for creating an
Application Node specific image.

8.7.1.2 How to Access the Console of a PW 250 / PW 450


The PRIMEPOWER server models 250 and 450 have a built-in XSCF (eXtended System
Control Facility) which allows the administrator to access the console of the server via
LAN interface. During the first boot of a PRIMEPOWER the console is redirected to this
LAN interface.
Here’s how you access the console (e.g. of Application Node an_0800):

control1:~ # telnet an_0800-co


Trying 192.168.20.30...
Connected to an_0800-co.
Escape character is '^]'.

an_0800 console login:

Now you have direct access to the Application Node’s console. We recommend keeping
the consoles of the Application Nodes open during boot and maintenance work.
If the console is not accessible (locked by another administrator or the Application Node
is not responding, etc.) you can connect directly to the XSCF itself using port 8010:
control1:~ # telnet an_0800-co 8010
Trying 192.168.20.30...
Connected to an_0800-co.
Escape character is '^]'.

SCF Shell

login:root
Password:
SCF Version 03190001
ALL RIGHTS RESERVED, COPYRIGHT (C) FUJITSU LIMITED 2003
ff.ff.ff[192.168.20.30]
SCF>

To hang-up a telnet console connection enter:


SCF> hangup 23

If the Application Node got stuck, check the console logs using the following command:
SCF> show-console-logs

Administration and Operation 149


Administrating Application Nodes Installation / Activation of New Solaris Images

To initiate a hardware-reset enter the following.


The Application Node will unconditionally enter the OBP prompt and processes
running on the Application Node will be interrupted without the chance of a
controlled stop!
SCF> xir
Are You sure to Reset(XIR)?
[y/n] y

On the console you will see:

Externally Initiated Reset


{1} ok

Now you can now reboot the Application Node.

8.7.1.3 OBP Flag: Local-mac-address?


The following option must be set:
{0} ok setenv local-mac-address? true

If this setting is not done, strange network behaviour may be the result. Some
network switches cannot handle the same MAC address at different
ports/VLANs.
The address reported by banner may differ! banner typically reports hme0's
mac address, which is not used in FlexFrame.

8.7.1.4 OBP Flag: Auto-boot?


For testing purposes, it may be useful to disable auto-boot:
{0} ok setenv auto-boot? false
{0} ok printenv auto-boot?
auto-boot? = false

You can enable it again after testing with


{0} ok setenv auto-boot? True

150 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

8.7.1.5 What Happens When the Solaris Client Boots


A freshly installed and not yet booted Solaris has a flag file. At the first boot, rc scripts run
depending on this flag file. Be sure to have at least the line reading e.g.

network_interface=PRIMARY {protocol_ipv6=no hostname=pw250f-st


ip_address=192.168.1.100 netmask=255.255.255.0}

properly configured in /etc/sysidcfg. The host name must match the network name
of the NIC in the Storage LAN you are booting from.
On the very first boot /etc/sysidcfg is read by a couple of tools, which in turn edit e.g.
/etc/hosts, /etc/default/init etc. After the next reboot /etc/sysidcfg is no
longer used. After having mounted its root file system, the client mounts /usr as defined
in /etc/vfstab. The system should now go to multiuser level as usual.
We start the boot process from the console of the PRIMEPOWER (from the OBP).
More information is displayed with the options -dv.

Fujitsu Siemens PRIMEPOWER250 2x SPARC64 V, No Keyboard


OpenBoot 3.10.1-1, 4096 MB memory installed
Ethernet address 0:e0:0:c4:9a:b, Host ID: 80f39a0b.
XSCF Version: 3.19.1

{0} ok boot -dv


Boot device: /pci@83,2000/FJSV,pwga@1 File and args: -dv
Timeout waiting for ARP/RARP packet
2b600 Using RARP/BOOTPARAMS...

Requesting Internet address for 0:e0:0:c5:1a:b


Internet address is: 192.168.10.30
hostname: an_0800-st
whoami: Router ip is: 192.168.10.201
Found 192.168.10.201 @ 0:30:5:40:6a:b9
root server: <filer>-st(192.168.10.203)
root directory:
/vol/volFF/os/Solaris/FSC_5.8_202_20050211/bi_FJSV,GPUZC-M_PW-
P/root/an_0800-st
Found 192.168.10.203 @ 2:a0:98:1:1e:92
Size: 376880+105005+158015 Bytes
SunOS Release 5.8 Version Generic_117350-18 64-bit
Copyright 1983-2003 Sun Microsystems, Inc. All rights reserved.
Ethernet address = 0:e0:0:c4:9a:b
...

Administration and Operation 151


Administrating Application Nodes Installation / Activation of New Solaris Images

Explanation of this screen:


The message Timeout waiting... may occur a few times. If it does not continue,
check if the rarpd process is running on the Control Node and if there is an appropriate
entry in /etc/ethers on the Control Nodes.
Another cause could be that the network interface of the Application Node is not in the
network segment where the rarpd of the Control Node is listening on. In this case check
the network cabling and switch configuration.
The 2b600 is given as an example, this value may vary depending on the file size. It may
be a different count on your installation. It is the size (in hex) of the TFTP kernel file we
have linked to the Application Node’s IP address in Hex in the /tfttpboot directory of
the Control Node.
The Router ip is: ... denotes the Control Node, for the boot process, the Control
Node is the router (which is not really serving with a routing function).
The root server and root directory must point to the Filer and the directory where
the root file system of the Application Node is located. If the boot process stops right after
this message, the root file system may not be found or the access permissions in
/vol0/etc/exports of the Filer are not set correctly. Another possibility could be
network segment issues.
The boot phase should run until you see a login prompt.

The system is ready.


an_0800 console login:

8.7.2 Solaris Image – rc scripts


If the Application Node does not come up (e.g. “hangs” during execution of an rc script),
it is useful to know which rc script is currently being executed.
To get this information you need to create a file /etc/.trace.sd into the root file
system of the Application Node (e.g. from the Control Node), e.g. like:

control1:~ #
touch /FlexFrame/volFF/os/Solaris/FSC_5.8_202_20050211/
bi_ FJSV,GPUZC-M_PW-P/root/an_0800-st/etc/.trace.sd

The messages look like:

...
##### rc2: /etc/rc2.d/S01SMAWswapmirror start
##### rc2: /etc/rc2.d/S05RMTMPFILES start
##### rc2: /etc/rc2.d/S06log3 start
Starting the logging daemon

152 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

Started!
##### rc2: /etc/rc2.d/S10lu start
##### rc2: /etc/rc2.d/S19FJSVdmpsnap start
##### rc2: /etc/rc2.d/S20sysetup start
...

If you want turn of the messages again, remove the file .trace.sd, e.g. like:

control1:~ # rm /FlexFrame/volFF/os/Solaris/FSC_5.8_202_20050211/
bi_ FJSV,GPUZC-M_PW-P/root/an_0800-st/etc/.trace.sd

8.7.3 Preparation for Solaris Application Nodes


If you have an already running Solaris Application Node in your FlexFrame Environment
you can continue with the steps in section 8.7.3.1 (see page 154). If not, please refer first
to section 8.7.3.2 (see page 158).
When you use a FlexFrame 3.1 or a FlexFrame 3.2A environment to build the
Solaris Boot Images of FlexFrame 3.2B please keep in mind that the netboot
packages of FlexFrame are overwritten with new packages!
This results in new scripts which have a totally other functionality than the old
ones. Rebuilding an FlexFrame 3.1 or FlexFrame 3.2A Solaris Application Node
with the ff_new_an.sh command will not work. In that case you have to reinstall
the netboot packages of FlexFrame 3.1 resp. FlexFrame 3.2A.
You will find the old packages on DVDs "FlexFrame Images & Tools for Solaris
8 Application Node for PRIMEPOWER" or "FlexFrame Images & Tools for
Solaris 9 Application Node for PRIMEPOWER"
Insert dvd to cd/dvd drive and use the following commands for installing:
mount /media/dvd
cd /media/dvd/packages
rpm -U --test --nodeps SMAWnbpw.rpm
rpm -U --test --nodeps SMAWnbpw-FFMySAP.rpm # on
FlexFrame 3.2A
or
rpm -U --test --nodeps SMAWnbpw-FF31_00.rpm # on
FlexFrame 3.1

Administration and Operation 153


Administrating Application Nodes Installation / Activation of New Solaris Images

You need sufficient free space on the file system of the filer, where the Solaris
Boot Images will be stored. Before creating Solaris Boot Images it has to be
assured that the minimum available space of this file system should not exceed
an usage level of 80 percent *after* creating the image. Be in mind, one Solaris
Boot Image will need about 5GB of space in the file system!
Otherwise the commands called by nb_unpack_bi will fail and for that reason
the Solaris Boot Image will be unusable! If this happens, you will get error
messages like:
Patch cluster install script for Solaris 9 Recommended
Determining if sufficient save space exists...
expr: syntax error
expr: syntax error
expr: syntax error
.....
expr: syntax error
./Rinstall_cluster: test: argument expected

8.7.3.1 Using a Running Solaris Application Node as Helper System for


Preparation
Because of lawful regulations it is no longer permitted to provide prepared "ready to use"
Solaris Boot Images. Therefore you have to create the Solaris Boot Images yourself. This
process is supported by the tool described in this section.
All you need is:
The original Solaris CDs/DVD for Solaris 8 or 9
● The DVD "FlexFrame Sources & Tools for Solaris Application Node for
PRIMEPOWER"
The DVD "FlexFrame Sources & Tools for Solaris Application Node for PRIMEPOWER"
contains all needed software - except the original Solaris OS - to create a Boot Image for
your Solaris Application Nodes. Furthermore it contains tools which have to be installed
on both Control Nodes. The tools are packed in rpm packages.
With this DVD and the original Solaris CDs/DVD you are able to create four different
types of Solaris Boot Images for your Solaris Application Nodes, depending on the
hardware class of your Application Node.
● Solaris 8 Application Node for PRIMEPOWER 250/450 (FJSV,GPUZC-M_PW-P)
● Solaris 8 Application Node for PRIMEPOWER 650/850 (FJSV,GPUZC-M_PW-CMZ)
● Solaris 9 Application Node for PRIMEPOWER 250/450 (FJSV,GPUZC-M_PW-P)
● Solaris 9 Application Node for PRIMEPOWER 650/850 (FJSV,GPUZC-M_PW-CMZ)
The Solaris Boot Image creation process will take approximately 5 hours for a Solaris 9
Boot Image and round about 5 and a half hours for a Solaris 8 Boot Image.

154 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

For each hardware class of the Application Nodes you have to create the corresponding
Boot Image on your Control Node respectively Filer.
The following graphic and explanations will give you a short overview to the necessary
preparations for creating a Boot Image.

Step 1: Sources & Tools for Solaris


Where: Control Node
Why: Access to "FlexFreame Sources & Tools for Solaris Application
Node
for PRIMEPOWER" DVD
How to: Insert the "FlexFreame Sources & Tools for Solaris Application
Node for PRIMEPOWER" DVD to DVD drive of CN.
Example: control1:/ # mount /media/dvd
Step 2: Solaris9 DVD/CDs or Solaris8 CDs
Where: Solaris AN HelperSystem
Why: Access to original Solaris OS DVD/CDs
How to: Insert the Solaris OS DVD/CD to the DVD/CD drive of Solaris
AN.
It will be mounted automatically by vold.

Administration and Operation 155


Administrating Application Nodes Installation / Activation of New Solaris Images

Step 3: modify exportfs of filer (3) *


Where: Control Node
Why: Access to the Filer for Solaris AN
How to: Add an entry to the exportfs file of the FilerSystem:
Example: /vol/volFF/os/Solaris/new_FSC_SOLBUID*
/-sec=sys,rw=<Storage_IP Solaris_AN>,
root=<Storage_IP Solaris_AN>, anon=0
How to: Activate the new exportfs on the FilerSystem.
Example: control1:/ # rsh <FilerSystem-st> exportfs
Step 4: modify exportfs of CN (4) *
Where: Control Node
Why: Access to the DVD drive of the CN for Solaris AN
How to : Share the DVD drive of CN for Solaris AN:
Example: control1:/ # exportfs <Solaris_AN-st>:/media/dvd
Step 5: nfsserver (5) *
Where: Control Node
Why: _ Access to the DVD drive of the CN for Solaris AN
How to: Start the nfsserver service on CN.
Example: control1:/ # /etc/init.d/nfsserver start
Step 6: ssh (6)
Where: Control Node
Why: Create known_hosts entry on Solaris AN
How to: Setup ssh connection via Storage LAN from CN to Solaris AN.
Example: control1:/ # ssh <name_or_IP_address_of_Solaris_AN>
Why: Commands for creating Base Boot Image and Boot Image are called via
ssh on CN. *
How to: Check if ssh access to Solaris AN is available without entering user or
password.
Example: control1:/ # ssh <name_or_IP_address_of_Solaris_AN> <cmd>
Step 7: nfs-mount (7) *
Where: Solaris AN HelperSystem
Why: Mount the “Sources & Tools for Solaris” DVD located in the DVD drive
of CN on Solaris AN.
Example: control1:/ # ssh <HelperSystem>
“mkdir /tmp/<FSC_TMP_PATH>”
control1:/ # ssh <HelperSystem> “mount –F nfs
<FILER>:<OS_FILER_PATH>/<FSC_TMP_PATH>
/tmp/<FSC_TMP_PATH>”

156 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

Step 8: nfs-mount (8) *


Where: Solaris AN HelperSystem
Why: Mount the file systems on the FilerSystem where the Boot Image will be
stored.
How to: To setup the ssh connection via Storage LAN from CN to Solaris AN,
call ssh.
Example: control1:/ # ssh <HelperSystem>
“mkdir /tmp/DVD_<iso_name>”
control1:/ # ssh <HelperSystem> “mount –F nfs
<CN>:<DVD_path_to_iso> /tmp/DVD_<iso_name>
* automatically done by nb_unpack_bi.sh.
The following describes the Boot Image creation process itself:
1. Choose one of your running Solaris Application Nodes as a helper system to create
the boot image.
2. Insert the (first of your) original Solaris CDs/DVD of the Solaris version you want to
build the image for into the CD/DVD drive of your helper system. (At the moment,
ESF does not permit to build a Solaris 8 Image using a Solaris 9 system. Therefore
you have to choose a running Solaris 9 System to build a Solaris 9 Image and a
running Solaris 8 System to build a Solaris 8 or a Solaris 9 Image)
3. Put the DVD "FlexFrame Sources & Tools for Solaris Application Node for
PRIMEPOWER" into the DVD drive of your Control Node, mount it and change the
directory to the root directory of the mounted DVD:
# mount /media/dvd
# cd /media/dvd

Check the path /FlexFrame/volFF. It has to be a file system mounted


from the Filer. Otherwise the installation of the Boot Images in the next
step will be done to the local disks of the Control Node If no Filer is
mounted, check your network.
4. To create the Boot Image and install the netboot package, call the command
# ./nb_unpack_bi

and follow the instructions. For further details, please refer to the section
“Preparation for Solaris Application Nodes” in the chapter “Installation Scripts” of the
“Installation of a FlexFrame Environment” manual.

Administration and Operation 157


Administrating Application Nodes Installation / Activation of New Solaris Images

5. Install the DVD on the second Control Node as well. Here, only some of the tools will
be installed, not the entire Boot Image. The Boot Image has already been copied to
the Filer by installing the DVD on the first Control Node.
mount /media/dvd
cd /media/dvd
./nb_unpack_bi

The mount point /FlexFrame/volFF is used from both Control Nodes


simultaneously.

8.7.3.2 Setup a Solaris Helper System for Preparation, which is Connected to


FlexFrame but Booted from Local Disk
If you don’t have an already running Solaris Application Node, you have to do some
preparations to get a helper system for the creation of the Solaris Boot Image. When you
have finished these preparations you have to go back to section 8.7.3.1 (see page 154)
to build your Solaris Boot Image using the nb_unpack_bi command.
Preparing the Helper System which boots from local disk
1. Prerequisites
● Helper system is part of an existing FlexFrame environment. This means:
The helper system is connected to the FlexFrame network in accordance with
the wiring.txt file.
The hostname of the helper system exists in the FlexFrame environment and
have to be the same after respectively for installation of the helper system on
local disk.
● Solaris8 or Solaris9 is installed on local disk of helper system (installed at least
using SUNWCall).
● The newest ESF software is installed on local disk of the helper system.

2. OpenBoot settings for helper system


Check the OpenBoot settings on the helper system. Set boot-device to disk:
{0} ok setenv boot-device disk

158 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

3. Console Settings for PRIMEPOWER250/450 Helper System


3.1 Switch the console to the serial port:
In normal case use /opt/FJSVmadm/sbin/madmin.
If network settings were done incorrectly and the IP address information that
was set has been lost, the console cannot be connected. However, if this
situation occurs, the following method can be used to forcibly switch from the
standard console (OS console) port to the serial port:
1. Power off the main unit.
2. Connect the console terminal to the serial port (tty-a).
3. Set the MODE switch on the front panel of the main unit to UNLOCK
MODE.
4. Press the POWER SWITCH button, and hold it down for at least 10
seconds. The ONLINE LED and the CHECK LED blink several times to
indicate that the standard console has been switched to the serial port
(tty-a).
5. Confirm that the console screen is displayed on the standard console
terminal that is connected to the serial port (tty-a).
3.2 Configure XSCF according the manuals "PRIMEPOWER250 Installation
Instructions" resp. "PRIMEPOWER450 Installation Instructions".

4. IP addresses, host names and file shares


4.1 Helper system
Check respectively create the files hostname.fjgi0 and hostname.fjgi1
in the directory /etc. Both must have the same entry: <AN_name>-st
4.2 Helper system
Check the /etc/hosts file and if necessary add the following line:
<storage_IP_of_AN> <AN_name>-st
4.3 Helper system
Add lines for your filer (storage LAN) and your CN (storage-LAN) into the
/etc/hosts file:
<filer_storage_IP> <filer_name>-st
<CN_storage_IP> <CN_name>-st
4.4 Control Node
Check the /etc/hosts file and if necessary add the following line:
<storage_IP_of_AN> <AN_name>-st

Administration and Operation 159


Administrating Application Nodes Installation / Activation of New Solaris Images

5. Patches for Solaris


The following patches have to be installed on helper system.
Helper system running Solaris8:
110934-18
110380-06
Helper system running Solaris9:
113713-14
112951-12
115689-01
The patches are included in the DVD "FlexFrame Sources & Tools for Solaris
Application Node for PRIMEPOWER" in the directories RSPC/8_Recommended
respectively RSPC/9_Recommended. Install them if needed using the patchadd
command.
6. SSH connection
6.1 Helper system
Create ssh-keys for RSA encryption:
/usr/bin/ssh-keygen -t rsa

For storing the keys use the default directory (~/.ssh).


It's not necessary to enter passphrases.
The ssh-keygen command creates a pair of keys for rsa encryption in the files
~/.ssh/id_rsa and ~/.ssh/id_rsa.pub.
6.2 Copy the ssh public key of the helper system to the Control Node.
Helper system:
cd /.ssh
scp id_rsa.pub <CN_name>-st:/tmp/<AN_name>_id_rsa.pub

You will be asked for creating the connection to the Control Node.
The authenticity of host <CN_name> (IP.IP.IP.IP) can't be established.
RSA key fingerprint is 12:f2:7f:c4:ea:98:8b:32:dd:49:02:f5:ae:3b:95:f7.

Are you sure you want to continue connecting (yes/no)?

Answer with yes.


Then you are asked for the password of user root on the Control Node:

root@<CN_name>-st's password:

160 Administration and Operation


Installation / Activation of New Solaris Images Administrating Application Nodes

Type in the password of user root on the Control Node.


The copy process will be done.
As user root connect to on the Control Node:
ssh <CN_name>-st

Then you are asked for the password of user root on the Control Node:

root@<CN_name>-st's password:

Type in the password of user root on the Control Node.


Now on the Control Node:
cat /tmp/<AN_name>_id_rsa.pub >>
/root/.ssh/authorized_keys
chmod 644 /root/.ssh/authorized_keys
rm /tmp/<AN_name>_id_rsa.pub
6.3 Copy the ssh public key of the Control Node to the helper system.
If the files id_rsa.pub and id_rsa do not exist, connect to the Contol Node
and create them analog 6.1).
On helper system copy the rsa public key of the Control Node to the helper
system:
scp <CN_name>-st:/root/.ssh/id_rsa.pub
/tmp/<CN_name>_id_rsa.pub

Install the keys on the helper system:


cat /tmp/<CN_name>_id_rsa.pub >> /.ssh/authorized_keys
chmod 644 /.ssh/authorized_keys
rm /tmp/<CN_name>_id_rsa.pub

7. Further configurations on helper system


7.1 Set the entry PermitRootLogin in the /etc/ssh/sshd_conf file to yes to
allow user root to do an ssh connection to the helper system:

PermitRootLogin=yes

7.2 Restart sshd:


/etc/init.d/sshd stop
/etc/init.d/sshd start

Administration and Operation 161


Administrating Application Nodes Solaris Image Maintenance Cycle

8. Check SSH Connections


8.1 Check if ssh connection from the helper system to the Control Node is working
without any interaction ( yes or password).
On helper system:
ssh <CN_name>-st

8.2 Check if ssh connection from the Control Node to the helper system is working
without any interaction (yes or password).
On Control Node:
ssh <AN_name>-st

After all these preparations please go back to section 8.7.3.1 (see page 154) to start the
Solaris Boot Image creation using nb_unpack_bi command.

8.8 Solaris Image Maintenance Cycle


This section describes how to run a Maintenance Cycle for an existing Solaris Image.
It does not describe the creating of a new Solaris Image. If you want to create a new
Solaris Image please refer to the manual “Installation of a FlexFrame Environment”,
section “Preparation for Solaris Application Nodes".
The Solaris Image Maintenance Cycle has been redesigned and is now easier
than in former FlexFrame releases. Please read carefully this section.

8.8.1 Introduction
Running a Solaris Image Maintenance Cycle means maintaining an existing Solaris
Image: You might want to apply patches, install or upgrade software, modify several
configurations or change other settings on a Solaris Image where your Solaris Application
Nodes run on.
In order to run a Solaris Image Maintenance Cycle you will need a
● Solaris Image
● Solaris Application Node running on that Solaris Image
● time slot of min. 2 hours
● the software to install and/or a plan what you want to maintain

162 Administration and Operation


Solaris Image Maintenance Cycle Administrating Application Nodes

The Solaris Application Node selected for running the Solaris Image
Maintenance Cycle won't be visible to FlexFrame i.e. in particular won't serve as
a spare node during that time.

We would like to point out that changes due to 3rd party software or patches or
customer modifications may have heavy (negative) side effects to the
FlexFrame functionality.
Before installing any 3rd party software, please see section “Third Party
Software” on page 105.
We would also like to point out that there is software on the market which can
not be installed on NFS or is not ready for Adaptive Computing (e.g. moving
SAP services from one to another node). If this is the case, contact your 3rd
party software vendor. It is usually not an issue of FlexFrame.

8.8.2 Overview
This section lists an example in order to give an idea how the Solaris Image Maintenance
Cycle works. Basically the Solaris Image Maintenance Cycle comprises of just 3 steps:
1. Transfer a Boot Image into a Maintenance Image.
2. Boot a Solaris Application Node from it and maintain it: install or upgrade software,
modify, apply.
3. Transfer the Maintenance Image into a Boot Image and boot all of the Solaris
Application Nodes of that group from the maintained Boot Image.

Example:
Assuming you run a couple of Solaris Application Nodes on a certain Image. To make it
short let's call it GOODIMAGE.
Now you decide that it is needed to have a certain piece of software SOFTWARE on these
Solaris Application Nodes.
You know that you can't install software directly on the Solaris Application Nodes:
● On the one hand the Solaris Application Nodes of a group share the same /usr file
system. This is only read-only for all of them.
● On the other hand you don't like to install software to all of them (via the shared
filesystem) without a careful test procedure.
● Beside that you must prevent FlexFrame from taking over this Solaris Application
Node in case of failures of another Solaris Application Node during the installation of
your software SOFTWARE.

Administration and Operation 163


Administrating Application Nodes Solaris Image Maintenance Cycle

Therefore you have to select one of your Solaris Application Nodes and bring it into a so
called maintenance state. This comprises of:
1. Choose one Solaris Application Node which is running on the Image you like to
maintain.
2. Save the configuration data of this Solaris Application Node - you might want to bring
it up with the same Internet Address, group, pool, etc.
3. Remove this Solaris Application Node.
4. Create a so called Maintenance Image out of the Solaris Image GOODIMAGE the
chosen Solaris Application Node run on.
5. Create configuration information for this Solaris Application Node.
6. Configure this Solaris Application Node.
7. Boot this Solaris Application Node.
Now this Solaris Application Node runs on a Maintenance Image which is separated from
the original GOODIMAGE. All the other Solaris Application Nodes on the GOODIMAGE run
without getting touched. The Maintenance Image allows software installation since it has
in particular it's own writable /usr file system and the FlexFrame (FA Agents) don't control
it. That's why you can continue now:
8. Install the software SOFTWARE on the Solaris Application Node and test it. Configure,
reboot, change settings of SOFTWARE until you are satified.
The result is a modified GOODIMAGE which contains the software SOFTWARE. To make it
short let's call it BETTERIMAGE. In order to have the software SOFTWARE available on all
the Solaris Application Nodes in the group you have to:
9. Shut down the Solaris Application Node.
10. Transfer the Maintenance Image BETTERIMAGE back to a Boot Image (now
recognizable by ff_an_adm.pl).
11. Remove the Solaris Application Node you used in Maintenance.
12. Create configuration information for this Solaris Application Node: just like #5 but
now on the BETTERIMAGE instead of on the GOODIMAGE.
13. Configure this Solaris Application Node.
14. Boot this Solaris Application Node.
Now the Maintenance cycle is actually done but we have to switch the Solaris Application
Nodes onto the new maintained Image.
15. Set by help of ff_an_adm.pl and ff_new_an.sh all of the Solaris Application
Nodes of the group to the BETTERIMAGE. Just reboot them and all of them contain
the modifications you made during maintenance.

164 Administration and Operation


Solaris Image Maintenance Cycle Administrating Application Nodes

Switching from GOODIMAGE to BETTERIMAGE costs only a reboot! In case of


problems you can switch back in the same way.
But please bear in mind:
It is very important that all the Solaris Application Nodes of one group have the
same configuration (software, patches) since each one must be able to take
over if another one fails. That's why all of the Solaris Application Nodes of a
group must run on the same image.

8.8.3 Running the Cycle


This section shows in detail how the process works. It includes a real life example you
easily can follow. The numbering refers to the "Overview” section.
Please log in onto a FlexFrame Control Node and change the directory to
/opt/SMAW/SMAWnbpw/bin.
1. Define the Boot Image we want to run a maintenance cycle on. The Boot Image
could be any Image suitable for running clients on. Normally it should be a proven
Boot Image where clients run successful on. Our Solaris Boot Image is /FlexFrame
/volFF/os/Solaris/FSC_5.9_904_20060607/bi_FJSV,GPUZC-M_PW-P.
Choose a Solaris Application Node running on that image. This Solaris Application
Node has now to be brought down and will not be available during the Maintenance
Cycle for normal FlexFrame operation. Our Solaris Application Node has the name
kid20_4:

control1: ssh kid20_4 init 0

2. Save the configuration data of this Solaris Application Node. This is helpful since we
need most of the configuration data in the next steps. In addition we might want to
bring up this Solaris Application Node after the Maintenance Cycle with the same
Internet Address, group, pool and so on.
control1: ff_an_adm.pl --op list --name kid20_4
--cmdline > kid20_4.config

For our task the last line of the generated file kid20_4.config is the most
important: it contains the command line used to configure the node.
3. Remove this Solaris Application Node. This allows the use of the configuration data
of kid20_4 during maintenance cycle.

control1: ff_an_adm.pl --op rem --name kid20_4

4. Create a so called Maintenance Image out of the Boot Image. The command to do
that is nb_boot2maintenance (for details and example output refer to the
respective chapter in the User Guide). The nb_boot2maintenance needs two

Administration and Operation 165


Administrating Application Nodes Solaris Image Maintenance Cycle

parameters to work: The Boot Image we like to maintain and a target directory. The
latter will get the Maintenance Image. Please note that it must not exist and should
get a meaningful name: It must not start with FSC but should start with
MAINTENANCE and reflect the date or version or other tags we need to identify it. We
choose /FlexFrame/volFF/os/Solaris/MAINTENANCE_5.9_904_20060704.

control1: ./nb_boot2maintenance
-s /FlexFrame/volFF/os/Solaris/FSC_5.9_904_20060607/
bi_FJSV,GPUZC-M_PW-P
-d /FlexFrame/volFF/os/Solaris/MAINTENANCE_5.9_904_20060704

The command will need some time to finish.

5. Create configuration information for the Solaris Application Node during its life in the
Maintenance cycle. The command to do this is again ff_an_adm.pl. The command
line to do this looks complicated but most of it is contained in the saved file
kid20_4.config. Just pick the last line but change the following:
At first the ospath parameter to our Maintenance Image path and at second the
name parameter to maintenance:

control1: ff_an_adm.pl --op add --name maintenance


--type PW250 --pool pool1 --group test1 --swgroup 1
--mac 00:E0:00:C5:46:A0,00:E0:00:A6:F4:7B --ospath
Solaris/MAINTENANCE_5.9_904_20060704 --host 129,130,131

Please ignore exceptionally the instruction


ff_new_an.sh -n maintenance given by ff_an_adm.pl!
The command writes a file
/tftpboot/config/netboot_pool1_maintenance.cfg
we will need later.
6. Configure this Solaris Application Node in order to make it ready to boot. The
command to do this is nb_configure_client. This command reads the
configuration file we've got on the fly by the last ff_an_adm.pl call. Don't forget to
type in the -c parameter: it will run the command in cluster mode and therefore not
harm the cluster consistency monitored by RMS. Don't forget to type in the -M
parameter: it will run the command in maintenance mode resulting in writable /usr
file systems later on.
control1: ./nb_configure_client
-M -c -r /tftpboot/config/netboot_pool1_maintenance.cfg

7. Now we are able to boot the Solaris Application Node.


{0} ok boot -dv

166 Administration and Operation


Solaris Image Maintenance Cycle Administrating Application Nodes

8. Since the Solaris Application Node runs on a separate Image with writable
filesystems the real maintenance can start. We can apply patches, install or upgrade
software, modify several configurations or change other settings on the system. We
should carefully check the system after applying the changes and respect the
warning we read in the “Introduction” section.
Any change we do will remain in the image i.e. any Solaris Application Node
which will run on that image later on will "inherit" everything. This is good for
sure in case of intended changes. But keep an eye on unwanted log files,
notes, and so on. Clean them up.
9. After the test run of the Solaris Application Node on the Maintenance Image we have
to shut it down again. Please do this using the console of the Solaris Application
Node:
maintenance-st console # init 0

10. Now we have to transfer the Maintenance Image into a Boot Image. The command
to do that is nb_maintenance2boot (for details and example output refer to the
respective chapter in the User Guide). This is the counterpart to
nb_boot2maintenance and transfers the Maintenance Image into a Boot Image
by moving appropriate contents and cleaning up the Image. The command reads the
well known configuration file:
control1: ./nb_maintenance2boot
-r /tftpboot/config/netboot_pool1_maintenance.cfg

11. Now we will bring up our former Solaris Application Node up again and therefore
have to release its configuration at first: Remove the Solaris Application Node - the
Node in Maintenance state is no longer needed.
control1: ff_an_adm.pl --op rem --name maintenance

12. Create configuration information for the Solaris Application Node kid20_4 in order
to bring it back to life. The command to do this is again ff_an_adm.pl. The
command line to do this looks complicated but most of it is contained in the saved file
kid20_4.config. Just pick the last line but change the ospath parameter to our
Maintenance Image path:
control1: ff_an_adm.pl --op add --name kid20_4 --type PW250
--pool pool1 --group test1 --swgroup 1
--mac 00:E0:00:C5:46:A0,00:E0:00:A6:F4:7B
--ospath Solaris/MAINTENANCE_5.9_904_20060704
--host 129,130,131

13. Create the Solaris Application Node Image


control1: ff_new_an.sh -n kid20_4

Administration and Operation 167


Administrating Application Nodes Solaris Image Maintenance Cycle

14. Now we are able boot the Solaris Application Node but we have to consider: The test
run we did was a test run without the specific FlexFrame software. The
ff_new_an.sh -n kid20_4 call enabled the FlexFrame functions. Therefore after
coming up the FlexFrame (FA Agents) will recognize the Application Node. In order
to prevent this new Application Node from being treated as a spare node it is wise to
hide the Node. Remove the respective link:
control1: cd /FlexFrame/volFF/os/Solaris/MAINTENANCE_5.9_904
_20060704/bi_FJSV,GPUZC-M_PW-P/root/kid20_4-st/etc/rc3.d
control1: mv S20myAMC.FA_AppAgent _S20myAMC.FA_AppAgent

Now we boot the Solaris Application Node:


{0} ok boot -dv

The kid20_4 system will come up and will contain all the changes we made during
maintenance. It runs in the FlexFrame environment without serving as a spare node. If
this works now without any problems as well we are done.
The Maintenance cycle is actually finished but we have to switch the Solaris Application
Nodes onto the new maintained Image.

8.8.4 Create Solaris Application Nodes on the Maintained


Image
The very last step is to assign the new - maintained - Image to all the other Solaris
Application Nodes. E.g. we have the Nodes kid20_1, kid20_2, and kid20_3 in our
environment. Since we only have to change their Boot Image path it is easy to do that:
control1: ff_an_adm.pl --op os --name kid20_1
--ospath Solaris/MAINTENANCE_5.9_904_20060704
control1: ff_an_adm.pl --op os --name kid20_2
--ospath Solaris/MAINTENANCE_5.9_904_20060704
control1: ff_an_adm.pl --op os --name kid20_3
--ospath Solaris/MAINTENANCE_5.9_904_20060704

After that launch ff_new_an.sh as in #13.

control1: ff_new_an.sh -n kid20_1


control1: ff_new_an.sh -n kid20_2
control1: ff_new_an.sh -n kid20_3

Up to this point the Solaris Application Nodes can still operate on the previous
version of the Solaris Image. Finally we need only to reboot the Solaris
Application Nodes to change their Image from the previous to the new
maintained Image.

168 Administration and Operation


Image Customization for Experts Administrating Application Nodes

control1: ssh kid20_1 init 6


control1: ssh kid20_2 init 6
control1: ssh kid20_3 init 6

And now we can activate the kid20_4 as a fully functional FlexFrame Solaris Application
Node by moving back the link and rebooting the system:
control1: cd /FlexFrame/volFF/os/Solaris/MAINTENANCE_5.9_904
_20060704/bi_FJSV,GPUZC-M_PW-P/root/kid20_4-st/etc/rc3.d
control1: mv _S20myAMC.FA_AppAgent S20myAMC.FA_AppAgent
control1: ssh kid20_4 init 6

Since all the Solaris Application Nodes of a group must have the very same
software it is recommended to bring up the Solaris Application Nodes on the
new maintained Image in a short period of time..

8.9 Image Customization for Experts


If new software has been brought into the image, it may have to be configured individually
for each Application Node. Here is the process how to achieve this:
In /opt/SMAW/SMAWnbpw/bin/nbrc.FF31_00 the customer may create scripts on
both Control Nodes in the form nb_customn_<NR>.
Where <NR> is a number with two digits. There are sample scripts in the same folder.
The pool name can be read from $POOL_NAME, but there's currently no variable denoting
the group name within a pool.
The scripts must be backed up carefully since a new Control Node image or Solaris
image may remove the complete folder to install new scripts of the software package.
For detailed information, please take a look at the “FlexFrame Installation Guide”, section
“Script: nb_customize_client”.

Administration and Operation 169


Administrating Application Nodes Troubleshooting

8.10 Troubleshooting

8.10.1 Solaris Image – Traces of Solaris rc-scripts


If the Application Node does not come up (e.g. “hangs” during execution of an rc-script), it
is useful to know which rc-script is currently being executed. To get this information you
need to create a file /etc/.trace.sd into the root file system of the Application Node
(e.g. from the Control Node), e.g. like:
control1:~ # touch /FlexFrame/volFF/os/Solaris/
FSC_5.8_202_20050211/bi_ FJSV,GPUZC-M_PW-P/root/
an_0800-st/etc/.trace.sd

The messages look like:

##### rc2: /etc/rc2.d/S01SMAWswapmirror start


##### rc2: /etc/rc2.d/S05RMTMPFILES start
##### rc2: /etc/rc2.d/S06log3 start
Starting the logging daemon
Started!
##### rc2: /etc/rc2.d/S10lu start
##### rc2: /etc/rc2.d/S19FJSVdmpsnap start
##### rc2: /etc/rc2.d/S20sysetup start

The messages can be turned off by removing the file .trace.sd, e.g. like:

control1:~ # rm /FlexFrame/volFF/os/Solaris/
FSC_5.8_202_20050211/bi_FJSV,GPUZC-M_PW-P/root/
an_0800-st/etc/.trace.sd

8.10.2 Problems With /usr During Maintenance Cycle


Please verify that the /usr file system is mounted read-writable. You can test this by
reading the output of the mount command, or by trying to create a file on /usr:

an_Solaris# touch /usr/testfile

If an error message like

touch: /usr/testfile cannot create

appears, the /usr file is still read-only.

170 Administration and Operation


Troubleshooting Administrating Application Nodes

Make sure to remove the testfile to avoid confusion.

an_Solaris# rm /usr/testfile

8.10.3 Boot Hangs With “Timeout waiting…”


If the boot process does not continue after the message Timeout waiting..., check if
the rarpd process is running on the Control Node and if an appropriate entry in
/etc/ethers is listed. Another cause could be that the network interface of the
Application Node is not in the network segment where the rarpd of the Control Node is
listening on. The file /etc/ethers should not be modified manually, but by using
ff_new_an.sh instead.

8.10.4 Boot Hangs After “router IP is…”


If the boot process stops right after this message, the root file system may have not been
found or the access permissions in /vol0/etc/exports of the Filer are not set
correctly. Another possibility could be network segment issues.

8.10.5 Boot Stops Complaining About /usr


Check the client's /etc/vfstab.

8.10.6 Boot Hangs When Trying to Mount /usr


Hostname: pw250f-1a
nfs mount: file1a: RPC: Rpcbind failure - RPC: Success
nfs mount: retrying: /usr

Check /usr in the vfstab and the Filer’s exports. Check that it is actually exported via
exportfs. Check /etc/hosts.

8.10.7 Boot Asks for Date, Time and/or Locale


The boot phase should run until you see a login prompt. If the Application Node asks for
input (like time and date etc), something went wrong. The settings in ../etc/sysidcfg
(AN) file could not be interpreted correctly.

Administration and Operation 171


Administrating Application Nodes Pools and Groups

8.11 Pools and Groups

8.11.1 Adding a Pool


A new pool can be added using the maintenance tool ff_pool_adm.pl. Some
parameters have to be defined on the command line. They are used to configure switch
VLANs and ports, to create the Filer volume folder structures, to create LDAP pool
subtree and to configure Control Nodes.
Adding a pool changes the exports file on all given Filers. Temporary exports
(not written to exports file on /vol0/etc/exports) on these Filers will be
gone after running ff_pool_adm.pl. Be sure not to have temporary exports.

Synopsis

ff_pool_adm.pl
--op add --name <pool_name>
--storage <vlan_id>,<network_ip>,<netmask>
--server <vlan_id>,<network_ip>,<netmask>
--client <vlan_id>,<network_ip>,<netmask>
--dns <domain_name>,[<dns_server_ ip>]
[--sapdata <nas_name>[,<volume_path>]]
[--saplog <nas_name>[,<volume_path>]]
[--volff <nas_name>[,<volume_path>]]
[--volff_common <nas_name>[,<volume_path>]]
[--defrouter <default_router_ip>] [--switchgrp <id>[,<id>]]

Command Options
--op add
Adds a pool.
--name <pool_name>
Name of new pool (has to be unique within entire FlexFrame). We recommend using
short lowercase names for <pool_name>.
--storage <vlan_id>,<network_ip>,<netmask>
Pool specific storage network segment. The option is followed by a comma
separated list of the VLAN ID, the network IP and the netmask of the network IP
address.
--server <vlan_id>,<network_ip>,<netmask>
Pool specific server network segment. The option is followed by a comma separated
list of the VLAN ID, the network IP and the netmask of the network IP address.

172 Administration and Operation


Pools and Groups Administrating Application Nodes

--client <vlan_id>,<network_ip>,<netmask>
Pool specific client network segment. The option is followed by a comma separated
list of the VLAN ID, the network IP and the netmask of network IP.
--dns <domain_name>,[<dns_server_ip>]
DNS domain name and servers to be used for this pool. More than one DNS server
IP address may be given. Keep in mind to use the default router option if the server
IP addresses do not match any of the pool networks. A DNS option may look like
this: my.domain.com,192.168.251.17,192.168.251.18
--sapdata <nas_name>[,<volume_path>]
Optional NAS name and volume path the pool should use for sapdata. A missing
volume path is auto-filled with the default (/vol/sapdata/<pool_name>). e.g.
filer1,/vol/sapdata/pool1. The entire option defaults to common NAS name
with default path. <nas_name> is the Filer's node name for this pool (without
-st suffix).
--saplog <nas_name>[,<volume_path>]
Optional NAS name and volume path the pool should use for saplog. A missing
volume path is auto filled with the default (/vol/saplog/<pool_name>). e.g.
filer1,/vol/saplog/pool1. The entire option defaults to common NAS name
with default path. <nas_name> is the Filer's node name for this pool (without
-st suffix).
--volff <nas_name>[,<volume_path>]
Optional NAS name and volume path the pool should use for volFF. A missing
volume path is auto filled with the default (/vol/volFF/pool-<pool_name>). e.g.
filer1,/vol/volFF/pool-pool1. The entire option defaults to common NAS
name with default path. <nas_name> is the Filer's node name for this pool (without
-st suffix).
--volff_common <nas_name>[,<volume_path>]
Optional NAS name and volume path the pool should use for common volFF data. A
missing volume path is auto filled with the default (/vol/volFF). e.g.
filer1,/vol/volFF. The entire option defaults to common NAS name with default
path. <nas_name> is the Filer's node name for this pool (without -st suffix).
Currently it has to be the first Filer of the FlexFrame landscape and
/vol/volFF/pool-<pool_name>.

--defrouter <default_router_ip>
The default router is a gateway to route IP data to other, non-pool local networks. All
IP data that can not be addressed to a local network will be sent to the default router
to be forwarded to the destination network. The option parameter is an IP address of
this default router. Keep in mind to use a default router IP address matching one of
the local pool networks, because otherwise it will not be accessible by Application
Nodes.

Administration and Operation 173


Administrating Application Nodes Pools and Groups

--switchgrp <id>[,<id>]
The switch group ID(s) the Client LAN to corporate LAN ports should be configured.
If not given, the client VLAN is assigned to the existing trunk ports or a new port pair
at first both switch groups. Not more than two switch group IDs are accepted.

Command Output
The command displays information about processing steps, errors and warnings. The
output may look like this:
cn1:~ # ff_pool_adm.pl --op add --name pool4
--storage 30,192.168.30.0,255.255.255.0
--server 31,192.168.31.0,255.255.255.0
--client 32,192.168.32.0,255.255.255.0 --sapdata filer
--saplog filer --volff filer --volff_common filer
--dns my.domain.com --defrouter 192.168.32.254
update LDAP
..........................
..........................
..........................
..........................
..........................
..........................
.....
update switch 1/1 configuration
Notice: Update will take about 1 minute.
vlan: storage-30 has been created
restart cluster service ldap_srv1
Notice: restart will take up to 1 minute.
stop and wait until service is offline
start and wait until service is online
restart cluster service ldap_srv2
Notice: restart will take up to 1 minute.
stop and wait until service is offline
start and wait until service is online
restart cluster service netboot_srv
Notice: restart will take up to 2 minutes.
stop and wait until service is offline
start and wait until service is online

If not reported any warnings or errors all precautions are done


and the pool was successfully created.
Use ff_poolgroup_adm.pl to define the host groups of this pool
to be able to add application nodes.

174 Administration and Operation


Pools and Groups Administrating Application Nodes

See /tmp/pool-pool4/ff_pool_adm.errlog for complete error and


warning log.

8.11.2 Removing a Pool


A pool can be removed using the maintenance tool ff_pool_adm.pl. Some
parameters have to be defined on the command line. Switch VLANs will be removed and
the affected ports reconfigured. The LDAP pool subtree will be removed and Control
Node configurations rewritten.
A pool may not be removed if any Application Node or SID is defined. The fist
pool may not be removed due to system requirements.
Removing a pool changes the exports file on all Filers used by this pool. Use
operation mode list or list-all to get the storage configuration of pool to
be removed. Temporary exports (not written to the exports file
/vol0/etc/exports) on this Filers will be gone after running
ff_pool_adm.pl.
Be sure not to have temporary exports.

Synopsis

ff_pool_adm.pl --op rem --name <pool_name>

Command Options
--op rem
Removes a pool.
--name <pool_name>
Name of pool to be removed. Use ff_pool_adm.pl --op list-all to get a list
of currently configured pools (see 8.11.4).

Command Output
The command displays only errors and warnings. The output may look like this:
cn1:~ # ff_pool_adm.pl --op rem --name pool4
update LDAP
..........................
..........................
..........................
..........................
..........................
..........................

Administration and Operation 175


Administrating Application Nodes Pools and Groups

update switch 1/1 configuration


Notice: Update will take about 1 minute.
restart cluster service ldap_srv1
Notice: restart will take up to 1 minute.
stop and wait until service is offline
start and wait until service is online
restart cluster service ldap_srv2
Notice: restart will take up to 1 minute.
stop and wait until service is offline
start and wait until service is online
restart cluster service netboot_srv
Notice: restart will take up to 2 minutes.
stop and wait until service is offline
start and wait until service is online

If not reported any warnings or errors the pool was successfully


removed.
Keep in mind, the volumes and their data were not harmed. It's on
you to remove them.

See /tmp/pool-pool4/ff_pool_adm.errlog for complete error and


warning log.

8.11.3 Listing Pool Details


To list the configurations details of a pool like used networks, pool groups, SIDs and
Application Nodes, the maintenance tool ff_pool_adm.pl can be used. The pool name
has to be defined on the command line.

Synopsis

ff_pool_adm.pl --op list --name <pool_name>


[--list <part>[,<part>]]

Command Options
--op list
Lists pool details
--name <pool_name>
Name of pool to be listed. Use ff_pool_adm.pl -op list-all to get a list of
currently configured pools (see 8.11.4).

176 Administration and Operation


Pools and Groups Administrating Application Nodes

--list <part>[,<part>]
To reduce the output to the interesting parts, use this option. The parameters to this
option are the display sections. Add them as a comma separated list. The default
sections are: network,storage,dns,group,sid,an,cn,filer. You may also use a two
character abbreviation instead the full section name like ne for network.

Command Output
The command displays the pool configuration details. The output may look like this:
cn1:/opt/FlexFrame/bin # ff_pool_adm.pl --op list --name p1
Pool configuration details of pool p1

Networks
Client-LAN
Network: 10.10.10.0 Netmask: 255.255.255.0 VLAN ID: 100
Server-LAN
Network: 192.168.10.0 Netmask: 255.255.255.0 VLAN ID: 110
Storage-LAN
Network: 192.168.20.0 Netmask: 255.255.255.0 VLAN ID: 120

Def.Router: 192.168.10.254

Storage Volumes
sapdata fas01-p1-st:/vol/sapdata/p1
saplog fas01-p1-st:/vol/saplog/p1
volFF fas01-p1-st:/vol/volFF/pool-p1
volFF shared fas01-p1-st:/vol/volFF

DNS data
Domain Name: my.domain.com

Pool Groups
Linux
OS: SuSE Linux SLES-9.X86_64
Solaris
OS: Sun SunOS 5.8

SIDs and their instances


D01
SAP Version: SAP-6.20 DB Version: Oracle-9
Instances
Type db
ID 0 Server-LAN: dbd01-se 192.168.10.110
Type ci

Administration and Operation 177


Administrating Application Nodes Pools and Groups

ID 9 Client-LAN: cid01 10.10.10.111


Server-LAN: cid01-se 192.168.10.111
Type app
ID 10 Client-LAN: app10d01 10.10.10.112
Server-LAN: app10d01-se 192.168.10.112
ID 11 Client-LAN: app11d01 10.10.10.113
Server-LAN: app11d01-se 192.168.10.113
P01
SAP Version: SAP-6.20 DB Version: Oracle-9
Instances
Type db
ID 0 Server-LAN: dbp01-se 192.168.10.100
Type ci
ID 0 Client-LAN: cip01 10.10.10.101
Server-LAN: cip01-se 192.168.10.101
Type app
ID 1 Client-LAN: app01p01 10.10.10.102
Server-LAN: app01p01-se 192.168.10.102
ID 3 Client-LAN: app03p01 10.10.10.103
Server-LAN: app03p01-se 192.168.10.103
ID 4 Client-LAN: app04p01 10.10.10.104
Server-LAN: app04p01-se 192.168.10.104
ID 5 Client-LAN: app05p01 10.10.10.105
Server-LAN: app05p01-se 192.168.10.105
Q01
SAP Version: SAP-6.20 DB Version: Oracle-9
Instances
Type db
ID 0 Server-LAN: dbq01-se 192.168.10.106
Type ci
ID 6 Client-LAN: ciq01 10.10.10.107
Server-LAN: ciq01-se 192.168.10.107
Type app
ID 7 Client-LAN: app07q01 10.10.10.108
Server-LAN: app07q01-se 192.168.10.108
ID 8 Client-LAN: app08q01 10.10.10.109
Server-LAN: app08q01-se 192.168.10.109

Application Nodes
PW250-1
Type: PW250
OS: Sun SunOS 5.8
Group: Solaris
Client-LAN PW250-1 10.10.10.33
local test1 PW250-1-clt1 10.10.10.34

178 Administration and Operation


Pools and Groups Administrating Application Nodes

local test2 PW250-1-clt2 10.10.10.35


Server-LAN PW250-1-se 192.168.10.33
local test1 PW250-1-set1 192.168.10.34
local test2 PW250-1-set2 192.168.10.35
Storage-LAN PW250-1-st 192.168.20.33
local test1 PW250-1-stt1 192.168.20.34
local test2 PW250-1-stt2 192.168.20.35
blade01
Type: BX600 Cabinet ID: 1 Slot/Partition ID: 1
OS: SuSE Linux SLES-9.X86_64
Group: Linux
Client-LAN blade01 10.10.10.23
Server-LAN blade01-se 192.168.10.23
Storage-LAN blade01-st 192.168.20.23
blade02
Type: BX600 Cabinet ID: 1 Slot/Partition ID: 2
OS: SuSE Linux SLES-9.X86_64
Group: Linux
Client-LAN blade02 10.10.10.24
Server-LAN blade02-se 192.168.10.24
Storage-LAN blade02-st 192.168.20.24

blade03
Type: BX600 Cabinet ID: 1 Slot/Partition ID: 3
OS: SuSE Linux SLES-9.X86_64
Group: Linux
Client-LAN blade03 10.10.10.25
Server-LAN blade03-se 192.168.10.25
Storage-LAN blade03-st 192.168.20.25
pw250-2
Type: PW250
OS: Sun SunOS 5.8
Group: Solaris
Client-LAN pw250-2 10.10.10.36
local test1 pw250-2-clt1 10.10.10.37
local test2 pw250-2-clt2 10.10.10.38
Server-LAN pw250-2-se 192.168.10.36
local test1 pw250-2-set1 192.168.10.37
local test2 pw250-2-set2 192.168.10.38
Storage-LAN pw250-2-st 192.168.20.36
local test1 pw250-2-stt1 192.168.20.37
local test2 pw250-2-stt2 192.168.20.38
rx801
Type: RX800
OS: SuSE Linux SLES-9.X86_64

Administration and Operation 179


Administrating Application Nodes Pools and Groups

Group: Linux
Client-LAN rx801 10.10.10.2
Server-LAN rx801-se 192.168.10.2
Storage-LAN rx801-st 192.168.20.2

Control Nodes
cn1
Client-LAN cn1-p1 10.10.10.10
Server-LAN cn1-p1-se 192.168.10.10
Storage-LAN cn1-p1-st 192.168.20.10
cn2
Client-LAN cn2-p1 10.10.10.11
Server-LAN cn2-p1-se 192.168.10.11
Storage-LAN cn2-p1-st 192.168.20.11

Filer Nodes
fas01-p1
Storage-LAN fas01-p1-st 192.168.20.14

A sample with a reduced output:


cn1:/opt/FlexFrame/bin # ff_pool_adm.pl --op list --name p1
--list ne,gr
Pool configuration details of pool p1

Networks
Client-LAN
Network: 10.10.10.0 Netmask: 255.255.255.0 VLAN ID: 100
Server-LAN
Network: 192.168.10.0 Netmask: 255.255.255.0 VLAN ID: 110
Storage-LAN
Network: 192.168.20.0 Netmask: 255.255.255.0 VLAN ID: 120

Def.Router: 192.168.10.254

Pool Groups
Linux
OS: SuSE Linux SLES-9.X86_64
Solaris
OS: Sun SunOS 5.8

180 Administration and Operation


Pools and Groups Administrating Application Nodes

8.11.4 Listing all Pools


To display an overview of all pools with their used networks, pool groups, SIDs and
Control Node and Filer interfaces, the maintenance tool ff_pool_adm.pl can be used.
No arguments except the operation mode have to be defined on the command line.

Synopsis

ff_pool_adm.pl --op list-all [--list <part>[,<part>]]

Command Options
--op list-all
Displays all configured pools.
--list <part>[,<part>]
To reduce output to interesting parts use this option. The parameters to this option
are the display sections. Add them as a comma separated list. The default sections
are: network,storage,group,sid,cn,filer. You may also use a two character
abbreviation instead the full section name like ne for network.

Command Output
The command displays the pool configuration details. The output may look like this:
cn1:/opt/FlexFrame/bin # ff_pool_adm.pl --op list-all
Pool configurations

p1
Pool Networks
Client-LAN
Network: 10.10.10.0 Netmask: 255.255.255.0 VLAN ID:
100
Server-LAN
Network: 192.168.10.0 Netmask: 255.255.255.0 VLAN ID:
110
Storage-LAN
Network: 192.168.20.0 Netmask: 255.255.255.0 VLAN ID:
120

Pool Storage Volumes


sapdata fas01-p1-st:/vol/sapdata/p1
saplog fas01-p1-st:/vol/saplog/p1
volFF fas01-p1-st:/vol/volFF/pool-p1
volFF shared fas01-p1-st:/vol/volFF

Administration and Operation 181


Administrating Application Nodes Pools and Groups

Pool Groups
Linux
OS: SuSE Linux SLES-9.X86_64
Solaris
OS: Sun SunOS 5.8

Pool SIDs
D01
SAP Version: SAP-6.20 DB Version: Oracle-9
P01
SAP Version: SAP-6.20 DB Version: Oracle-9
Q01
SAP Version: SAP-6.20 DB Version: Oracle-9

Pool Control Node Interfaces


cn1
Client-LAN cn1-p1 10.10.10.10
Server-LAN cn1-p1-se 192.168.10.10
Storage-LAN cn1-p1-st 192.168.20.10

cn2
Client-LAN cn2-p1 10.10.10.11
Server-LAN cn2-p1-se 192.168.10.11
Storage-LAN cn2-p1-st 192.168.20.11

Pool Filer Node Interfaces


fas01-p1
Storage-LAN fas01-p1-st 192.168.20.14

A sample with a reduced output on a single pool configuration:


cn1:/opt/FlexFrame/bin # ff_pool_adm.pl --op list-all
--list sid,group
Pool configurations

p1
Pool Groups
Linux
OS: SuSE Linux SLES-9.X86_64
Solaris
OS: Sun SunOS 5.8

Pool SIDs
D01
SAP Version: SAP-6.20 DB Version: Oracle-9

182 Administration and Operation


Pools and Groups Administrating Application Nodes

P01
SAP Version: SAP-6.20 DB Version: Oracle-9
Q01

8.11.5 Adding a Group to a Pool


A pool may have more than one group. To add a group to a pool, the maintenance tool
/opt/FlexFrame/bin/ff_poolgroup_adm.pl can be used. To associate a group
to a pool, some parameters have to be defined on the command line. Groups are used
with the FlexFrame Autonomous Agents.
The group is added to a pool in the LDAP database. Since the FlexFrame Autonomous
Agents are not able to read pool and group configurations from LDAP, groups have to be
configured manually in the FA Agent configurations to take effect.

Synopsis

ff_poolgroup_adm.pl --op add --pool <pool_name>


--group <group_name> --ostype {Linux|SunOS}
--osversion <version_string>
--osvendor {Sun|SuSE}

Command Options
--op add
Adds a group to a pool.
--pool <pool_name>
Name of pool the group should be added. We recommend using short lowercase
names for <pool_name>.
--group <group name>
The name of the group to be added.
--ostype {Linux|SunOS}
Type of operating system (OS) the systems of this group work with. Currently only
Linux and SunOS (Solaris) are valid choices.
--osversion <version_string>
The version of the OS. Can be omitted if only one image version (Linux or Solaris) is
installed.
--osvendor {Sun|SuSE}
The vendor of the OS. Currently only Sun (Solaris) and SuSE (Linux) are supported.
Can be omitted if only one image version (Linux or Solaris) is installed.

Administration and Operation 183


Administrating Application Nodes Pools and Groups

8.11.6 Removing Pool Group


To remove a group of a pool, use the maintenance tool /opt/FlexFrame/bin/ff_
poolgroup_adm.pl.. You have to define the pool and the group name on the command
line.
The group is removed from a pool in the LDAP database. Since the FlexFrame
Autonomous Agents are not able to read pool and group configurations from LDAP,
groups have to be configured manually in the FA Agent configurations to take effect.

Synopsis

ff_poolgroup_adm.pl --op rem --pool <pool_name>


--group <group_name>

Command Options
--op rem
Removes a group from a pool.
--pool <pool_name>
Name of the pool the group should be removed from.
--group <group_name>
The name of the group to be removed.

8.11.7 Changing Group Assignment of Application Nodes


Change the assigned pool group of an Application Node with
/opt/FlexFrame/bin/ff_an_adm.pl. Command line arguments are the Application
Node name for which the pool group should be changed and the name of the new pool
group.
The command changes the pool group of Application Node in the LDAP database. The
configuration of FA Agents currently has to be changed manually.

Synopsis

ff_an_adm.pl --op group --name <node_name> --group <group_name>

Command Options
--op group
Changes pool group of Application Node.

184 Administration and Operation


The Hosts Database Administrating Application Nodes

--name <node_name>
Name of the Application Node the pool group has to be changed.
--group <group_name>
The name of the pool group the Application Node should be assigned to. Use
ff_pool_adm.pl --op list-all to get available pool groups (see 8.11.4).

8.11.8 Changing Group and Pool Assignment of Application


Nodes
There is currently no maintenance tool to do this. The recommended way is to remove an
Application Node and add it with the new pool and group name.

8.12 The Hosts Database


It may become necessary to have additional entries in the hosts database. Those entries
may be required by 3rd-party products installed on customized Application Node images.
The hosts database is stored in LDAP. To maintain the additional host entries use the
command ff_hosts.sh.

You cannot remove names or addresses which are essential to the FlexFrame
landscape.

Each pool has its own hosts database. Therefore you have to maintain each
pool individually.

8.12.1 Script: ff_hosts.sh


This tool allows the administrator to list, add and delete host names or aliases to the
LDAP database. Note: When adding host names with an IP address that matches one of
the pool's network segments or the Control LAN segment, the list of IP addresses for that
segment gets extended by the IP address of this host name to prevent automatic
allocation of the same IP address by other FlexFrame tools.
Only one option of -l, -a or -r can be used at one time.

Administration and Operation 185


Administrating Application Nodes The Hosts Database

Synopsis

ff_hosts.sh [-d] -p <pool_name> [{-l|-a <ip> -n <name>|-r <name>}]

Command Options
-d debug. This option will log debug information which can be found in the file
/tmp/ff_hosts.DEBUGLOG
-l List all the hosts entries for the pool as provided by option -p.
-p <pool_name>
Pool where the host name should be added to.
-a <ip>
Add the IP addess <ip>. Has to be used togehter with option -n. If an entry with the
IP address <ip> already exists, the name provided will be added as an alias.
-n <name>
Host name <name> will be added to the list.
-r <name>
Deletes the host name or alias of name. The host name cannot be deleted if it has
additional aliases. Remove the aliases first.

Debugging
/tmp/ff_hosts.DEBUGLOG will hold debugging information. In case of problems,
please provide this file.
Examples:
The following example will list all additional hosts entries for pool poolname created
using this tool:
cn1:~ # ff_hosts.sh -l -p poolname

The following example will add a host newhost with the IP address 1.2.3.4 to the
pool poolname:

cn1:~ # ff_hosts.sh -p poolname -a 1.2.3.4 -n newhost

The following example will remove the hosts entry for newhost:.

cn1:~ # ff_hosts.sh -p poolname -r newhost

186 Administration and Operation


Rebooting All Application Nodes Administrating Application Nodes

8.13 Rebooting All Application Nodes


There may be desaster situations where the Application Nodes remain in an undefined or
uncertain state. This may be caused by a loss of the NFS storage system. Once the NFS
storage is back online again, the Application Nodes need to be restarted. This can be
done using the tool ff_rebot_all_an.sh. Note that this tool will not shut down the
systems gracefully. If the command is used during regular operations, it can lead to loss
of data.
The purpose of this tool is to support the administrator in a desaster scenario. In case all
Application Nodes need to be rebooted, this command can be used to issue the power-off
and power-on commands to the Application Nodes. The user has to enter Yes (capital Y)
on all questions. Otherwise the program will not continue and will not send any
commands to the Application Nodes. The program will display the affected Application
Nodes as read from the LDAP database. Then, the power-off commands are sent to each
Application Node. After a delay of 5 seconds, the power-on commands are sent to each
Application Node.
Currently only servers of the following type are supported
● PRIMEPOWER 250
● PRIMEPOWER 450
● PRIMERGY BX* series (blades)

Synopsis

ff_reboot_all_an.sh

Debugging
/tmp/ff_reboot_all_an.DEBUGLOG will hold debugging information. In case of
problems, please provide this file.

Administration and Operation 187


9 Security

9.1 Requested Passwords and Password Settings


During Installation

9.1.1 Requested Passwords During Installation of the


Control Nodes
Entering all passwords is mandatory. For IPMI and LDAP, passwords have to be equal
on both Control Nodes.

Please note that the passwords are displayed on the screen – the input will not
be hidden!

Administration and Operation 189


Security Requested Passwords and Password Settings During Installation

9.1.2 Setting Passwords During Installation of a NetApp


Filer
Setting the administrative (root) password for <FILER-NAME> ...
New password:
Retype new password:

9.1.3 Initial SSH Configuration


In order to initialize ssh authorization between control nodes, call

/opt/FlexFrame/bin/ff_ssh_tool.sh –i
You will be asked to enter the passwords of both Control Nodes.
cn1:~ # ff_ssh_tool.sh -i
Copyright (C) 2005 Fujitsu Siemens Computers. All rights reserved.

Would you like to (re)initialize SSH authorization of the Control


Nodes?
Warning: If you enter "yes", current SSH authorization will be
overwritten.
That means you will lose your authorized_keys and known_hosts
files.
Note that /etc/hosts from this node will be transferred to the
other Control Node if it is different.

It is recommended to backup theses files on both Control Nodes


before you continue. Following files will be affected:
/root/.ssh/authorized_keys /root/.ssh/known_hosts /etc/hosts

Continue? yes

You are on first Control Node (cn1)


Enter password for "cn1" (this node):
Enter password for "cn2" (leave empty for same password):

~~~ Initial test of ssh connection between Control Nodes ~~~


Testing ssh to this Control Node (cn1)
SSH connection to "cn1" works

Testing ssh to other Control Node (cn2)


SSH connection to "cn2" works

190 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

~~~ ssh connection works, configuring authorization ~~~


root@cn2's password:

~~~ authorization complete, testing all addresses... ~~~


SSH connection to "..." works
SSH connection to "..." works
SSH connection to "..." works
...
...

~~~ transferring known_hosts file to other Control Node ~~~


SSH initialization is complete

~~~ syncing /etc/hosts with other Control Node ~~~


Copying hosts to "cn2"

ssh initialization finished successfully.

9.2 Password Settings During Operation, Update


and Upgrade

9.2.1 User Administration


In order to change user and/or group IDs as well as passwords, several tools can be
utilized.
For changing user passwords on Solaris-based Application Nodes the standard passwd
tool can be used. On Linux based Application Nodes this tool will not work and
ff_ldappasswd.sh should be used instead.
ff_ldappasswd.sh will retrieve the current user, will prompt the user for a new
password and change it accordingly.
Do not use ldappasswd, it will leave your LDAP password entries unusable.

The default password for all Unix users is password.


In order to change Unix user passwords from a Control Node use

ff_ldappasswd.sh -l <login_name> -p <pool_name> [-r]


The -r option enables the root user to change a password without authentication through
the rootdn of the LDAP server, i.e. the tool will retreive the rootdn and its password for
authorization from the configuration files.

Administration and Operation 191


Security Password Settings During Operation, Update and Upgrade

In order to change the password for the LDAPadmins use -p LDAPadmins instead of
providing a pool name, and provide admins or replicate as the user option's value. This
will change passwords of admins or replicate LDAP users.
In cases where existing SAP systems are transfered into FlexFrame or recovered from an
earlier installation, it might be necessary to change user or group IDs to different values
than the ones provided by the installation tools. In order to do that, ff_change_id.pl can
be utilized:

ff_change_id.pl --pool <pool_name>


[--uid <uid> <group_name> | --gui <gid> <user_name>]
should be run on a Control Node. It will change IDs according to the values given, as long
as those values are not occupied by other entries.

9.2.2 Password Management on Control Nodes

9.2.2.1 Passwords for Root and Standard Unix Users


The root password and the user passwords are located in the file /etc/shadow on both
Control Nodes and can be changed with the standard passwd command.

cn1:~ # vi /etc/shadow
cn1:~ # passwd
Changing password for root.
New password:
Password will be truncated to 8 characters
Re-enter new password:
Password changed

The user passwords of the Control Nodes may (but should not) differ and can be
changed at any moment. A change has no other impact.

192 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

9.2.2.2 Password for Root of LDAP Server and Replica


The password is stored in the configuration file /etc/openldap/slapd.conf as value
of the entry rootpw on both Control Nodes and within the configuration inside the LDAP
server itself under cn=root,ou=LDAPadmins resp. cn=replica,ou=LDAPadmins.
Modify the password with an editor in /etc/openldap/slapd.conf on both Control
Nodes.

Modify the password on both, the root and the replica of the LDAP server.

Administration and Operation 193


Security Password Settings During Operation, Update and Upgrade

194 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

The passwords can be changed at any moment. A change has no other impact.
If the LDAP server and its replica are not rebooted both the old password and the new
password will be valid. After rebooting both only the new password will be valid.
The LDAP servers should be rebooted from the PRIMECLUSTER Interface (ldap_srv1
and ldap_srv2) .

9.2.2.3 Password for LDAPadmins


The password is stored in cn=sysclient,ou=LDAPadmins,ou=<pool>,ou=Pools
for each pool inside the LDAP server. Modify the password with an LDAP editor. The
password have to be match the password given to the Solaris LDAPcachemgr.

9.2.2.4 Password for SNMP Community


The password is stored on both Control Nodes in the /etc/snmpd.conf and
/opt/FlexFrame/etc/ff_misc.conf configuration files. Modify this password with
an editor. The password can be changed at any moment. A change has no other impact.

Administration and Operation 195


Security Password Settings During Operation, Update and Upgrade

9.2.2.5 Key for Access to the Name Server (named)


The key is stored on both Control Nodes in the file /etc/rndc.key. Modify this key with
an editor. The key can be changed at any moment. A change has no other impact.

9.2.2.6 Password for myAMC WebGUI


The password is stored on both Control Nodes in the configuration file
/opt/myAMC/config/FA_WebGui.conf. Modify this password with an editor. The
password can be changed at any moment. A change has no other impact.

9.2.2.7 Password for mySQL Database


This password is stored in the configuration file /opt/myAMC/config/amc.conf and
inside the mySQL server on both Control Nodes.
Step 1: Password for the connection of myAMCMessenger to database
Modify the password in the following command:

/etc/init.d/myAMC.MessengerSrv connect <user> <password>


Step 2: Password inside the database
Login to mySQL database:
cn1:~ # mysql -u myAMC -p FlexFrame
mysql> SET PASSWORD FOR 'myAMC'@'localhost' =
PASSWORD('<new_password>');

The password can be changed at any moment. A change has no other impact.

9.2.2.8 Password for Power Shutdown


1. The password is stored inside the server BIOS. For detailed description see section
“Configuring User, Password and Community” on page 251.
2. The password is stored on both Control Nodes in the configuration file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml.
Modify this password with an editor or the WebGUI.
3. When modified with the WebGUI the password is automatically changed in the PCL
configuration file /etc/opt/SMAW/SMAWsf/SA_ipmi.cfg.
The password can be changed at any moment. A change has no other impact.

196 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

9.2.2.9 Password for SAPDB/MaxDB in FlexFrame Start/Stop Scripts


The password is stored aside the user control in /FlexFrame/scripts/* for
SAPDB/MaxDB on both control nodes. The default value is control.

cn1:~ # cd /FlexFrame /scripts


cn1:~ # grep control, *

Modify this password with an editor. The user control must not be changed. This
password must be identical for user control in all SAPDBs/MaxDBs across the site.

9.2.3 Password Management on Linux Application Nodes


Refer also to the section “User Administration” on page 191.

9.2.3.1 Password for Admin on BX(3,6)00


For detailed information how to change see section “Configuring User, Password and
Community” on page 251.

9.2.3.2 Password for PowerShutdown on BX(3,6)00, RXxxx


For detailed information how to change the password for SNMP community see section
“Configuring User, Password and Community” on page 251.
1. The password is located in the myAMC configuration file myAMC_FA_SD_Sec.xml
on both Control Nodes. Modify this password with an editor or the WebGUI
2. When modified with the WebGUI the password is automatically changed in the PCL
configuration files /etc/opt/SMAW/SMAWsf/SA_*.cfg.
3. For RX300 application nodes the file is /etc/opt/SMAW/SMAWsf/SA_ipmi.conf.
The configuration can be tested with
cn1:~ # ipmipower -s <ip_address> -u <user> -p <password>

The password can be changed at any moment. A change has no other impact.

9.2.3.3 Passwords for Root and Standard Unix Users


See section “User Administration” on page 191.

9.2.3.4 Passwords for SAP Users


See section “User Administration” on page 191.

Administration and Operation 197


Security Password Settings During Operation, Update and Upgrade

9.2.4 Password Management on Solaris Application Nodes


Refer also to the section “User Administration” on page 191.

9.2.4.1 Password for PowerShutdown


For detailed information how to change password for community see section
“Configuring User, Password and Community” on page 251.
1. The password is stored on both Control Nodes in the configuration file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml.
Modify the password with an editor or the WebGUI.
2. When modified with the WebGUI the password is automatically changed in the PCL
configuration files /etc/opt/SMAW/SMAWsf/SA_*.cfg.
The password can be changed at any moment. A change has no other impact.

9.2.4.2 Passwords for Root and Standard Unix Users


See section “User Administration” on page 191.

9.2.4.3 Passwords for SAP Users


See section “User Administration” on page 191.

9.2.5 Passwords for Networking Components

9.2.5.1 Password for Cisco Switch Enable/Login and SNMP_Community


1. Change password in LDAP cn=<n>,ou=Switch,ou=Network,ou=FF_conf with
LDAP editor:
cn1:~ # ldapbrowser &

Navigate to the switch group where you want to change the password:

198 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

Click twice in userPassword and introduce the new password.

2. Change password on switch:

telnet session:
enable
configure terminal
enable password <new_password>
line vty 0 15
password <new_password>
end
copy running config startup-config

3. On Control Node:
cd /tftpboot
touch sw<n>-<m>.config
chmod 666 sw<n>-<m>.config

Administration and Operation 199


Security Password Settings During Operation, Update and Upgrade

4. On switch:
copy running config tftp <control_lan_ip_of_CN>
sw<n>-<m>.config

Enable and login passwords must be identical!

9.2.5.2 Admin Password for SwitchBlade BX(3,6)00


1. Change password in LDAP cn=<m>,cn=<n>,ou=SwitchBlade,ou=Network,
ou=FF_conf with LDAP editor.

2. Change password on switch:

telnet session:
enable
configure
enable password level 0 <new passwd>
line vty
password <new password>
end
copy running config startup-config

3. On control node:
cd /tftpboot
touch bx[3,6]00-<n>-swb<m>.config
chmod 666 bx[3,6]00-<n>-swb<m>.config

4. On switch:
copy running config tftp <control lan ip fo CN>
bx[3,6]00-<n>-swb<m>.config

9.2.5.3 Admin and SNMP_Community Storage Passwords


Passwords can be modified with ONTAP WebGUI or by telnet session to every Filer.

200 Administration and Operation


Password Settings During Operation, Update and Upgrade Security

9.2.6 Preparation for Linux Application Nodes

9.2.6.1 Script: ff_install_an_linux_images.sh

Synopsis

ff_install_an_linux_images.sh [-v] [-p <path_to_images>]

Description
● Setting up the root password in the root image
You need a password to log onto the Application Nodes as root. This password is
valid for Application Nodes with the same shared root Image. The script asks for the
root password twice. If both the entered strings are the same, the password will be
accepted.
Enter password: ********
Enter password again: ********
Password accepted.

If you plan to use more root images, for example a separate root image for
every pool with different root passwords, you must copy these root images
manually after this installation procedure.
● Setting up ssh in the root image
To log on from the Control Nodes to the Application Nodes without entering the root
password, set up ssh as follows:
– Generate host keys for the root image
– Deploy authorized keys from the Control Nodes into the root image

Administration and Operation 201


Security Passwords Stored in Initialization Files

9.2.7 Preparation for Solaris Application Nodes

9.2.7.1 Script: nb_unpack_bi

Synopsis

nb_unpack_bi [-x] [-q|-v] | [-h]

Description
Please enter the password for your Solaris Application Nodes. It applies for all Solaris
Application Nodes.

INFO: A root password is required for the Application Nodes


You will now be asked for the root password (valid for all
Application Nodes using this image).
Enter password:
Enter password again:
Password accepted.

9.3 Passwords Stored in Initialization Files

9.3.1 Switch Group Switch Definition Files


The script ff_wiring.pl creates (amongst others) switch group switch definition files
(used with ff_switch_conf.pl).
/opt/FlexFrame/network/switchgroup*.def
Switch group switch definition files. These may contain data of more than one switch
if the switch type is stackable and usable as one virtual switch (as supported by the
CISCO Catalyst 3750 switch family). Contains switch type, host name, snmp, ntp,
syslog, vlan and port definitions. The file content may look like this:

SWITCHTYPE=cat3750g-24ts
HOSTNAME=sw1-1
IP=100.100.13.18;255.255.255.0;14
SNMP=public;ro
SYSLOG=100.100.13.13;local0
PASSWORD=passwort;root
NTP=100.100.13.12;server
NTP=100.100.13.13;server

202 Administration and Operation


Configuration of the SCON Shutdown Agent on the Control Nodes Security

9.4 Configuration of the SCON Shutdown Agent on


the Control Nodes
For this purpose the rcsd.cfg file was supplemented under /etc/opt/SMAW/SMAWsf
and the SA_scon.cfg file created.

cn1:~ # more /etc/opt/SMAW/SMAWsf/SA_ipmi.cfg


# FlexFrame(TM)
# Generated file at Mon Nov 29 11:53:03 CET 2004
# A backup of a previous version (if existed) my be found
# in directory: /opt/FlexFrame/etc/backup_20041129115223
#
# DO NOT EDIT MANUALLY!
#
CN1-1 :OEM:passwort cycle
CN2-1 :OEM:passwort cycle
# end of configuration file

Administration and Operation 203


Security FA Agents

9.5 FA Agents

The password is stored in configuration file /opt/myMAC/config/FA_WebGui.conf in


the entries messengerdb.jdbc.username and messengerdb.jdbc.password.
The default entries are:

messengerdb.jdbc.password
messengerdb.jdbc.url=jdbc:mysql://localhost:3306/messenger
messengerdb.jdbc.username=myAMC
messengerdb.jdbc.password=FlexFrame

For modification(s) use a text editor on the Control Nodes.

204 Administration and Operation


FA Agents Security

9.5.1 Power-Shutdown Configuration


The power shutdown is necessary for FlexFrame Autonomous Agents to switch off
Application Nodes in the case of a switchover scenario to make sure that a system in an
undefined state will be switched off. This is also necessary for the PRIMECLUSTER
Shutdown Facility on the Control Nodes.
The power shutdown will be implemented by the Shutdown Agents (SA) of the
PRIMECLUSTER Shutdown Facility and the ff_xscf.sh script.
Each Shutdown Agent (SA_blade, SA_ipmi, SA_rsb, SA_rps and SA_scon) has its
own config file which will be automatically configured by the FlexFrame Autonomous
Agents and the PRIMECLUSTER Shutdown Facility on the Control Nodes.
The PRIMEPOWER XSCF (eXtended System Control Facility) is a built-in component for
PRIMEPOWER 250 and 450 systems. It enables either serial or LAN access to the
server’s console or power-on, power-off and reset commands.
Only the hardware and software which is used by the Shutdown Agents and XSCF must
be prepared with IP address, user, password, etc.
You must therefore perform the following steps in accordance with the Application Node
and Control Node hardware.
Detailed information on the power-shutdown configuration is available in the "FA Agents -
Installation and Administration" manual. The following sections just show where
usernames and passwords are stored.

9.5.1.1 BX300/600
Default: <Username> = root, <Password> = root
Use the same user name, password and community string in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server. The
SNMP community string should be “read-write”.

9.5.1.2 RX300/RX300 S2
IPMI configuration:
● LAN A onboard should only be used for IPMI on the Control LAN
● LAN B onboard and the upper NIC of the separate LAN card should be used for
bonding.
Use the same user name and password in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.

Administration and Operation 205


Security FA Agents

9.5.1.3 RX600
RSB configuration:
● RX600: RSB1 onboard
Connect the Control LAN cable with the LAN port for the remote management controller.
Use the same user name and password in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section "Security_default" or use a separate config section entry for this server.
Use the same community string in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.

9.5.1.4 RX800
ASM, RSB and the connection between must be configured.
Set user name and password (same as in
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or separate entry for this server).
Set Permissions for Reset/SwitchOff to 1 and then press <F1>.
Use the same user name and password in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.
Create new SNMP Community 2 for tcc and then press F1.
Use the same community string in config file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.

9.5.1.5 PRIMEPOWER 250/450


The PRIMEPOWER XSCF (eXtended System Control Facility) is a built-in component to
enable either serial or LAN access to the server’s console or power-on, power-off and
reset commands.
The FlexFrame images for Solaris will automatically configure most of the XSCF settings
(serial/lan, IP address etc.) during system startup. The /etc/default/SMAW_xscf
configuration file of each server contains the settings.
For security reasons, password information is not stored in this configuration file and
must be configured manually. Log on to the server and call the program
/opt/FJSVmadm/sbin/madmin.

206 Administration and Operation


FA Agents Security

Set the same user name and password in config file


/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.
You can test the user and password by accessing the port 8010 of the server’s control
LAN interface using telnet:

cn1:~ # telnet server-co 8010

The initial values are root (username) and fsc (password).

9.5.1.6 PRIMEPOWER 650/850


To power down servers with RPS (PRIMEPOWER 650/850), only one RPS per server is
used controlling one (PRIMEPOWER 650) or two (PRIMEPOWER 850) protection boxes.
When the RPS has been powered down it has to be ensured that the associated
protection boxes and the server are also powered down so that a consistent status is
achieved.
Prerequisite: The RSB inside the RPS must have the firmware version 1.0.3.105 or
higher.
Set the same user and password in the configuration file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml, config
section Security_default or use a separate config section entry for this server.
The initial values are root (username) and fsc (password).

9.5.2 SNMP Traps


The FA Agents are able to send SNMP traps to configured trap destinations. The traps
contain status information resp. changes of services in the FlexFrame environment. For
example there is an SNMP trap if a service is starting. SNMP traps can be used to
connect an external system monitoring application to FlexFrame.
Detailed information on the SNMP trap configuration is available in the "FA Agents -
Installation and Administration" manual. The following sections just show where SNMP
community strings are stored.

9.5.2.1 General
The TrapTargets.xml file contains all the trap destinations, i.e. information which is
needed to send SNMP traps. Two parameters are required for each target:
● Host name or IP address
● SNMP community

Administration and Operation 207


Security PRIMECLUSTER

The community roughly corresponds to a password.


Generally public is configured as the default value.

9.5.2.2 Configuring User, Password and Community


To use agent power shutdown, user, password and community must be defined in the
configuration of the FA Agent. This configuration is specified in the pool-specific
configuration file
/opt/myAMC/vFF/vFF_<pool_name>/config/myAMC_FA_SD_Sec.xml. The
entries for user, password and community must be the same as those configured in the
Application Nodes.

9.5.2.3 Configuring Management Blades


The management blades have to be configured. This is done in the Managementblades
configuration section of the myAMC_FA.xml configuration file.

9.6 PRIMECLUSTER

208 Administration and Operation


PRIMECLUSTER Security

Read access with the PCL Web-based Admin View is possible with every Unix user
defined on the control nodes. Write access is only possible with root
See section “User Administration“on page 191.

Administration and Operation 209


10 Administrating SAP Systems
This chapter describes the management of SAP System IDs (so-called SIDs) and their
respective instances within the FlexFrame environment. It further describes how to clone
SIDs as well as their instances for a different pool than the one they were installed to.
The tools described below only maintain the LDAP database entries, rather than adding
or removing the data and binaries of the respective SAP systems. These steps need to
be performed manually.
Listing, adding, removing and cloning the above entities in the LDAP server is supported
by two tools, ff_sid_adm.pl and ff_clone_sid.pl. Both scripts will take care of
keeping the data accessed by the operating system’s naming service mechanism in sync
with the FlexFrame internal configuration data, both of which reside in the LDAP server.
This data should not be manipulated manually.

10.1 Listing SAP SIDs and Instances


The script ff_sid_adm.pl is used to list existing SAP System IDs and their instances
(for adding and removing SIDs please see 10.2).
Synopsis

ff_sid_adm.pl –-op list –-pool <pool_name> [--sid <SAP_system_id>]

Command Options
--op list
Determines the list operation.
--pool <pool_name>
Determines the FlexFrame pool to which the operation should be applied.
--sid <SAP_system_id>
Determines the SID being used.

Examples:
%> ff_sid_adm.pl –-op list –-pool Pan
OL1
OL4
OL2
SHT

Administration and Operation 211


Administrating SAP Systems Adding / Removing SAP SIDs and Instances

%> ff_sid_adm.pl –-op list –-pool Pan –-sid SHT


01
00
DB0

10.2 Adding / Removing SAP SIDs and Instances


The script ff_sid_adm.pl is used for adding and removing SAP SIDs and/or their
instances (for listing configured SID and instance see 10.1).

Synopsis
ff_sid_adm.pl –-op add –-pool <pool_name> --sid <SAP_system_id>
--sapversion {4.6|6.20|6.40|7.0}
--db {ORACLE9|ORACLE10|SAPDB73|SAPDB74|MAXDB75|MAXDB76}
--group <groupname1>:<gidnumber1>,<groupname2>:<gidnumber2>,...
--user <username1>:<uidnumber1>,<username2>:<uidnumber2>,...
<db loghost>:{<db_loghost_ip>|*}
--sap {ci|app|jc|j|scs|ascs}:<SYSNR>:
<loghost-client>:{<loghost_client_ip>|*}
<loghost-server>:{<loghost_server_ip>|*}

ff_sid_adm.pl –-op del –-pool <pool_name> --sid <SAP_system_id>


[--sysnr <SYSNR>]

Command Options
--op add
Determines the add operation.
--op del
Determines the del operation.
--pool <pool_name>
Determines the FlexFrame pool to which the operation should be applied.
--sid <SAP_system_id>
Determines the SID being used.
--sapversion {4.6|6.20|6.40|7.0}
Specifies the SAP basis version being used.

212 Administration and Operation


Adding / Removing SAP SIDs and Instances Administrating SAP Systems

--db {ORACLE9|ORACLE10|SAPDB73|SAPDB74|MAXDB75|MAXDB76}
Specifies the database type as well as the respective version being used.
--group <groupname1>:<gidnumber1>,<groupname2>:<gidnumber2>,...
--user <username1>:<uidnumber1>,<username2>:<uidnumber2>,...
user and group enable specially selected user numbers and group numbers to be
assigned to SAP users and SAP groups respectively. In this case a check is made to
see whether the user or group has already been defined for the DB system involved.
A user or group is created only if they do not already exist. For example, a group dba
which already exists cannot be assigned a group number which deviates from the
default value.
<db loghost>:{<db_loghost_ip>|*}}
The logical host name is used for the database as well as the IP address for that host
name. Use an asterisk if you want it to be chosen automatically. All the entries need
to be specified in a colon separated format.
--sap {ci|app|jc|j|scs|ascs}:<SYSNR>:
<loghost>-client:{<loghost client ip>|*}
<loghost>-server:{<loghost server ip>|*}
Specifies an SAP instance (optionally multiple of those) through its type (ci, app, jc,
j, scs, ascs), its SAP system number, the logical host name in the client network,
the respective IP address, the logical host name in the server network and the
respective IP address. Again, the IP addresses can be replaced with asterisks in
order to have them chosen automatically. All the entries need to be specified in a
colon separated format.

--sysnr <SYSNR>
Removes a specific SAP instance instead of the entire system (SID).
Examples:
Adding an SID with one Central Instance:
control1:~ # ff_sid_adm.pl –-op add –-sid SHT –-pool Otto
--sapversion 6.40 –-db ORACLE9:dbsht:192.168.1.1
--sap ci:00:sht00-client:\*:sht00-server:\*

Adding an instance to an existing SAP System:


control1:~ # ff_sid_adm.pl –-op add –-sid SHT –-pool Otto
--sapversion 6.40 --sap app:01:sht01-client:\*:sht01-server:\*

Removing an entire SID (including its instances):


%> ff_sid_adm.pl –-op del –-sid SHT –-pool Otto

Administration and Operation 213


Administrating SAP Systems Cloning a SAP SID into a Different Pool

Removing an Application Server:


%> ff_sid_adm.pl –-op del –-sid SHT –-pool Otto –-sysnr 01

10.3 Cloning a SAP SID into a Different Pool

10.3.1 Script: ff_clone_sid.pl


The script ff_clone_sid.pl allows users to clone (basically copy) an entire SID from
one pool to another. It needs to be clearly understood that only the FlexFrame-specific
administrational data in the LDAP server as well as the required information for the
operating system’s naming services are copied and/or added to the LDAP database. Any
additional work (like copying SAP/database binaries and database content) will not be
performed by this tool and needs to be performed, as well.
Also, if the script determines conflicts in userIDs, groupIDs, service entries etc., it will not
perform further steps, in order to avoid potential damage done to the target pool. Users
would then have to add those data to the target pool manually, which clearly requires
some knowledge of the LDIF format and usage of tools such as ldapsearch and
ldapmodify. In addition to that, userIDs, groupIDs and other entries might have to be
adapted, accordingly.

Synopsis

ff_clone_sid.pl --sid=<SID_name> --srcpool=<pool_name>


--trgtpool=<pool_name>

10.3.2 Script: ff_change_id.pl


This script allows changing user IDs or group IDs in LDAP. This is meant to support the
cloning of SAP systems to other pools. In some cases - when IDs are already in use - it
may be required to change them, after cloning a system into a different pool.

Synopsis

ff_change_id.pl --pool=<pool_name>
[{--uid=<id> <user_name>|--gid=<id> <group_name>}]

Example:
ff_change_id.pl --pool=pool1 --uid=501 orasht

214 Administration and Operation


Cloning a SAP SID into a Different Pool Administrating SAP Systems

Data Transfer when Cloning an SID


After cloning the information in the LDAP database with the script ff_clone_sid.pl, it
is necessary to copy SAP database binaries and SAP database content. The copy will be
done from the Control Node as user root. If the database size is greater than 50 GB it is
recommended to use the ONTAP command ndmpcopy on the NetApp Filer to copy the
database content without using the network connections (see Example 3 below).
Please check carefully that no data will be overwritten in the target pool when
copying data from the source pool!

The following directories have to be copied from the source pool to the target pool:
/FlexFrame/volFF/pool-<pool_name>/<dbtype>/<OS>/<SID>
/FlexFrame/volFF/pool-<pool_name>/sap/sapmnt/<SID>
/FlexFrame/volFF/pool-<pool_name>/sap/usr_sap/<SID>
/FlexFrame/volFF/pool-<pool_name>/sap/home_sap/<sid>adm
/FlexFrame/sapdata/<pool_name>/<SID>
/FlexFrame/saplog/<pool_name>/<SID>
/FlexFrame/volFF/pool-<pool_name>/oracle/<OS>/client/*
(Only necessary if the Oracle client software is not already installed in the target pool.)

You have to edit the file /FlexFrame/scripts/ora_listener_names.


This file is pool-dependent. Add a new line as shown below:
<ora_listener_name>
DE2:LISTENER_DE2
QA2:LISTENER_QA2
PR2:LISTENER_PR2
PRD:LISTENER_PRD # <-- add this line

If SAPDB is used, please copy also


/FlexFrame/volFF/pool-<pool_name>/sap/home_sap/sqd<sid>.

<pool name> Name of source or target pool


dbtype oracle or sapdb
OS Linux or SunOS
SID The upper-case three-digit system ID
sid The lower-case three-digit system ID

Administration and Operation 215


Administrating SAP Systems Cloning a SAP SID into a Different Pool

Example 1 (Oracle-DB on Solaris):


SID = OSI
Source pool = p1
Target pool = p2

control1:/ # cd /FlexFrame/volFF/pool-p1/oracle/SunOS
control1:/FlexFrame/volFF/pool-p1/oracle/SunOS # cp -r OSI
../../../pool-p2/oracle/SunOS
control1:/FlexFrame/volFF/pool-p1/oracle/SunOS # cd ../../sap
control1:/FlexFrame/volFF/pool-p1/sap # cp -r sapmnt/OSI
../../pool-p2/sap/sapmnt
control1:/FlexFrame/volFF/pool-p1/sap # cp -r usr_sap/OSI
../../pool-p2/sap/usr_sap
control1:/FlexFrame/volFF/pool-p1/sap # cp -r home_sap/osiadm
../../pool-p2/sap/home_sap
control1:/FlexFrame/volFF/pool-p1/sap # cd /FlexFrame/sapdata/p1
control1:/FlexFrame/sapdata/p1 # cp -r OSI ../p2
control1:/FlexFrame/sapdata/p1 # cd ../../saplog/p1
control1:/FlexFrame/saplog/p1 # cp -r OSI ../p2

Example 2 (SAPDB-DB on Linux):


SID = MLI
Source pool = p1
Target pool = p2

control1:/ # cd /FlexFrame/volFF/pool-p1/sapdb/Linux
control1:/FlexFrame/volFF/pool-p1/sapdb/Linux # cp -r MLI
../../../pool-p2/sapdb/Linux
control1:/FlexFrame/volFF/pool-p1/sapdb/Linux # cd ../../sap
control1:/FlexFrame/volFF/pool-p1/sap # cp -r sapmnt/MLI
../../pool-p2/sap/sapmnt
control1:/FlexFrame/volFF/pool-p1/sap # cp -r usr_sap/MLI
../../pool-p2/sap/usr_sap
control1:/FlexFrame/volFF/pool-p1/sap # cp -r home_sap/mliadm
../../pool-p2/sap/home_sap
control1:/FlexFrame/volFF/pool-p1/sap # cp -r home_sap/sqdmli
../../pool-p2/sap/home_sap
control1:/FlexFrame/volFF/pool-p1/sap # cd /FlexFrame/sapdata/p1
control1:/FlexFrame/sapdata/p1 # cp -r MLI ../p2
control1:/FlexFrame/sapdata/p1 # cd ../../saplog/p1
control1:/FlexFrame/saplog/p1 # cp -r MLI ../p2

216 Administration and Operation


Cloning a SAP SID into a Different Pool Administrating SAP Systems

Example 3 using ndmpcopy (Oracle-DB on Solaris):


Filer IP = 192.168.10.203 (Source and Target identical)
SID = OSI
Source pool = p1
Target pool = p2

control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \


/vol/volFF/pool-p1/oracle/SunOS/OSI \
192.168.10.203:/vol/volFF/pool-p2/oracle/SunOS/OSI
control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \
/vol/volFF/pool-p1/sap/sapmnt/OSI \
192.168.10.203:/vol/volFF/pool-p2/sap/sapmnt/OSI
control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \
/vol/volFF/pool-p1/sap/usr_sap/OSI \
192.168.10.203:/vol/volFF/pool-p2/sap/usr_sap/OSI
control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \
/vol/volFF/pool-p1/sap/home_sap/osiadm \
192.168.10.203:/vol/volFF/pool-p2/sap/home_sap/osiadm
control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \
/vol/sapdata/pool-p1/OSI 192.168.10.203:/vol/sapdata/pool-p2/OSI
control1:/ # rsh 192.168.10.203 ndmpcopy -da root:password \
/vol/saplog/pool-p1/OSI 192.168.10.203:/vol/saplog/pool-p2/OSI

10.3.3 Changing User and Group IDs after Cloning


The ff_change_id.pl script allows changes of user IDs or group IDs in LDAP. This
utility is meant to support the cloning of SAP systems to other pools. In some cases -
when IDs are already in use - it may be required to change them, after cloning a system
into a different pool.

Synopsis

ff_change_id.pl --pool=<pool_name>
[{--uid=<id> <user_name>|--gid=<id> <group_name>}]

Example:
# ff_change_id.pl --pool=pool1 --uid=501 orasht

Administration and Operation 217


Administrating SAP Systems Multiple Filers and Multiple Volumes

10.4 Multiple Filers and Multiple Volumes


During the installation process, FlexFrame assumes that there is one Filer with sapdata
and saplog volumes. Larger installations may require more than one Filer or multiple
volumes on the same Filer.
It is possible to distribute SAP databases across multiple Filers and multiple volumes
under the following conditions:
1. All Filers where entered in the FlexFrame Planning tool, prior to installation of the
FlexFrame landscape.
2. The software for SAP and the database are always located in a centralized volume
volFF of the first Filer. No distribution here.
3. For each SID you can assign a sapdata volume for the database's data files. This
sapdata volume can be shared with other SIDs or solely for this SID.
4. For each SID you can assign a saplog volume for the database's online redolog files.
This saplog volume can be shared with other SIDs or solely for this SID.
The volumes must be created manually on the Filer, e.g. like:
filer2> vol create dataC11 10
filer2> vol create logC11 4

Here, dataC11 and logC11 are the new names of the volumes and 4 and 10 are the
numbers of disks to be used.
We recommend using the volume names sapdata and saplog (if on a different
Filer) or data<SID> and log<SID> if SID specific volumes on the same Filer.

You may use FlexVols (ONTAP 7G) or regular volumes.


The following options must be set for the volume:
filer2> vol options dataC11 nosnap on
filer2> vol options dataC11 nosnapdir on
filer2> vol options dataC11 minra on
filer2> vol options dataC11 no_atime_update on
filer2> vol options logC11 nosnap on
filer2> vol options logC11 nosnapdir on
filer2> vol options logC11 minra on
filer2> vol options logC11 no_atime_update on

Next, create qtrees for each FlexFrame pool which will store data and logs in those
volumes:
filer2> qtree create /vol/dataC11/pool1
filer2> qtree create /vol/logC11/pool1

218 Administration and Operation


Multiple Filers and Multiple Volumes Administrating SAP Systems

If you use more than the first Filer, make sure it is reachable using, e.g.:
control1:~ # ping filer2-st
PING filer2-st (192.168.10.203) from 192.168.10.201 : 56(84) bytes
ofdata.
64 bytes from filer2-st (192.168.10.203): icmp_seq=1 ttl=255
time=0.117 ms
64 bytes from filer2-st (192.168.10.203): icmp_seq=2 ttl=255
time=0.107 ms
64 bytes from filer2-st (192.168.10.203): icmp_seq=3 ttl=255
time=0.103 ms

Now, the LDAP database has to be told that this SID is not using the default (first) Filer
and sapdata/saplog volumes:
control1:~ # ff_sid_mnt_adm.pl --op=add --pool=pool1 --sid=C11
--sapdata=filer2:/vol/dataC11/pool1/C11
--saplog=filer2:/vol/logC11/pool1/C11

Now, the volumes on the Control Nodes need to be mounted. To do so you should add
the following lines to each Control Node's /etc/fstab:

filer2-st:/vol/vol0 /FlexFrame/filer2-st/vol0 nfs


nfsvers=3,rw,bg,udp,soft,nolock,wsize=32768,rsize=32768
filer2-st:/vol/dataC11/pool1 /FlexFrame/filer2-st/pool1/dataC11
nfs nfsvers=3,rw,bg,udp,soft,nolock,wsize=32768,rsize=32768
filer2-st:/vol/logC11/pool1 /FlexFrame/filer2-st/pool1/logC11
nfs nfsvers=3,rw,bg,udp,soft,nolock,wsize=32768,rsize=32768

Repeat the sapdata and saplog-lines for each pool, if there's more than one pool
for those volumes.
Use the volume name for the last directory in the mount point.

Now we need the mount points:


control1:~ # mkdir -p /FlexFrame/filer2-st/vol0
control1:~ # mkdir -p /FlexFrame/filer2-st/pool1/dataC11
control1:~ # mkdir -p /FlexFrame/filer2-st/pool1/logC11

(Again, sapdata and saplog for each pool)


Before we can mount the files we need to tell the Filer to export the volumes
appropriately:
control1:~ # mount /FlexFrame/filer2-st/vol0
control1:~ # vi /FlexFrame/filer2-st/vol0/etc/exports

Administration and Operation 219


Administrating SAP Systems Upgrading a SAP System

Insert the following lines:

/vol/dataC11/pool1 -sec=sys,rw=192.168.10.0/24,anon=0
/vol/logC11/pool1 -sec=sys,rw=192.168.10.0/24,anon=0

Save the file.


The network 192.168.10.0/24 must match the Storage LAN segment of pool
pool1.

Now you need to make the file re-read its configuration file:
control1:~ # rsh filer2-st exporfs -a

Now we can mount the volumes:


control1:~ # mount -a
Before you can install SAP and the database on those volumes, some folders for the SID
in question have to be created in advance. To do so, run the following command for each
SID (replace "pool1" with your pool name and "C11" with your SID:
control1:~ # ff_setup_sid_folder.sh pool1 C11

Now you can continue with the SAP installation.

10.5 Upgrading a SAP System

10.5.1 Service Port


If you plan to upgrade your SAP release, you have to add a special service port (called
shadow instance) to LDAP.
SAP shadow service ports are required during an SAP release upgrade. To list, add or
remove the service ports in the LDAP database for service entries, you can use this
tool.

Synopsis

ff_sap_shadowport.sh [-d] -l -p <pool_name> [-s <sid>]


[-o <port_no>]

ff_sap_shadowport.sh [-d] {-a|-r} -p <pool_name> -s <sid>


[-o <port_no>]

220 Administration and Operation


Upgrading a SAP System Administrating SAP Systems

Command Options
-d Writes debugging information to a log file (see below).
-l Lists all SAP shadow service ports of the pool provided with the -p option. If the
option -s is used, only the port of the specified SID is displayed.

-a Adds an entry.

-r Removes an entry.
-p <pool_name>
Specifies the name of the pool (e.g. pool1).
-s <sid>
Specifies the SAP System ID (SID) by a 3 character string (e.g. C11).
-o <port_no>
Specifies the service port number. The default is 3694 (optional).

Debugging
/tmp/ff_sap_shadowport.DEBUGLOG holds debugging information if option -d was
used. In case of problems, please provide this file.

10.5.2 FA Agents
Please make sure that the FA Application Agents are stopped on the hosts while you are
installing, updating or removing any SAP or database software:
Stop the FA Agent:
/etc/init.d/myAMC.FA_AppAgent stop

Check the status:


/etc/init.d/myAMC.FA_AppAgent status

There should be no running processes listed.

Administration and Operation 221


Administrating SAP Systems SAP Kernel Updates and Patches

10.6 SAP Kernel Updates and Patches


For an SAP kernel update (binary patches), logon to the Application Node with the CI of
the SAP system that is to be updated. Please make sure that the FA Application Agents
are stopped on the host while you are updating the SAP kernel (see also section “FA
Agent” on page 221).
SAP’s OSS note 19466 describes where to find kernel patches and how to handle the
installation.
For the installation of SAP ABAP patches or similar, please refer to the SAP
documentation. There are no FlexFrame specific changes.

222 Administration and Operation


11 Administrating SAP Services
In a FlexFrame environment, any type of SAP instance, i.e. database (DB), SAP Central
Services (SCS), Java central instance (JC), Central Instance (CI), application instance
(APP) or Java-only stack (J), is called an SAP service. The management can be done
either web based with the FA WebGUI or script based with the adapted start/stop scripts.
If the SAP ACC is running, SAP services can be managed by ACC (see chapter “SAP
ACC” on page 235”).

11.1 Displaying Status of SAP Services

11.1.1 myAMC.FA WebGUI


The myAMC.FA WebGUI can be used to display the states of SAP services in a
FlexFrame environment. To see all active SAP services and nodes in the FA WebGUI,
the FA Application Agent should be started on all Application Nodes. The displayed active
SAP services can be shown system-related or node-related.
The following table shows the node and service states of FlexFrame Autonomy and the
corresponding colors of these states in the FA WebGUI:

white green yellow red black

inactive or no normal, warning critival critical


further everything ok
information

Node states
RUNNING
SWITCH INT
SWITCH EXT
PowerOff
Service states
SHUTDOWN
DOWN
WATCH
NOWATCH
UNKNOWN
NULL
RUNNING

Administration and Operation 223


Administrating SAP Services Displaying Status of SAP Services

white green yellow red black

inactive or no normal, warning critival critical


further everything ok
information

STOPPING
STARTING
REBOOT
RESTART
RESTARTING
REBOOT
REBOOTING
RBGET
SWITCH
SWITCHOVER
SWGET
ERROR

If you don't want to configure the parameters in the plain xml files, you may use the FA
WebGUI to do this more conveniently. Details are provided in the “FA Agents –
Installation and Administration“ manual.

11.1.2 Script: ff_list_services.sh


The script ff_list_services.sh lists the status of all installed SAP services.

Synopsis

ff_list_services.sh [-ivcCH] [<pool> ...]

Command Options
-i Shows inactive services
-v Verbose mode

-c Force use of colors for status information

-C Suppress colors for status information


-H Suppress headers
-h or -? Shows usage
If no pools are specified, services of all pools will be shown.

224 Administration and Operation


Starting and Stopping Application Services Administrating SAP Services

11.2 Starting and Stopping Application Services


Virtualization of the applications and services demands special measures for starting,
stopping, restarting them etc. These measures are cared for by an SAP service script for
each service type.

The application or service must not be started directly, e.g. for an SAP
instance as <sid>adm with startsap, since in this case the interfaces are
neither supplied with IP addresses, nor is the service control file maintained.
The started application will not work due to the lack of a network connection.

11.2.1 SAP Service Scripts


Each service type has its own script. The service type is part of the script name:

Script name Application


sapascs Start and stop an ABAP-SAP Central Services Instance
sapdb Start and stop Oracle, SAPDB etc.
sapci Start and stop an ABAP Central Instance
sapscs Start and stop a SAP Central Services Instance
sapjc Start and stop a Java Central Instance
sapapp Start and stop an ABAP Application Instance (application server)
sapj Start and stop a Java Application Instance

The call syntax for sapdb, sapci, sapscs, sapascs and sapjc is:
sapdb <sid> <action>
sapci <sid> <action>
sapscs <sid> <action>
sapascs <sid> <action>
sapjc <sid> <action>

The call syntax for sapapp and sapj is:


sapapp <id> <sid> <action>
sapj <id> <sid> <action>

The call parameters are:


<id>
Distinction of several similar instances of a service type of an SID; 2-digit numerical.

Administration and Operation 225


Administrating SAP Services Starting and Stopping Application Services

<sid>
System ID (SID), 3-digit, in lower case.
<action>
The action to be performed with the application or service. Actions are start, stop,
restart, status, cleanup, watch and nowatch.
Call the following from the Control Node, using ssh [-t] with reference to the example
of sapapp:
ssh <application_node_name> sapapp <ID> <SID> <action>

Without the -t option, the standard output is redirected to the logfile


/FlexFrame/scripts/log/<script_name>_[<ID>_]<SID>_<action>.log.
Example:
The application server of the system CB1 with the ID 01 is to be started on the node
blade3; let us assume that the Control Node is control1.
Execute the following as root in the command shell of control1:

control1:~ # ssh blade3 sapapp 01 cb1 start

11.2.2 SAP Service Script Actions


In the following, the SAP service script actions are desribed:
Start
This checks whether the application or service in question is already running. If it is
running, it is not restarted. It also checks whether required applications or services
are running. The required virtual IP addresses are assigned to the relevant interfaces
for Client LAN and Server LAN (ifconfig <ifc> <ip-adr> netmask <netmask>
up), the application is started and the service control file is written.
Stop
This checks whether the application or service in question is running. If it is not, it is
not stopped. The application is terminated, the service control file is deleted and the
virtual IP addresses are brought down again (ifconfig <ifc> down).
Status
This checks the logical status of the application or service. The functional availability
is not tested.
Restart
This merges the actions stop, cleanup and start in one call. restart is
intended for restarting of a malfunctioning application.

226 Administration and Operation


Starting and Stopping Application Services Administrating SAP Services

Cleanup
This kills application processes that are still running and deletes occupied resources
such as shared memory, semaphores and the message queue.
Note that this action may only be performed after stopping the application has failed.
Nowatch
This removes the application from monitoring by the high-availability software (FA
Agent) without the application having to be restarted. The application itself retains its
current status.
Watch
This includes the application again into monitoring by the high-availability software
(FA Agent) without the application having to be restarted. The application itself
retains its current status.

11.2.3 SAP Service Scripts User Exits


The FlexFrame start-/stop-scripts for SAP Services (sapdb, sapci, sapapp etc.) provide
a user exit as a shell script /FlexFrame/scripts/user_script. If this
user_script exist, the FlexFrame start-/stop-scripts will call it twice, 1st in the
beginning, 2nd at the end of the start-/stop-scripts. This means two phases, a pre-phase
and a post-phase. The pre-phase will be run before any action, the post-phase will run
after all action. Because of plausibility errors while executing the FlexFrame start-/stop-
script, it is possible that the post-phase will be omitted.
A sample user_script is delivered as user_script.TEMPLATE.
To achieve a pool depedent function of user_script, put the script into the pool image
and create a symbolic link to it in FlexFrame/scripts. Seen from the Control Node:

Control1: # ln -s /FlexFrame/pooldata/config/scripts_config/
user_script /FlexFrame/scripts/user_script

If no pool dependent function is needed, the script user_script can be located directly
in /FlexFrame/scripts.

Administration and Operation 227


Administrating SAP Services Return Code of the SAP Service Scripts

11.3 Return Code of the SAP Service Scripts


The SAP service scripts issue a return code (exit code). The meaning of this code can be
looked up in the sapservice_functions file:

# common exit codes for service scripts


#
no_error=0 # Bit 0,
wrong_parameter_count=1 # Bit 1, wrong number of
parameters
plausibility_error=2 # Bit 2, plausibility error
interface_server_lan_error=4 # Bit 3, error at server lan
interface up/down
interface_client_lan_error=8 # Bit 4, error at client lan
interface up/down
service_start_stop_error=16 # Bit 5, error at service
start/stop/status/...
any_error=32 # Bit 6, any other error
# rule is logical OR:
# let exit_code="exit_code|new_exit_code"
# if [ `expr $exit_code&4` -eq 4 ];then ......
# if [ `let xxx="$rc & 4";echo $xxx` -ne 4 ];then ...

For actions like start, stop, nowatch and watch, the exit code should be no_error
normally. For the action status, the exit code is dependent from various factors:
● The installation check against the SAP service may have failed
(plausibility_error)
● the SAP service is possibly not running or not running on this host (any_error)
● the SAP service is running on this host (no_error).
In addition to these exit codes, the SAP service scripts print out messages as described
in the section “Start/Stop Script Errors” on page 283.

228 Administration and Operation


Starting and Stopping Multipe SAP Services Administrating SAP Services

11.4 Starting and Stopping Multipe SAP Services


The following scripts are provided for starting and stopping multiple applications and
services:

Script name Application Place of execution


start_all_sapservices Initial start of all Only on a Control Node
configured applications
stop_all_sapservices Stopping all running Only on a Control Node
applications
stop_all_sapservices_SID Stopping all running Only on a Control Node
applications of one SID
stop_all_sapservices_local Stopping all running Only on an Application
applications on the local Node
node

The call syntax for pool-dependent scripts is:


stop_all_sapservices [<pool>]
stop_all_sapservices_SID <SID> [<pool>]

11.4.1 Details on Controlling Multiple SAP Services


This section describes the functionality of the SAP service scripts for multiple services.
start_all_sapservices
Initial start script for customer defined sap service landscapes. This can be the whole
FlexFrame landscape, a pool, a SAP system landscape with a productive system, a
quality assurance system and a development system, or a single SAP system with
database, central instance and application instances. This script can be duplicated to
create various initial start scripts with different names, e.g.
start_all_sapservices_pool2. A sample start_all_sapservices script is
delivered as start_all_sapservices.TEMPLATE.
stop_all_sapservices
Shutdown script for all running sap services in a dedicated pool. The advantage of
this script is that sap services have not to be stopped one-by-one.
stop_all_sapservices_SID
Shutdown script for a singe SAP system (one SID) with all belonging sap services
like application instances, central instances und database instance in the right order.

Administration and Operation 229


Administrating SAP Services Removing an Application from Monitoring by FA Agents

stop_all_sapservices_local
Shutdown script for all running sap services on one Application Node. This script will
be integrated via runlevels. The information base for these
stop_all_sapservices* scripts are the /FlexFrame/scripts/log/*_host
files.

11.5 Removing an Application from Monitoring by


FA Agents
If the applications or services are started with the scripts for virtualization, they are
monitored by the FA Agents. If you do not want this, be it for tests, installation or
upgrades, you have to inform the high-availability software of this, using the additional
parameter nowatch when the application is started.
Example:
The central instance of BW1 is to be started without monitoring by the high-availability
software:
blade1 # sapci bw1 nowatch
blade1 # sapci bw1 start

or from a Control Node:


control1 # ssh blade1 sapci nowatch
control1 # ssh blade1 sapci start

If a running application is to be excluded from monitoring by the high-availability software


without being restarted, this is possible, using the nowatch option. The application then
retains its current status.
Example:
The central instance of BW1 is to be excluded from monitoring by the high-availability
software while running:
blade1 # sapci bw1 nowatch

Alternatively, call the following from a Control Node:


control1 # ssh blade1 sapci bw1 nowatch

If a running application is to be included (again) into monitoring by the high-availability


software without being restarted, this is possible using the watch option. The application
then retains its current status.

230 Administration and Operation


Removing an Application from Monitoring by FA Agents Administrating SAP Services

Example:
The central instance of BW1 is to be included into monitoring by the high-availability
software while running:
blade1 # sapci bw1 watch

or from a Control Node:


control1 # ssh blade1 sapci bw1 watch

11.5.1 Stopping and Starting an Application for Upgrades


Using r3up
The r3up upgrade control program starts and stops the central instance or application
server in various upgrade phases, using the conventional startsap and stopsap. The
start and stop scripts for visualization are therefore left out.
Suitable measures must be taken to ensure that the virtual host name for <sid>adm is
always available and that the interfaces are supplied with the correct IP addresses. In
addition, the application has to be removed from monitoring by the high-availability
software.
Note that for the following solution steps, the application must have been started
correctly beforehand with start and stop scripts for virtualization, so that the
matching virtual host name is set by default in the $HOME/hostname_default
script.

● Remove the application from monitoring by the high-availability software and run it.
● r3up can now shut down and start the application directly with stopsap and
startsap; the IP addresses remain assigned to the interfaces and the host name is
retained for <sid>adm without any changes on the correct virtual host name.
● Include the application again into monitoring by the high-availability software after
the upgrade.

Administration and Operation 231


Administrating SAP Services Service Switchover

11.6 Service Switchover


There is no command line tool to enforce a direct switchover. To switch over an SAP
service like a Central Instance, Database Instance, Application Instance etc., it is required
to first stop this SAP service on the actual Application Node and to start this SAP service
on the destination Application Node thereafter.
Examples:
The central instance of OL4 from pool pool1 is running on klinge4 and should
switchover to klinge5.
First, check whether the Central Instance OL4 is running on klinge4:
For a first overview to see which SAP service should be running on which Application
Node in the pool pool1, run

control1 # view_hosts pool1


app_44_ol4 should be running on klinge4
app_56_ml4 should be running on Baby_4
ci_ml4 should be running on RX300-01
ci_ol4 should be running on klinge1
ci_osm should be running on Baby_3
db_osm should be running on Baby_3
db_ml4 should be running on RX300-01
db_ol4 should be running on klinge4

The output should be means that the script view_hosts does not really check the
state of the SAP services. It only checks the FlexFrame/script/log/*_host files
which are created immediately after a successfully start of the corresponding SAP
service.
For a detailed status information you can use:
control1 # ssh klinge4 sapci ol4 status

The exit code from our SAP service script will be “0” in this case. For any other errors
while executing our SAP service script, it will return an exit code not equal “0”.
control1:/ # ssh klinge1 sapci ol4 status
Central-Instance OL4 should be running on this host.
Central-Instance OL4 has really running processes on this host.
client_lan interface vlan2002:1 10.1.7.151 is already up.
server_lan interface vlan2006:2 172.16.4.151 is already up.
ciol4: klinge1 /FlexFrame/scripts/sapci done!
ciol4: exit_code=0

232 Administration and Operation


Service Switchover Administrating SAP Services

Another Example on solaris:


control1:~ # ssh Baby_3 sapci osm status
Central-Instance OSM should be running on this host.
Central-Instance OSM has really running processes on this host.
client_lan interface fjgi2002000:2: 10.1.7.106 is already up.
server_lan interface fjgi2006000:3: 172.16.4.106 is already up.
ciosm: Baby_3 /FlexFrame/scripts/sapci done!
ciosm: exit_code=0

Stop the running Central Instance OL4 on klinge4:

control1 # ssh klinge4 sapci ol4 stop

And now start the stopped Central Instance OL4 on the new destination klinge5:

control1 # ssh klinge5 sapci ol4 start

The manual switch over should now be complete.

Administration and Operation 233


12 SAP ACC
This chapter describes the administration and use of ACC in a FlexFrame environment.
Below, you find a brief description of what is to be done. For detail on how it is to be
done, please refer to the following documents:
● ACCImplementation.pdf
● ACCSecurity.pdf
● ACCCustomizing.pdf

12.1 Integration of New Servers, Pools and SAP


Services
This section summarizes how to integrate new hosts and SAP services into ACC.
Detailed information how to do this can be found in the Installation Guide for SAP
Adaptive Computing Controller.
The following sections are referring to the Installation Guide for ACC from FSC.

Assumption
Running ACC and SolMan services.

12.1.1 Integration of New ACC Pools (=FF groups)


For information on creating a new ACC pool, please see chapter “Configure ACC Pools”.

12.1.2 Integration of New Servers


For information on integrating new servers in a FlexFrame environment with ACC Agents
started, please see chapter “ACC Agents”.

Integration of a new Server to ACC


For information on making a new registered server adaptive, please see chapter
“Adaptive Enabled Computing Nodes”.

Administration and Operation 235


SAP ACC User Administration

12.1.3 Integration of new SAP Services


There are different procedures to integrate new SAP services into ACC, depending on
ABAP or Java services.

12.1.3.1 Moving SAP Services into SLD


See chapter “Register managed Application Services into SLD”.

SolMan
Transaction smsy_setup started (see chapter “Activate and check Transfer from SLD to
Solution Manager”).
Making services adaptive (see chapter “Integration of SAP Services”).

ACC
Click on refresh button.
New services should be visible in the ACC WebGUI.

12.2 User Administration


ACC is a single point of control component which needs to be secured against
unauthorized access. Details are described in the “ACCSecurity.pdf”.
For administration of ACC users see the manual chapter “User Administration for ACC
Users”.

12.3 Usage of ACC


The ACC WebGUI has a Help menu integrated on the right side of the status
bar. For detailed information use the Help menu.

12.3.1 Displaying Status of SAP Services


● After starting the ACC WebGUI see chapter “Start of ACC”.
● The Logical Landscape appears.
● Click on the desired pool or select “Expand All” in Option menue of tray.
● You should see the status of each integrated SAP service.

236 Administration and Operation


Usage of ACC SAP ACC

12.3.2 Starting SAP Services


● After starting of the ACC WebGUI see chapter “Start of ACC”.
● The Logical Landscape appears.
● Click on the desired pool or “Expand All” in Option menue of tray.
● Choose an SAP service by clicking on the instance name.
● Select the desired service by clicking on the blue quad.
● Select a server from the Selected Server assortment.
● Click on the Start Application Service button to start.
● Confirm the upcoming window and check the Status.

12.3.3 Stoping SAP Services


● After starting the ACC WebGUI, see the chapter “Start of ACC”.
● The Logical Landscape appears.
● Click on the desired pool or “Expand All” in Option menue of tray
● Choose a SAP service by clicking on the instance name.
● Select the desired service by clicking on the blue quad.
● Click on the “Stop Application Services” button to stop.
● Confirm the upcoming window and check the status.

12.3.4 Relocating SAP Services


● After starting the ACC WebGUI, see the chapter “Start of ACC”.
● The Logical Landscape appears.
● Click on the desired pool or “Expand All” in Option menue of tray
● Choose a SAP service by clicking on the instance name
● Select the desired service by clicking on the blue quad.
● Select a server from the “Selected Server” assortment
● Click on the “Relocate Application Service” button to relocate
● Confirm the upcoming window and check the Status

Administration and Operation 237


SAP ACC Usage of ACC

12.3.5 Archiving the ACC Log


It is also possible to archive the logging information:
Navigate in the ACC WebGUI to “Technical Settings” => “Archive Log”
Delete, Archive and Retrieve are the possible functions. For more information use
the Help button in the status bar.

238 Administration and Operation


13 Configuring FA Agents
Operation of the FA Agents does not necessarily require individual parameterization.
Usable parameter files are available after installation has taken place. For productive use,
the values must have been tested and, if necessary, adjusted to the requirements and the
start, stop, ping and restart times on the customer system for the services monitored in
myAMC.FA.
The FlexFrame Autonomy solution is configured to use files contained in the directory
/opt/myAMC/vFF/vFF_<pool_ name>/config on the Control Node.
The files are available in XML format.
TrapTargets.xml
Trap target (pool-dependent). In this file you can configure the trap receivers to which
messages are sent.
myAMC_FA_Groups.xml
Groups (pool-dependent). The group affiliation is configured in this file.
myAMC_FA.xml
FlexFrame Autonomy (pool-dependent), settings for the autonomous reactions.
myAMC_FA_ACC.xml
ACC Connector (pool-dependent), settings for the interface to SAP ACC.
myAMC_FA_GUI.xml
FA WebGUI (pool-dependent), settings for the FlexFrame WebGUI.
myAMC_FA_SD_Sec.xml
FA shutdown security (pool-dependent), settings for the power shutdown.
After the installation, the default files for all configurations are available as
<xxxx>-default.xml.
Example:
In its as-supplied status, the myAMC_FA_default.xml file is identical to the
myAMC_FA.xml file. It can be used to restore a modified or destroyed myAMC_FA.xml
file.
Details are provided in the “FA Agents – Installation and Administration“ manual.

Administration and Operation 239


Configuring FA Agents Groups

13.1 Groups

13.1.1 General
FlexFrame offers advanced functions for partitioning a FlexFrame environment into
service or customer specific server pools and groups. This may be interesting for large
installations or application service providers.

Server pools
On FlexFrame 3.2, a pool is a number of Application Nodes belonging to the same
department (or customer) with exclusive hardware requirements. FlexFrame systems can
be divided into pools. Each FlexFrame system consists of at least one pool. Within a
pool, all servers may communicate with each other, but not with the systems of other
pools.
Servers of different pools can use different copies of the OS and can be separated into
different network segments.
The autonomous reactions of FlexFrame Autonomy are always pool-related. Pools can
be configured in LDAP by using the FF administration tools. Details about the pooling
within FA Agents are provided in the “FA Agents – Installation and Administration“
manual. The FA configuration file myAMC_FA_Pools.xml is used as a cache file in the
case LDAP is not responding.

Server groups
Within a server pool, various types of hardware can be used with different characteristics,
such as operating system and architecture, number of CPUs and RAM size. The bulk of
servers can be divided into groups (pool groups) of servers with similar operating
systems and hardware performance. This may be very useful for groups of high-
performance database servers or groups of medium-performance application servers.
Each pool consists of at least one group.
Each SAP application running in a pool can use one or more servers in one or more
groups of servers in the same pool. Each instance of this application runs in a selected
group. In case of a failure, switchover to a spare server is possible in the same group.
For example, a pair of servers can be divided into a high-performance database server
and a smaller application server. The group configuration makes sure that database
instances run in the group of high-performance servers, while application instances stay
on groups of smaller servers without interfering with each other.
All servers in a pool share the same VLAN network segments, even if they belong to
different groups. They do not share VLAN segments with other pools, except the Control
VLAN.

240 Administration and Operation


Groups Configuring FA Agents

The autonomous reactions of FlexFrame Autonomy are always group-related. Groups


can be configured in LDAP by using the FF administration tools. Details about the
grouping within FA Agents are provided in the “FA Agents – Installation and
Administration“ manual. The FA configuration file myAMC_FA_Groups.xml is used as a
cache file in the case LDAP is not responding.
After modifying the group configuration in LDAP, the FA agents must be stopped and
restarted in the affected pool. The new group configuration is now used.

13.1.2 Service Classes


SAP systems and their services can be classified. Classification on system or service
level permits various reaction scenarios.
Systems and services are classified in the myAMC_FA_Groups.xml file. The
classification is pool-related.
The service classes are defined in the group configuration file of a virtual FlexFrame pool.
A service class is defined by the following variables:
System ID ("P46", "O20", ...)
Service type ("db", "app", "ci", ...)
Service ID ("00", ...)

The following attributes are defined in accordance with these variables:


service-priority
service-powervalue
This value is currently only provided for information purposes.

13.1.3 Service Priority


A priority can be assigned to all services of a service class.
The highest service priority is 1. By default each service is assigned this priority, i.e. if no
service classes are defined, all services have priority 1. The higher the number, the lower
is the priority of a service. The highest possible number is 1000.
Priority 0 is a special case. Setting priority to 0 for a service class enables the
autonomous functions for a service to be deactivated.
The service priority is evaluated for all autonomous reactions. If, for example, a service of
a productive system and a service of a test system are running on the same node and the
test system service has been assigned priority 5, this reboot is not executed because the
service of the productive system which is running without error has priority 1 and thus a
higher priority.

Administration and Operation 241


Configuring FA Agents Traps

13.1.4 Service Power Value


For this service, the service power value attribute defines a performance value which
specifies the maximum performance (SAPS) this service requires.
This attribute is intended for use in future versions of the FA agents in the field of load
distribution and load transfer.

13.1.5 Class Creation Rules


A service either belongs to the default class which always exists or it can be assigned
unambiguously to another class by evaluating the aforementioned variables.
Details are provided in the “FA Agents – Installation and Administration“ manual.

13.2 Traps
The FA Agents are able to send SNMP traps to configured trap destinations. The traps
contain status information respectively changes of services in the FlexFrame
environment. For example there is an SNMP trap if a service is starting. SNMP traps can
be used to connect an external system monitoring application to FlexFrame.

13.2.1 General
The TrapTargets.xml file contains all the trap destinations, i.e. information which is
needed to send SNMP traps. Two parameters are required for each target:
● Host name or IP address
● SNMP community
The community roughly corresponds to a password.
Generally public is configured as the default value.
Details are provided in the ”FA Agents – Installation and Administration“ manual.

13.2.2 Changing the Trap Destinations


In a FlexFrame configuration, at least the two Control Nodes of the pool involved must be
entered as trap destinations. Other systems may also be entered as trap destinations,
however they need to be part of a network reachable from the FF environment.
After the trap destinations have been modified or extended, the FA Control Agent and FA
Application Agents must be restarted to apply the modification.

242 Administration and Operation


FlexFrame Autonomy Configuring FA Agents

13.3 FlexFrame Autonomy


The behavior of the FlexFrame Autonomous Agents can be influenced by a number of
parameters.
The FlexFrame Autonomous Agents’ parameters are described in detail in the FlexFrame
Autonomy documentation. You should always check that the default values can be used
in the configuration involved.
In particular the parameters for startup time, restart time, etc. should be checked, for
example, with relation to the database size or memory size as very great differences to
the default values are possible here. Every operator has to measure the dynamic values
of his/her configuration here and correlate it with the parameters set.
Only when the parameters are set correctly the FlexFrame Autonomous Agents can
perform their tasks securely and reliably.
The information to be configured relates to the following components:
● General parameters
● Node-related parameters
● Service-related parameters
● Path configurations
● Power shutdown configuration

13.3.1 General Parameters


The list below describes a selection of the parameters which influence the behavior of the
FlexFrame Autonomous Agents. Further information on parameterization with
corresponding sample scenarios is provided in the manual ”FA Agents – Installation and
Administration“.
CheckCycleTime
This parameter defines the cycles in which the internal detector modules of the FA
Agents supply results and the rule modules evaluate the status derived from these.
The parameter value may not be less than the minimum processing time which the
detector modules, rule modules and reaction modules require to process a cycle. The
default value in the as-supplied status is 10 seconds. The parameter value must also
always be at least 1/3 of the lifetime of the MonitorAlerts. In the FlexFrame standard
installation the lifetime of the MonitorAlerts is 30 seconds.
LivelistWriterTime
This parameter defines the intervals at which the FA Agents must generate a Livelist.
It is specified in seconds.

Administration and Operation 243


Configuring FA Agents FlexFrame Autonomy

ControlAgentTime
This parameter specifies how often the Control Agent checks the Livelists of the
Application Agents. The parameter should thus be about the same as the
LiveListWriterTime.
MaxHeartbeatTime
This parameter specifies the maximum time which may elapse between two Livelist
entries of an Application Agent before the Control Agent intervenes. The intervention
is a check if the Application Node is alive. If not, there will be an external switchover
to another Application Node. The MaxHeartbeatTime must therefore always be
greater than the ControlAgentTime and the LivelistWriterTime. In practice
the factor of 3 between LivelistWriterTime and MaxHeartbeatTime has
proved practical.
MaxRebootTime
This parameter specifies the maximum time which may elapse between two Livelist
entries of an Application Agent before the Control Agent intervenes during a reboot of
this Application Node.
MaxFailedReachNumber
Number of tries by the FA Control Agent to reach the Application Node after the
application has exceeded the MaxHeartbeatTime. After this number of tries, an
external switchover is initiated. This parameter specifies how often the Control Agent
attempts to reach a node after the MaxHeartbeatTime has been exceeded before
an external switchover is initiated.

13.3.2 Node-Related Parameters


Node_MaxRebootNumber
This parameter specifies how many consecutive reboots may be performed to
restore a service. If 3 is specified, the Application Agent attempts to make the system
available again with up to three reboots. Keep in mind that a reboot is also
unsuccessful if the system could not be restored within the MaxRebootTime set. In
the event of reboot problems, the MaxRebootTime parameter must therefore also
always be checked and compared with the reboot time actually needed.
Node_MaxSwitchOverNumber
This parameter specifies how many consecutive switchovers may be performed to
restore a service.
Node_SwitchOverServiceStartDelayTime
This parameter specifies the waiting time for starting of services on the target
Application Node during an internal switchover. This waiting time is necessary, to
avoid errors caused by switch IP caching. Therefore, the described parameter has to
be larger than the caching time of the used network switch, otherwise the starting of
a service could fail.

244 Administration and Operation


FlexFrame Autonomy Configuring FA Agents

Node_PowerDownTime
This parameter specifies the maximum time a Control Agent will wait for a node to
shut down before the services are started on a spare node by means of a switchover.
Node_CheckAvailabilityTime
This parameter specifies the maximum time a Control Agent will wait for complete
execution of the Node_CheckAvailabilityCommand to be completed. If no
positive acknowledgment from the Application Agent is received in that time, the
node is regarded as unavailable.
Node_SendTrapsAllowed
This parameter releases or blocks the sending of node traps.
Node_RebootCommand
This parameter specifies which command is executed when the Application Agent
initiates a reboot. Normally this is a shutdown with a subsequent reboot.
Node_ShutdownCommand
This parameter specifies which command is executed when the Application Agent
initiates a switchover. Normally this is a shutdown without a subsequent reboot.
Node_PowerDownCommand
This parameter specifies which command is executed by the Control Agent before an
external switchover is initiated. In this way it is ensured that the services on the node
being switched over are really stopped and can be taken over without any problem
by other nodes. The Control Agent waits at most for the period specified with the
Node_PowerDownTime parameter before it continues with the switchover.
Node_CheckAvailabilityCommand
This parameter specifies which command is executed by the Control Agent to check
the availability of a node. A return value of 0 is interpreted as a positive result, every
other return value as negative. The Control Agent waits at most for the period
specified with Node_CheckAvailabilityTime. If the command has not been
executed completely by then, it is assumed that the test is negative, i.e. the node is
no longer available, resulting in an external switchover.
Node_RemoteExecutionCommand
This parameter specifies which command the Control Agent puts ahead of a
command to be executed on another node. This is used, for example, to start or stop
a service remotely on an Application Node. Normally ssh is used here.

Administration and Operation 245


Configuring FA Agents FlexFrame Autonomy

13.3.3 Service-Related Parameters


The following parameters can be set individually for each service type or for multiple
services simultaneously. This is also a result of the hierarchical structure of the parameter
file. In the parameter file there is also an option for configuring the values for the DB, CI,
APP, SCS, JC and J services individually. The value of the default service is used for any
value which is not service-specific.
● Service_EnableMonitoring
● Service_SendTraps
● Service_MaxRestartNumber
● Service_TrapSendDelayTime
● Service_ReactionDelayTime
● Service_MaxStartTime
● Service_MaxStopTime
The dynamic behavior of the FA Application and Control Agents depends very much on
the values in the configuration file and the physical conditions. You should therefore
check very carefully that the relation between certain values is secure and application-
oriented.
Service_EnableMonitoring
This parameter defines whether monitoring is enabled or disabled for the service type
in question.
Service_SendTraps
This parameter releases or blocks the sending of service traps.
Service_MaxRestartNumber
This parameter defines how many attempts are made to restart a failed service. This
value can be configured individually for each service type. The value is typically in
the range 1 to 10. The value 0 means that no attempt is made to restart a failed
service. If reboots are permitted on the node, failure of a service leads directly to a
reboot.
Service_TrapSendDelayTime
This parameter defines the send delay time for the service traps.

246 Administration and Operation


FlexFrame Autonomy Configuring FA Agents

Service_ReactionDelayTime
This parameter interacts directly with CheckCycleTime. It can be set individually for
each service type. This time defines how long the triggering of a reaction is delayed
after a failure has been detected.
Examples:
CheckCycleTime = 10 sec; ServiceReactionDelayTime = 30 sec

In this example a failed service is detected in a cycle. However, the reaction only
takes place after 30 seconds. The failure must therefore have been identified as a
failure over at least three detection cycles. This allows preventing a detection error
resulting in an incorrect reaction.
CheckCycleTime = 10 sec; ServiceReactionDelayTime = 0 sec
In this example the required reaction takes place immediately in the cycle in which
the problem was detected.
Service_MaxRestartTime
This parameter defines the maximum time which may be required for a service type
in the event of a restart. If this time is exceeded, a second or further attempt is made
in accordance with Service_MaxRestartNumber. Thus if the time is selected to
short for the service to be monitored and the hardware used, i.e. the service requires
longer to restart than permitted by Service_MaxRestartTime, a problem situation
is triggered incorrectly.
Service_MaxStartTime
This parameter defines how long a service may take to start up. If this time is
exceeded, the Agent interprets the service as not started and initiates further
reactions.
Service_MaxStopTime
This parameter defines how long a service may take to stop. If this time is exceeded,
the Agent interprets the service as not stopped and initiates appropriate reactions.
Service_PingVirtualServiceInterface
This parameter defines whether the associated virtual FlexFrame service interface is
pinged to determine the availability of a service. If it is set to 0, the virtual LAN
interfaces of the client and server network are not queried. Interface availability then
has no influence on the status change of a service.

Administration and Operation 247


Configuring FA Agents FlexFrame Autonomy

13.3.4 Path Configuration


The path configuration is used to define the directories in which the FlexFrame Autonomy
components store their various work files.
A FlexFrame Autonomy solution stores a range of information, such as files with display
information for the WebGUI and logging information to be used for support when this is
required, in various files.
To ensure that performance and clarity are retained even in larger configurations, we
recommend that you do not modify these settings!
If the suggested path configuration is changed, though, make sure that clarity is still
retained and no problems arise regarding performance and accessibility.
LiveListLogFilePath
This parameter specifies the directory in which the Livelist is stored.
LiveListXmlFilePath
This parameter specifies the directory in which the XML representation of the Livelist
is stored. This file is required by the FA WebGUI. The parameter should contain the
same path as ServicesXmlFilePath.
ServicesXmlFilePath
This parameter specifies the directory in which the XML representation of the
services list is stored. These files are required by the FA WebGUI. The parameter
should contain the same path as LiveListXmlFilePath.
ServicesListFilePath
This parameter specifies the directory in which the services list files are stored.
ServicesLogFilePath
This parameter specifies the directory in which the services log files are stored.
RebootListFilePath
This parameter specifies the directory in which the reboot files are stored. These files
contain a list of all services which must be restored after a reboot.
SwitchOverListFilePath
This parameter specifies the directory in which the switchover files are stored. These
files contain a list of all services which must be restored on another node after a
switchover.
PerformanceFilePath
This parameter specifies the directory in which the performance files are stored.
These files contain measured values for performance data.

248 Administration and Operation


Power Management (On/Off/Power-Cycle) Configuring FA Agents

SAPScriptFilePath
This parameter specifies the directory in which the start and stop scripts for the SAP
services (sapdb, sapci, sapapp etc.) can be found. The default path
(/opt/myAMC/scripts/sap) is normally a symbolic link to the actual script
directory.
ControlFilePath
This parameter specifies the directory of the control files
(<service type><service ID><service SID>_host) generated by the
start/stop scripts.
BlackboardFilePath
This parameter specifies the directory in which the BlackBoard file can be found.
Commands can be entered in it that are executed by the FA Application Agents.
GroupConfigFile
This parameter specifies the file in which the group affiliation is configured.

13.4 Power Management (On/Off/Power-Cycle)


Details are provided in the “FA Agents – Installation and Administration“ manual.

13.4.1 General
The power shutdown concept of FlexFrame Autonomy provides an easy-to-configure
method for implementing secure shutdown of various hardware platforms. Various blade,
PRIMERGY and PRIMEPOWER systems (Midrange, Enterprise) can be installed
simultaneously in a FlexFrame. Each of these systems has different requirements which
must be taken into consideration for the power shutdown.
The FlexFrame Autonomy Control Agents make direct use of the Shutdown Agents from
the PRIMECLUSTER shutdown facility. These are part of the high-availability solution
PRIMECLUSTER (PCL) which is used on all Control Nodes of a FlexFrame system.
Normally, these agents are provided with their configuration information in the course of
the PCL configuration. However, in a FlexFrame solution only the Control Node PCLs are
configured; no configuration information on any Application Nodes exists in
PRIMECLUSTER.
The FlexFrame Autonomous Agents ascertain this lack of configuration information
automatically at runtime and then generate the configuration information required for the
agents.
Different configuration information is generated in accordance with the type of system for
which the power shutdown is performed.

Administration and Operation 249


Configuring FA Agents Power Management (On/Off/Power-Cycle)

The FlexFrame Autonomous Agents consequently create an agent-specific configuration


file:
SA_blade.cfg
for Blade configuration
SA_ipmi.cfg
for PRIMERGY configuration
SA_rsb.cfg
for PRIMERGY configuration
SA_xscf.cfg
for PRIMEPOWER configuration
SA_rps.cfg
for PRIMEPOWER (Midrange) configuration
SA_scon.cfg
for PRIMEPOWER (Enterprise) configuration

13.4.2 Architecture
The figure below provides an overview of the components involved and how these
interact in a FlexFrame environment.
The PRIMECLUSTER software runs on the two Control Nodes in a FlexFrame solution.
The FA Control Agent runs on the active Control Node defined by PCL. The FA
Application Agents provide the Control Agent with information on the computer type and
further information which is required for generic creation of configuration files, insofar as
this is technically possible and the information is unambiguous and can be ascertained
securely.
Information which cannot be ascertained generically must be entered manually in the
BrutForceShutdown config section of the myAMC_FA.xml file.
For further information on configuring the power shutdown manually, please see the "FA
Agents - Installation and Administration" manual.

250 Administration and Operation


Power Management (On/Off/Power-Cycle) Configuring FA Agents

FlexFrame Autonomy ACC Integration


external Switch Over

PRIMECLUSTER Control Node


FlexFrame Autonomy
Control Agent SA_XSCF Agent
SA_IPMI Agent
Control Node
SA_Blade Agent
SA_SCON Agent

SA_SCON Console CFG CFG


Blade PRIMERGY

CFG
PRIMEPOWER
SCON
Midrange Application Node PRIMERGY

Application Node PRIMERGY CFG


CFG PRIMEPOWER
PRIMEPOWER XSCF
Application Node
SCON PRIMERGY Midrange
Enterprise with XSCF
Application Node
PRIMEPOWER Midrange
w/o XSCF

Application Node
PRIMEPOWER Enterprise

13.4.3 Configuring User, Password and Community


To use agent power shutdown, user, password and community must be defined in the
configuration of the FA Agent. This configuration is specified in the pool-specific
configuration file myAMC_FA_SD_Sec.xml. The entries for user, password and
community must be the same as those configured in the Application Nodes.
Details are provided in the manual “FA Agents - Installation and Administration”.

13.4.4 Configuring Management Blades


The management blades have to be configured. This is done in the Managementblades
configuration section of the myAMC_FA.xml configuration file. For further information on
Management Blades configuration, please see the “FA Agents - Installation and
Administration“ manual.

Administration and Operation 251


Configuring FA Agents Linux Kernel Crash Dump (LKCD) Utilities

13.5 Linux Kernel Crash Dump (LKCD) Utilities


In case of kernel panics on a Control Node, the Linux Kernel Crash Dump Utilities
(lkcdutils) can be configured to dump the memory to a crash dump file in the local file
system of the Control Node.
To activate this functionality, edit the /etc/sysconfig/dump file and set
DUMP_ACTIVE = “1”. Then run lkcd config from the shell or reboot the Control Node.
However, there is a conflict between PRIMECLUSTER’s Shutdown Facility and the crash
dump itself: During a crash dump on one Control Node, PRIMECLUSTER on the other
Control Node can’t query the status of its partner and assumes that it has crashed.
Therefore it will issue an IPMI kill command to the Control Node that is currently writing its
crash dump. The IPMI power cycle will be done immediately and interrupts the crash
dump which will be unusable.
If you expect kernel panics, you have two possibilities:

Disabling the Shutdown Facility


● This can easily be done by unplugging LAN A of the affected Control Node(s).
● Disadvantage: PRIMECLUSTER will not recover any Control Node faults. A kernel
panic may result into an inconsistent and/or faulted cluster without application take-
over.
This solution is ideal if kernel panics occur regularly or can be reproduced easily.

Adjust the interconnect timeout value.


● On both Control Nodes edit the file /usr/opt/reliant/bin/hvenv.local and
set HV_CONNECT_TIMEOUT=360.
● Also edit /etc/sysconfig/dump and set PANIC_TIMEOUT=0 to prevent an
automatic reboot after the dump.
● Reboot both Control Nodes one after another.
With this configuration, PRIMECLUSTER will wait 10 minutes and 45 seconds before
sending the IPMI kill command. This might not be enough, depending on the memory
size of your Control Node.
You can provoke a crash dump by doing the following
control1: # echo 1 > /proc/sys/kernel/sysrq

Press ALT GR + SYSRQ(=PRINTSCREEN) + c to start a dump.


To see what happens, you should change to the syslog console by pressing
CTRL + ALT +F10 before starting the dump.

252 Administration and Operation


Linux Kernel Crash Dump (LKCD) Utilities Configuring FA Agents

if you like to measure the time needed for a crash dump. Remember to disable the
Shutdown Facility before testing. Otherwise it may be interrupted if it takes too much
time.
You should see increased activity of the local disks during the dump.
On the next reboot, the crash dump will be saved from the swap devices to
/var/log/dump.
To analyze this dump, change to this directory and call lcrash -n <number>, where
<number> has to be the suffix of the dump files, for example map.0, dump.0 and
kerntypes.0.

Administration and Operation 253


14 Data Protection – Backup and
Restore
Data protection means backing up data and being able to recover it. You protect the data
by making copies of it so that it is available for restoration even if the original is no longer
available.
Reasons that businesses need data backups and protection systems include the
following:
● To protect data from accidentally deleted files, application crashes, data corruption,
and viruses
● To archive data for future use
● To recover from a disaster
Depending on your data protection and backup needs, Data ONTAP from Network
Appliance offers a variety of features and methods to ensure against accidental,
malicious, or disaster-induced loss of Filer data. The following list describes the Data
ONTAP online features to protect data. For detailed information see the “Network
Appliance Online Backup and Recovery Guide”.
Data protection features:
TM
● Snapshot
Backup within a volume.
This feature allows you to manually or automatically create, schedule, and maintain
multiple backups of data on a volume. Snapshots use only a minimal amount of
additional volume space on your Filer and do not reduce performance significantly.
If a user accidentally modifies or deletes crucial data on a volume with Snapshot
enabled, these data can be easily and quickly restored from one of the last
snapshots taken.
®
● SnapRestore (license required)
Fast, space efficient restoration of large volumes of data backed up to snapshots.
The SnapRestore feature performs on-request snapshot recovery from snapshots on
an entire volume.

Administration and Operation 255


Data Protection – Backup and Restore Backup of Filer Volumes with NetApp Snapshot

● SnapMirror® (license required)


Volume-to-volume and qtree-to-qtree replication.
This feature enables you to periodically create snapshots of data on one volume or
qtree, replicate that data to a partner volume or qtree, usually on another Filer, and
archive one or more iterations of that data as snapshots.
Replication on the partner volume or qtree ensures quick availability and restoration
of data, from the point of the last snapshot, in case the Filer of the original volume or
qtree should be unavailable.
If you conduct tape backup and archival operations, you can carry them out on the
data that were already backed up to the SnapMirror partner Filer, thus freeing the
original Filer of this time-consuming, performance-degrading chore.
● SyncMirrorTM (cluster configuration required)
Continuous mirroring of data to two separate Filer volumes.
This feature allows you to mirror Filer data in real-time to matching volumes
physically connected to the same Filer head. In case of unrecoverable disk errors on
one volume, the Filer automatically switches access to the mirrored volume.
Filer cluster configuration is required for this feature.
● MetroCluster
SyncMirror functionality, enhanced to provide continuous volume mirroring across
distances from 500 meters to 10 kilometers.

14.1 Backup of Filer Volumes with NetApp


Snapshot

14.1.1 Filer Volumes


A snapshot is a frozen, read-only image of the entire Data ONTAP file system that
reflects the state of the file system at the time the snapshot was created. Data ONTAP
maintains a configurable snapshot schedule that creates and deletes snapshots
automatically. Snapshots can also be created and deleted manually. It is possible to store
up to 255 snapshots at one time on each Filer volume. You can specify the percentage of
disk space that snapshots can occupy. The default setting is 20% of the total (both used
and unused) space on the disk.

256 Administration and Operation


Backup of Filer Volumes with NetApp Snapshot Data Protection – Backup and Restore

For the following volumes in a FlexFrame 3.2 environment a snapshot schedule should
be configured:
vol0
Root volume with Data ONTAP.
VolFF
FlexFrame specific volume with the OS images, FA Agents, scripts and pool specific
data (e.g. SAP and DB executables, profiles etc.).
Sapdata
FlexFrame specific volume with all data files of the SAP databases.
Saplog
FlexFrame specific volume with all log files of the SAP databases.

14.1.2 Snapshot Schedules


When you install Data ONTAP on a Filer, it creates a default snapshot schedule. The
default snapshot schedule automatically creates one nightly snapshot Monday through
Saturday at midnight, and four hourly snapshots at 8 a.m., noon, 4 p.m., and 8 p.m. Data
ONTAP retains the two most recent nightly snapshots and the six most recent hourly
snapshots, and deletes the oldest nightly and hourly snapshots when new snapshots are
created.
For newly created volumes, the default snapshot schedule is not activated. In
FlexFrame environments this has to be done for the volumes volFF, sapdata
and saplog. Please use the FilerView GUI or the snap command to activate it.
There are three types of schedules that you can set up to run automatically using the
snap sched command:
Weekly
Data ONTAP creates these snapshots every Sunday at midnight.
Weekly snapshots are called weekly.n, where n is an integer. The most recent
weekly snapshot is weekly.0, and weekly.1 is the next most recent weekly
snapshot.
Nightly
Data ONTAP creates these snapshots every night at midnight, except when a weekly
snapshot is scheduled to occur at the same time.
Nightly snapshots are called nightly.n, where n is an integer. The most recent
nightly snapshot is nightly.0, and nightly.1 is the next most recent nightly
snapshot.

Administration and Operation 257


Data Protection – Backup and Restore Backup of SAP Databases

Hourly
Data ONTAP creates these snapshots on the hour or at specified hours, except at
midnight if a nightly or weekly snapshot is scheduled to occur at the same time.
Hourly snapshots are called hourly.n, where n is an integer. The most recent
hourly snapshot is hourly.0, and hourly.1 is the next most recent hourly
snapshot.
To display the snapshot schedule for one or all volumes on a Filer, enter the following
command:
filer> snap sched [volume_name]

Example:
filer> snap sched volFF
Volume volFF: 2 6 8@8,12,16,20

The result means that for volume volFF weekly (max. 2), nightly (max. 6) and hourly
(max. 8) snapshots will be created. The hourly snapshots will be at 8:00, 12:00, 16:00
and 20:00 h.
You can create different snapshot schedules for different volumes on a Filer. On a very
active volume like SAPLOG, schedule snapshots every hour and keep them for just a few
hours. For example, the following schedule creates a snapshot every hour and keeps the
last three:
filer> snap sched saplog 0 0 3

This schedule does not consume much disk space, and it lets users recover files in
recent snapshots as long as they notice their mistake within a couple of hours.

14.2 Backup of SAP Databases


SAP backup and recovery presents several challenges:
● Performance impact on the productive SAP system
Backups typically have a significant performance impact on the productive SAP
system because there is a high load on the database server, the storage system,
and the storage network during backups.
● Shrinking backup windows
Since conventional backups have a significant performance impact on the productive
SAP system, backups can be made only during times with low dialog or batch
activities on the SAP system. It becomes more and more difficult to define an
appropriate backup window when the SAP system is used 24x7.

258 Administration and Operation


Backup of SAP Databases Data Protection – Backup and Restore

● Rapid data growth


Databases are growing. Rapid data growth together with shrinking backup windows
results in ongoing investments in the backup infrastructure – more tape drives, new
tape drive technology, faster storage networks etc. Growing databases also require
more tape media or disk space for backups. Incremental backups can address these
issues, but result in a very slow restore process that usually is not acceptable.
● Increasing cost of downtime and decreasing mean time to recover
The mean time to recover (MTTR) is the time needed to recover from a database
failure (logical or physical error). The MTTR cuts into two areas — the time that is
necessary to restore the database and time that is necessary to do the forward
recovery of the database. The forward recovery time depends on the number of redo
logs that need to be applied after a restore. Unplanned downtime of an SAP system
will always have a financial impact on the business process. A significant part of the
unplanned downtime is the time that is needed to restore and recover the SAP
system in the case of a database failure. The backup and recovery architecture has
to be designed according to the maximum acceptable unplanned downtime.
● Backup and recovery time included in SAP upgrade projects
The project plan for an SAP upgrade always includes at least three backups of the
SAP database. The time needed to perform these backups will reduce the total
available time for the upgrade process.
A backup and recovery solution using a Network Appliance Filer will always consist of two
parts:
1. Backup and restore/recovery using Snapshot and SnapRestore
2. Backup and restore to/from a second location, which can be disks or tape
A backup to a second location will always be based on Snapshot copies created on the
primary storage. Therefore, the data will be directly read from the primary storage system
without generating load on the SAP database server. Several options to back up the data
to a second location are possible.
Snapshots can be used to create an online/offline backup of Oracle databases or an
online backup of SAPDB/MaxDB databases. With an online backup the Oracle database
has to be put in hot backup mode before the Snapshot copy is created. With an offline
backup the database is shut down before the Snapshot copy is created. SAPDB/MaxDB
databases are always consistent during online backups, but it is recommended to use the
GUI-based Database Manager from SAPDB/MaxDB to create backups.
Backups using the Snapshot technology directly are not visible in the SAP
systems (using transaction DB24 for database operations). This will cause at
least warnings during SAP Early Watch sessions. By using the SAP standard
backup tools brbackup and brarchive the backups will be visible in the SAP
system.

Administration and Operation 259


Data Protection – Backup and Restore Backup of SAP Databases

Example for an Oracle backup script using snapshot (only parts visible)

##################################################################
# Name: sapsnap
# Create a snapshot
# Turning tablespaces in hot backup mode or shut down DB
#
# Syntax sapsnap <online|offline> SID
#
# Parameter: online Turns all tablespaces in hotbackup mode
before
# doing the snapshot. Ends hotbackup mode
after
# creating the snapshot
# offline Shuts down the DB before doing the
snapshot
# Starts the DB after creating the
snapshot
#
##################################################################
# functions
##################################################################
# Sending Database tables in BEGIN BACKUP
hotbackup_start (){
echo "sqlplus \"/ as sysdba\" @/tmp/prebackup_${SID}.sql"
> /tmp/prebackup_${SID}.sh;

260 Administration and Operation


Backup of SAP Databases Data Protection – Backup and Restore

echo "set feedback off;" > /tmp/prebackup_${SID}.sql;


echo "set pagesize 0;" >> /tmp/prebackup_${SID}.sql;
echo "alter system switch logfile;" >>
/tmp/prebackup_${SID}.sql;
echo "spool /tmp/hotbackup${SID}start.sql;" >>
/tmp/prebackup_${SID}.sql;
echo "select 'alter tablespace '||tablespace_name||' begin
backup;' from dba_tablespaces;" >> /tmp/prebackup_${SID}.sql;
echo "spool off;" >> /tmp/prebackup_${SID}.sql;
echo "@/tmp/hotbackup${SID}start.sql;" >>
/tmp/prebackup_${SID}.sql;
echo "exit;" >> /tmp/prebackup_${SID}.sql;
chmod 777 /tmp/prebackup_${SID}.sh
su - ora${sid} -c /tmp/prebackup_${SID}.sh
rm /tmp/prebackup_${SID}.sql
rm /tmp/prebackup_${SID}.sh
rm /tmp/hotbackup${SID}start.sql
}
# Set the Database Tables back in online mode
hotbackup_end (){
echo "sqlplus \"/ as sysdba\"
@/tmp/postbackup_${SID}.sql" > /tmp/postbackup_${SID}.sh;
echo "set feedback off" > /tmp/postbackup_${SID}.sql
echo "set pagesize 0" >> /tmp/postbackup_${SID}.sql
echo "spool /tmp/hotbackup${SID}end.sql" >>
/tmp/postbackup_${SID}.sql
echo "select 'alter tablespace '||tablespace_name||' end
backup;' from dba_tablespaces;" >> /tmp/postbackup_${SID}.sql
echo "spool off" >> /tmp/postbackup_${SID}.sql
echo "@/tmp/hotbackup${SID}end.sql;" >>
/tmp/postbackup_${SID}.sql
echo "exit;" >> /tmp/postbackup_${SID}.sql
chmod 777 /tmp/postbackup_${SID}.sh
su - ora${sid} -c /tmp/postbackup_${SID}.sh
rm /tmp/postbackup_${SID}.sh
rm /tmp/hotbackup${ORACLE_SID}end.sql
rm /tmp/postbackup_${ORACLE_SID}.sql
}
online_snap (){
# Performing Snapschot from Database only
rsh ${DFILER} snap delete ${DATAVOL}
sap_online_${SID}_old2

Administration and Operation 261


Data Protection – Backup and Restore Restore SnapShot

rsh ${DFILER} snap rename ${DATAVOL}


sap_online_${SID}_old1 sap_online_${SID}_old2
rsh ${DFILER} snap rename ${DATAVOL}
sap_online_${SID}_new sap_online_${SID}_old1
rsh ${DFILER} snap create ${DATAVOL}
sap_online_${SID}_new
}

For more information about backup of SAP systems, see the NetApp Technical Library
report 3365 about “SAP Backup and Recovery with NetApp Filers”:
http://www.netapp.com/tech_library/ftp/3365.pdf

14.3 Restore SnapShot


You might need to restore a file from a snapshot if the file was accidentally erased or
corrupted. If you have purchased the SnapRestore license, you can automatically restore
files or volumes from snapshots with one command.
To restore a file from a snapshot, execute the following steps.
1. If the original file still exists and you don’t want it to be overwritten by the snapshot
file, then use your UNIX client to rename the original file or move it to a different
directory.
2. Locate the snapshot containing the version of the file you want to restore.
3. Copy the file from the .snapshot directory to the directory in which the file originally
existed.
Example with SnapRestore license:
filer> snap restore -t file /vol/vol0/etc/testfile -s nightly.0

filer> WARNING! This will restore a file from a snapshot into the
active filesystem. If the file already exists in the active
filesystem, it will be overwritten with the contents from the
snapshot.

Are you sure you want to do this? Y

You have selected file /vol/vol0/etc/testfile, snapshot nightly.0

Proceed with restore? y

262 Administration and Operation


FlexFrame Backup with Tape Library Data Protection – Backup and Restore

Example with copying the same file from the .snapshot directory using a Control Node
(no SnapRestore license required):
control1> df -k
Filesystem 1K-blocks Used Available Use% Mounted on
filer:/vol/vol0/ 50119928 620012 49499916 2%
/FlexFrame/vol0

control1> cd /FlexFrame/vol0/etc

control1> cp /FlexFrame/vol0/.snapshot/etc/testfile testfile

Copying will allocate new space while snap restore will use the old data blocks.
The data on volume volFF is shared among multiple SAP systems. Restoring
the complete volume has an effect on all SAP systems of the complete
FlexFrame landscape.

14.4 FlexFrame Backup with Tape Library

14.4.1 Arcserve
Detailed information on a backup solution with Arcserve is available at
http://extranet.fujitsu-
siemens.com/com/ep/storage/management/brightstor/arcserve_backup/general/arcserve
_with_flexframe
as well as
http://extranet.fujitsu-siemens.com/flexframe
at the bottom.

Administration and Operation 263


Data Protection – Backup and Restore FlexFrame Backup with Tape Library

14.4.2 NetWorker
This concept is based on snapshots and uses the NDMP protocol for transferring data
from the NetApp Filer directly to the tape library. As a backup tool NetWorker is used
including add-on products like NSR-ORA-NDMP. A dedicated backup server will be used
for maintaining NetWorker.
Configuration Example for Oracle:

Detailed information on
● FlexFrame Backup with NetWorker
● slidesets for different target groups, also with implementation and configuration
information
● order units - example
will be provided by Fujitsu Siemens Computers Storage Consulting and is available at
http://extranet.fujitsu-siemens.com/com/ep/storage/solutions/it-solutions/FlexFrame-
Backup
as well as
http://extranet.fujitsu-siemens.com/flexframe
at the bottom.

264 Administration and Operation


Backup / Restore of FlexFrame Control Nodes Data Protection – Backup and Restore

14.5 Backup / Restore of FlexFrame Control Nodes

14.5.1 Backup of a Control Node


Most of the files on a Control Node are shipped with the installation DVD. Necessary data
can be backed-up using ff_cn_backup.sh.

Synopsis

ff_cn_backup.sh [-dr] [-f <file_name>] [-l <list>]

Command Options
-d Debug messages to /tmp/ff_cn_backup.DEBUGLOG
-r Restore
-f <file_name>
Use file name for restore
-l <list>
Blank separated list of files (quoted)
The default behavior (without any options) is to backup the configuration files in a zipped
tar file at /FlexFrame/volFF/FlexFrame/backup/<name_of_cn><date>.tgz.
Additional files can be backup up using the -l option.
The Control Nodes are not inteded to have additional software installed on.
Therefore backup in the regular sense is not requried. Installation from the DVD
is much faster than any restore.

14.5.2 Restore of a Control Node


Restore can be done after installation using the Instalation DVD along with the
configuration media. Make sure to install the same patches and RPM packages of the
FlexFrame tools as before.
After the installation, ff_cn_backup.sh must be called using the option -r.
By default, the restore will pick the latest backup it can find. If this should not be the
appropriate backup file, use the option -f <file_name>.

Administration and Operation 265


Data Protection – Backup and Restore Backup / Restore of FlexFrame Control Nodes

In detail, follow these steps to restore a Control Node:


1. Boot the Control Node Installation DVD.
2. Provide the configuration files (*.conf) with the USB stick or floppy. They may be
obtained manually from the /opt/FlexFrame/etc folder of the remaining Control
Node or from a backup file.
You will get a warning message about wrong checksums if you reuse
configuration files which have had been installed on a Control Node. This
warning message is only intended to give you a hint to double-check the
configuration files. It can be ignored safely.

3. Abort the installation at the first “<YES> <NO>” screen with the title "FlexFrame(TM)
Setup for Control Nodes ".
4. Enter s to get into a subshell.
5. Install the latest patch set.
6. Reboot the Control Node using reboot -f
7. Again the setup screen will appear.
8. If the passwords are requested, enter the current passwords.
9. Enter the correct Control Node (first or second) for this Control Node during
installation.
10. Once the system is booted execute the following command:
control1:~# ff_cn_folders.pl -notrans

11. Check the access to /FlexFrame/volFF/FlexFrame/backup


12. Use ff_cn_backup.sh with the option -r and optionally -f <file_name> to
restore the original contents of the Control Node's configuration files.
13. Compare the two Control Nodes using the command:
control1:~# ff_cn_cmp.sh

If there are differences in files listed, copy them over from the other Control Node
using:
scp -p control2:/<path> /<path>

14. Reboot the Control Node using the init 6 command.


15. The Control Node should now be ok.

266 Administration and Operation


Backing Up Switch Configurations Data Protection – Backup and Restore

14.6 Backing Up Switch Configurations


The switch configuration may be different to that at the time of installation due to the
operation of a FlexFrame landscape. New Application Nodes and switches may be
added, ports configured and so on. For system backup, there are a lot of well known
products and programs.
To simplify backing up the configuration of all switches used within the FlexFrame
landscape the program /opt/FlexFrame/bin/ff_save_switch_config.pl should
be used. It is designed to be run by cron(8) and recommended to call it once a day.
The program scans the LDAP database for switches and switch blades. It connects the
switches and the switch blades and stores the running or startup configuration via TFTP
into /tftpboot/<switch host name>.config on the Control Nodes. The switch host
name is derived from LDAP, not from the switch backed up. The program is able to
backup all (this is the default) switches or the switch given by its LDAP known host name.

Synopsis

ff_save_switch_config.pl [--silent] [--running]


[--name <switch_name>]
[--ip <control_node_ip>]

Command Options
--silent
The silent mode supresses any message, even error messages.
--running
In normal case the configuration, which the switch reads on booting, is backed up,
the so called startup-config. Using the --running option the current running switch
configuration is used instead of startup-config. For a Quanta BX600 switch blade
(has six external ports) the running switch configuration is backed up only.
--name <switch_name>
Backup only the switch with given name. Keep in mind to use the name known at
LDAP database. It may differ from current switch name if it was not synchronized.
--ip <control_node_ip>
Use control node with given control lan ip address to save configurations to. In
normal case the program is able to detect the control node with running netboot_srv
service itself. If it is not able to detect the active node or can not determine the
control lan ip of the control node control, the program requests to use the --ip
option.

Administration and Operation 267


Data Protection – Backup and Restore Restoring Switch Configuration

14.7 Restoring Switch Configuration


To restore a switch configuration select the proper configuration file from /tftpboot and
follow the instructions in the chapter „Switch Configuration” of the „Installation Guide.
For switch blades the configuration may alternatively be created with
/opt/FlexFrame/bin/ff_bx_cabinet_adm.pl using operation mode swb-config
(see section 8.5.7).

268 Administration and Operation


15 Error Handling & Trouble Shooting

15.1 Log Files


For easy gathering of log file information on all Application Nodes in FlexFrame, it is not
necessary to login on each Application Node because all log files are directly readable at
the Control Nodes on the mounted Filer.
To elevate the admin to read these log files it is useful to create symbolic links on the
Control Nodes in the directory /var like this:

control1:/ # cd /var
control1:/var # ls -ld log*
drwxr-xr-x 20 root root 1440 May 2 14:01 log
lrwxrwxrwx 1 root root 58 Apr 29 17:28
log_pool1_klinge1 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac10020e/log
lrwxrwxrwx 1 root root 58 Apr 29 17:29
log_pool1_klinge2 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac10020f/log
lrwxrwxrwx 1 root root 58 Apr 29 17:30
log_pool1_klinge3 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac100210/log
lrwxrwxrwx 1 root root 58 Apr 29 17:30
log_pool1_klinge4 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac100211/log
lrwxrwxrwx 1 root root 58 Apr 29 17:30
log_pool1_klinge5 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac100212/log
lrwxrwxrwx 1 root root 58 Apr 29 17:31
log_otto_RX300-01 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac100113/log
lrwxrwxrwx 1 root root 58 Apr 29 17:33
log_otto_RX300-02 -> /FlexFrame/volFF/os/Linux/FSC_3.2B00-
000.SLES-9.X86_64/var_img/var-ac100114/log
control1:/var #

To read the /var/log/messsages from Application Node klinge5 in pool pool1


invoke this:
control1:/ # cd /var
control1:/var # less log_pool1_klinge5/messages

Administration and Operation 269


Error Handling & Trouble Shooting Network Errors

Or enter the log directory from this Application node and look for logfiles:
control1:/ # cd /var/log_pool1_klinge5
control1:/var/log_pool1_klinge5 # ls -lrt|tail
-rw-r--r-- 1 root root 278 May 4 16:23 log.scagt
-rw-r--r-- 1 root root 174 May 4 16:24 log.vvagt
-rw-r--r-- 1 root root 257 May 4 16:24
log.statusagt
-rw-r--r-- 1 root root 612 May 4 16:28 ntp
-rw-r--r-- 1 root root 5542 May 4 16:35 warn
-rw-r--r-- 1 root root 1696 May 4 16:35 auth.log
-rw-r--r-- 1 root root 484 May 4 18:16 log.busagt
drwxr-xr-x 2 root root 4096 May 4 18:19 sa
-rw-r--r-- 1 root root 27622 May 4 18:19 messages
prw------- 1 root root 0 May 4 18:24 psadfifo
control1:/var/log_pool1_klinge5 #

15.2 Network Errors


In most cases, network problems are most times configuration mistakes. All switches
except the switch blades are configured to send SNMP traps and log via syslog to Control
Nodes. To see what happened, look at the Control Nodes’ /var/log/messages. Any
conditions reported by the switches can be found here. For SNMP traps look at the FA
Agents support database. The FA Agents collect all kinds of SNMP traps for the entire
FlexFrame environment.

15.3 NFS Mount Messages


During start/stop procedures of SAP instances, NFS mount failures for /usr/sap/SYS
and /oracle/db_sw can be seen in /var/log/messages. These messages are not
FlexFrame-specific failures, but may occur on any system that has /usr/sap and
/oracle in its automount maps. These messages are caused by the way SAP is linking
binaries during software development process.

270 Administration and Operation


LDAP Error Codes and Messages Error Handling & Trouble Shooting

15.4 LDAP Error Codes and Messages


LDAP failures may have various reasons. To determine the reason, first have a look at
the Control Node’s /var/log/messages. If slapd does not report anything you may
have to increment the loglevel configured with /etc/openldap/slapd.conf. Log
levels are additive, and available levels are:

Log level Meaning


1 Trace function calls
2 Debug packet handling
4 Heavy trace debugging
8 Connection management
16 Print out packets sent and received
32 Search filter processing
64 Configuration file processing
128 Access control list processing
256 Stats log connections/operations/results
512 Stats log entries sent
1024 Print communication with shell backends
2048 Entry parsing

Most LDAP problems are related to access control and connection management.
If you have changed the loglevel send a -HUP to the slapd to advice it to reread
configuration files (use pkill -HUP slapd).
Another reason may be the LDAP client configuration. Check the following issues:
● Does the client use the correct server address?
● Does it use the correct user and password (Solaris only)?
Look at the client messages file to get this information.
If ldapsearch -x on Linux and ldaplist on Solaris works properly the problem may
be the nsswitch configuration. These are the top most problems.

Administration and Operation 271


Error Handling & Trouble Shooting FA Agents Error Diagnosis

When reporting problems with the administration scripts, please provide the
following information:
● The error message of the admin tool
● A dump of the LDAP database (ldapsearch -x > <file name>)
The configuration files /etc/openldap/slapd.conf and
/FlexFrame/volFF/FlexFrame/ldap/common/slapd.acl.conf

15.5 FA Agents Error Diagnosis


The FA Agents offer a large number of diagnostic options for detecting and diagnosing
problems on the FA Agents themselves or other components.
Problems concerning FA Agents can be assigned to one of the following categories:
● FlexFrame installation and configuration errors
● Parameter errors
● Configuration errors
● Detection, reaction errors, start, stop, maintenance errors
● Power shutdown errors

Typical consequences of installation and configuration errors are:


● FA Agents fail to start
● Error messages during startup of FA Agents

Error: Mount points missing


Diagnosis:
In the case of missing mount points monitored by FA Autonomy, traps are sent to
the central trap consoles. With other mount points which are absolutely essential for
the operation of the node in question, it can happen that the agents cannot be
started as the directories required are not available.
Response:
Provide the mount points required with the appropriate mount options.

272 Administration and Operation


FA Agents Error Diagnosis Error Handling & Trouble Shooting

Error: Mount points without “File Locking”


Diagnosis:
The FA Agents log this situation both in the operating system’s Syslog and in special
files (/opt/myAMC/vFF/log/log_syslog*).
Response:
Provide the mount points required with the appropriate mount options (lock).

Error: Rights for the directories/files are not sufficient


Diagnosis:
In the case of files monitored by FA Autonomy, traps are sent to the central trap
consoles. With other directories/files which are absolutely essential for the operation
of the node concerned, it can happen that the agents cannot be started as the
directories/files required are not available.
Response:
Provide the directories/files with the required rights.

Error: Agents do not have the authorization to write to the directories assigned
Diagnosis:
The FA Autonomy production and log files are not written.
Response:
Provide the directories/files with the required permissions.

Error: Version incompatibility


Diagnosis:
FlexFrame installation and FlexFrame Autonomy installation are not directly
compatible. This can always be the case when older FlexFrame installations are
updated with new FlexFrame Autonomous Agents.
Response:
For diagnosis and troubleshooting, the mount points, the directory structure and the
access rights to the directories used by the agents must be checked.
Use the migration tool, check that the parameters used in the FA config files are
compatible with the version and syntactically correct.

Administration and Operation 273


Error Handling & Trouble Shooting FA Agents Error Diagnosis

Error: Pool assignment not found


Displayed in the FA WebGUI or in the agent’s start trap and on an event console.
Diagnosis:
A node is assigned to the wrong pool or to the default pool.
Response:
Check the LDAP configuration parameters, call the PGTool Pool.sh and check the
pool name returned. Check the pool membership for each node.

Error: Group assignment is not correct


Display on the FA WebGUI or in the agent’s start trap and display on an event console.
Response:
Check the group configuration in the group configuration file. Check the group
membership for each node.

Error: Service priority not recognized


Display on the FA WebGUI or in the agent’s start trap and display on an event console.
Response:
Check the configuration of the service class and service priority in the group
configuration file. Check the values for each node.

Error: Availability problem not rectified by autonomous reaction


Diagnosis:
Services are discontinued (possibly due to hardware fault) and are not made avai-
lable again by FlexFrame Autonomy.
Response:
Check whether nodes are available for taking over the services (Spare Nodes).
Check whether the FA Agents on the nodes involved have been started.

274 Administration and Operation


FA Agents Error Diagnosis Error Handling & Trouble Shooting

Error: Services do not start - Constant reboot, Permanent switchover


Diagnosis:
SAP services which are started do not enter run mode but are repeatedly restarted
or, if the problem escalates, the node is rebooted or an internal switchover takes
place. Possible causes:
– The MaxRestart time for the service is too short. This parameter can be
adjusted in the FA configuration.
– The virtual interfaces cannot be reached.
– There is a permanent problem which prevents a service being started (e.g.
necessary database recovery).
Response:
Stop the FA Agents to interrupt escalation of the reaction and check whether the
service can be started manually.
If the service cannot be started manually, this problem must be corrected by the
administrator.
If the service can be started manually, the time required for this must be matched to
the MaxStart time and MaxRestart time in the configuration and the
configuration must be adjusted, if necessary.
If the virtual interfaces cannot be reached from the Application Agent, the network
configuration must be checked.

Error: Service cannot be stopped


Diagnosis:
An active SAP service is repeatedly restarted after a manual stop command.
Response:
The problem could result of not using the FlexFrame SAP scripts to stop the service
manually. The FlexFrame SAP scripts must be used for every action concerning
FlexFrame services.
If the FlexFrame SAP scripts were used, the Monitor Alert Script might not be
available or does not have the required rights.
However, it is also possible that the Monitor Alert Time and CycleTime are
configured incorrectly. The agents’ CycleTime is too long in relation to the Monitor
Alert Time. The Monitor Alert Time must be at least 3 times the CycleTime.

Administration and Operation 275


Error Handling & Trouble Shooting FA Agents Operation and Log Files

Error: Maintenance activities are interrupted by autonomous reactions


Diagnosis:
Unwanted autonomous reactions during maintenance.
Response:
Set NoWatch for the service in question or stop the Application Agents for the node
concerned and restart them after maintenance has been completed.

Error: Incorrect Display on the FA WebGUI


Diagnosis:
The state checked manually does not match the display.
Response:
Check whether the Application Agents in question, the Control Agents concerned
and the web server are running properly for the WebGUI. The log files of the
FlexFrame Autonomous Agents can be used for the diagnosis.
The FlexFrame Autonomous Agents write detailed log files. The functions of the FA
Agents are documented in their own files. These files are created dynamically during
ongoing operation and may not be modified manually as this can impair fault-free
operation of the FA Agents or lead to erroneous reactions. Deleting these files
results in a state in which the Autonomous Agents reorganize themselves and, from
this point on, analyze the situation from their current viewpoint without any previous
information.

15.6 FA Agents Operation and Log Files

15.6.1 General
The activities and dynamic states of the FA Agents are documented in various files.
These files may not be changed manually as this can impact error free operation of the
FA Agents or result in incorrect reactions.
These files are created dynamically during ongoing operation. Deleting these files leads
to a status in which the Autonomous Agents reorganize themselves, and from this point
they re-evaluate the situation from the current viewpoint without any previous knowledge.

276 Administration and Operation


FA Agents Operation and Log Files Error Handling & Trouble Shooting

15.6.2 Overview, important Files and Directories


This section shows a number of FF specific files and directories and their respective
content.
Base directory is /opt/myAMC/.
Version numbers: In the table below, V<v>K<r> corresponds to
V<version number>K<revision_number>.

Subdirectories Content

./scripts Scripts for various tasks

./scripts/sap Link to the FlexFrame scripts

./scripts/acc Scripts for the SAPACC Interface

./scripts/PowerMng Scripts for the power management blades.

./scripts/ShutDown_Node Scripts to shut down a node.

./config General configuration data

./config/FA_WebGui.conf General settings for the WebGUI


(directories, cycle times, database
settings)

./config/amc-users.xml User management

./FA_AppAgent Installation path of myAMC.FA_AppAgent


and of diverse scripts.

./FA_AppAgent/myAMC.FA_AppAgent Start/stop scripts Application Agent

./FA_AppAgent/PGTool_Pool.sh Determination of pool membership

./FA_AppAgent/PGTool_Version.sh Determination of pool version

./FA_AppAgent/PVget.sh Determination of the SAPS number of a


node

./FA_AppAgent/BBTool.sh BlackBoard control

./FA_AppAgent/BBT_dialog.sh BlackBoard dialog mode control

Administration and Operation 277


Error Handling & Trouble Shooting FA Agents Operation and Log Files

Subdirectories Content

./FA_AppAgent/bin_Solaris_V<v>K<r> Binaries and libraries for Solaris and Linux


./FA_AppAgent/bin_Linux_V<v>K<r> for each version.
./FA_AppAgent/bin_Linux_SLES9_V<v>K<r>
./FA_AppAgent/lib_Solaris_V<v>K<r>
./FA_AppAgent/lib_Linux_V<v>K<r>
./FA_AppAgent/lib_Linux_SLES9_V<v>K<r>

./FA_AppAgent/config myAMC.FA_AppAgent-specific
configuration data

./FA_AppAgent/log empty

./FA_CtrlAgent Installation path of


myAMC.FA_CtrlAgent and of scripts

./FA_CtrlAgent/myAMC.FA_AppAgent Start/stop scripts Control Agent

./FA_CtrlAgent/PGTool_Pool.sh Determination of pool membership

./FA_CtrlAgent/PGTool_Version.sh Determination of pool version

./FA_CtrlAgent/PVget.sh Determination of the SAPS number of a


node

./FA_CtrlAgent/BBTool.sh BlackBoard control

./FA_CtrlAgent/BBT_dialog.sh BlackBoard dialog mode control

./FA_CtrlAgent/bin_Solaris_V<v>K<r> Binaries and libraries for Solaris and Linux


./FA_CtrlAgent/bin_Linux_V<v>K<r> for each version.
./FA_CtrlAgent/bin_Linux_SLES9_V<v>K<r>
./FA_CtrlAgent/lib_Solaris_V<v>K<r>
./FA_CtrlAgent/lib_Linux_V<v>K<r>
./FA_CtrlAgent/lib_Linux_SLES9_V<v>K<r>

./FA_CtrlAgent/config myAMC.FA_CtrlAgent-specific
configuration data

./FA_CtrlAgent/log empty

./vFF Pool-specific (vFF) data

./vFF/log Pool-specific log files

278 Administration and Operation


FA Agents Operation and Log Files Error Handling & Trouble Shooting

Subdirectories Content

./vFF/Common/myAMC_FA_Pools.xml Pools configuration file and its default


./vFF/Common/myAMC_FA_Pools-default.xml version
This is used as LDAP cache.

./vFF/Common/.vFF_template.V<v>K<r> Template of pool-specific data for each


version

./vFF/Common/.vFF_template.V<v>K<r>/ Configuration
config

./vFF/Common/.vFF_template.V<v>K<r>/ Trap targets


config/TrapTargets.xml

./vFF/Common/.vFF_template.V<v>K<r>/ myAMC.FA configuration and default


config/myAMC_FA.xml
./vFF/Common/.vFF_template.V<v>K<r>/
config/myAMC_FA-default.xml

./vFF/Common/.vFF_template.V<v>K<r>/ myAMC.FA ACC configuration and default


config/myAMC_FA_ACC.xml
./vFF/Common/.vFF_template.V<v>K<r>/
config/myAMC_FA_ACC-default.xml

./vFF/Common/.vFF_template.V<v>K<r>/ myAMC.FA GUI configuration and default


config/myAMC_FA_GUI.xml
./vFF/Common/.vFF_template.V<v>K<r>/
config/myAMC_FA_GUI-default.xml

./vFF/Common/.vFF_template.V<v>K<r>/ myAMC.FA groups configuration and


config/myAMC_FA_Groups.xml default
./vFF/Common/.vFF_template.V<v>K<r>/
config/myAMC_FA_Groups-default.xml

./vFF/Common/.vFF_template.V<v>K<r>/ myAMC.FA shutdown security


config/myAMC_FA_SD_Sec.xml configuration and default
./vFF/Common/.vFF_template.V<v>K<r>/
config/myAMC_FA_SD_Sec-default.xml

./vFF/Common/.vFF_template.V<v>K<r>/log Logfiles of myAMC.FA_AppAgent and


./vFF/Common/.vFF_template.V<v>K<r>/log/ myAMC.FA_CtrlAgent for each pool
AppAgt
./vFF/Common/.vFF_template.V<v>K<r>/log/
CtlrAgt

./vFF/Common/.vFF_template.V<v>K<r>/data Work files


./vFF/Common/.vFF_template.V<v>K<r>/data
/FA

Administration and Operation 279


Error Handling & Trouble Shooting FA Agents Operation and Log Files

Subdirectories Content

./vFF/Common/.vFF_template.V<v>K<r>/data Livelist
/FA/livelist livelist.log

./vFF/Common/.vFF_template.V<v>K<r>/data XML repository for the web interface


/FA/xmlrepository livelist.xmlServices_<node
name>.xml

./vFF/Common/.vFF_template.V<v>K<r>/data Service lists


/FA/servicelists Services_<node name>.lst

./vFF/Common/.vFF_template.V<v>K<r>/data Service logs (history)


/FA/servicelogs Services_<node name>.log

./vFF/Common/.vFF_template.V<v>K<r>/data Reboot files


/FA/reboot Reboot_<node name>.lst

./vFF/Common/.vFF_template.V<v>K<r>/data SwitchOver files


/FA/switchover SwitchOver_<node name>.lst

./vFF/Common/.vFF_template.V<v>K<r>/data BlackBoard
/FA/blackboard blackboard.txt

./vFF/Common/.vFF_template.V<v>K<r>/data Measured performance data


/FA/performance

./vFF/vFF_Cust_1 Pool-specific data for pool “Cust_1”


./vFF/vFF_Cust_1/config (example).
./vFF/vFF_Cust_1/log/… See above for the description of the
./vFF/vFF_Cust_1/data subdirectories and files.
./vFF/vFF_Cust_1/data/FA/….

./vFF/vFF_Cust_2 Pool-specific data for pool “Cust_2”


./vFF/vFF_Cust_2/config (example).
./vFF/vFF_Cust_2/log/… See above for the description of the
./vFF/vFF_Cust_2/data subdirectories and files.
./vFF/vFF_Cust_2/data/FA/…

15.6.3 Special Files


The write cycle for the entries (with the exception of reboot, switchover and
BlackBoard) and the storage location of the files described in the following are defined
using a parameter in the configuration file myAMC_FA.xml.

280 Administration and Operation


FA Agents Operation and Log Files Error Handling & Trouble Shooting

15.6.3.1 Livelist
Each FA Application Agent regularly enters itself in this list. Through these entries the
Control Agent recognizes whether the various Application Agents are available and
functioning without error.

15.6.3.2 Services List


This file (testament) exists for each FA Application Agent on a node-specific basis. In this
file, the FA Application Agent logs the detected services and their actual status, which are
detected by the internal detector. So the status information in this file is service related.
The content is updated with every detector cycle.

15.6.3.3 Services Log


The contents of this file are identical to those of the Services-List file, with the
difference that the history is contained in this file. This enables status changes and
reaction decisions to be detected and replicated.

15.6.3.4 Reboot
The contents of this file are identical to those of the Services-List file. The file serves
as information storage when a reboot takes place. It is written only for the autonomous
reaction reboot and is deleted again after the reboot has been completed and the
services have been started up.

15.6.3.5 Switchover
The contents of this file are identical to those of the Services-List file. The file serves
as information storage (testament) when a switchover takes place. It is written only for the
autonomous reaction switchover and is deleted again after the services were taken
over.

15.6.3.6 XML Repository


In terms of contents, the files in the XML Repository are the same as those in the Livelist
and Services List. By contrast, the contents are written in XML notation and can thus be
visualized directly with the associated FA WebGUI. The write cycle for the entries and the
storage location of the file are defined using a parameter in the configuration file
myAMC_FA.xml.

Administration and Operation 281


Error Handling & Trouble Shooting FA Agents Operation and Log Files

15.6.3.7 BlackBoard
The BlackBoard is an input interface for the FA Agents. Commands can be entered here
which are executed by the FA Application Agents. The commands have a specific validity
period and are secured against manipulation. The file is written manually using a tool
which guarantees, among other things, protection against manipulation.

15.6.4 FA Autonomy Diagnostic Tool


Manual diagnosis of the log files can be very time-consuming. The Fujitsu Siemens
Computers support organization works with specialized diagnostic tools which can
analyze even large quantities of data very quickly and efficiently. This service can be
utilized when required and if a corresponding service agreement exists.
To utilize this service, either individual log files or the entire virtual FA directory of a pool
can be sent to the support department, e.g. as a compressed and protected zip archive.

15.6.5 Data for Diagnosis in the Support Department


If support is required, there is special data needed by the FlexFrame Support. This
information is required to analyze problems with FlexFrame and the Autonomous Agents.
● Error description, as precise as possible
What is the problem or error? On which nodes does it occur?
● Version of the FA Agents installed
Run “rpm -qa | grep myAMC“ on the Control Node.
● Configuration, work and log files of the FA Agents
The following script creates an achive with the desired information:
/opt/myAMC/FA_CtrlAgent/SAVE_FA_files_for_diag.sh

This script has to be invoked from a Control Node!


cd /opt/myAMC/FA_CtrlAgent
./SAVE_FA_files_for_diag.sh

The functions of the FA Agents are documented in various files. These files may not be
changed manually as this can impair error-free operation of the FA Agents or result in
incorrect reactions.
These files are created dynamically during ongoing operation. Deleting these files leads
to a status in which the Autonomous Agents reorganize themselves, and from this point
they re-evaluate the situation from the current viewpoint without any previous knowledge.

282 Administration and Operation


Start/Stop Script Errors Error Handling & Trouble Shooting

For further information on collecting diagnosis data see “FA Agents - Installation and
Administration“ manual, section 4.7.3.

15.7 Start/Stop Script Errors


While executing the SAP service start/stop scripts, several errors may occur for arbitrary
reasons. They can be grouped in error classes like
● wrong parameters invoked
● installation errors of the SAP service (e.g. virtual host names, instance profiles,
service entries)
● functionality errors with interface configuration
● logical errors (e.g. with start_flags)
The $-variables in the messages shown below are dependent from the system ID,
instance number and instance type.

15.7.1 Common Error Messages for all Start/Stop Scripts


This section lists common error messages delivered by the start/stop service scripts:

$SERVICE_SCRIPT_PATH/sapservice_config does not exist;


you are possibly not on an Application Node!
All start/stop scripts use a common configuration file.in aFlexFrame 3.2
environment that is pool-specific.

Wrong parameter count


To less or to many parameters invoked, please attend the usage.
For clean up:
$v_service_text ${ID} ${SID} have still running processes.
For sure to kill running processes on $v_service_text ${ID}
${SID}? (yes/no):

Clean up is the ultimative method to kill still running processes, freeing occupied
shared memory and semaphores caused by this SAP service. You should never
invoke this if you are not really sure!

homedir from $v_user not found in passwd.


For the OS user $v_user there is no home directory in /etc/passwd or
getent passwd.

Administration and Operation 283


Error Handling & Trouble Shooting Start/Stop Script Errors

homedir $v_home from $v_user not exist.


The home directory from $v_user does not exist.

Interface errors

no interface defined for $v_lan_type.


In /FlexFrame/scripts/sapservice_config must be defined which
interface will be used for which LAN type.

no IPMP interface found for $v_interface.


No IPMP interface is configured for Solaris. Please check with ifconfig -a.

no netmask defined for $v_lan_type.


The netmasks to be used for each LAN type have to be defined in
/FlexFrame/scripts/sapservice_config.

$v_lan_type interface $v_interface:$I ypcat/getent failure for


host $v_vhost.
getent hosts deliveres no entry for this virtual host name $v_vhost.

$v_lan_type interface $v_interface_n $MY_IP is already up.


This is a warning only, not an error.

my ${v_lan_type} ip-addr $MY_IP is already in use, ping gets


answers from a foreign interface.
While configuring a virtual interface to start a SAP service, it is checked if the
virtual IP is not in use.

$v_lan_type interface $v_interface:$I $MY_IP is not up.


The newly configured interface seems not to work properly for ping/arping.

$v_lan_type interface $v_interface: no free interface found.


This message occurs when all 64 virtual interfaces are in use. There may be a
maximum of 64 virtual interfaces on one physical interface.

284 Administration and Operation


Start/Stop Script Errors Error Handling & Trouble Shooting

$v_interface_n $MY_IP is already down.


This warning occurs while trying to shut down an already deconfigured virtual
interface.

$v_interface_n $MY_IP is not down.


Shutting down a virtual interface has failed.

start_flag errors

$v_service_text $ID $SID is possibly running on another host:


$v_vhost.

$v_service_text $ID $SID is possibly not started.

$v_service_text $ID $SID is already running on this host $v_vhost.

$v_service_text $ID $SID is already running on another host:


$v_vhost, MY_HOST: $MY_HOST.
The situation expected does not accord to the situation documented in the start
flag files *_host.

announce_start_flag errors

$v_service_text $ID $SID is possibly already started.

$v_service_text $ID $SID is possibly already starting.


A concurrent start situation for one SAP service has occurred. Only one attempt
to start it can be successful, the other one will fail with this message.

15.7.2 SAPDB Specific Error Messages


This section lists common error messages delivered by the start/stop service scripts.

we have neither a ABAP CI nor a Java JC instance profile for


$vhostname_ci!
To gather the dbms_type (ORA, ADA) from the $<sid>adm environment, the
script is looking for instance profiles to get the appropriate virtual host name to
set the belonging environment. If no instance profiles exists or if none is
reachable via NFS, this message will occur.

Administration and Operation 285


Error Handling & Trouble Shooting Start/Stop Script Errors

dbms_type is not set!


The dbms_type could not be determined.

Database /sapdb/$SID not found!


or
Database /oracle/$SID not found!
This directory does not exist or is not reachable via NFS.

Unknown Database Type !


The dbms_type is neither ORA nor ADA. The sapdb script supports only these
two database types.

$SIDADM_HOME/.dbenv_${vhostname_ci}.csh not found for SID $SID


Instance $ID !
and
$SIDADM_HOME/.sapenv_${vhostname_ci}.csh not found for SID $SID
Instance $ID !
The database is started as ${sid}adm, hence the environment files for this
have to exist.

For MaxDB or SAPDB Databases (i.e. dbms_type ADA):

Database /sapdb/$SID has no appropriate XUser found!


xuser list as ${sid}adm deliveres neither “c” or “c_J2EE” key. Please
check /home/${sid}adm/.XUSER.62 for a correct installation.

For Oracle Databases (i.e. dbms_type ORA):

tns-listener ${SID} is possibly not started.


TNS-listern process is not running. Please check the TNS listener configuration.

For Oracle Databases (i.e. dbms_type ORA):

tns-listener is possibly not correct configured for ${SID}.


tnsping has failed. Please check the TNS listener configuration.

286 Administration and Operation


Start/Stop Script Errors Error Handling & Trouble Shooting

$DB_SERVICE ${SID} not started.


startdb as ${sid}adm has failed, the reason can be found in the original
start logfile os4adm/startdb.log.

$DB_SERVICE ${SID} is possibly not stopped.


stopdb as ${sid}adm has failed, the reason can be found in the original stop
logfile os4adm/stopdb.log.

For dbms_type ADA:

vserver ${SID} is possibly not stopped.

For dbms_type ORA:

tns-listener ${SID} is possibly not stopped, we kill them.

15.7.3 Sapci-specific Error Messages


This section lists common error messages delivered by the start/stop service scripts.

System /usr/sap/$SID not found!


The directory /usr/sap/$SID does not exist or is not reachable via NFS.

No startprofile found for SID $SID $CI_SERVICE !


and
No instance profile $v_profile found for SID $SID $CI_SERVICE !
The profiles in directory /sapmnt/${SID}/profile/ must be reachable.

$CI_SERVICE ${ID} ${SID} could not determine the System Nr


(SAPSYSTEM) in $v_profile !
The instance number is the value of the instance parameter SAPSYSTEM. It has to
be found. The profile name is represented by $v_profile here.

Administration and Operation 287


Error Handling & Trouble Shooting Start/Stop Script Errors

$SIDADM_HOME/.dbenv_${vhostname}.csh not found for SID $SID


Instance $v_sysnr !
and
$SIDADM_HOME/.sapenv_${vhostname}.csh not found for SID $SID
Instance $v_sysnr !
The SAP instance is started as ${sid}adm, hence the environment files for this
must exist. The instance number is $v_sysnr here.

One or more /etc/services entries


sapms$SID sapgw$v_sysnr sapgw${v_sysnr}s sapdp$v_sysnr
sapdp${v_sysnr}s
for SID $SID Instance $v_sysnr missed !
All entries in /etc/services (getent services) for the SAP instance must
exist. $v_sysnr represents here the instance number.

$DB_SERVICE ${SID} host $vhost is not answering to ping.


There is no ping to the dependent service e.g. database (Oracle or
SAPDB/MaxDB) is answering.

$DB_SERVICE ${SID} is not running, please start it first.


Connecting via R3trans -d to the database (Oracle or SAPDB/MaxDB) was
impossible.

$CI_SERVICE ${SID} not started.


startsap r3 as ${sid}adm has failed, the reason can be found in the
original start logfile: os4adm/startsap_DVEBMGS67.log.

$CI_SERVICE ${SID} is possibly not stopped.


stopsap r3 as ${sid}adm has failed, the reason can be found the original
stop logfile: os4adm/stopsap_DVEBMGS67.log

15.7.4 Sapscs-specific Error Messages


The same messages as for sapci are used, instead of $CI_SERVICE or
$SCS_SERVICE is used.

288 Administration and Operation


SAP ACC Troubleshooting Error Handling & Trouble Shooting

15.7.5 Sapascs-specific Error Messages


The same messages as for sapci are used, instead of $CI_SERVICE or
$ASCS_SERVICE is used.

15.7.6 Sapjc-specific Error Messages


The same messages as for sapci are used, instead of $CI_SERVICE or $JC_SERVICE is
used.

15.7.7 Sapapp-specific Error Messages


The same messages as for sapci are used, instead of $CI_SERVICE or $APP_SERVICE
is used.

15.7.8 Sapj-specific Error Messages


The same messages as for sapci are used, instead of $CI_SERVICE or $J_SERVICE is
used.

15.8 SAP ACC Troubleshooting


See also chapter “Troubleshooting” in the ACC Installation Guide.

15.8.1 ACC Logging


The ACC stores all logging information. To access the log information, navigate in the
WebGUI to Controller Log
For detailed information, click on the suitable line and the information will be shown.
It is also possible to archive the logging information. Navigate to the Technical
Settings-> Archive Log
Delete, Archive and Retrieve are possible functions. For more information use the
Help link in the status bar.

15.8.2 Missing Server in the ACC Physical Landscape


● Check if the server is already up and reachable via network
● Start the ACCagents:
Server # /usr/sap/adaptive/ACCagents start

Administration and Operation 289


Error Handling & Trouble Shooting PRIMECLUSTER

● ACC: Press the Refresh button after a while and check if the server is visible in the
Physical Landscape.

15.8.3 Reset of Service Status in Case of Failures


If the service has the status failed, some steps have to be done to reset the status
depending on where the failure happened:
● Shut down the application service (App, CI, DB, …) by hand using the FF
start/stop scripts.
● Reset the status of the application service: Navigate to the detailed view of the
application service, click on the underlined status message and change the status.

15.8.4 Hanging Locks


For a safe handling of the different adaptive hosts, services and configurations, locks are
set to the currently used components. In some error situations it is possible that those
locks are not deleted correctly.
To delete those locks by hand:
● Login to Visual Administrator,
● Navigate to Server 0…. -> Services -> Locking Adapter.
● Type * into the field Name and press Refresh button.
Choose the right lock and press Delete selected locks.

15.9 PRIMECLUSTER

15.9.1 Problem Reporting


When reporting PRIMECLUSTER problems, always provide the logfiles of both Control
Nodes (CN1 and CN2).

15.9.2 Removing “Ghost Devices” from RMS GUI


To remove non-existing „Ghost Devices“ that may be diplayed in the RMS GUI, run the
following command:
rcqconfig -c -a CN1 CN2

Eventually, you may need to reboot afterwards.

290 Administration and Operation


Script Debugging Error Handling & Trouble Shooting

15.10 Script Debugging

15.10.1 Shell Scripts


For many FlexFrame scripts, debugging is activated by default. If debugging is not active,
it can be activated with the option "-d".
Example:
ff_netscan.sh -d

Further on, shell scripts can be traced using the option "-x".
For details, see the man page of the shell in use (man sh; man bash; etc.)
sh -x ff_netscan.sh

15.10.2 Perl Scripts


Before debugging, activate the logging functions of the shell by calling "script" (For
details, see man script).
The debugger is called using
perl -d <script_name>

Functions:
h Help
x <expression>
Evaluate an expression (hash/array) and display the result
p <string>
Returns strings
s Execute next command; follow subroutines
n Execute next command; skip subroutines
b <line_number>
Set break point at line <line number>
q Quit.
For further information, see the man page of perldebug.

Administration and Operation 291


16 Abbreviations
ABAP Advanced Business Application Programming
ACC Adaptive Computing Controller
ACI Adaptive Computing Infrastructure
ACPI Advanced Configuration and Power Interface
APM Advanced Power Management
APOLC Advanced Planner & Optimizer Life Cache
CCU Console Connection Unit
CIFS Common Internet File System
DHCP Dynamic Host Configuration Protocol
DIT Domain Information Tree
ERP Enterprise Resource Planning
ESF Enhanced System Facility
EULA End User License Agreement
FAA FlexFrame Autonomous Agent
FC Fiber Channel
FTP File Transfer Protocol
IP Internet Protocol
LAN Local Area Network
LDAP Lightweight Directory Access Protocol
LUN Logical Unit Number
MAC Media Access Control
MINRA Minimal Read Ahead
NAS Network Attached Storage
NDMP Network Data Management Protocol
NFS Network File System
NIC Network Interface Card
NVRAM Non-Volatile Random Access Memory

Administration and Operation 293


Abbreviations

OBP Open Boot Prom


OLTP On-Line Transaction Processing
ONTAP Open Network Technology for Appliance Products
OSS Open Source Software
POST Power-On Self Test
PCL PRIMECLUSTER
PW PRIMEPOWER
PXE Preboot Execution Environment
PY PRIMERGY
QA Quality Assurance
QS Quality of Service
RAID Redundant Array of Independent (or Inexpensive) Disks
RARB Reverse Address Resolution Protocol
RDBMS Relational Database Management System
RHEL Red Hat Enterprise Linux
RSB Remote Service Board
SCS System Console Software
SAP BW SAP Business Warehouse
SAPGUI SAP Graphical User Interface
SAPOSS SAP Online System Service
SID System Identifier
SLD System Landscape Directory
SLES SuSE Linux Enterprise Server
SMB Server Message Block
SMC System Management Console
SNMP Simple Network Management Protocol
SPOC Single Point Of Control
TELNET Telecommunications Network
TFTP Trivial File Transfer Protocol

294 Administration and Operation


Abbreviations

UDP User Datagram Protocol


UPS Uninterruptible Power Supply
VLAN Virtual Local Area Network
VTOC Virtual Table Of Contents
WAN Wide Area Network
WAS Web Application Server
WAFL Write Anywhere File Layout
XSCF Extended System Control Facility

Administration and Operation 295


17 Glossary
Adaptive Computing Controler
SAP system for monitoring and controlling SAP environments.
Advanced Business Application Programming
Proprietary programming language of SAP.
Advanced Power Management
A standard for power saving and managements in computers.
Application Agent
A software program monitoring and managing applications.
Application Node
A host for applications (e.g. SAP instances db, ci, agate, wgate, app etc.). This
definition includes Application Servers as well as Database Servers.
Automounter
The automounter is an NFS utility that automatically mounts directories on an NFS
client as they are needed, and unmounts them when they are no longer needed.
Blade
A special form factor for computer nodes.
BOOTPARAM
Boot time parameters of the Solaris kernel.
BRBACKUP
SAP backup and restore tools.
Client LAN
A virtual network segment within FlexFrame, used for client-server traffic.
Common Internet File System
A protocol for the sharing of file systems (same as SMB).
Computing Node
From the SAP ACI perspective: A host that is used for applications.
Control Agent
A software program monitoring and managing nodes within FlexFrame.
Control LAN
A virtual network segment within FlexFrame, used for system management traffic.
Control Node
A physical computer system, controlling and monitoring the entire FlexFrame
landscape and running shared services in the rack (dhcp, tftp, ldap etc.).

Administration and Operation 297


Glossary

Control Station
An Application Node running SAP ACC.
Dynamic Host Configuration Protocol
DHCP is a protocol for assigning dynamic IP addresses to devices on a network.
Dynamic Host Configuration Protocol server
DHCP service program.
Enterprise Resource Planning
Enterprise Resource Planning systems are management information systems that
integrate and automate many of the business practices associated with the
operations or production aspects of a company.
Ethernet
A Local Area Network which supports data transfer rates of 10 megabit per second.
Fiber Channel
Fibre Channel is a serial computer bus intended for connecting high speed storage
devices to computers.
Filer
Network attached storage for file systems.
FlexFrame Autonomous Agent
Central system management and high availability software component of FlexFrame.
FlexFrame for mySAP Business Suite
FlexFrame for mySAP Business Suite is a radically new architecture for mySAP
environments. It exploits the latest business critical computing technology to deliver
major cost savings for SAP customers.
Gigabit Ethernet
A Local Area Network which supports data transfer rates of 1 gigabit (1,024
megabits) per second.
Host name
Name of a physical server as seen from outside the network. One physical server
may have multiple host names.
Image
In the FlexFrame documentation, “Image” is used as a term for “Hard Disk Image”.
Internet Protocol Address
A unique number used by computers to refer to each other when sending information
through networks using the Internet Protocol.
Lightweight Directory Access Protocol
Protocol for accessing on-line directory services.

298 Administration and Operation


Glossary

Local host name


Name of the node (physical computer); it can be displayed and set using the
command /bin/hostname.
Logical Unit Number
An address for a single (SCSI) disk drive.
MaxDB
A relational database system from mySQL (formerly ADABAS and SAPDB).
Media Access Control address
An identifier for network devices, usually unique. The MAC address is stored
physically on the device.
mySAP Business Suite
The main ERP software product of SAP AG.
NDMPcopy
NDMPcopy transfers data between Filers using the Network Data Management
Protocol .
Netboot
A boot procedure for computers, where the operating system is provided via the
network instead of local disks.
NetWeaver
SAP NetWeaver is the technical foundation of mySAP Business Suite solutions.
Network Appliance Filer
See “Filer”.
Network Attached Storage
A data storage device that is connected via a network to one or multiple computers.
Network File System
A network protocol for network-based storage access.
Network Interface Card
A hardware device that allows computer communication via networks.
Node
A physical computer system, controlled by an OS.
Node name
Name of a physical node, as returned by the command uname -n. Each node name
within a FlexFrame environment has to be unique.
Non Volatile Random Access Memory
A type of computer memory that retains its information when the power is switched
off.

Administration and Operation 299


Glossary

On-Line Transaction Processing


Transaction processing via computer networks.
OpenLDAP
An Open Source LDAP service implementation.
Open Network Technology for Appliance Products
The operating system of Network Appliance Filers.
Open Source Software
Software that is distributed free of charge under an open source license, such as the
GNU Public License.
Oracle RAC
A cluster database by Oracle Corporation.
Physical host name
Name of a physical computer system (node).
Power-On Self Test
Part of a computer's boot process; automatic testing of diverse hardware
components.
Preboot Execution Environment
An environment that allows a computer to boot from a network resource without
having a local operating system installed.
PRIMECLUSTER
Fujitsu Siemens Computer’s high availability and clustering software.
PRIMEPOWER
Fujitsu Siemens Computer's SPARC-based server product line.
PRIMERGY
Fujitsu Siemens Computer's i386-based server product line.
Qtree
A special subdirectory in a volume that acts as a virtual subvolume with special
attributes, primary quotas and permissions.
Red Hat Enterprise Linux
Linux distribution by Red Hat, Inc., targeting business customers.
Reverse Address Resolution Protocol
A protocol allowing resolution of an IP address corresponding to a MAC address.
SAP Service
In FlexFrame: SAP Service and DB Services.
SAP service script
An administration script for starting and stopping an SAP application on a virtual host.

300 Administration and Operation


Glossary

SAP Solution Manager


Service portal for the implementation, operation and optimization of an SAP solution.
SAPGUI
SAP graphical user interface.
SAPLogon
Front-end software for SAPGUI.
SAPRouter
Router for SAP services like SAPGUI or SAPTELNET.
Server
A physical host (hardware), same as node.
Service
A software program providing functions to clients.
Service type
The type of an application or service (db, ci, app, agate, wgate etc.).
Single Point of Control
In FlexFrame: One user interface to control a whole FlexFrame landscape.
Storage LAN
A virtual LAN segment within a FlexFrame environment, carrying the traffic to the
Filer.
SuSE Linux Enterprise Server
A Linux distribution by Novell, specializing in server installations.
Telecommunications Network (TELNET)
A network protocol that mainly provides command line login to a remote host.
Trivial File Transfer Protocol server
A simple FTP implementation.
Virtual host name
The name of the virtual host on which an application runs; it is assigned to a physical
node when an application is started.
The following rule forms the host names of virtual services:
<service_type>[<ID>]<SID>[<-LAN type]
where <service_type> can be one of:
ci - central instance (ABAP)
db - database instance
app - application instance (ABAP)
ascs - ABAP SAP central services instance
scs - JAVA SAP central services instance
jc - JAVA central instance

Administration and Operation 301


Glossary

j - JAVA application instance


<ID> is a number from 00 to 96 (except: 2, 25, 43, 72, 89) for the service types app
and j only. It is empty for other service types.
<SID> is the system ID of an SAP system.
<LAN type>
-se Server LAN
empty string Client LAN
<ID> is a number from 00 to 96 (except: 2, 25, 43, 72, 89) for the service type of
app only. It is empty for other service types.
<SID> is the system ID of an SAP system.
This host name formation rule for virtual services is mandatory in version 3.2 of the
FlexFrame solution. Some components rely on this rule.
Within a FlexFrame environment, each node name must be unique. However, each
node may have multiple host names that are derived from the node name by a
defined naming rule.
In SAP environments host names are currently limited to 13 alphanumeric
characters including the hyphen (“-“). The first character must be a letter. In the SAP
environment host names are case-sensitive (see “SAP Note No. 611361”).
Virtual Local Area Network
A VLAN is a logically segmented network mapped over physical hardware according
to the IEEE 802.1q standard.
Virtualization
Virtualization means the separation of hardware and processes. In a virtualized
environment (FlexFrame), a process can be moved between hardware nodes while
staying transparent to the user and application.

302 Administration and Operation


18 Index
A state of 63
abbreviations 293 upgrading RPM packages 146
ACC application services
integration of new servers 235 starting and stopping 225
integration of pools 235 application software
integration of SAP serevices 235 upgrading 146
usage of 236 automounter concept 40
user administration 236 B
Application Blade Server Cabinets
removing from monitoring by FA administrating 117
Agents 230
C
stopping ans starting for upgades
cloning of SIDs 217
using 231
cluster file system, built 45
application maintenance mode 74
Cluster Foundation 70
Application Nodes 10
Cluster Foundation (CF) 29
adding 113
configuration
administrating 107
power-shutdown 205
create new Linux OS image 132
Control Node
displaying information on a
specific 107 defect hardware 92
displaying information on all 111 exchanging 91
installing image update 129 failed hardware 91
Linux 10 OS damaged 92
listing 107 software updates 93
reactivating after power shutdown by Control Nodes
FA Agents 49
backup and restore 265
removing 116
Control Nodes 10
renaming 117
SSH configuration 190
Solaris 11

Administration and Operation 303


Index

D ff_netscan.sh 54
data protection 255 ff_pool_defrt.sh 61
desaster repair 37 ff_pool_dnssrv.sh 62
document history 2 Filer
Domain Information Tree (DIT) 28 add new 21
E display all configured 23
error handling 269 display configuration 24
F remove 27
FA Agent version Filer cluster 45
install new 99 Filer configuration 11
migration on pool level 101 Filer volumes
FA Agents backup with NetApp Snapshot 256
configuring 239 Filers
error diagnosis 272 multiple 218
FlexFrame Autonomy 243 FlexFrame
groups 240 architecture 3
operation and log files 276 backup with tape library 263
power management 249 basic administration 46
traps 207, 242 general notes 4
FA Autonomous Agents 66 network 52
FA Autonomy FlexFrame Autonomous Agents
diagnostic tool 282 migration at pool level 100
FA migration tool 103 FlexFrame Autonomy
file mode 103 command line interface 101
parameters 104 FlexFrame configuration state,
displaying 51
pool mode 103
FlexFrame landscape
ff_change_id.pl 217
accessing 46
ff_install_an_linux_images.sh 201
power-off 48
ff_list_services.sh 224
power-on 46

304 Administration and Operation


Index

FlexFrame Web portal 65 LAN failover 38


G segments 38
glossary 297 switches 40
H Network Appilance Filer 44
hardware 5 network cards
hardware changes 83 replacing 89
I network errors 270
image customization for experts 169 NFS mount messages 270
inconsistent and faulted node failures 33
applications 72
notational conventions 1
J
O
Jakarta Tomcat 66
ONTAP patches
L
installing 105
LDAP 28, 217
OS image
FlexFrame structure in 28
install new 99
working with 28
P
LDAP error codes and messages 271
password management
Linux Application Nodes
Control Nodes 192
preparation 201
Linux Application Nodes 197
Linux images
Solaris Application Nodes 198
installing new 129
passwords
Linux kernel
requested during installation 189
update/install new 95
setting during installation 190
update/install new 138
setting during operation, update and
log files 269 upgrade 191
N pool
nb_unpack_bi 202 add to a Filer 25
NetApp Snapshot 256 remove from a Filer 26
network 38 pool groups
automounter concept 40

Administration and Operation 305


Index

changing group and pool assignment S


of Application Nodes 185
SAP ACC 235
changing group assignment of
hanging locks 290
Application Nodes 184
logging 289
removing 184
missing serverin ACC physical
pools
landscape 289
adding 172
reset of service status in case of
adding a group 183 failures 290
listing details 176, 181 troubleshooting 289
removing 175 SAP databases
state of 63 backup 258
pools and groups 172 SAP kernel
power control hardware updates and patches 222
replacing 90 SAP service scripts 225
power-shutdown configuration 205 actions 226
PRIMECLUSTER 29, 66 return code 228
administration 66 user exits 227
CLI commands 33 SAP services
log files 37 administrating 223
PRIMECLUSTER components 29 details on controlling multiple 229
R display status 223
related documents 2 list status 224
Reliant Monitor Services 71 starting and stopping multiple 229
Reliant Monitor Services (RMS) 29 SAP SIDs
remote administration 46 adding instances 212
requirements 1 removing instances 212
RMS configuration SAP SIDs
FlexFrame specific 30 adding 212
schematic overview 32 cloning 214
data transfer when cloning 215

306 Administration and Operation


Index

listing 211 nb_unpack_bi 202


listing instances 211, 214 security 189
removing 212 ServerView
SAP system update 144
upgrading 220 ServerView S2 75
Sap systems ServerView update via RPM 94
administrating 211 service packs 137
SAP systems service switchover 232
state of 63 shared operating system 8
Sapapp shared operating system, boot
concept 8
specific error messages 289
Shutdown Facility (SF) 29
Sapascs
SID instances
specific error messages 289
state of 63
Sapci
snapshot 45
specific error messages 287
SnapShot
SAPDB
restore 262
specific error messages 285
software 6
Sapj
software updates 93
specific error messages 289
Solaris Application Nodes
Sapjc
preparation 153, 202
specific error messages 289
Solaris images
Sapscs
install new/activate 148
specific error messages 288
maintenance cycle 162
scripts
rc script 152
ff_install_an_linux_images.sh 201
troubleshooting 170
ff_list_services.sh 224
special files 280
ff_netscan.sh 54
SSH configuration 190
ff_pool_defrt.sh 61
start/stop script errors 283
ff_pool_dnssrv.sh 62
start/stop scripts

Administration and Operation 307


Index

common error messages 283 switch group


state change host name 16
of Applicatrion Nodes 63 change password 16
of pools 63 list configuration 14
of SAP systems 63 switch port
of SID instances 63 add configuration 17
support deartment display configuration 20
diagnosis data 282 remove configuration 19
switch T
add to a switch group 11 Third Party software 105
remove from a switch group 13 U
switch application user administration 191
usage example 35 V
switch applications 74 volume layout 45
switch blades volumes
replacing 92 multiple 218
switch configuration 11 W
backing up 267 Web interfaces 65
restoring 268

308 Administration and Operation


Fujitsu Siemens Computers GmbH
User Documentation
Comments
81730 Munich
Germany
Suggestions
Corrections
Fax: (++49) 700 / 372 00000

e-mail: [email protected]
http://manuals.fujitsu-siemens.com

Submitted by

Comments on FlexFrame™ for mySAP™ Business Suite 3.2


Administration and Operation

Fujitsu Siemens Computers GmbH
User Documentation
Comments
81730 Munich
Germany
Suggestions
Corrections
Fax: (++49) 700 / 372 00000

e-mail: [email protected]
http://manuals.fujitsu-siemens.com

Submitted by

Comments on FlexFrame™ for mySAP™ Business Suite 3.2


Administration and Operation

Dieses Handbuch wurde erstellt von / This manual was produced by
cognitas. Gesellschaft für Technik-Dokumentation mbH — www.cognitas.de

Herausgegeben von / Published by


Fujitsu Siemens Computers GmbH
Printed in the Federal Republic of Germany

You might also like