SGI Administrators Guide
SGI Administrators Guide
007-5642-005
COPYRIGHT
2010, 2011 SGI. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere
herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic
documentation in any manner, in whole or in part, without the prior written permission of SGI.
LIMITED RIGHTS LEGEND
The software described in this document is commercial computer software provided with restricted rights (except
as to included open/free source) as specified in the FAR 52.227-19 and/or the DFAR 227.7202, or successive
sections. Use beyond license provisions is a violation of worldwide intellectual property laws, treaties and
conventions. This document is provided with limited rights as defined in 52.227-14.
The electronic (software) version of this document was developed at private expense; if acquired under an agreement
with the USA government or any contractor thereto, it is acquired as commercial computer software subject to the
provisions of its applicable license agreement, as specified in (a) 48 CFR 12.212 of the FAR; or, if acquired for
Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto.
Contractor/manufacturer is Silicon Graphics, 46600 Landing Parkway, Fremont, CA 94538.
TRADEMARKS AND ATTRIBUTIONS
Silicon Graphics, SGI, the SGI logo, SGI Prism, and Altix are trademarks or registered trademarks of Silicon
Graphics International Corp. or its subsidiaries in the United States and/or other countries worldwide.
AMD and AMD Opteron are trademarks or registered trademarks of Advanced Micro Devices, Inc. Intel, Pentium,
and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and
other countries. Java is a registered trademark of Sun Microsystems, Inc. Linux is a registered trademark of Linus
Torvalds, used with permission by SGI. NVIDIA is a registered trademark of NVIDIA Corporation in the United
States and/or other countries. PBS Professional is a trademark of Altair Grid Technologies, a subsidiary of Altair
Engineering, Inc. Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat,
Inc. in the United States and other countries. SUSE LINUX and the SUSE logo are registered trademarks of Novell,
Inc. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open
Company, Ltd. Windows, Windows Server, and Windows Vista are trademarks or registered trademarks of Microsoft
Corporation in the United States and/or other countries.
All other trademarks mentioned herein are the property of their respective owners.
Table of Contents
Preface .......................................................................................................................................................vii
Product Definition ................................................................................................................................vii
Audience ..............................................................................................................................................vii
Revision History ...................................................................................................................................vii
Related Documentation .......................................................................................................................viii
Annotations ........................................................................................................................................... ix
Product Support ...................................................................................................................................... x
Reader Comments .................................................................................................................................. x
Chapter 1
Getting Started ............................................................................................................................................ 1
System Requirements ............................................................................................................................. 1
Minimum Hardware Requirements .......................................................................................................1
Operating System Requirements ...........................................................................................................2
Software Requirements .........................................................................................................................3
Licensing ................................................................................................................................................ 8
Importing Existing Hosts ....................................................................................................................... 8
Starting and Stopping the SGI Management Center Server ................................................................... 8
Verifying SGI Management Center Services Are Running ................................................................... 9
Chapter 2
Introduction to SGI Management Center .............................................................................................. 11
Overview .............................................................................................................................................. 11
Product Definition ..............................................................................................................................11
i
007-5642-005
Table of Contents
Chapter 3
Preferences and Settings ...........................................................................................................................23
Preferences ...........................................................................................................................................23
General .............................................................................................................................................. 23
Configure Network and Email Settings .............................................................................................. 24
Platform Management ........................................................................................................................ 25
Applications ....................................................................................................................................... 32
Provisioning Settings ......................................................................................................................... 33
Partitions ...............................................................................................................................................55
Adding Partitions ............................................................................................................................... 55
Editing Partitions ............................................................................................................................... 56
Deleting Partitions ............................................................................................................................. 56
ii
007-5642-005
Table of Contents
Regions ................................................................................................................................................. 57
Creating Regions ................................................................................................................................57
Editing Regions ..................................................................................................................................58
Deleting Regions ................................................................................................................................59
Racks .................................................................................................................................................... 60
Adding Racks .....................................................................................................................................61
Editing Racks......................................................................................................................................61
Deleting Racks....................................................................................................................................61
Chapter 5
User Administration ................................................................................................................................. 63
Default User Administration Settings .................................................................................................. 64
Adding a User .....................................................................................................................................64
Editing User Accounts ........................................................................................................................66
Disabling a User Account ...................................................................................................................66
Deleting a User Account .....................................................................................................................66
Groups .................................................................................................................................................. 67
Adding a Group ..................................................................................................................................67
Editing a Group ..................................................................................................................................69
Deleting a Group.................................................................................................................................69
Roles ..................................................................................................................................................... 70
Adding a Role .....................................................................................................................................70
Editing a Role .....................................................................................................................................72
Deleting Roles ....................................................................................................................................72
Privileges ............................................................................................................................................73
Chapter 6
Imaging, Version Control, and Provisioning ......................................................................................... 75
Overview .............................................................................................................................................. 75
Payload Management ........................................................................................................................... 76
Configuring a Payload Source.............................................................................................................76
Creating a Payload ..............................................................................................................................78
Importing Kernel Parameters from a Running Host ............................................................................83
Adding a Package to an Existing Payload ...........................................................................................84
Remove a Payload Package.................................................................................................................87
Payload File Configuration .................................................................................................................89
Payload Authentication Management ..................................................................................................90
Payload Local User and Group Account Management ........................................................................92
Add and Update Payload Files or Directories......................................................................................96
Edit a Payload File with the Text Editor .............................................................................................97
Delete Payload Files ...........................................................................................................................98
Delete a Payload .................................................................................................................................98
Install Management Center into the Payload .......................................................................................99
Installation on a Running Altix UV SSI or Cluster Compute Node ................................................... 100
iii
007-5642-005
Table of Contents
Provisioning ........................................................................................................................................141
Select an Image and Provision.......................................................................................................... 141
VCS Upgrade ................................................................................................................................... 144
Advanced Provisioning Options ....................................................................................................... 145
Chapter 7
Instrumentation and Events ...................................................................................................................147
Instrumentation ...................................................................................................................................147
States ............................................................................................................................................... 148
Event Log......................................................................................................................................... 148
Menu Controls ................................................................................................................................. 149
Overview Tab................................................................................................................................... 150
Thumbnail Tab ................................................................................................................................. 151
List Tab............................................................................................................................................ 152
CPU Tab .......................................................................................................................................... 153
Memory Tab..................................................................................................................................... 154
Disk Tab .......................................................................................................................................... 155
Network Tab .................................................................................................................................... 156
Kernel Tab ....................................................................................................................................... 157
Load Tab .......................................................................................................................................... 158
Environmental Tab ........................................................................................................................... 159
Environmental List Tab .................................................................................................................... 160
GPU Tab .......................................................................................................................................... 161
Power Tab ........................................................................................................................................ 162
Table of Contents
Chapter 8
Upgrading SGI Management Center .................................................................................................... 193
General Tasks ..................................................................................................................................... 193
Upgrading from a Previous Version of SGI Management Center ..................................................... 194
Upgrading from SGI ISLE Cluster Manager 2.x .............................................................................. 195
Chapter 9
Using the Discover Interface .................................................................................................................. 197
Software Requirements ...................................................................................................................... 197
The Graphical Interface ...................................................................................................................... 198
The Command-Line Interface ............................................................................................................ 202
Chapter 10
Troubleshooting ...................................................................................................................................... 203
Debug Logs ....................................................................................................................................... 204
Support Information Tool .................................................................................................................. 204
Startup Daemon Fails on the Master Host ......................................................................................... 204
Nodes in Provisioning or Unknown State after Provisioning ............................................................ 205
Temperatures and Fan Speeds Not Registering ................................................................................. 205
Inordinately High CPU Usage on Head Node ................................................................................... 205
Insufficient Number of Provisioning Channels .................................................................................. 205
Kernel Modules Not Loading on Compute Nodes ............................................................................. 206
Command-line Boot Parameters Not Honored .................................................................................. 206
Payload Check-in Error ...................................................................................................................... 206
Invalid or Expired License Message .................................................................................................. 207
Resource Usage Too High on Head Node ......................................................................................... 207
Altix UV Provisioning Stops While Loading Kernel ........................................................................ 208
Chapter 11
Command-Line Interface ....................................................................................................................... 209
Command-Line Syntax and Conventions .......................................................................................... 209
CLI Commands .................................................................................................................................. 210
conman ............................................................................................................................................... 216
cwhost ................................................................................................................................................ 219
cwpower ............................................................................................................................................. 228
cwprovision ........................................................................................................................................ 230
cwuser ................................................................................................................................................ 233
dbix ..................................................................................................................................................... 239
dbx ...................................................................................................................................................... 240
imgr .................................................................................................................................................... 241
kmgr ................................................................................................................................................... 242
v
007-5642-005
pdcp ....................................................................................................................................................243
pdsh ....................................................................................................................................................245
pmgr ....................................................................................................................................................248
powerman ...........................................................................................................................................249
vcs .......................................................................................................................................................251
Appendix ..................................................................................................................................................255
Pre-configured Metrics .......................................................................................................................255
CPU ................................................................................................................................................. 255
Disk ................................................................................................................................................. 256
Kernel .............................................................................................................................................. 256
Load ................................................................................................................................................. 257
Memory............................................................................................................................................ 257
Network ........................................................................................................................................... 258
Glossary ...................................................................................................................................................259
Index .........................................................................................................................................................263
Preface
The SGI Management Center System Administrator's Guide is written in modular style where each section builds upon
another to deliver progressively advanced scenarios and configurations. Depending on your system configuration and
implementation, certain sections of this guide may be optional, but warrant your attention as the needs of your system
evolve. This guide assumes that you, the reader, have a working knowledge of Linux.
Product Definition
SGI Management Center is actually a suite of products to manage your cluster:
As the name implies, SGI Management Center for Altix ICE is specific to the SGI Altix ICE platform and has a
separate manual. Refer to Related Documentation on page viii. This manual pertains to all other supported platforms.
Audience
This guide's intended audience is the system administrator who will be working with the SGI Management Center
software to manage and control the cluster.
Revision History
Revision
Date
Description
001
April 2010
002
May 2010
003
October 2010
004
January 2011
005
July 2011
vii
007-5642-005
Related Documentation
Related Documentation
The following documents provide additional information relevant to the SGI Management Center product:
To access the IPMI guide, contact your local sales representative. The following paragraphs describe the general access
method for SGI customer documentation.
You can obtain SGI documentation, release notes, or man pages in the following ways:
Refer to the SGI Technical Publications Library at http://docs.sgi.com. Various formats are available. This library
contains the most recent and most comprehensive set of online books, release notes, man pages, and other information.
You can also view man pages by typing man <title> on a command line.
SGI systems include a set of Linux man pages, formatted in the standard UNIX man page style. Important system
configuration files and commands are documented on man pages. These are found online on the internal system disk (or
DVD-ROM) and are displayed using the man command. For example, to display the man page for the rlogin
command, type the following on a command line:
man rlogin
For additional information about displaying man pages using the man command, see man(1).
In addition, the apropos command locates man pages based on keywords. For example, to display a list of man pages
that describe disks, type the following on a command line:
apropos disk
viii
007-5642-005
Annotations
Annotations
This guide uses the following annotations throughout the text:
Indicates impending danger. Ignoring these messages may result in serious injury or death.
Warns users about how to prevent equipment damage and avoid future problems.
Informs users of related information and provides details to enhance or clarify user activities.
ix
007-5642-005
Product Support
Product Support
SGI provides a comprehensive product support and maintenance program for its products. SGI also offers services to
implement and integrate Linux applications in your environment.
Refer to http://www.sgi.com/support/
If you are in North America, contact the Technical Assistance Center at
+1 800 800 4SGI or contact your authorized service provider.
If you are outside North America, contact the SGI subsidiary or authorized distributor in your country.
Reader Comments
If you have comments about the technical accuracy, content, or organization of this document, contact SGI. Be sure to
include the title and document number of the manual with your comments. (Online, the document number is located in
the front matter of the manual. In printed manuals, the document number is located at the bottom of each page.)
You can contact SGI in any of the following ways:
x
007-5642-005
Chapter 1
Getting Started
To set up SGI Management Center in your environment, you must first install SGI Management Center Server on a
Master Host. After your SGI Management Center Server is installed, you can create images to distribute the SGI
Management Center Client to the host nodes you want to manage. This lets you monitor and manage compute hosts
from a central access point.
System Requirements
Before you attempt to install SGI Management Center, make sure your master host and compute hosts meet the
following minimum hardware and software requirements:
Compute Nodes
3.0 GHz Intel Pentium 4 (32-bit) or 2.2 GHz Intel Xeon or AMD Opteron (64-bit)
1 GB RAM
100 MB local disk typically used, diskless operation is also supported
100 Mbps management network (including switches and interface card) 1000 Mbps recommended
Roamer
IPMI
DRAC
ILO
Intel Power Node Manager (IPNM)Powered by Intel Data Center Manager (DCM)
1
007-5642-005
System Requirements
Operating System Requirements
When using Intelligent Platform Management Interface (IPMI), version 2.0 is recommended for power control, serial
access, and environmental monitoring. IPMI 1.5, ILO 1.6 (or later), DRAC 3, and DRAC 4 offer power control only.
Roamer provides power control and console access. IPNM/DCM provides only power management.
2
007-5642-005
System Requirements
Software Requirements
Windows 7
Windows Server 2003
Windows Server 2008/Windows Server 2008 R2
Windows Vista
Windows XP
Software Requirements
SGI Management Center requires the following RPM packages:
You must enable the DHCP server, TFTP server, NTP server, and IPMI daemon (if using OpenIPMI/ipmitool) to start
at system bootup. TFTP, NTP, and IPMI should also be started.
If you do not enable the NTP daemon on the master host, you should set an alternate NTP server when configuring
network preferences or bypass the NTP synchronization by entering 127.0.0.1 as the NTP server. See Configure
Network and Email Settings on page 24. An incorrect NTP configuration can cause the nodes to hang during the SGI
boot process.
SGI Foundation Software is required if you want to use the Memory Failure Analysis feature of Management Center.
Contact your SGI representative or visit http://www.sgi.com/products/software/sfs.html.
In order to support SGI Altix UV large-memory systems, SGI Management Center requires the SGI SMN bundle
software to be pre-installed.
3
007-5642-005
This host name can be changed by setting the host and system.rna.host values in $MGR_HOME/@genesis.profile.
Server Installation
To install SGI Management Center on the master host, you can use any front end for RPMsuch as YAST,Yum, the
Red Hat Package Management Tool, etc. Add the SGI Management Center CDROM or iso image as an installation
source and install the following packages and all dependencies:
sgimc
sgimc-server
sgi-cm-agnostic (Required if you are using the Dynamic Provisioning feature with PBS Professional 10.2 or higher.)
shout (Only needed if you are installing on an SGI Altix UV System Management Node.)
Other packages such as powerman, conman, and pdsh are provided on the media for convenience and are supported by
their software manufacturers. For more information about conman, powerman, and pdsh, see
https://computing.llnl.gov/linux/.
Once you have installed the SGI Management Center RPM packages on the master host, you will not be able to start the
application GUI until you restart the X session on your host. Alternatively, you can source the /etc/profile.d/mgr.sh
script from the command line:
# . /etc/profile.d/mgr.sh
By default, the SGI Management Center password is root. For information on how to change this password, see Editing
User Accounts on page 66. When you provision a host, SGI Management Center sets up a root account for your hosts.
If the management network is something other than 10.0.0.0 following an installation or upgrade, you need to log in as
root and update it in SGI Management Center preferences. See Preferences on page 23.
4
007-5642-005
Client Installation
The client allows you to remotely manage your cluster from a computer that is not part of the cluster. The client
installation also gives you superior performance because it significantly reduces network traffic. You can install the
client on a computer running Linux or Windows.
Insert the SGI Management Center CD in your CD/DVD-ROM drive and allow the SGI Management Center
installer to launch.
If the installer does not start automatically, launch the installer manually (assuming the CDROM drive is d:):
d:\windows\launch_installer.vbs
2.
3.
Specify the Installation Directory and Host Name, then click Next.
The SGI Management Center Server or Master Host must use a valid host name that can be resolved through name
resolution (for example, DNS, /etc/hosts). For information on changing the name of the Master Host, see Renaming the
Management Center Master Host on page 48.
4.
5.
6.
When you finish installing SGI Management Center, use Explorer to navigate to the installation directory.
7.
8.
You can also start SGI Management Center from the command-line interface. For example, if you installed to the
default location c:\program files\sgi , enter the following:
c:\program files\sgi\sgimc\bin\mgrclient.vbs
To run SGI Management Center from a remote share, map the network drive where you installed SGI Management
Center and create a copy of the shortcut on your local machine.
5
007-5642-005
2.
Select Programs.
3.
4.
Prerequisites
This advanced configuration requires the following prerequisites:
Configuration
The following steps describe how to configure the SGI Management Center for scale-out:
1.
Designate one system to be the primary host for the SGI Management Center.
This system will manage the first 4096 compute nodes and will be utilized for image, kernel, and payload management.
2.
Each host can manage 4096 compute nodes. For example, 32,768 compute nodes require 1 primary host and 7 service
nodes.
3.
Install the SGI Management Center on all of the participating host and service nodes.
4.
Populate the various SGI Management Center databases with their respective 4096 compute nodes.
5.
Export the $MGR_HOME/vcs directory on the primary host across the shared filesytem for the service nodes.
For NFS:
# /opt/sgi/sgimc/vcs 10.0.10.*(rw,sync,no_root_squash)
The primary host will be the only system managing the VCS mechanism. The other subordinate service node directories
will not be populated or managed.
6.
For NFS:
# mount master:/opt/sgi/sgimc/vcs /opt/sgi/sgimc/vcs
6
007-5642-005
7.
Modify the IGMP multicast base addresses on the participating services nodes from their default settings.
service1
service2
service1
service2
service1
service2
service1
239.192.1.128
239.192.2.128
239.192.3.128
239.192.4.128
239.192.5.128
239.192.6.128
239.192.7.128
Remember to modify your IGMP multicast routing tables as well on these nodes.
For example:
239.192.0.0/24, 239.192.1.0/24, 239.192.2.0/24, etc.
8.
Configure your images, kernels, and payloads on the primary host for your cluster.
The primary host can be utilized for validation of images, kernels, and payloads for your system using the working and
versioned check-out mechanism. This can be useful in provisioning the primary group of 4096 compute nodes initially
and ensuring desired functionality.
Provisioning
Do the following to provision the cluster:
1.
The primary host will manage the VCS and working copies of the images, kernels, and payloads for the subordinate
service nodes.
The separate provisioning of each block of 4096 compute nodes does not imply that you cannot boot all nodes
simultaneously. It only means that the SGI Management Center instances are sharing the same VCS imaging database.
This is to avoid complications within the VCS system.
2.
Log in to each service node and start the SGI Management Center GUI.
The VCS entries from the primary host will be populated on these nodes.
3.
If the SGI Management Center GUIs are open and you make changes to the primary host VCS entries, you will need to
refresh the service node GUIs to see the modifications. You can do this by toggling between the Working Images and
Versioned Images tabs.
7
007-5642-005
Licensing
Advanced Scale-Out Configuration
Instrumentation
Each instance of the SGI Management Center will monitor the environmental, thermal, and other metric data from their
assigned compute node groups. In order for each compute node to know where (which instance of SGI Management
Center) to send its instrumentation data, you must modify the image on each compute node after installation.
To make this modification to the image, do the following:
1.
Examine the script scaleout_prefinalize.sh carefully to determine whether or not you need to modify the script for
your particular installation.
Add the script as a prefinalize script for the image that you will be provisioning to your hosts.
Licensing
In order to use SGI Management Center, you will need to obtain a license from SGI. For information about software
licensing, refer to the licensing FAQ on the following webpage:
http://www.sgi.com/support/licensing/faq.html
Open the /etc/lk/keys.dat file in a text editor. Copy and paste the license string, exactly as given, and save the file.
8
007-5642-005
Run the /etc/init.d/mgr status command to verify that the following services are running:
DNA.<host IP address>
DatabaseService
DistributionService.provisioning-00
DistributionService.provisioning-01
.
.
DistributionService.provisioning-nn
SGI Management Center includes two distribution services for each provisioning channel pair defined in the
preferences.
FileService.<host name>
HostAdministrationService.<host name>
IceboxAdministrationService
ImageAdministrationService
InstrumentationService
KernelAdministrationService
LogMonitoringService
NotificationService
PayloadAdministrationService
PayloadNodeService.<hostname>
PlatformManagementService
PowerMonitoringService
ProvisioningService
RNA
RemoteProcessService.<hostname>
SynchronizationService
TreeMonitoringService
VersionService
VersionService.<host_name>
com.sgi.clusterman.server.CommunicationServerFactory
9
007-5642-005
Chapter 2
Introduction to SGI Management Center
Overview
SGI Management Center reduces the total cost of cluster ownership by streamlining and simplifying all aspects of
cluster management. Through a single point of control, you can automate repetitive installation and configuration tasks.
SGI Management Center automates problem determination and system recovery, and monitors and reports health
information and resource utilization.
SGI Management Center provides administrators with increased power and flexibility in controlling cluster system
resources, and improved scalability and performance allows SGI Management Center to manage cluster systems of any
size. Version-controlled provisioning allows administrators to easily install the operating system (OS) and applications
on all hosts in the cluster and facilitates changes to an individual host or group of hosts.
Product Definition
SGI Management Center is actually a suite of products to manage your cluster:
As the name implies, SGI Management Center for Altix ICE is specific to the SGI Altix ICE platform and has a
separate manual. Refer to Related Documentation on page viii. This manual pertains to all other supported platforms.
11
007-5642-005
Overview
Product Definition
The following figure illustrates the packaging of the features in the Standard Edition and Premium Edition along with
the supported technologies and platforms.
Standard edition
Premium edition
Options
Data management
Platform control
Power management
Features
User management
Host management
Failure analysis
Dashboard
Framework
Web
services
TCP
Java
Presentation layer
RMI
Multiplatform GUI
Middleware API
CLI
Informatics
Logical/physical topology
Inventory
Metrics
Health monitoring
Technologies
Platform management
Services
Data management
DCM
File system
management
SNMP
IPMI
PCP
DRAC
iLO
Operating
Red Hat Enterprise Linux
systems
CentOS Linux
Platforms
Architecture
12
007-5642-005
Rackable
SGI Altix UV
Cloudrack
SGI Octane
x86_64 Platforms
Overview
Comprehensive System Monitoring
In cases where only minor changes are made to VCS-controlled images, SGI Management Center allows you to apply
updates without re-provisioning. See VCS Upgrade on page 144.
Partition monitoring
Provisioning
Hierarchical tree population
Event management
In order to support SGI Altix UV large-memory systems, SGI Management Center requires the SGI SMN bundle
software to be pre-installed.
13
007-5642-005
Overview
Support for SGI Altix UV Systems
For UV systems (with the exception of UV 10 systems), a UV license is required for SGI Management Center. If you
install a cluster/server license, some features will not work correctly, such as automatic discovery of blades and chassis
management controllers (CMCs).
Configuration
The following are SGI Management Center configuration requirements for UV systems:
With the SMN software running on the SMN, SGI Management Center supports automatic discovery of the CMCs,
blades, and the SSI partition. The CMCs and blades are shown in the host tree in the physical view (shown in the
following figure). The SSI partition is shown in the logical view.
14
007-5642-005
Overview
Support of SGI Prism XL Systems
Provisioning
When a UV license is installed, new images will be set up as UV SSI images by default. In the event that an image is
not intended for a UV SSI, right-click on the image and go to Properties. Uncheck UV SSI and click Apply.
A UV SSI image includes a defined EFI boot partition. During the provision, /boot/EFI will be mounted as VFAT and
used for the purpose of an EFI boot. Existing contents of the EFI boot partition will be preserved. Note that SGI
Management Center does not create any of the contents of /boot/EFI.
To provision a UV SSI, right-click on the SSI partition in the logical view of the host tree and mouse over Provisioning.
Alternatively, you can go to the Provisioning tab, select the VCS or working image, and select the SSI partition to
provision in the host tree on the left.
Once the provisioning is in progress, you can monitor the status of the blades by right-clicking on a blade in the
physical view of the SGI Management Center host tree and clicking Connect to Console. To monitor the status of the
provisioning of the SSI, right-click on the SSI partition in the logical view and click Connect to Console.
Kernel Parameters
The UV SSI automatically determines the best kernel boot parameters for your UV hardware. Consult the UV
documentation for details. If these kernel parameters change (for example, due to a hardware change), SGI
Management Center will send a warning alert in the Event Log panel. The warning will indicate that the kernel
parameters have changed and will instruct you to import the new kernel parameters for the appropriate kernel.
Section Importing Kernel Parameters from a Running Host on page 83 describes how you import the kernel
parameters.
15
007-5642-005
2.
Log in as root.
3.
4.
Enter a user name (root by default) and password (root by default) and click OK.
16
007-5642-005
Menus
System Tabs
Frame Controls
Tool Bar
Navigation
Tree
Frame
Tabs
Docked
Frame
Frame Tabs
Upper Pane
Lower Pane
Menus A collection of pull-down menus that provide access to system features and functionality.
Tool Bar The tool bar provides quick access to common tasks and features.
Frame
VCS Status
Power On
VCS Check In
Host Compare
Power Off
Layouts
VCS
Server Name The name of the server on which Management Center is running.
System Tabs Allow you to navigate and configure the cluster. Tabs may be opened, closed, and repositioned as
needed.
17
007-5642-005
Frame Controls Lets you dock, un-dock, hide, minimize, and close frames.
Auto-hide
Hide
Close
Frames Provide you with specific control over common aspects of cluster systems (for example, imaging and user
accounts). Each frame tab opens a frame containing a navigation tree that allows you to manage system components
easily. The navigation tree is found in most frames and is used to help organize cluster components. You may dock,
close, or relocate frames and frame tabs as needed.
Upper/Lower Panes These panes allow you to view cluster information in a structured environment.
18
007-5642-005
Closing Tabs
1.
2.
Opening Tabs
Arranging Tabs
REORDERING TABS
Right-click a tab and select New Horizontal Group or New Vertical Group.
2.
19
007-5642-005
To move tabs between groups, right-click the tab and select Move to Next Tab Group. You can also drag and drop tabs
between groups.
Dockable Frames
Management Center dockable frames can be opened, closed and repositioned to meet your needs.
Before you can reposition a frame, you must click the Auto-Hide button to make the frame always visible. See Frame
Controls Lets you dock, un-dock, hide, minimize, and close frames. on page 18.
2.
Click the frames title bar and drag it to a new position in the interface.
Layouts
Customized views of the Management Center interface are easily saved and accessed from the View menu or opened
with the Layouts button on the toolbar.
2.
To overwrite an existing layout with the current view, move the mouse over the layout and select Overwrite from the
popup menu.
3.
20
007-5642-005
On the tool bar, click the Layouts button, or select Layouts from the View menu.
2.
Select the layout you want to open from the popup menu.
Renaming a Layout
1.
2.
Move the mouse over the layout you want to rename and select Rename from the popup menu.
3.
2.
Move the mouse over the layout you want to add a description to and select Describe from the popup menu.
3.
Deleting a Layout
1.
2.
Move the mouse over the layout you want to delete and select Delete from the popup menu.
2.
Move the mouse over the layout you want to set as the default and select Set as Default from the popup menu.
21
007-5642-005
Chapter 3
Preferences and Settings
Preferences
Management Center preferences allow you to configure the global settings and default behavior for your cluster.
Preferences include general settings, platform management configurations, applications, and provisioning. Although
these settings apply to the entire cluster, you may override certain preferences as needed (such as, provisioning).
You can access preferences by selecting Preferences from the Edit menu.
General
23
007-5642-005
Preferences
Configure Network and Email Settings
In the Management Center interface, select Preferences from the Edit menu.
2.
3.
In the Email Settings section, enter the sender, server, and domain information.
Use the email settings to send notifications of cluster events.
Sender Used as the From address.
Server Must be a valid SMTP server and must be configured to receive emails from the authorized domain.
Domain The domain used to send email.
4.
5.
24
007-5642-005
Preferences
Platform Management
Platform Management
Global Options
Management Center supports multiple platform management interfaces. This is useful if you are using multiple
platforms for system management (for instance, one interface for power management and another for environmental
monitoring). The global options section of the preferences dialog allows you to set the default options used for the
majority of hosts in the system, although some hosts may still need additional configuration.
Set the most common options by configuring Device 1, 2, and 3. From the configuration dialog, select the Platform
Management Device Type you want to use: Icebox (not currently supported), Roamer, IPMI, FreeIPMI, DRAC, ILO,
Powerman, or Conman). The check boxes below each device indicate which features are available to be managed by the
device. If you configure multiple devices, you can select or clear these check boxes to indicate which device will
manage this feature. See also Intel Data Center Manager (DCM) on page 28.
Not all management controllers have the same feature set. DRAC and ILO support only power control and Conman
supports only serial console control. Roamer supports serial console control and power control. IPMI supports all
features, but you may want to use other interfaces for power and serial console control and IPMI for controlling the
beacon and environmental monitoring.
25
007-5642-005
Preferences
Platform Management
To configure platform management to use a remote power control device such as IPMI, ILO, or DRAC, you must first
create the power control user. See Configure the Master Host and Management Center on page 37.
In order for DRAC to successfully control power on DRAC-enabled hosts, you must install the racadm utility on the
Master Host. You may obtain the racadm RPM, mgmtst-racadm-4.5.0-335.1386.rpm, from the /misc directory of the
Management Center CD or from SGI technical support.
Dynamic If you are setting up the Management Device dynamically and the device's interface MAC address is an
offset of the management interface, set the Management Device IP Address Type to Dynamic and enter the MAC
Address Offset. This is typical for IPMI implementations with on-board BMC controllers. For example, a host whose
management interface MAC address is 00:11:22:33:44:55 might have a Management Device with a MAC address of
00:11:22:33:44:58. In this case, the MAC offset would be 00:00:00:00:00:03 (Greater Than).
Relative If you are setting up the Management Device dynamically or statically and the devices interface IP address is
an offset of the management interface, set the Management Device IP address type to Relative and use the IP Address
Offset. This is typical when using ILO or an IPMI controller with an add-on BMC daughter card. For example, a host
with an IP address of 10.0.0.1 might have a Management Device with an IP address of 10.0.2.1. In this case, the IP
offset would be 0.0.2.0 (Greater).
26
007-5642-005
Preferences
Platform Management
Static If you are setting up the Management Device dynamically or statically and the devices interface MAC address
or IP address does not correlate with either the MAC or IP address of the management interface, set the Management
Device IP address type to Staticthis is not typical. If you select Static, you must configure the IP address manually on
a per-host basis.
27
007-5642-005
Preferences
Platform Management
Conman
Conman is a serial console management program designed to support a large number of devices simultaneously.
Conman supports multiple serial controllers (including IPMI) and provides continuous serial logging and multiplexing
that allows you to share a serial connection for logging and access, or between multiple consoles.
Conman is available under the GPL and is installed by default on SGI systems. Conman can be obtained from SGI as
RPM packages or from http://home.gna.org/conman/.
Prior to selecting Conman for serial access, you must install the conman RPM on the Master Host, then configure
conman by defining the serial devices and consoles in /etc/conman.conf. Additional information on conman is available
from the man pages by entering man conman.conf.
Before you can begin using conman, you must start its daemon, conmand (installed as
/etc/init.d/conmand). For information on using conman, see conman on page 216.
ENABLING DCM
To enable DCM-based management for compliant hardware, do the following:
1.
2.
28
007-5642-005
Preferences
Platform Management
Field
Description
Select DCM.
Nameplate Power
Once configured, SGI Management Center will synchronize with DCM and achieve model consistency. Management
Center will begin to receive event updates from the DCM service. You can access power monitoring data through
Instrumentation > Power . See Power Tab on page 162.
Selecting Policy from the Power sub-menu displays the Policy Management dialog box.
29
007-5642-005
Preferences
Platform Management
New policies may be defined for the selected physical target or for the cluster as a whole (if selected) by clicking Add
from the Policy Management dialog box. The GUI displays the Add Custom Policy dialog box for policy definition.
You can select one of the policy types described in the table below:
Policy Type
Description
CUSTOM_PWR_LIMIT
MIN_PWR
Defines a policy which enforces the minimum operational power for the selected target
(an entity or a group of entities).
MIN_PWR_ON_INLET_
TEMP_TRIGGER
STATIC_PWR_LIMIT
Constructs immutable power budgets for entities that have special characteristics.
Policies that are defined on groups of nodes (a rack or cluster, for example) may overlap
with those defined for individual nodes and, typically, the policies attempt to achieve the
highest possible power savings. STATIC_PWR_LIMIT may be used to preserve a
specific power allocation budget for targeted endpoints within the cluster. If further
CUSTOM_PWR_LIMIT restrictions are placed upon the endpoint directly or by the
application of a group policy, the STATIC_PWR_LIMIT budget will take precedent.
30
007-5642-005
Preferences
Platform Management
The updated Policy Management dialog box reflects the new addition.
Activating Policies
You can activate a power policy in an on-demand fashion or non-interactively via schedule definition. To do so
interactively (on demand), use the Edit feature of the Policy Management dialog box. Simply select the Enable check
box in the resulting display.
You can create and enable a schedule definition upon policy creation by entering valid values in the following fields:
Start Date
End Date
Hour Start
Hour End
You can later access these fields using the Edit feature from the Policy Management dialog box. The Hour Start to
Hour End interval defines how the policy will be activated on a day-to-day basis.
Disabling Policies
A policy may be disabled at any time either by deleting the policy from the Policy Management dialog box (using the
Delete button) or by using the Edit feature and de-selecting the Enable check box.
31
007-5642-005
Preferences
Applications
Applications
APPLICATIONS
The applications option allows you to select the default applications used for specific actions and file types.
Terminal Enter the executable path of the application you want to use for your terminal window. The terminal
application is used when opening a serial console to the host. By default, Management Center uses an xterm with the
following options:
xterm -geom 80x25 -T Console of {host} -sb -bg black -fg gray -sl 1000 -e /usr/bin/
telnet {system.rna.host} {port}
The Management Center terminal field supports the use of the following variables:
{host} The host name used to set the console name (optional).
{system.rna.host} The host name of the Master Host (required).
{port} The dynamic port set by the Master Host (required).
Management Center uses any terminal that supports spawning an external command (usually the '-e' flag). The full
path to the terminal and the '-e /usr/bin/telnet {system.rna.host}{port}' statement are the only
requirements. All other items are optional. Consider the following examples:
Cygwin terminal on Windows:
C:\cygwin\bin\rxvt.exe -sr -sl 10000 -fg white -bg black -fn fixedsys -fb fixedsys -T
Console of {host} -tn cygwin -e /usr/bin/telnet {system.rna.host} {port}
32
007-5642-005
Preferences
Provisioning Settings
Gnome-terminal on Linux:
/usr/bin/gnome-terminal -t Console of {host} -e /usr/bin/telnet {system.rna.host}
{port}
If you use Konsole or Gnome-terminal, you can use the default settings used by your desktop.
HTML Browser Enter the executable path of the application you want to use as your HTML browser. On Linux, the
default browser is Firefox. On Windows, Management Center uses your default browser.
PDF Viewer Enter the executable path of the application you want to use to view PDFs such as the SGI Management
Center System Administrators Guide or Release Notes. On Linux, the defaults are Acrobat Reader then xpdf. On
Windows, Management Center uses your default PDF reader.
Provisioning Settings
33
007-5642-005
Preferences
Provisioning Settings
Provisioning
These settings let you control the default provisioning behavior.
You can overwrite these settings from the Advanced Provisioning dialog. See Advanced Provisioning Options on
page 145.
Enable Confirmation Dialogs Select this option if you want to display a confirmation dialog when you provision
hosts.
Provision at Next Reboot When checked, hosts are not provisioned until you reboot them manually or with a script.
When unchecked, Management Center automatically restarts hosts or powers them down to begin the provisioning
process.
Multicast TTL Sets the Multicast TTL or Time-To-Live on a multicast packet. The default, 1, restricts multicast
packets to the subnet (the clusters internal network). If you are using multicast across networks and multiple switches
across a private network, select 32. If you plan to use multicast across a company WAN, use 64 (the maximum TTL that
multicast supports).
Multicast Packet Size Sets the maximum size of multicast packets (by default, 1446).
Number of Multicast Channel Pairs Management Center uses one channel for downloading the kernel and
RAMdisk, and another channel for downloading the payload. Typically, you will need only one channel per image
used; however, depending on the number of images in use on the system, you may require additional multicast
channels. If you run out of channels, a No Available Channels error occurs when you attempt to provision. By
default, 10 channel pairs are configured on your system.
Multicast Base Address The multicast base address specifies what multicast subnet you will use, starting at the last
octet and increasing by 1. By default, Management Center sets the base multicast address to 239.192.0.128 with 10
channels, which uses addresses from 239.192.0.128-137. If you have multiple Management Center Master Hosts on the
same network, they should use a different subnet or different ranges within that subnet. For example, Master 1 might
use 239.192.0.128-137 and Master 2 might use 239.192.1.128-137. Other multicast ranges such as 224.0.0.x may also
be suitable for your network.
If you change your multicast base address, you must verify that the multicast default route includes the new base
address. See Configure Multicast Routes on page 39 for information on configuring multicast routes.
34
007-5642-005
Configuring IPMI
Configure the IPMI BMC
Specify the Download Path of the Payload During the provisioning process, Management Center downloads the
payload to the hosts root directory. Depending on the size of the payload, this may require a very large root partition.
To use a smaller root partition, you may download the payload to a different partition by specifying the image.path in
$MGR_HOME/etc/ProvisioningService.profile:
#
#
#
#
Versioning
These options allow you to configure default directories used to check items in and out of VCS and to open large files
created when importing a payload.
Default Checkout Directory When enabled, Management Center uses this directory as a scratch directory for
checking items in and out of VCS. Use this if you have limited space on the partition containing $MGR_HOME.
Default Deflate Directory When enabled, this option allows you to specify an alternate path in which to open large
files created when importing a payload. Use this if you have limited space on the partition containing $MGR_HOME.
Configuring IPMI
Configure the IPMI BMC
The BMC(s) for the nodes should be set up to use networking and serial over LAN. You will also need to know the
username and password that will be used for power control and serial with the BMC(s) in order to use power control
and serial over LAN with the Management Center. The ipmitool utility allows you to set the username and password
used to access the BMC on a host. This tool also allows you to set the LAN parameters of the BMC. For more
information, consult the manual Guide to Administration, Programming Environments, and Tools Available on SGI
Altix XE Systems (007-4901-xxx) or third-party documentation (in the case of third-party node types).
35
007-5642-005
Configuring IPMI
Configure the ipmitool_options.profile
Example:
# Use standard options globally
ipmitool.power._default_=-I lanplus
ipmitool.status._default_=-I lanplus
ipmitool.sol._default_=-I lanplus
Add the following modules (available in drivers/char/ipmi under the kernel modules tree) to the kernel with
which you will be provisioning:
ipmi_devintf
ipmi_si
ipmi_msghandler
2.
In the kernel parameters, set the serial console and baud rate.
For SGI clusters, the defaults are ttyS1 and 115200, respectively.
3.
Install either OpenIPMI or Freeipmi into the payload if neither is already installed.You can obtain OpenIPMI from
the SLES or RHEL distribution CD/DVD. Freeipmi can be obtained from http://www.gnu.org/software/freeipmi.
4.
If you are using OpenIPMI, run the following command to enable the ipmi daemon on the master host:
chroot $MGR_HOME chkconfig ipmi on
OpenIPMI requires the kernel binary RPM installed in the payload in order for the ipmi daemon to run properly.
36
007-5642-005
Configuring IPMI
Configure the Master Host and Management Center
Install Ipmitool on the Master Host to allow you to perform IPMI-related tasks such as powering off hosts, executing beacon operations, and activating SOL.
The SDR cache is created in $MGR_HOME/ipmi/sdrcache.dat on each host. If the $MGR_HOME/ipmi directory or the
sdrcache.dat file cannot be created, monitoring will fail.
2.
3.
4.
Assign the new user the name and password configured for BMC controllers (for SGI systems, admin and ipmi).
This gives you full access to IPMI controls on the hosts.
5.
Assign the user to the power group and make power the primary group for the user.
This user is not required for monitoring temperature and fans but is required for power control and beaconing. This user
cannot log into Management Center.
6.
7.
8.
9.
(Optional) Select the MAC Address vs. Host MAC Address type:
A. Not Related
B. Greater Than
C. Less Than
37
007-5642-005
Configuring DHCP
Configure DHCP Settings
13. (Optional) Enter the IP address offset from the management interface.
14. (Optional) Enter the IP address for the host.
15. Select a Platform Management User.
Users must belong to Power as their primary group to appear in this list. See Groups on page 67.
16. Click OK.
Configuring DHCP
If you are using Dynamic Host Configuration Protocol (DHCP) you need to configure it on your master host to ensure
proper communication with your compute nodes.
1.
2.
Open /etc/sysconfig/dhcpd.
3.
Look for the DHCPD_INTERFACE line and make sure it ends with =''ethx''.
4.
5.
When working with DHCP, ensure that the server installation includes DHCP and, if the subnet on which the cluster
will run differs from 10.0.0.0, edit the Network subnet field in the preferences dialog.
The DHCP option of the Actions menu allows you to perform the following operations:
Changes made to /etc/dhcpd.conf are overwritten when you provision the host.
38
007-5642-005
Configure TFTP
Configure Multicast Routes
The following examples use a multicast network of 224.0.0.0/4 to provide broad multicast support, but you can also use
a more narrow multicast route such as 239.192.0.0/16. By default, the base multicast address in Management Center is
239.192.0.128.
SLES
1.
Enter the following from the command line to temporarily add the route (where eth1 is the management interface):
route add -net 224.0.0.0 netmask 240.0.0.0 dev eth1
2.
RHEL
1.
Enter the following from the command line to temporarily add the route (where eth1 is the management interface):
route add -net 224.0.0.0 netmask 240.0.0.0 dev eth1
2.
Configure TFTP
Management Center places boot files in /tftpboot/mgr. The tftp or atftp daemon must use /tftpboot as the home for tftp
boot files. If you are using the tftp package that is included with the RHEL 6 distribution, you must change the
parameter server_args in /etc/xinetd.d/tftp to use /tftpboot as the home for tftp boot files.
Example /etc/xinetd.d/tftp file:
service tftp
{
disable
socket_type
protocol
wait
user
server
server_args
per_source
cps
flags
=
=
=
=
=
=
=
=
=
=
no
dgram
udp
yes
root
/usr/sbin/in.tftpd
-s /tftpboot
11
100 2
IPv4
39
007-5642-005
Configure TFTP
Configure Multicast Routes
40
007-5642-005
Chapter 4
Cluster Configuration
Clustered Environments
In a clustered environment, there is always at least one host that acts as the master of the remaining hosts (for large
systems, multiple masters may be required). This host, commonly referred to as the Management Center Master Host,
is reserved exclusively for managing the cluster and is not typically available to perform tasks assigned to the remaining
hosts.
To manage the remaining hosts in the cluster, you can use the following grouping mechanisms:
Partitions
Partitions include a strict set of hosts that may not be shared with other partitions.
Regions
Regions are a subset of a partition and may contain any hosts that belong to the same partition. Hosts contained
within a partition may belong to a single region or may be shared with multiple regions. Dividing up the system can
help simplify cluster management and allows you to enable different privileges on various parts of the system.
Racks
You can use racks to represent the physical layout of your cluster.
Cluster
Partitions
Regions
Hosts
Shared Hosts
41
007-5642-005
42
007-5642-005
Adding Hosts
Adding Hosts
To add a host, you must provide the host name, description, MAC address, IP address, and the partition and region to
which the host belongs. Hosts can be added only after you have set up a Master Host.
You can also import a list of existing hosts. See Import Hosts on page 49.
1.
2.
Select New Host from the File menu or right-click in the host navigation tree and select New Host.
A new host pane appears.
3.
4.
5.
(Optional) Select the name of the partition to which this host belongs from the drop-down menu.
If you right-click a partition or region in the navigation tree and select New Host, the host is automatically assigned to
that partition or region.
6.
7.
43
007-5642-005
Adding Hosts
Add Interfaces
The Interfaces pane allows you to create new interfaces and assign host management responsibilities.
1.
2.
To find the MAC address of a new, un-provisioned host, you must watch the output from the serial console. Etherboot
displays the hosts MAC address on the console when the host first boots. For example:
Etherboot 5.1.2rc5.eb7 (GPL) Tagged ELF64 ELF (Multiboot) for EEPRO100]
Relocating _text from: [000242d8,00034028) to [17fdc2b0,17fec000)
Boot from (N)etwork (D)isk (F)loppy or from (L)ocal?
Probing net...
Probing pci...Found EEPRO100 ROM address 0x0000
[EEPRO100]Ethernet addr: 00:02:B3:11:03:77
Searching for server (DHCP)...
(*If conman is set up and working, this information is also contained in the conman log file for the host typically
located in /var/log/conman/console.n[1-x])
To find the MAC address on a host that is already running, enter ifconfig -a in the CLI and look for the HWaddr of
the management interface.
3.
Click Management to use the Management Center interface to manage the host.
Management Center stores the interface and automatically writes it to dhcp.conf.
4.
If you are using IPMI or another third-party power controller, you should add the BMCs MAC address and the IP
address you are going to assign it. Management Center will set up DHCP to connect to the BMC. In the Platform
Management settings, you can select this interface and use it for operations.
5.
Click OK.
44
007-5642-005
Assign Regions
The Regions pane allows you to identify any regions to which the host belongs.
1.
2.
Select the region to which the host belongs. (To select multiple regions, use the Shift or Ctrl keys.)
3.
Click OK.
By default, platform management uses the device specified in your Global preferences settings to control hosts in the
cluster. To override this setting, select Override Global Settings.
IPMI
Typically, hosts use one or more Ethernet interfaces. With IPMI, ILO, and DRAC, each host uses at least two
interfaces: one management interface and one IPMI/ILO/DRAC interface. The management interface is configured for
booting and provisioning, the IPMI/ILO/DRAC interface is used to gather environmental and sensor data (for example,
fan speeds) from the host and perform power operations. Additional interfaces are used only for setting up host names
and IP addresses.
ILO and DRAC support power control only they do not support temperature and sensor monitoring.
In order for Platform Management to work correctly, you must first define interfaces for each host (see Add Interfaces
on page 44). In some cases, you must manually configure an IP address for the Platform Management Controllerin
most cases, however, you can use DHCP to configure this address. To view information about each interface, see
dhcpd.conf.
The IPMI dialog defines which interface is used for Platform Management. Typically, the Management Device is easily
identified because its MAC or IP address is an offset of the host. For example, a host with a MAC address of
00:11:22:33:44:56 and an IP address of 10.0.0.1 might have a Management Device with a MAC address
00:11:22:33:44:59 and set an IP address of 10.0.2.1. In this case, the MAC offset would be 000000000003 (Greater)
and the IP offset would be 0.0.2.0 (Greater).
45
007-5642-005
2.
Select IPMI or Roamer from the Platform Management Device Type drop-down list.
3.
4.
(Optional) Select the MAC Address vs. Host MAC Address type:
A. Not Related
B. Greater Than
C. Less Than
5.
6.
7.
8.
9.
Users must belong to Power as their primary group to appear in this list.
46
007-5642-005
2.
3.
4.
(Optional) Select the MAC Address vs. Host MAC Address type:
A. Not Related
B. Greater Than
C. Less Than
5.
6.
7.
8.
9.
47
007-5642-005
Edit a Host
Edit a Host
Editing hosts allows you to change information previously saved about a host, edit host configurations, or move hosts in
and out of partitions and regions.
To Edit a Host
1.
Select a host from the host navigation tree. (To select multiple hosts, use the Shift or Ctrl keys.)
2.
Select Edit from the Edit menu or right-click the hosts in the navigation tree and select Edit.
Management Center displays the host pane for each selected host. From this view, you can make changes to the
hosts.
Changing the name of the Master Host may prevent the cluster from functioning correctly. For information on changing
the name of the Master Host, see Renaming the Management Center Master Host on page 48.
3.
Click Apply.
2.
Select Edit from the Edit menu or right-click on the Master Host and select Edit. Management Center displays the
host pane.
3.
4.
5.
In a command line, enter /etc/init.d/mgr stop to shut down Management Center services on the system.
6.
On the Master Host, edit the $MGR_HOME/@genesis.profile to use the new name (system.rna.host).
7.
On the Master Host, edit the $MGR_HOME/etc/Activator.profile and change all instances of the host name to use
the new name.
8.
Add the new Master Host name to the alias list in /etc/hosts. For example:
10.168.18.3 host.sgi.com
9.
48
007-5642-005
host
<new_name>
Find a Host
Find a Host
To Find a Host in the Host Navigation Tree
1.
2.
3.
Delete a Host
Deleting a host removes it from the cluster.
To Delete a Host
1.
Select the host you want to delete from the host navigation tree. (To select multiple hosts, use the Shift or Ctrl
keys).
2.
Select Delete from the Edit menu or right-click the selected hosts in the navigation tree and select Delete.
Management Center asks you to confirm your action.
3.
Import Hosts
Management Center provides an easy way to import a large group of hosts from a file. When importing a list of hosts, it
is important to note that Management Center imports only host information. Management Center accepts the following
file types: nodes.conf, dbix, or CSV.
49
007-5642-005
Import Hosts
Obtain or create a host list file for importing. The following examples depict nodes.conf, dbix, and CSV file
formats:
A. nodes.conf
SGI nodes.conf format lists one host per line with properties being space or tab
delimited:
MAC HOSTNAME IP_ADDRESS BOOT_MODE UNIQUE_NUM DESCRIPTION
Example:
0050455C0392 n001 192.168.4.1 boot_mode 1 Node_n001
0050455C03A2 n002 192.168.4.2 boot_mode 2 Node_n002
B. dbix
dbix
hosts.<hostname>.description: <description>
hosts.<hostname>.enabled:true
hosts.<hostname>.name:<hostname>
hosts.<hostname>.partition:<partition>
interfaces.<MAC_address1>.address:<IP_address1>
interfaces.<MAC_address1>.mac:<MAC_address1>
interfaces.<MAC_address1>.management:true
interfaces.<MAC_address1>.owner:<hostname>
interfaces.<MAC_address2>.address:<IP_address2>
interfaces.<MAC_address2>.mac:<MAC_address2>
interfaces.<MAC_address2>.management:false
interfaces.<MAC_address2>.owner:<hostname>
Example:
hosts.n1.description:Added automatically by add_hosts.shasd
hosts.n1.enabled:true
hosts.n1.name:n1
hosts.n1.partition:computehosts
interfaces.0030482acc96.address:10.0.1.1
interfaces.0030482acc96.mac:0030482acc96
interfaces.0030482acc96.management:true
interfaces.0030482acc96.owner:n1
interfaces.0030482acc9a.address:10.0.2.1
interfaces.0030482acc9a.mac:0030482acc9a
interfaces.0030482acc9a.management:false
interfaces.0030482acc9a.owner:n1
Dbix files are created primarily by obtaining and editing a Management Center database file.
C. CSV
HOSTNAME,MAC_ADDRESS1,IP_ADDRESS1,DESCRIPTION,MAC_ADDRESS2,IP_ADDRESS2
Example:
n14,0040482acc96,0040482acc9a,10.4.1.1,10.4.2.1,Description
2.
50
007-5642-005
Import Hosts
3.
4.
Enter the path for the file you want to import or click Browse to locate the file.
5.
Review the list of hosts to import and un-check any hosts you do not want.
Errors display for items that cannot be imported.
7.
Click Close.
51
007-5642-005
System
The System options in the right-click menu execute power-related events on the hosts.
POWER OFF
Issues the Linux /sbin/poweroff command to stop all applications and services running on the host and, if the
hardware allows, to power off the host. If you have used the /sbin/shutdown command to successfully shut down
and reboot hosts at the next power cycle, you should be safe to enable this option. To enable shutdown, set the
shutdown.button.enable option in HostAdministrationService.profile to true.
Using the shutdown option requires that the BIOS is enabled to support boot at power up the default behavior for
LinuxBIOS. This setting, also referred to as Power State Control or Power On Boot, is typically enabled for most
server-type motherboards.
If you do not enable this BIOS setting, hosts that are shut down may become unusable until you press the power button
on each host. For the location of your host power switch, please consult your host installation documentation.
The power connection to the host remains active unless you click Off. To return the host to normal operational status,
cycle the power.
HALT
Issues the Linux /sbin/halt command to stop all applications and services running on the host and, if the hardware
allows, power off the host.
REBOOT
Shuts down and restarts all applications and services on the host.
RESTART SERVICES
Restarts the Management Center services on the selected hosts.
52
007-5642-005
You cannot restart Management Center services on the Master Host from the GUI. You must perform this action from
the CLI.
Power
The Power options in the right-click menu execute power-related events from your power management device.
ON
Turn on power to the host.
If you are unable to power a host on or off, the port may be locked.
OFF
Immediately turn off power to the host.
CYCLE
Turn off the power, then back on. This is useful for multiple hosts.
RESET
Send a signal to the motherboard to perform a soft boot of the host.
Beacon
BEACON ON
To identify a specific host in a cluster for troubleshooting purposes, click Beacon On to flash a light from the host. Use
the Shift and Ctrl keys to select multiple hosts.By default, the beacon icon appears next to the selected host(s) for 180
seconds. You can change this default time by changing the timeout.beacon.seconds parameter in file
$MGR_HOME/etc/PlatformManagementService.profile.
53
007-5642-005
Console
The beacon function works only if the hardware installed in your cluster supports beacons (i.e., the hosts support IPMI
or ILO).
BEACON OFF
Turn off the beacon.
Console
Connecting to the console allows you to monitor activity on a host-by-host basis. When you connect to the console,
Management Center opens a terminal window for each host and allows you to view host activity or execute bash and
other general command-line operations necessary for troubleshooting. You can also use the console to apply specific
configurations or enhancements to a payload that you can import and use at a later time.
Select the host on which you want to open a console from the host navigation tree.
To select multiple hosts, use the Shift or Ctrl keys.
2.
3.
Enter bash or other general CLI commands as needed to configure the host.
4.
Roamer KVM
When using Roamer-enabled nodes, you can connect to the Roamer KVM from the Management Center GUI. When
you connect to the KVM console, Management Center opens a console window using Java. This allows you to control
the host as you would with a keyboard, monitor, and mouse.
Before you can connect to a console, you must configure the platform management settings for your host to use Roamer
and enable the Console option. This configures the host to use Roamer for the serial console.
To connect to the Roamer KVM, use the following steps:
1.
Select the host on which you want to open a console from the host navigation tree.
2.
3.
54
007-5642-005
Partitions
Adding Partitions
Partitions
You can use partitions to group clusters into non-overlapping collections of hosts. Instrumentation, provisioning, power
control, and administrative tasks can be performed on this collection of hosts by selecting the partition in the host tree.
Adding Partitions
1.
Right-click in the Hosts navigation tree and select New Partition or select New Partition from the File menu.
2.
3.
4.
In the Hosts pane, click Add to display the Select Hosts dialog.
5.
6.
Click Apply.
55
007-5642-005
Partitions
Editing Partitions
Editing Partitions
Editing a partition allows you to change previously saved information about a partition. You can edit or remove regions,
alter partition configurations, disable partitions, or remove partitions from the host.
1.
2.
Select Edit from the Edit menu or right-click on the partitions in the host navigation tree and select Edit.
3.
4.
Click Apply to accept the changes or click Close to abort this action.
Deleting Partitions
Deleting a partition allows you to remove unused partitions from the system.
If you delete a partition, all regions and hosts associated with the partition will move to the default partition. To delete
regions and hosts, refer to Regions on page 57 and Adding Hosts on page 43.
1.
Select the partitions you want to delete from the host navigation tree.
2.
Select Delete from the Edit menu or right-click on the partitions in the navigation tree and select Delete.
3.
56
007-5642-005
Regions
Creating Regions
Regions
A region is a subset of a partition and may share any hosts that belong to the same partition even if the hosts are
currently used by another region.
Creating Regions
1.
Select New Region from the File menu or right-click in the host navigation tree and select New Region.
2.
3.
4.
(Optional) Select the name of the partition you want to assign the region to from the drop-down list.
Regions not assigned to a partition become part of the default or unassigned partition.
5.
6.
In the Select Hosts dialog, select the hosts you want to add to the region.
7.
57
007-5642-005
Regions
Editing Regions
8.
9.
Editing Regions
Editing regions allows you to change previously saved information about a region or to modify region memberships by
adding or removing groups or hosts.
1.
2.
Select Edit from the Edit menu or right-click the regions in the navigation tree and select Edit.
3.
4.
Click Apply.
58
007-5642-005
Regions
Deleting Regions
Deleting Regions
Deleting a region allows you to remove unused regions from the system.
1.
Select the region you want to delete from the host navigation tree.
2.
Select Delete from the Edit menu or right-click on the regions in the navigation tree and select Delete.
3.
If you delete a region, all hosts associated with the region return to the partition to which the region belonged. If the
region was not part of a partition, the hosts move to the default partition.
59
007-5642-005
Racks
Deleting Regions
Racks
To aid in the management of the cluster, you can use racks to represent the physical layout of the cluster into nonoverlapping collections of hosts. If you have hosts which are not assigned a rack, they will appear in a rack labelled
Unassigned.
60
007-5642-005
Racks
Adding Racks
Adding Racks
1.
Right-click in the Hosts navigation tree and select New Rack or select New Rack from the File menu.
2.
3.
4.
In the Hosts pane, click Add to display the Select Hosts dialog.
5.
6.
Click Apply.
Editing Racks
Editing a rack allows you to change previously saved information about a rack. You can edit rack information, alter
rack configurations, or remove racks.
1.
2.
Select Edit from the Edit menu or right-click on the racks in the host navigation tree and select Edit.
3.
4.
Deleting Racks
If you delete a rack, all hosts associated with the rack will be moved to rack Unassigned.
1.
Select the rack(s) you want to delete from the host navigation tree.
2.
Select Delete from the Edit menu or right-click on the rack(s) in the navigation tree and select Delete.
3.
61
007-5642-005
Chapter 5
User Administration
Management Center allows you to configure groups, users, roles, and privileges to establish a working environment on
the cluster. A group refers to an organization with shared or similar needs that is structured using specific roles
(permissions and privileges) and region access that may be unique to the group or shared with other groups. Members
of a group (users) inherit all rights and privileges defined for the group(s) to which they belong.
Roles
Groups
Group
Regions
Users
For example, a user assigned to multiple groups (as indicated by the following diagram) has different rights and
privileges within each group. This flexibility allows you to establish several types of user roles: full administration,
group administration, user, or guest.
Multi-Group Users
63
007-5642-005
Management Center currently supports adding users and groups to payloads onlyit does not support the management
of local users and groups on the Master Host. Users with local Unix accounts do not automatically have Management
Center accounts, and this information cannot be imported into Management Center.
If you are using local authentication in your payloads and intend to add Management Center users or groups, ensure that
the user and group IDs (UIDs and GIDs, respectively) match up between the accounts on the Master Host and
Management Center. Otherwise, NFS may not work properly.
After installation, Management Center allows you to create, modify, or delete groups, users, roles, and privileges as
needed.
Adding a User
Adding a user to Management Center creates an account for the user and grants access to the system.
1.
Select New User from the File menu or right-click in the user navigation tree and select New User.
64
007-5642-005
2.
3.
(Optional) Management Center assigns a system-generated user ID. Enter any changes to the ID in the User ID
field.
4.
Enter the users first and last name in the Full Name field.
If a user already has an account and you would like to apply the account to the Master Host and compute hosts, add the
user to your payload during payload creation. When you provision, Management Center creates the account on the
hosts. See Payload Local User and Group Account Management on page 92.
5.
6.
7.
(Optional) Enter a shell for this user or select an existing one from the drop-down list. (By default, Management
Center uses /bin/bash.)
8.
Click Apply.
2.
Each user must belong to a primary group. If not, Management Center automatically assigns the user to the users
group. If you are using third-party power controls such as IPMI, the power group must be the primary group for all
users who will use these controls. See Power on page 67.
3.
Click OK.
4.
(Optional) Select Create a private group for the user to create a new group with the same name as the user.
5.
(Optional) Check Disable Account to prevent users from logging into this account and to exclude this account
from future payloads without deleting the account.
65
007-5642-005
2.
Select Edit from the Edit menu or right-click a user in the navigation tree and select Edit.
3.
Click Apply.
2.
Select Edit from the Edit menu or right-click a user in the navigation tree and select Edit.
3.
4.
Click Apply.
To Delete a User
1.
Select the users you want to delete from the user navigation tree.
2.
Select Delete from the Edit menu or right-click the user names and select Delete.
3.
66
007-5642-005
Groups
Adding a Group
Groups
The following sections outline the fundamentals of adding, editing, and deleting groups. By default, Management
Center enables the following groups, but you can create new groups as needed:
Power The power group contains the user names and passwords that will be used to manage IPMI and other 3rd-party
power controllers. By default, this group has no role associated with it, so users assigned to this group cannot typically
log into Management Center. Although temperature and fan monitoring do not require that a user is assigned to this
group, you must assign a user to the power group in order to use power control and beaconing for IPMI-enabled
devices.
When using third-party power controls such as IPMI, the power group must be the primary group for all users who will
access these controls (see Defining User Groups on page 65). Users who belong to the power group cannot log into
Management Center.
Root The root group typically contains users with full administrative privileges.
Users The users group typically includes all users with access to the cluster. By default, the Users group is associated
with the Users role. Management Center automatically assigns all users to the users group.
Adding a Group
Adding groups creates a collection of users with shared or similar needs (for example, an engineering, testing, or
administrative group).
1.
Select New Group from the File menu or right-click in the user navigation tree and select New Group.
2.
007-5642-005
Groups
Adding a Group
3.
(Optional) Management Center assigns a system-generated Group ID. Enter any changes to the ID in the Group ID
field.
4.
5.
Click Apply.
ADD USERS
The Users pane allows you to identify the users that belong to the current group. Users are allowed to be part of any
number of groups, but granting access to multiple groups may allow users unnecessary privileges to various parts of the
system. See Roles on page 70.
1.
2.
Select the users to add to the group (use the Shift or Ctrl keys to select multiple users.
3.
Click OK.
ASSIGN ROLES
The Roles pane allows you to assign specific roles to the group.
1.
2.
3.
Click OK.
ASSIGN REGIONS
The Regions pane allows you to grant a group access to specific regions of the system. See User Administration on
page 63.
1.
68
007-5642-005
Groups
Editing a Group
2.
3.
Click OK.
Editing a Group
Editing a group allows you to change previously saved information about a group or modify group memberships by
adding or removing users.
1.
2.
Select Edit from the Edit menu or right-click a group name in the navigation tree and select Edit.
3.
4.
Click Apply.
Deleting a Group
Deleting a group allows you to remove unused groups from the system.
1.
Select the groups you want to delete from the user navigation tree.
2.
Select Delete from the Edit menu or right-click group names in the navigation tree and select Delete.
3.
Click OK.
69
007-5642-005
Roles
Adding a Role
Roles
The following sections outline the fundamentals of adding, editing, and deleting roles. Roles are associated with groups
and privileges, and define the functionality assigned to each group. Several groups can use the same role.
Adding a Role
Adding a role to Management Center allows you to define and grant system privileges to groups.
1.
Select New Role from the File menu or right-click in the Users frame and select New Role.
2.
3.
4.
Click Apply.
Adding or revoking privileges will not affect users that are currently logged into Management Center. Changes take
effect only after the users close Management Center and log in again.
70
007-5642-005
Roles
Adding a Role
2.
3.
Click OK.
GRANTING PRIVILEGES
The Privileges pane allows you to assign permissions to a role. Any user with the role will have these permissions in the
system. See Privileges on page 73.
1.
2.
3.
Click OK.
71
007-5642-005
Roles
Editing a Role
Editing a Role
Editing roles allows you to modify privileges defined for a group.
1.
2.
Select Edit from the Edit menu or right-click role names in the navigation tree and select Edit..
3.
Deleting a role will not affect the privileges of a user that is currently logged into Management Center. Changes will
take effect only after you restart the Management Center client.
Deleting Roles
Deleting a role removes any user privileges assigned to the role.
1.
Select the role you want to delete from the user navigation tree.
2.
Select Delete from the Edit menu or right-click role names in the navigation tree and select Delete.
3.
Click OK.
Deleting a role does not affect the privileges of a user that is currently logged into Management Center. Changes take
effect only after you restart the Management Center client. Also note that you cannot delete the root role.
72
007-5642-005
Roles
Privileges
Privileges
Privileges are permissions or rights that grant varying levels of access to system users. Management Center allows you
to assign privileges as part of a role, then assign the role to specific user groups. Users assigned to multiple groups will
have different roles and access within each group. This flexibility allows you to establish several types of roles you can
assign to users: full administration, group administration, user, or guest. See User Administration on page 63. The
following table lists the privileges established for the Management Center module at the function and sub-function
levels:
Module
Name
Description
Management Center
Database
Host
Icebox
Image
Instrumentation
Logging
Power
Provisioning
Serial
User
73
007-5642-005
Chapter 6
Imaging, Version Control, and
Provisioning
Overview
Management Center version-controlled image management allows you to create and store images that can be used to
install and configure hosts in your system. An image may contain file system information, utilities used for
provisioning, one payload, and one kernelalthough you may create and store many payloads and kernels. The
payload contains the operating system, applications, libraries, configuration files, locale and time zone settings, file
system structure, selected local user and group accounts (managed by Management Center), and any centralized user
authentication settings to install on each host (e.g., NIS, LDAP, and Kerberos). The kernel is the Linux kernel.
For a list of Management Center-supported operating systems, see Operating System Requirements on page 2.
Payload
Kernel
Image
Stored
payloads
Image
Stored
kernels
This chapter provides both GUI and command-line interface directions to assist you in configuring and maintaining
images, and in using them to provision hosts. The image configuration process allows you to select a kernel and
payload, and also configures the boot utilities and partition layout. Once the new image is complete, you can check it
into the Version Control System and provision hosts with the new image. See Version Control System (VCS) on
page 134 and Provisioning on page 141.
75
007-5642-005
Payload Management
Configuring a Payload Source
Payload Management
Payloads are stored versions of the operating system and any applications installed on the hosts. Payloads are
compressed and transferred to the hosts via multicast during the provisioning process.
Physical Media
If you are using physical media, you must insert it and mount it for your CDROM:
/mnt/cdrom
or
/media/dvd
CD ISOs
If you are using the CD ISOs, you must mount the ISOs one at a time to simulate using the CDROM:
mount -o loop <ISO_name> <mount_point>
Using either the multiple disks or multiple ISOs may require switching between disks several times.
DVD ISOs
DVD ISOs are perhaps the most convenient because they are simply mounted and do not require changing disks. To use
a DVD ISO:
mount -o loop <ISO_name> <mount_point>
FTP or HTTP
You must follow the operating system vendors recommendations for setting up a network based installation. Some
problems have been reported using Apache 2.2.
If you choose to copy the entire contents of each disc rather than the files described below, you must copy disc1 LAST.
Failure to copy disks in the correct order may produce payload creation failures (for example, package aaa_base may
not be found).
1.
Mount disk 1 and copy the contents of the entire disk to a location on the hard drive:
mount /mnt/cdrom
or
76
007-5642-005
Payload Management
Configuring a Payload Source
Mount disk 2 and copy the *.rpm files from the RPMS directory to the RPMS directory on the hard drive:
cp /mnt/cdrom/RedHat/RPMS/*.rpm /mnt/redhat/RedHat/RPMS
3.
Mount each remaining disk and copy the RPMS directory to the RPMS directory on the hard drive.
If you choose to copy the entire contents of each disc rather than the files described below, you must copy disc1 LAST.
Failure to copy disks in the correct order may produce payload creation failures (e.g., package aaa_base may not be
found).
1.
Mount disk 1 and copy the contents of the entire disk to a location on the hard drive:
mount /media/cdrom
or
mount -o loop SLES-9-x86-64-CD1.iso /media/cdrom
mkdir /mnt/suse
cp -r /media/cdrom/* /mnt/suse
2.
Mount disk 2 and copy the RPMs from each architecture subdirectory to the SuSE directory on the hard drive:
cp
cp
cp
cp
cp
cp
3.
-r
-r
-r
-r
-r
-r
/media/cdrom/suse/noarch/* /mnt/suse/suse/noarch
/media/cdrom/suse/i586/* /mnt/suse/suse/i586
/media/cdrom/suse/i686/* /mnt/suse/suse/i686
/media/cdrom/suse/src/* /mnt/suse/suse/src
/media/cdrom/suse/nosrc/* /mnt/suse/suse/nosrc
/media/cdrom/suse/x86_64/* /mnt/suse/suse/x86_64
Mount each remaining disk and copy the RPMs from each architecture subdirectory to the SUSE directory.
77
007-5642-005
Payload Management
Creating a Payload
Creating a Payload
Payloads are initially created using a supported Linux distribution installation media (CD-ROM, FTP, NFS) to build a
base payload (see Operating System Requirements on page 2 for a list of supported distributions) or by importing a
payload from a previously provisioned host. Additions and changes are applied by adding or removing packages, or by
editing files through the GUI or CLI. Changes to the Payload are managed by the Management Center Version Control
System (VCS). Package information and files are stored and may be browsed through Management Center.
Please consult SGI before upgrading your Linux distribution or kernel. Upgrading to a distribution or kernel not
approved for use on your system may render Management Center inoperable or otherwise impair system functionality.
Technical Support is not provided for unapproved system configurations.
To create a new payload from a Linux distribution:
1.
Select New Payload from the File menu or right-click in the imaging navigation tree and select New Payload.
To create a new payload using a payload from a host you have already configured, see Importing a Payload from an
Existing Host on page 81.
2.
3.
78
007-5642-005
Payload Management
Creating a Payload
4.
5.
Select the Scheme (file, http://, or ftp://) from the drop-down list.
6.
Enter the location of the top level directory for the Linux distribution or, click the Browse icon if you selected the
File scheme to locate the directory.
If you are creating multiple payloads from the same distribution source, it may be faster and easier to copy the
distribution onto the hard drive. This also prevents you from having to switch CD-ROMs during the payload creation
process. See Red Hat Installations on page 76 and SUSE Linux Enterprise Server Installations on page 77 for specific
details on installing these distributions.
7.
8.
9.
Click OK.
As the distribution loads, the progress of the payload creation is displayed along with the operation status
messages.
Select Hide on Completion to close the Task Progress dialog if no errors or warnings occur.
79
007-5642-005
Payload Management
Creating a Payload
If Management Center is unable to detect payload attributes, the Distribution Unknown dialog appears. From this
dialog, select the distribution type that most closely resembles your distribution and Management Center will attempt to
create your payload.
10. (Optional) In the packages pane, click Add to include additional packages in the payload.
11. Select which payload categories to install or remove by clicking the checkbox next to each package.
When you select a core category to include in a payload, Management Center automatically selects packages that are
essential in allowing the capability to run. However, you may include additional packages at any time. See Adding a
Package to an Existing Payload on page 84.
12. Click OK.
13. (Optional) From the Packages pane, select packages you want to remove from the payload, then click Delete in the
packages pane.
14. (Optional) Configure advanced settings you want to apply to the payload. See Payload File Configuration on
page 89, Payload Authentication Management on page 90, and Payload Local User and Group Account
Management on page 92.
80
007-5642-005
Payload Management
Creating a Payload
If an RPM installation error occurs during the payload creation process, Management Center enables the Details button
and allows you to view which RPM produced the error.
To view error information about a failed command, click the command description field. You may copy the contents of
this field and run it from the CLI to view specific details about the error.
16. (Optional) Select any payload files you wish to include with, remove from, or edit from the File drop-down list.
See Add and Update Payload Files or Directories on page 96.
17. (Optional) Click Check In to import the new payload into VCS. See also Version Control System (VCS) on
page 134.
If a payload is open in the GUI, click Copy in the lower left of the panel to create a copy of the payload.
When you copy of a payload, Management Center creates a working copy of the payload in other words, the payload
that is checked out into the $MGR_HOME/imaging/<username>/payloads directory. To create a copy of a versioned
payload, use VCS Management on page 137.
2.
In the Copy Payload dialog, enter the name of the new payload and click OK.
Payload Management
Creating a Payload
On RHEL, temporarily disable SE Linux while importing the payload. If you do not require SE Linux, you may want to
leave it disabled.
To disable SE Linux:
1. Navigate to the Imaging tab.
2. Select the kernel you are using and edit the kernel parameters.
3. Add selinux=0 as a parameter.
4. Reboot the host and import the payload.
3.
You can also import a payload using pmgr from the command line. See pmgr on page 248.
4.
5.
6.
Enter the host name you are creating the payload from or select a host from the drop-down list.
7.
Use the following check box selections to indicate whether or not an image and kernel should be created:
* Create kernel from imported payload creates a kernel with the same name as the payload and populates the
list of modules in the kernel to match that of the running host.
* Create image from imported payload and kernel creates an image and attempts to re-create all local
filesystems from the list of partitions on the running host.
82
007-5642-005
Payload Management
Importing Kernel Parameters from a Running Host
* In the imported kernel, check the list of kernel modules that is generated and the list of kernel boot parameters that
are generated. You may need to customize them according to your needs.
* In the imported image, check the partition scheme and add/remove partitions as necessary.
* LVM is not supported. If you have LVM partitions on the running host, you will need to create traditional partitions
in the image manually.
* If you have any remote filesystem mounts on the running host, such as static NFS mounts, they will not be defined in
the new image. They will need to be defined manually.
8.
(Optional) Review the Excluded Files list and remove any files you want to exclude from the payload.
If you include a symlink when creating a payload, excluding the target produces a dangling symbolic link. This link
may cause an exception and abort payload creation when Management Center attempts to repair missing directories.
9.
(Optional) Enter the location of any file you want to exclude from the payload and click Add. Click Browse to
locate a file on your system.
Open the imaging pane, find the desired kernel, and double-click it.
2.
On the resulting kernel configuration panel (shown in the following figure), click the Import... button.
83
007-5642-005
Payload Management
Adding a Package to an Existing Payload
3.
In the Import Kernel Parameters window that appears, select the desired host and click Import.
4.
Select either Replace kernel parameters or Merge kernel parameters and click the OK button.
5.
Examine the list of kernel parameters that was generated and make any needed changes.
6.
84
007-5642-005
Payload Management
Adding a Package to an Existing Payload
Right-click a payload name in the imaging navigation tree and select Edit.
2.
3.
4.
Enter the Location of the top level directory for the Linux distribution, a directory containing RPM packages, or the
location of an individual package. If you selected the File scheme, click the Browse icon to locate the package.
85
007-5642-005
Payload Management
Adding a Package to an Existing Payload
If the browse button does not launch a dialog, a DNS name resolution error may exist. The DNS server name must be
specified in the client not the IP address.
If you have several packages in a directory, select the directory. Management Center displays all packages in the
directory you can choose which packages you want to install. Management Center resolves package dependencies
(see Payload Package Dependency Checks on page 87).
5.
6.
7.
Click OK.
8.
9.
Click OK.
Before adding the package, Management Center performs a package dependency check. See Payload Package
Dependency Checks on page 87 for information about dependency errors.
11. Click Check In to check the payload into VCS.
12. Update the image to use the new payload.
13. Re-provision the hosts with the new image or update the payload on the hosts using VCS Upgrade on page 144.
86
007-5642-005
Payload Management
Remove a Payload Package
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
From the package list in the Packages pane, select a package group or expand the group to view individual
packages.
To view individual packages instead of package groups, change the View Packages By option.
3.
Click Delete.
4.
5.
Before adding the package, Management Center performs a package dependency check. See Payload Package
Dependency Checks on page 87 for information about dependency errors.
ADDING A PACKAGE
When adding a package, you may correct dependency failures by selecting one of the following options:
87
007-5642-005
Payload Management
Remove a Payload Package
REMOVING A PACKAGE
When removing a package, you may correct dependency failures by selecting one of the following options:
88
007-5642-005
Payload Management
Payload File Configuration
The list of options available is based on the distribution selected. The options displayed in the example below are
SUSE-based distributions (SUSE Linux Enterprise Server 10).
To Configure a Payload
1.
2.
Select Configuration from the Advanced drop-down list and click the check box by each script you want to
enable.
3.
Click Apply.
89
007-5642-005
Payload Management
Payload Authentication Management
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Authentication from the Advanced pull-down menu. The Authentication dialog appears.
3.
90
007-5642-005
Payload Management
Payload Authentication Management
4.
Click Close.
5.
Click Apply to save changes. Click Revert or Close to abort this action.
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Authentication from the Advanced pull-down menu. The Authentication dialog appears.
3.
Click Close.
5.
Click Apply to save changes. Click Revert or Close to abort this action.
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Authentication from the Advanced pull-down menu. The Authentication dialog appears.
3.
Click Close.
5.
Click Apply to save changes. Click Revert or Close to abort this action.
91
007-5642-005
Payload Management
Payload Local User and Group Account Management
Add a local user or group account known to Management Center to the payload (see User Administration on
page 63).
Delete a local user or group account from the payload.
Local account management does not support moving local accounts from the host.
Local user and group accounts that are reserved for system use do not display and cannot be added or deleted. The root
account is added automatically. Management Center handles group dependencies.
Software that requires you to add groups (e.g., Myrinet Group) can be managed through user accounts.
92
007-5642-005
Payload Management
Payload Local User and Group Account Management
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Local Accounts from the Advanced pull-down menu. The Local Accounts dialog appears.
3.
In the Users pane, click Add. The Add User dialog appears.
4.
Select the user(s) to add to the payload (use the Shift or Ctrl keys to select multiple users).
5.
6.
Click Apply to save changes. Click Revert or Close to abort this action.
93
007-5642-005
Payload Management
Payload Local User and Group Account Management
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Local Accounts from the Advanced pull-down menu. The Local Accounts dialog appears.
3.
Select the user(s) to remove from the payload (use the Shift or Ctrl keys to select multiple users).
4.
5.
Click Close.
6.
Click Apply to complete the process. Click Revert or Close to abort this action.
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Local Accounts from the Advanced pull-down menu. The Local Accounts dialog appears.
94
007-5642-005
Payload Management
Payload Local User and Group Account Management
3.
In the Groups pane, click Add. The Add Group dialog appears.
4.
Select the group(s) to add to the payload (use the Shift or Ctrl keys to select multiple users).
5.
6.
Click Apply to complete the process. Click Revert or Close to abort this action.
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Local Accounts from the Advanced pull-down menu. The Local Accounts dialog appears.
3.
Select the group(s) to remove from the payload (use the Shift or Ctrl keys to select multiple groups).
4.
5.
Click Close.
6.
Click Apply to complete the process. Click Revert or Close to abort this action.
95
007-5642-005
Payload Management
Add and Update Payload Files or Directories
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Add File from the Files pull-down menu. The Add File or Directory dialog appears.
3.
Enter the source for the new file in the Source field or click Browse to locate the source.
4.
Enter the destination for the new file in the Destination field or click Browse to select the destination.
6.
Click Apply to complete the process. Click Revert or Close to abort this action.
If a working copy of a payload is available, you can enter the payload directory and make changes to the payload
manually from the CLI. Working copies of payloads are stored at:
$MGR_HOME/imaging/<username>/payloads/<payload_name>
From this directory, enter chroot to change the directory to your root (/) directory. After making changes, check the
payload into VCS.
96
007-5642-005
Payload Management
Edit a Payload File with the Text Editor
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Edit File from the Files pull-down menu. The Remote File Chooser appears.
3.
Select the file to edit and click Open. The text editor window appears.
4.
Edit the file as necessary, then click OK to save changes or click Cancel to abort this action.
5.
Click Apply to complete the configuration. Click Revert or Close to abort this action.
97
007-5642-005
Payload Management
Delete Payload Files
Right-click on a payload in the imaging navigation tree and select Edit. The payload panel appears.
2.
Select Delete File from the Files pull-down menu. The Remote File Chooser appears.
3.
Select the file(s) you want to remove, then click Delete to remove the files or Cancel to abort this action.
4.
Click Apply to complete the process. Click Revert or Close to abort this action.
Delete a Payload
To Delete a Working Copy of a Payload
Before you delete the working copy of your payload, use the VCS status option to verify that the payload is checked in.
See Version Control System (VCS) on page 134 for details on using version control.
Once you check the payload into VCS, you may remove the directory from within your working user directory (e.g., to
save space):
$MGR_HOME/imaging/<username>/payloads/<name>
To verify that your changes were checked in, use the VCS status option. See Version Control System (VCS) on page 134
for details on using the version control system.
1.
2.
98
007-5642-005
Payload Management
Install Management Center into the Payload
2.
3.
4.
On the Management Center media, browse to the sgi/x86_64 directory (if using SLES) or the RPMS directory (if
using RHEL) and select the following packages:
sgimc-payload
java-1.6.0-sun
You can also install into the payload from the command line using the RPM "root" parameter. For example:
# cd /mnt/cdrom/sgi/x86_64
# rpm -ivh --root=$MGR_HOME/imaging/root/payloads/Compute java-1.6.0-sun-1.6.0.17sgi700c1.sles11.x86_64.rpm
# rpm -ivh --root=$MGR_HOME/imaging/root/payloads/Compute sgimc-payload-1.0.0sgi700c1.sles11.x86_64.rpm
99
007-5642-005
Payload Management
Installation on a Running Altix UV SSI or Cluster Compute Node
The following is an example of the configure_payload.sh script with user entries shown in bold:
# configure_payload.sh
This script will configure the SGI Management Center Payload Daemon service to work
correctly with the SGI Management Center server. You will need the following pieces of
information to enable this script to successfully complete the setup.
1. The name of the server where the SGI Managment Center server is running. This name
should match what the server is calling itself in the file /opt/sgi/sgimc/
@genesis.profile. The entry is listed as system.rna.host.
2. The IP address of the server where the SGI Management Center Server is running. This
should be the IP address for the management network.
3. The name that this host is known by in the database of SGI Management Center. This does
not have to match the actual hostname that this system knows itself as.
4. The IP address of this system on the management network. This IP address should also
be present in the SGI Management Center database.
Note: the two host names should be simple host names, not fully qualified domain names.
Please enter the name of the server: host
Please enter the IP address of the server: 172.21.0.1
Please enter the name of this host from the server database: UV00000014-P000
Please enter the IP address of this host: 172.21.1.0
You entered the following:
Server Name = host
Server IP = 172.21.0.1
Host Name = UV00000014-P000
Host IP = 172.21.1.0
The SGI Management Center Payload Daemon service has been configured. You should restart
the Name Service Cache Daemon before attempting to start the payload daemon service.
Execute the following commands to restart the Name Service Cache Daemon and the SGI
Management Center Payload Daemon service:
/etc/init.d/nscd restart
/etc/init.d/mgr restart
Note that the script prompts for the host name as defined in the server databasethat is, the host name from the host tree
in the SGI Management Center GUI. If you are installing on a running UV SSI, use the name of the partition (usually of
the format UVxxxxxxxx-Pyyy). If you are installing on a non-UV, generic cluster node, it will be the name of the node in
the host tree (for example, n001). If the node is not present in the tree, you must add an entry for it.
100
007-5642-005
Kernel Management
Create a Kernel
Kernel Management
Kernels may be customized for particular applications and used on specific hosts to achieve optimal system
performance. Management Center uses VCS to help you manage kernels used on your system.
Create a Kernel
The following sections review the steps necessary to create a kernel for use in provisioning your cluster.
Select New Kernel from the File menu or right-click in the imaging navigation tree and select New Kernel. A new
kernel pane appears.
2.
3.
4.
101
007-5642-005
Kernel Management
Create a Kernel
5.
Specify the full path to the kernel binary or click Browse to open the Remote File Chooser and select the kernel
binary.
Make sure you select a kernel binary that begins with vmlinuz and not vmlinux. This will result in provisioning
problems later on.
6.
Specify the location of the modules directory (e.g., /lib/modules) or click Browse to open the Remote File Chooser.
7.
8.
Click Apply to create the kernel. Click Revert or Close to abort this action.
102
007-5642-005
Kernel Management
Create a Kernel
9.
To make configuration changes to the kernel, see Edit a Kernel on page 107.
2.
Select a kernel from the navigation tree, then right-click on the payload and select Copy.
You may also open a kernel for editing, then click the Copy button at the lower left of the panel.
3.
Management Center prompts you for the name of the new kernel.
4.
Enter the name of the new kernel and click OK. Click Cancel to abort this action.
103
007-5642-005
Kernel Management
Create a Kernel
Please consult SGI before upgrading your Linux distribution or kernel. Upgrading to a distribution or kernel not
approved for use on your system may render Management Center inoperable or otherwise impair system functionality.
Technical Support is not provided for unapproved system configurations.
1.
Obtain and install the kernel source RPM for your distribution from your distribution CD-ROMs or distribution
vendor. This places the kernel source code under /usr/src, typically in a directory named
linux-2.<minor>.<patch>-<revision> (if building a Red Hat Enterprise Linux kernel, Management Center places
the source code into /usr/src/kernels/2.<minor>.<patch>-<revision>).
Because you dont need the kernel source RPM in your payload, install the RPM on the host.
2.
If present, review the README file inside the kernel source for instructions on how to build and configure the
kernel.
It is highly recommended the you use, or at least base your configuration on one of the vendors standard kernel
configurations.
3.
To use a stock configuration, copy it to the kernel source directory and run make oldconfig.
4.
Build the kernel and its modules using the make bzImage && make modules command. If your distribution uses
the Linux 2.4 kernel, use make dep && make bzImage && make modules but DO NOT install the kernel.
104
007-5642-005
Kernel Management
Create a Kernel
5.
Select Source Kernel from the File menu. A new kernel pane appears.
6.
7.
8.
9.
Enter the location of the kernel source (i.e., where you unpacked the kernel source) in the Source Directory field or
click Browse to open the Remote File Chooser. By default, kernel source files are located in /usr/src.
105
007-5642-005
Kernel Management
Create a Kernel
To make configuration changes to the kernel, see Edit a Kernel on page 107.
106
007-5642-005
Kernel Management
Edit a Kernel
Edit a Kernel
To Edit a Kernel
1.
2.
3.
(Optional) Click Update to update a kernel that has been recompiled for some reason (e.g., a change in kernel
configuration). Management Center updates the kernel based on the Source Directory and Binary Path used when
you created the kernel. See To Create a Kernel Using an Existing Binary on page 101.
4.
(Optional) Click Properties to view the *.config and System.map files for the kernel (if they existed when you
imported the kernel).
5.
(Optional) Edit the Parameters pane using the Form or Advanced view. The form view organizes and displays the
basic required options and provides the default values required for IPMI. The Advanced view allows you to view
all configurations in an editable text field and allows you to configure the kernels command-line parameters string.
A. Select Serial Console to specify which console (tty0 or tty1) you will use to communicate with hosts.
B. Select Baud Rate to change the baud rate used on your system.
C. Select RAMdisk Size to change the size of the RAMdisk configured on your system.
6.
(Optional) In the modules pane, click Add to include new modules in this kernel. You may select modules
individually (files ending in *.ko) or you can add a directory and allow Management Center to automatically select
all modules and directories recursively. See Modules on page 108.
7.
(Optional) In the modules pane, select any module(s) you want to remove from the kernel and click Delete.
8.
Click Apply to complete the process. Click Revert or Close to abort this action.
107
007-5642-005
Kernel Management
Edit a Kernel
9.
10. (Optional) Click Copy to create a copy of this kernel. See To Create a Copy of an Existing Kernel on page 103.
MODULES
Many provisioning systems use a basic kernel to boot and provision the host, then reboot with an optimized kernel that
will run on the host. Management Center requires only a single kernel to boot and run; however, you must compile any
additional functionality into the kernel (i.e., monolithic) or add loadable kernel modules to the kernel (i.e., modular).
Management Center loads the modules during the provisioning process.
If you encounter problems when provisioning hosts on your cluster, check to see that you compiled your kernel
correctly. If you compiled a modular kernel, you must include ethernet or file system modules before the host can
provision properly. Use the serial console to watch the host boot.
In some cases, it may be necessary to install kernel modules on a host during the provisioning process, but not load
them at boot time. Because an image ties a kernel and payload together, modules can be copied to the host by adding
them to an image rather than adding them to a payload.
To add modules to an image, run mkdir -p ramdisk/lib/modules from the images directory. For example, if you were
running as root and your image name were ComputeHost:
cd $MGR_HOME/imaging/root/images/ComputeHost
mkdir -p ramdisk/lib/modules/<linux name & version>/kernel/
mkdir -p ramdisk/lib/modules/<kernel name with version>/kernel/net/e1000
Then copy the modules you want to an appropriate subdirectory of the modules directory:
cp /usr/src/linux/drivers/net/e1000/e1000.ko
ramdisk/lib/modules/<linux name & version>/kernel/net
ramdisk/lib/modules/<linux name & version>/kernel/net/e1000/
You may wish to look at your local /lib/modules directory if you have questions about the directory structure. During
the boot process, the kernel automatically loads the modules that were selected in the kernel configuration screen. The
additional modules will be copied to the host during the finalize stage. This method keeps the payload independent from
the kernel and allows you to load the modules after the host boots.
108
007-5642-005
Kernel Management
Delete a Kernel
Delete a Kernel
To Delete a Working Copy of a Kernel
1.
2.
Right-click on the kernel in the imaging navigation tree and select Delete.
3.
Before you delete the working copy of your kernel, check VCS to verify that the kernel is checked in. See Version
Control System (VCS) on page 134 for details on using version control.
Once you check the kernel into VCS, you may delete the working copy of the kernel from your working directory (e.g.,
to save space).
$MGR_HOME/imaging/<username>/<kernel>/<name>
109
007-5642-005
Image Management
Create an Image
Image Management
Images contain exactly one payload and one kernel, and allow you to implement tailored configurations on various
hosts throughout the cluster.
Please consult SGI before upgrading your Linux distribution or kernel. Upgrading to a distribution or kernel not
approved for use on your system may render Management Center inoperable or otherwise impair system functionality.
Technical Support is not provided for unapproved system configurations.
Create an Image
To Create an Image
1.
Select New Image from the File menu or right-click in the imaging navigation tree and select New Image. A New
Image pane appears.
2.
3.
4.
5.
Select a Kernel by clicking Browse. To install additional kernel modules that do not load at boot time, see Modules
on page 108.
6.
110
007-5642-005
Image Management
Create an Image
7.
Define the partition scheme used for the compute hoststhe partition scheme must include a root (/) partition. See
To Create a Partition for an Image on page 114.
Kernel support for selected file systems must be included in the selected kernel (or as modules).
8.
9.
(Optional) If you need to make modifications to the way hosts boot during the provisioning process, select the
RAM Disk tab. See RAM Disk on page 128.
10. (Optional) Click the Advanced button to display the Advanced Options dialog. This dialog allows you to configure
partitioning behavior and payload download settings (see Advanced Imaging Options).
11. Click Apply to complete the process. Click Revert or Close to abort this action.
PARTITIONING OPTIONS
This option allows you to configure the partition settings used when provisioning a host. You may automatically
partition a host if the partitioning scheme changes or choose to never partition the host. You may also specify if the
image should use GPT partition tables or EFI. See Managing Partitions on page 114.
FORMATTING OPTIONS
These options allow you to configure the partition formatting settings used when provisioning a host. You may
automatically format when drives need to be formatted (for example, if the payload or the partitioning scheme changes),
always re-create all partitions (including those that are exempt from being overwritten), or choose to never format.
DOWNLOAD OPTIONS
These options allow you to automatically download a payload if a newer version is available (or if the current payload
is not identical to that contained in the image), always download the payload, or choose to never download a payload.
KERNEL VERBOSITIY
The kernel verbosity level (18) allows you to control debug messages displayed by the kernel during provisioning. The
default value 1 is the least verbose and 8 is the most.
111
007-5642-005
Image Management
Delete an Image
boot.profile
Management Center generates the file, boot.profile, each time you save an image (overwriting the previous file in
/etc/boot.profile). The boot profile contains information about the image and is required for the boot process to function
properly. You may configure the following temporary parameters:
dmesg.level
The verbosity level (1-8) of the kernel1 (the default) is the least verbose and 8 is the
most.
partition
Configure the hard drive re-partitioning status (Automatic, Always, Never). By default,
Automatic.
partition.once
Override the current drive re-partitioning status (Default, On, Off). By default, Default.
image
Configure the image download behavior (Automatic, Always, Never). By default,
Automatic. Always and Never will download the image even if it is up-to-date.
image.once
Override the current image download behavior (Default, On, Off). By default, Default.
To view the current download behavior, see Advanced Imaging Options on page 111.
image.path
Specifies where to store the downloaded image. By default, /mnt.
To change the configuration of one of these parameters, add the parameter (e.g., dmesg.level: 7) to the boot.profile and
provision using that image. You may also configure most of these values from the GUI. See Select an Image and
Provision on page 141.
Changes made to image settings remain in effect until the next time you save the image.
2.
Select an image from the navigation tree, then right-click on the image and select Copy.
You may also open an image for editing, then click the Copy button.
3.
Management Center prompts you for the name of the new image.
4.
Enter the name of the new image and click OK. Click Cancel to abort this action.
Delete an Image
To Delete a Working Copy of an Image
1.
112
007-5642-005
Image Management
Delete an Image
2.
Once you check the image into VCS, you may remove the directory from within your working user directory (e.g., to
save space).
$MGR_HOME/imaging/<username>/images/<name>
To verify that your changes were checked in, use the VCS status option. See Version Control System (VCS) on page 134
for details on using version control.
113
007-5642-005
Image Management
Managing Partitions
Managing Partitions
To Create a Partition for an Image
1.
Right-click on an image in the imaging navigation tree and select Edit. The Image panel appears.
114
007-5642-005
Image Management
Managing Partitions
2.
In the partitions pane, click Add to create a new partition. The New Partition dialog appears.
3.
Select a file system type from the Filesystem pull-down menu. To create a diskless host, see Diskless Hosts on
page 125.
4.
Enter the device on which to add the partition or select a device from the drop-down list. Supported devices include
the following, but the most common is /dev/hda because hosts typically have only one disk and use IDE:
/dev/hdaPrimary IDE Disk
/dev/hdbSecondary IDE Disk
/dev/sdaPrimary SCSI Disk
/dev/sdbSecondary SCSI Disk
If you are using non-standard hosts, you can add additional storage devices to the partitioning drop-down list. The
Image Administration Service profile, $MGR_HOME/etc/ImageAdministrationService.profile, allows you to configure
non-standard hard drives. This profile contains options that allow you to set the drive name (available when partitioning
the disk at the time of creating or modifying an image) and the prefix for a partition on the drive (if one exists). By
default these values are commented out, but may be commented in as needed. Once drives are configured, they become
available via Management Center.
Profile options are as follows:
partitioning.devices:cciss/c0d0
The name of the storage device where the device file is located (e.g., /dev/cciss/c0d0).
partitioning.devices.cciss/c0d0.naming:p
The partition prefix for the device defined by the previous key (e.g., cciss/c0d0).
In this example, the partition will look like c0d0p1, c0d0p2, and so on.
5.
115
007-5642-005
Image Management
Managing Partitions
6.
(Optional) Enter the fstab options. The /etc/fstab file controls where directories are mounted and, because
Management Center writes and manages the fstab on the hosts, any changes made on the hosts are overwritten
during provisioning.
7.
(Optional) Enter the mkfs options to use when creating the file system (i.e., file size limits, symlinks, journalling).
For example, to change the default block size for ext3 to 4096, enter -b 4096 in the mkfs options field.
8.
9.
10. (Optional) Un-check the Format option to make the partition exempt from being overwritten or formatted when
you provision the host. This may be overridden by the Force formatting option or from the boot.profile (see Select
an Image and Provision on page 141 and boot.profile on page 112).
After partitioning the hard disk(s) on a host for the first time, you can make a partition on the disk exempt from being
overwritten or formatted when you provision the host. However, deciding not to format the partition may have an
adverse affect on future payloadssome files may remain from previous payloads. This option is not allowed if the
partition sizes change when you provision the host.
For nodes with external storage, detach the storage when provisioning. The discovery order may present the external
storage first and, consequently, Management Center will use the storage for the filesystems it manages.
11. Select the partition size:
Fixed size allows you to define the size of the partition (in MBs).
Fill to end of disk allows you to create a partition that uses any space that remains after defining partitions with
fixed sizes.
It is wise to allocate slightly more memory than is required on some partitions. To estimate the amount of memory
needed by a partition, use the du -hc command.
12. Click Apply to save changes or click Cancel to abort this action.
13. (Optional) Click Check In to import the image into VCS.
14. Click Apply to complete the process. Click Revert or Close to abort this action.
Management Center generates the file, boot.profile, each time you save an image. For a description of the information
contained in this file, see boot.profile on page 112.
116
007-5642-005
Image Management
RAID Partitions
RAID Partitions
To Create a RAID Partition
When adding a RAID partition, the host typically requires two disks and at least two previously created software RAID
partitions (one per disk).
1.
Right-click on an image in the imaging navigation tree and select Edit. The image pane appears.
2.
In the partitions pane, click Add to create the appropriate number of software RAID partitions for the RAID you
are creating. See To Create a Partition for an Image on page 114.
The RAID button is disabled until you create at least two RAID partitions.
117
007-5642-005
Image Management
RAID Partitions
3.
Click the RAID button to assign the partitions a file system, mount point, and RAID level. The Add RAID dialogue
appears.
4.
5.
6.
Select a RAID level from the RAID Level pull-down menu. This level affects the size of the resulting RAID and
the number of RAID partitions required to create it (e.g., RAID0 and RAID1 require 2 RAID partitions, RAID5
requires 3 RAID partitions).
7.
(Optional) Enter the fstab options. The /etc/fstab file controls where directories are mounted and, because
Management Center writes and manages the fstab on the hosts, any changes made on the hosts are overwritten
during provisioning.
8.
(Optional) Enter the mkfs options to use when creating the file system (i.e., file size limits, symlinks, journalling).
For example, to change the default block size for ext3 to 4096, enter -b 4096 in the mkfs field.
9.
From the RAID Members list, select the currently unused RAID partitions to include in this RAID.
118
007-5642-005
Image Management
Edit a Partition
Edit a Partition
To Edit a Partition on an Image
1.
Right-click an image in the imaging navigation tree and select Edit. The image panel appears.
2.
In the partitions pane, select the partition you want to edit from the list of partitions.
119
007-5642-005
Image Management
Edit a Partition
3.
Click Edit in the partitions pane. The Edit Partition dialog appears.
4.
Make any necessary changes to the partition, then click Apply to accept the changes. Click Cancel to abort this
action.
120
007-5642-005
Image Management
Delete a Partition
Delete a Partition
To Delete a Partition from an Image
1.
Right-click an image in the imaging navigation tree and select Edit. The image panel appears.
2.
From the partitions pane, select the partition you want to delete from the list of partitions. To select multiple
partitions, use the Shift or Ctrl keys.
3.
Click Delete.
121
007-5642-005
Image Management
User-Defined File Systems
Right-click on an image in the imaging navigation tree and select Edit. The image panel appears.
122
007-5642-005
Image Management
User-Defined File Systems
2.
From the partitions pane, click Add. The New Partition dialog appears.
3.
4.
Enter the device on which to add the partition or select a device from the pull-down menu. Supported devices
include the following, but the most common is /dev/hda because hosts typically have only one disk and use IDE:
/dev/hdaPrimary IDE Disk
/dev/hdbSecondary IDE Disk
/dev/sdaPrimary SCSI Disk
/dev/sdbSecondary SCSI Disk
If you are using non-standard hosts, you can add additional storage devices to the partitioning drop-down list. The
Image Administration Service profile, $MGR_HOME/etc/ImageAdministrationService.profile, allows you to configure
non-standard hard drives. This profile contains options that allow you to set the drive name (available when partitioning
the disk at the time of creating or modifying an image) and the prefix for a partition on the drive (if one exists). By
default these values are commented out, but may be commented in as needed. Once drives are configured, they become
available via Management Center.
Profile options are as follows:
partitioning.devices:cciss/c0d0
The name of the storage device where the device file is located (e.g., /dev/cciss/c0d0).
partitioning.devices.cciss/c0d0.naming:p
The partition prefix for the device defined by the previous key (e.g., cciss/c0d0).
In this example, the partition will look like c0d0p1, c0d0p2, and so on.
5.
Create a plug-in to create the user-defined file system. Everything required to build and mount the file system will
need to be included in the RAMdisk. Kernel modules needed to support the file system must be added to the kernel
you selected. See Plug-ins for the Boot Process on page 130.
123
007-5642-005
Image Management
User-Defined File Systems
6.
7.
8.
9.
Click Apply to complete the process. Click Revert or Close to abort this action.
Management Center generates the file, boot.profile, each time you save an image. See boot.profile on page 112 for a
description of the information contained in this file.
124
007-5642-005
Image Management
Diskless Hosts
Diskless Hosts
Management Center provides support for diskless hosts. For optimal performance, Management Center implements
diskless hosts by installing the operating system into the hosts physical memory, generally referred to as RAMfs or
TmpFS. Because the OS is stored in memory, it is recommended that you use a minimal Linux installation to avoid
consuming excess memory. An optimized Linux installation is typically around 100-150MB, but may be as small as
30MB depending on which libraries are installed. Management Center also supports local scratch or swap space on the
hosts.
Potentially large directories like /home should never be stored in RAM. Rather, they should be shared through a global
storage solution.
When using diskless hosts, the file system is stored in memory. Changes made to the hosts file system will be lost when
the host reboots. If changes are required, make them in the payload first.
SGI offers secure diskless systems for classified environments. These include integration of micro installation with a
globally mounted file system and scripts that optimize and simplify diskless management. Additional options for
diskless systems are available through SGI Professional Services. Please contact SGI or speak with your SGI
representative for more information.
Right-click on an image in the imaging navigation tree and select Edit. The image panel appears.
2.
From the partitions pane, click Add. The New Partition dialog appears.
125
007-5642-005
Image Management
Diskless Hosts
3.
Select the tmpfs or nfs file system type from the Filesystem pull-down menu.
Although diskless hosts may use either tmpfs or nfs partitions, they must use only one type. If you are converting or
editing a diskless host, change all partitions to the same type.
4.
Enter the Mount Point or select one from the pull-down menu (diskless hosts use root / as the mount point).
In most Linux installations, the majority of the OS is stored in the /usr directory. To help conserve memory, you may
elect to share the /usr directory via NFS or another global file system.
5.
(Optional) Enter the fstab options. The /etc/fstab file controls where directories are mounted.
Because Management Center writes and manages the fstab on the hosts, any changes made on the hosts are overwritten
during provisioning.
6.
It is wise to allocate slightly more memory than is required on some partitions. To estimate the amount of memory
needed by a partition, use the du -hc command.
It is important to note that memory allocated to a partition is not permanently consumed. For example, consider
126
007-5642-005
Image Management
Diskless Hosts
programs that need to write temporary files in a /tmp partition. Although you may configure the partition to use a
maximum of 50 MB of memory, the actual amount used depends on the contents of the partition. If the /tmp partition is
empty, the amount of memory used is 0 MB.
7.
8.
9.
Click Apply to complete the process. Click Revert or Close to abort this action.
Management Center generates the file, boot.profile, each time you save an image. See boot.profile on page 112 for a
description of the information contained in this file.
127
007-5642-005
Image Management
RAM Disk
RAM Disk
The RAM Disk is a small disk image that is created and loaded with the utilities required to provision the host. When
the host first powers on, it loads the kernel and mounts the RAM Disk as the root file system. In order for host
provisioning to succeed, the RAM Disk must contain specific boot utilities. Under typical circumstances, you will not
need to add boot utilities unless you are creating something such as a custom, pre-finalized script that needs utilities not
required by standard Linux versions (e.g., modprobe).
Management Center uses two skeleton RAM Disksone for ia32 and another for both AMD-64 and EM64T. These
skeleton disks are located in $MGR_HOME/ramdisks and should never be modified manually. All changes must be
performed through Management Center or in $MGR_HOME/imaging/<username>/images/<image_name>/ramdisk.
Modifications made to the RAM Disk are permanent for ALL images.
Right-click on an image n the imaging navigation tree and select Edit. The image panel appears.
128
007-5642-005
Image Management
RAM Disk
2.
Click the RAM Disk button. The RAM Disk dialog appears. Default files from the skeleton RAM Disk are grayed
outany changes or updates appear in black.
3.
4.
Enter the boot utility path in the Source field or click Browse to locate a utility.
5.
Specify the Destination location in which to install the boot utility in the RAM Disk file system.
6.
Click OK to install the boot utility or click Cancel to abort this action.
7.
(Optional) Select Add Debug Utilities to apply additional debugging utilities to the RAM Disk.
8.
Click Apply to complete the process. Click Revert or Close to abort this action.
Management Center generates the file, boot.profile, each time you save an image. See boot.profile on page 112 for a
description of the information contained in this file.
129
007-5642-005
Image Management
Plug-ins for the Boot Process
Stage 1
Stage 2
Stage 3
Stage 4
initialize
identify
partition
image
/plugins/postinitialize
/plugins/postidentify
/plugins/postpartition
/plugins/postimage
/plugins/preidentify
/plugins/prepartition
/plugins/preimage
/plugins/prefinalize
initialize
identify
partition
image
finalize
Stage 5
finalize
Stage one creates writable directories and loads any kernel modules.
Stage two uses DHCP to get the IP address and host name.
Stage three creates partitions and file systems.
Stage four downloads and extracts the payload.
Stage five configures Management Center services to run with the host name retrieved from
DHCP.
All plug-ins must be added inside the RAM Disk under /plugins/<filename>.
The provisioning plug-in scripts run each time the node is booted with an image that contained that plug-in at the time it
was provisioned. This is not dependent on whether or not a new payload is being downloaded or similar situations.
Plug-in scripts should be written in such a way that running it multiple times against the same installed payload will not
cause problems.
130
007-5642-005
Image Management
Plug-ins for the Boot Process
To Add a Plug-in
The following example depicts how to run a script during the boot process.
1.
Write a shell or Perl script to run during the boot process. For example, to run a script immediately after
partitioning a drive, name the script postpartition and add it to the plugins directory in the RAMdisk
(i.e., /plugins/<filename>).
You must add all necessary utilities for your plug-in script to the RAM Disk. For example, if you use a Perl script as a
plug-in, you must add the Perl binary and all necessary shared libraries and modules to the RAM Disk. The shared
libraries for a utility may be determined using the ldd(1) command. Please note that adding these items significantly
increases the size of the RAM Disk. See To Add Boot Utilities on page 128.
2.
Right-click on an image in the imaging navigation tree and select Edit. The image panel appears.
131
007-5642-005
Image Management
Plug-ins for the Boot Process
3.
Click the RAM Disk button. The RAM Disk dialog appears.
4.
5.
Enter the boot utility path in the Source field or click Browse to locate a plug-in.
6.
All scripts must be installed in the /plugins/ directory. However, you can overwrite other utilities.
132
007-5642-005
Image Management
Plug-ins for the Boot Process
7.
Click OK to install the utility or click Cancel to abort this action. The new plugins appear in the RAM Disk dialog.
8.
(Optional) Select Add Debug Utilities to apply additional debugging utilities to the RAM Disk.
9.
Click Close.
10. Click Apply to complete the process. Click Revert or Close to abort this action.
Management Center generates the file, boot.profile, each time you save an image. See boot.profile on page 112 for a
description of the information contained in this file.
133
007-5642-005
You can also use VCS Management to copy a payload, kernel, or image and create a new version. See VCS
Management on page 137.
Version Control
The following diagram illustrates version control for a kernel. The process begins with a working copy of a kernel that
is checked into VCS as a versioned kernel. The kernel is then checked out of VCS, modified (as a working copy of the
kernel), and checked back into VCS as a new version of the original kernel.
New Version
VCS
VCS
VCS
1
In
k
ec
ut
Ch
ec
ec
Ch
(Versioned)
Ch
In
(Versioned)
If another user checks out a copy of the same item you are working with and checks it back into VCS before you do,
you must either discard your changes and check out the latest version of the item or create a new branch that does not
contain the items checked in by the other user.
A Working Copy of a payload, kernel, or image is currently present in the working area (e.g. $MGR_HOME/imaging/
<user>/payloads). A Versioned payload, kernel, or image is a revision of a payload, kernel, or image stored in VCS.
Management Center displays payloads, kernels, or images that are currently checked out of VCS in the imaging tree.
These items may be edited only while they are checked out, but you may check them into VCS to store your changes. If
you are not using a working copy of an item (e.g., it is checked into VCS), you can delete it to conserve space.
134
007-5642-005
Version Branching
Image management works with VCS to allow you to branch any payload, kernel, or image under version control
arbitrarily from any version. Suppose, for example, that a payload under version control was gradually optimized to suit
specific hardware contained in a cluster. If the optimization were performed in stages (where each stage was a different
VCS revision), VCS would contain multiple versions of the payload.
Now suppose that you added some new hosts with slightly different hardware specifications to the cluster, but the last
few revisions of the payload use optimizations that are incompatible with the new hardware. Using the version
branching feature, you could create a new branch of the payload based on an older version that does not contain the
offending optimizations. The new branch could be used with the new hosts, while the remaining hosts could use the
original payload.
VCS
ec
Ch
ut
135
007-5642-005
After making changes to a payload, kernel, or image, click Check In or select Check In from the VCS menu. The
VCS Import dialog appears.
2.
(Optional) Enter an alias to use when referring to this version. The alias is the name displayed in the VCS Log
between the parentheses:
1(<Alias>)
February 26, 2004 9:14:17 AM MST, root
Description of changes...
3.
(Optional) Select Branch to create a new branch of this item. Do not select this option if you want Management
Center to create a new revision on the current branch.
If another user checks out a copy of the same item you are working with and checks it back into VCS before you do,
you must either discard your changes and check out the latest version of the item or create a new branch that does not
contain the items checked in by the other user.
4.
VCS Check In may fail if you have insufficient disk space. To monitor the amount of available disk space, configure the
disk space monitor to log this information, e-mail the administrator, or run a script when disk space is low. See
Management Center Monitoring and Event Subsystem on page 165 for details.
136
007-5642-005
Select Check Out from the VCS option in the Actions menu. The VCS Check Out dialog appears.
2.
Select the payload, kernel, or image you want to check out of VCS (use the Shift or Ctrl keys to select multiple
items).
When you check out a payload, kernel, or image, Management Center creates a working copy of the item. If you check
out the root of a payload, kernel, or image, Management Center selects the tip revision.
Every time a user creates a payload (or checks a payload out of VCS), Management Center stores a working copy of the
payload in the users $MGR_HOME/imaging directory. To accommodate this process, Management Center requires a
minimum of 10 GB of disk space. Once the payload is checked into VCS, the user may safely remove the contents of
the imaging directory.
3.
Click OK. Management Center places the item(s) into a working directory where you may make changes. Click
Cancel to abort this action.
VCS Management
The VCS management console allows you to copy, delete, or view the change history for a particular package, kernel,
or image.
137
007-5642-005
Select Manage from the VCS option in the Actions menu. The VCS Management dialog appears.
2.
Click the Add (A), Modify (M), or Delete (D) options to include or exclude specific information.
3.
To remove a versioned payload, kernel, or image from VCS, select the item from the navigation tree and click
Delete. When deleting a version of any item, all subsequent versions are also deleted (i.e., deleting version 4 also
removes versions 5, 6, and so on).
If you select Payloads, Kernels, or Images from the navigation tree, clicking Delete will remove ALL payloads, kernels,
or images from the system.
4.
To copy a payload, kernel, or image, right-click on the item in the navigation tree and select Copy. Management
Center prompts you for a new name, then creates a new copy of the item in VCS.
138
007-5642-005
Open the file, $MGR_HOME/etc/exclude.files (a copy of this file should exist on all hosts):
proc
dev/pts
etc/ssh/ssh_host_dsa_key
etc/ssh/ssh_host_dsa_key.pub
etc/ssh/ssh_host_key
etc/ssh/ssh_host_key.pub
etc/ssh/ssh_host_rsa_key
etc/ssh/ssh_host_rsa_key.pub
media
mnt
root/.ssh
scratch
sys
tmp
usr/local/src
usr/share/doc
usr/src
var/cache/
var/lock
139
007-5642-005
var/log
var/run
var/spool/anacron
var/spool/at
var/spool/atjobs
var/spool/atspool
var/spool/clientmqueue
var/spool/cron
var/spool/mail
var/spool/mqueue
var/tmp
2.
It is best to edit this file while it is in the payload so it can be copied to all hosts.
VersionControlService.profile
Management Center uses VersionControlService.profile, a global default exclude list that is not distribution-specific.
You may add files or directories to this list to prevent Management Center from checking them into VCSparticularly
helpful when importing payloads from the working directory. To remove items from the exclusion list, comment them
out of the profile.
Also contained in the VersionControlService.profile, the deflate.temp:/<dir> parameter allows you to specify an
alternate path for large files created while importing a payload.
140
007-5642-005
Provisioning
Select an Image and Provision
Provisioning
The Management Center provisioning service allows you to create an image from a payload and kernel, then apply that
image to multiple hosts. When provisioning, you can select a versioned image stored in VCS or use a working copy of
an image from your working directory. The following illustration depicts an image that is provisioned to multiple hosts.
Host
Payload
Host
Kernel
Host
Image
Host
Provision
Select the host(s) you want to provision from the navigation tree (use the Shift or Ctrl keys to select multiple hosts).
If you want to provision a host using the latest revision of an image stored in VCS, you can right-click a host and select
Provision. Management Center displays a popup menu and allows you to select the image you want to use to provision.
If you have made only minor changes to an image and want to upgrade your hosts to use the new image, see VCS
Upgrade on page 144.
141
007-5642-005
Provisioning
Select an Image and Provision
2.
3.
A Versioned image is a revision of an image that is checked into VCS. A Working image has not been checked into VCS
and is currently present in the working area (e.g., $MGR_HOME/imaging/<user>/images). This allows you to test
changes prior to checking in. See Version Control System (VCS) on page 134 for details on using the version control
system.
A Working Copy of an image is currently present in the working area (e.g., $MGR_HOME/imaging/<user>/images). A
Versioned image is a revision of an image stored in VCS. See Version Control System (VCS) on page 134 for details on
using the version control system.
4.
5.
(Optional) Click the Advanced button to display the Advanced Options dialog (see Advanced Provisioning Options
on page 145). This dialog allows you to override partitioning, payload, and kernel verbosity settings.
6.
Click Provision to distribute the image to the selected hosts. Management Center asks you to confirm your action.
142
007-5642-005
Provisioning
Select an Image and Provision
7.
When you click Yes, Management Center re-provisions the hosts using the new image. Any pending or running jobs on
the selected host(s) are lost.
Right-click Provisioning
1.
Select the host(s) you want to provision from the navigation tree (use the Shift or Ctrl keys to select multiple hosts).
2.
Right-click a host and select Provision. Management Center displays a popup menu and allows you to select the
image you will use to provision.
143
007-5642-005
Provisioning
VCS Upgrade
VCS Upgrade
VCS Upgrade is a quick, easy way to make small changes to hosts. Unlike provisioning (which requires rebooting the
host and reformatting its hard drive), the VCS Upgrade feature copies the VCS revision to the host and inflates it while
the host is running. Using the upgrade feature requires that you check all changes into the payload, that the payload
revision is updated in the image, and that you check in the image.
The update feature will update only those hosts with files managed by the payload and will not affect the running kernel
or file system information. If there are changes to the kernel or image, they will not take place until the host is reprovisioned with that image. You cannot downgrade a host by using an older version of a payload.
Major changes made to hosts should be done using provisioning. This ensures that all hosts are homogenous and takes
full advantage of multicast. Also, VCS Upgrade leaves the image and payload on the host out of sync from what is
available in the VCS repositoryfor this reason, SGI recommends that you use Advanced Provisioning Options on
page 145 to schedule the hosts to be re-provisioned with the selected image the next time they reboot.
To Upgrade a Host(s)
1.
2.
Open the Versioned Images tab and select the image you want to use to upgrade the host(s)
3.
Select the host(s) you want to upgrade from the navigation tree (use the Shift or Ctrl keys to select multiple hosts).
4.
Click Update to update the image to the selected hosts. As the operation begins, a status dialog appears.
144
007-5642-005
Provisioning
Advanced Provisioning Options
To change the default scheduled provisioning setting, see Provisioning on page 34.
Scheduling a provision at next reboot can be especially useful when used with PBS. For example, you may make
updates to a payload, then schedule provisioning to occur only after the current tasks are complete. To do this, the root
user (who must be allowed to submit jobs) can submit a job to each host instructing it to reboot.
145
007-5642-005
Provisioning
Advanced Provisioning Options
The root user can submit jobs to PBS only if acl_roots is configured. To configure acl_roots, run qmgr and enter the
following from the qmgr prompt:
qmgr: set server acl_roots += root
If you already set up additional ACLs, you will also need to add root to those ACLs. For example, suppose you have an
acl_users list that allows access to a queue, workq. The command to add root to the ACL would be:
# set queue workq acl_users += root
The following is a sample PBS script you might use to reboot hosts:
#################################################
#!/bin/bash
for i in `seq 1 64`
do
echo \#PBS -N Reboot_n$i > Reboot_n$i.pbs
echo \#PBS -joe >> Reboot_n$i.pbs
echo \#PBS -V >> Reboot_n$i.pbs
echo \#PBS -l nodes=n$i >> Reboot_n$i.pbs
echo \#PBS -q workq >> Reboot_n$i.pbs
echo \#PBS -o /dev/null >> Reboot_n$i.pbs
echo \/sbin\/reboot >> Reboot_n$i.pbs
echo done >> Reboot_n$i.pbs
qsub < Reboot_n$i.pbs
rm Reboot_n$i.pbs
done
#################################################
PARTITIONING OPTIONS
This option allows you override the current partition settings. You can automatically partition an iamge if the partition
changes or choose not to re-partition drives.
FORMATTING OPTIONS
You can automatically format partitions if the payload or partitioning scheme changes, force formatting of all
partitionsincluding those that are exempt from being overwritten (see Partitions on page 55), or choose not to format.
KERNEL VERBOSITY
The kernel verbosity level (1-8) allows you to control debug messages displayed by the kernel during provisioning. The
default value, 1, is the least verbose and 8 is the most.
146
007-5642-005
Chapter 7
Instrumentation and Events
Instrumentation
The Management Center instrumentation service provides the ability to monitor system health and activity for every
host in the cluster. Hosts may be monitored collectively to provide a general system overview, or individually to allow
you to view the configuration of a particular host (useful when diagnosing problems with a particular host or
configuration). From the Instrumentation tab, you can view statistical data for the following areas:
Overview
Thumbnail
List
CPU
Memory
Disk
Network
Kernel
Load
Environmental
Environmental List
GPU
Power
When monitoring the Management Center Master Host, the name of the Master Host must match the name assigned in
$MGR_HOME/@genesis.profile.
When using the Management Center client by exporting an X session over an SSH connection, enabling the gradient fill
and anti-aliasing options for instrumentation may adversely affect the performance of the GUI. This is common on
slower systems. To improve system performance, disable the Gradient Fill and Anti-Aliasing options under the View
menu. For best performance, install a Management Center Client.
147
007-5642-005
Instrumentation
States
States
Management Center uses the following icons to provide visual cues about system status. These icons appear next to
each host viewed with the instrumentation service or from the navigation tree. Similar icons appear next to clusters,
partitions, and regions to indicate the status of hosts contained therein.
Logging
Healthy
Informational
Warning
Critical Error
Provisioning
Off
Unknown
On
States
Event Log
Management Center also tracks events logged for each host in the cluster. The Management Center event log is located
on the instrumentation overview screen. If you select multiple hosts (or a container such as a cluster, partition, or
region), the log shows messages for any host in the selection. If you select a single host, the event log shows messages
for this host only. Events have three severity levels: error, warning, and information. For additional details on
instrumentation event monitoring, see Management Center Monitoring and Event Subsystem on page 165.
148
007-5642-005
Instrumentation
Menu Controls
Menu Controls
The output for the instrumentation service is easily configured and displayed using menu controls located in the View
menu.
View Menu
Metrics Select and display custom metrics defined for your systemthis option is not available to all tab views. See
Metrics on page 178 for information on defining metrics.
Interval Set the frequency (in seconds) with which to gather and display data10, 5, or 1.
Layout Arrange how the instrumentation panel displays information.
Filter List hosts that are in specific states (Thumbnail tab only).
Size Change the display size of thumbnails (Small, Medium, Large).
Sort Organize and display statistical data according to the name or state of the host(s).
Temperatures Select the format in which to display temperatures (Celsius, Fahrenheit).
Anti-Aliasing Apply smoothing to line graphs.
Gradient Fill Apply fill colors to line graphs.
149
007-5642-005
Instrumentation
Overview Tab
Overview Tab
The Overview tab provides details about the configuration, power status, resource utilization, and health status of the
host(s) selected in the host navigation tree. Selecting a Cluster, Partition, or Region in the tree displays all hosts
contained in it. See States on page 148 for a list of system health indicators and Event Log on page 148 for information
regarding messages generated by the host(s).
150
007-5642-005
Instrumentation
Thumbnail Tab
Thumbnail Tab
The Thumbnail tab displays a graphical representation of the system health, event log status, CPU usage, memory
availability, and disk space. From the View menu, you may filter hosts to display only those in a specific state, resize
the thumbnails, or sort the hosts by name or state. See States on page 148 for a list of system health indicators.
151
007-5642-005
Instrumentation
List Tab
List Tab
The List tab displays all pre-configured and custom metrics being observed by the instrumentation service. To add
metrics to this list, select Metrics from the View menu. To create new metrics, see Instrumentation on page 147.
You may copy and paste the contents of list view tables for use in other applications.
152
007-5642-005
Instrumentation
CPU Tab
CPU Tab
Select the CPU tab to monitor the CPU utilization for the selected host(s).
153
007-5642-005
Instrumentation
Memory Tab
Memory Tab
Select the Memory tab to monitor the physical and virtual memory utilization for the selected host(s).
154
007-5642-005
Instrumentation
Disk Tab
Disk Tab
Select the Disk tab to monitor the disk I/O and usage for the selected host(s).
155
007-5642-005
Instrumentation
Network Tab
Network Tab
Select the Network tab to monitor packet transmissions and errors for the selected host(s).
156
007-5642-005
Instrumentation
Kernel Tab
Kernel Tab
Select the Kernel tab to monitor the kernel information for the selected host(s).
157
007-5642-005
Instrumentation
Load Tab
Load Tab
Select the Load tab to monitor the load placed on the selected host(s).
158
007-5642-005
Instrumentation
Environmental Tab
Environmental Tab
Select the Environmental tab to view the temperature summary readings for the selected host(s). Each summary
contains up to five temperature readingsfour processor temperatures followed by the ambient host temperature
(which requires an Icecard). On hosts that support IPMI, these temperature readings differ slightlytwo processor
temperatures, two power supply temperatures, and the ambient host temperature.
The processor temperature readings for IPMI-based hosts indicate the amount of temperature change that must occur
before the CPUs thermal control circuitry activates to prevent damage to the CPU. These are not actual CPU
temperatures.
From the Environmental tab, you can access the following options from the View menu:
Filter Filter and display hosts based on error status
Size Change the size of the thumbnail view (small, medium, or large). Small thumbnails support a mouse-over function
to display a host summary.
Temperatures Set temperature options to display values as Celsius or Fahrenheit. Temperatures range from Green
(Cool) to Yellow (warm) to Red (Hot). Fan speeds follow the same conventionslow or stopped fans appear in red.
159
007-5642-005
Instrumentation
Environmental List Tab
160
007-5642-005
Instrumentation
GPU Tab
GPU Tab
As shown in the panel above, SGI Management Center provides monitoring of supported GPUs (items like
temperature, fan speed, memory usage, and ECC). For a listing of GPU solutions supported by SGI see
http://www.sgi.com/pdfs/4235.pdf.
161
007-5642-005
Instrumentation
Power Tab
Power Tab
The SGI Management Center DCM integration uses indirect TCP communications through an external web service
provider to accumulate and display power monitoring data. This results in a delay in updating instrumentation for every
tree selection change. Consequently, the waiting time for initial Power panel updates for large-scale systems using
DCM may be several minutes in duration.
The following are the primary components of the Power panel:
Details table
Status table
Power Utilization pie chart
Power Trend chart
Details Table
The Details table contains configuration details for various power-related entities.
Attribute
Description
System Count
Displays the number of systems associated with the selected entity, if available.
Monitor Server
BMC Address
Shows the endpoint BMC address currently in use (if several or none are offered, this
entry is unavailable.)
Derated Power
Nameplate Power
Power Status
162
007-5642-005
Instrumentation
Power Tab
Attribute
Description
Policy Status
Capabilities
Status Table
The Status table contains power sampling and measurement data.
Data Item
Description
Maximum Power
Total maximum power measurement recorded in any monitoring cycle for all sampling
intervals within the aggregation period applied for the selected entity (which may be a
group).
max{ sum_T1(max_N1{ P1, P2,...,Pn}, ..., max_Nn{...}), ..., sum_Tn(...)}
Average Power
Sum of the entity/group mean power measurements as given by the sum of the arithmetic
mean of power measurements for all sub-nodes within the specified entity/group for all
sampling intervals within the aggregation period.
avg{ sum_T1(avg_N1{ P1, P2, ..., Pn}, ..., avg_Nn{...}), ..., sum_Tn(...)}
Minimum Power
Result calculated much like Maximum Power, but with floor. Pn is the last monitoring
cycle in a sampling interval. Tn is the last sampling interval in an aggregation period.
A measured, calculated and/or configured sum of all of the derated power components for
the selected endpoint.
Taken from a prescribed IPMI/SMBUS-accessible inlet-air sensor for the endpoint (if
available).
Metric
Description
Used
Indicates the last instantaneously sampled power measurement for the endpoint.
Unused
Lost
Calculated using the configured or estimated power factor for the endpoint, and is
generally an estimate on the efficiency of the power distribution for a node or rack.
163
007-5642-005
Failure Analysis
Power Tab
Failure Analysis
SGI Management Center supports failure analysis for memory errors via memlog, a software component of SGI
Foundation Software.
164
007-5642-005
Monitors run at a set interval and collect information from each host. Listeners receive information about metrics from
the instrumentation service, then determine if the values are reasonable. If a listener determines that a metric is above or
below a set threshold, the listener triggers a logger to take a specific action.
Typically, configuration files are host-specific and are located in the $MGR_HOME/etc directory. If you modify the
configuration files, you can copy them into the payload to make them available on each host after you provision.
By default, Management Center creates a backup of the $MGR_HOME/etc directory during installation and copies it to
$MGR_HOME/etc.bak.<date>.<timestamp>
165
007-5642-005
Monitors
Management Center Monitors run periodically on the cluster and provide metrics that are gathered, processed, and
displayed using the Management Center instrumentation GUI. Using monitors allows you to tune Management
Center to meet your exact system needs by enabling or disabling specific monitors or by setting the rate at which
monitors run. In cases where pre-defined monitors simply do not meet your specific needs, Management Center also
allows you to create custom monitors (see Custom Monitors on page 174). The following table lists the Management
Center default monitors.
Monitor Name
Interval
NFS Client
NFS Server
BlueSmoke
500
Disk
Disk Space
60
Identity
Kernel
LinuxBIOS
86400
Load
15
Memory
Network
Uptime
60
Environmental
All standard Management Center monitors are configured in the InstrumentationMonitors.profile in the
$MGR_HOME/etc directory. The format of the monitor configuration in the file is generally as follows (where <time>
is in milliseconds):
<name>: com.lnxi.instrumentation.server.<monitor_name>
<name>.interval: <time>
When working with standard monitors, it is strongly recommended that you leave all monitors enabledhowever, you
can increase how often these monitors run. Raising the interval can reduce CPU time and network use for monitoring.
Because Management Center uses very little CPU processing time on the compute hosts, values as high as 1 second
(1000 milliseconds) are nearly undetectable. By default, some monitors are set to run at 5 seconds (5000 milliseconds)
or longer.
When monitoring the Management Center Master Host, the name of the Master Host must match the name assigned in
$MGR_HOME/@genesis.profile.
166
007-5642-005
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Check or un-check the box next to each monitor you want to enable or disable.
4.
(Optional) Click Apply as Default to apply the listener configuration as the default on the Master Host and payload.
Management Center saves the listeners in InstrumentationMonitors.profile.default.
5.
(Optional) Click Apply to Hosts to apply the monitor to a specific host(s). The Export to Hosts dialog appears.
A. Select the host(s) to which to export the monitors from the navigation tree.
167
007-5642-005
(Optional) Click Apply to Payloads to include these monitors as part of a payload. The Export to Payloads appears.
Click Close to complete this action and close the Event Administration dialog.
If you click close without applying your changes, all modifications will be lost.
168
007-5642-005
To Add a Monitor
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
For information on creating a custom monitor, see Custom Monitors on page 174.
4.
5.
Enter the path of the executable script used for this monitor or click browse to locate the script.
6.
7.
8.
9.
169
007-5642-005
When you add a monitor and click Apply as Default, Management Center saves the monitor as one of the default
monitorsall future payloads will contain the new monitor. Furthermore, the new monitor will be included any time
you install Management Center into a payload.
11. Click Close.
170
007-5642-005
To Import Monitors
IMPORT FROM HOST
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Click Import and select Import from Host. The Import from Hosts dialog appears.
4.
Select the host from which to import listeners and click Import. Click Cancel to abort this action.
171
007-5642-005
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Click Import and select Import from Payload. The Import from Payloads dialog appears.
4.
Select the payload from which to import listeners and click Import. Click Cancel to abort this action.
IMPORT DEFAULT
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Click Import and select Import Default. Management Center restores all monitors stored as default monitors in
InstrumentationMonitors.profile.default. See To Enable or Disable a Listener on page 183 for information on
adding default listeners.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Click Import and select Restore Factory Settings. Management Center reverts the default monitors that shipped
with Management Center.
172
007-5642-005
To Edit a Monitor
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
Double-click a monitor in the list or select the monitor and click Edit. The edit dialog appears.
4.
Make any necessary modifications, then click OK to apply your changes. Click Cancel to abort this action.
5.
6.
When you change a monitor and click Apply as Default, Management Center saves the monitor as one of the default
monitorsall future payloads will contain the new monitor. Furthermore, the new monitor will be included any time
you install Management Center into a payload.
7.
Click Close.
To Delete a Monitor
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
Select Monitors.
3.
You cannot delete Management Center default monitorsthese monitors can be disabled only.
4.
5.
173
007-5642-005
Custom Monitors
Custom monitors are added by creating a new monitor with the Management Center GUI and including a user-defined
program or script that returns information in a format Management Center can process.
Test scripts carefully! Running an invalid script may cause undesired results with Management Center.
Because monitors typically invoke a script (e.g., bash, perl), using values of less than 5 seconds is not recommended
(but is supported). To use a custom monitor, the program or script called by the monitor must return values to STDOUT
in key:value pairs that use the following format:
hosts.<hostname>.<name>.<key1>:<value1>\n
hosts.<hostname>.<name>.<key2>:<value2>\n
The <hostname> refers to the name of the host from which you are running the script.
When monitoring the Management Center Master Host, the name of the Master Host must match the name assigned in
$MGR_HOME/@genesis.profile.
The <name> is the same name used in the InstrumentationMonitors.profile.default.
The <key> parameter refers to what is being monitored.
The <value> is the return value for that key. The script can return one or more items as long as they all have a key and
value. The value can be any string or number, but the script is responsible for the formatting. The \n at the end is a
newline character (required).
174
007-5642-005
2.
Add the new monitor to the custom monitors profile. The following example uses perl to monitor how many users
are logged into a host. The script returns two values: how many people are logged in and who the people are. The
script name is $MGR_HOME/bin/who.pl and returns who.who and who.count.
#!/usr/bin/perl -w
# Basic modules are allowed
use IO::File;
use Sys::Hostname;
$host = hostname;
my @users;
# This opens the program and runs it. Don't forget the '|' on the end
my $fh = new IO::File('/usr/bin/who |');
# If the program was started
if (defined $fh) {
# Then loop through its output until you get an eof.
while (defined($line = <$fh>)) {
if ($line =~ m/^\w+.*/) {
$line =~ m/^(\w+).*$/;
push(@users,$1);
}
}
# Close the file.
$fh->close();
}
# Remove duplicate entries of who.
%seen = ();
foreach $item (@users) {
push(@uniq, $item) unless $seen{$item}++;
}
# Count how many items are in the array for our count
$count = scalar(@uniq);
# Rather than an array of values, just return a single text string;
foreach $users(@uniq) {
$who .= $users,;
}
chop($who);
print hosts.. $host . .who.count: . $count .\n;
print hosts.. $host . .who.who: . join(,, $who).\n;
175
007-5642-005
When you run the script on host n2 (assuming that perl and the perl modules above are installed correctly), the
following prints to STDOUT:
[root@n2 root]# ./who.pl
hosts.n2.who.count:1
hosts.n2.who.who:root
The script MUST exist on the hosts that will run this monitor. Therefore, you must either copy this script to each host
($MGR_HOME/bin) or configure the payload to include the script and provision the hosts with the new payload.
3.
4.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
5.
6.
7.
When applying listeners to a host, the image used to provision the host must use a payload that contains Management
Center. See Install Management Center into the Payload on page 99.
8.
176
007-5642-005
9.
When you add a monitor and click Apply as Default, Management Center saves the monitor as one of the default
monitorsall future payloads will contain the new monitor. Furthermore, the new monitor will be included any time
you install Management Center into a payload.
10. Click Close.
177
007-5642-005
Metrics
Metrics refer to data collected by monitors that is processed and displayed by the Management Center instrumentation
service. The types of metrics collected are feature-specific and Management Center allows you to view metrics for an
individual host or group of hosts. For a list of available metrics, see Pre-configured Metrics on page 255.
Before you can display a custom metric, you must define a custom monitor to collect the data. See Custom Monitors on
page 174.
2.
Select the host(s) for which you want to display metrics in the host navigation tree.
3.
178
007-5642-005
4.
Select Metrics from the Edit menu. The Metric Selector appears.
5.
Select the metrics you want to include, then click OK. The metrics appear in the List tab.
179
007-5642-005
Metrics Selector
The Metrics Selector reads from Metrics.profile in the $MGR_HOME/etc directory on each Management Center client.
You may add custom metrics to this profile by making additions in the proper file format:
hosts.<name>.<key>.label:<metric_title>
hosts.<name>.<key>.description:<description>
hosts.<name>.<key>.type:java.lang.<type>
hosts.<name>.<key>.pattern:<pattern>
180
007-5642-005
181
007-5642-005
Event Listeners
Event Listeners allow you to easily monitor your cluster and trigger events (loggers) when you exceed specific
thresholds. Event listeners may be configured on specific hosts (including the Master Host) and included on payloads
that contain Management Center (see Install Management Center into the Payload on page 99). By default,
Management Center includes a basic collection of listeners, but allows you to add custom listeners as needed. You may
also import listeners from an existing host or payload, import the default listeners, or restore the factory settings. The
following table lists the default listeners:
Listener Name
Threshold
512000000
Memory (EDAC)
(Correctable Errors)
500
Memory (EDAC)
(Uncorrectable Errors)
LinuxBIOS Bootmode
Management Center has detected that LinuxBIOS is running in Fallbackmode. This may indicate an error with BIOS settings. As a result,
this host may not be running at full performance.
System Load
2.1
Five minute load average limit {0} exceeded on host {3} (current load
average {2})
Message
Ambient Temperature limit {0} exceeded on host {3} (current temperature {2}).
Ambient Temperature limit {0} exceeded on host {3} (current temperature {2}). Shutting down.
CPU Temperature 1 limit {0} exceeded on host {3} (current temperature {2}).
CPU Temperature 1 limit {0} exceeded on host {3} (current temperature {2}).
The temperature listener is divided into a CPU temperature listener and an ambient temperature listener. The CPU
temperature listener is triggered by any CPU and the CPU that trips it is specified in the message. By separating the
ambient temperature, Management Center supports a negative threshold for PEKI temperatures and a positive threshold
for ambient temperatures.
182
007-5642-005
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Check or un-check the box next to each listener you want to enable or disable.
4.
(Optional) Click Apply as Default to apply the listener configuration as the default on the Master Host and payload.
Management Center saves the listeners in InstrumentationListeners.profile.default.
5.
(Optional) Click Apply to Hosts to apply the listener to a specific host(s). The Export to Hosts dialog appears.
A. Select the host(s) to which to export the listeners from the navigation tree.
183
007-5642-005
(Optional) Click Apply to Payloads to include these listeners as part of a payload. The Export to Payloads appears.
Click Close to complete this action and close the Event Administration dialog.
If you click close without applying your changes, all modifications will be lost.
184
007-5642-005
To Add a Listener
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
4.
5.
6.
Select the metric to monitor. For a list of available metrics, see Pre-configured Metrics on page 255.
If you write a custom monitor and want to use one or more of the metrics from that monitor, you must edit the
CustomMetrics.profile to include the metrics, then restart Management Centerotherwise, no custom listeners will be
defined. CustomMetrics.profile uses the same format as Metrics.profile, discussed in Metrics Selector on page 180.
7.
8.
Enter the threshold for the metric and click the Max/Min button to specify whether this value is the maximum of
minimum threshold.
9.
185
007-5642-005
The message is user-configurable and contains the content of the log message or e-mail message. Several variables are
available in the message:
{0} = Threshold
{1} = Metric Name
{2} = Metric Value at the time the listener was triggered
{3} = Hostname
11. Add actions to perform if this event is triggered. Available actions are listed in the following table:
Action
email
Description
Send an event notification e-mail to a comma-delimited list of recipients.
script
snmp
beacon
console
file
halt
log
pbsoff
Automatically set the host status to offline. The pbsoff action requires some additional configuration. See PBS Configuration on page 187.
powercycle
poweron
poweroff
reboot
shutdown
syslog
The Actions list allows you to configure the order in which actions should occur. You may also click Delete to remove
an action from the list.
12. Check the Enable option to activate the listener.
13. Click OK to continue or click Cancel to abort this action.
14. Click Apply as Default to save the listener.
When you add a listener and click Apply as Default, Management Center saves the listener as one of the default
listenersall future payloads will contain the new listener. Furthermore, the new listener will be included any time you
install Management Center into a payload.
15. Click Close.
186
007-5642-005
PBS CONFIGURATION
The pbsoff action uses the pbsnodes command. This command is installed on the hosts as part of the PBS package
however, the PBS server is not typically configured to authenticate from other hosts in the system. In order for the
pbsoff action to be successful, you must allow pbsnodes to run from the hosts. To do this, set the pbs manager via qmgr:
qmgr -c set server managers = root@*.<cluster>.<domain>.<base>
For example:
qmgr -c set server managers = root@*.engr.mycompany.com
You can test this configuration by running the following command on one of the hosts:
pbsnodes -o <hostname>
187
007-5642-005
To Edit a Listener
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Double-click a listener in the list or select the listener and click Edit. The edit dialog appears.
4.
Make any necessary modifications, then click OK to apply your changes. Click Cancel to abort this action.
5.
6.
When you change a listener and click Apply as Default, Management Center saves the listener as one of the default
listenersall future payloads will contain the new listener. Furthermore, the new listener will be included any time you
install Management Center into a payload.
7.
Click Close.
188
007-5642-005
To Delete a Listener
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
4.
To Import Listeners
IMPORT FROM HOST
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Click Import and select Import from Host. The Import from Hosts dialog appears.
4.
Select the host from which to import listeners and click Import. Click Cancel to abort this action.
189
007-5642-005
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Click Import and select Import from Payload. The Import from Payloads dialog appears.
4.
Select the payload from which to import listeners and click Import. Click Cancel to abort this action.
IMPORT DEFAULT
1.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Click Import and select Import Default. Management Center restores all listeners stored as default listeners in
InstrumentationListeners.profile.default. See To Enable or Disable a Listener on page 183 for information on
adding default listeners.
Select Event Administration from the Edit menu. The Event Administration dialog appears.
2.
3.
Click Import and select Restore Factory Settings. Management Center reverts the default listeners that shipped with
Management Center.
190
007-5642-005
Loggers
Loggers refer to actions taken when a threshold exceeds its maximum or minimum value. Common logger events
include sending messages to the centralized Management Center event log, logging to a file, logging to the serial
console, and shutting down the host.
TEMPLATEFORMATTER
You may extend the abilities of pre-configured and custom loggers (located in $MGR_HOME/etc/Logging.profile)
using the template field of the TemplateFormatter. The template field allows you to configure the types of messages
displayed by loggers. For example, the message template type used in the following example is %m:
formatters.com.lnxi.instrumentation.event: \
com.xeroone.logging.TemplateFormatter
formatters.com.lnxi.instrumentation.event.template: %m
Description
%N
Sequential record number. This number resets each time the virtual machine restarts.
%T
Creation time.
%C
Channel.
%S
Severity.
%M
Message.
%E
Event.
%EN
Event name.
%ET
Event trace.
%AN
Application name.
%AM
Application moniker.
%AST
191
007-5642-005
Template
Description
%AV
Application version.
%HN
Host name.
%HM
Host moniker.
%MS
Memory size.
%MF
Memory free.
%OSN
%OSV
%%
Literal % character.
''
'
192
007-5642-005
Chapter 8
Upgrading SGI Management Center
This chapter describes how you upgrade SGI Management Center from the following base versions:
General Tasks
Regardless of the base version, you will need to do the following:
The following sections describe upgrading items specific to the base version from which you are upgrading.
If you upgrade from a cluster running SGI Management Center 1.3 or older, the contents of all partitions on a node will
be lost the first time you provision.
193
007-5642-005
For example:
# dbix -x > /root/SMC1.0-backup.dbix
2.
3.
4.
194
007-5642-005
You should check that there is text in the dbix backup file in order to confirm that the dbix backup succeeded.
2.
3.
4.
195
007-5642-005
5.
Move over the imaging and vcs contents to their new home.
Note:The following steps assume the destination directories are empty or non-existent. Make sure there are no files
or directories in the destination with the same name as the files being moved.
#
#
#
#
#
mkdir -p /opt/sgi/sgimc/vcs
mkdir -p /opt/sgi/sgimc/imaging/root/payloads
mkdir -p /opt/sgi/sgimc/imaging/root/kernels
mkdir -p /opt/sgi/sgimc/imaging/root/images
mv /opt/sgi/islecm/vcs/* /opt/sgi/sgimc/vcs/
# mv /opt/sgi/islecm/imaging/root/payloads/* /opt/sgi/sgimc/imaging/root/payloads/
# mv /opt/sgi/islecm/imaging/root/kernels/* /opt/sgi/sgimc/imaging/root/kernels/
# mv /opt/sgi/islecm/imaging/root/images/* /opt/sgi/sgimc/imaging/root/images/
6.
NOTE: Most of the files end in .profile and do not contain static path information. If you copy a custom version of
exclude.files, change all references from /opt/sgi/islecm to /opt/sgi/sgimc. Example:
# cd /opt/sgi/sgimc/etc
# sed -i "s/opt\/sgi\/islecm/opt\/sgi\/sgimc/g" exclude.files
7.
8.
You can safely remove the "jdk" package (for example, jdk-1.5.0_17-fcs) if it was installed for SGI ISLE Cluster
Manager and you do not need it otherwise. Any leftover islecm-java packages from previous versions of SGI ISLE
Cluster Manager may also be removed. Similarly, you can remove the db46 package if it was installed with a previous
version of SGI Management Center and is no longer being used otherwise.
196
007-5642-005
Chapter 9
Using the Discover Interface
When you add new nodes to your cluster, you must provide profile information about the new nodes to SGI
Management Centernotably, information like the the MAC addresses of the compute nodes and of the BMCs in the
new compute nodes. SGI Management Center provides the Discover interface to assist you in determining
(discovering) the pertinent MAC addresses and adding the new nodes to your cluster.
There is both a graphical and command-line interface for Discover. This chapter describes how you can use each to
discover compute nodes.
This does not pertain to SGI Altix UV large-memory platforms. For those platforms, SGI Management Center uses the
system management node (SMN) and its associated SMN software bundle to discover its chassis management
controllers (CMCs) and blades.
Software Requirements
In order to use Discover, a premium-licensed feature, you need to install the following packages:
discover
discover-common
discover-server
sgi-common-python
sgi-management-device
cattr
After installing these packages, you must also start the Discover daemon:
# service discoverd start
197
007-5642-005
2.
3.
4.
Apply power to the nodes but do not turn the nodes on.
The system searches for MAC addresses and determines if they belong to a BMC, Ethernet switch, or unknown device.
The Continue button is enabled as soon as at least one BMC is found.
198
007-5642-005
5.
Once Discover finds the MACs for the BMCs, press Continue.
199
007-5642-005
6.
As Discover detects the BMCs, it associates the MACs with the BMCs in order.
Once that the BMCs have been ordered, the Discover will determine the system MAC for each node.
After discovering the system MACs, the nodes will be added to the hosts tree.
200
007-5642-005
201
007-5642-005
202
007-5642-005
Chapter 10
Troubleshooting
This chapter describes some troubleshooting steps for various problems that may arise. If you encounter a problem not
listed here or the suggested solution does not work, contact SGI Customer Support. See Product Support on page x.
The following topics appear in this chapter:
203
007-5642-005
Debug Logs
Debug Logs
When you are encountering problems with SGI Management Center, it is often helpful to turn on debugging. In the
Management Center GUI, go to the Preferences screen:
Edit > Preferences
Check the box next to Enable debugging on the master host.
Alternatively, you can turn on debugging by modifying file /opt/sgi/sgimc/etc/system-clustermanager.profile on any
master host, client install node, or payload install node. Add the following text to the file:
system.logging:com.lnxi.debug
logging.level: DEBUG
The following logs are generated:
/opt/sgi/sgimc/log/debug.log
/opt/sgi/sgimc/log/SGIMC-server.log
/tmp/SGIMC-<username>.log
Once you reproduce the problem that is occurring, examine these logs for information about the cause of the problem.
There are often log entries (like warnings and/or exceptions) that are not an indication of an actual problem.
admin.default.domain
admin loghost
This should not be a loopback address such as 127.0.0.1. You can find the RNA host name by checking
/opt/sgi/sgimc/@genesis.profile.
204
007-5642-005
admin.default.domain
n001.default.domain
admin loghost
n001
0% [
Resolution
The most common cause of this behavior is a misconfiguration of IGMP in the management network switches. Please
verify multicast routing is enabled on the switch. In some cases, you may need to enable IGMP Snooping. For some
switches, it is required to disable spanning tree protocol or enable RTSP / Edge Routing. Consult your switch
documentation for information about how to configure your switch.
2.
Click Provisioning.
3.
205
007-5642-005
If the preceding steps fail to resolve the issue, you may have too many combinations of payloads/kernels assigned to
various nodes in the system. This can be resolved by re-provisioning all nodes with one image, which will free the rest
of the available payload/kernel combinations.
2.
3.
Ensure you have not run out of disk space in the filesystem that contains /opt/sgi/sgimc.
2.
Ensure that you do not have any extra mounts in the payload. For example, unmount sys or dev in
/opt/sgi/sgimc/imaging/root/payloads/<name> .
3.
With debugging enabled, re-attempt the check-in and then examine /opt/sgi/sgimc/log/debug.log for additional
information about the error.
206
007-5642-005
Inspect file /etc/lk/keys.dat for corruption or missing characters (for examplemissing lines or an unmatched,
single quotation mark).
Ensure that the system time/date is correct.
Run the following command and examine the output for information about the current state of the license:
# lk_verify
2.
Click Monitoring.
3.
Un-check the box for Enable environmental gathering from the master host.
4.
5.
6.
7.
You must restart the Management Center daemon on the master host and on all nodes.
207
007-5642-005
208
007-5642-005
Chapter 11
Command-Line Interface
Command-Line Syntax and Conventions
CLI commands documented in this guide adhere to the following rulescommands entered incorrectly may produce
the Command not recognized error message.
Convention
Description
xyz
<variable>
<> Angle brackets and italics indicate a user-defined variable (e.g., an IP address or host name)
[x]
[x|y|z]
{x|y|z}
[x{y|z}]
[ { | } ] A combination of square brackets and braces with vertical bars indicates a required choice of an
optional parameter.
Help for all CLI commands is available through man pages. To access the man pages, enter man page from the CLI.
The cwx man page describes all command-line utilities available in Management Center.
All CLI command arguments documented in this chapter are shown using colon notation only ({--partition:|-p:}). You
may also use a space or an equal sign (i.e., --description , -M=) with these arguments.
209
007-5642-005
CLI Commands
CLI Commands
Most of the CLI commands outlined in this chapter are exclusive to the Management Center Master Host.
CLI Commands
conman {
[[-b <host>[ <host> ...<host_n>]]|
[-d <destination>[:<port>]]|
[-e <character>]|
[-f]|
[-F <file_name>]|
[-h]|
[-j]|
[-l <file_name>]|
[-L]|
[-m]|
[-q]|
[-Q]|
[-r]|
[-v]|
[-V]]
<host_console>
}
210
007-5642-005
CLI Commands
CLI Commands
cwhost {
[partadd [{--description:|-d:} <partition_description>] [--enable:] [--disable:]
[{--regions:|-R} <region1>[,<region2>...]] [{--hosts:|-h} <host1>[,<host2>...]]
<partition>|
[partmod {[{--name:|-n:} <partition_name>] [{--description:|-d:}
<partition_description>]
[--enable:] [--disable:] [{--regions:|-R} <region1>[,<region2>...]]
[{--hosts:|-h} <host1>[,<host2>...]]} <partition>]|
[partdel <partition_name>]|
[partshow [<partition_1>[ <partition_2> ...<partition_n>]]]|
[regionadd [{--description:|-d:} <region_description>] [{--partition:|-p:}
<partition_description>]
[--enable:] [--disable:] [{--hosts:|-h} <host1>[,<host2>...]]
[{--groups:|-g} <group1>[,<group2>...]] <region>]|
[regionmod {--name:|-n:} <region> [{--description:|-d:} <region_description>]
[{--partition:|-p:} <partition_description>] [--enable:] [--disable:]
[{--hosts:|-h} <host1>[,<host2>...]]
[{--groups:|-g} <group1>[,<group2>...]] <region>]<region>]|
[regiondel <region>]|
[regionshow [<region_1>[ <region_2> ...<region_n>]]]|
[hostadd <host1> <mac1> <ip1>[ <host2> <mac2> <ip2>] [{--description:|-d:}
<host_description>]
[--enable:] [--disable:] [{--partition:|-p:} <partition_description>]
[{--regions:|-R:} <region_1>[,<region_2>,...<region_n>]]
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]]|
[hostmod <host> [{--name:|-n:} <host>] [{--interfaces:|-I}
<mac1>|<ip1>[,<mac2>|<ip2>]]
[{--description:|-d:} <host_description>] [--enable:] [--disable:]
[{--partition:|-p:} <partition_description>]
[{--regions:|-R:} <region_1>[,<region_2>,...<region_n>]]
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]]|
[hostdel <host>]|
[hostshow [<host_1>[ <host_1> ...<host_n>]]]|
[ifaceadd <host> <mac> <ip> [{--management:|-M:}]]|
[ifacemod <mac>|<ip> [{--management:|-M:}] [--mac:|-m:} <mac>] [{--ip:|-i:} <ip>]
[{--hostname:|-h:} <host>]]|
[ifacedel <mac>|<ip>]|
[ifaceshow [<mac_1>|<ip_1>[ <mac_2>|<ip_2> ...<mac_n>|<ip_n>]]]|
[iceboxadd <icebox> <mac> <ip> [{--description:|-d:} <icebox_description>]
[{--password:|-p:} <password>] [{--hosts:|-h:} <host1>:<port1>[,<host2>:<port2>...]]]|
[iceboxmod <icebox> [{--name:|-n:} <icebox>] [{--mac:|-m:} <mac>] [{--ip:|-i:} <ip>]
[{--description:|-d:} <icebox_description>] [{--password:|-p:} <password>]
[{--hosts:|-h:} <host1>:<port1>[,<host2>:<port2>...]]]|
[iceboxdel <icebox>]|
[iceboxshow [<icebox_1>[ <icebox_2> ...<icebox_n>]]]|
[inflate <host-range1>[ <host-range2> ...]]|
[deflate <host1>[ <host2> ...]]|
[{--verbose|-v}]|
[-signature]|
[{-usage|-help|-?}]
}
211
007-5642-005
CLI Commands
CLI Commands
cwpower {
{
[--on:|-1:]|
[--off:|-0:]|
[--cycle:|-C:]|
[--reset:|-R:]|
[--powerstatus:|-S:]|
[--reboot:|-r:]|
[--halt:|-h:]|
[--down:|-d:]|
[--hoststatus:|-s:]|
[--flash|-f]|
[--unflash|-u]|
[--beacon|-b]|
[--severity|-e]|
[{--verbose:|-v:} [--progressive:|-p:]]
}
<host_1>[ <host_1> ...<host_n>]|
[-signature]|
[{-usage|-help|-?}]
}
cwprovision {
[{--download-path:|-d:}<path>
{--image:|-i:}<image>
{--image.revision:|-I:}<revision>
{--kernel:|-k:}[<kernel>]
[{--kernel-log-level:|-l:}[<level>]]
{--payload:|-p:}[<payload>]
[{--payload-download:|-D:}yes|no|default]
[--update --payload.revision:<revision>]
[{--repartition:|-R:}yes|no|default]
[{--working-image:|-w:}<name>]|
[{--next-reboot:|-n:}]]|
[{--query-last-image:|-q} [--uncompressed-hostnames:|-u]]
<host_1>[ <host_1> ...<host_n>]}|
[-signature]|
[{-usage|-help|-?}]
}
212
007-5642-005
CLI Commands
CLI Commands
cwuser {
[useradd [{--description:|-c:}<description>] [{--home:|-d:}<home_directory>]
[{--group:|-g:}<primary_group>]
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
[{--password:|-p:}<encrypted_password>] [{--shell:|-s:}<shell>] [{--uid:|-u:}<uid>]
[{--enable:|-U}] [{--disable:|-L:}] [{--normal:|-n:}] <user>]|
[usermod [{--description:|-c:}<description>] [{--home:|-d:}<home_directory>]
[{--group:|-g:}<primary_group>]
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
[{--password:|-p:}<encrypted_password>] [{--shell:|-s:}<shell>] [{--uid:|-u:}<uid>]
[{--enable:|-U}] [{--disable:|-L:}] [{--name:|-l:}<user>] <user>]|
[userdel <user>]|
[usershow [<user_1>[ <user_2> ...<user_n>]]]|
[passwd <user>]|
[encryptpasswd]|
[groupadd [{--description:|-d:}<description>] [{--gid:|-g:}<gid>]
[[{--roles:|-r:}<role_1>] [,<role_2>...<role_n>]]
[{--regions:|-R:}<region_1>[,<region_2>...<region_3>]] <group>]|
[groupmod [{--description:|-d:}<description>] [{--gid:|-g:}<gid>]
[[{--roles:|-r:}<role_1>] [,<role_2>,...<role_n>]]
[{--regions:|-R:}<region_1>[,<region_2>,...<region_3>]]
[{--name:|-n:}<group>] <group>]|
[groupdel <group>]|
[groupshow [<group_1>[ <group_2> ...<group_n>]]]|
[roleadd [{--description:|-d:}<description>]
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]] <role>]|
[rolemod [{--description:|-d:}<description>]
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]]
[{--name:|-n:}<role>] <role>]|
[roledel <role>]|
[roleshow [<role_1>[ <role_2> ...<role_n>]]]|
[privshow [<privilege_1>[ <privilege_2> ...<privilege_n>]]]|
[{--verbose|-v}]|
[-signature]|
[{-usage|-help|-?}]
}
dbix {
[{-d|--delete} <context_1>[ <context_2> ...<context_n>]]|
[{-i|--import} <context>] |
[{-x|--export} <context_1>[ <context_2> ...<context_n>]]|
[{-usage|-help|-?}]
}
dbx {
[{--domain:|-d} <domain>] [{--format:|-f:} <format>] [{-usage|-help|-?}] [runtime[:verbose]]
[-signature] [-splash]
}
213
007-5642-005
CLI Commands
CLI Commands
imgr {
{--image:|-i:}<image> [{--kernel:|-k:}<kernel>] [{--kernelrevision:|-K:}<kernel_revision>]
[{--payload:|-p:}<payload>] [{--payload.revision:|-P:}<payload_revision>]
[{--force:|-f:}] [{--list:|-l:}]|
[{-usage|-help|-?}]
}
kmgr {
{--name:|-n:}<name> [{--description:|-d:}<description>]
{--path:|-p:}<path_to_Linux_kernel_source> [{--kernel:|-k:}<name_of_binary>]
[{--architecture:|-a:}<architecture>] [{--modules:|-m:}] [{--binary:|-b:}] [{--list:|l:}]|
[{-usage|-help|-?}]
}
pdcp {[
[-w <host>[,<host>...,<host_n>]]|
[-x <host>[,<host>...,<host_n>]]|
[-a]|
[-i]|
[-r]|
[-p]|
[-q]|
[-f <number>]|
[-l <user>]|
[-t <seconds>]|
[-d]]
<source>[ <source>... <source_n>]
<destination>
}
pdsh {
[[-w <host>[,<host>...,<host_n>]]|
[-x <host>[,<host>...,<host_n>]]|
[-a]|
[-i]|
[-q]|
[-f <number>]|
[-s]|
[-l <user>]|
[-t <seconds>]|
[-u <seconds>]|
[-n <tasks_per_host>]|
[-d]|
[-S]|
<host>[,<host>...,<host_n>]]
<command>
}
214
007-5642-005
CLI Commands
CLI Commands
pmgr {
[[{--description:|-d:}<description>] [{--include:|-i:}<include_file_or_directory>]
[{--include-from:|-I:}<file_containing_list>] [{--location:|-l:}<location_dir>]
[{--silent:|-s:}<silent>]
[{--exclude:|-x:}<exclude_file_or_dir>]] [{--exclude-from:|-X:}<file_containing_list>]
<payload_name>| [{-usage|-help|-?}]
}
powerman {
[[{--on|-1}]|
[{--off|-0}]|
[{--cycle|-c}]|
[{--reset|-r}]|
[{--flash|-f}]|
[{--unflash|-u}]|
[{--list|-l}]|
[{--query|-q}]|
[{--node|-n}]|
[{--beacon|-b}]|
[{--temp|-t}]|
[{--help|-h}]|
[{--license|-L}]|
[{--destination|-d} host[:port]]|
[{--version|-V}]|
[{--device|-D}]|
[{--telemetry|-T}]|
[{--exprange|-x}]]
<host>[ <host> ...<host_n>]
}
vcs {
[{identify| id}]|
[status]|
[include <files>]|
[exclude <files>]|
[archive <filename>]|
[import -R:<repository> -M:<module> [-n:<name>] [-d:<description>] [<files>]]|
[commit [-n:<name>] [-d:<description>] [<files>]]|
[branch [-n:<name>] [-d:<description>] [<files>]]|
[{checkout | co} -R:<repository> -M:<module> [-r:<revision>|<branch>|<name>]]|
[{update | up} [-r:<revision>|<branch>|<name>] [<files>]]|
[name [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]|
[describe [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]|
[{narrate | log} [-R:<repository> -M:<module>] [-r:<revision>|<branch>|<name>]]|
[iterate [-R:<repository> [-M:<module> [-r:<revision>|<branch>|<name>]]]]|
[list]|
[{-usage|-help|-?}]
}
215
007-5642-005
conman
conman
conman {
[[-b <host>[ <host> ...<host_n>]]|
[-d <destination>[:<port>]]|
[-e <character>]|
[-f]|
[-F <file_name>]|
[-h]|
[-j]|
[-l <file_name>]|
[-L]|
[-m]|
[-q]|
[-Q]|
[-r]|
[-v]|
[-V]]
<host_console>
}
Description
The Conman client allows you to connect to remote consoles managed by conmand. Console names are separated by
spaces or commas and matched to the configuration via globbing. Regular expression matching can be enabled with the
-r option.
Conman supports three console access modes: monitor (read-only), interactive (read-write), and broadcast (write-only).
Unless otherwise specified, conman opens the console session in interactive mode (the default).
To use Conman for serial access (that is, as your platform management device), Conman must be installed on the
Master Host and the console(s) must be configured in /etc/conman.conf. The Conman daemon (installed as /etc/init.d/
conmand) must also be started.
You can obtain Conman from http://home.gna.org/conman/. Additional information on Conman is available from the
man pages by entering man conman.conf.
Parameters
[-b <host>[ <host> ...<host_n>]]
(Optional) Broadcast to multiple host consoles (write-only). You may enter a range of
hosts or a space-delimited list of hosts (e.g., host[1-4 7 9]).
Data sent by the client is copied to all specified consoles in parallel, but console output
is not sent back to the client. You can use this option in conjunction with -f or -j.
[-d <destination>[:<port>]]
[-e <character>]
216
007-5642-005
(Optional) Specify the location of the conmand daemon, overriding the default
[127.0.0.1:7890]. This location may contain a host name or IP address and be followed
by an optional colon and port number.
(Optional) Specify the client escape character, overriding the default (&).
conman
[-f]
[-F <file_name>]
[-h]
[-j]
[-l <file_name>]
[-L]
[-m]
[-q]
[-Q]
[-r]
[-v]
[-V]
<host_console>
(Optional) Specify that write-access to the console should be forced, thereby stealing
the console away from existing clients with write privileges. As connections are
terminated, conmand informs the original clients of who perpetrated the theft.
(Optional) Read console names or patterns from a file with the specified name. Only
one console name may be specified per line. Leading and trailing white space, blank
lines, and comments (i.e., lines beginning with a #) are ignored.
(Optional) Display a summary of the command-line options.
(Optional) Specify that write-access to the console should be joined, thereby sharing the
console with existing clients that have write privileges. As privileges are granted,
conmand informs the original clients that privileges have been granted to new clients.
(Optional) Log console session output to a file with the specified name.
(Optional) Display license information.
(Optional) Monitor a console (read-only).
(Optional) Query conmand for consoles matching the specified names or patterns.
Output from this query can be saved to file for use with the -F option.
(Optional) Enable quiet-mode, suppressing informational messages. This mode can be
toggled on and off from within a console session via the &Q escape.
(Optional) Match console names via regular expressions instead of globbing.
(Optional) Enable verbose mode.
(Optional) Display version information.
The name of the host to which to connect.
ESCAPE CHARACTERS
Conman supports the following escapes and assumes the default escape character (&):
&?
Display a list of all escapes currently available.
&.
Terminate the connection.
&&
Send a single escape character.
&B
Send a serial-break to the remote console.
&F
Switch from read-only to read-write via a force.
&I
Display information about the connection.
&J
Switch from read-only to read-write via a join.
&L
Replay the last 4KB of console output. This escape requires that logging is enabled for
the console in the conmand configuration.
&M
Switch from read-write to read-only.
&Q
Toggle quiet-mode to display or suppress informational messages.
&R
Reset the host associated with this console. This escape requires that resetcmd is
specified in the conmand configuration.
&Z
Suspend the client.
217
007-5642-005
conman
ENVIRONMENT
The following environment variables may be used to override default settings.
CONMAN_HOST
Specifies the host name or IP address at which to contact conmand, but may be
overridden with the -d command-line option. Although a port number separated by a
colon may follow the host name (i.e., host:port), the CONMAN_PORT environment
variable takes precedence. If you do not specify a host, the default host IP address
(127.0.0.1) is used.
CONMAN_PORT
Specifies the port on which to contact conmand, but may be overridden by the -d
command-line option. If not set, the default port (7890) is used.
CONMAN_ESCAPE
The first character of this variable specifies the escape character, but may be overridden
by the -e command-line option. If not set, the default escape character (&) is used.
Example 1
To connect to host console n1, enter:
conman n1
Once in conman, enter &. to exit or &? to display a list of conman commands.
Example 2
To broadcast (write-only) to multiple hosts, enter:
conman -b n[1-10]
To view the output of broadcast commands on a group of hosts, use the conmen command before you begin entering
commands from conman. Conmen opens a new window for each host and displays the host output.
For example, the following command opens new consoles for hosts n2-n4:
conmen n[2-4]
218
007-5642-005
cwhost
cwhost
cwhost {
[partadd [{--description:|-d:} <partition_description>] [--enable:] [--disable:]
[{--regions:|-R} <region1>[,<region2>...]] [{--hosts:|-h} <host1>[,<host2>...]]
<partition>|
[partmod {[{--name:|-n:} <partition_name>] [{--description:|-d:}
<partition_description>]
[--enable:] [--disable:] [{--regions:|-R} <region1>[,<region2>...]]
[{--hosts:|-h} <host1>[,<host2>...]]} <partition>]|
[partdel <partition_name>]|
[partshow [<partition_1>[ <partition_2> ...<partition_n>]]]|
[regionadd [{--description:|-d:} <region_description>] [{--partition:|-p:}
<partition_description>]
[--enable:] [--disable:] [{--hosts:|-h} <host1>[,<host2>...]]
[{--groups:|-g} <group1>[,<group2>...]] <region>]|
[regionmod {--name:|-n:} <region> [{--description:|-d:} <region_description>]
[{--partition:|-p:} <partition_description>] [--enable:] [--disable:]
[{--hosts:|-h} <host1>[,<host2>...]]
[{--groups:|-g} <group1>[,<group2>...]] <region>]<region>]|
[regiondel <region>]|
[regionshow [<region_1>[ <region_2> ...<region_n>]]]|
[hostadd <host1> <mac1> <ip1>[ <host2> <mac2> <ip2>] [{--description:|-d:}
<host_description>]
[--enable:] [--disable:] [{--partition:|-p:} <partition_description>]
[{--regions:|-R:} <region_1>[,<region_2>,...<region_n>]]
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]]|
[hostmod <host> [{--name:|-n:} <host>] [{--interfaces:|-I} <mac1>|<ip1>[,<mac2>|<ip2>]]
[{--description:|-d:} <host_description>] [--enable:] [--disable:]
[{--partition:|-p:} <partition_description>]
[{--regions:|-R:} <region_1>[,<region_2>,...<region_n>]]
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]]|
[hostdel <host>]|
[hostshow [<host_1>[ <host_1> ...<host_n>]]]|
[ifaceadd <host> <mac> <ip> [{--management:|-M:}]]|
[ifacemod <mac>|<ip> [{--management:|-M:}] [--mac:|-m:} <mac>] [{--ip:|-i:} <ip>]
[{--hostname:|-h:} <host>]]|
[ifacedel <mac>|<ip>]|
[ifaceshow [<mac_1>|<ip_1>[ <mac_2>|<ip_2> ...<mac_n>|<ip_n>]]]|
[iceboxadd <icebox> <mac> <ip> [{--description:|-d:} <icebox_description>]
[{--password:|-p:} <password>] [{--hosts:|-h:} <host1>:<port1>[,<host2>:<port2>...]]]|
[iceboxmod <icebox> [{--name:|-n:} <icebox>] [{--mac:|-m:} <mac>] [{--ip:|-i:} <ip>]
[{--description:|-d:} <icebox_description>] [{--password:|-p:} <password>]
[{--hosts:|-h:} <host1>:<port1>[,<host2>:<port2>...]]]|
[iceboxdel <icebox>]|
[iceboxshow [<icebox_1>[ <icebox_2> ...<icebox_n>]]]|
[inflate <host-range1>[ <host-range2> ...]]|
[deflate <host1>[ <host2> ...]]|
[{--verbose|-v}]|
[-signature]|
[{-usage|-help|-?}]
}
219
007-5642-005
cwhost
Description
The Host Administration (cwhost) utility allows you to add, modify, view the current state of, or delete any partition,
region, host, interface, or Icebox in your cluster.
220
007-5642-005
cwhost
Subcommands
partadd
[--enable:] [--disable:]
(Optional) A brief description of the partition. If you do not specify a description, this
field remains blank.
(Optional) Indicates whether or not the partition is enabled. If you do not specify this
option, Management Center will enable the partition.
[{--regions:|-R} <region1>[,<region2>...]]
(Optional) The list of regions that are members of this partition. If you do not specify
any regions, none are included in the partition.
[{--hosts:|-h} <host1>[,<host2>...]]
<partition>
(Optional) The list of hosts that are members of this partition. If you do not specify any
hosts, none are included in the partition.
The name of the partition to add.
partmod
(Optional) Change the partition name. If you do not specify a name, Management
Center uses the current partition name.
[{--description:|-d:} <partition_description>]
[--enable:] [--disable:]
[{--regions:|-R} <region1>[,<region2>...]]
(Optional) The list of regions that are members of this partition. If you do not specify
any regions, the partition remains in its original state.
[{--hosts:|-h} <host1>[,<host2>...]]
<partition>
(Optional) The list of hosts that are members of this partition. If you do not specify any
hosts, the partition remains in its original state.
The name of the partition to add.
partdel
(Optional) The name(s) of the partition(s) for which to display the current settings.
Multiple entries are delimited by spaces. Leave this option blank to display all
partitions.
221
007-5642-005
cwhost
regionadd
(Optional) A brief description of the region. If you do not specify a description, this
field remains blank.
[{--partition:|-p:} <partition_description>]
[--enable:] [--disable:]
(Optional) The partition to which this region belongs. If you do not specify a partition,
Management Center assigns the region to the default or unassigned partition.
(Optional) Indicates whether or not the region is enabled. If you do not specify this
option, Management Center will enable the region.
[{--hosts:|-h} <host1>[,<host2>...]]
(Optional) The list of hosts that are members of this region. If you do not specify this
option, the region will not contain any member hosts.
[{--groups:|-g} <group1>[,<group2>...]]
<region>
(Optional) The list of groups that may access this region. If you do not specify this
option, the region will not be available to any groups.
The name of the new region.
regionmod
[--enable:] [--disable:]
(Optional) The partition to which this region belongs. If you do not specify a partition,
Management Center assigns the region to the original partition specified.
(Optional) Indicates whether or not the region is enabled. If you do not specify this
option, the region remains in its original state.
[{--hosts:|-h} <host1>[,<host2>...]]
(Optional) The list of hosts that are members of this region. If you do not specify any
hosts, the region remains in its original state.
[{--groups:|-g} <group1>[,<group2>...]]
<region>
(Optional) The list of groups that may access this region. If you do not specify any
groups, the region remains in its original state.
The name of the region to modify.
regiondel
222
007-5642-005
cwhost
regionshow
(Optional) The name of the region(s) for which to display the current settings. Multiple
entries are delimited by spaces. Leave this option blank to display all regions.
hostadd
The name of each new host, its MAC address, and its IP address. The first host specified
is the management interface. Multiple entries are space-delimited.
[{--description:|-d:} <host_description>]
[--enable:] [--disable:]
(Optional) A brief description of the host. If you do not specify a description, this field
remains blank.
(Optional) Indicates whether or not the host is enabled. If you do not specify this option,
Management Center enables the host.
[{--partition:|-p:} <partition_description>]
(Optional) The partition to which this host belongs. If you do not specify a partition,
Management Center assigns the host to the default or unassigned partition.
[{--regions:|-r:} <region_1>[,<region_2>,...<region_n>]]
(Optional) The region(s) to which this host belongs. If you do not specify a region,
Management Center does not assign the host to any region. Multiple entries are commadelimited.
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]
(Optional) The Icebox(es) and port(s) to which this host is connected. If you do not
specify an Icebox and port, Management Center assumes that the host is not connected
to an Icebox. Multiple entries are comma-delimited.
hostmod
(Optional) A list of interfaces with which this host is associated. If none of the specified
interfaces are management interfaces, Management Center marks the first interface as
the management interface.
[{--description:|-d:} <host_description>]
[--enable: {yes|no}]
[{--partition:|-p:} <partition_description>]
(Optional) The partition to which this host belongs. If you do not specify a partition, the
host remains associated with the original partition specified.
223
007-5642-005
cwhost
[{--regions:|-r:} <region_1>[,<region_2>,...<region_n>]]
(Optional) The region(s) to which this host belongs. If you do not specify a partition, the
host will not belong to any region. Multiple entries are comma-delimited.
[{--iceboxes:|-i:} <icebox_1>:<port>[,<icebox_2>:<port>,...<icebox_3>:<port>]]
(Optional) The Iceboxes and ports to which this host is connected. If you do not specify
an Icebox and port, Management Center assumes that the host is not connected to an
Icebox. Multiple entries are comma-delimited.
hostdel
Delete a host.
<host>
hostshow
(Optional) The name of the host(s) for which to display the current settings. Multiple
entries are delimited by spaces. Leave this option blank to display all hosts.
ifaceadd
[{--management:|-M:}]
(Optional) Specify whether or not this interface is a management interface. If you do not
specify this option, Management Center assumes that this interface is not a management
interface.
ifacemod
[--mac:|-m:} <mac>]
[{--ip:|-i:} <ip>]
(Optional) Specify whether or not this interface is a management interface. If you do not
specify this option, the interface remains in its original state.
(Optional) Change the interfaces hardware or MAC address.
(Optional) Change the interfaces IP address.
[{--hostname:|-h:} <host>]
224
007-5642-005
cwhost
ifaceshow
(Optional) The MAC or IP address(es) of the interface(s) for which to display the
current settings. Multiple entries are delimited by spaces. Leave this option blank to
display all interfaces.
iceboxadd
[{--description:|-d:} <icebox_description>]
(Optional) A brief description of the Icebox. If you do not specify a description, this
field remains blank.
[{--password:|-p:} <password>]
(Optional) A list of hosts connected to the Icebox and the ports to which they are
connected. If you do not specify this option, Management Center assumes that the hosts
are not connected to an Icebox.
iceboxmod
[{--description:|-d:} <icebox_description>]
(Optional) A list of hosts connected to the Icebox and the ports to which they are
connected. If you do not specify this option, Management Center assumes that the hosts
remain in their original state.
iceboxdel
225
007-5642-005
cwhost
iceboxshow
(Optional) The Icebox(es) for which to display the current setting(s). Multiple entries
are delimited by spaces. Leave this option blank to display all Iceboxes.
inflate <host-range1>[ <host-range2> ...]
(Optional) Allows you to change between full and compressed host list format. Inflate
the specified host range(s) to display a full list of hosts.
(Optional) Allows you to change between full and compressed host list format. Deflate
the specified host range(s) to display a compressed host list.
(Optional) Display verbose output when performing operations. This option is common
to all subcommands.
(Optional) Displays the application signature. The application signature contains the
name, description, version, and build information of this application.
(Optional) Display help information for the command and exit. All other options are
ignored.
Examples
EXAMPLE 1
View the layout of the system:
cwhost hostshow
EXAMPLE 2
Get details of the system:
cwhost hostshow -v
EXAMPLE 3
Create a region called group1:
cwhost regionadd group1
EXAMPLE 4
Add a host to region group1 with the host name n1, the mac 0005b342afe1, and the IP address 10.0.0.1:
cwhost hostadd -r:group1 n1 0005b342afe1 10.0.0.1
EXAMPLE 5
Add host n2 to the group1 region:
cwhost hostmod -r:group1 n2
226
007-5642-005
cwhost
EXAMPLE 6
Deflate the host list n1, n2, n3, and n4:
cwhost deflate n1 n2 n3 n4
n[1-4]
EXAMPLE 7
Inflate the host list n[1-4]:
cwhost inflate n[1-4]
n1
n2
n3
n4
227
007-5642-005
cwpower
cwpower
cwpower {
{
[--on:|-1:]|
[--off:|-0:]|
[--cycle:|-C:]|
[--reset:|-R:]|
[--powerstatus:|-S:]|
[--reboot:|-r:]|
[--halt:|-h:]|
[--down:|-d:]|
[--hoststatus:|-s:]|
[--flash|-f]|
[--unflash|-u]|
[--beacon|-b]|
[--severity|-e]|
[{--verbose:|-v:} [--progressive:|-p:]]
}
<host_1>[ <host_1> ...<host_n>]|
[-signature]|
[{-usage|-help|-?}]
}
Description
The Power Administration (cwpower) utility allows you to perform power administration operations on a host(s) within
the cluster. Operations include power on, power off, power cycle, reset, reboot, halt, and power down (a soft power off).
You may also query the current power status of a particular host(s).
You may specify only one power administration operation option each time you use the cwpower command.
Parameters
[--beacon|-b]
[--severity|-e]|
[--on|-1]
[--off|-0]
[--cycle|-C]
[--reset|-R]
[--powerstatus|-S]
[--reboot|-r]
[--halt|-h]
[--down|-d]
[--hoststatus|-s]
[--flash|-f]
[--unflash|-u]
228
007-5642-005
cwpower
[{--verbose|-v} [--progressive|-p]]
(Optional) Change the standard output to verbose. Output displays the power status of
each host, one per line. To display output as information becomes available, select the
progressive optionprogressive output is not guaranteed to be sorted and is not
summarized.
<host_1>[ <host_1> ...<host_n>]
[-signature]
[{-usage|-help|-?}]
The name of the host(s) for which to execute the specified operation. You may enter a
range of hosts or a space-delimited list of hosts (e.g., host[1-4 7 9]).
(Optional) Displays the application signature. The application signature contains the
name, description, version, and build information of this application.
(Optional) Display help information for the command and exit. All other options are
ignored.
Examples
EXAMPLE 1
To Power on hosts 110:
cwpower -1 n[1-10]
EXAMPLE 2
Power off host 1:
cwpower -0 n1
EXAMPLE 3
Power cycle hosts 25:
cwpower -C n[2-5]
EXAMPLE 4
Check the status (On, Off, Unknown, Provisioning) of hosts 110:
cwpower -s n[1-10]
229
007-5642-005
cwprovision
cwprovision
cwprovision {
[{--download-path:|-d:}<path>
{--image:|-i:}<image>
{--image.revision:|-I:}<revision>
{--kernel:|-k:}[<kernel>]
[{--kernel-log-level:|-l:}[<level>]]
{--payload:|-p:}[<payload>]
[{--payload-download:|-D:}yes|no|default]
[--update --payload.revision:<revision>]
[{--repartition:|-R:}yes|no|default]
[{--working-image:|-w:}<name>]|
[{--next-reboot:|-n:}]]|
[{--query-last-image:|-q} [--uncompressed-hostnames:|-u]]
<host_1>[ <host_1> ...<host_n>]}|
[-signature]|
[{-usage|-help|-?}]
}
Description
The Provisioning (cwprovision) utility allows you to provision or update a host(s) on the cluster and use working copies
to override the kernel and payload associated with the image. See Provisioning on page 141 and Version Control
System (VCS) on page 134.
Parameters
{--download-path:|-d:}<path>
{--image:|-i:}<image>
The path to which to download the image during the boot process (by default,
/mnt).
The image to use to provision the host(s). Unless you specify the working image option,
Management Center assumes that the image is a version-controlled image.
{--image.revision:|-I:}<revision>
The revision of the image to use to provision the host(s). If you specify a branch
revision, Management Center uses the tip revision of the branch. If you do not specify a
revision or a working image, Management Center uses the tip revision of the image.
Revisions may be specified either numerically or by alias.
The image.revision option is not available in conjunction with the working-image option.
{--kernel:|-k:}[<kernel>]
The working copy of the kernel associated with the image used to provision the host(s).
The name is required only if two or more working copies of the kernel exist.
[{--kernel-log-level:|-l:}[<level>]]
Select the kernel verbosity level used to control debug messages. This level may range
from 1 (the least verbose) to 8 (the most verbose). By default, the verbosity level is 1.
230
007-5642-005
cwprovision
Power Management{--payload:|-p:}[<payload>]
The working copy of the payload associated with the image used to provision the
host(s). The name is required only if two or more working copies of the payload exist.
[{--payload-download:|-D:}yes|no|default]
(Optional) Specify whether or not to force a download of the payload to the host during
this provisioning operation. The default option automatically detects whether or not to
download the payload. See Advanced Provisioning Options on page 145.
[--update --payload.revision:<revision>]
(Optional) Specify whether or not to force a repartition of the host during this
provisioning operation. The default option automatically detects whether or not to
repartition the host. See Advanced Provisioning Options on page 145.
[{--working-image:|-w:}<name>]
(Optional) Use the working copy of the specified image to provision the host(s).
The working-image option is not available in conjunction with the image.revision option.
[{--next-reboot:|-n:}]
[{--query-last-image:|-q}]
The kernel and payload specify zero (0) if you use the VCS version and one (1) if you use the working version to
override the kernel or payload using the advanced provisioning options.
The query-last-image option can display image and host information even if the host is down.
[{--uncompressed-hostnames:|-u}]
(Optional) Select this option to change the output format for query-last-image to list one
host name and corresponding image per line. This option can be used only with querylast-image.
<host_1>[ <host_1> ...<host_n>]
[-signature]
[{-usage|-help|-?}]
The name of the host(s) to provision. You may enter a range of hosts or a spacedelimited list of hosts (e.g., host[1-4 7 9]).
(Optional) Displays the application signature. The application signature contains the
name, description, version, and build information of this application.
(Optional) Display help information for the command and exit. All other options are
ignored.
231
007-5642-005
cwprovision
Examples
Use vcs iterate -R:images to see what images are available for provisioning. For a list of working images, use imgr -list.
EXAMPLE 1
To provision hosts 24 with image Compute_Host:
cwprovision -i:Compute_Host n[2-4]
EXAMPLE 2
To provision hosts 24 with an older version (version 3) of the image Compute_Host:
cwprovision -i:Compute_Host -I:3 n[2-4]
EXAMPLE 3
To set advanced options to force re-partitioning and download the payload for hosts 24:
cwprovision -i:Compute_Host -I:3 -R:yes -D:yes n[2-4]
EXAMPLE 4
To provision hosts 210 after the next reboot:
cwprovision -i:rhel4_img --next-reboot n[2-10]
EXAMPLE 5
To update hosts 6-8 with revision 9 of the payload:
cwprovision --update --payload.revision:9 n[6-8]
232
007-5642-005
cwuser
cwuser
cwuser {
[useradd [{--description:|-c:}<description>] [{--home:|-d:}<home_directory>]
[{--group:|-g:}<primary_group>]
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
[{--password:|-p:}<encrypted_password>] [{--shell:|-s:}<shell>] [{--uid:|-u:}<uid>]
[{--enable:|-U}] [{--disable:|-L:}] [{--normal:|-n:}] <user>]|
[usermod [{--description:|-c:}<description>] [{--home:|-d:}<home_directory>]
[{--group:|-g:}<primary_group>]
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
[{--password:|-p:}<encrypted_password>] [{--shell:|-s:}<shell>] [{--uid:|-u:}<uid>]
[{--enable:|-U}] [{--disable:|-L:}] [{--name:|-l:}<user>] <user>]|
[userdel <user>]|
[usershow [<user_1>[ <user_2> ...<user_n>]]]|
[passwd <user>]|
[encryptpasswd]|
[groupadd [{--description:|-d:}<description>] [{--gid:|-g:}<gid>]
[[{--roles:|-r:}<role_1>] [,<role_2>...<role_n>]]
[{--regions:|-R:}<region_1>[,<region_2>...<region_3>]] <group>]|
[groupmod [{--description:|-d:}<description>] [{--gid:|-g:}<gid>]
[[{--roles:|-r:}<role_1>] [,<role_2>,...<role_n>]]
[{--regions:|-R:}<region_1>[,<region_2>,...<region_3>]]
[{--name:|-n:}<group>] <group>]|
[groupdel <group>]|
[groupshow [<group_1>[ <group_2> ...<group_n>]]]|
[roleadd [{--description:|-d:}<description>]
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]] <role>]|
[rolemod [{--description:|-d:}<description>]
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]]
[{--name:|-n:}<role>] <role>]|
[roledel <role>]|
[roleshow [<role_1>[ <role_2> ...<role_n>]]]|
[privshow [<privilege_1>[ <privilege_2> ...<privilege_n>]]]|
[{--verbose|-v}]|
[-signature]|
[{-usage|-help|-?}]
}
Description
The User Administration (cwuser) utility allows you to perform user, group, and role administration operations on the
cluster. Operations include adding, modifying, deleting, and displaying the current state of users, groups, and roles.
Subcommands
useradd
The users description (e.g., the users full name). If you do not specify a description,
this field remains blank.
[{--home:|-d:}<home_directory>]
233
007-5642-005
cwuser
[{--group:|-g:}<primary_group>]
The users primary group. You may enter the group name or its numerical gid. If you do
not enter a primary group, Management Center will do one of the following:
Red Hat Linux
Create a group with the same name as the user and assign the primary
group to that group (unless you specify the [--normal:|-n:] option).
SuSE Linux
The primary group for the user is the default group specified for users,
usually users.
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
The secondary group(s) to which the user belongs. If you do not specify this option, the
user belongs to no secondary groups. Multiple entries are delimited by commas.
[{--password:|-p:}<encrypted_password>]
[{--shell:|-s:}<shell>]
[{--uid:|-u:}<uid>]
The users encrypted password. If you do not specify a password, Management Center
disables the account.
The users login shell. If you do not specify this option, Management Center assigns /
bin/bash as the users login shell.
The users uid. If you do not specify a uid, Management Center assigns the first
available uid greater than 499.
[{--enable:|-U}] [{--disable:|-L:}]
[{--normal:|-n:}]
<user>
These options allow you to enable or disable the users account. The -U (unlock) and -L
(lock) options are provided for compatibility with the useradd utility and allow you to
enable and disable the users account respectively. If you do not specify either of these
options, the users account is enabled by default (unless no password is supplied).
If you do not specify a group for the user on Red Hat Linux, Management Center will
behave as it does with most other versions of Linux. The users primary group uses the
default user group, users.
The users login name.
usermod
The users description (e.g., the users full name). If you do not specify a description,
Management Center uses the current description.
[{--home:|-d:}<home_directory>]
The users home directory. If left blank, the current home directory.
[{--group:|-g:}<primary_group>]
The users primary group. You may enter the group name or its numerical gid. If you do
not enter a primary group, Management Center uses the current group assignment.
[{--groups:|-G:}<secondary_group_1>[,<secondary_group_2>,...<secondary_group_n>]]
The secondary group(s) to which the user belongs. If you do not specify this option,
Management Center assigns the user to any secondary groups previously assigned.
Multiple entries are delimited by commas.
234
007-5642-005
cwuser
[{--password:|-p:}<encrypted_password>]
[{--shell:|-s:}<shell>]
[{--uid:|-u:}<uid>]
Change the users encrypted password. If you do not specify a password, Management
Center uses the current password.
The users login shell. If you do not specify this option, Management Center uses the
login shell previously assigned to the user.
The users uid. If you do not specify a uid, Management Center uses the current uid.
[{--enable:|-U}] [{--disable:|-L:}]
[{--name:|-l:}<user>]
<user>
These options allow you to enable or disable the users account. The -U (unlock) and -L
(lock) options are provided for compatibility with the useradd utility and allow you to
enable and disable the users account respectively. If you do not specify either of these
options, the users account is enabled by default (unless no password is supplied).
Change the login name for the users account. If you do not specify this option,
Management Center uses the previous login name.
The users login name.
userdel
(Optional) The users(s) login name(s). Multiple entries are delimited by spaces. Leave
this option blank to display all users.
passwd
Alter the password for a Management Center user. After making the change, Management Center prompts you to reenter the password.
<user>
The users login name.
encryptpasswd
This option allows you to encrypt a clear text password into the Management Center encrypted format and display it on
screen. You may then copy and paste the encrypted password when creating a new user account. See example on
page 238.
Encrypted password strings often contain characters with which the Linux shell has problems. To overcome this,
encrypted text must be escaped using single quotes:
cwuser usermod '-p:$1$Jx^VLEZy$/7SmJmEbmbVMQW13kxaIg.' john
groupadd
The groups description. If you do not specify a description, this field remains blank.
The groups gid. If you do not specify a gid, Management Center assigns the first
available gid greater than 499.
235
007-5642-005
cwuser
[{--roles:|-r:}<role_1>[,<role_2>,...<role_n>]]
The roles associated with the group. If you do not specify a role(s), the group is not
associated with any roles. Multiple entries are delimited by commas.
[{--regions:|-R:}<region_1>[,<region_2>,...<region_3>]]
<group>
The region(s) associated with the group. If you do not specify a region(s), Management
Center does not associate the group with any regions. Multiple entries are delimited by
commas.
Group name.
groupmod
[{--gid:|-g:}<gid>]
The groups description. If you do not specify a description, Management Center uses
the current group description.
The groups gid. If you do not specify a gid, Management Center uses the gid
previously assigned.
[{--roles:|-r:}<role_1>[,<role_2>,...<role_n>]]
The roles associated with the group. If you do not specify a role(s), the group maintains
its previous role associations. Multiple entries are delimited by commas.
[{--regions:|-R:}<region_1>[,<region_2>,...<region_3>]]
[{--name:|-n:}<group>]
<group>
The regions associated with the group. If you do not specify a region(s), Management
Center maintains the current region associations. Multiple entries are delimited by
commas.
Use this option to change the group name. If you do not specify a name, the group name
remains unchanged.
Current group name.
groupdel
(Optional) Group name(s) for which to display the current settings. Multiple entries are
delimited by spaces. Leave this option blank to display all groups.
roleadd
The roles description. If you do not specify a role description, this field remains blank.
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]]
<role>
The privileges associated with the role. If you do not specify a privilege(s),
Management Center does not assign any privileges to the role. Multiple entries are
delimited by commas.
The name of the role.
rolemod
236
007-5642-005
cwuser
[{--description:|-d:}<description>]
The roles description. If you do not specify a description for the role, Management
Center uses the current description.
[{--privileges:|-p:}<privilege_1>[,<privilege_2>,...<privilege_n>]]
[{--name:|-n:}<role>]
<role>
The privileges associated with the role. If you do not specify a privilege(s),
Management Center uses current privilege associations. Multiple entries are delimited
by commas.
Use this option to change the name of the role. If you do not specify a name, the role
name remains unchanged.
The name of the current role.
roledel
(Optional) The name of the role(s) for which to display the current settings. Multiple
entries are delimited by spaces. Leave this option blank to display all roles.
privshow
[{--verbose|-v}]
[-signature]
[{-usage|-help|-?}]
(Optional) The privilege(s) for which to display the current settings. Multiple entries are
delimited by spaces. Leave this option blank to display all privileges.
(Optional) Display verbose output when performing operations. This option is common
to all subcommands.
(Optional) Displays the application signature. The application signature contains the
name, description, version, and build information of this application.
(Optional) Display help information for the command and exit. All other options are
ignored.
Examples
EXAMPLE 1
Display the current users in the system:
cwuser usershow -v
237
007-5642-005
cwuser
EXAMPLE 2
Add the user john to the users group:
cwuser useradd -g:users john
EXAMPLE 3
Add an encrypted password to a new user account:
cwuser encryptpasswd
<Enter, then verify password>
The command outputs an encrypted string to use when creating the new account.
$1$Jx^VLEZy$/7SmJmEbmbVMQW13kxaIg
Because encrypted password strings often contain characters with which the Linux shell has problems, encrypted text
and user names containing spaces (e.g., John Johnson) must be escaped using single quotes.
Create the new user account using the encrypted password.
cwuser useradd '-p:$1$Jx^VLEZy$/7SmJmEbmbVMQW13kxaIg.' -d:/home/john -s:/bin/bash uid:510 -g:users -c:John Johnson john
238
007-5642-005
dbix
dbix
dbix {
[{-d|--delete} <context_1>[ <context_2> ...<context_n>]]|
[{-i|--import} <context>] |
[{-x|--export} <context_1>[ <context_2> ...<context_n>]]|
[{-usage|-help|-?}]
}
Description
The dbix application provides support for importing, exporting, and deleting Management Center database entries. The
application uses the standard input and output streams for reading and writing data, and the delete and export options
accept an optional space-delimited list of contexts (a context refers to the path to the database attributes on which to
perform the operation).
Parameters
[{-d|--delete} <context_1>[ <context_2> ...<context_n>]]
Examples
EXAMPLE 1
Export the entire database to a file:
dbix -x > cwx.4.0-May.20.2007.db
EXAMPLE 2
Export the hosts section of the database to a file:
dbix -x hosts > cwx.4.0-hosts.db
EXAMPLE 3
Delete the entire database:
dbix -d
(confirm action)
EXAMPLE 4
Import a new database (or additions):
dbix -i < cwx.4.0-new_hosts.db
239
007-5642-005
dbx
dbx
dbx {
[{--domain:|-d} <domain>] [{--format:|-f:} <format>] [{-usage|-help|-?}] [runtime[:verbose]]
[-signature] [-splash]
}
Description
This utility exports specific file formats from the database. Supported formats include a simple host name list typically
used for mpich, pdsh, etc., an IP address to host name map (/etc/hosts), and configuration files for powerman and
conman.
Parameters
Arguments and option values are case sensitive. Option names are not.
[{--domain:|-d} <domain>]
[{-usage|-help|-?}]
[-runtime[:verbose]]
[-signature]
[-splash]
Examples
EXAMPLE 1
Use dbx to configure a powerman.conf file:
dbx -f:conman > /etc/conman.conf
EXAMPLE 2
Use dbx to configure a hosts file:
dbx -f:hosts -d:sgi.com > /etc/hosts
240
007-5642-005
imgr
imgr
imgr {
{--image:|-i:}<image> [{--kernel:|-k:}<kernel>] [{--kernelrevision:|-K:}<kernel_revision>]
[{--payload:|-p:}<payload>] [{--payload.revision:|-P:}<payload_revision>]
[{--force:|-f:}] [{--list:|-l:}]|
[{-usage|-help|-?}]
}
Description
The imgr command is used to modify the kernel or payload of an existing image. To create a new image, please refer to
Image Management on page 110. The Imaging CLI allows you to perform the following operations:
If you change a kernel or payload, Management Center rebuilds the image but still requires that you commit the image
to VCS. See vcs on page 251.
Parameters
{--image:|-i:}<image>
The name of the image to modify. By default, Management Center selects the version of
the image that was most recently checked in.
[{--kernel:|-k:}<kernel>]
(Optional) Specify which kernel revision to use. If you do not specify a revision, you
will be asked whether or not to use the latest revision.
[{--payload:|-p:}<payload>]
[{--force:|-f:}]
[{--list:|-l:}]
[{-usage|-help|-?}]
(Optional) Specify which payload revision to use. If you do not specify a revision, you
will be asked whether or not to use the latest revision.
(Optional) Select the force option to automatically select the latest revision of a payload
or kernel. Selecting this option suppresses the prompt that asks you whether or not to
use the latest revision.
(Optional) Display a list of working images.
(Optional) Display help information for the command and exit. All other options are
ignored.
Examples
Update image Compute to use revision 4 of kernel-2.4:
imgr -i:Compute -k:linux-2.4 -K:4
241
007-5642-005
kmgr
kmgr
kmgr {
{--name:|-n:}<name> [{--description:|-d:}<description>]
{--path:|-p:}<path_to_Linux_kernel_source> [{--kernel:|-k:}<name_of_binary>]
[{--architecture:|-a:}<architecture>] [{--modules:|-m:}] [{--binary:|-b:}] [{--list:|l:}]|
[{-usage|-help|-?}]
}
Description
The kmgr command is used to create a kernel package from a binary kernel or from a kernel source directory. The
utility copies the binary kernel, .config, System.map, and modules to the kernel directory.
Parameters
{--name:|-n:}<name>
[{--description:|-d:}:<description>]
Example 1
Create a new kernel named linux-2.4:
kmgr -n:linux-2.4 -p:/usr/src/linux-2.4.20-8 -a:i386
Example 2
Create a new kernel, linux-2.6, from a binary kernel:
kmgr -b -n:linux-2.6 -k:/boot/vmlinuz-2.6.16-smp -a:x86_64 -d:Linux 2.6.16 SMP kernel
242
007-5642-005
pdcp
pdcp
pdcp {[
[-w <host>[,<host>...,<host_n>]]|
[-x <host>[,<host>...,<host_n>]]|
[-a]|
[-i]|
[-r]|
[-p]|
[-q]|
[-f <number>]|
[-l <user>]|
[-t <seconds>]|
[-d]]
<source>[ <source>... <source_n>]
<destination>
}
Description
Pdcp is a parallel copy command used to copy files from a Master Host to all or selected hosts in the cluster. Unlike rcp
which copies files only to an individual host, pdcp can copy files to multiple remote hosts in parallel. When pdcp
receives SIGINT (Ctrl+C), it lists the status of current threads. A second SIGINT within one second terminates the
program.
Parameters
TARGET HOST LIST OPTIONS
If you do not specify any of the following options, the WCOLL environment variable must point to a file that contains a
list of hosts, one per line.
[-w <host>[,<host>...,<host_n>]]
(Optional) Execute this operation on the specified host(s). You may enter a range of
hosts or a comma-delimited list of hosts (e.g., host[1-4,7,9]). Any list that consists of a
single - character causes pdsh to read the target hosts from stdin, one per line.
[-a]
[-i]
(Optional) Exclude the specified hosts from this operation. You may enter a range of
hosts or a comma-delimited list of hosts (e.g., host[1-4,7,9]). You may use this option in
conjunction with other target host list options such as -a.
(Optional) Perform this operation on all hosts in the cluster.
(Optional) Use this option in conjunction with -a or -g to request canonical host names.
By default, pdsh uses reliable host names.
243
007-5642-005
pdcp
[-r]
[-p]
[-q]
[-f <number>]
[-l <user>]
[-t <seconds>]
[-d]
List the source file(s) you want to copy from the Master Host. To copy multiple files,
enter a space-delimited list of files (e.g., pdcp -a /source1 /source2 /source3 /
destination).
The location to which to copy the file. The destination is set off from the source by a
space.
Example 1
Copy /etc/hosts to foo01foo05:
pdcp -w foo[01-05] /etc/hosts /etc
Example 2
Copy /etc/hosts to foo0 and foo2foo5:
pdcp -w foo[0-5] -x foo1 /etc/hosts /etc
Example 3
To copy a file to all hosts in the cluster:
pdcp -a /etc/hosts /etc/
Example 4
To copy a directory recursively:
pdcp -a -r /scratch/dir /scratch
Example 5
To copy multiple files to a directory
pdcp -a /etc/passwd /etc/shadow /etc/group /etc
244
007-5642-005
pdsh
pdsh
pdsh {
[[-w <host>[,<host>...,<host_n>]]|
[-x <host>[,<host>...,<host_n>]]|
[-a]|
[-i]|
[-q]|
[-f <number>]|
[-s]|
[-l <user>]|
[-t <seconds>]|
[-u <seconds>]|
[-n <tasks_per_host>]|
[-d]|
[-S]|
<host>[,<host>...,<host_n>]]
<command>
}
Description
To use pdsh, it must be installed and configured. You can obtain pdsh from http://sourceforge.net/projects/pdsh/.
Pdsh is a variant of the rsh command. However, unlike rsh which runs commands only on an individual host, pdsh
allows you to issue parallel commands on groups of hosts. When pdsh receives SIGINT (Ctrl+C), it lists the status of
current threads. A second SIGINT within one second terminates the program. If set, the DSHPATH environment
variable is the PATH for the remote shell.
If a command is not specified on the command line, pdsh runs interactively, prompting for commands, then executing
them when terminated with a carriage return. In interactive mode, target hosts that time-out on the first command are
not contacted for subsequent commands. Commands prefaced with an exclamation point are executed on the local
system.
Parameters
TARGET HOST LIST OPTIONS
[-w <host>[,<host>...,<host_n>]]
(Optional) Execute this operation on the specified host(s). You may enter a range of
hosts or a comma-delimited list of hosts (e.g., host[1-4,7,9]). Any list that consists of a
single - character causes pdsh to read the target hosts from stdin, one per line.
[-a]
(Optional) Exclude the specified hosts from this operation. You may enter a range of
hosts or a comma-delimited list of hosts (e.g., host[1-4,7,9]). You may use this option in
conjunction with other target host list options such as -a.
(Optional) Perform this operation on all hosts in the cluster. By default, a list of all hosts
installed in the cluster is available under /etc/pdsh/machines.
245
007-5642-005
pdsh
[-i]
(Optional) Use this option in conjunction with -a or -g to request canonical host names.
By default, pdsh uses reliable host names.
[-l <user>]
[-t <seconds>]
[-u <seconds>]
[-n <tasks_per_host>]
[-d]
[-S]
<host>[,<host>...,<host_n>]
The name of the host(s) on which to execute the specified operation. You may enter a
range of hosts or a comma-delimited list of hosts (e.g., host[1-4,7,9]).
Example 1
Run a command on foo7 and foo9foo15:
pdsh -w foo[7,9-15] <command>
Example 2
Run a command on foo0 and foo2foo5:
pdsh -w foo[0-5] -x foo1 <command>
246
007-5642-005
pdsh
Example 3
In some instances, it is preferable to run pdsh commands using a pdsh shell. To open the shell for a specific group of
hosts, enter the following:
pdsh -w foo[0-5]
From the shell, you may enter commands without specifying the host names:
pdsh> date
247
007-5642-005
pmgr
pmgr
pmgr {
[[{--description:|-d:}<description>] [{--include:|-i:}<include_file_or_directory>]
[{--include-from:|-I:}<file_containing_list>] [{--location:|-l:}<location_dir>]
[{--silent:|-s:}<silent>]
[{--exclude:|-x:}<exclude_file_or_dir>]] [{--exclude-from:|-X:}<file_containing_list>]
<payload_name>| [{-usage|-help|-?}]
}
Description
The pmgr utility generates a Management Center payload from an existing Linux installation to use on a specified
hosthowever, Management Center services must be running on the remote host. An exclude list (or file) allows you to
manage which files and directories you want to exclude from the payload (e.g., remote NFS mounted directories or /
proc).
Parameters
[-d:<description>]
[-i:<include_file_or_directory>]
(Optional) Enter the name of the file or directory to include in the payload. When you
specify a directory, the payload will include all files and subdirectories contained in the
directory.
To include a previously excluded item (i.e., a file or directory contained in an excluded directory), enter the name of the
file or subdirectory.
[{--include-from:|-I:}<file_containing_list>]
[-l:<location_dir>]
[-s:<silent>]
(Optional) Enter the name of the file that contains a list of all files to include in the
payload.
(Optional) The directory in which to create the payload. By default, the user's payload
working directory with the payload name appended.
(Optional) Omit all output other than errors, including the payload creation progress
meter and final summary. This is useful when scripting pmgr.
[-x:<exclude_file_or_dir>]
(Optional) Exclude the named file or directory from the payload. Excluding a directory
excludes all files and subdirectories.
[{--exclude-from:|-X:}<file_containing_list>]
<payload_name>
[{-usage|-help|-?}]
(Optional) Enter the name of the file that contains a list of all files to exclude from the
payload.
The name of the payload.
(Optional) Display help information for the command and exit. All other options are
ignored.
Example
The following example demonstrates how to create a new payload from an existing host installation, n2, and exclude
some unwanted directories from the payload:
pmgr -x:/proc:/home:/var/log:/dev/pts:/mnt -h=n2 n2_payload
248
007-5642-005
powerman
powerman
powerman {
[[{--on|-1}]|
[{--off|-0}]|
[{--cycle|-c}]|
[{--reset|-r}]|
[{--flash|-f}]|
[{--unflash|-u}]|
[{--list|-l}]|
[{--query|-q}]|
[{--node|-n}]|
[{--beacon|-b}]|
[{--temp|-t}]|
[{--help|-h}]|
[{--license|-L}]|
[{--destination|-d} host[:port]]|
[{--version|-V}]|
[{--device|-D}]|
[{--telemetry|-T}]|
[{--exprange|-x}]]
<host>[ <host> ...<host_n>]
}
Description
To use Powerman for power control (that is, as your platform management device), Powerman must be installed and
configured. You can obtain Powerman from http://sourceforge.net/projects/powerman/.
Powerman offers power management controls for hosts in clustered environments. Controls include power on, power
off, and power cycle via remote power control (RPC) devices. Target host names are mapped to plugs on RPC devices
in powerman.conf.
Parameters
[{--on|-1}]
[{--off|-0}]
[{--cycle|-c}]
[{--reset|-r}]
[{--flash|-f}]
[{--unflash|-u}]
[{--list|-l}]
[{--query|-q}]
[{--node|-n}]
249
007-5642-005
powerman
[{--beacon|-b}]
[{--temp|-t}]
[{--help|-h}]
[{--license|-L}]
returns the hosts power status only, not its operational status. A host in the Off state
could be On at the plug and operating in standby power mode.
(Optional) Query beacon status (if implemented by RPC). If you do not specify a
host(s), powerman queries the beacon status of all hosts.
(Optional) Query host temperature (if implemented by RPC). If you do not specify a
host(s), powerman queries the temperature of all hosts. Temperature information is not
interpreted by powerman and is reported as received from the RPC on one line per host,
prefixed by the host name.
(Optional) Display option summary.
(Optional) Show powerman license information.
[{--destination|-d} host[:port]]
[{--version|-V}]
[{--device|-D}]
[{--telemetry|-T}]
[{--exprange|-x}]
The name of the host(s) on which to execute the specified operation. You may enter a
range of hosts or a space- or comma-delimited list of hosts (e.g., host[1-4 7 9] or
host[1-4 7,9]).
FILES
/usr/sbin/powermand
/usr/bin/powerman
/usr/bin/pm
/etc/powerman/powerman.conf
/etc/powerman/*.dev
Example 1
To power on hosts bar, baz, and n01n05:
powerman --on bar baz n[01-05]
Example 2
To turn off hosts n4 and n7n9:
powerman -0 n4,n[7-9]
250
007-5642-005
vcs
vcs
vcs {
[{identify| id}]|
[status]|
[include <files>]|
[exclude <files>]|
[archive <filename>]|
[import -R:<repository> -M:<module> [-n:<name>] [-d:<description>] [<files>]]|
[commit [-n:<name>] [-d:<description>] [<files>]]|
[branch [-n:<name>] [-d:<description>] [<files>]]|
[{checkout | co} -R:<repository> -M:<module> [-r:<revision>|<branch>|<name>]]|
[{update | up} [-r:<revision>|<branch>|<name>] [<files>]]|
[name [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]|
[describe [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]|
[{narrate | log} [-R:<repository> -M:<module>] [-r:<revision>|<branch>|<name>]]|
[iterate [-R:<repository> [-M:<module> [-r:<revision>|<branch>|<name>]]]]|
[list]|
[{-usage|-help|-?}]
}
Description
Manage version controlled directories within Management Center.
Parameters
[{identify| id}]
[status]
[include <files>]
[exclude <files>]
[archive <filename>]
(Optional) Display information about the module contained in the current working
directory.
(Optional) Display the status of the files within the current working directory including
whether they have been added (A), modified (M) or deleted (D).
(Optional) Add provided list of files to the include list. You may also use this option to
override a specific file exclusion.
(Optional) Add provided list of files to the exclude list. Excluding files allows you to
remove files that may cause problems (e.g., when trying to archive files).
(Optional) Create an archive of the current working directory in the given file. This
option may be used to archive a host and include it in VCS as a payload.
(Optional) Create a new module with the provided list of files or all of the current
working directory.
[commit [-n:<name>] [-d:<description>] [<files>]]
(Optional) Insert a new revision in the module using the provided list of files or any
working copy modifications.
[branch [-n:<name>] [-d:<description>] [<files>]]
(Optional) Insert a new revision that is not on tip using the provided list of files or any
working copy modifications.
[{checkout| co} -R:<repository> -M:<module> [-r:<revision>|<branch>|<name>]]
(Optional) Retrieve an existing revision from a module. The contents of the module will
be stored in a new directory named after the module.
251
007-5642-005
vcs
(Optional) Update the current directory to use the latest tip revision of a branch (3.4),
the main trunk of a specific branch (4), or a branch with a specific name (Golden). The
files option allows you to update a specific file contained in a payload.
[name [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]
(Optional) Add, modify or delete the optional name or alias of a revision. Names are
unique revision identifiers for the entire module. A blank for the name will delete the
previous value.
[describe [-R:<repository>] [-M:<module>] [-r:<revision>|<branch>|<name>] <text>]
(Optional) Add, modify or delete the optional description of a revision. A blank for the
description will delete the previous value.
[{narrate| log} [-R:<repository> -M:<module>] [-r:<revision>|<branch>|<name>]]
Examples
EXAMPLE 1
Display a list of images contained in the Version Control System:
vcs iterate -R:images
EXAMPLE 2
Display a list of files that have changed since the last time the Compute payload was checked out:
cd $MGR_HOME/imaging/root/payloads/Compute
vcs status
EXAMPLE 3
List current versions of all category types (payloads, kernels, and images) checked into VCS:
vcs list
Images
--------------------------------------------------------MyImage (1) - Kernel: MyKernel (3) Payload: MyPayload (6.1.4)
TestImage (1) - Kernel: Compute (2) Payload: SLES10 (23)
Kernels
--------------------------------------------------------MyKernel (5)
Compute (2)
Payloads
--------------------------------------------------------MyPayload (6.1.7)
SLES9 (34)
SLES10 (23)
EXAMPLE 4
Check out a specific revision, 8, of a version controlled payload named Compute:
vcs checkout -R:payloads -M:Compute -r:8
252
007-5642-005
vcs
EXAMPLE 5
Use VCS to make sure you have the latest revision of what was originally checked out in the previous example:
cd $MGR_HOME/imaging/<username>/payloads/Compute
vcs update
253
007-5642-005
Appendix
Pre-configured Metrics
The CustomMetrics.profile is the file used to define which metrics are available in the Add a Custom Metric Listener
dialog. The Metrics.profile is the file used to define which metrics are available from the Metrics Selector dialog to
view in the instrumentation service.
Both the Metrics.profile and CustomMetrics.profile use the same format and need to be edited only if you have written
a custom monitoring script and configured it as a custom monitor. Then, if you want to:
Display the custom metrics in the List View, add the new metrics to the Metrics.profile.
Set thresholds on the custom metrics, add the new metrics to the CustomMetrics.profile.
CPU
Metric Name
CPU Percent Idle Aggregate
hosts.{host.moniker}.cpu.iowait.pattern=100%
The total cycles used by all CPUs waiting for I/O.
hosts.{host.moniker}.cpu.nice.pattern=100%
The total cycles used by all CPUs in user mode with low priority.
hosts.{host.moniker}.cpu.system.pattern=100%
The total cycles used by all CPUs in kernel mode.
hosts.{host.moniker}.cpu.user.pattern=100%
The total cycles used by all CPUs in user mode.
255
007-5642-005
Pre-configured Metrics
Disk
Disk
Metric Name
Disk Reads (blocks per second)
hosts.{host.moniker}.disks.hda.block.writes.pattern=0000
00
The number of blocks written to a disk.
hosts.{host.moniker}.disks.hda.io.reads.pattern=000000
The number of I/O reads from a disk.
hosts.{host.moniker}.disks.hda.io.writes.pattern=000000
The number of I/O writes to a disk.
hosts.{host.moniker}.disks.hda1.capacity.used.pattern=0,
000 MB
The disk capacity used for disk hda1, hda2, hda3, or hda4.
hosts.{host.moniker}.disks.hda1.capacity.free.pattern=0,
000 MB
The disk capacity free for disk hda1, hda2, hda3, or hda4.
hosts.{host.moniker}.disks.hda1.percentage.used.pattern=
100%
The disk percentage used for disk hda1, hda2, hda3, or hda4.
hosts.{host.moniker}.disks.sda1.capacity.used.pattern=0,
000 MB
The disk capacity used for disk sda1, sda2, sda3, or sda4.
hosts.{host.moniker}.disks.sda1.capacity.free.pattern=0,
000 MB
The disk capacity free for disk sda1, sda2, sda3, or sda4.
hosts.{host.moniker}.disks.sda1.percentage.used.pattern=
100%
The disk percentage used for disk sda1, sda2, sda3, or sda4.
Kernel
Metric Name
Kernel Context Switches (per second)
hosts.{host.moniker}.kernel.interrupts.pattern=000000
hosts.{host.moniker}.kernel.processes.pattern=00000
256
007-5642-005
Pre-configured Metrics
Load
Metric Name
Kernel Swaps In
hosts.{host.moniker}.kernel.swaps.out.pattern=000000
The number of swap pages that have been sent out.
Load
Metric Name
Load - 15 Minute
Load - 1 Minute
hosts.{host.moniker}.load.1m.pattern=0.00
The number of tasks in the run state averaged over 1 minute.
Load - 5 Minute
hosts.{host.moniker}.load.5m.pattern=0.00
The number of tasks in the run state averaged over 5 minutes.
Memory
Metric Name
Memory Active (bytes)
hosts.{host.moniker}.memory.cached.pattern=0,000 MB
The amount of cached memory.
hosts.{host.moniker}.memory.committed.pattern=0,000 MB
The amount of used memory.
hosts.{host.moniker}.memory.free.pattern=0,000 MB
The total amount of free memory.
hosts.{host.moniker}.memory.swap.cached.pattern=0,000 MB
The amount of cached swap.
hosts.{host.moniker}.memory.swap.free.pattern=0,000 MB
The amount of free swap space.
hosts.{host.moniker}.memory.total.pattern=0,000 MB
The total amount of memory.
257
007-5642-005
Pre-configured Metrics
Network
Network
Metric Name
Network (eth0) Bytes Received
(per second)
hosts.{host.moniker}.network.eth0.rx.packets.pattern=0,0
00 MB
The total number of received packets on all interfaces.
hosts.{host.moniker}.network.eth0.tx.bytes.pattern=0,000
MB
The total number of transmitted bytes on all interfaces.
hosts.{host.moniker}.network.eth0.tx.packets.pattern=0,0
00 MB
The total number of packets transmitted on all interfaces.
hosts.{host.moniker}.network.eth1.rx.bytes.pattern=0,000
MB
The total number of bytes received on all interfaces.
hosts.{host.moniker}.network.eth1.rx.packets.pattern=0,0
00 MB
The total number of received packets on all interfaces.
hosts.{host.moniker}.network.eth1.tx.bytes.pattern=0,000
MB
The total number of transmitted bytes on all interfaces.
hosts.{host.moniker}.network.eth1.tx.packets.pattern=0,0
00 MB
The total number of packets transmitted on all interfaces.
258
007-5642-005
Glossary
Anti-aliasing A technique used to smooth images and text to improve their appearance on screen.
Architecture-independent Allows hardware or software to function regardless of hardware platform.
Baud rate A unit of measure that describes data transmission rates (in bits per second).
Block size The largest amount of data that the file system will allocate contiguously.
boot.profile A file that contains instructions on how to boot a host.
Boot utilities Utilities added to the RAM Disk that run during the boot process. Boot utilities allow you to create such
things as custom, pre-finalized scripts using utilities that are not required for standard Linux versions.
Cluster Clustering is a method of linking multiple computers or compute hosts together to form a unified and more
powerful system. These systems can perform complex computations at the same level as a traditional supercomputer by
dividing the computations among all of the processors in the cluster, then gathering the data once the computations are
completed. A cluster refers to all of the physical elements of your SGI solution, including the Management Center
Master Host, compute hosts, Management Center, UPS, high-speed network, storage, and the cabinet.
Management Center Master Host The Management Center Master Host is the host that controls the remaining hosts
in a cluster (for large systems, multiple masters may be required). This host is reserved exclusively for managing the
cluster and is not typically available to perform tasks assigned to the remaining hosts.
DHCP Dynamic Host Configuration Protocol. Assigns dynamic IP addresses to devices on a network.
Diskless host A host whose operating system and file system are installed into physical memory. This method is
generally referred to as RAMfs or TmpFS.
EBI An ELF Binary Image that contains the kernel, kernel options, and a RAM Disk.
Event engine Allows administrators to trigger events based on a change in system status (e.g., when processors rise
above a certain temperature or experience a power interruption). Administrators may configure triggers to inform users
of a specific event or to take a specific action.
Ext Original extended file system for Linux systems. Provides 255-character filenames and supports files sizes up to 2
Gigabytes.
Ext2 The second extended file system for Linux systems. Offers additional features that make the file system more
compatible with other file systems and provides support for file system extensions, larger file sizes (up to 4 Terabytes),
symbolic links, and special file types.
Ext3 Provides a journaling extension to the standard ext2 file system on Linux. Journaling reduces time spent
recovering a file system, critical in environments where high availability is important.
Group A group refers to an organization with shared or similar needs. A cluster may contain multiple groups with
unique or shared rights and privileges. A group may also refer to an administrator-defined collection of hosts within a
cluster that perform tasks such as data serving, Web serving, and computational number crunching.
259
007-5642-005
Glossary
Health monitoring An element of the Instrumentation Service used to track and display the state of all hosts in the
system. Health status icons appear next to each host viewed with the instrumentation service or from the navigation tree
to provide visual cues about system health. Similar icons appear next to clusters, partitions, and regions to indicate the
status of hosts contained therein.
Host An individual server or computer within the cluster that operates in parallel with other hosts in the cluster. Hosts
may contain multiple processors.
image.profile A file used to generate boot.profile. This file contains information about the image, including the
payload, kernel, and partition layout.
IP address A 32-bit number that identifies each sender or receiver of information.
Kerberos Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/
server applications by using secret-key cryptography.
Kernel The binary kernel, a .config file, System.map, and modules (if any).
LDAP Lightweight Directory Access Protocol is an Internet protocol that email programs use to look up contact
information from a server.
Listener A listener constantly reads and reviews system metrics. Configuring listener thresholds allows you to trigger
loggers to address specific issues as they arise.
Logger The action taken when a threshold exceeds its maximum or minimum value. Common logger events include
sending messages to the centralized Management Center message log, logging to a file, logging to the serial console,
and shutting down the host.
MAC address A hardware address unique to each device installed in the system.
Metrics Used to track logger events and report data to the instrumentation service (where it may be monitored).
MIB Management Information Base. The MIB is a tree-shaped information structure that defines what sort of data can
be manipulated via SNMP.
Monitors Monitors run periodically on hosts and provide the metrics that are gathered, processed, and displayed using
the Management Center instrumentation service.
Multi-user Allows multiple administrators to simultaneously log into and administer the cluster.
Netmask A string of 0's and 1's that mask or screen out the network part of an IP address so only the host computer
portion of the address remains. The binary 1's at the beginning of the mask turn the network ID portion of the IP address
into 0's. The binary 0's that follow allow the host ID to remain. A commonly used netmask is 255.255.255.0 (255 is the
decimal equivalent of a binary string of eight ones).
NIS Network Information Service makes information available throughout the entire network.
Node See Host.
Partition Partitions are used to separate clusters into non-overlapping collections of hosts.
Payload A compressed file system that is downloaded via multicast during the provisioning process.
Plug-ins Programs or utilities added to the boot process that expand system capabilities.
RAID Redundant Array of Independent Disks. Provides a method of accessing multiple, independent disks as if the
array were one large disk. Spreading data over multiple disks improves access time and reduces the risk of losing all
data if a drive fails.
RAM Disk A small, virtual drive that is created and loaded with the utilities that are required when you provision the
host. In order for host provisioning to succeed, the RAM Disk must contain specific boot utilities. Under typical
circumstances, you will not need to add boot utilities unless you are creating something such as a custom, pre-finalized
script that needs utilities not required by standard Linux versions (e.g., modprobe).
RHEL Red Hat Enterprise Linux.
260
007-5642-005
Glossary
Region A region is a subset of a partition and may share any hosts that belong to the same partitioneven if the hosts
are currently used by another region.
Role Roles are associated with groups and privileges, and define the functionality assigned to each group.
Secure remote access The ability to monitor and control the cluster from a distant location through an SSL-encrypted
connection. Administrators have the benefit of secure remote access to their clusters through any Java-enhanced
browser. Management Center can be used remotely, allowing administrators access to the cluster from anywhere in the
world.
Secure Shell (SSH) SSH is used to create a secure connection to the CLI. Connections made with SSH are encrypted
and safe to use over insecure networks.
SLES SUSE Linux Enterprise Server.
Version branching The ability to modify an existing payload, kernel, or image under version control and check it back
into VCS as a new, versioned branch of the original item.
Version Control System (VCS) The Management Center Version Control System allows users with privileges to
manage changes to payloads, kernels, or images (similar in nature to managing changes in source code with a version
control system such as CVS). The Version Control System supports common Check-Out and Check-In operations.
Versioned copy A versioned copy of a payload, kernel, or image is stored in VCS.
Working copy A working copy of a payload, kernel, or image is currently present in the working area only
(e.g., $MGR_HOME/imaging/<user>/ payloads). Working copies are not stored in VCS.
261
007-5642-005
Index
A
accounts
disable user 65
enable 65
manage group 92
manage local 92
acl_roots 146
add
boot utilities 128
custom monitors 175
directory to payload 96
file to payload 96
group 67
user account to payload 94
host 15, 43, 197
kernel modules without loading 108
listener 185
local user account to payload 93
monitor 169
package
to existing payload 84
partition 55
plug-in 131
RAID partition 117
role 70
user 64
to group 68
administration levels 63
Altix ICE vii, 11
Altix UV systems 3, 13, 14, 100, 208
AMD GPUs 15
annotations
electric shock ix
note ix
tip ix
warning ix
anti-aliasing 147, 149
appearance
interface 19
applications preferences 32
apply listeners
as default 167, 183
to hosts 167, 183
to payloads 168, 184
authentication
management, payload 90
auto node discovery 15, 197
B
beacon
turn off 54
turn on 53
block size 107, 116
boot
process, plug-ins for 130
troubleshooting 206
utilities, add 128
boot.profile 112, 129
branch, version 135
C
chassis management controllers (CMCs) 14
check into VCS
image 136
kernel 136
payload 136
check out of VCS
image 137
kernel 137
payload 137
CLI 259
client platforms 3
cluster 41, 63
263
007-5642-005
Index
D
environment 63
host administration 219
power administration 228
provisioning 230
system monitoring 147
user administration 233
CMCs 14
command-line interface 209, 259
conman 216
cwhost 219
cwpower 228
cwprovision 230
cwuser 233
dbix 239
dbx 240
imgr 241
kmgr 242
pdcp 243
pdsh 245
pmgr 248
powerman 249
vcs 251
compute host (See host)
configure
NIS 90
conman 25, 216
connect to console 54
console 54
copy
from VCS 138
image 112
kernel 103
payload 81
CPU
metrics 255
tab 153
utilization 153
create
group 67
host 43
image 110
kernel 101
kernel from binary 242
multiple payloads from source 79
partition 55, 114
password
Icebox 225
user 65, 234
payload 78
region 57
264
007-5642-005
role 70
csv 49
Customer Service x, 203, 204
customize the interface 19
cwhost 219
cwpower 228
cwprovision 230
cwuser 233
D
Data Center Manager (DCM) 2, 28
dbix 49, 239
dbx 240
DCM 2, 28
debugging 204
default user administration settings 64
delete
all payloads, kernels, and images 138
file(s) from payload 98
group 69
account from payload 95
host 49
image partition 121
listener 189
local user account from payload 94
monitor 173
package from payload 84
partition 56
payload 98
region 59
role 72
user account 66
working copy of image 112
working copy of kernel 109
working copy of payload 98
dependency checks, package 87
DHCP 3, 38
dhcpd.conf 38
disable
anti-aliasing 147
gradient fill 147
Kerberos 91
LDAP 91
listener 183
monitor 167
NIS 90
user account 65, 66
Discover interface 15, 197
disk
Index
E
E
edit
group 69
host 48
Icebox password 225
image partition 119
kernel 107
listener 188
monitor 173
partition 56
password 66, 235
payload 87
using text editor 97
region 58
role 72
user account 66
electric shock ix
enable
anti-aliasing 147
gradient fill 147
Kerberos 91
LDAP 91
listeners 183
monitor 167
NIS 90
user account 65
environment monitoring 160, 165
environmental tab 159
errors
messages 191
RPM 81
troubleshooting 203
event
listeners 182
log 148, 191
monitoring 165
exclude
files and directories from VCS 140
exclude file(s) from payload 83
F
features, Management Center 12
feedback, documentation ix
file
exclude file(s) from payload 83
system, user-defined 122
fill to end of disk 126
filter 149, 151
find
host 49
format partition 116
frames
controls 18
dockable 18, 20
FreeIPMI 25
Freeipmi 3
fstab 116, 126
G
general preferences 23
general tab 150
GID 64, 68
GPU monitoring 15, 161
gradient fill 147, 149
group 63, 67
add 67
account to payload 94
assign roles to 68
assign to role 71
assign user to 65
delete 69
account from payload 95
edit 69
GID 64, 68
grant access to region 68
power 65, 67
primary 65
region, add to 58
265
007-5642-005
Index
H
root 67
user membership 65
users 67
states 148
upgrade 144
I
H
halt host 52
hardware
system requirements 1
health
monitoring 147
event log 148
system status icons 148
status 150
host 41, 63
add 15, 43, 197
to partition 55
administration 41, 63
grant privileges 73
beacon
turn off 54
turn on 53
CLI administration 220
configure
diskless host 125
cycle power to 53
delete 49
diskless 125
edit 48
event log 148
find 49
halt 52
import 49
load monitoring 158
Management Center Master 41
rename 48
names 4
power
turn off 52, 53
turn on 53
power management 52
provision 141
using CLI 230
reboot 52
region
add host to 57
assign host to 45
reset 53
shared 41, 63
shut down 52
266
007-5642-005
Icebox
administration privileges 73
create password for 225
modify password 225
icons, system status 148
ILO 1, 25, 47
image 75, 112
add modules without loading 108
check into VCS 136
check out from VCS 137
CLI controls 241
copy 112
create 110
delete all 138
delete partition 121
delete working copy of image 112
edit image partition 119
management 110
partition 114
privileges, enable imaging 73
provision 141
select image 141
versioned 134
working copy 134
image.once 112
image.path 112
imgr 241
import
binary kernel 242
default listeners 190
host list 49
listener 189
listener from payload 190
listeners from payload 190
monitors
from payload 172
monitors from host 171
informational messages 191
install
Management Center 4
client 5
into payload 99
instrumentation 147
CPU utilization 153
custom monitors 174
Index
J
disk
aggregate usage 155
I/O 155
enhance performance 147
event log 148
health status 150
host load 158
kernel information 157
list view 152
memory utilization 154
menu controls 149
metrics, define 178
metrics, pre-configured 255
monitoring and event subsystem 165
packet transmissions 156
power status 150
resource utilization 150
system configuration 150
system status 148
overview 150
temperature readings 159, 160
thumbnail view 151
Intel Data Center Manager (DCM) 28
DCM
Data Center Manager (DCM) 1
Intel Power Node Manager (IPNM) 1, 2, 28
interface
customized appearance 19
management 44
map 16
split-pane view 19
interval 149
IP address 260
host 44
IPMI 1, 25, 35, 45
IPMItool 3
ipmitool utility 35
IPNM 1, 2, 28
ISLE Cluster Manager, upgrading from 193
J
Jpackage Utilities 3
K
Kerberos 91
kernel 75
build from source 101
check into VCS 136
L
layouts 20
open saved 21
save 20
set default 21
LDAP 91
licensing 8, 207
links, dangling symbolic 83
list view 149, 152
listeners 165, 190
add 185
apply as default 167, 183
apply to hosts 167, 183
apply to payloads 168, 184
delete 189
disable 183
edit 188
enable 183
event 182
import 189
load
metrics 257
load tab 158
loadable kernel modules 108
loggers 165, 191
TemplateFormatter 191
267
007-5642-005
Index
M
M
MAC addresses 197
maintenance operations 8
management
interface 44
management network 4
VCS 138
Management Center
administration
grant privileges 73
features 12
install into payload 99
install on client 5
interface
customize 19
map 16
split-pane view 19
introduction 11
Master Host
rename 48
preferences 23
applications 32
general 23
platform management 25
provisioning settings 33
Premium Edition vii, 11
product definition vii, 11
server, start and stop 8
services 9
Standard Edition vii, 11
upgrading 193
Managment Center
platforms 1
system requirements 2
Master Host
definition 41
rename 48
system requirements 1
memlog 3, 13, 164
memory
estimate partition requirements 116, 126
metrics 257
utilization 154
Memory Failure Analysis 3
memory failure analysis 13, 164
memory tab 154
metrics 178, 255
alignment 180
CPU 255
268
007-5642-005
custom 180
disk 256
display custom 178
instrumentation service 149
kernel 256
load 257
memory 257
metrics selector 180
network 258
mkfs 116
modules
install without loading 108
loadable kernel 108
modules subtab 108
monitoring
event 165
system health 147
monitors 165, 166
add 169
add custom 175
custom 174
delete 173
disable 167
edit 173
enable 167
import from host 171
import from payload 172
multicast
route configuration 39
N
navigation tree 18
netmask 260
network metrics 258
network tab 156
Network Time Protocol (NTP) 3
NFS 64
NIS 90
nodes.conf 49
note ix
NTP 3
NVIDIA GPUs 15
O
open
layouts 21
operating system requirements 2
override global settings 45
Index
P
P
package
add to existing payload 84
dependency checks 87
remove from payload 87
packet transmissions 156
partition 41, 55, 63, 112
add 55
host to 55
RAID 117
create 114
user-defined file system 122
delete 56
delete from image 121
edit 56
edit image partition 119
estimate memory requirements 116, 126
format 116
manage 114
overwrite protection 116
partition this time 146
partitioning behavior 111
save 116
size
fill to end of disk 116, 124, 126
fixed 116, 124, 126
partition.once 112
password
create Icebox 225
create new 65, 234
encrypt 235
modify 66, 235
modify Icebox 225
payload 75
account management, local user 92
add
directory to 96
file to 96
group user account to 94
local user account to 93
package to existing 84
attributes, troubleshoot 80
authentication management 90
check into VCS 136
check out from VCS 137
check-in error 206
CLI controls 248
configure 89
copy 81
create 78
multiple payloads from source 79
dangling symbolic links 83
delete 98
file(s) from payload 98
group account from payload 95
local user account from payload 94
working copy of payload 98
delete all 138
download this time 146
edit
using CLI 96
with text editor 97
exclude file(s) 83
file configuration 89
group account management 92
install Management Center into 99
management 76
package dependency checks 87
pmgr 248
remove package from 87
script, enable 89
update directory 96
update file 96
versioned 134
working copy 134
PBS 145
PBS Professional 4
pdcp 243
pdsh 245
PEKI temperatures 182
permissions 73
See role; privileges
physical memory utilization 154
plaforms, Management Center 1
platform management 1
DRAC 47
ILO 47
IPMI 45
platform management preferences 25
platforms, Management Center
Management Center
platforms 12
plug-ins
add 131
for boot process 130
power 53
CLI administration 228
269
007-5642-005
Index
Q
control 52
cycle
to host 53
group 65, 67
management 28, 162
management, host 52
monitoring 28, 162
policy 28
powerman 249
status 150
turn off
to host 53
turn off host 52
turn on
to host 53
powerman 25, 249
pre-configured metrics 255
preferences
applications 32
general 23
Management Center 23
platform management 25
provisioning settings 33
Premium Edition, Management Center vii, 11
primary
group 65
Prism XL platforms 15
privileges 73
change user 72
database 73
host administration 73
Icebox administration 73
imaging 73
instrumentation 73
logging 73
Management Center 73
power 73
provisioning 73
serial 73
user administration 73
problems 203
product definition, Management Center vii, 11
Product Support x, 203, 204
provision 141
CLI controls 230
disable confirmation dialog 143
enable confirmation dialog 143
format partition 116
provisioning settings preferences 33
right-click 143
270
007-5642-005
Q
qmgr 146
R
racks 41, 60
RAID 117
RAM Disk 128
block size 107
RAMfs 125
reboot host 52
region 41, 57, 63
add
group to 58
host to 57
assign to host 45
create 57
delete 59
edit 58
grant group access to 68
remove
file(s) from payload 98
group 69
group account from payload 95
host 49
local user account from payload 94
package from payload 87
partition 56
region 59
role 72
user account 66
rename
host 48
Management Center Master Host 48
requirements
hardware 1
operating system 2
software 3
reset
host 53
resource
utilization 150
restore factory settings 190
RHEL 260
right-click menu 52
Index
S
connect to console 54
provisioning 143
rights
See role; privileges
Roamer 1, 25
Roamer KVM 54
role 63, 70
add 70
assign group to 71
assign to group 68
delete 72
edit 72
grant privileges and permissions 71
root group 67
routes, multicast 39
RPM
errors 81
S
save
layouts 20
partition 116
scalability 6
schedule provision at next reboot 145
script, enable in payload 89
search, tree 49
server platforms 2
SGI Altix UV systems 3, 13, 14, 100, 208
SGI Foundation Software 13, 164
SGI Prism XL platforms 15
shut down a host 52
size
thumbnail 151
SLES 261
SMN 13, 14
SMN bundle software 3, 13
software
requirements 3
sort 149
split-pane view 19
SSL 91
Standard Edition, Management Center vii, 11
start Management Center server 8
state, host 148
stop Management Center server 8
symbolic links, dangling 83
symlink 83
system
configuration 150
health 147
requirements
hardware 1
operating system 2
status
event log 148
icons 148
overview 150
system management node (SMN) 3, 13, 14
T
task progress dialog 79
Technical Support x, 203, 204
Telnet client 3
temperature
changing thresholds 165
monitoring 149, 160
PEKI 182
readings 159
troubleshooting 205
TemplateFormatter 191
TFTP 3, 39
third-party power controls 65, 67
thumbnail
size 149
view 151
thumbnail view 149
tip ix
TmpFS 125
toolbar 17
transmissions, packet 156
Trivial File Transfer Protocol (TFTP) 3, 39
troubleshooting
general 203
payload attributes 80
RPM errors 81
U
UID 64, 65
upgrade
distribution 2
kernel 2
VCS upgrade 144
upgrading Management Center 193
user 63, 64
add 64
local user account to payload 93
to group 68
271
007-5642-005
Index
administration 63
default settings 64
privileges 73
assign to group 65
CLI administration 233
delete
local user account from payload 94
delete account 66
disable account 66
edit account 66
group membership 65
multi-group 63
UID 64, 65
user-defined file system 122
users group 67
UV systems 3, 13, 14, 100, 208
V
VCS 134
branch 136
CLI controls 251
command-line controls 251
copy 138
exclude files and directories 140
management console 138
upgrade 144
verbosity level, kernel 111, 146, 230
version
branching 135
control system 134
check into 136
check out 137
vcs command 251
versioned copy 134, 261
VersionControlService.profile 140
virtual memory utilization 154
W
warning ix
warning messages 191
Windows clients 3
working copy 134, 261
272
007-5642-005