0% found this document useful (0 votes)
364 views

Manual OpenHPC

Komputer OPN HPC untuk pengolahan data

Uploaded by

Oemar Bakrie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
364 views

Manual OpenHPC

Komputer OPN HPC untuk pengolahan data

Uploaded by

Oemar Bakrie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

OpenHPC (v1.3.

5)
Cluster Building Recipes

CentOS7.5 Base OS
xCAT/SLURM Edition for Linux* (x86 64)

Document Last Update: 2018-06-13


Document Revision: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Legal Notice

Copyright © 2016-2018, OpenHPC, a Linux Foundation Collaborative Project. All rights reserved.

This documentation is licensed under the Creative Commons At-


tribution 4.0 International License. To view a copy of this license,
visit http://creativecommons.org/licenses/by/4.0.

Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.

2 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Contents
1 Introduction 5
1.1 Target Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Requirements/Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Install Base Operating System (BOS) 7

3 Install OpenHPC Components 8


3.1 Enable OpenHPC repository for local use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Enable xCAT repository for local use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Installation template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Add provisioning services on master node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Add resource management services on master node . . . . . . . . . . . . . . . . . . . . . . . . 10
3.6 Optionally add InfiniBand support services on master node . . . . . . . . . . . . . . . . . . . 10
3.7 Optionally add Omni-Path support services on master node . . . . . . . . . . . . . . . . . . . 11
3.8 Complete basic xCAT setup for master node . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.9 Define compute image for provisioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.9.1 Build initial BOS image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.9.2 Add OpenHPC components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.9.3 Customize system configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.9.4 Additional Customization (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.9.4.1 Increase locked memory limits . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.9.4.2 Enable ssh control via resource manager . . . . . . . . . . . . . . . . . . . . . 14
3.9.4.3 Add Lustre client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.9.4.4 Add Nagios monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.9.4.5 Add Ganglia monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.9.4.6 Add ClusterShell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9.4.7 Add mrsh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9.4.8 Add genders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9.4.9 Add ConMan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9.4.10 Add NHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.9.5 Identify files for synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.10 Finalizing provisioning configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.11 Add compute nodes into xCAT database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.12 Boot compute nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Install OpenHPC Development Components 21


4.1 Development Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 MPI Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Performance Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.5 Setup default development environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.6 3rd Party Libraries and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.7 Optional Development Tool Builds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Resource Manager Startup 24

3 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

6 Run a Test Job 25


6.1 Interactive execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2 Batch execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Appendices 28
A Installation Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
B Upgrading OpenHPC Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
B.1 New component variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
C Integration Test Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
D Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
D.1 Adding local Lmod modules to OpenHPC hierarchy . . . . . . . . . . . . . . . . . . . 33
D.2 Rebuilding Packages from Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
E Package Manifest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
F Package Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

1 Introduction
This guide presents a simple cluster installation procedure using components from the OpenHPC software
stack. OpenHPC represents an aggregation of a number of common ingredients required to deploy and
manage an HPC Linux* cluster including provisioning tools, resource management, I/O clients, develop-
ment tools, and a variety of scientific libraries. These packages have been pre-built with HPC integration
in mind while conforming to common Linux distribution standards. The documentation herein is intended
to be reasonably generic, but uses the underlying motivation of a small, 4-node stateless cluster installation
to define a step-by-step process. Several optional customizations are included and the intent is that these
collective instructions can be modified as needed for local site customizations.

Base Linux Edition: this edition of the guide highlights installation without the use of a companion con-
figuration management system and directly uses distro-provided package management tools for component
selection. The steps that follow also highlight specific changes to system configuration files that are required
as part of the cluster install process.

1.1 Target Audience


This guide is targeted at experienced Linux system administrators for HPC environments. Knowledge of
software package management, system networking, and PXE booting is assumed. Command-line input
examples are highlighted throughout this guide via the following syntax:

[sms]# echo "OpenHPC hello world"

Unless specified otherwise, the examples presented are executed with elevated (root) privileges. The
examples also presume use of the BASH login shell, though the equivalent commands in other shells can
be substituted. In addition to specific command-line instructions called out in this guide, an alternate
convention is used to highlight potentially useful tips or optional configuration options. These tips are
highlighted via the following format:

Tip

How much sugar is in a cup? It depends on how big the cup is! –D. Brayford

1.2 Requirements/Assumptions
This installation recipe assumes the availability of a single head node master, and four compute nodes. The
master node serves as the overall system management server (SMS) and is provisioned with CentOS7.5 and
is subsequently configured to provision the remaining compute nodes with xCAT in a stateless configuration.
The terms master and SMS are used interchangeably in this guide. For power management, we assume that
the compute node baseboard management controllers (BMCs) are available via IPMI from the chosen master
host. For file systems, we assume that the chosen master server will host an NFS file system that is made
available to the compute nodes. Installation information is also discussed to optionally mount a parallel
file system and in this case, the parallel file system is assumed to exist previously.

An outline of the physical architecture discussed is shown in Figure 1 and highlights the high-level
networking configuration. The master host requires at least two Ethernet interfaces with eth0 connected to
the local data center network and eth1 used to provision and manage the cluster backend (note that these

5 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Parallel File System

high speed network

Master compute
(SMS) nodes

Data
Center eth0 eth1 to compute eth interface
Network
to compute BMC interface
tcp networking

Figure 1: Overview of physical cluster architecture.

interface names are examples and may be different depending on local settings and OS conventions). Two
logical IP interfaces are expected to each compute node: the first is the standard Ethernet interface that
will be used for provisioning and resource management. The second is used to connect to each host’s BMC
and is used for power management and remote console access. Physical connectivity for these two logical
IP networks is often accommodated via separate cabling and switching infrastructure; however, an alternate
configuration can also be accommodated via the use of a shared NIC, which runs a packet filter to divert
management packets between the host and BMC.
In addition to the IP networking, there is an optional high-speed network (InfiniBand or Omni-Path
in this recipe) that is also connected to each of the hosts. This high speed network is used for application
message passing and optionally for parallel file system connectivity as well (e.g. to existing Lustre or BeeGFS
storage targets).

1.3 Inputs
As this recipe details installing a cluster starting from bare-metal, there is a requirement to define IP ad-
dresses and gather hardware MAC addresses in order to support a controlled provisioning process. These
values are necessarily unique to the hardware being used, and this document uses variable substitution
(${variable}) in the command-line examples that follow to highlight where local site inputs are required.
A summary of the required and optional variables used throughout this recipe are presented below. Note
that while the example definitions above correspond to a small 4-node compute subsystem, the compute
parameters are defined in array format to accommodate logical extension to larger node counts.

6 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

• ${sms name} # Hostname for SMS server


• ${sms ip} # Internal IP address on SMS server
• ${domain name} # Local network domain name
• ${sms eth internal} # Internal Ethernet interface on SMS
• ${internal netmask} # Subnet netmask for internal network
• ${ntp server} # Local ntp server for time synchronization
• ${bmc username} # BMC username for use by IPMI
• ${bmc password} # BMC password for use by IPMI
• ${num computes} # Total # of desired compute nodes
• ${c ip[0]}, ${c ip[1]}, ... # Desired compute node addresses
• ${c bmc[0]}, ${c bmc[1]}, ... # BMC addresses for computes
• ${c mac[0]}, ${c mac[1]}, ... # MAC addresses for computes
• ${c name[0]}, ${c name[1]}, ... # Host names for computes
• ${compute regex} # Regex matching all compute node names (e.g. “c*”)
• ${compute prefix} # Prefix for compute node names (e.g. “c”)
• ${iso path} # Directory location of OS iso for xCAT install
• ${synclist} # Filesystem location of xCAT synclist file
Optional:
• ${sysmgmtd host} # BeeGFS System Management host name
• ${mgs fs name} # Lustre MGS mount name
• ${sms ipoib} # IPoIB address for SMS server
• ${ipoib netmask} # Subnet netmask for internal IPoIB
• ${c ipoib[0]}, ${c ipoib[1]}, ... # IPoIB addresses for computes
• ${nagios web password} # Nagios web access password

2 Install Base Operating System (BOS)


In an external setting, installing the desired BOS on a master SMS host typically involves booting from a
DVD ISO image on a new server. With this approach, insert the CentOS7.5 DVD, power cycle the host, and
follow the distro provided directions to install the BOS on your chosen master host. Alternatively, if choos-
ing to use a pre-installed server, please verify that it is provisioned with the required CentOS7.5 distribution.

Prior to beginning the installation process of OpenHPC components, several additional considerations
are noted here for the SMS host configuration. First, the installation recipe herein assumes that the SMS
host name is resolvable locally. Depending on the manner in which you installed the BOS, there may be an
adequate entry already defined in /etc/hosts. If not, the following addition can be used to identify your
SMS host.

[sms]# echo ${sms_ip} ${sms_name} >> /etc/hosts

While it is theoretically possible to enable SELinux on a cluster provisioned with xCAT, doing so is
beyond the scope of this document. Even the use of permissive mode can be problematic and we therefore
recommend disabling SELinux on the master SMS host. If SELinux components are installed locally, the
selinuxenabled command can be used to determine if SELinux is currently enabled. If enabled, consult
the distro documentation for information on how to disable.

Finally, provisioning services rely on DHCP, TFTP, and HTTP network protocols. Depending on the
local BOS configuration on the SMS host, default firewall rules may prohibit these services. Consequently,

7 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

this recipe assumes that the local firewall running on the SMS host is disabled. If installed, the default
firewall service can be disabled as follows:

[sms]# systemctl disable firewalld


[sms]# systemctl stop firewalld

3 Install OpenHPC Components


With the BOS installed and booted, the next step is to add desired OpenHPC packages onto the master
server in order to provide provisioning and resource management services for the rest of the cluster. The
following subsections highlight this process.

3.1 Enable OpenHPC repository for local use


To begin, enable use of the OpenHPC repository by adding it to the local list of available package repositories.
Note that this requires network access from your master server to the OpenHPC repository, or alternatively,
that the OpenHPC repository be mirrored locally. In cases where network external connectivity is available,
OpenHPC provides an ohpc-release package that includes GPG keys for package signing and repository
enablement. The example which follows illustrates installation of the ohpc-release package directly from
the OpenHPC build server.

[sms]# yum install http://build.openhpc.community/OpenHPC:/1.3/CentOS_7/x86_64/ohpc-release-1.3-1.el7.x86_64.rpm

Tip

Many sites may find it useful or necessary to maintain a local copy of the OpenHPC repositories. To facilitate
this need, standalone tar archives are provided – one containing a repository of binary packages as well as any
available updates, and one containing a repository of source RPMS. The tar files also contain a simple bash
script to configure the package manager to use the local repository after download. To use, simply unpack
the tarball where you would like to host the local repository and execute the make repo.sh script. Tar files
for this release can be found at http://build.openhpc.community/dist/1.3.5

3.2 Enable xCAT repository for local use


Next, enable use of the public xCAT repository by adding it to the local list of available package repositories.
This also requires network access from your master server to the internet, or alternatively, that the repository
be mirrored locally. In this case, we highlight network enablement by downloading the latest xCAT repo
file.

[sms]# yum -y install yum-utils


[sms]# wget -P /etc/yum.repos.d https://xcat.org/files/xcat/repos/yum/latest/xcat-core/xcat-core.repo

xCAT has a number of dependencies that are required for installation that are housed in separate public
repositories for various distributions. To enable for local use, issue the following:

[sms]# wget -P /etc/yum.repos.d https://xcat.org/files/xcat/repos/yum/xcat-dep/rh7/x86_64/xcat-dep.repo

8 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

In addition to the OpenHPC and xCAT package repositories, the master host also requires access to
the standard base OS distro repositories in order to resolve necessary dependencies. For CentOS7.5, the
requirements are to have access to both the base OS and EPEL repositories for which mirrors are freely
available online:

• CentOS-7 - Base 7.5.1804 (e.g. http://mirror.centos.org/centos-7/7/os/x86 64 )


• EPEL 7 (e.g. http://download.fedoraproject.org/pub/epel/7/x86 64 )

The public EPEL repository will be enabled automatically upon installation of the ohpc-release package.
Note that this requires the CentOS Extras repository, which is shipped with CentOS and is enabled by
default.

3.3 Installation template


The collection of command-line instructions that follow in this guide, when combined with local site inputs,
can be used to implement a bare-metal system installation and configuration. The format of these com-
mands is intended to be usable via direct cut and paste (with variable substitution for site-specific settings).
Alternatively, the OpenHPC documentation package (docs-ohpc) includes a template script which includes
a summary of all of the commands used herein. This script can be used in conjunction with a simple text
file to define the local site variables defined in the previous section (§ 1.3) and is provided as a convenience
for administrators. For additional information on accessing this script, please see Appendix A.

3.4 Add provisioning services on master node


With the OpenHPC and xCAT repositories enabled, we can now begin adding desired components onto
the master server. This repository provides a number of aliases that group logical components together in
order to help aid in this process. For reference, a complete list of available group aliases and RPM packages
available via OpenHPC are provided in Appendix E. To add support for provisioning services, the following
commands illustrate addition of a common base package followed by the xCAT provisioning system.

# Install base meta-package


[sms]# yum -y install ohpc-base

[sms]# yum -y install xCAT

# enable xCAT tools for use in current shell


[sms]# . /etc/profile.d/xcat.sh

Tip

Many server BIOS configurations have PXE network booting configured as the primary option in the boot
order by default. If your compute nodes have a different device as the first in the sequence, the ipmitool
utility can be used to enable PXE.

[sms]# ipmitool -E -I lanplus -H ${bmc_ipaddr} -U root chassis bootdev pxe options=persistent

HPC systems rely on synchronized clocks throughout the system and the NTP protocol can be used to
facilitate this synchronization. To enable NTP services on the SMS host with a specific server ${ntp server},
issue the following:

9 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

[sms]# systemctl enable ntpd.service


[sms]# echo "server ${ntp_server}" >> /etc/ntp.conf
[sms]# systemctl restart ntpd

3.5 Add resource management services on master node


OpenHPC provides multiple options for distributed resource management. The following command adds the
Slurm workload manager server components to the chosen master host. Note that client-side components
will be added to the corresponding compute image in a subsequent step.

# Install slurm server meta-package


[sms]# yum -y install ohpc-slurm-server

# Identify resource manager hostname on master host


[sms]# perl -pi -e "s/ControlMachine=\S+/ControlMachine=${sms_name}/" /etc/slurm/slurm.conf

Tip

SLURM requires enumeration of the physical hardware characteristics for compute nodes under its control.
In particular, three configuration parameters combine to define consumable compute resources: Sockets,
CoresPerSocket, and ThreadsPerCore. The default configuration file provided via OpenHPC assumes dual-
socket, 8 cores per socket, and two threads per core for this 4-node example. If this does not reflect your
local hardware, please update the configuration file at /etc/slurm/slurm.conf accordingly to match your
particular hardware. Note that the SLURM project provides an easy-to-use online configuration tool that
can be accessed here.

Other versions of this guide are available that describe installation of alternate resource management
systems, and they can be found in the docs-ohpc package.

3.6 Optionally add InfiniBand support services on master node


The following command adds OFED and PSM support using base distro-provided drivers to the chosen
master host.

[sms]# yum -y groupinstall "InfiniBand Support"


[sms]# yum -y install infinipath-psm

# Load IB drivers
[sms]# systemctl start rdma

Tip

InfiniBand networks require a subnet management service that can typically be run on either an
administrative node, or on the switch itself. The optimal placement and configuration of the subnet
manager is beyond the scope of this document, but CentOS7.5 provides the opensm package should
you choose to run it on the master node.

10 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

With the InfiniBand drivers included, you can also enable (optional) IPoIB functionality which provides
a mechanism to send IP packets over the IB network. If you plan to mount a Lustre file system over
InfiniBand (see §3.9.4.3 for additional details), then having IPoIB enabled is a requirement for the Lustre
client. OpenHPC provides a template configuration file to aid in setting up an ib0 interface on the master
host. To use, copy the template provided and update the ${sms ipoib} and ${ipoib netmask} entries to
match local desired settings (alter ib0 naming as appropriate if system contains dual-ported or multiple
HCAs).

[sms]# cp /opt/ohpc/pub/examples/network/centos/ifcfg-ib0 /etc/sysconfig/network-scripts

# Define local IPoIB address and netmask


[sms]# perl -pi -e "s/master_ipoib/${sms_ipoib}/" /etc/sysconfig/network-scripts/ifcfg-ib0
[sms]# perl -pi -e "s/ipoib_netmask/${ipoib_netmask}/" /etc/sysconfig/network-scripts/ifcfg-ib0

# Initiate ib0
[sms]# ifup ib0

3.7 Optionally add Omni-Path support services on master node


The following command adds Omni-Path support using base distro-provided drivers to the chosen master
host.

[sms]# yum -y install opa-basic-tools

# Load RDMA services


[sms]# systemctl start rdma

Tip

Omni-Path networks require a subnet management service that can typically be run on either an
administrative node, or on the switch itself. The optimal placement and configuration of the subnet
manager is beyond the scope of this document, but CentOS7.5 provides the opa-fm package should
you choose to run it on the master node.

3.8 Complete basic xCAT setup for master node


At this point, all of the packages necessary to use xCAT on the master host should be installed. Next, we
enable support for local provisioning using a second private interface (refer to Figure 1) and register this
network interface with xCAT.

# Enable internal interface for provisioning


[sms]# ifconfig ${sms_eth_internal} ${sms_ip} netmask ${internal_netmask} up

# Register internal provisioning interface with xCAT for DHCP


[sms]# chdef -t site dhcpinterfaces="xcatmn|${sms_eth_internal}"

11 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

3.9 Define compute image for provisioning


With the provisioning services enabled, the next step is to define and customize system image that can
subsequently be used to provision one or more compute nodes. The following subsections highlight this
process.

3.9.1 Build initial BOS image


The following steps illustrate the process to build a minimal, default image for use with xCAT. To begin,
you will first need to have a local copy of the ISO image available for the underlying OS. In this recipe, the
relevant ISO image is CentOS-7-x86 64-DVD-1804.iso (available from the CentOS mirrors). We initialize
the image creation process using the copycds command assuming that the necessary ISO image is available
locally in ${iso path} as follows:

[sms]# copycds ${iso_path}/CentOS-7-x86_64-DVD-1804.iso

Once completed, several OS images should be available for use within xCAT. These can be queried via:

# Query available images


[sms]# lsdef -t osimage
centos7.5-x86_64-install-compute (osimage)
centos7.5-x86_64-netboot-compute (osimage)
centos7.5-x86_64-statelite-compute (osimage)

In this example, we leverage the stateless (netboot) image for compute nodes and proceed by using
genimage to initialize a chroot-based install. Note that the previous query highlights the existence of other
provisioning images as well. OpenHPC provides a stateful xCAT recipe, follow that guide if you are interested
in stateful install. Statelite installation is an intermediate type of install, in which a limited number of files
and directories persist across reboots. Please consult available xCAT documentation if interested in this
type of install.

# Save chroot location for compute image


[sms]# export CHROOT=/install/netboot/centos7.5/x86_64/compute/rootimg/
# Build initial chroot image
[sms]# genimage centos7.5-x86_64-netboot-compute

3.9.2 Add OpenHPC components


The genimage process used in the previous step is designed to provide a minimal CentOS7.5 configuration.
Next, we add additional components to include resource management client services, InfiniBand drivers,
and other additional packages to support the default OpenHPC environment. This process augments the
chroot-based install performed by genimage to modify the base provisioning image and requires access the
BOS and OpenHPC repositories to resolve package install requests. The following steps are used to first
enable the necessary package repositories for use within the chroot.

[sms]# yum-config-manager --installroot=$CHROOT --enable base


[sms]# cp /etc/yum.repos.d/OpenHPC.repo $CHROOT/etc/yum.repos.d
[sms]# cp /etc/yum.repos.d/epel.repo $CHROOT/etc/yum.repos.d

Next, install a base compute package:

# Install compute node base meta-package


[sms]# yum -y --installroot=$CHROOT install ohpc-base-compute

12 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Now, we can include additional components to the compute instance using yum to install into the chroot
location defined previously:

# Add Slurm client support meta-package


[sms]# yum -y --installroot=$CHROOT install ohpc-slurm-client

# Add Network Time Protocol (NTP) support


[sms]# yum -y --installroot=$CHROOT install ntp

# Add kernel drivers


[sms]# yum -y --installroot=$CHROOT install kernel

# Include modules user environment


[sms]# yum -y --installroot=$CHROOT install lmod-ohpc

# Optionally add IB support and enable


[sms]# yum -y --installroot=$CHROOT groupinstall "InfiniBand Support"
[sms]# yum -y --installroot=$CHROOT install infinipath-psm
[sms]# chroot $CHROOT systemctl enable rdma

3.9.3 Customize system configuration


Prior to assembling the image, it is advantageous to perform any additional customization within the chroot
environment created for the desired compute instance. The following steps document the process identify
the resource manager server, configure NTP for compute resources, and enable NFS mounting of a $HOME
file system and the public OpenHPC install path (/opt/ohpc/pub) that will be hosted by the master host
in this example configuration.

# Add NFS client mounts of /home and /opt/ohpc/pub to base image


[sms]# echo "${sms_ip}:/home /home nfs nfsvers=3,nodev,nosuid,noatime 0 0" >> $CHROOT/etc/fstab
[sms]# echo "${sms_ip}:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3,nodev,noatime 0 0" >> $CHROOT/etc/fstab

# Export /home and OpenHPC public packages from master server


[sms]# echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports
[sms]# echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports
[sms]# exportfs -a
[sms]# systemctl restart nfs-server
[sms]# systemctl enable nfs-server

# Enable NTP time service on computes and identify master host as local NTP server
[sms]# chroot $CHROOT systemctl enable ntpd
[sms]# echo "server ${sms_ip}" >> $CHROOT/etc/ntp.conf

3.9.4 Additional Customization (optional)


This section highlights common additional customizations that can optionally be applied to the local cluster
environment. These customizations include:

• Add InfiniBand or Omni-Path drivers • Add Ganglia monitoring


• Increase memlock limits • Add Sensys monitoring
• Restrict ssh access to compute resources • Add ClusterShell
• Add BeeGFS client • Add mrsh
• Add Lustre client • Add genders
• Add Nagios Core monitoring • Add ConMan

13 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Details on the steps required for each of these customizations are discussed further in the following sections.

3.9.4.1 Increase locked memory limits In order to utilize InfiniBand or Omni-Path as the underlying
high speed interconnect, it is generally necessary to increase the locked memory settings for system users.
This can be accomplished by updating the /etc/security/limits.conf file and this should be performed
within the compute image and on all job submission hosts. In this recipe, jobs are submitted from the master
host, and the following commands can be used to update the maximum locked memory settings on both the
master host and the compute image:

# Update memlock settings on master


[sms]# perl -pi -e 's/# End of file/\* soft memlock unlimited\n$&/s' /etc/security/limits.conf
[sms]# perl -pi -e 's/# End of file/\* hard memlock unlimited\n$&/s' /etc/security/limits.conf

# Update memlock settings within compute image


[sms]# perl -pi -e 's/# End of file/\* soft memlock unlimited\n$&/s' $CHROOT/etc/security/limits.conf
[sms]# perl -pi -e 's/# End of file/\* hard memlock unlimited\n$&/s' $CHROOT/etc/security/limits.conf

3.9.4.2 Enable ssh control via resource manager An additional optional customization that is
recommended is to restrict ssh access on compute nodes to only allow access by users who have an active
job associated with the node. This can be enabled via the use of a pluggable authentication module (PAM)
provided as part of the Slurm package installs. To enable this feature within the compute image, issue the
following:

[sms]# echo "account required pam_slurm.so" >> $CHROOT/etc/pam.d/sshd

3.9.4.3 Add Lustre client To add Lustre client support on the cluster, it necessary to install the client
and associated modules on each host needing to access a Lustre file system. In this recipe, it is assumed
that the Lustre file system is hosted by servers that are pre-existing and are not part of the install process.
Outlining the variety of Lustre client mounting options is beyond the scope of this document, but the general
requirement is to add a mount entry for the desired file system that defines the management server (MGS)
and underlying network transport protocol. To add client mounts on both the master server and compute
image, the following commands can be used. Note that the Lustre file system to be mounted is identified
by the ${mgs fs name} variable. In this example, the file system is configured to be mounted locally as
/mnt/lustre.

# Add Lustre client software to master host


[sms]# yum -y install lustre-client-ohpc

# Include Lustre client software in compute image


[sms]# yum -y --installroot=$CHROOT install lustre-client-ohpc

# Include mount point and file system mount in compute image


[sms]# mkdir $CHROOT/mnt/lustre
[sms]# echo "${mgs_fs_name} /mnt/lustre lustre defaults,localflock,noauto,x-systemd.automount 0 0" \
>> $CHROOT/etc/fstab

14 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Tip

The suggested mount options shown for Lustre leverage the localflock option. This is a Lustre-specific
setting that enables client-local flock support. It is much faster than cluster-wide flock, but if you have an
application requiring cluster-wide, coherent file locks, use the standard flock attribute instead.

The default underlying network type used by Lustre is tcp. If your external Lustre file system is to be
mounted using a network type other than tcp, additional configuration files are necessary to identify the de-
sired network type. The example below illustrates creation of modprobe configuration files instructing Lustre
to use an InfiniBand network with the o2ib LNET driver attached to ib0. Note that these modifications
are made to both the master host and compute image.

[sms]# echo "options lnet networks=o2ib(ib0)" >> /etc/modprobe.d/lustre.conf


[sms]# echo "options lnet networks=o2ib(ib0)" >> $CHROOT/etc/modprobe.d/lustre.conf

With the Lustre configuration complete, the client can be mounted on the master host as follows:

[sms]# mkdir /mnt/lustre


[sms]# mount -t lustre -o localflock ${mgs_fs_name} /mnt/lustre

3.9.4.4 Add Nagios monitoring Nagios is an open source infrastructure monitoring package that
monitors servers, switches, applications, and services and offers user-defined alerting facilities. As provided
by OpenHPC, it consists of a base monitoring daemon and a set of plug-ins for monitoring various aspects
of an HPC cluster. The following commands can be used to install and configure a Nagios server on the
master node, and add the facility to run tests and gather metrics from provisioned compute nodes.

# Install Nagios meta-package on master host


[sms]# yum -y install ohpc-nagios

# Install plugins into compute node image


[sms]# yum -y --installroot=$CHROOT install nagios-plugins-all-ohpc nrpe-ohpc

# Enable and configure NRPE in compute image


[sms]# chroot $CHROOT systemctl enable nrpe
[sms]# perl -pi -e "s/^allowed_hosts=/# allowed_hosts=/" $CHROOT/etc/nagios/nrpe.cfg
[sms]# echo "nrpe 5666/tcp # NRPE" >> $CHROOT/etc/services
[sms]# echo "nrpe : ${sms_ip} : ALLOW" >> $CHROOT/etc/hosts.allow
[sms]# echo "nrpe : ALL : DENY" >> $CHROOT/etc/hosts.allow
[sms]# chroot $CHROOT /usr/sbin/useradd -c "NRPE user for the NRPE service" -d /var/run/nrpe \
-r -g nrpe -s /sbin/nologin nrpe
[sms]# chroot $CHROOT /usr/sbin/groupadd -r nrpe

# Configure remote services to test on compute nodes


[sms]# mv /etc/nagios/conf.d/services.cfg.example /etc/nagios/conf.d/services.cfg

# Define compute nodes as hosts to monitor


[sms]# mv /etc/nagios/conf.d/hosts.cfg.example /etc/nagios/conf.d/hosts.cfg
[sms]# for ((i=0; i<$num_computes; i++)) ; do
perl -pi -e "s/HOSTNAME$(($i+1))/${c_name[$i]}/ || s/HOST$(($i+1))_IP/${c_ip[$i]}/" \
/etc/nagios/conf.d/hosts.cfg
done

# Update location of mail binary for alert commands


[sms]# perl -pi -e "s/ \/bin\/mail/ \/usr\/bin\/mailx/g" /etc/nagios/objects/commands.cfg

15 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

# Update email address of contact for alerts


[sms]# perl -pi -e "s/nagios\@localhost/root\@${sms_name}/" /etc/nagios/objects/contacts.cfg

# Add check_ssh command for remote hosts


[sms]# echo command[check_ssh]=/usr/lib64/nagios/plugins/check_ssh localhost \
>> $CHROOT/etc/nagios/nrpe.cfg

# define password for nagiosadmin to be able to connect to web interface


[sms]# htpasswd -bc /etc/nagios/passwd nagiosadmin ${nagios_web_password}

# Enable Nagios on master, and configure


[sms]# chkconfig nagios on
[sms]# systemctl start nagios
[sms]# chmod u+s `which ping`

3.9.4.5 Add Ganglia monitoring Ganglia is a scalable distributed system monitoring tool for high-
performance computing systems such as clusters and grids. It allows the user to remotely view live or
historical statistics (such as CPU load averages or network utilization) for all machines running the gmond
daemon. The following commands can be used to enable Ganglia to monitor both the master and compute
hosts.

# Install Ganglia meta-package on master


[sms]# yum -y install ohpc-ganglia

# Install Ganglia compute node daemon


[sms]# yum -y --installroot=$CHROOT install ganglia-gmond-ohpc

# Use example configuration script to enable unicast receiver on master host


[sms]# cp /opt/ohpc/pub/examples/ganglia/gmond.conf /etc/ganglia/gmond.conf
[sms]# perl -pi -e "s/<sms>/${sms_name}/" /etc/ganglia/gmond.conf

# Add configuration to compute image and provide gridname


[sms]# cp /etc/ganglia/gmond.conf $CHROOT/etc/ganglia/gmond.conf
[sms]# echo "gridname MySite" >> /etc/ganglia/gmetad.conf

# Start and enable Ganglia services


[sms]# systemctl enable gmond
[sms]# systemctl enable gmetad
[sms]# systemctl start gmond
[sms]# systemctl start gmetad
[sms]# chroot $CHROOT systemctl enable gmond

# Restart web server


[sms]# systemctl try-restart httpd

Once enabled and running, Ganglia should provide access to a web-based monitoring console on the master
host. Read access to monitoring metrics will be enabled by default and can be accessed via a web browser.
When running a web browser directly on the master host, the Ganglia top-level overview is available at
http://localhost/ganglia. When accessing remotely, replace localhost with the chosen name of your master
host (${sms name}).

16 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

3.9.4.6 Add ClusterShell ClusterShell is an event-based Python library to execute commands in par-
allel across cluster nodes. Installation and basic configuration defining three node groups (adm, compute,
and all) is as follows:

# Install ClusterShell
[sms]# yum -y install clustershell-ohpc

# Setup node definitions


[sms]# cd /etc/clustershell/groups.d
[sms]# mv local.cfg local.cfg.orig
[sms]# echo "adm: ${sms_name}" > local.cfg
[sms]# echo "compute: ${compute_prefix}[1-${num_computes}]" >> local.cfg
[sms]# echo "all: @adm,@compute" >> local.cfg

3.9.4.7 Add mrsh mrsh is a secure remote shell utility, like ssh, which uses munge for authentication
and encryption. By using the munge installation used by Slurm, mrsh provides shell access to systems using
the same munge key without having to track ssh keys. Like ssh, mrsh provides a remote copy command,
mrcp, and can be used as a rcmd by pdsh. Example installation and configuration is as follows:

# Install mrsh
[sms]# yum -y install mrsh-ohpc mrsh-rsh-compat-ohpc
[sms]# yum -y --installroot=$CHROOT install mrsh-ohpc mrsh-rsh-compat-ohpc mrsh-server-ohpc

# Identify mshell and mlogin in services file


[sms]# echo "mshell 21212/tcp # mrshd" >> /etc/services
[sms]# echo "mlogin 541/tcp # mrlogind" >> /etc/services

# Enable xinetd in compute node image


[sms]# chroot $CHROOT systemctl enable xinetd

3.9.4.8 Add genders genders is a static cluster configuration database or node typing database used
for cluster configuration management. Other tools and users can access the genders database in order to
make decisions about where an action, or even what action, is appropriate based on associated types or
”genders”. Values may also be assigned to and retrieved from a gender to provide further granularity. The
following example highlights installation and configuration of two genders: compute and bmc.

# Install genders
[sms]# yum -y install genders-ohpc

# Generate a sample genders file


[sms]# echo -e "${sms_name}\tsms" > /etc/genders
[sms]# for ((i=0; i<$num_computes; i++)) ; do
echo -e "${c_name[$i]}\tcompute,bmc=${c_bmc[$i]}"
done >> /etc/genders

3.9.4.9 Add ConMan ConMan is a serial console management program designed to support a large
number of console devices and simultaneous users. It supports logging console device output and connecting
to compute node consoles via IPMI serial-over-lan. Installation and example configuration is outlined below.

# Install conman to provide a front-end to compute consoles and log output


[sms]# yum -y install conman-ohpc

# Configure conman for computes (note your IPMI password is required for console access)

17 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

[sms]# for ((i=0; i<$num_computes; i++)) ; do


echo -n 'CONSOLE name="'${c_name[$i]}'" dev="ipmi:'${c_bmc[$i]}'" '
echo 'ipmiopts="'U:${bmc_username},P:${IPMI_PASSWORD:-undefined},W:solpayloadsize'"'
done >> /etc/conman.conf

# Enable and start conman


[sms]# systemctl enable conman
[sms]# systemctl start conman

Note that additional options are typically necessary to enable serial console output. These are setup during
the node registration process in §3.11

3.9.4.10 Add NHC Resource managers provide for a periodic ”node health check” to be performed
on each compute node to verify that the node is working properly. Nodes which are determined to be
”unhealthy” can be marked as down or offline so as to prevent jobs from being scheduled or run on them.
This helps increase the reliability and throughput of a cluster by reducing preventable job failures due to
misconfiguration, hardware failure, etc. OpenHPC distributes NHC to fulfill this requirement.
In a typical scenario, the NHC driver script is run periodically on each compute node by the resource
manager client daemon. It loads its configuration file to determine which checks are to be run on the current
node (based on its hostname). Each matching check is run, and if a failure is encountered, NHC will exit
with an error message describing the problem. It can also be configured to mark nodes offline so that the
scheduler will not assign jobs to bad nodes, reducing the risk of system-induced job failures.

# Install NHC on master and compute nodes


[sms]# yum -y install nhc-ohpc
[sms]# yum -y --installroot=$CHROOT install nhc-ohpc

# Register as SLURM's health check program


[sms]# echo "HealthCheckProgram=/usr/sbin/nhc" >> /etc/slurm/slurm.conf
[sms]# echo "HealthCheckInterval=300" >> /etc/slurm/slurm.conf # execute every five minutes

3.9.5 Identify files for synchronization


The xCAT system includes functionality to synchronize files located on the SMS server for distribution to
managed hosts. This is one way to distribute user credentials to compute nodes (alternatively, you may
prefer to use a central authentication service like LDAP). To import local file-based credentials, issue the
following to enable the synclist feature and register user credential files:

# Define path for xCAT synclist file


[sms]# mkdir -p /install/custom/netboot
[sms]# chdef -t osimage -o centos7.5-x86_64-netboot-compute synclists="/install/custom/netboot/compute.synclist"

# Add desired credential files to synclist


[sms]# echo "/etc/passwd -> /etc/passwd" > /install/custom/netboot/compute.synclist
[sms]# echo "/etc/group -> /etc/group" >> /install/custom/netboot/compute.synclist
[sms]# echo "/etc/shadow -> /etc/shadow" >> /install/custom/netboot/compute.synclist

Similarly, to import the global Slurm configuration file and the cryptographic key that is required by the
munge authentication library to be available on every host in the resource management pool, issue the
following:

[sms]# echo "/etc/slurm/slurm.conf -> /etc/slurm/slurm.conf " >> /install/custom/netboot/compute.synclist


[sms]# echo "/etc/munge/munge.key -> /etc/munge/munge.key " >> /install/custom/netboot/compute.synclist

18 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Tip

The “updatenode compute -F” command can be used to distribute changes made to any defined synchro-
nization files on the SMS host. Users wishing to automate this process may want to consider adding a crontab
entry to perform this action at defined intervals.

3.10 Finalizing provisioning configuration


To finalize the xCAT provisioning configuration, this section first highlights packing of the stateless image
from the chroot environment followed by the registration of desired compute nodes. To assemble the final
compute image use packimage as follows:

[sms]# packimage centos7.5-x86_64-netboot-compute

3.11 Add compute nodes into xCAT database


Next, we add compute nodes and define their properties as objects in xCAT database. These hosts are
grouped logically into a group named compute to facilitate group-level commands used later in the recipe.
Note the use of variable names for the desired compute hostnames, node IPs, MAC addresses, and BMC
login credentials, which should be modified to accommodate local settings and hardware. To enable serial
console access via xCAT, serialport and serialspeed properties are also defined.

# Define nodes as objects in xCAT database


[sms]# for ((i=0; i<$num_computes; i++)) ; do
mkdef -t node ${c_name[$i]} groups=compute,all ip=${c_ip[$i]} mac=${c_mac[$i]} netboot=xnba \
arch=x86_64 bmc=${c_bmc[$i]} bmcusername=${bmc_username} bmcpassword=${bmc_password} \
mgt=ipmi serialport=0 serialspeed=115200
done

Tip

Defining nodes one-by-one, as done above, is only efficient for a small number of nodes. For larger node
counts, xCAT provides capabilities for automated detection and configuration. Consult the xCAT Hardware
Discovery & Define Node Guide. Alternatively, confluent, a tool related to xCAT, also has robust discovery
capabilities and can be used to detect and auto-configure compute hosts.

xCAT requires a network domain name specification for system-wide name resolution. This value can be set
to match your local DNS schema or given a unique identifier such as “local”. In this recipe, we leverage the
$domain name variable to define as follows:

[sms]# chdef -t site domain=${domain_name}

If enabling optional IPoIB functionality (e.g. to support Lustre over InfiniBand), additional settings are
required to define the IPoIB network with xCAT and specify desired IP settings for each compute. This can
be accomplished as follows for the ib0 interface:

# Define ib0 netmask


[sms]# chdef -t network -o ib0 mask=$ipoib_netmask net=${c_ipoib[0]}

19 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

# Enable secondary NIC configuration


[sms]# chdef compute -p postbootscripts=confignics

# Register desired IPoIB IPs per compute


[sms]# for ((i=0; i<$num_computes; i++)) ; do
chdef ${c_name[i]} nicips.ib0=${c_ipoib[i]} nictypes.ib0="InfiniBand" nicnetworks.ib0=ib0
done

With the desired compute nodes and domain identified, the remaining steps in the provisioning configura-
tion process are to define the provisioning mode and image for the compute group and use xCAT commands
to complete configuration for network services like DNS and DHCP. These tasks are accomplished as follows:

# Complete network service configurations


[sms]# makehosts
[sms]# makenetworks
[sms]# makedhcp -n
[sms]# makedns -n

# Associate desired provisioning image for computes


[sms]# nodeset compute osimage=centos7.5-x86_64-netboot-compute

3.12 Boot compute nodes


Prior to booting the compute hosts, we configure them to use PXE as their next boot mode. After the initial
PXE, ensuing boots will return to using the default boot device specified in the BIOS.

[sms]# rsetboot compute net

At this point, the master server should be able to boot the newly defined compute nodes. This is done by
using the rpower xCAT command leveraging IPMI protocol set up during the the compute node definition
in § 3.11. The following power cycles each of the desired hosts.

[sms]# rpower compute reset

Once kicked off, the boot process should take about 5 minutes (depending on BIOS post times). You
can monitor the provisioning by using the rcons command, which displays serial console for a selected node.
Note that the escape sequence is CTRL-e c . typed sequentially.
Successful provisioning can be verified by a parallel command on the compute nodes. The default install
provides two such tools: xCAT-provided psh command, which uses xCAT node names and groups, and pdsh,
which works independently. For example, to run a command on the newly imaged compute hosts using psh,
execute the following:

[sms]# psh compute uptime


c1: 12:56:50 up 14 min, 0 users, load average: 0.00, 0.01, 0.04
c2: 12:56:50 up 13 min, 0 users, load average: 0.00, 0.02, 0.05
c3: 12:56:50 up 14 min, 0 users, load average: 0.00, 0.02, 0.05
c4: 12:56:50 up 14 min, 0 users, load average: 0.00, 0.01, 0.04

Note that the equivalent pdsh command is pdsh -w c[1-4] uptime.

20 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

4 Install OpenHPC Development Components


The install procedure outlined in §3 highlighted the steps necessary to install a master host, assemble
and customize a compute image, and provision several compute hosts from bare-metal. With these steps
completed, additional OpenHPC-provided packages can now be added to support a flexible HPC development
environment including development tools, C/C++/Fortran compilers, MPI stacks, and a variety of 3rd party
libraries. The following subsections highlight the additional software installation procedures.

4.1 Development Tools


To aid in general development efforts, OpenHPC provides recent versions of the GNU autotools collection,
the Valgrind memory debugger, EasyBuild, and Spack. These can be installed as follows:

# Install autotools meta-package


[sms]# yum -y install ohpc-autotools

[sms]# yum -y install EasyBuild-ohpc


[sms]# yum -y install hwloc-ohpc
[sms]# yum -y install spack-ohpc
[sms]# yum -y install valgrind-ohpc

4.2 Compilers
OpenHPC presently packages the GNU compiler toolchain integrated with the underlying modules-environment
system in a hierarchical fashion. The modules system will conditionally present compiler-dependent software
based on the toolchain currently loaded.

[sms]# yum -y install gnu7-compilers-ohpc

The llvm compiler toolchains are also provided as a standalone additional compiler family (ie. users can
easily switch between gcc/clang environments), but we do not provide the full complement of downstream
library builds.

[sms]# yum -y install llvm5-compilers-ohpc

4.3 MPI Stacks


For MPI development and runtime support, OpenHPC provides pre-packaged builds for a variety of MPI
families and transport layers. Currently available options and their applicability to various network trans-
ports are summarized in Table 1. The command that follows installs a starting set of MPI families that are
compatible with ethernet fabrics.

Table 1: Available MPI variants

Ethernet (TCP) InfiniBand Intel® Omni-Path


MPICH X
MVAPICH2 X
MVAPICH2 (psm2) X
OpenMPI X X X
OpenMPI (PMIx) X X X

21 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

[sms]# yum -y install openmpi3-gnu7-ohpc mpich-gnu7-ohpc

If your system includes InfiniBand and you enabled underlying support in §3.6 and §3.9.4, an additional
MVAPICH2 family is available for use:

[sms]# yum -y install mvapich2-gnu7-ohpc

Alternatively, if your system includes Intel® Omni-Path, use the (psm2) variant of MVAPICH2 instead:

[sms]# yum -y install mvapich2-psm2-gnu7-ohpc

An additional OpenMPI build variant is listed in Table 1 which enables PMIx job launch support for use
with Slurm. This optional variant is available as openmpi3-pmix-slurm-gnu7-ohpc.

4.4 Performance Tools


OpenHPC provides a variety of open-source tools to aid in application performance analysis (refer to Ap-
pendix E for a listing of available packages). This group of tools can be installed as follows:

# Install perf-tools meta-package


[sms]# yum -y install ohpc-gnu7-perf-tools

4.5 Setup default development environment


System users often find it convenient to have a default development environment in place so that compilation
can be performed directly for parallel programs requiring MPI. This setup can be conveniently enabled via
modules and the OpenHPC modules environment is pre-configured to load an ohpc module on login (if
present). The following package install provides a default environment that enables autotools, the GNU
compiler toolchain, and the OpenMPI stack.

[sms]# yum -y install lmod-defaults-gnu7-openmpi3-ohpc

Tip

If you want to change the default environment from the suggestion above, OpenHPC also provides the GNU
compiler toolchain with the MPICH and MVAPICH2 stacks:
• lmod-defaults-gnu7-mpich-ohpc
• lmod-defaults-gnu7-mvapich2-ohpc

4.6 3rd Party Libraries and Tools


OpenHPC provides pre-packaged builds for a number of popular open-source tools and libraries used by HPC
applications and developers. For example, OpenHPC provides builds for FFTW and HDF5 (including serial
and parallel I/O support), and the GNU Scientific Library (GSL). Again, multiple builds of each package
are available in the OpenHPC repository to support multiple compiler and MPI family combinations where
appropriate. Note, however, that not all combinatorial permutations may be available for components where
there are known license incompatibilities. The general naming convention for builds provided by OpenHPC
is to append the compiler and MPI family name that the library was built against directly into the package

22 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

name. For example, libraries that do not require MPI as part of the build process adopt the following RPM
name:

package-<compiler family>-ohpc-<package version>-<release>.rpm

Packages that do require MPI as part of the build expand upon this convention to additionally include the
MPI family name as follows:

package-<compiler family>-<mpi family>-ohpc-<package version>-<release>.rpm

To illustrate this further, the command below queries the locally configured repositories to identify all of
the available PETSc packages that were built with the GNU toolchain. The resulting output that is included
shows that pre-built versions are available for each of the supported MPI families presented in §4.3.

Tip

OpenHPC-provided 3rd party builds are configured to be installed into a common top-level repository so that
they can be easily exported to desired hosts within the cluster. This common top-level path (/opt/ohpc/pub)
was previously configured to be mounted on compute nodes in §3.9.3, so the packages will be immediately
available for use on the cluster after installation on the master host.

For convenience, OpenHPC provides package aliases for these 3rd party libraries and utilities that can
be used to install available libraries for use with the GNU compiler family toolchain. For parallel libraries,
aliases are grouped by MPI family toolchain so that administrators can choose a subset should they favor a
particular MPI stack. Please refer to Appendix E for a more detailed listing of all available packages in each
of these functional areas. To install all available package offerings within OpenHPC, issue the following:

# Install 3rd party libraries/tools meta-packages built with GNU toolchain


[sms]# yum -y install ohpc-gnu7-serial-libs
[sms]# yum -y install ohpc-gnu7-io-libs
[sms]# yum -y install ohpc-gnu7-python-libs
[sms]# yum -y install ohpc-gnu7-runtimes

# Install parallel lib meta-packages for all available MPI toolchains


[sms]# yum -y install ohpc-gnu7-mpich-parallel-libs
[sms]# yum -y install ohpc-gnu7-openmpi3-parallel-libs

4.7 Optional Development Tool Builds


In addition to the 3rd party development libraries built using the open source toolchains mentioned in §4.6,
OpenHPC also provides optional compatible builds for use with the compilers and MPI stack included in
newer versions of the Intel® Parallel Studio XE software suite. These packages provide a similar hierarchical
user environment experience as other compiler and MPI families present in OpenHPC.
To take advantage of the available builds, the Parallel Studio software suite must be obtained and installed
separately. Once installed locally, the OpenHPC compatible packages can be installed using standard package
manager semantics. Note that licenses are provided free of charge for many categories of use. In particular,
licenses for compilers and developments tools are provided at no cost to academic researchers or developers
contributing to open-source software projects. More information on this program can be found at:
https://software.intel.com/en-us/qualify-for-free-software

23 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Tip

As noted in §3.9.3, the default installation path for OpenHPC (/opt/ohpc/pub) is exported over
NFS from the master to the compute nodes, but the Parallel Studio installer defaults to a path of
/opt/intel. To make the Intel® compilers available to the compute nodes one must either customize
the Parallel Studio installation path to be within /opt/ohpc/pub, or alternatively, add an additional
NFS export for /opt/intel that is mounted on desired compute nodes.

To enable all 3rd party builds available in OpenHPC that are compatible with Intel® Parallel Studio, issue
the following:

# Install OpenHPC compatibility packages (requires prior installation of Parallel Studio)


[sms]# yum -y install intel-compilers-devel-ohpc
[sms]# yum -y install intel-mpi-devel-ohpc

# Optionally, choose the Omni-Path enabled build for MVAPICH2. Otherwise, skip to retain IB variant
[sms]# yum -y install mvapich2-psm2-intel-ohpc

# Install 3rd party libraries/tools meta-packages built with Intel toolchain


[sms]# yum -y install ohpc-intel-serial-libs
[sms]# yum -y install ohpc-intel-io-libs
[sms]# yum -y install ohpc-intel-perf-tools
[sms]# yum -y install ohpc-intel-python-libs
[sms]# yum -y install ohpc-intel-runtimes
[sms]# yum -y install ohpc-intel-mpich-parallel-libs
[sms]# yum -y install ohpc-intel-mvapich2-parallel-libs
[sms]# yum -y install ohpc-intel-openmpi3-parallel-libs
[sms]# yum -y install ohpc-intel-impi-parallel-libs

5 Resource Manager Startup


In section §3, the Slurm resource manager was installed and configured for use on both the master host and
compute node instances. With the cluster nodes up and functional, we can now startup the resource manager
services in preparation for running user jobs. Generally, this is a two-step process that requires starting up
the controller daemons on the master host and the client daemons on each of the compute hosts. Note that
Slurm leverages the use of the munge library to provide authentication services and this daemon also needs
to be running on all hosts within the resource management pool. The following commands can be used to
startup the necessary services to support resource management under Slurm.

# Start munge and slurm controller on master host


[sms]# systemctl enable munge
[sms]# systemctl enable slurmctld
[sms]# systemctl start munge
[sms]# systemctl start slurmctld

# Start slurm clients on compute hosts


[sms]# pdsh -w $compute_prefix[1-4] systemctl start slurmd

24 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

6 Run a Test Job


With the resource manager enabled for production usage, users should now be able to run jobs. To demon-
strate this, we will add a “test” user on the master host that can be used to run an example job.

[sms]# useradd -m test

Next, the user’s credentials need to be distributed across the cluster. xCAT’s xdcp has a merge function-
ality that adds new entries into credential files on compute nodes:

# Create a sync file for pushing user credentials to the nodes


[sms]# echo "MERGE:" > syncusers
[sms]# echo "/etc/passwd -> /etc/passwd" >> syncusers
[sms]# echo "/etc/group -> /etc/group" >> syncusers
[sms]# echo "/etc/shadow -> /etc/shadow" >> syncusers
# Use xCAT to distribute credentials to nodes
[sms]# xdcp compute -F syncusers

Alternatively, the updatenode compute -f command can be used. This re-synchronizes (i.e. copies) all
the files defined in the syncfile setup in Section 3.9.5.
OpenHPC includes a simple “hello-world” MPI application in the /opt/ohpc/pub/examples directory that
can be used for this quick compilation and execution. OpenHPC also provides a companion job-launch
utility named prun that is installed in concert with the pre-packaged MPI toolchains. This convenience
script provides a mechanism to abstract job launch across different resource managers and MPI stacks such
that a single launch command can be used for parallel job launch in a variety of OpenHPC environments.
It also provides a centralizing mechanism for administrators to customize desired environment settings for
their users.

25 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

6.1 Interactive execution


To use the newly created “test” account to compile and execute the application interactively through the
resource manager, execute the following (note the use of prun for parallel job launch which summarizes the
underlying native job launch mechanism being used):

# Switch to "test" user


[sms]# su - test

# Compile MPI "hello world" example


[test@sms ~]$ mpicc -O3 /opt/ohpc/pub/examples/mpi/hello.c

# Submit interactive job request and use prun to launch executable


[test@sms ~]$ srun -n 8 -N 2 --pty /bin/bash

[test@c1 ~]$ prun ./a.out

[prun] Master compute host = c1


[prun] Resource manager = slurm
[prun] Launch cmd = mpiexec.hydra -bootstrap slurm ./a.out

Hello, world (8 procs total)


--> Process # 0 of 8 is alive. -> c1
--> Process # 4 of 8 is alive. -> c2
--> Process # 1 of 8 is alive. -> c1
--> Process # 5 of 8 is alive. -> c2
--> Process # 2 of 8 is alive. -> c1
--> Process # 6 of 8 is alive. -> c2
--> Process # 3 of 8 is alive. -> c1
--> Process # 7 of 8 is alive. -> c2

Tip

The following table provides approximate command equivalences between SLURM and PBS Pro:

Command PBS Pro SLURM

Submit batch job qsub [job script] sbatch [job script]


Request interactive shell qsub -I /bin/bash srun –pty /bin/bash
Delete job qdel [job id] scancel [job id]
Queue status qstat -q sinfo
Job status qstat -f [job id] scontrol show job [job id]
Node status pbsnodes [node name] scontrol show node [node id]

26 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

6.2 Batch execution


For batch execution, OpenHPC provides a simple job script for reference (also housed in the /opt/ohpc/
pub/examples directory. This example script can be used as a starting point for submitting batch jobs to
the resource manager and the example below illustrates use of the script to submit a batch job for execution
using the same executable referenced in the previous interactive example.

# Copy example job script


[test@sms ~]$ cp /opt/ohpc/pub/examples/slurm/job.mpi .

# Examine contents (and edit to set desired job sizing characteristics)


[test@sms ~]$ cat job.mpi
#!/bin/bash

#SBATCH -J test # Job name


#SBATCH -o job.%j.out # Name of stdout output file (%j expands to %jobId)
#SBATCH -N 2 # Total number of nodes requested
#SBATCH -n 16 # Total number of mpi tasks #requested
#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - 1.5 hours

# Launch MPI-based executable

prun ./a.out

# Submit job for batch execution


[test@sms ~]$ sbatch job.mpi
Submitted batch job 339

Tip

The use of the %j option in the example batch job script shown is a convenient way to track application output
on an individual job basis. The %j token is replaced with the Slurm job allocation number once assigned
(job #339 in this example).

27 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Appendices
A Installation Template
This appendix highlights the availability of a companion installation script that is included with OpenHPC
documentation. This script, when combined with local site inputs, can be used to implement a starting
recipe for bare-metal system installation and configuration. This template script is used during validation
efforts to test cluster installations and is provided as a convenience for administrators as a starting point for
potential site customization.

Tip

Note that the template script provided is intended for use during initial installation and is not designed for
repeated execution. If modifications are required after using the script initially, we recommend running the
relevant subset of commands interactively.

The template script relies on the use of a simple text file to define local site variables that were outlined
in §1.3. By default, the template installation script attempts to use local variable settings sourced from
the /opt/ohpc/pub/doc/recipes/vanilla/input.local file, however, this choice can be overridden by
the use of the ${OHPC INPUT LOCAL} environment variable. The template install script is intended for
execution on the SMS master host and is installed as part of the docs-ohpc package into /opt/ohpc/pub/
doc/recipes/vanilla/recipe.sh. After enabling the OpenHPC repository and reviewing the guide for
additional information on the intent of the commands, the general starting approach for using this template
is as follows:

1. Install the docs-ohpc package

[sms]# yum -y install docs-ohpc

2. Copy the provided template input file to use as a starting point to define local site settings:

[sms]# cp /opt/ohpc/pub/doc/recipes/centos7/input.local input.local

3. Update input.local with desired settings


4. Copy the template installation script which contains command-line instructions culled from this guide.

[sms]# cp -p /opt/ohpc/pub/doc/recipes/centos7/x86_64/xcat/slurm/recipe.sh .

5. Review and edit recipe.sh to suite.


6. Use environment variable to define local input file and execute recipe.sh to perform a local installation.

[sms]# export OHPC_INPUT_LOCAL=./input.local


[sms]# ./recipe.sh

28 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

B Upgrading OpenHPC Packages


As newer OpenHPC releases are made available, users are encouraged to upgrade their locally installed
packages against the latest repository versions to obtain access to bug fixes and newer component versions.
This can be accomplished with the underlying package manager as OpenHPC packaging maintains versioning
state across releases. Also, package builds available from the OpenHPC repositories have “-ohpc” appended
to their names so that wild cards can be used as a simple way to obtain updates. The following general
procedure highlights a method for upgrading existing installations. When upgrading from a minor release
older than v1.3, you will first need to update your local OpenHPC repository configuration to point against
the v1.3 release (or update your locally hosted mirror). Refer to §3.1 for more details on enabling the latest
repository. In contrast, when upgrading between micro releases on the same branch (e.g. from v1.3 to 1.3.2),
there is no need to adjust local package manager configurations when using the public repository as rolling
updates are pre-configured.
1. (Optional) Ensure repo metadata is current (on head node and in chroot location(s)). Package man-
agers will naturally do this on their own over time, but if you are wanting to access updates immediately
after a new release, the following can be used to sync to the latest.

[sms]# yum clean expire-cache


[sms]# yum --installroot=$CHROOT clean expire-cache

2. Upgrade master (SMS) node

[sms]# yum -y upgrade "*-ohpc"

3. Upgrade packages in compute image

[sms]# yum -y --installroot=$CHROOT upgrade "*-ohpc"

4. Rebuild image(s)

[sms]# packimage centos7.5-x86_64-netboot-compute

In the case where packages were upgraded within the chroot compute image, you will need to reboot the
compute nodes when convenient to enable the changes.

B.1 New component variants


As newer variants of key compiler/MPI stacks are released, OpenHPC will periodically add toolchains
enabling the latest variant. To stay consistent throughout the build hierarchy, minimize recompilation
requirements for existing binaries, and allow for multiple variants to coexist, unique delimiters are used to
distinguish RPM package names and module hierarchy.
In the case of a fresh install, OpenHPC recipes default to installation of the latest toolchains available
in a given release branch. However, if upgrading a previously installed system, administrators can opt-in to
enable new variants as they become available. To illustrate this point, consider the previous OpenHPC 1.3.2
release as an example which contained an “openmpi” MPI variant providing OpenMPI 1.10.x along with
runtimes and libraries compiled with this toolchain. That release also contained the “llvm4” compiler family
which was updated to “llvm5” in OpenHPC 1.3.3. In the case where an admin would like to enable the newer
“openmpi3” toolchain, installation of these additions is simplified with the use of OpenHPC’s meta-packages
(see Table 2 in Appendix E). The following example illustrates adding the complete “openmpi3” toolchain.
Note that we leverage the convenience meta-packages containing MPI-dependent builds, and we also update
the modules environment to make it the default.

29 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

# Update default environment


[sms]# yum -y remove lmod-defaults-gnu7-openmpi-ohpc
[sms]# yum -y install lmod-defaults-gnu7-openmpi3-ohpc

# Install OpenMPI 3.x-compiled meta-packages with dependencies


[sms]# yum -y install ohpc-gnu7-perf-tools \
ohpc-gnu7-io-libs \
ohpc-gnu7-python-libs \
ohpc-gnu7-runtimes \
ohpc-gnu7-openmpi3-parallel-libs

# Install LLVM/Clang 5.x


[sms]# yum -y install llvm5-compilers-ohpc

30 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

C Integration Test Suite


This appendix details the installation and basic use of the integration test suite used to support OpenHPC
releases. This suite is not intended to replace the validation performed by component development teams,
but is instead, devised to confirm component builds are functional and interoperable within the modular
OpenHPC environment. The test suite is generally organized by components and the OpenHPC CI workflow
relies on running the full suite using Jenkins to test multiple OS configurations and installation recipes. To
facilitate customization and running of the test suite locally, we provide these tests in a standalone RPM.

[sms]# yum -y install test-suite-ohpc

The RPM installation creates a user named ohpc-test to house the test suite and provide an isolated
environment for execution. Configuration of the test suite is done using standard GNU autotools semantics
and the BATS shell-testing framework is used to execute and log a number of individual unit tests. Some
tests require privileged execution, so a different combination of tests will be enabled depending on which user
executes the top-level configure script. Non-privileged tests requiring execution on one or more compute
nodes are submitted as jobs through the SLURM resource manager. The tests are further divided into
“short” and “long” run categories. The short run configuration is a subset of approximately 180 tests to
demonstrate basic functionality of key components (e.g. MPI stacks) and should complete in 10-20 minutes.
The long run (around 1000 tests) is comprehensive and can take an hour or more to complete.
Most components can be tested individually, but a default configuration is setup to enable collective
testing. To test an isolated component, use the configure option to disable all tests, then re-enable the
desired test to run. The --help option to configure will display all possible tests. Example output is
shown below (some output is omitted for the sake of brevity).

[sms]# su - ohpc-test
[test@sms ~]$ cd tests
[test@sms ~]$ ./configure --disable-all --enable-fftw
checking for a BSD-compatible install... /bin/install -c
checking whether build environment is sane... yes
...
---------------------------------------------- SUMMARY ---------------------------------------------

Package version............... : test-suite-1.3.0

Build user.................... : ohpc-test


Build host.................... : sms001
Configure date................ : 2017-03-24 15:41
Build architecture............ : x86 64
Compiler Families............. : gnu
MPI Families.................. : mpich mvapich2 openmpi
Resource manager ............. : SLURM
Test suite configuration...... : short
...
Libraries:
Adios .................... : disabled
Boost .................... : disabled
Boost MPI................. : disabled
FFTW...................... : enabled
GSL....................... : disabled
HDF5...................... : disabled
HYPRE..................... : disabled
...

Many OpenHPC components exist in multiple flavors to support multiple compiler and MPI runtime
permutations, and the test suite takes this in to account by iterating through these combinations by default.

31 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

If make check is executed from the top-level test directory, all configured compiler and MPI permutations
of a library will be exercised. The following highlights the execution of the FFTW related tests that were
enabled in the previous step.

[test@sms ~]$ make check


make --no-print-directory check-TESTS
PASS: libs/fftw/ohpc-tests/test_mpi_families
============================================================================
Testsuite summary for test-suite 1.3.0
============================================================================
# TOTAL: 1
# PASS: 1
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
[test@sms ~]$ cat libs/fftw/tests/family-gnu-*/rm_execution.log
1..3
ok 1 [libs/FFTW] Serial C binary runs under resource manager (SLURM/gnu/mpich)
ok 2 [libs/FFTW] MPI C binary runs under resource manager (SLURM/gnu/mpich)
ok 3 [libs/FFTW] Serial Fortran binary runs under resource manager (SLURM/gnu/mpich)
PASS rm_execution (exit status: 0)
1..3
ok 1 [libs/FFTW] Serial C binary runs under resource manager (SLURM/gnu/mvapich2)
ok 2 [libs/FFTW] MPI C binary runs under resource manager (SLURM/gnu/mvapich2)
ok 3 [libs/FFTW] Serial Fortran binary runs under resource manager (SLURM/gnu/mvapich2)
PASS rm_execution (exit status: 0)
1..3
ok 1 [libs/FFTW] Serial C binary runs under resource manager (SLURM/gnu/openmpi)
ok 2 [libs/FFTW] MPI C binary runs under resource manager (SLURM/gnu/openmpi)
ok 3 [libs/FFTW] Serial Fortran binary runs under resource manager (SLURM/gnu/openmpi)
PASS rm_execution (exit status: 0)

32 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

D Customization
D.1 Adding local Lmod modules to OpenHPC hierarchy
Locally installed applications can easily be integrated in to OpenHPC systems by following the Lmod con-
vention laid out by the provided packages. Two sample module files are included in the examples-ohpc
package—one representing an application with no compiler or MPI runtime dependencies, and one depen-
dent on OpenMPI and the GNU toolchain. Simply copy these files to the prescribed locations, and the lmod
application should pick them up automatically.

[sms]# mkdir /opt/ohpc/pub/modulefiles/example1


[sms]# cp /opt/ohpc/pub/examples/example.modulefile \
/opt/ohpc/pub/modulefiles/example1/1.0
[sms]# mkdir /opt/ohpc/pub/moduledeps/gnu7-openmpi3/example2
[sms]# cp /opt/ohpc/pub/examples/example-mpi-dependent.modulefile \
/opt/ohpc/pub/moduledeps/gnu7-openmpi3/example2/1.0
[sms]# module avail

----------------------------------- /opt/ohpc/pub/moduledeps/gnu7-openmpi3 -----------------------------------


adios/1.12.0 imb/2018.0 netcdf-fortran/4.4.4 ptscotch/6.0.4 sionlib/1.7.1
boost/1.65.1 mpi4py/2.0.0 netcdf/4.4.1.1 scalapack/2.0.2 slepc/3.7.4
example2/1.0 mpiP/3.4.1 petsc/3.7.6 scalasca/2.3.1 superlu_dist/4.2
fftw/3.3.6 mumps/5.1.1 phdf5/1.10.1 scipy/0.19.1 tau/2.26.1
hypre/2.11.2 netcdf-cxx/4.3.0 pnetcdf/1.8.1 scorep/3.1 trilinos/12.10.1

--------------------------------------- /opt/ohpc/pub/moduledeps/gnu7 ----------------------------------------


R/3.4.2 metis/5.1.0 ocr/1.0.1 pdtoolkit/3.24 superlu/5.2.1
gsl/2.4 mpich/3.2 openblas/0.2.20 plasma/2.8.0
hdf5/1.10.1 numpy/1.13.1 openmpi3/3.0.0 (L) scotch/6.0.4

---------------------------------------- /opt/ohpc/admin/modulefiles -----------------------------------------


spack/0.10.0

----------------------------------------- /opt/ohpc/pub/modulefiles ------------------------------------------


EasyBuild/3.4.1 cmake/3.9.2 hwloc/1.11.8 pmix/1.2.3 valgrind/3.13.0
autotools (L) example1/1.0 (L) llvm5/5.0.0 prun/1.2 (L)
clustershell/1.8 gnu7/7.2.0 (L) ohpc (L) singularity/2.4

Where:
L: Module is loaded

Use "module spider" to find all possible modules.


Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

33 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

D.2 Rebuilding Packages from Source


Users of OpenHPC may find it desirable to rebuild one of the supplied packages to apply build customizations
or satisfy local requirements. One way to accomplish this is to install the appropriate source RPM, modify
the specfile as needed, and rebuild to obtain an updated binary RPM. A brief example using the FFTW
library is highlighted below. Note that the source RPMs can be downloaded from the community build
server at https://build.openhpc.community via a web browser or directly via yum as highlighted below.
The OpenHPC build system design leverages several keywords to control the choice of compiler and MPI
families for relevant development libraries and the rpmbuild example illustrates how to override the default
mpi family.

# Install rpm-build package and yum tools from base OS distro


[test@sms ~]$ sudo yum -y install rpm-build yum-utils

# Install FFTW’s build dependencies


[test@sms ~]$ sudo yum-builddep fftw-gnu7-openmpi3-ohpc

# Download SRPM from OpenHPC repository and install locally


[test@sms ~]$ yumdownloader --source fftw-gnu7-openmpi3-ohpc
[test@sms ~]$ rpm -i ./fftw-gnu7-openmpi3-ohpc-3.3.6-28.11.src.rpm

# Modify spec file as desired


[test@sms ~]$ cd ~/rpmbuild/SPECS
[test@sms ~rpmbuild/SPECS]$ perl -pi -e "s/enable-static=no/enable-static=yes/" fftw.spec

# Increment RPM release so package manager will see an update


[test@sms ~rpmbuild/SPECS]$ perl -pi -e "s/Release: 28.11/Release: 29.1/" fftw.spec

# Rebuild binary RPM. Note that additional directives can be specified to modify build
[test@sms ~rpmbuild/SPECS]$ rpmbuild -bb --define "mpi_family mpich" fftw.spec

# Install the new package


[test@sms ~]$ sudo yum -y install ~test/rpmbuild/RPMS/x86_64/fftw-gnu-mpich-ohpc-3.3.6-29.1.x86_64.rpm

34 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

E Package Manifest

This appendix provides a summary of available meta-package groupings and all of the individual RPM
packages that are available as part of this OpenHPC release. The meta-packages provide a mechanism to
group related collections of RPMs by functionality and provide a convenience mechanism for installation. A
list of the available meta-packages and a brief description is presented in Table 2.

35 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 2: Available OpenHPC Meta-packages

Group Name Description


ohpc-autotools Collection of GNU autotools packages.
ohpc-base Collection of base packages.
ohpc-base-compute Collection of compute node base packages.
ohpc-ganglia Collection of Ganglia monitoring and metrics packages.
ohpc-gnu7-io-libs Collection of IO library builds for use with GNU compiler toolchain.
ohpc-gnu7-mpich-io-libs Collection of IO library builds for use with GNU compiler toolchain and the
MPICH runtime.
ohpc-gnu7-mpich-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the MPICH runtime.
ohpc-gnu7-mpich-perf-tools Collection of performance tool builds for use with GNU compiler toolchain
and the MPICH runtime.
ohpc-gnu7-mvapich2-io-libs Collection of IO library builds for use with GNU compiler toolchain and the
MVAPICH2 runtime.
ohpc-gnu7-mvapich2-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the MVAPICH2 runtime.
ohpc-gnu7-mvapich2-perf-tools Collection of performance tool builds for use with GNU compiler toolchain
and the MVAPICH2 runtime.
ohpc-gnu7-openmpi3-io-libs Collection of IO library builds for use with GNU compiler toolchain and the
OpenMPI runtime.
ohpc-gnu7-openmpi3-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain and
the OpenMPI runtime.
ohpc-gnu7-openmpi3-perf-tools Collection of performance tool builds for use with GNU compiler toolchain
and the OpenMPI runtime.
ohpc-gnu7-parallel-libs Collection of parallel library builds for use with GNU compiler toolchain.
ohpc-gnu7-perf-tools Collection of performance tool builds for use with GNU compiler toolchain.
ohpc-gnu7-python-libs Collection of python related library builds for use with GNU compiler
toolchain.
ohpc-gnu7-python2-libs Collection of python2 related library builds for use with GNU compiler
toolchain.
ohpc-gnu7-python3-libs Collection of python3 related library builds for use with GNU compiler
toolchain.
ohpc-gnu7-runtimes Collection of runtimes for use with GNU compiler toolchain.
ohpc-gnu7-serial-libs Collection of serial library builds for use with GNU compiler toolchain.
ohpc-intel-impi-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the Intel(R) MPI Library.
ohpc-intel-io-libs Collection of IO library builds for use with Intel(R) Parallel Studio XE soft-
ware suite.
ohpc-intel-mpich-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the MPICH runtime.
ohpc-intel-mvapich2-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the MVAPICH2 runtime.
ohpc-intel-openmpi3-parallel-libs Collection of parallel library builds for use with Intel(R) Parallel Studio XE
toolchain and the OpenMPI runtime.
ohpc-intel-perf-tools Collection of performance tool builds for use with Intel(R) Parallel Studio XE
toolchain.

36 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 2 (cont): Available OpenHPC Meta-packages

Group Name Description


ohpc-intel-python-libs Collection of python related library builds for use with Intel(R) Parallel Studio XE
toolchain.
ohpc-intel-python2-libs Collection of python2 related library builds for use with Intel(R) Parallel Studio XE
toolchain.
ohpc-intel-python3-libs Collection of python3 related library builds for use with Intel(R) Parallel Studio XE
toolchain.
ohpc-intel-runtimes Collection of runtimes for use with Intel(R) Parallel Studio XE toolchain.
ohpc-intel-serial-libs Collection of serial library builds for use with Intel(R) Parallel Studio XE toolchain.
ohpc-nagios Collection of Nagios monitoring and metrics packages.
ohpc-slurm-client Collection of client packages for SLURM.
ohpc-slurm-server Collection of server packages for SLURM.
ohpc-warewulf Collection of base packages for Warewulf provisioning.

37 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

What follows next in this Appendix is a series of tables that summarize the underlying RPM packages
available in this OpenHPC release. These packages are organized by groupings based on their general
functionality and each table provides information for the specific RPM name, version, brief summary, and
the web URL where additional information can be obtained for the component. Note that many of the 3rd
party community libraries that are pre-packaged with OpenHPC are built using multiple compiler and MPI
families. In these cases, the RPM package name includes delimiters identifying the development environment
for which each package build is targeted. Additional information on the OpenHPC package naming scheme
is presented in §4.6. The relevant package groupings and associated Table references are as follows:

• Administrative tools (Table 3)


• Resource management (Table 4)
• Compiler families (Table 5)
• MPI families (Table 6)
• Development tools (Table 7)
• Performance analysis tools (Table 8)
• Distro support packages and dependencies (Table 9)
• IO Libraries (Table 11)
• Runtimes (Table 12)
• Serial/Threaded Libraries (Table 13)
• Parallel Libraries (Table 14)

38 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 3: Administrative Tools

RPM Package Name Version Info/URL


Python framework for efficient cluster administration.
clustershell-ohpc 1.8
http://clustershell.sourceforge.net
ConMan: The Console Manager.
conman-ohpc 0.2.8
http://dun.github.io/conman
OpenHPC documentation.
docs-ohpc 1.3.5
https://github.com/openhpc/ohpc
Example source code and templates for use within OpenHPC
examples-ohpc 1.5
environment. https://github.com/openhpc/ohpc
Distributed Monitoring System.
ganglia-ohpc 3.7.2
http://ganglia.sourceforge.net
Static cluster configuration database.
genders-ohpc 1.22
https://github.com/chaos/genders
lmod-defaults-gnu-impi-ohpc
lmod-defaults-gnu-mpich-ohpc
1.2
lmod-defaults-gnu-mvapich2-ohpc
lmod-defaults-gnu-openmpi-ohpc
lmod-defaults-gnu7-impi-ohpc
lmod-defaults-gnu7-mpich-ohpc
lmod-defaults-gnu7-mvapich2-ohpc OpenHPC default login environments.
lmod-defaults-gnu7-openmpi-ohpc https://github.com/openhpc/ohpc
lmod-defaults-intel-impi-ohpc 1.3.1
lmod-defaults-intel-mpich-ohpc
lmod-defaults-intel-mvapich2-ohpc
lmod-defaults-intel-openmpi-ohpc
lmod-defaults-gnu7-openmpi3-ohpc
1.3.3
lmod-defaults-intel-openmpi3-ohpc
Lua based Modules (lmod).
lmod-ohpc 7.7.14
https://github.com/TACC/Lmod
A Linux operating system framework for managing HPC clus-
losf-ohpc 0.55.0
ters. https://github.com/hpcsi/losf
Remote shell program that uses munge authentication.
mrsh-ohpc 2.12
https://github.com/chaos/mrsh
Host/service/network monitoring program plugins for Nagios.
nagios-plugins-ohpc 2.2.1
https://www.nagios-plugins.org
Host/service/network monitoring program.
nagios-ohpc 4.3.4
http://www.nagios.org
Stores all configuration and event data from Nagios in a
ndoutils-ohpc 2.1.3
database. http://www.nagios.org/download/addons
LBNL Node Health Check.
nhc-ohpc 1.4.2
https://github.com/mej/nhc
Host/service/network monitoring agent for Nagios.
nrpe-ohpc 3.2.0
http://www.nagios.org
OpenHPC release files.
ohpc-release 1.3
https://github.com/openhpc/ohpc
Parallel remote shell program.
pdsh-ohpc 2.33
http://sourceforge.net/projects/pdsh
Convenience utility for parallel job launch.
prun-ohpc 1.2
https://github.com/openhpc/ohpc
Integration test suite for OpenHPC.
test-suite-ohpc 1.3.5
https://github.com/openhpc/ohpc/tests

39 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 4: Resource Management

RPM Package Name Version Info/URL


MUNGE authentication service.
munge-ohpc 0.5.13
http://dun.github.io/munge
PBS Professional for an execution host.
pbspro-execution-ohpc 14.1.2
https://github.com/PBSPro/pbspro
PBS Professional for a client host.
pbspro-client-ohpc 14.1.2
https://github.com/PBSPro/pbspro
PBS Professional for a server host.
pbspro-server-ohpc 14.1.2
https://github.com/PBSPro/pbspro
An extended/exascale implementation of PMI.
pmix-ohpc 2.1.1
https://pmix.github.io/pmix
Development package for Slurm.
slurm-devel-ohpc 17.11.7
https://slurm.schedmd.com
Example config files for Slurm.
slurm-example-configs-ohpc 17.11.7
https://slurm.schedmd.com
Graphical user interface to view and modify Slurm state.
slurm-sview-ohpc 17.11.7
https://slurm.schedmd.com
PAM module for restricting access to compute nodes via Slurm.
slurm-pam slurm-ohpc 17.11.7
https://slurm.schedmd.com
Perl API to Slurm.
slurm-perlapi-ohpc 17.11.7
https://slurm.schedmd.com
Perl tool to print Slurm job state information.
slurm-contribs-ohpc 17.11.7
https://slurm.schedmd.com
Slurm Workload Manager.
slurm-ohpc 17.11.7
https://slurm.schedmd.com
Slurm compute node daemon.
slurm-slurmd-ohpc 17.11.7
https://slurm.schedmd.com
Slurm controller daemon.
slurm-slurmctld-ohpc 17.11.7
https://slurm.schedmd.com
Slurm database daemon.
slurm-slurmdbd-ohpc 17.11.7
https://slurm.schedmd.com
Slurmś implementation of the pmi libraries.
slurm-libpmi-ohpc 17.11.7
https://slurm.schedmd.com
Torque/PBS wrappers for transition from Torque/PBS to Slurm.
slurm-torque-ohpc 17.11.7
https://slurm.schedmd.com
openlava/LSF wrappers for transition from OpenLava/LSF to
slurm-openlava-ohpc 17.11.7
Slurm. https://slurm.schedmd.com

40 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 5: Compiler Families

RPM Package Name Version Info/URL


The GNU C Compiler and Support Files.
gnu-compilers-ohpc 5.4.0
http://gcc.gnu.org
The GNU C Compiler and Support Files.
gnu7-compilers-ohpc 7.3.0
http://gcc.gnu.org
OpenHPC compatibility package for Intel(R) Parallel Studio
intel-compilers-devel-ohpc 2018
XE. https://github.com/openhpc/ohpc
The LLVM Compiler Infrastructure.
llvm4-compilers-ohpc 4.0.1
http://www.llvm.org
The LLVM Compiler Infrastructure.
llvm5-compilers-ohpc 5.0.1
http://www.llvm.org

Table 6: MPI Families

RPM Package Name Version Info/URL


OpenHPC compatibility package for Intel(R) MPI Library.
intel-mpi-devel-ohpc 2018
https://github.com/openhpc/ohpc
mpich-gnu-ohpc
MPICH MPI implementation.
mpich-gnu7-ohpc 3.2.1
http://www.mpich.org
mpich-intel-ohpc
mvapich2-gnu-ohpc
mvapich2-gnu7-ohpc
mvapich2-intel-ohpc OSU MVAPICH2 MPI implementation.
2.2
mvapich2-psm2-gnu-ohpc http://mvapich.cse.ohio-state.edu/overview/mvapich2
mvapich2-psm2-gnu7-ohpc
mvapich2-psm2-intel-ohpc
openmpi-gnu-ohpc
openmpi-gnu7-ohpc
openmpi-intel-ohpc A powerful implementation of MPI.
1.10.7
openmpi-psm2-gnu-ohpc http://www.open-mpi.org
openmpi-psm2-gnu7-ohpc
openmpi-psm2-intel-ohpc
openmpi3-gnu7-ohpc
openmpi3-intel-ohpc A powerful implementation of MPI.
3.1.0
openmpi3-pmix-slurm-gnu7-ohpc http://www.open-mpi.org
openmpi3-pmix-slurm-intel-ohpc

41 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 7: Development Tools

RPM Package Name Version Info/URL


Build and installation framework.
EasyBuild-ohpc 3.6.1
http://easybuilders.github.io/easybuild
A GNU tool for automatically creating Makefiles.
automake-ohpc 1.15
http://www.gnu.org/software/automake
A GNU tool for automatically configuring source code.
autoconf-ohpc 2.69
http://www.gnu.org/software/autoconf
CMake is an open-source, cross-platform family of tools
cmake-ohpc 3.11.1
designed to build, test and package software. https:
//cmake.org
Portable Hardware Locality.
hwloc-ohpc 1.11.10
http://www.open-mpi.org/projects/hwloc
The GNU Portable Library Tool.
libtool-ohpc 2.4.6
http://www.gnu.org/software/libtool
python-scipy-gnu-mpich-ohpc
python-scipy-gnu-mvapich2-ohpc
0.19.1
python-scipy-gnu-openmpi-ohpc
Scientific Tools for Python.
python-scipy-gnu7-openmpi-ohpc
http://www.scipy.org
python-scipy-gnu7-mpich-ohpc
python-scipy-gnu7-mvapich2-ohpc 1.1.0
python-scipy-gnu7-openmpi3-ohpc
python34-scipy-gnu7-mpich-ohpc
Scientific Tools for Python.
python34-scipy-gnu7-mvapich2-ohpc 1.1.0
http://www.scipy.org
python34-scipy-gnu7-openmpi3-ohpc
python-numpy-gnu-ohpc 1.12.1 NumPy array processing for numbers, strings, records
python-numpy-gnu7-ohpc and objects.
1.14.3
python-numpy-intel-ohpc http://sourceforge.net/projects/numpy
python34-numpy-gnu7-ohpc NumPy array processing for numbers, strings, records
1.14.3
python34-numpy-intel-ohpc and objects. http://sourceforge.net/projects/numpy
python-mpi4py-gnu7-impi-ohpc
python-mpi4py-gnu7-mpich-ohpc
python-mpi4py-gnu7-mvapich2-ohpc
Python bindings for the Message Passing Interface
python-mpi4py-gnu7-openmpi3-ohpc
3.0.0 (MPI) standard.
python-mpi4py-intel-impi-ohpc
https://bitbucket.org/mpi4py/mpi4py
python-mpi4py-intel-mpich-ohpc
python-mpi4py-intel-mvapich2-ohpc
python-mpi4py-intel-openmpi3-ohpc
python34-mpi4py-gnu7-impi-ohpc
python34-mpi4py-gnu7-mpich-ohpc
python34-mpi4py-gnu7-mvapich2-ohpc
Python bindings for the Message Passing Interface
python34-mpi4py-gnu7-openmpi3-ohpc
3.0.0 (MPI) standard.
python34-mpi4py-intel-impi-ohpc
https://bitbucket.org/mpi4py/mpi4py
python34-mpi4py-intel-mpich-ohpc
python34-mpi4py-intel-mvapich2-ohpc
python34-mpi4py-intel-openmpi3-ohpc
HPC software package management.
spack-ohpc 0.11.2
https://github.com/LLNL/spack
Valgrind Memory Debugger.
valgrind-ohpc 3.13.0
http://www.valgrind.org

42 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 8: Performance Analysis Tools

RPM Package Name Version Info/URL


imb-gnu-impi-ohpc
imb-gnu-mpich-ohpc
imb-gnu-mvapich2-ohpc
imb-gnu-openmpi-ohpc
imb-gnu7-impi-ohpc
imb-gnu7-mpich-ohpc
imb-gnu7-mvapich2-ohpc Intel MPI Benchmarks (IMB).
2018.1
imb-gnu7-openmpi-ohpc https://software.intel.com/en-us/articles/intel-mpi-benchmarks
imb-gnu7-openmpi3-ohpc
imb-intel-impi-ohpc
imb-intel-mpich-ohpc
imb-intel-mvapich2-ohpc
imb-intel-openmpi-ohpc
imb-intel-openmpi3-ohpc
likwid-gnu7-ohpc Toolsuite of command line applications for performance
4.3.2
likwid-intel-ohpc oriented programmers. https://github.com/RRZE-HPC/likwid
mpiP-gnu-impi-ohpc
mpiP-gnu-mpich-ohpc
mpiP-gnu-mvapich2-ohpc
mpiP-gnu-openmpi-ohpc
mpiP-gnu7-impi-ohpc
mpiP-gnu7-mpich-ohpc
mpiP-gnu7-mvapich2-ohpc mpiP: a lightweight profiling library for MPI applications.
3.4.1
mpiP-gnu7-openmpi-ohpc http://mpip.sourceforge.net
mpiP-gnu7-openmpi3-ohpc
mpiP-intel-impi-ohpc
mpiP-intel-mpich-ohpc
mpiP-intel-mvapich2-ohpc
mpiP-intel-openmpi-ohpc
mpiP-intel-openmpi3-ohpc
Performance Application Programming Interface.
papi-ohpc 5.6.0
http://icl.cs.utk.edu/papi
pdtoolkit-gnu-ohpc
PDT is a framework for analyzing source code.
pdtoolkit-gnu7-ohpc 3.25
http://www.cs.uoregon.edu/Research/pdt
pdtoolkit-intel-ohpc
scalasca-gnu-impi-ohpc
scalasca-gnu-mpich-ohpc
scalasca-gnu-mvapich2-ohpc
scalasca-gnu-openmpi-ohpc
scalasca-gnu7-impi-ohpc
scalasca-gnu7-mpich-ohpc
Toolset for performance analysis of large-scale parallel
scalasca-gnu7-mvapich2-ohpc
2.3.1 applications.
scalasca-gnu7-openmpi-ohpc
http://www.scalasca.org
scalasca-gnu7-openmpi3-ohpc
scalasca-intel-impi-ohpc
scalasca-intel-mpich-ohpc
scalasca-intel-mvapich2-ohpc
scalasca-intel-openmpi-ohpc
scalasca-intel-openmpi3-ohpc

43 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 8 (cont): Performance Analysis Tools

RPM Package Name Version Info/URL


scorep-gnu-impi-ohpc
scorep-gnu-mpich-ohpc
scorep-gnu-mvapich2-ohpc
3.1
scorep-gnu-openmpi-ohpc
scorep-gnu7-openmpi-ohpc
scorep-intel-openmpi-ohpc
Scalable Performance Measurement Infrastructure for Parallel
scorep-gnu7-impi-ohpc
Codes.
scorep-gnu7-mpich-ohpc
http://www.vi-hps.org/projects/score-p
scorep-gnu7-mvapich2-ohpc
scorep-gnu7-openmpi3-ohpc
4.0
scorep-intel-impi-ohpc
scorep-intel-mpich-ohpc
scorep-intel-mvapich2-ohpc
scorep-intel-openmpi3-ohpc
tau-gnu-impi-ohpc
tau-gnu-mpich-ohpc
tau-gnu-mvapich2-ohpc
2.27
tau-gnu-openmpi-ohpc
tau-gnu7-openmpi-ohpc
tau-intel-openmpi-ohpc
tau-gnu7-impi-ohpc Tuning and Analysis Utilities Profiling Package.
tau-gnu7-mpich-ohpc http://www.cs.uoregon.edu/research/tau/home.php
tau-gnu7-mvapich2-ohpc
tau-gnu7-openmpi3-ohpc
2.27.1
tau-intel-impi-ohpc
tau-intel-mpich-ohpc
tau-intel-mvapich2-ohpc
tau-intel-openmpi3-ohpc

Table 9: Distro Support Packages/Dependencies

RPM Package Name Version Info/URL


Module for Lua which adds bitwise operations on numbers.
lua-bit-ohpc 1.0.2
http://bitop.luajit.org
Lua library to Access Directories and Files.
lua-filesystem-ohpc 1.6.3
http://keplerproject.github.com/luafilesystem
POSIX library for Lua.
lua-posix-ohpc 33.2.1
https://github.com/luaposix/luaposix

44 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 10: Lustre

RPM Package Name Version Info/URL


Lustre File System.
lustre-client-ohpc 2.11.0
https://wiki.hpdd.intel.com
Lustre administration utility.
shine-ohpc 1.5
http://lustre-shine.sourceforge.net

45 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 11: IO Libraries

RPM Package Name Version Info/URL


adios-gnu-impi-ohpc
adios-gnu-mpich-ohpc
adios-gnu-mvapich2-ohpc
1.12.0
adios-gnu-openmpi-ohpc
adios-gnu7-openmpi-ohpc
adios-intel-openmpi-ohpc
adios-gnu7-impi-ohpc The Adaptable IO System (ADIOS).
adios-gnu7-mpich-ohpc http://www.olcf.ornl.gov/center-projects/adios
adios-gnu7-mvapich2-ohpc
adios-gnu7-openmpi3-ohpc
1.13.1
adios-intel-impi-ohpc
adios-intel-mpich-ohpc
adios-intel-mvapich2-ohpc
adios-intel-openmpi3-ohpc
hdf5-gnu-ohpc 1.10.1 A general purpose library and file format for storing scientific
hdf5-gnu7-ohpc data.
1.10.2
hdf5-intel-ohpc http://www.hdfgroup.org/HDF5
netcdf-fortran-gnu-impi-ohpc
netcdf-fortran-gnu-mpich-ohpc
netcdf-fortran-gnu-mvapich2-ohpc
netcdf-fortran-gnu-openmpi-ohpc
netcdf-fortran-gnu7-impi-ohpc
netcdf-fortran-gnu7-mpich-ohpc
Fortran Libraries for the Unidata network Common Data
netcdf-fortran-gnu7-mvapich2-ohpc
4.4.4 Form.
netcdf-fortran-gnu7-openmpi-ohpc
http://www.unidata.ucar.edu/software/netcdf
netcdf-fortran-gnu7-openmpi3-ohpc
netcdf-fortran-intel-impi-ohpc
netcdf-fortran-intel-mpich-ohpc
netcdf-fortran-intel-mvapich2-ohpc
netcdf-fortran-intel-openmpi-ohpc
netcdf-fortran-intel-openmpi3-ohpc
netcdf-cxx-gnu-impi-ohpc
netcdf-cxx-gnu-mpich-ohpc
netcdf-cxx-gnu-mvapich2-ohpc
netcdf-cxx-gnu-openmpi-ohpc
netcdf-cxx-gnu7-impi-ohpc
netcdf-cxx-gnu7-mpich-ohpc
netcdf-cxx-gnu7-mvapich2-ohpc C++ Libraries for the Unidata network Common Data Form.
4.3.0
netcdf-cxx-gnu7-openmpi-ohpc http://www.unidata.ucar.edu/software/netcdf
netcdf-cxx-gnu7-openmpi3-ohpc
netcdf-cxx-intel-impi-ohpc
netcdf-cxx-intel-mpich-ohpc
netcdf-cxx-intel-mvapich2-ohpc
netcdf-cxx-intel-openmpi-ohpc
netcdf-cxx-intel-openmpi3-ohpc

46 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 11 (cont): IO Libraries

RPM Package Name Version Info/URL


netcdf-gnu-impi-ohpc
netcdf-gnu-mpich-ohpc
netcdf-gnu-mvapich2-ohpc
4.5.0
netcdf-gnu-openmpi-ohpc
netcdf-gnu7-openmpi-ohpc
netcdf-intel-openmpi-ohpc
netcdf-gnu7-impi-ohpc C Libraries for the Unidata network Common Data Form.
netcdf-gnu7-mpich-ohpc http://www.unidata.ucar.edu/software/netcdf
netcdf-gnu7-mvapich2-ohpc
netcdf-gnu7-openmpi3-ohpc
4.6.1
netcdf-intel-impi-ohpc
netcdf-intel-mpich-ohpc
netcdf-intel-mvapich2-ohpc
netcdf-intel-openmpi3-ohpc
phdf5-gnu-impi-ohpc
phdf5-gnu-mpich-ohpc
phdf5-gnu-mvapich2-ohpc
1.10.1
phdf5-gnu-openmpi-ohpc
phdf5-gnu7-openmpi-ohpc
phdf5-intel-openmpi-ohpc
A general purpose library and file format for storing scientific
phdf5-gnu7-impi-ohpc
data.
phdf5-gnu7-mpich-ohpc
http://www.hdfgroup.org/HDF5
phdf5-gnu7-mvapich2-ohpc
phdf5-gnu7-openmpi3-ohpc
1.10.2
phdf5-intel-impi-ohpc
phdf5-intel-mpich-ohpc
phdf5-intel-mvapich2-ohpc
phdf5-intel-openmpi3-ohpc
pnetcdf-gnu7-openmpi-ohpc
1.8.1
pnetcdf-intel-openmpi-ohpc
pnetcdf-gnu7-impi-ohpc
pnetcdf-gnu7-mpich-ohpc
pnetcdf-gnu7-mvapich2-ohpc A Parallel NetCDF library (PnetCDF).
pnetcdf-gnu7-openmpi3-ohpc http://cucis.ece.northwestern.edu/projects/PnetCDF
1.9.0
pnetcdf-intel-impi-ohpc
pnetcdf-intel-mpich-ohpc
pnetcdf-intel-mvapich2-ohpc
pnetcdf-intel-openmpi3-ohpc
sionlib-gnu-impi-ohpc
sionlib-gnu-mpich-ohpc
sionlib-gnu-mvapich2-ohpc
sionlib-gnu-openmpi-ohpc
sionlib-gnu7-impi-ohpc
sionlib-gnu7-mpich-ohpc
Scalable I/O Library for Parallel Access to Task-Local Files.
sionlib-gnu7-mvapich2-ohpc
1.7.1 http://www.fz-juelich.de/ias/jsc/EN/Expertise/Support/
sionlib-gnu7-openmpi-ohpc
Software/SIONlib/ node.html
sionlib-gnu7-openmpi3-ohpc
sionlib-intel-impi-ohpc
sionlib-intel-mpich-ohpc
sionlib-intel-mvapich2-ohpc
sionlib-intel-openmpi-ohpc
sionlib-intel-openmpi3-ohpc

47 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 12: Runtimes

RPM Package Name Version Info/URL


Lightweight user-defined software stacks for high-performance
charliecloud-ohpc 0.2.4
computing. https://hpc.github.io/charliecloud
ocr-gnu-ohpc
Open Community Runtime (OCR) for shared memory.
ocr-gnu7-ohpc 1.0.1
https://xstack.exascale-tech.com/wiki
ocr-intel-ohpc
Application and environment virtualization.
singularity-ohpc 2.5.1
http://singularity.lbl.gov

Table 13: Serial/Threaded Libraries

RPM Package Name Version Info/URL


R is a language and environment for statistical computing and
R-gnu7-ohpc 3.5.0
graphics (S-Plus like). http://www.r-project.org
gsl-gnu-ohpc GNU Scientific Library (GSL).
2.4
gsl-gnu7-ohpc http://www.gnu.org/software/gsl
metis-gnu-ohpc
Serial Graph Partitioning and Fill-reducing Matrix Ordering.
metis-gnu7-ohpc 5.1.0
http://glaros.dtc.umn.edu/gkhome/metis/metis/overview
metis-intel-ohpc
openblas-gnu-ohpc An optimized BLAS library based on GotoBLAS2.
0.2.20
openblas-gnu7-ohpc http://www.openblas.net
plasma-gnu7-ohpc Parallel Linear Algebra Software for Multicore Architectures.
2.8.0
plasma-intel-ohpc https://bitbucket.org/icl/plasma
scotch-gnu7-ohpc Graph, mesh and hypergraph partitioning library.
6.0.4
scotch-intel-ohpc http://www.labri.fr/perso/pelegrin/scotch
superlu-gnu-ohpc A general purpose library for the direct solution of linear
superlu-gnu7-ohpc 5.2.1 equations.
superlu-intel-ohpc http://crd.lbl.gov/∼xiaoye/SuperLU

48 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 14: Parallel Libraries

RPM Package Name Version Info/URL


boost-gnu-impi-ohpc
boost-gnu-mpich-ohpc
boost-gnu-mvapich2-ohpc
1.66.0
boost-gnu-openmpi-ohpc
boost-gnu7-openmpi-ohpc
boost-intel-openmpi-ohpc
boost-gnu7-impi-ohpc Boost free peer-reviewed portable C++ source libraries.
boost-gnu7-mpich-ohpc http://www.boost.org
boost-gnu7-mvapich2-ohpc
boost-gnu7-openmpi3-ohpc
1.67.0
boost-intel-impi-ohpc
boost-intel-mpich-ohpc
boost-intel-mvapich2-ohpc
boost-intel-openmpi3-ohpc
fftw-gnu-mpich-ohpc
fftw-gnu-mvapich2-ohpc
fftw-gnu-openmpi-ohpc
A Fast Fourier Transform library.
fftw-gnu7-mpich-ohpc 3.3.7
http://www.fftw.org
fftw-gnu7-mvapich2-ohpc
fftw-gnu7-openmpi-ohpc
fftw-gnu7-openmpi3-ohpc
hypre-gnu-impi-ohpc
hypre-gnu-mpich-ohpc
hypre-gnu-mvapich2-ohpc
2.13.0
hypre-gnu-openmpi-ohpc
hypre-gnu7-openmpi-ohpc
hypre-intel-openmpi-ohpc
hypre-gnu7-impi-ohpc Scalable algorithms for solving linear systems of equations.
hypre-gnu7-mpich-ohpc http://www.llnl.gov/casc/hypre
hypre-gnu7-mvapich2-ohpc
hypre-gnu7-openmpi3-ohpc
2.14.0
hypre-intel-impi-ohpc
hypre-intel-mpich-ohpc
hypre-intel-mvapich2-ohpc
hypre-intel-openmpi3-ohpc
mfem-gnu7-impi-ohpc
mfem-gnu7-mpich-ohpc
mfem-gnu7-mvapich2-ohpc
Lightweight, general, scalable C++ library for finite element
mfem-gnu7-openmpi3-ohpc
3.3.2 methods.
mfem-intel-impi-ohpc
http://mfem.org
mfem-intel-mpich-ohpc
mfem-intel-mvapich2-ohpc
mfem-intel-openmpi3-ohpc

49 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 14 (cont): Parallel Libraries

RPM Package Name Version Info/URL


mumps-gnu-impi-ohpc
mumps-gnu-mpich-ohpc
mumps-gnu-mvapich2-ohpc
mumps-gnu-openmpi-ohpc
mumps-gnu7-impi-ohpc
mumps-gnu7-mpich-ohpc
mumps-gnu7-mvapich2-ohpc A MUltifrontal Massively Parallel Sparse direct Solver.
5.1.2
mumps-gnu7-openmpi-ohpc http://mumps.enseeiht.fr
mumps-gnu7-openmpi3-ohpc
mumps-intel-impi-ohpc
mumps-intel-mpich-ohpc
mumps-intel-mvapich2-ohpc
mumps-intel-openmpi-ohpc
mumps-intel-openmpi3-ohpc
petsc-gnu-impi-ohpc
petsc-gnu-mpich-ohpc
petsc-gnu-mvapich2-ohpc
3.8.3
petsc-gnu-openmpi-ohpc
petsc-gnu7-openmpi-ohpc
petsc-intel-openmpi-ohpc
petsc-gnu7-impi-ohpc Portable Extensible Toolkit for Scientific Computation.
petsc-gnu7-mpich-ohpc http://www.mcs.anl.gov/petsc
petsc-gnu7-mvapich2-ohpc
petsc-gnu7-openmpi3-ohpc
3.9.1
petsc-intel-impi-ohpc
petsc-intel-mpich-ohpc
petsc-intel-mvapich2-ohpc
petsc-intel-openmpi3-ohpc
ptscotch-gnu7-impi-ohpc
ptscotch-gnu7-mpich-ohpc
ptscotch-gnu7-mvapich2-ohpc
ptscotch-gnu7-openmpi-ohpc
ptscotch-gnu7-openmpi3-ohpc Graph, mesh and hypergraph partitioning library using MPI.
6.0.4
ptscotch-intel-impi-ohpc http://www.labri.fr/perso/pelegrin/scotch
ptscotch-intel-mpich-ohpc
ptscotch-intel-mvapich2-ohpc
ptscotch-intel-openmpi-ohpc
ptscotch-intel-openmpi3-ohpc
slepc-gnu7-openmpi-ohpc
3.8.2
slepc-intel-openmpi-ohpc
slepc-gnu7-impi-ohpc
slepc-gnu7-mpich-ohpc
slepc-gnu7-mvapich2-ohpc A library for solving large scale sparse eigenvalue problems.
slepc-gnu7-openmpi3-ohpc http://slepc.upv.es
3.9.1
slepc-intel-impi-ohpc
slepc-intel-mpich-ohpc
slepc-intel-mvapich2-ohpc
slepc-intel-openmpi3-ohpc

50 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

Table 14 (cont): Parallel Libraries

RPM Package Name Version Info/URL


superlu dist-gnu-impi-ohpc
superlu dist-gnu-mpich-ohpc
superlu dist-gnu-mvapich2-ohpc
4.2
superlu dist-gnu-openmpi-ohpc
superlu dist-gnu7-openmpi-ohpc
superlu dist-intel-openmpi-ohpc
A general purpose library for the direct solution of linear
superlu dist-gnu7-impi-ohpc
equations.
superlu dist-gnu7-mpich-ohpc
http://crd-legacy.lbl.gov/∼xiaoye/SuperLU
superlu dist-gnu7-mvapich2-ohpc
superlu dist-gnu7-openmpi3-ohpc
5.3.0
superlu dist-intel-impi-ohpc
superlu dist-intel-mpich-ohpc
superlu dist-intel-mvapich2-ohpc
superlu dist-intel-openmpi3-ohpc
trilinos-gnu-impi-ohpc
trilinos-gnu-mpich-ohpc
trilinos-gnu-mvapich2-ohpc
trilinos-gnu-openmpi-ohpc
trilinos-gnu7-impi-ohpc
trilinos-gnu7-mpich-ohpc
trilinos-gnu7-mvapich2-ohpc A collection of libraries of numerical algorithms.
12.12.1
trilinos-gnu7-openmpi-ohpc http://trilinos.sandia.gov/index.html
trilinos-gnu7-openmpi3-ohpc
trilinos-intel-impi-ohpc
trilinos-intel-mpich-ohpc
trilinos-intel-mvapich2-ohpc
trilinos-intel-openmpi-ohpc
trilinos-intel-openmpi3-ohpc

51 Rev: a78eeac38
Install Guide (v1.3.5): CentOS7.5/x86 64 + xCAT + SLURM

F Package Signatures
All of the RPMs provided via the OpenHPC repository are signed with a GPG signature. By default, the
underlying package managers will verify these signatures during installation to ensure that packages have
not been altered. The RPMs can also be manually verified and the public signing key fingerprint for the
latest repository is shown below:

Fingerprint: DD5D 8CAA CB57 364F FCC2 D3AE C468 07FF 26CE 6884

The following command can be used to verify an RPM once it has been downloaded locally by confirming
if the package is signed, and if so, indicating which key was used to sign it. The example below highlights
usage for a local copy of the docs-ohpc package and illustrates how the key ID matches the fingerprint
shown above.

[sms]# rpm --checksig -v docs-ohpc-*.rpm


docs-ohpc-1.0-1.1.x86_64.rpm:
Header V3 RSA/SHA256 Signature, key ID 26ce6884: OK
Header SHA1 digest: OK (c3873bf495c51d2ea6d3ef23ab88be105983c72c)
V3 RSA/SHA256 Signature, key ID 26ce6884: OK
MD5 digest: OK (43d067f33fb370e30a39789439ead238)

52 Rev: a78eeac38

You might also like