0% found this document useful (0 votes)
118 views

HCIA-Intelligent Computing V1.0 Lab Guide

Uploaded by

houssem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views

HCIA-Intelligent Computing V1.0 Lab Guide

Uploaded by

houssem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 148

Management Software

Operation Guide for


Trainees

Issue: 1.0

Huawei Technologies Co., Ltd.


Copyright © Huawei Technologies Co., Ltd. 2019 All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their
respective holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei
and the customer. All or part of the products, services and features described in this document may
not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all
statements, information, and recommendations in this document are provided "AS IS" without
warranties, guarantees or representations of any kind, either expressed or implied.

The information in this document is subject to change without notice. Every effort has been made in
the preparation of this document to ensure accuracy of the contents, but all statements, information,
and recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base, Bantian, Longgang Shenzhen 518129

People's Republic of China

Website: https://e.huawei.com/en

Huawei Proprietary and Confidential


Copyright © Huawei Technologies Co., Ltd
[]
HCIA

HCIA – Management Software Operation Guide for Trainees Page 1

Huawei Certificate System


Huawei Certification follows the "platform + ecosystem" development strategy,
which is a new collaborative architecture of ICT infrastructure based on
"Cloud-Pipe-Terminal". Huawei has set up a complete certification system consisting
of three categories: ICT infrastructure certification, Platform and Service certification
and ICT vertical certification, and grants Huawei certification the only all-range
technical certification in the industry.
Huawei offers three levels of certification: Huawei Certified ICT Associate (HCIA),
Huawei Certified ICT Professional (HCIP), and Huawei Certified ICT Expert (HCIE).
Huawei Certified ICT Associate- Intelligent Computing (HCIA-Intelligent
Computing) is intended for Huawei engineers in representative offices and branch
offices, and other engineers who want to learn Huawei intelligent computing
products. The HCIA-Intelligent Computing certification covers the computing
industry, chip development history and trend, computing system architecture
overview, computing platform products and common technologies, and industry
solution cases and practices.
The HCIA-Intelligent Computing certificate system introduces you to the
industry and market, helps you in innovation, and enables you to stand atop the
intelligent computing frontiers.
[]
HCIA

HCIA – Management Software Operation Guide for Trainees Page 2


HCIA – Management Software Operation Guide for Trainees Page 3

Contents

1 References and Tools................................................................................................... 4


1.1 References and Tools .................................................................................................................................................... 4

2 Management Software Operation Guide ................................................................ 5


2.1 Course Introduction ...................................................................................................................... 错误!未定义书签。

2.2 Objectives ......................................................................................................................................................................... 5

2.3 Case Background ........................................................................................................................................................... 5

2.4 Tasks ................................................................................................................................................................................... 5

Scenario 1: Configure the iBMC....................................................................................................................................... 6

Scenario 2: Configure the RAID ....................................................................................................................................... 7

Scenario 3: Configure the BIOS........................................................................................................................................ 9

2.5 Scoring Form ................................................................................................................................................................. 11

2.6 Auxiliary Materials and Props.................................................................................................................................. 11

2.6.1 Network Diagram and Data.................................................................................................................................. 11


HCIA – Management Software Operation Guide for Trainees Page 4

1 References and Tools

1.1 References and Tools


Use the commands and reference documents listed in this document based on the
product version.

Reference documents:

 Huawei V5 Server RAID Controller Card User Guide

 Huawei Server Purley Platform BIOS Parameter Reference

 FusionServer Pro Rack Server iBMC (V300 to V369) User Guide

Software:

 BIOS

 iBMC

Reference links:

 https://support.huawei.com/enterprise/en/doc/EDOC1100019358/

 https://e.huawei.com/en
HCIA – Management Software Operation Guide for Trainees Page 5

2 Management Software Operation


Guide

2.1 Course Introduction


Perform management software operations based on typical scenarios on site.

2.2 Objectives
After the course, the trainees will be able to:

 Configure the server iBMC.

 Configure RAID settings.

 Configure BIOS settings.

2.3 Case Background


A disaster prevention institute uses a V5 rack server to provide computing power for
its seismic monitoring platform. The seismic monitoring platform has problems, such
as low data read speed and high maintenance and monitoring costs. Now, the institute
wants to perform basic configuration and debugging of the server to meet service
requirements.

Make a simple deployment plan.

2.4 Tasks
[Task Overview]-Task Flowchart
HCIA – Management Software Operation Guide for Trainees Page 6

Scenario 1: Configure the iBMC


[Task Overview]-Task Flowchart

Background
The monitoring and O&M of the seismic monitoring platform is not intelligent. For
example, faults need to be identified and rectified manually one by one, which results
in high labor and material costs.
HCIA – Management Software Operation Guide for Trainees Page 7

The Huawei intelligent Baseboard Management Controller (iBMC) is embedded


software used for server lifecycle management. It implements hardware status
monitoring & deployment, energy saving, and security management, and provides
standardized interfaces to build a more comprehensive server management
ecosystem. The iBMC implements precise server management.

Suppose you have a Huawei rack server. Log in to the iBMC web user interface (WebUI),
and view alarms and logs of the server and perform system configuration and
management.

Question
How to perform operations on the iBMC CLI?

Task 1: Query Information and configure settings


Section 1: Log in to the iBMC WebUI using the user name and password provided, and
view server information, and configure the server settings.

Requirements: Screenshot the key steps for viewing information and configuring the
system, and name the screenshots in 1.1 iBMC Configure-N format. N indicates the
sequence number of the screenshot. The screenshots for each question are numbered
from 1.

Evaluation criteria:

1.1 Query the server iBMC IP address.

1.2 Query the server system information.

1.3 Configure trap notification for alarms.

1.4 Configure email notification for alarms.

1.5 View the latest server screenshot.

1.6 Enable power capping and set the smart cooling mode to High performance mode.

1.7 configure SNMPv2 settings.

1.8 Set the hard drive as the first boot device.

1.9 Switch over iBMC images.

1.10 Mount an image file to the server through the remote console.

1.11 Query information about all users on the iBMC CLI.

Scenario 2: Configure the RAID


[Task Overview]-Task Flowchart
HCIA – Management Software Operation Guide for Trainees Page 8

Background
RAID is configured to reduce errors and improve the performance and reliability of the
storage system. Generally, RAID needs to be configured for a newly purchased server.

Suppose you have a rack server (configured with an LSI SAS3108 RAID controller card).
Restart the server, access the RAID Configuration Utility, and create a RAID 5 array.

Notice:

 During the login process, you are asked to install and run the Java program. Perform
operations as prompted. In addition, you need to manually add iBMC to the
Exception Site List on Java Control panel or set the Java security level to a lower
level.

 Data on a hard disk will be deleted after the hard disk is added to a RAID array.
Before creating a RAID array, check that there is no data on hard disks or the data
on hard disks is not required.

 Disks of the same type and specifications must be used in a RAID array.

Question
What are the precautions to be observed when you configure RAID 5? What are the
application scenarios of other RAID levels?

Reference: RAID levels and Huawei V5 Server RAID Controller Card User Guide
HCIA – Management Software Operation Guide for Trainees Page 9

Task 1: Compare RAID Levels


Section 1: Fill in the following table.

Table 2-1 RAID levels

RAID Read Write Min. Number Disk


Reliability
Level Performance Performance of Disks Utilization

RAID 0

RAID 1

RAID 5

RAID 6

RAID 1E

RAID 10

RAID 50

RAID 60

Task 2: Configure a RAID 5 Array


Section1: Log in to the HTML5 Integrated Remote Console of the server, access the CU,
and create RAID properties.

Requirements: Screenshot the key steps and name the screenshots in the "1.1 RAID
Configure-N" format. N indicates the sequence number of the screenshot. The
screenshots for each question are numbered from 1.

Evaluation criteria:

1.1 Create a RAID 5 array.

1.2 Create two virtual drives.

1.3 Configure advanced settings.

1.4 Check the configuration result.

Scenario 3: Configure the BIOS


[Task Overview]-Task Flowchart
HCIA – Management Software Operation Guide for Trainees Page 10

Background
Suppose you have a Huawei rack server. Access the BIOS interface and query the
internal information, including the CPU, memory, and disk information of the server.
Then, set the boot mode of the server.

Question
How do you set the server boot mode to Legacy?

What is the iBMC IP address?

Task 1: Configure the BIOS


Section1: Log in to the virtual console of the server, go to the BIOS startup screen, and
check the server information one by one.

Requirements: Screenshot the key steps and name the screenshots in the "1.1 BIOS
Configure-N" format. N indicates the sequence number of the screenshot. The
screenshots for each question are numbered from 1.

Evaluation criteria:

1.1 Check the CPU information.

1.2 Check the information about all hard disks.

1.3 Set the server boot device to DVD.

1.4 Complete iBMC network settings.


HCIA – Management Software Operation Guide for Trainees Page 11

2.5 Scoring Form


Table 2-2 Scoring form

Task Score Description

Task 1

Task 2
XXX case

XXX Task 3
(trainee/group)
Task 4

Total score

2.6 Auxiliary Materials and Props

2.6.1 Network Diagram and Data

Network
diagram and data planning.xlsx
Management Software Operation Guide

Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.


Foreword
 This slide provides guidance for deploying the server management
software through case study.

Page 2 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. iBMC Management Platform Operations

3. Creation of a RAID 5 Array

4. BIOS Configuration

Page 3 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Drill Background

Background

 A disaster prevention institute uses a V5 rack server to provide computing


power for its seismic monitoring platform. The seismic monitoring platform
has problems, such as low data read speed and high maintenance and
monitoring costs. Now, the institute wants to perform basic configuration
and debugging of the server to meet service requirements.
 Make a simple deployment plan.

Page 4 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Objectives
 After completing this course, you will be able to understand and grasp:
 Basic functions of the server management software
 Application scenarios of different RAID levels
 iBMC, RAID, and BIOS operation processes

iBMC: intelligent Baseboard Management Controller


RAID: redundant array of independent disks
BIOS: basic input/output system

Page 5 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
V5 Rack Server Management Software
Deployment
Objectives Forms of Discussion

 Task 1: Configure iBMC settings.  Activity 1: Group discussion


 Task 2: Configure RAID settings. Case  Activity 2: Group presentation
Task 3: Configure BIOS settings. Activity 3: Comments on each other
Study
 

Time Related Information

 Group discussion: 40 minutes  iBMC functions


 Presentation/group: 10 minutes  RAID levels

 Comments: 10 minutes  Basic functions of the BIOS

Page 6 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task Flowchart
View alarm and diagnosis
information.
Operations on the iBMC Configure system management
management platform settings.

Use the iBMC CLI.

Create a RAID 5 array.


RAID feature deployment

Configure advanced settings.

Query internal information.


Operations on the BIOS
Set the server boot mode.

Page 7 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. iBMC Management Platform Operations

3. Creation of a RAID 5 Array

4. BIOS Configuration

Page 8 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
iBMC Management Platform Operations

Objectives Forms of Discussion

 Task 1: Configure iBMC settings.  Activity 1: Group discussion


Case  Activity 2: Group presentation
Study  Activity 3: Comments on each other

Time Related Information

 iBMC functions
 Group discussion: 8 minutes
 Operations on the iBMC
 Presentation/group: 3 minutes
Management Platform
 Comments: 5 minutes

Page 9 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Operations on the iBMC Management
Platform
 Background:
The monitoring and O&M of the seismic monitoring platform is not intelligent. For
example, faults need to be identified and rectified manually one by one, which results in
high labor and material costs.

Now, use the intelligent Baseboard Management Controller (iBMC) to implement


intelligent O&M. Log in to the iBMC web user interface (WebUI) of a Huawei rack server,
query alarms and logs, and configure and manage the system.

 Question:
Operations on the iBMC command-line interface (CLI).

Page 10 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Operations on the iBMC Management
Platform
 [Task Overview]-Task Flowchart

Start End

Perform operations on the iBMC


Log in to the iBMC WebUI. CLI.

Perform remote control.


Query system information.

Perform system management.

Query alarms and events.


Perform system configuration.

View diagnosis information. Set the server boot mode.

Page 11 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Operations on the iBMC Management
Platform
 Task 1: Log in to the iBMC WebUI of a 2288H V5, query the system
information, and fill in the following table.

Basic Information Description Remarks

IP address of the iBMC

Processor model

iBMC primary U-Boot


version

Page 12 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Operations on the iBMC Management
Platform
 Reference answer

Basic Information Description Remarks

IP address of the iBMC 192.168.2.100

Intel(R) Xeon(R) Gold 6148 CPU @


Processor model
2.40GHz

iBMC primary U-Boot


2.1.07
version

Page 13 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Operations on the iBMC Management Platform

3. Creation of a RAID 5 Array

4. BIOS Configuration

Page 14 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
RAID Operations

Objectives Forms of Discussion

 Task 1: Configure a RAID 5  Activity 1: Group discussion


array and compare features of Activity 2: Group presentation
Case

different RAID levels.  Activity 3: Comments on each other


Study
Time Related Information

 RAID levels and features


 Group discussion: 7 minutes
 Operations on the RAID
 Presentation/group: 3 minutes
management platform
 Comments: 5 minutes

Page 15 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
RAID Operations
 Background:
RAID is configured to reduce errors and improve the performance and reliability of the
storage system. Generally, RAID needs to be configured for a newly purchased server.

Suppose you have a rack server. Restart the server, access the RAID Configuration Utility,
and create a RAID 5 array.

 Question:
What are the precautions to be observed during the configuration of a RAID 5 array? What
are the application scenarios of other RAID levels?
Reference: common RAID types of 2288H V5 servers and Huawei V5 Server RAID Controller Card User
Guide

Page 16 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
RAID Operations
 [Task Overview]-Task Flowchart

Start End

Exit the RAID Configuration


Set the RAID level.
Utility.

Set the number of disks in a


Check the configuration.
span.

Add disks. Set the RAID capacity and name.

Page 17 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
RAID Operations
 Task 1: Compare RAID levels.
Read Write Min. Number Disk
RAID Level Reliability
Performance Performance of Disks Utilization
RAID 0

RAID 1

RAID 5

RAID 6

RAID1E

RAID 10

RAID 50

RAID 60

Page 18 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
RAID Operations
 [Reference answer]
Read Write Min. Number Disk
RAID Level Reliability
Performance Performance of Disks Utilization
RAID 0 Low High High 2 100%

RAID 1 High Low Low 2 1/N

RAID 5 Relatively high High Medium 3 (N-1)/N

RAID 6 Relatively high High Medium 4 (N-2)/N

RAID1E High Medium Medium 3 M/N

RAID 10 High Medium Medium 4 M/N

RAID 50 High High Relatively high 6 (N-M)/N

RAID 60 High High Relatively high 8 (N-M*2)/N

Page 19 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Operations on the iBMC Management Platform

3. Creation of a RAID 5 Array

4. BIOS Configuration

Page 20 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations

Objectives Forms of Discussion

 Task 1: Query disk information.  Activity 1: Group discussion


Task 2: Set the server boot mode Activity 2: Group presentation
Case
 

to Legacy.  Activity 3: Comments on each other


Study
Time Related Information

 Group discussion: 7 minutes  BIOS functions and features


 Presentation/group: 3 minutes  Operations on the BIOS management

 Comments: 5 minutes platform

Page 21 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 Background:
Suppose you have a Huawei rack server. Access the BIOS interface and query the
internal information, including the CPU, memory, and disk information of the
server. Then, set the boot mode of the server.

 Question:
How do you set the server boot mode to Legacy?

Page 22 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 [Task Overview]-Task Flowchart
Start End

Access the BIOS interface. Set and query the iBMC


network.

Query CPU information. Set the server boot mode.

Query memory information. Query disk information.

Page 23 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 Task 1: Check the disk information and fill in the following table.

Basic Information Status Remarks

Port 0

sSATA device type

SATA controller

Port 1

Page 24 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 [Reference answer]

Basic Information Status Remarks

Port 0 Enabled

sSATA device type HDD

SATA controller AHCI

Port 1 Enabled

Page 25 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 Task 2: Set the server boot mode to Legacy, write down the operation
procedure, and take a screenshot.

Page 26 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
BIOS Management Platform Operations
 [Reference answer]
1. Log in to the BIOS. For details, see the 2. In the dialog box displayed, choose Legacy.
user guide. Choose Boot > Boot Type, and
press Enter.

Page 27 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Summary
 Perform initial configuration of V5 rack servers after the study.
 Understand the functions and basic working principles of the management
software.

Page 28 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Recommendations

 Huawei V5 Server RAID Controller Card User Guide

 Huawei Server Purley Platform BIOS Parameter Reference

 FusionServer Pro Rack Server iBMC (V300 to V369) User Guide

Page 29 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Thank You
www.huawei.com
Revision Record
Course Code Product Product Version Course Version

Author Date Reviewer New/Update Update Description

Lu Fangming 2019.05.25 Shui Shaolan New

Liu Chao 2019.07.21 Shui Shaolan Update


Server Intelligent O&M
Guide for Trainees

ISSUE: 1.0

HUAWEI TECHNOLOGIES CO., LTD.


Copyright © Huawei Technologies Co., Ltd. 2019 All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their
respective holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei
and the customer. All or part of the products, services and features described in this document may
not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all
statements, information, and recommendations in this document are provided "AS IS" without
warranties, guarantees or representations of any kind, either expressed or implied.

The information in this document is subject to change without notice. Every effort has been made in
the preparation of this document to ensure accuracy of the contents, but all statements, information,
and recommendations in this document do not constitute a warranty of any kind, express or
implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base, Bantian, Longgang Shenzhen 518129

People's Republic of China

Website: https://e.huawei.com/en

Huawei Proprietary and Confidential


Copyright © Huawei Technologies Co., Ltd.
HCIA - Server Intelligent O&M Guide for Trainees Page 1

Huawei Certificate System


Huawei Certification follows the "platform + ecosystem" development strategy,
which is a new collaborative architecture of ICT infrastructure based on "Cloud-Pipe-
Terminal". Huawei has set up a complete certification system consisting of three
categories: ICT infrastructure certification, Platform and Service certification and ICT
vertical certification, and grants Huawei certification the only all-range technical
certification in the industry.
Huawei offers three levels of certification: Huawei Certified ICT Associate (HCIA),
Huawei Certified ICT Professional (HCIP), and Huawei Certified ICT Expert (HCIE).
Huawei Certified ICT Associate- Intelligent Computing (HCIA- Intelligent
Computing) is intended for Huawei engineers in representative offices and branch
offices, and other engineers who want to learn Huawei intelligent computing
products. The HCIA-Intelligent Computing certification covers the computing
industry, chip development history and trend, computing system architecture
overview, computing platform products and common technologies, and industry
solution cases and practices.
The HCIA-Intelligent Computing certificate system introduces you to the industry
and market, helps you in innovation, and enables you to stand atop the intelligent
computing frontiers.
HCIA - Server Intelligent O&M Guide for Trainees Page 2
HCIA - Server Intelligent O&M Guide for Trainees Page 3

Contents

1 References and Tools................................................................................................... 4


1.1 References and Tools .................................................................................................................................................... 4

2 Overview ....................................................................................................................... 5
2.1 Course Introduction ...................................................................................................................................................... 5

2.2 Objectives ......................................................................................................................................................................... 5

2.3 Case Background ........................................................................................................................................................... 5

2.4 Tasks ................................................................................................................................................................................... 6

Scenario 1: Install and Configure Ansible ................................................................................................................ 6

Scenario 2: Manage Servers in Batches Using the ad-hoc Command ........................................................ 7

Scenario 3: Deploy Nginx Automatically Using a Playbook ............................................................................ 9

2.5 Scoring Form ................................................................................................................................................................. 10


HCIA - Server Intelligent O&M Guide for Trainees Page 4

1 References and Tools

1.1 References and Tools


Use the commands and reference documents listed in this document based on site
requirements.

Reference links:

https://docs.ansible.com/

https://support-open.huawei.com/en

https://e.huawei.com/en
HCIA - Server Intelligent O&M Guide for Trainees Page 5

2 Overview

2.1 Course Introduction


We will perform cluster experiments based on typical live network scenarios including
Ansible installation and configuration, batch server management, and automatic
Nginx deployment. Trainees will learn these typical requirements through discussion
to obtain the Ansible deployment and automatic O&M capabilities.

2.2 Objectives
Upon completion of this course, you will be able to:
 Understand the modes and scenarios of Ansible installation and deployment.
 Manage servers in batches using the ad-hoc command of Ansible.

 Perform the configuration and debugging using a playbook.

2.3 Case Background


To improve work efficiency, eliminate duplicate tasks, and reduce error risks, company
Z requires that the modification of the servers on the live network be minimized.
Therefore, Ansible is selected from the four mainstream O&M automation tools
(Puppet, SaltStack, Chef, and Ansible) to automate O&M management.

Ansible is an IT automation tool, which can be used to configure systems, deploy


software, and coordinate more advanced IT tasks, such as continuous deployment
and rolling update. Ansible is applicable to enterprise IT infrastructure management,
ranging from the small-scale enterprise environment with a few hosts to the
enterprise environment with thousands of instances. Ansible is also a simple
automation language that perfectly describes the IT application infrastructure.

Assume that you are an IT system engineer of company Z, and you need to complete
the following tasks and configuration.
HCIA - Server Intelligent O&M Guide for Trainees Page 6

2.4 Tasks

Scenario 1: Install and Configure Ansible

Task 1: Confirm Service Environment


Server configuration

Python version: 2.7

OS: CentOS 7.2

IP address: 192.168.1.100

Configuration of the managed end

Python version: 2.7

OS: CentOS 7.2

SSH server software: OpenSSH

IP address of Host01: 192.168.1.101

IP address of Host02: 192.168.1.102

IP address of Host03: 192.168.1.103

Host01

Controller Switch Host02

Host03

Figure 2-1 Topology in the lab environment


HCIA - Server Intelligent O&M Guide for Trainees Page 7

Task 2: Install Ansible Using Yum Commands on the Control End


Write the installation commands:

Task 3: Install Python and Configure SSH Login Without a Password


Write the commands:

Task 4: Modify the ansible.cfg Configuration File and Configure the


Controlled Hosts
Write the commands:

Scenario 2: Manage Servers in Batches Using the ad-hoc

Command

Task 1: Test the Connectivity of All Remote Host Group Webservers


Write the command:
HCIA - Server Intelligent O&M Guide for Trainees Page 8

Task 2: Check the Information about eth0 of the Remote Host


Group Webservers
Write the command:

Task 3: Run the Remote Host Script test.sh


Write the commands:

Task 4: Copy the test.sh File from the Control End to the /tmp/
Directory on the Target Host, and Set the Owner and Group of the
File to root with the File Permission rwxr-xr-x
Write the command:

Task 5: Check the uid and gid Information in the /etc/sysctl.conf File
of the Remote Host Group Webservers
Write the command:
HCIA - Server Intelligent O&M Guide for Trainees Page 9

Task 6: Install HTTPD on All Remote Host Group Webservers


Write the command:

Task 7: Enable the HTTP Service for the Remote Host Group
Webservers and Check the Service Status
Write the commands:

Task 8: Create and Delete the /home/f1 File on the Remote Server
Group Webservers
Write the commands:

Scenario 3: Deploy Nginx Automatically Using a Playbook

Task 1: Deploy Nginx Automatically Using a Playbook


Write the playbook:
HCIA - Server Intelligent O&M Guide for Trainees Page 10

2.5 Scoring Form


(This table is for reference only. Case scores will be counted in the final capability
assessment.)

Table 2-1 Scoring form

Item Score Description

Assessment point 1

Assessment point 2
Case xx
Assessment point 3
Trainee/Group xx
Assessment point 4

Total score
Server Intelligent O&M Guide Slides

Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.


Contents
1. Case Background

2. Installing and Configuring Ansible

3. Managing Servers in Batches Using the ad-hoc Command

4. Deploying Nginx Using a Playbook

Page 2 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Background
Introduction

To improve work efficiency, eliminate duplicate tasks, and reduce error risks,
company Z requires that the modification of the servers on the live network
be minimized. Therefore, Ansible is selected from the four mainstream O&M
automation tools (Puppet, SaltStack, Chef, and Ansible) to automate O&M
management.

Page 3 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Objectives
Upon completion of this course, you will be able to:

 Install, deploy, and configure Ansible.

 Manage servers in batches using the ad-hoc command.

 Deploy Nginx using a playbook.

Page 4 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Topology

Host 01

Controller Switch Host 02

Host 03

Page 5 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Case Background

2. Installing and Configuring Ansible

3. Managing Servers in Batches Using the ad-hoc Command

4. Deploying Nginx Using a Playbook

Page 6 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
Discussion Objectives Form of Discussion
 Task 1: Confirm the environment  Activity 1: Group discussion
 Task 2: Install Ansible  Activity 2: Group presentation
 Task 3: Install Python and log in to the Case  Activity 3: Comments on each other

Study
system using SSH without a password
 Task 4: Configure the Controlled Hosts

Discussion Duration Related Knowledge

 Group discussion: 8 minutes  Ansible installation


 Presentation of each group: 3 minutes  SSH login without a password
 Inter-group interaction: 5 minutes

Page 7 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
Task 1: Confirm Service Environment

Device OS Version IP Address Pingable from Other Hosts Remarks

Server

Managed end

Page 8 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
[Reference Answer]

Device OS Version IP Address Pingable from Other Hosts Remarks

Server CentOS 7.2 192.168.1.100 Yes

192.168.1.101 Yes
Managed end CentOS 7.2 192.168.1.102 Yes
192.168.1.103 Yes

Page 9 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
Task 2: Install Ansible Using Yum Commands on the Control End

Page 10 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
[Reference Answer]
CentOS (Yum)

1. Add the third-party suite source epel-release.

$ sudo yum install -y epel-release

2. Install Ansible.
$ sudo yum install -y ansible

Page 11 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
Task 3: Install Python and Configure SSH Login Without a Password

Page 12 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
[Reference Answer]
1. Install Yum, SSH, and Python on all nodes.
$ sudo yum install -y openssh-server python

2. Run the following command on the control node:


[root@centos ~]# ssh-keygen

3. Run the following commands on the control node:


[root@centos ~]#ssh-copy-id 192.168.1.101
[root@centos ~]#ssh-copy-id 192.168.1.102
[root@centos ~]#ssh-copy-id 192.168.1.103

Page 13 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
Task 4: Modify the ansible.cfg Configuration File and Configure the
Controlled Hosts

Page 14 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Installing and Configuring Ansible
[Reference Answer]
# vi /etc/ansible/ansible.cfg
[defaults]
inventory = /etc/ansible/hosts
forks = 5
become = root
remote_port = 22
host_key_checking = False
timeout = 10
log_path = /var/log/ansible.log
private_key_file = /root/.ssh/id_rsa

#cat /etc/ansible/hosts
[webservers]
192.168.1.101
192.168.1.102
192.168.1.103

Page 15 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Case Background

2. Installing and Configuring Ansible

3. Managing Servers in Batches Using the ad-hoc Command

4. Deploying Nginx Using a Playbook

Page 16 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Solution Architecture Design
Discussion Objectives Form of Discussion
 Task 1: Test the connectivity  Activity 1: Group discussion
 Task 2: Check the NIC information  Activity 2: Group presentation
 Task 3: Execute the remote script Case  Activity 3: Comments on each other
Task 4: Copy file remotely
Study

 Task 5: Check the remote host file


 Task 6: Install HTTPD
 Task 7: Remotely start the service
 Task 8: Create and delete the file remotely

Discussion Duration Related Knowledge


 Usage of Ansible modules
 Group discussion: 8 minutes
 Basic Linux commands
 Presentation of each group: 3 minutes
 Inter-group interaction: 5 minutes

Page 17 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 1: Test the Connectivity of All Remote Host Group Webservers

Page 18 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]

[root@localhost ~]# ansible webservers -m ping

Page 19 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 2: Check the Information about eth0 of the Remote Host Group
Webservers

Page 20 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
[root@localhost ~]# ansible webservers -m command -a 'ip addr show dev eth0'

Page 21 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 3: Run the Remote Host Script test.sh

Page 22 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
[root@localhost ~]# ansible webservers -m shell -a "/home/test.sh"

Note: The /home/test.sh script must exist on the remote host and have the
execution permission.
#more test.sh
Echo "Welcome to Huawei Cloud"
chmod 777 test.sh

Page 23 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 4: Copy the test.sh File from the Control End to the /tmp/ Directory on
the Target Host, and Set the Owner and Group of the File to root with the File
Permission rwxr-xr-x

Page 24 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
[root@localhost ~]# ansible webservers -m copy -a "src=/home/test.sh
dest=/tmp/ owner=root group=root mode=0755"

Note: The script refers to test.sh on the control node.

Page 25 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 5: Check the uid and gid Information in the /etc/sysctl.conf File of the
Remote Host Group Webservers

Page 26 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]

[root@localhost ~]# ansible webservers -m stat -a "path=/etc/sysctl.conf"

Page 27 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 6: Install HTTPD on All Remote Host Group Webservers

Page 28 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
[root@localhost ~]# ansible webservers -m yum -a "name=httpd
state=latest disable_gpg_check=yes enablerepo=epel "

Page 29 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 7: Enable the HTTP Service for the Remote Host Group Webservers and
Check the Service Status

Page 30 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
#Enable the service:
[root@localhost ~]# ansible webservers -m service -a "name=httpd state=restarted"

#Check the service status:


[root@localhost ~]# ansible webservers -a " systemctl status httpd"

#Stop the service:


[root@localhost ~]# ansible webservers -m service -a "name=httpd state=stopped"

Page 31 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
Task 8: Create and Delete the /home/f1 File on the Remote Server Group
Webservers

Page 32 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Managing Servers in Batches Using the
ad-hoc Command
[Reference Answer]
ansible all -m file -a 'name=/home/f1 state=touch'
ansible all -m file -a 'name=/home/f1 state=absent'

Page 33 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Case Background

2. Installing and Configuring Ansible

3. Managing Servers in Batches Using the ad-hoc Command

4. Deploying Nginx Using a Playbook

Page 34 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Solution Implementation
Discussion Objectives Form of Discussion
 Task 1: Deploy Nginx automatically  Activity 1: Group discussion
using a playbook Activity 2: Group presentation
Case

Activity 3: Comments on each other


Study

Discussion Duration Related Knowledge

 Group discussion: 100 minutes  Ansible module knowledge


 Presentation of each group: 10  Playbook syntax rules
minutes
 Inter-group interaction: 10 minutes

Page 35 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Deploying Nginx Using a Playbook
Task 1: Deploy Nginx Automatically Using a Playbook

Page 36 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Deploying Nginx Using a Playbook
[Reference Answer]
# main.yml
---
- hosts: webservers
tasks:
- name: Add repo
yum_repository:
name: nginx
description: nginx repo
baseurl: http://nginx.org/packages/centos/7/$basearch/
gpgcheck: no
enabled: 1
- name: Install nginx
yum:
name: nginx
state: latest
- name: Start nginx
service:
name: nginx
state: started

Execute the playbook:


#ansible-playbook main.yml

Page 37 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Summary
Three experiment scenarios:
 Installing and Configuring Ansible

 Managing Servers in Batches Using the ad-hoc Command

 Deploying Nginx Using a Playbook

Page 38 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Quiz
1. Which of the following options belong to Ansible?

A. copy

B. command

C. file

D. Yum

Page 39 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
More Information
https://docs.ansible.com/

Page 40 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Thank You
www.huawei.com
Revision Record
Course Code Product Product Version Course Version

Author Date Reviewer New/Update Update Description

Lu Fangming 2019.7.25 Shui Shaolan New


Industry Solution Practice
Guide

For Trainees

Issue 1.0

Huawei Technologies Co., Ltd.


Copyright © Huawei Technologies Co., Ltd. 2019. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their
respective holders.

Note
The purchased products, services and features are stipulated by the contract made between
Huawei and the customer. All or part of the products, services and features described in this
document may not be within the purchase scope or the usage scope. Unless otherwise specified in
the contract, all statements, information, and recommendations in this document are provided "AS
IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made
in the preparation of this document to ensure accuracy of the contents, but all statements,
information, and recommendations in this document do not constitute a warranty of any kind,
express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base, Bantian, Longgang, Shenzhen Postal code: 518129

Website: https://e.huawei.com/en

Huawei Proprietary and Confidential


Copyright © Huawei Technologies Co., Ltd
Industry Solution Practice Guide for Trainees Page 1

Huawei Certificate System


Huawei Certification follows the "platform + ecosystem" development strategy,
which is a new collaborative architecture of ICT infrastructure based on
"Cloud-Pipe-Terminal". Huawei has set up a complete certification system consisting
of three categories: ICT infrastructure certification, Platform and Service certification
and ICT vertical certification, and grants Huawei certification the only all-range
technical certification in the industry.
Huawei offers three levels of certification: Huawei Certified ICT Associate (HCIA),
Huawei Certified ICT Professional (HCIP), and Huawei Certified ICT Expert (HCIE).
Huawei Certified ICT Associate- Intelligent Computing (HCIA- Intelligent
Computing) is intended for Huawei engineers in representative offices and branch
offices, and other engineers who want to learn Huawei intelligent computing
products. The HCIA-Intelligent Computing certification covers the computing
industry, chip development history and trend, computing system architecture
overview, computing platform products and common technologies, and industry
solution cases and practices.
The HCIA- Intelligent Computing certificate system introduces you to the
industry and market, helps you in innovation, and enables you to stand atop the
intelligent computing frontiers.
Industry Solution Practice Guide for Trainees Page 2
Industry Solution Practice Guide for Trainees Page 3

Contents

1 Reference Documents and Tools ............................................................................... 4


1.1 Reference Documents and Tools .............................................................................................................................. 4

2 HPC Case Study ............................................................................................................ 5


2.1 Course Introduction ...................................................................................................................................................... 5

2.2 Objectives ......................................................................................................................................................................... 5

2.3 Background ...................................................................................................................................................................... 5

2.4 Tasks ................................................................................................................................................................................... 6

Scenario 1 Discussion on HPC.......................................................................................................................................... 6

Scenario 2 Connecting Devices........................................................................................................................................ 7

Scenario 3 Acceptance Test ............................................................................................................................................. 13

2.5 Score Form ..................................................................................................................................................................... 14


Industry Solution Practice Guide for Trainees Page 4

1 Reference Documents and Tools

1.1 Reference Documents and Tools


Use the commands and reference documents based on the product version.

Reference documents:

1. HPC Solution V100R001C08 HPL Performance Test Guide


2. HPC Solution Deployment Guide
3. HPC Solution TaiShan Platform OpenHPC Installation and Deployment Guide
4. HPC Solution TaiShan Platform CPU Linpack Test Guide
5. HPC Solution STREAM Test Guide
6. HPC Solution TaiShan Platform IOR Test Guide
For details, see the following links:

1. https://support.huawei.com/enterprise/en/index.html

2. https://e.huawei.com/en/
Industry Solution Practice Guide for Trainees Page 5

2 HPC Case Study

2.1 Course Introduction


This course is a case study based on the HPC knowledge we have learned. In recent
years, universities in China are undertaking more scientific research tasks and have
stronger requirements on the computing efficiency of complex tasks. HPC, which was
used only by a few scientific research institutions in the past, has become a necessary
infrastructure for many universities. The case study focuses on the requirement
analysis, network planning, delivery and implementation, and acceptance and testing
of a specific project. Through this case study, we can consolidate and review what we
have learned before.

2.2 Objectives
 Understand the characteristics and components of the HPC solution.
 Understand how to select device models.

 Understand how to design the network of a small- and medium-sized HPC cluster.
 Understand the delivery process of an HPC basic environment.
 Understand the HPC project acceptance process.

2.3 Background
Note: The case in this document is for reference only. The actual configuration may
vary. For details, see the corresponding product documentation.

With the rapid development of computer technology and national economy, HPC has
become a necessary tool for scientific researches and plays an important role in
various basic disciplines and production systems. HPC has been applied in industrial
Industry Solution Practice Guide for Trainees Page 6

simulation, teaching and scientific research, energy exploration, weather forecasting,


and other fields.

Based on the project survey, M company decides to deploy an HPC cloud simulation
platform. You are the implementation engineer of this project and need to complete
several basic tasks.

This section describes the acceptance scope of the HPC solution implementation
service, including:

1. Devices involved in the project, such as servers, storage devices, and network
switching devices

2. Software involved in the project, such as OSs, parallel file system software,
application environment software, and cluster management software

3. Tools involved in the project, such as FusionServer Tools

According to the HPC solution design and implementation requirements, the Huawei
HPC solution is deployed in equipment room A. The solution provides a complete
service running platform, an HPC cloud simulation platform, centralized management
and scheduling services, and unified storage space. Huawei provides the overall
solution design, software and hardware installation service, commissioning service,
and acceptance service.

2.4 Tasks

Scenario 1 Discussion on HPC


Background
Based on the project survey, M company decides to deploy an HPC cloud simulation
platform. The storage and computing product models have been selected.

You are an engineer. Compare HPC and common computing such as server
virtualization in terms of computing, storage, and networking.

Question
What are the differences between HPC and common computing in terms of
computing, storage, and networking?
Industry Solution Practice Guide for Trainees Page 7

Scenario 2 Connecting Devices


Background
The compute nodes, network devices, and storage devices have been selected. Some
devices have no FlexIO card. Select FlexIO cards and fill in the physical connection
planning table.

Task 1 Identifying Components


Fill in the table with component names corresponding to the numbers in the device
rear view.

1. Provide the names of TaiShan X6000 & XA320C components.

Figure 2-1 Rear view of the TaiShan X6000 & XA320C

Table 2-1 TaiShan X6000 & XA320C components

No. Component No. Component

1 2

3 4

5 6

7 8

9 10

11 - -

2. Fill in the table with the component names of the Atlas G5500 & G560 V5.
Industry Solution Practice Guide for Trainees Page 8

Figure 2-2 Atlas G5500 & G560 V5

Table 2-2 Atlas G5500 & G560 V5 component names

No. Component No. Component

1 2

3 4

3. Fill in the table with the component names of the FusionServer Pro 2488H V5.

Figure 2-3 FusionServer Pro 2488H V5

Table 2-3 FusionServer Pro 2488H V5 component names

No. Component No. Component

1 2
Industry Solution Practice Guide for Trainees Page 9

3 4

5 6

7 8

9 10

11 12

Task 2 Adding Interface Cards


Insert the following two FlexIO cards into the G5500 server and the FusionServer Pro
2488 server respectively. Provide the schematic diagram.

FlexIO card 1

Figure 2-4 IN200 Intelligent Ethernet NIC, Standard NIC

FlexIO card 2
Industry Solution Practice Guide for Trainees Page 10

Figure 2-5 4 x 10GE or 4 x 25GE FlexIO

Logical diagram:

Task 3 Designing Logical Connections


Design the logical connections of the devices by drawing lines.
Industry Solution Practice Guide for Trainees Page 11

X6000 THIN 2488 FAT 1288 MGMT Atlas G5500

CE8861 S5720

P12X-1 P12X-2 P12X-3

Figure 2-6 HPC network topology

Task 4 Planning Physical Connections


After the logical connections are designed, fill in the physical connection planning
table.

Switch ports:

S5720
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

Figure 2-7 S5720 ports

CE8861
2 4 6 8 2 4 6 8 10 12 14 16 18 20 22 24

1 3 5 7 1 3 5 7 9 11 13 15 17 19 21 23

Figure 2-8 CE8861 ports


Industry Solution Practice Guide for Trainees Page 12

Figure 2-9 Storage node rear view

Table 2-4 Physical connection planning table

Network Device Switch


Product Port Switch
Plane Node Port

P12X-1 Slot 1-0


Storage OceanStor
P12X-2 Slot 1-0
network 9000
P12X-3 Slot 1-0

XA320C-1 100GE port 1

XA320C-2 100GE port 1


TaiShan X6000
XA320C-3 100GE port 1
Computing
network XA320C-4 100GE port 1

Atlas G5500 G560 V5 25GE port 1

2488 V5 fat
/ 25GE port 1
node

P12X-1 MGMT
OceanStor
P12X-2 MGMT
9000
P12X-3 MGMT

XA320C-1 MGMT

XA320C-2 MGMT
TaiShan X6000
IPMI network XA320C-3 MGMT

XA320C-4 MGMT

Atlas G5500 G560 V5 MGMT

2488 V5 fat
/ MGMT
node

1288 V5
/ MGMT
management
Industry Solution Practice Guide for Trainees Page 13

node

P12X-1 GE port 1
OceanStor
P12X-2 GE port 1
9000
P12X-3 GE port 1

XA320C-1 GE port 1

XA320C-2 GE port 1
TaiShan X6000
Management XA320C-3 GE port 1

network XA320C-4 GE port 1

Atlas G5500 G560 V5 GE port 1

2488 V5 fat
/ GE port 1
node

1288 V5
management / GE port 1
node

Scenario 3 Acceptance Test


Background
You are the acceptance engineer of the project. You need to complete the acceptance
of the project after the cluster software configuration and storage configuration are
complete.

Task 1 Testing the Cluster HPL Performance


1. What are the steps for testing the cluster HPL performance?

2. Which field shows the final result of the floating-point computing test?
Industry Solution Practice Guide for Trainees Page 14

Task 2 Testing the Performance of the File System


What are the steps for testing the file system?

2.5 Score Form


(This table is for reference only. The case scores will be recorded in the final capability
assessment.)

Scoring Item Score Description

Assessment point 1

Assessment point 2
XXX Case

XXX Assessment point 3


(Trainee/Group)
Assessment point 4

Total score
Industry Solution Practice Guide
HPC Scenario
Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Discussion on HPC

3. Device Connection

4. Acceptance Test

Page 2 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Background
With the rapid development of computer technology and national economy, high-
performance computing (HPC) has become a necessary tool for scientific researches
and is playing an important role in various basic disciplines and production systems.
HPC has been applied in industrial simulation, teaching and scientific research,
energy exploration, weather forecasting, and other fields.

Based on the project survey, M company decides to deploy an HPC cloud


simulation platform. You are the implementation engineer of this project and need
to complete several basic tasks.

Page 3 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Objectives
 Understand the characteristics and components of the HPC solution.
 Understand how to select device models.
 Understand how to design the network of a small- and medium-sized HPC
cluster.
 Understand the delivery process of an HPC basic environment.
 Understand the HPC project acceptance process.

Page 4 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Discussion on HPC

3. Device Connection

4. Acceptance Test

Page 5 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Differences Between HPC and Common
Computing
 Background

Based on the project survey, M company decides to deploy an HPC cloud


simulation platform. The storage and computing product models have been
selected.

You are an engineer. Compare HPC and common computing such as server
virtualization in terms of computing, storage, and networking without considering
the software.
 Task 1

What are the differences between HPC and common computing in terms of
computing, storage, and networking?

Page 6 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Key to the HPC Discussion
An HPC system consists of the management network, computing network, and
storage network, including compute nodes, fat nodes, acceleration nodes,
management nodes, login nodes, and parallel file systems.
Three types of compute nodes:
Compute nodes (thin nodes): high-performance blade servers or rack servers
Fat nodes: SMP high-performance servers with multiple processors and large
memory capacity
GPU compute nodes: use GPGPU cards for GPU computing acceleration
Three-plane networking:
1. Computing network: used for message transmission during computing
2. Management network: used for cluster system management
3. Storage network: used for storage or data transmission

Page 7 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Key to the HPC Discussion
Type Characteristics Application Scenario

MPI cluster computing is applicable


Usually, 2-socket servers are used to to most HPC applications. Generally,
MPI compute node (thin node)
form a cluster. the number of MPI nodes is the
largest in a project.
Applicable to scenarios demanding
4-socket or 8-socket servers with large memory of a single node.
SMP compute node (fat node)
large memory capacity Generally, the memory size is greater
than 512 GB.

Some HPC applications support GPU


Uses the coprocessor GPU/PHI for
computing acceleration, for example,
computing acceleration. Generally, 1
GPU compute node some software in the life science and
GPU/node, 2 GPUs/node, and 4
oil exploration fields. The NVIDIA
GPUs/node are required.
Tesla series GPUs are recommended.

Page 8 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Key to the HPC Discussion
Type Characteristics Application Scenario
Uses storage-type server to deploy the NFS
server; small capacity and relatively low Applicable to small projects that do not require
NFS
performance. For example, deploy the NFS high performance.
server by using RH2288 V3.
NAS Directly uses NAS or unified storage to Applicable to HPC systems with budgets below
provide servers; supports NFS and CIFS, and CNY2 million and without expansion plans.
Unified
provides large capacity and relatively high Required performance less than 2 GB/s
storage
performance, for example, the OceanStor Applicable to systems with Windows clients for
V3 unified storage. accessing the storage

Uses RH2288 servers and OceanStor V3 FC Applicable to projects with budgets of over
SAN with the Intel Lustre file system. The CNY2 million for the HPC system.
Lustre
system provides high performance and Required performance of 2 GB/s to 20 GB/s
storage
Parallel good scalability. The native system supports All nodes accessing the storage in the cluster
storage only Linux clients. are Linux systems.

Dedicated storage with integrated software


Oceanstor For scenarios requiring Windows client access,
and hardware; supports Linux and Windows
9000 OceanStor 9000 is preferred.
access; good scalability.

Page 9 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Key to the HPC Discussion
Type Characteristics

BMC hardware management network, which is usually a Fast Ethernet (FE)


or GE network.
The network implements functions including hardware power-on and
Out-of-band management power-off and hardware device monitoring through out-of-band
network management. Generally, the hardware management network is connected
to the system management network because the cluster management
software and dual-node cluster HA software need to communicate with the
BMC.
Implements system management functions by the cluster management
Management network
software, generally a GE network.
Network for computing communication between cluster nodes. Generally,
Computing network low latency and high bandwidth are required. In most cases, it is an
InfiniBand network, and 10GE and 40GE networks in some scenarios.
The network for a compute node to access the storage. Generally, a data
Storage network
network and a computing network are combined and share the same link.

Page 10 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Discussion on HPC

3. Device Connection

4. Acceptance Test

Page 11 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Device Connection
Background
The compute nodes, network devices, and storage devices have been
selected. Some devices have no FlexIO card. Select FlexIO cards and
fill in the physical connection planning table.

Page 12 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Fill in the table with component names corresponding to the numbers in the device rear view.
Step 1:
Rear view of the TaiShan X6000 & XA320C

No. Component No. Component

1 2

3 4

5 6

7 8

9 10

11 - -

Page 13 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Key:

No. Component No. Component

1 Mezzanine card 2 Water outlet

3 Water inlet 4 Standard PCIe card

Universal connector LOM port 1 (GE


5 6
port electrical port)
LOM port 2 (GE Power
7 8
electrical port) button/indicator
iBMC management Label (including the
9 10
network port SN)
LOM port 3 (100GE
11 - -
optical port)

Page 14 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Step 2
Rear view of the Atlas G5500 & G560 V5

No. Component No. Component

1 2

3 4

Page 15 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Key:

No. Component No. Component


Chassis management
1 2 I/O module
module

3 Fan Module 4 Power module

Page 16 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Step 3
Rear view of the FusionServer Pro 2488H V5

No. Component No. Component

1 2

3 4

5 6

7 8

9 10

11 12

Page 17 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Identifying Components
Key:

No. Component No. Component

1 PCIe slot 1 2 PCIe slot 2

3 PSU socket 4 USB 3.0 port

5 GE electrical port 6 10GE optical port

Management network
7 8 Serial port
port
PCIe slots (slots 3 to 11
9 VGA port 10
from left to right)

11 PSU 1 12 PSU 2

Page 18 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 2 Adding Interface Cards
Insert the following two FlexIO cards into the G5500 server and the FusionServer Pro
2488 server respectively, and provide the schematic diagram.

IN200 Intelligent Ethernet NIC, Standard NIC 4 x 10GE or 4 x 25GE FlexIO card

Page 19 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 2 Adding Interface Cards
Key:

Page 20 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 3 Designing Logical Connections
 Design the logical connections of the devices by drawing lines.

X6000 THIN 2488 FAT 1288 MGMT Atlas G5500

CE8861 S5720

P12X-1 P12X-2 P12X-3

Page 21 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 3 Designing Logical Connections
 Key:

X6000 THIN 2488 FAT 1288 MGMT Atlas G5500

CE8861 S5720

Management/
IPMI
Computing/
Network
P12X-1 P12X-2 P12X-3

Page 22 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 4 Planning Physical Connections
After the logical connections are designed, plan the physical connections and fill in the table.
Switch ports:
S5720
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47

CE8861
2 4 6 8 2 4 6 8 10 12 14 16 18 20 22 24

1 3 5 7 1 3 5 7 9 11 13 15 17 19 21 23

Rear view of a
storage node:

Page 23 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 4 Planning Physical Connections

Fill in the physical Network Plane Product Device Node


P12X-1
Port
Slot 1-0
Switch Switch Port

connection planning table Storage network OceanStor 9000 P12X-2 Slot 1-0
P12X-3 Slot 1-0
on the manual. XA320C-1 100GE port 1
XA320C-2 100GE port 1
TaiShan X6000
Computing XA320C-3 100GE port 1
network XA320C-4 100GE port 1
Atlas G5500 G560 V5 25GE port 1
2488 V5 fat node / 25GE port 1
P12X-1 MGMT
OceanStor 9000 P12X-2 MGMT
P12X-3 MGMT
XA320C-1 MGMT
XA320C-2 MGMT
IPMI network TaiShan X6000
XA320C-3 MGMT
XA320C-4 MGMT
Atlas G5500 G560 V5 MGMT
2488 V5 fat node / MGMT
1288Mgmt / MGMT
P12X-1 GE port 1
OceanStor 9000 P12X-2 GE port 1
P12X-3 GE port 1
XA320C-1 GE port 1
Management XA320C-2 GE port 1
TaiShan X6000
network XA320C-3 GE port 1
XA320C-4 GE port 1
Atlas G5500 G560 V5 GE port 1
2488 V5 fat node / GE port 1
1288 V5 management node / GE port 1

Page 24 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 4 Planning Physical Connections
Key: Network Plane Product Device Node Port Switch Switch port
P12X-1 Slot 1-0 CE8861 25GE 2/1
Storage network OceanStor 9000 P12X-2 Slot 1-0 CE8861 25GE 2/2
P12X-3 Slot 1-0 CE8861 25GE 2/3
XA320C-1 100GE port 1 CE8861 100GE 1/1
XA320C-2 100GE port 1 CE8861 100GE 1/2
TaiShan X6000
Computing XA320C-3 100GE port 1 CE8861 100GE 1/3
network XA320C-4 100GE port 1 CE8861 100GE 1/4
Atlas G5500 G560 V5 25GE port 1 CE8861 25GE 2/4
2488 V5 fat node / 25GE port 1 CE8861 25GE 2/5
P12X-1 MGMT S5720 GE 1
OceanStor 9000 P12X-2 MGMT S5720 GE 2
P12X-3 MGMT S5720 GE 3
XA320C-1 MGMT S5720 GE 4
XA320C-2 MGMT S5720 GE 5
IPMI network TaiShan X6000
XA320C-3 MGMT S5720 GE 6
XA320C-4 MGMT S5720 GE 7
Atlas G5500 G560 V5 MGMT S5720 GE 8
2488 V5 fat node / MGMT S5720 GE 9
1288 V5 management node / MGMT S5720 GE 10
P12X-1 GE port 1 S5720 GE 11
OceanStor 9000 P12X-2 GE port 1 S5720 GE 12
P12X-3 GE port 1 S5720 GE 13
XA320C-1 GE port 1 S5720 GE 14
Management XA320C-2 GE port 1 S5720 GE 15
TaiShan X6000
network XA320C-3 GE port 1 S5720 GE 16
XA320C-4 GE port 1 S5720 GE 17
Atlas G5500 G560 V5 GE port 1 S5720 GE 18
2488 V5 fat node / GE port 1 S5720 GE 19
1288 V5 management node / GE port 1 S5720 GE 20

Page 25 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Contents
1. Background

2. Discussion on HPC

3. Device Connection

4. Acceptance Test

Page 26 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Acceptance Test
Background

You are the acceptance engineer of the project. You need to complete the
acceptance of the project after the cluster software configuration and
storage configuration are complete.

Page 27 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Testing the Cluster HPL Performance

1. What are the steps for testing the cluster HPL performance?

2. Which field shows the final result of the floating-point computing test?

Page 28 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 1 Testing the Cluster HPL Performance
Key:

1. For details, see the HPC Solution TaiShan Platform CPU Linpack Test Guide.

2. WC00C2R2

Page 29 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 2 Testing the Performance of the File
System
What are the steps for testing the file system?

Page 30 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Task 2 Testing the Performance of the File
System
Key:

For details, see the HPC Solution TaiShan Platform IOR Test Guide.

Page 31 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Summary
This course covers the following contents:

1. Background

2. Discussion on HPC

3. Device Connection

4. Acceptance Test

Learn the server device models and basic networking rules by finishing tasks.

Page 32 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
References and Tools
Reference documents:
1. HPC Solution V100R001C08 HPL Performance Test Guide
2. HPC Solution Deployment Guide
3. HPC Solution TaiShan Platform OpenHPC Installation and Deployment Guide
4. HPC Solution TaiShan Platform CPU Linpack Test Guide
5. HPC Solution STREAM Test Guide
6. HPC Solution TaiShan Platform IOR Test Guide
For details, see the following links:
https://support.huawei.com/enterprise/en/index.html
https://e.huawei.com/en/

Page 33 Copyright © 2019 Huawei Technologies Co., Ltd. All rights reserved.
Thank You
www.huawei.com

You might also like