0% found this document useful (0 votes)
54 views

General HPC Architecture On AWS

This document describes a general HPC architecture on AWS using a lift and shift approach. It involves installing AWS ParallelCluster to provision HPC resources defined in a configuration file. This allows building an HPC system in AWS similar to an on-premises environment without much change for users.

Uploaded by

Neelima Kolli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

General HPC Architecture On AWS

This document describes a general HPC architecture on AWS using a lift and shift approach. It involves installing AWS ParallelCluster to provision HPC resources defined in a configuration file. This allows building an HPC system in AWS similar to an on-premises environment without much change for users.

Uploaded by

Neelima Kolli
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

• This architecture can be applied when the on-premises HPC architecture is migrated to the AWS Cloud

using the lift and shift method.

General HPC Architecture on AWS • This method is called a traditional architecture, and its advantage is that users who used HPC systems in
an on-premises environment can build and use HPC systems in the AWS Cloud environment without
much burden.

A series of processes that constitute high-performance


• It is almost the same as the on-premises environment, except the resource is defined in the form of a
script using AWS ParallelCluster.
• Use this architecture to intuitively configure your HPC system in the AWS Cloud environment and use it
computing (HPC) on AWS with lift and shift approach to perform your simulations.

Installs AWS ParallelCluster, which is used to


1 provision HPC resources.

Company DC Region 2
Use the installed AWS ParallelCluster to define
4 10 8 the resource you want to provision in the form of a
script. This is called the “configure file”.

1 AWS ParallelCluster 3 Provision the configure file defined in Step 2 with


3 an AWS ParallelCluster command.
install Resource creation request
The real provisioning of resources is performed
2 Configure file setting AWS CloudFormation Amazon CloudWatch AWS Auto Scaling 4 through an infrastructure as code (IaC) service
called AWS CloudFormation linked with AWS
13 Work complete ParallelCluster.
VPC When provisioning is complete, the defined
5 resources are created. A head node (including
Scheduler 9 defined scheduler) and a file system (Amazon FSx
6 7
for Lustre) are created.
Remotely access to head node Head Job To perform the simulation, the user connects to
HPC User 6 the created head node through a secure shell
Node submission
12 Post-processing with DCV protocol (SSH) or DCV connection.
Queue Computing nodes Create a job script on the head node and submit it
7 to the scheduler already installed on the head
5
node. The job is queued until it is processed.

The amount of computing power defined in the job


Amazon EBS 8 script is allocated to process the job.
A compute cluster to process the job in the queue
9
Network file system HPC Cluster is created, and computing is performed.
(NFS) share The created cluster nodes and various HPC
10 resources are monitored through a monitoring
FSx for Lustre service called Amazon CloudWatch.
Snowball, DX/VPN, etc. Import / Export
11 5 The processed results can be stored in Amazon
12 11 Simple Storage Service (Amazon S3), and sent to
Storage S3 Bucket the on-premises environment if necessary.

If necessary, you can do post-processing with DCV


12
without transmitting the result data into on-
premises.
© 2021, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Reference Architecture When there are no more jobs to process, the
13 cluster is deleted.

You might also like