0% found this document useful (0 votes)
5 views

Slurm Usage Guide

Uploaded by

Le Truc Quynh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Slurm Usage Guide

Uploaded by

Le Truc Quynh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Slurm Usage Guide

Concept

SSH flow: Get into hanoi -> then go login-sp.vinai-systems.com

Login with account AD.

ssh hanoi
ssh <username>@login-sp.vinai-systems.com
Ex: ssh [email protected]

HOME_FOLDER_ISILON <=> /home/your_username (on loginNode) <=>


/vinai/your_username

SUPERPOD_STORAGE_DDN_FOLDER <=> /lustre/scratch/client (on all node)

PERSONAL_STORAGE_DDN_FOLDER <=>
/lustre/scratch/client/vinai/user/your_username

You have to put your training data in DDN Storage, HOME ISILON will be used for data
archive longterm.

Introduction
Slurm is an open-source job scheduling system for Linux clusters, most frequently used for
high-performance computing (HPC) applications. This guide will cover some of the basics to
get started using slurm as a user. For more information, the Slurm Docs are a good place to
start.

After slurm is deployed on a cluster, a slurmd daemon should be running on each compute
system. Users do not log directly into each compute system to do their work. Instead, they
execute slurm commands (ex: srun, sinfo, scancel, scontrol, etc) from a slurm login node.
These commands communicate with the slurmd daemons on each host to perform work.
Simple Commands
Cluster state with sinfo
To "see" the cluster, ssh to the slurm login node for your cluster and run the `sinfo`
command:
dgxuser@sdc2-hpc-login-mgmt001:~$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
batch* up 1-00:00:00 8 idle sdc2-hpc-dgx-a100-[001-008]
batch* up 1-00:00:00 2 down sdc2-hpc-dgx-a100-[013,015]
There are 8 nodes available on this system, all in an idle state. If a node is busy, its state will
change from idle to alloc. If a node is down, its state will change from idle to down.
dgxuser@sdc2-hpc-login-mgmt001:~$ sinfo -lN
Fri Jul 16 10:47:52 2021
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT
AVAIL_FE REASON
sdc2-hpc-dgx-a100-001 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-002 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-003 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-004 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-005 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-006 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-007 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-008 1 batch* idle 256 2:64:2 103100 0 1 (null) none
sdc2-hpc-dgx-a100-013 1 batch* down 256 2:64:2 103100 0 1 (null) VinAI use
sdc2-hpc-dgx-a100-015 1 batch* down 256 2:64:2 103100 0 1 (null) VinAI use

The `sinfo` command can be used to output a lot more information about the cluster. Check out
the sinfo doc for more information.

Running a job with srun


To run a job, use the srun command:
dgxuser@sdc2-hpc-login-mgmt001:~$ srun --partition=batch --gres=gpu:8 env | grep CUDA
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
dgxuser@sdc2-hpc-login-mgmt001:~$ srun --partition=batch --ntasks 8 -l hostname
5: sdc2-hpc-dgx-a100-001
2: sdc2-hpc-dgx-a100-001
7: sdc2-hpc-dgx-a100-001
6: sdc2-hpc-dgx-a100-001
0: sdc2-hpc-dgx-a100-001
3: sdc2-hpc-dgx-a100-001
1: sdc2-hpc-dgx-a100-001
4: sdc2-hpc-dgx-a100-001

Running an interactive job


Especially when developing and experimenting, it's helpful to run an interactive job, which
requests a resource and provides a command prompt as an interface to it (maxtime=2h):

dgxuser@sdc2-hpc-login-mgmt001:~$ srun --partition=batch --pty /bin/bash --time=02:00:00


dgxuser@sdc2-hpc-dgx-a100-001:~$ hostname
sdc2-hpc-dgx-a100-001
dgxuser@sdc2-hpc-dgx-a100-001:~$ exit

During interactive mode, the resource is being reserved for use until the prompt is exited (as
shown above). Commands can be run in succession.
Note: before starting an interactive session with srun it may be helpful to create a session
on the login node with a tool like tmux or `screen`. This will prevent a user from losing
interactive jobs if there is a network outage or the terminal is closed.
More Advanced Use

Run a batch job


While the srun command blocks any other execution in the terminal, sbatch can be run to queue
a job for execution once resources are available in the cluster. Also, a batch job will let you
queue up several jobs that run as nodes become available. It's therefore good practice to
encapsulate everything that needs to be run into a script and then execute with sbatch vs with
srun:
Example: running job python

dgxuser@sdc2-hpc-login-mgmt001:~$ cat script.sh


#!/bin/bash
set -e
#SBATCH --job-name=demo # create a short name for your job
#SBATCH --output=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.out #
create a output file
#SBATCH --error=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.err #
create a error file
#SBATCH --partition=batch or phase2 # choose partition
#SBATCH --gpus=1 # gpu count
#SBATCH --nodes=1 # node count
#SBATCH --mem-per-cpu=2G # memory per cpu-core (4G is default)
#SBATCH --cpus-per-gpu=8 # cpu-cores per gpu
#SBATCH --mail-type=all # option sendmail: begin,fail.end,requeue,all
#SBATCH [email protected] //your email
python3 demo.py
dgxuser@sdc2-hpc-login-mgmt001:~$ sbatch script.sh

Resources can be requested in several different ways:

sbatch/srun Option Description


-N, --nodes= Specify the total number of nodes to request
-n, --ntasks= Specify the total number of tasks to request
--ntasks-per-node= Specify the number of tasks per node
--gpus-per-node= Specify the number of GPUs to use Per node
-G, --gpus= Total number of GPUs to allocate for the job
--gpus-per-task= Number of gpus per task
--cpus-per-task= Number of cpus per task
--exclusive Guarantee that nodes are not shared amongst jobs

Observing running jobs with squeue


To see which jobs are running in the cluster, use the `squeue` command:
dgxuser@sdc2-hpc-login-mgmt001:~$ squeue -a -l
Fri Jul 16 11:01:38 2021
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES
NODELIST(REASON)
125 batch demo dgxuser COMPLETI 0:09 1-00:00:00 1 sdc2-hpc-dgx-a100-001

Cancel a job with scancel

dgxuser@sdc2-hpc-login-mgmt001:~$ squeue
dgxuser@sdc2-hpc-login-mgmt001:~$ scancel JOBID

Running job with module


List of available modules

dgxuser@sdc2-hpc-login-mgmt001:~$ module avail

-----------------------------------------------------------------------------------------------------------------
/sw/modules/all -------------------------------------------------------------------------------------------------
-----------------
mpi/3.0.6 python/2.7.18 python/3.6.10 python/3.8.10 python/miniconda3/miniconda3
python/pytorch/1.9.0+cu111 python/tensorflow/2.3.0

Use "module spider" to find all possible modules.


Use "module keyword key1 key2 ..." to search for all possible modules matching any of the
"keys".

Create your environment


dgxuser@sdc2-hpc-login-mgmt001:~$ module load python/miniconda3/miniconda3
dgxuser@sdc2-hpc-login-mgmt001:~$ conda create -p
/lustre/scratch/client/vinai/users/youruser/yourfolder python=yourversion
dgxuser@sdc2-hpc-login-mgmt001:~$ conda activate yourenv

Installation of your lib and packages you want (prefer using pip). Export proxy if you have
a problem with internet connection.
export HTTP_PROXY=http://proxytc.vingroup.net:9090/
export HTTPS_PROXY=http://proxytc.vingroup.net:9090/
export http_proxy=http://proxytc.vingroup.net:9090/
export https_proxy=http://proxytc.vingroup.net:9090/
Example run job with 1 node A100, 4 Gpus:

dgxuser@sdc2-hpc-login-mgmt001:~$ cat conda.sh


#!/bin/bash -e
#SBATCH --job-name=py-job
#SBATCH --output=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.out
#SBATCH --error=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.err
#SBATCH --gpus=4
#SBATCH --nodes=1
#SBATCH --mem-per-gpu=36G
#SBATCH --cpus-per-gpu=8
#SBATCH --partition=batch or phase2
#SBATCH --mail-type=all
#SBATCH [email protected] //your email

module purge
module load python/miniconda3/miniconda3
eval "$(conda shell.bash hook)"
conda activate /lustre/scratch/client/vinai/users/youruser/yourfolder

command ...
dgxuser@sdc2-hpc-login-mgmt001:~$ sbatch conda.sh

Running job with docker container


List of available containers on harbor.vinai-systems.com

harbor.vinai-systems.com/library/dc-miniconda:3-cuda10.0-cudnn7-ubuntu18.04
harbor.vinai-systems.com/library/cuda:10.0-cudnn7-ubuntu18.04
harbor.vinai-systems.com/library/pytorch:1.4.0-python3.7-cuda10.1-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/dc-tensorflow:1.14.0-python3.7-cuda10.0-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/dc-python:3.6-cuda10.0-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/dc-tf-torch:1.15.0-1.4.0-python2.7-cuda10.0-cudnn7-
ubuntu16.04
harbor.vinai-systems.com/library/dc-miniconda:3-cuda10.1-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/miniconda:3-cuda10.1-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/dc-pytorch:1.4.0-python3.7-cuda10.0-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/dc-miniconda:3-cuda10.0-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/miniconda:3-cuda10.0-cudnn7-ubuntu16.04
harbor.vinai-systems.com/library/pytorch:1.4.0-python3.7-cuda10.0-cudnn7-ubuntu16.04

You can build one of your own from nvcr.io. Dockerfile example in the ZipFile attached.
On login node:
docker login harbor.vinai-systems.com (login node account)
docker tag your_image harbor.vinai-systems.com/library/your_image:your_tag
docker push harbor.vinai-systems.com/library/your_image:your_tag
Contact Admin if you want to create account login harbor.vinai-system.com

You can run docker by example:

dgxuser@sdc2-hpc-login-mgmt001:~$ cat container.sh


#!/bin/bash -e
#SBATCH --job-name=container-job
#SBATCH --output=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.out
#SBATCH --error=/lustre/scratch/client/vinai/users/youruser/yourfolder/slurm_%A.err
#SBATCH --gpus=2
#SBATCH --nodes=1
#SBATCH --mem-per-gpu=36G
#SBATCH --cpus-per-gpu=8
#SBATCH --partition=batch
#SBATCH --mail-type=all
#SBATCH [email protected] //your email
srun --container-image="harbor.vinai-systems.com#library/cuda:10.0-cudnn7-ubuntu18.04" \
--container-mounts=lustre_folder:container_folder \
python …
dgxuser@sdc2-hpc-login-mgmt001:~$ sbatch container.sh

Note: Save your checkpoint to lustre folder

You might also like