0% found this document useful (0 votes)
5 views

Introduction to C MPI PM

The document provides an introduction to the Message Passing Interface (MPI) as a standard for message passing in distributed memory environments, detailing its history, implementations, and communication models. It covers various MPI functions, including point-to-point and collective communications, along with examples and programming practices. Additionally, it outlines the structure of MPI programs and offers assignments for practical application of the concepts discussed.

Uploaded by

1none2none3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Introduction to C MPI PM

The document provides an introduction to the Message Passing Interface (MPI) as a standard for message passing in distributed memory environments, detailing its history, implementations, and communication models. It covers various MPI functions, including point-to-point and collective communications, along with examples and programming practices. Additionally, it outlines the structure of MPI programs and offers assignments for practical application of the concepts discussed.

Uploaded by

1none2none3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Introduction to MPI

Preeti Malakar
[email protected]

An Introductory Course on High-Performance


Computing in Science and Engineering
25th February 2019
1
Parallel Programming Models
Libraries MPI, TBB, Pthread, OpenMP, …
New languages Haskell, X10, Chapel, …
Extensions Coarray Fortran, UPC, Cilk, OpenCL, …

• Shared memory
– OpenMP, Pthreads, …
• Distributed memory
– MPI, UPC, …
• Hybrid
– MPI + OpenMP
2
Hardware and Network Model
Memory
Core

host1 host2 host3 host4


Persistent
storage
• Interconnected systems
• Distributed memory
• NO centralized server/master

3
Message Passing Interface (MPI)
• Standard for message passing in a distributed
memory environment
• Efforts began in 1991 by Jack Dongarra, Tony
Hey, and David W. Walker
• MPI Forum
– Version 1.0: 1994
– Version 2.0: 1997
– Version 3.0: 2012

4
MPI Implementations
• MPICH (ANL)
• MVAPICH (OSU)
• Intel MPI
• OpenMPI

5
MPI Processes
Memory
Core

host1 host2 host3 host4


process process process process
Persistent
storage

6
MPI Internals
Process Manager
• Start and stop processes in a scalable way
• Setup communication channels for parallel processes
• Provide system-specific information to processes

Job scheduler
• Schedule MPI jobs on a cluster/supercomputer
• Allocate required number of nodes
– Two jobs generally do not run on the same core
• Enforce other policies (queuing etc.)

7
Communication Channels

• Sockets for network I/O


• MPI handles communications, progress etc.

Reference: Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-


Passing Communication Subsystem by Buntinas et al. 8
Message Passing Paradigm
• Message sends and receives
• Explicit communication

Communication types
• Blocking
• Non-blocking

9
Getting Started

Function names:
MPI_*
Initialization and Finalization
MPI_Init
• gather information about the parallel job
• set up internal library state
• prepare for communication

MPI_Finalize
• cleanup

11
MPI_COMM_WORLD

12
Communication Scope
Communicator (communication handle)
• Defines the scope
• Specifies communication context

Process
• Belongs to a group
• Identified by a rank within a group

Identification
• MPI_Comm_size – total number of processes in communicator
• MPI_Comm_rank – rank in the communicator

13
MPI_COMM_WORLD
Process identified
by rank/id

3
2

0 4

14
Getting Started

Total
number of
processes

Rank of a
process
MPI Message
• Data and header/envelope
• Typically, MPI communications send/receive messages

Message Envelope
Source: Origin of message
Destination: Receiver of message
Communicator
Tag (0:MPI_TAG_UB)

16
MPI Communication Types

Point-to-point Collective
Point-to-point Communication
• MPI_Send Blocking send and receive
int MPI_Send (const void *buf, int count,
MPI_Datatype datatype, int dest, int tag,
MPI_Comm comm)
SENDER

Tags should match


• MPI_Recv
int MPI_Recv (void *buf, int count,
MPI_Datatype datatype, int source, int tag,
MPI_Comm comm, MPI_Status *status)
RECEIVER
MPI_Datatype
• MPI_BYTE
• MPI_CHAR
• MPI_INT
• MPI_FLOAT
• MPI_DOUBLE

19
Example 1
MPI_Comm_rank (MPI_COMM_WORLD, &myrank);
Message
// Sender process tag
if (myrank == 0) /* code for process 0 */
{
strcpy (message,"Hello, there");
MPI_Send (message, strlen(message)+1, MPI_CHAR, 1, 99,
MPI_COMM_WORLD);
}

// Receiver process
else if (myrank == 1) /* code for process 1 */
{
MPI_Recv (message, 20, MPI_CHAR, 0, 99, MPI_COMM_WORLD, &status);
printf ("received :%s\n", message);
} Message
tag
20
MPI_Status
• Source rank
• Message tag
• Message length
– MPI_Get_count

21
MPI_ANY_*
• MPI_ANY_SOURCE
– Receiver may specify wildcard value for source

• MPI_ANY_TAG
– Receiver may specify wildcard value for tag

22
Example 2
MPI_Comm_rank (MPI_COMM_WORLD, &myrank);

// Sender process
if (myrank == 0) /* code for process 0 */
{
strcpy (message,"Hello, there");
MPI_Send (message, strlen(message)+1, MPI_CHAR, 1, 99,
MPI_COMM_WORLD);
}

// Receiver process
else if (myrank == 1) /* code for process 1 */
{
MPI_Recv (message, 20, MPI_CHAR, MPI_ANY_SOURCE, 99,
MPI_COMM_WORLD, &status);
printf ("received :%s\n", message);
}
23
MPI_Send (Blocking)

• Does not return until


buffer can be reused
• Message buffering SENDER

• Implementation-
dependent
• Standard communication
mode
RECEIVER

24
Buffering

[Source: Cray presentation] 25


Safety
0 1
MPI_Send MPI_Recv Safe
MPI_Send MPI_Recv

MPI_Send MPI_Send Unsafe


MPI_Recv MPI_Recv

MPI_Send MPI_Recv Safe


MPI_Recv MPI_Send

MPI_Recv MPI_Recv Unsafe


MPI_Send MPI_Send
26
Message Protocols
• Short
– Message sent with envelope/header
• Eager
– Send completes without acknowledgement from destination
– Small messages – typically 128 KB (at least in MPICH)
– MPIR_CVAR_CH3_EAGER_MAX_MSG_SIZE (check mpivars)
• Rendezvous
– Requires an acknowledgement from a matching receive
– Large messages

27
Other Send Modes

• MPI_Bsend Buffered
– May complete before matching receive is posted

• MPI_Ssend Synchronous
– Completes only if a matching receive is posted

• MPI_Rsend Ready
– Started only if a matching receive is posted
28
Non-blocking Point-to-Point
• MPI_Isend (buf, count, datatype, dest, tag,
comm, request)
• MPI_Irecv (buf, count, datatype, source, tag,
comm, request)
• MPI_Wait (request, status)

0 1
MPI_Isend MPI_Isend Safe
MPI_Recv MPI_Recv

29
Computation Communication Overlap

0 1

compute

MPI_Isend compute Time

compute MPI_Recv

MPI_Wait compute

compute
30
Collective Communications
• Must be called by all processes that are part of the
communicator

Types
• Synchronization (MPI_Barrier)
• Global communication (MPI_Bcast, MPI_Gather, …)
• Global reduction (MPI_Reduce, …)

31
Barrier
• Synchronization across all group members
• Collective call
• Blocks until all processes have entered the call
• MPI_Barrier (comm)

32
Broadcast

• Root process sends message to all processes


X X
• Any process can be root process but has to
X
be the same in all processes
X
• int MPI_Bcast (buffer, count, datatype, root,
comm) X

• Number of elements in buffer – count X


X
• buffer – Input or output?
X
X
Q: Can you use point-to-point communication
for the same?
33
Example 3
int rank, size, color;
MPI_Status status;

MPI_Init (&argc, &argv);


MPI_Comm_rank (MPI_COMM_WORLD, &rank);
MPI_Comm_size (MPI_COMM_WORLD, &size);

color = rank + 2;
int oldcolor = color;
MPI_Bcast (&color, 1, MPI_INT, 0, MPI_COMM_WORLD);

printf ("%d: %d color changed to %d\n", rank, oldcolor, color);

34
Gather
DATA
• Gathers values from all processes to a
root process A0

PROCESSES
• int MPI_Gather (sendbuf, sendcount, A1
sendtype, recvbuf, recvcount,
A2
recvtype, root, comm)
• Arguments recv* not relevant on non-
root processes A0 A1 A2

35
Example 4

MPI_Comm_rank (MPI_COMM_WORLD, &rank);


MPI_Comm_size (MPI_COMM_WORLD, &size);

color = rank + 2;

int colors[size];
MPI_Gather (&color, 1, MPI_INT, colors, 1, MPI_INT, 0,
MPI_COMM_WORLD);

if (rank == 0)
for (i=0; i<size; i++)
printf ("color from %d = %d\n", i, colors[i]);

36
Scatter
DATA
• Scatters values to all processes
from a root process A0

PROCESSES
• int MPI_Scatter (sendbuf, A1
sendcount, sendtype, recvbuf,
A2
recvcount, recvtype, root, comm)
• Arguments send* not relevant on
non-root processes A0 A1 A2
• Output parameter – recvbuf

37
Allgather
DATA

• All processes gather A0

PROCESSES
values from all processes A1
• int MPI_Allgather A2
(sendbuf, sendcount,
sendtype, recvbuf,
recvcount, recvtype, A0 A1 A2
comm)
A0 A1 A2

A0 A1 A2

38
Reduce

• MPI_Reduce (inbuf, outbuf, count, datatype, op, root,


comm)
• Combines element in inbuf of each process
• Combined value in outbuf of root
• op: MIN, MAX, SUM, PROD, …

0 1 2 3 4 5 6 2 1 2 3 2 5 2 0 1 1 0 1 1 0

2 1 2 3 4 5 6 MAX at root

39
Allreduce
• MPI_Allreduce (inbuf, outbuf, count, datatype, op, comm)
• op: MIN, MAX, SUM, PROD, …
• Combines element in inbuf of each process
• Combined value in outbuf of each process

0 1 2 3 4 5 6 2 1 2 3 2 5 2 0 1 1 0 1 1 0

2 1 2 3 4 5 6 2 1 2 3 4 5 6 2 1 2 3 4 5 6

MAX
40
Sub-communicator
- Logical subset
- Different contexts

41
MPI_COMM_SPLIT
MPI_Comm_split (MPI_Comm oldcomm, int
color, int key, MPI_Comm *newcomm)
• Collective call
• Logically divides based on color
– Same color processes form a group
– Some processes may not be part of newcomm
(MPI_UNDEFINED)
• Rank assignment based on key

42
Logical subsets of processes

13 00
2 21
5 16
42
10 14
1 …
17
6
11
8 9
0 3 19
15 10
7 31
12
18 52
4 …

How do you assign one color to odd processes and another color to even processes ?
color = rank % 2

43
MPI Programming
Hands-on
How to run an MPI program on a
cluster?

MPI process MPI process MPI process MPI process


MPI process MPI process MPI process MPI process
4 nodes, ppn=2

mpiexec –n <number of processes> -f <hostfile> ./exe


<hostfile>

host1:2
host2:2
host3:2

45
How to run an MPI program on a
managed cluster/supercomputer?

MPI process MPI process MPI process MPI process


MPI process MPI process MPI process MPI process
4 nodes, ppn=2

Execution on HPC2010: qsub sub.sh

46
Practice Examples

• egN directory (N=1,2,3,4)


– egN.c
• Compile
– source /opt/software/intel/initpaths intel64
– make
• Execute
– qsub sub.sh

47
Your Code
• Each sub-directory (a1, a2, a3) has
– sub.sh [Required number of cores mentioned]
– Makefile
– .c
– Edit the .c [Look for “WRITE YOUR CODE HERE”]

48
Assignments
1. Even processes send their data to odd processes

2. Element-wise sum of distributed arrays

3. Sum of array elements of 2 large arrays


– Make two groups of processes {0,2,4,6} and {1,3,5,7}
– The 0th process of each group should distribute its array to the other
group members (equal division)
– All processes sum up their individual array chunks

49
Reference Material
• Marc Snir, Steve W. Otto, Steven Huss-Lederman, David W.
Walker and Jack Dongarra, MPI - The Complete Reference,
Second Edition, Volume 1, The MPI Core.
• William Gropp, Ewing Lusk, Anthony Skjellum, Using MPI :
portable parallel programming with the message-passing
interface, 3rd Ed., Cambridge MIT Press, 2014.

50

You might also like