0% found this document useful (0 votes)

69 views

Lecture15 PDF

The document discusses parallel computing paradigms like distributed memory and GPU computing. It introduces MPI (Message Passing Interface) as the standard for exchanging data between processors. MPI uses calls to subroutines to control data exchange between CPUs. The document provides examples of MPI routines like Broadcast and Reduce. It also discusses how to structure Fortran code for MPI and provides an example of using MPI to compute an integral in parallel.

Uploaded by

Daniel Mora

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Lecture15 PDF

Uploaded by

Daniel Mora

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Introduction to High Performance Scientific Computing

Autumn, 2016

Lecture 15

Imperial College Prasun Ray

London 28 November 2016
Parallel computing paradigms

Distributed memory
•  Each (4-core) chip has its own memory

•  The chips are connected by network ‘cables’

•  MPI coordinates communication between two or more CPUs

Imperial College
London
Parallel computing paradigms

Related approaches:
•  Hybrid programming: mix of shared-memory (OpenMP) and
distributed-memory (MPI) programming

•  GPU’s: Shared memory programming (CUDA or OpenCL)

•  Coprocessors and co-array programming

Imperial College
London
MPI intro
•  MPI: Message Passing Interface

•  Standard for exchanging data between processors

•  Supports Fortran, c, C++

•  Can also be used with Python

Imperial College
London
OpenMP schematic
Program starts with
single master thread

Then, launch parallel Start program

region with multiple
threads.
master thread
Each thread has
access to all FORK
variables introduced
previously Parallel region (4 threads)

Can end parallel JOIN

region if/when
desired and launch
Serial region (1 thread)
parallel regions again
in future as needed

Imperial College
London
MPI schematic
Program starts with
all processes
running
Start program
MPI controls
communication
between processes
Parallel region (4 processes)

Imperial College
London
MPI intro
•  Basic idea: calls to MPI subroutines control data exchange
between processors

•  Example:

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

This will send the integer n which has size 1 from processor 0 to all
of the other processors.

Imperial College
London
MPI broadcast

P0 data P0 data

P1 P1 data

P2 P2 data

P3 P3 data

Imperial College
London
MPI intro
•  Basic idea: calls to MPI subroutines control data exchange
between processors

•  Example:

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

This will send the integer n which has size 1 from processor 0 to all
of the other processors.

Generally, need to specify:

•  source and/or destination of message
•  size of data contained in message
•  type of data contained in message (integer, double precision, …)
•  the data itself (or its location)

Imperial College
London
Fortran code structure
! Basic Fortran 90 code structure!
!
!1. Header!
program template!
!
!2. Variable declarations (e.g. integers, real numbers,...)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'template code'!
!
!
!4. End program!
end program template!
!
! To compile this code:!
! $ gfortran -o f90template.exe f90template.f90!
! To run the resulting executable: $ ./f90template.exe

Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
! Basic MPI + Fortran 90 code structure! See mpif90template.f90
!
!1. Header!
program template!
use mpi!
!
!2a. Variable declarations (e.g. integers, real numbers,...)!
integer :: myid, numprocs, ierr!
!
!2b. Initialize MPI!
call MPI_INIT(ierr)!
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)!
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)!
!
!3. basic code: input, loops, if-statements, subroutine calls!
print *, 'this is proc # ',myid, 'of ', numprocs!
!
!
!4. End program!
call MPI_FINALIZE(ierr)!
end program template!
!
! To compile this code:!
! $ mpif90 -o mpitemplate.exe mpif90template.f90!
! To run the resulting executable with 4 processes:$ mpiexec -n 4 mpitemplate.exe
Imperial College
London
MPI intro
•  Compile + run:

$ mpif90 -o mpif90template.exe mpif90template.f90!

!
$ mpiexec -n 4 mpif90template.exe!
this is proc # 0 of 4!
this is proc # 3 of 4!
this is proc # 1 of 4!
this is proc # 2 of 4!

Note: The number of processes specified with mpiexec can be

larger than the number of cores on your machine, but then tasks are
run sequentially.

Imperial College
London
MPI+Fortran example: computing an integral

•  Estimate integral
with midpoint rule,

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  N = number of intervals

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

Imperial College
London
MPI+Fortran quadrature
•  N = number of intervals

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

§  Basic idea: if N = 8 * numprocs, Nper_proc = 8

§  But, if N <= numprocs, N/numprocs = 0

Nper_proc = (N + numprocs – 1)/numprocs

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

Use MPI_REDUCE

Imperial College
London
MPI reduce

P0 data1 Reduction P0 result

P1 data2 P1

P2 data3 P2

P3 data4 P3

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  Use MPI_REDUCE

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

Imperial College
London
MPI+Fortran quadrature
Two most important tasks:

1.  Decide how many intervals per processor

2.  Each processor will compute its own partial sum, sum_proc,
how do we compute sum(sum_proc)?

•  Use MPI_REDUCE

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

•  For quadrature, we need MPI_SUM

Imperial College
London
MPI+Fortran quadrature

For quadrature, we need MPI_SUM:

call MPI_REDUCE(data, result, 1, MPI_DOUBLE_PRECISION,

0,MPI_COMM_WORLD,ierr)

This will:

1.  Collect the double precision variable data which has size 1 from
each processor.

2.  Compute the sum (because we have chosen and store

the value in result on processor 0.

Note: Only processor 0 will have the final sum. With

MPI_ALLREDUCE, the result will be on every processor.

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: distribute data

!set number of intervals per processor!

Nper_proc = (N + numprocs - 1)/numprocs!
!
!starting and ending points for processor!
istart = myid * Nper_proc + 1!
iend = (myid+1) * Nper_proc!
if (iend>N) iend = N!
!

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: 1. distribute data, 2. compute sum_proc

!set number of intervals per processor!

Imperial College
London
MPI+Fortran quadrature
midpoint_p.f90: 1. distribute data, 2. compute sum_proc, 3. reduction

!set number of intervals per processor!

Nper_proc = (N + numprocs - 1)/numprocs!
!
!starting and ending points for processor!
istart = myid * Nper_proc + 1!
iend = (myid+1) * Nper_proc!
if (iend>N) iend = N!
!
!loop over intervals computing each interval's contribution to integral!
do i1 = istart,iend!
xm = dx*(i1-0.5) !midpoint of interval i1!
call integrand(xm,f)!
sum_i = dx*f!
sum_proc = sum_proc + sum_i !add contribution from interval to
total integral!
end do!
!collect double precision variable, sum, with size 1 on process 0 using
the MPI_SUM option!
call MPI_REDUCE(sum_proc,sum,1,MPI_DOUBLE_PRECISION,MPI_SUM,
0,MPI_COMM_WORLD,ierr)
Imperial College
London
MPI+Fortran quadrature
Compile and run:

$ mpif90 -o midpoint_p.exe midpoint_p.f90!

!
$ mpiexec -n 2 midpoint_p.exe !
number of intervals = 1000!
number of procs = 2!
Nper_proc= 500!
The partial sum on proc # 0 is: 1.8545905426699112 !
The partial sum on proc # 1 is: 1.2870021942532193 !
N= 1000!
sum= 3.1415927369231307 !
error= 8.3333337563828991E-008!
!

Imperial College
London
Other collective operations
•  Scatter and gather

Imperial College
London
MPI scatter

P0 [f1,f2,f3,f4] P0 f1

P1 P1 f2

P2 P2 f3

P3 P3 f4

Imperial College
London
MPI gather

P0 [f1,f2,f3,f4] P0 f1

P1 P1 f2

P2 P2 f3

P3 P3 f4

Imperial College
London
k 1
Ti+1 2Tik + Tik 11
Other Other collective
collective
2
= operations
Si
operations
x
2
x 1 k 1
•  •  Scatter
Scatter and k and gather
Ti gather
= Si + Ti+1 + Tik 11
2 2
dT •  Gather
•  iGather allTparticles
all particles 2Ton
i+ Ti 1
processor
i+1 on processor
=•  S
•  Compute i (t)interaction
Compute + interaction
forcesforces
2 for,particles
i =on1,that
for particles 2,
on..., Nprocessor
that
processor
dt x
2 XN
d xi
2
= f (|xi xj |), i = 1, 2, ..., N
dt j=1

•  Avoid
•  Avoid forproblems
for big big problems (why?)
(why?)

Imperial College
Imperial College
London London
MPI collective data movement

Imperial College
London
From Using MPI

Tws PDF
100% (1)
Tws PDF
60 pages
MPI_tutorial_Fall_Break_2022
No ratings yet
MPI_tutorial_Fall_Break_2022
60 pages
2-MPI
No ratings yet
2-MPI
13 pages
MPI Plamen Krastev
No ratings yet
MPI Plamen Krastev
49 pages
Fortran Mpi Tutorial
No ratings yet
Fortran Mpi Tutorial
29 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
MPI Tutorial: MPI (Message Passing Interface)
No ratings yet
MPI Tutorial: MPI (Message Passing Interface)
29 pages
Parallel & Distributed Computing: MPI - Message Passing Interface
No ratings yet
Parallel & Distributed Computing: MPI - Message Passing Interface
49 pages
Mpi 1
No ratings yet
Mpi 1
38 pages
5 MPIprogramming
No ratings yet
5 MPIprogramming
43 pages
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
No ratings yet
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center
19 pages
Lab Mpi
No ratings yet
Lab Mpi
32 pages
Lecture 11 Distributed Memory Programming
No ratings yet
Lecture 11 Distributed Memory Programming
28 pages
Intro_MPI
No ratings yet
Intro_MPI
60 pages
02 Mpi 0
No ratings yet
02 Mpi 0
19 pages
1.hello World Programme in Mpi
No ratings yet
1.hello World Programme in Mpi
11 pages
Mpi Openmp Examples
No ratings yet
Mpi Openmp Examples
27 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages
Introduction To MPI Ranger Lonestar
No ratings yet
Introduction To MPI Ranger Lonestar
67 pages
Lab Mpi
No ratings yet
Lab Mpi
29 pages
Sunil Kumar L 24
No ratings yet
Sunil Kumar L 24
21 pages
SERC IntroMPI 2019-09-14 v0
No ratings yet
SERC IntroMPI 2019-09-14 v0
43 pages
Introduction To Parallel Computing: What Is Parallel Computing? CS 480 - II Parallel and Scientific Computing
No ratings yet
Introduction To Parallel Computing: What Is Parallel Computing? CS 480 - II Parallel and Scientific Computing
10 pages
Lecture 15 MPI Summarization
No ratings yet
Lecture 15 MPI Summarization
26 pages
Parallel Programming and MPI
No ratings yet
Parallel Programming and MPI
54 pages
HPC - NRW 02 MPI Concepts
No ratings yet
HPC - NRW 02 MPI Concepts
27 pages
Parallel Programming Using MPI
No ratings yet
Parallel Programming Using MPI
69 pages
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
No ratings yet
ATPESC 2019 Track-2 1-7-30 830am Guo-Raffenetti-Thakur-MPI For Scalable Computing
199 pages
Class03 - MPI, Part 1, Intermediate PDF
No ratings yet
Class03 - MPI, Part 1, Intermediate PDF
83 pages
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
100% (1)
Message Passing Interface (MPI) : EC3500: Introduction To Parallel Computing
40 pages
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
No ratings yet
Lab Assesment 9 Parallel & Distributed Computing (L31+32) : Dated: 16/10/2020 Assessment 9 Muskan Agrawal 18BCE0707
4 pages
An Introduction To MPI: Parallel Programming With The Message Passing Interface
No ratings yet
An Introduction To MPI: Parallel Programming With The Message Passing Interface
48 pages
Mpi Half Day Public
No ratings yet
Mpi Half Day Public
140 pages
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
No ratings yet
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
8 pages
[Scientific and Engineering Computation] William Gropp, Ewing L. Lusk, Anthony Skjellum, Rajeev Thakur - Using MPI and Using MPI-2 (1999, The MIT Press)
No ratings yet
[Scientific and Engineering Computation] William Gropp, Ewing L. Lusk, Anthony Skjellum, Rajeev Thakur - Using MPI and Using MPI-2 (1999, The MIT Press)
385 pages
Distributed Memory Programming With MPI: Peter Pacheco
No ratings yet
Distributed Memory Programming With MPI: Peter Pacheco
121 pages
PA
No ratings yet
PA
87 pages
Pcap Cse 3263 Lab Manual 2023
No ratings yet
Pcap Cse 3263 Lab Manual 2023
70 pages
Distributed Memory Programming With: Peter Pacheco
No ratings yet
Distributed Memory Programming With: Peter Pacheco
125 pages
Codigo
No ratings yet
Codigo
4 pages
NGK Mpi
No ratings yet
NGK Mpi
74 pages
Mpi Openmp Handouts
No ratings yet
Mpi Openmp Handouts
67 pages
Mpi
No ratings yet
Mpi
67 pages
PDCLabMan Updated
No ratings yet
PDCLabMan Updated
46 pages
ECE 1747H: Parallel Programming: Message Passing (MPI)
No ratings yet
ECE 1747H: Parallel Programming: Message Passing (MPI)
67 pages
High Performance Computing For Computational Mechanics: ISCM-10
No ratings yet
High Performance Computing For Computational Mechanics: ISCM-10
63 pages
Computing LLNL Gov
No ratings yet
Computing LLNL Gov
42 pages
Lec 9 DR Marwa Abbas
No ratings yet
Lec 9 DR Marwa Abbas
64 pages
Clase 4 - Tutorial de MPI
No ratings yet
Clase 4 - Tutorial de MPI
35 pages
MPI Lab 3
No ratings yet
MPI Lab 3
18 pages
Pdcnotes
No ratings yet
Pdcnotes
23 pages
Message Passing Interface (MPI)
No ratings yet
Message Passing Interface (MPI)
22 pages
3.Introduction to Parallelism
No ratings yet
3.Introduction to Parallelism
64 pages
Parallel and Distributed Computing Lab Digital Assignment - 5
No ratings yet
Parallel and Distributed Computing Lab Digital Assignment - 5
7 pages
Lecture07 MPI by Example
No ratings yet
Lecture07 MPI by Example
27 pages
MPI Pacheco Ch3
No ratings yet
MPI Pacheco Ch3
124 pages
mpi_book
No ratings yet
mpi_book
673 pages
Mpi
No ratings yet
Mpi
46 pages
Pic® Micro Principles V11
From Everand
Pic® Micro Principles V11
Clive W. Humphris
No ratings yet
Pic® Micro Principles on Your Mobile
From Everand
Pic® Micro Principles on Your Mobile
Clive W. Humphris
No ratings yet
Pic® Micro Principles Teachers Pack V11
From Everand
Pic® Micro Principles Teachers Pack V11
Clive W. Humphris
No ratings yet
Wealth CalculatorIS
No ratings yet
Wealth CalculatorIS
10 pages
Bifd Programme Version 20 July 2022
No ratings yet
Bifd Programme Version 20 July 2022
36 pages
Bifd Programme Final
No ratings yet
Bifd Programme Final
38 pages
Erc 2021 Adg Results Pe
No ratings yet
Erc 2021 Adg Results Pe
15 pages
Italian For Beginners: 100 Phrases Every Italian Beginner Must Know
100% (1)
Italian For Beginners: 100 Phrases Every Italian Beginner Must Know
25 pages
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
No ratings yet
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
38 pages
Presentation Hints PDF
No ratings yet
Presentation Hints PDF
2 pages
Core Course 1: Fluid Dynamics 1: Appendix I: Core Courses
No ratings yet
Core Course 1: Fluid Dynamics 1: Appendix I: Core Courses
14 pages
Kaa Shiv Company Profile
No ratings yet
Kaa Shiv Company Profile
13 pages
Unit 5 Mu
No ratings yet
Unit 5 Mu
37 pages
Class Notes Apache Zookeeper Hadoop
No ratings yet
Class Notes Apache Zookeeper Hadoop
3 pages
Module 4
No ratings yet
Module 4
27 pages
3g Mobile Communication Technology
No ratings yet
3g Mobile Communication Technology
12 pages
Date
No ratings yet
Date
1 page
1691654424037
No ratings yet
1691654424037
13 pages
Pre Cal Module1
No ratings yet
Pre Cal Module1
12 pages
DTG Eclipse With DP II White Ink Manual v2
No ratings yet
DTG Eclipse With DP II White Ink Manual v2
69 pages
Microsoft Testinises Ai-102 Sample Question 2024-May-30 by Ives 178q Vce
No ratings yet
Microsoft Testinises Ai-102 Sample Question 2024-May-30 by Ives 178q Vce
13 pages
Mt4+ Stealth Orders: Userguide
No ratings yet
Mt4+ Stealth Orders: Userguide
25 pages
Deadlocks 0
No ratings yet
Deadlocks 0
13 pages
OceanStor Dorado 6.x & OceanStor 6.x Host Connectivity Guide For VMware ESXi
No ratings yet
OceanStor Dorado 6.x & OceanStor 6.x Host Connectivity Guide For VMware ESXi
149 pages
Using Layer Masks To Remove Backgrounds With Photoshop CS5 and CS6
No ratings yet
Using Layer Masks To Remove Backgrounds With Photoshop CS5 and CS6
19 pages
20250123_GARTNER_Ai_Agents_MWKF1mK7SiybRL1NbsZH_Jan23TCoshowEBrethenoux
No ratings yet
20250123_GARTNER_Ai_Agents_MWKF1mK7SiybRL1NbsZH_Jan23TCoshowEBrethenoux
30 pages
Unigraphics NX Interview Questions and Answers 3 Engineering Wave PDF
No ratings yet
Unigraphics NX Interview Questions and Answers 3 Engineering Wave PDF
3 pages
Part Numbering Standards WP ENS
No ratings yet
Part Numbering Standards WP ENS
13 pages
BIM and GIS Data Integration Guidelines (June 2023 Edition)
No ratings yet
BIM and GIS Data Integration Guidelines (June 2023 Edition)
14 pages
SADP-3
No ratings yet
SADP-3
6 pages
Cover Letter Format For CV
100% (2)
Cover Letter Format For CV
6 pages
User Manual of Macro V2.0
0% (1)
User Manual of Macro V2.0
31 pages
Database Module PDFS 2023
No ratings yet
Database Module PDFS 2023
251 pages
2014 - 05 - 20 - c.pCO and C.suite
No ratings yet
2014 - 05 - 20 - c.pCO and C.suite
49 pages
Week#8(Containerization)
No ratings yet
Week#8(Containerization)
17 pages
CSS Reviewer
No ratings yet
CSS Reviewer
7 pages
Developer Activity Delphi Digital 1
100% (1)
Developer Activity Delphi Digital 1
11 pages
System Design Techniques: - Quality Assurance
No ratings yet
System Design Techniques: - Quality Assurance
21 pages
STOCK ALL BRAND 2 Juni 2022
No ratings yet
STOCK ALL BRAND 2 Juni 2022
9 pages
ARTERY AT32 MCU Cross Reference Table EN V202011
No ratings yet
ARTERY AT32 MCU Cross Reference Table EN V202011
4 pages

Lecture15 PDF

Uploaded by

Lecture15 PDF

Uploaded by

Introduction to High Performance Scientific Computing

Imperial College Prasun Ray

• The chips are connected by network ‘cables’

• MPI coordinates communication between two or more CPUs

• GPU’s: Shared memory programming (CUDA or OpenCL)

• Coprocessors and co-array programming

• Standard for exchanging data between processors

• Supports Fortran, c, C++

• Can also be used with Python

Then, launch parallel Start program

Can end parallel JOIN

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

call MPI_BCAST(n, 1, MPI_INTEGER,0,MPI_COMM_WORLD,ierr)

Generally, need to specify:

$ mpif90 -o mpif90template.exe mpif90template.f90!

Note: The number of processes specified with mpiexec can be

1. Decide how many intervals per processor

1. Decide how many intervals per processor

• numprocs = number of processors

• Need to compute Nper_proc: intervals per processor

• numprocs = number of processors

• Need to compute Nper_proc: intervals per processor

§ Basic idea: if N = 8 * numprocs, Nper_proc = 8

§ But, if N <= numprocs, N/numprocs = 0

Nper_proc = (N + numprocs – 1)/numprocs

1. Decide how many intervals per processor

P0 data1 Reduction P0 result

1. Decide how many intervals per processor

• Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

1. Decide how many intervals per processor

• Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

• For quadrature, we need MPI_SUM

For quadrature, we need MPI_SUM:

call MPI_REDUCE(data, result, 1, MPI_DOUBLE_PRECISION,

2. Compute the sum (because we have chosen and store

Note: Only processor 0 will have the final sum. With

!set number of intervals per processor!

!set number of intervals per processor!

!set number of intervals per processor!

$ mpif90 -o midpoint_p.exe midpoint_p.f90!

You might also like

•  The chips are connected by network ‘cables’

•  MPI coordinates communication between two or more CPUs

•  GPU’s: Shared memory programming (CUDA or OpenCL)

•  Coprocessors and co-array programming

•  Standard for exchanging data between processors

•  Supports Fortran, c, C++

•  Can also be used with Python

1.  Decide how many intervals per processor

1.  Decide how many intervals per processor

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

•  numprocs = number of processors

•  Need to compute Nper_proc: intervals per processor

§  Basic idea: if N = 8 * numprocs, Nper_proc = 8

§  But, if N <= numprocs, N/numprocs = 0

1.  Decide how many intervals per processor

1.  Decide how many intervals per processor

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

1.  Decide how many intervals per processor

•  Reduction options: MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD

•  For quadrature, we need MPI_SUM

2.  Compute the sum (because we have chosen and store