Shared Memory: Openmp Environment and Synchronization

This document discusses key aspects of the OpenMP API for parallel programming including compiler directives, parallel regions, work sharing directives like parallel for and sections, thread identification functions, data environment directives like private and reduction, and synchronization primitives like barriers and critical sections. It provides examples of using various OpenMP constructs for parallelizing loops and implementing common parallel patterns.

Uploaded by

karthik reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

109 views

Shared Memory: Openmp Environment and Synchronization

Uploaded by

karthik reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 32

Shared Memory: OpenMP

Environment and Synchronization

OpenMP API Overview
API is a set of compiler directives inserted in the
source program (in addition to some library functions).
Ideally, compiler directives do not affect sequential
code.
pragma’s in C / C++ .
(special) comments in Fortran code.
API Semantics
Master thread executes sequential code.
Master and slaves execute parallel code.
Note: very similar to fork-join semantics of Pthreads
create/join primitives.
OpenMP Directives
Parallelization directives:
parallel region
parallel for
Data environment directives:
shared, private, threadprivate, reduction, etc.
Synchronization directives:
barrier, critical
General Rules about Directives
They always apply to the next statement, which must
be a structured block.
Examples
#pragma omp …
statement
#pragma omp …
{ statement1; statement2; statement3; }
OpenMP Parallel Region
#pragma omp parallel

A number of threads are spawned at entry.

Each thread executes the same code.
Each thread waits at the end.
Very similar to a number of create/join’s with the
same function in Pthreads.
Getting Threads to do Different Things
Through explicit thread identification (as in Pthreads).
Through work-sharing directives.
Thread Identification
int omp_get_thread_num()
int omp_get_num_threads()

Gets the thread id.

Gets the total number of threads.
Example
#pragma omp parallel
{
if( !omp_get_thread_num() )
master();
else
slave();
}
Work Sharing Directives
Always occur within a parallel region directive.
Two principal ones are
parallel for
parallel section
OpenMP Parallel For
#pragma omp parallel
#pragma omp for
for( … ) { … }
Each thread executes a subset of the iterations.
All threads wait at the end of the parallel for.
Multiple Work Sharing Directives
May occur within a single parallel region
#pragma omp parallel
{
#pragma omp for
for( ; ; ) { … }
#pragma omp for
for( ; ; ) { … }
}
All threads wait at the end of the first for.
The NoWait Qualifier
#pragma omp parallel
{
#pragma omp for nowait
for( ; ; ) { … }
#pragma omp for
for( ; ; ) { … }
}
Threads proceed to second for w/o waiting.
Sections
A parallel loop is an example of independent work
units that are numbered
If you have a pre-determined number of independent
work units, the sections is more appropriate
In a sections construct can be any number
of section constructs and each should be independent
They can be executed by any available thread in the
current team
Parallel Sections Directive

#pragma omp parallel

{
#pragma omp sections
{
{…}
#pragma omp section  this is a delimiter
{…}
#pragma omp section
{…}
…
}
}
Example:
y = f(x) + g(x)
double y1,y2;
#pragma omp sections
{
#pragma omp section
y1 = f(x)
#pragma omp section
y2 = g(x)
}
y = y1+y2;
Single directive
It limits the execution of a block to a single thread
If the computation needs to be done only once
Helpful for initializing shared variables
#pragma omp parallel
{
#pragma omp single
printf(“Inside section single!\n");
//Try to get thread numbers using omp_get_thread_num
// parallel code
}
Exercise 1:
Matrix multiplication using sections primitive and
observe the time taken
Matrix multiplication using serial programming and
observe the time taken
Exercise 2:
Data Environment Directives (2 of 2)
Private
Threadprivate
Reduction
Private Variables
#pragma omp parallel for private( list )

Makes a private copy for each thread for each variable

in the list.
This and all further examples are with parallel for, but
same applies to other region and work-sharing
directives.
Private Variables: Example (1 of 2)
for( i=0; i<n; i++ ) {
tmp = a[i];
a[i] = b[i];
b[i] = tmp;
}
Swaps the values in a and b.
Loop-carried dependence on tmp.
Easily fixed by privatizing tmp.
Private Variables: Example (2 of 2)
#pragma omp parallel for private( tmp )
for( i=0; i<n; i++ ) {
tmp = a[i];
a[i] = b[i];
b[i] = tmp;
}
Removes dependence on tmp.
Would be more difficult to do in Pthreads.
Threadprivate
Private variables are private on a parallel region basis.
Threadprivate variables are global variables that are
private throughout the execution of the program.
Threadprivate
#pragma omp threadprivate( list )
Example: #pragma omp threadprivate( x)
Requires program change in Pthreads.
Requires an array of size p.
Access as x[pthread_self()].
Costly if accessed frequently.
Not cheap in OpenMP either.
Reduction Variables
#pragma omp parallel for reduction( op:list )
op is one of +, *, -, &, ^, |, &&, or ||
The variables in list must be used with this operator in
the loop.
The variables are automatically initialized to sensible
values.
Reduction Variables: Example
#pragma omp parallel for reduction( +:sum )
for( i=0; i<n; i++ )
sum += a[i];

Sum is automatically initialized to zero.

{
int x;
x = 2;
#pragma omp parallel num_threads(2) shared(x)
{
if (omp_get_thread_num() == 0)
{
x = 5;
}
else
{ /* Print 1: the following read of x has a race */
printf("1: Thread# %d: x = %d\n", omp_get_thread_num(),x );
}
#pragma omp barrier
if (omp_get_thread_num() == 0)
{ /* Print 2 */
printf("2: Thread# %d: x = %d\n",
omp_get_thread_num(),x ); } else { /* Print 3 */
printf("3: Thread# %d: x = %d\n",
omp_get_thread_num(),x ); } }
return 0;
Synchronization Primitives
Critical
#pragma omp critical name
Implements critical sections by name.
Similar to Pthreads mutex locks (name ~ lock).
Barrier
#pragma omp critical barrier
Implements global barrier.
Reduction
#pragma omp parallel for reduction(+,sum)
for( i=0, sum=0; i<n; i++ )
sum += a[i];

Dependence on sum is removed.

Exercise
Use OpenMP to implement a producer-consumer program
in which some of the threads are producers and others are
consumers. The producers read text from a collection of
files, one per producer. They insert lines of text into a
single shared queue. The consumers take the lines of text
and tokenize them. Tokens are “words”
A search engine can be implemented using a farm of
servers; each contains a subset of data that can be searched.
Assume that this farm server has a single front-end that
interacts with clients who submit queries. Implement the
above server form using master-worker pattern

Clean Code - Chapter 6 - Objects and Data Structures
No ratings yet
Clean Code - Chapter 6 - Objects and Data Structures
34 pages
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
4.5/5 (3)
lp384d27 125&500 Componet Test Procedure
No ratings yet
lp384d27 125&500 Componet Test Procedure
36 pages
4 Openmp
No ratings yet
4 Openmp
32 pages
OPENMP1
No ratings yet
OPENMP1
67 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
Presentation2 HS OpenMP
No ratings yet
Presentation2 HS OpenMP
29 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
46 pages
Unit III
No ratings yet
Unit III
15 pages
10 OpenMP-2
No ratings yet
10 OpenMP-2
25 pages
Openmp: Parallel Processing
No ratings yet
Openmp: Parallel Processing
40 pages
Num Tech
No ratings yet
Num Tech
39 pages
Openmp
No ratings yet
Openmp
115 pages
Parallel Programming 2
No ratings yet
Parallel Programming 2
20 pages
Openmp: Martin Kruliš Ji Ří Dokulil
No ratings yet
Openmp: Martin Kruliš Ji Ří Dokulil
38 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
Openmp Boston
No ratings yet
Openmp Boston
90 pages
CS-3006 5 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 5 UsingOpenMP SharedMemoryProgramming
76 pages
Open MP
No ratings yet
Open MP
35 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
No ratings yet
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
19 pages
OpenMP P1
No ratings yet
OpenMP P1
32 pages
Openmp: Openmp Adds Constructs For Shared-Memory
No ratings yet
Openmp: Openmp Adds Constructs For Shared-Memory
15 pages
Chap4 OpenMP
No ratings yet
Chap4 OpenMP
35 pages
OpenMP 2
No ratings yet
OpenMP 2
3 pages
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
No ratings yet
Work Replication With Parallel Region: #Pragma Omp Parallel For (For (J 0 J 10 J++) Printf ("Hello/n") )
19 pages
UNIT 3
No ratings yet
UNIT 3
13 pages
Cao Da1
No ratings yet
Cao Da1
9 pages
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
No ratings yet
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
73 pages
DS1822-Parallel Computing - Unit2
No ratings yet
DS1822-Parallel Computing - Unit2
25 pages
A Tutorial On Parallel Computing On Shared Memory Systems
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
23 pages
OpenMP-More Directives
No ratings yet
OpenMP-More Directives
44 pages
About OpenMP
No ratings yet
About OpenMP
86 pages
07 OpenMP
No ratings yet
07 OpenMP
28 pages
Unit 4 Shared-Memory Parallel Programming With Openmp
No ratings yet
Unit 4 Shared-Memory Parallel Programming With Openmp
37 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
Openmp
No ratings yet
Openmp
61 pages
17 Shared Memory
No ratings yet
17 Shared Memory
73 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
Openmp 1
No ratings yet
Openmp 1
38 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
UNIT III
No ratings yet
UNIT III
61 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
Mpsoc Architectures Openmp
No ratings yet
Mpsoc Architectures Openmp
35 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
OpenMP Reference
No ratings yet
OpenMP Reference
2 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
Lecture 06 - OpenMP
No ratings yet
Lecture 06 - OpenMP
37 pages
3unit3 Mca Pecnotes
No ratings yet
3unit3 Mca Pecnotes
23 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
No ratings yet
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
50 pages
ipc_assig 1
No ratings yet
ipc_assig 1
9 pages
OPENMP
No ratings yet
OPENMP
37 pages
Open MP
No ratings yet
Open MP
30 pages
Chapter 5
No ratings yet
Chapter 5
92 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Profound Linux For Developers
From Everand
Profound Linux For Developers
Onder Teker
No ratings yet
1.1.3 Lab - Researching PenTesting Careers - ILM
No ratings yet
1.1.3 Lab - Researching PenTesting Careers - ILM
5 pages
The Computer System (IT ERA)
No ratings yet
The Computer System (IT ERA)
6 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
BSC30923 - CyberSecurity Defence and Operations - CA4
No ratings yet
BSC30923 - CyberSecurity Defence and Operations - CA4
7 pages
r05320505 Neural Networks
100% (2)
r05320505 Neural Networks
5 pages
CN Model Question Paper
No ratings yet
CN Model Question Paper
4 pages
TEXT BOOK:"Client/Server Survival Guide" Wiley INDIA Publication, 3 Edition, 2011. Prepared By: B.Loganathan
No ratings yet
TEXT BOOK:"Client/Server Survival Guide" Wiley INDIA Publication, 3 Edition, 2011. Prepared By: B.Loganathan
41 pages
AWS Certified Solutions Architect Slides v40!1!400!1!100_backup
No ratings yet
AWS Certified Solutions Architect Slides v40!1!400!1!100_backup
83 pages
Chapter Four Communication Paradigms
No ratings yet
Chapter Four Communication Paradigms
48 pages
New WWW WWW WWWW WWW WWWW
No ratings yet
New WWW WWW WWWW WWW WWWW
3 pages
CBSE Class 9 Computer Science Sample Paper 2017 (1) - 0
No ratings yet
CBSE Class 9 Computer Science Sample Paper 2017 (1) - 0
2 pages
Venkata Subbaiah Maven Silicon
No ratings yet
Venkata Subbaiah Maven Silicon
2 pages
Chapter 1 Computer Fundamental
No ratings yet
Chapter 1 Computer Fundamental
8 pages
5 - Selections - Examples
No ratings yet
5 - Selections - Examples
9 pages
General Instructions
No ratings yet
General Instructions
3 pages
Assertion Reasoning MCQ Questions Answers
No ratings yet
Assertion Reasoning MCQ Questions Answers
7 pages
Client Side PDF Creation For Fiori Apps
No ratings yet
Client Side PDF Creation For Fiori Apps
6 pages
μPD780058 Micro controllers
No ratings yet
μPD780058 Micro controllers
694 pages
Module 12 Review Questions
No ratings yet
Module 12 Review Questions
4 pages
Unit 1 - 3
No ratings yet
Unit 1 - 3
24 pages
CS8074 Cyber Forensics: Anna University Exams April May 2022 - Regulation 2017
No ratings yet
CS8074 Cyber Forensics: Anna University Exams April May 2022 - Regulation 2017
2 pages
08 Handout 1
No ratings yet
08 Handout 1
2 pages
Cryptography Policy V
No ratings yet
Cryptography Policy V
3 pages
Assembler Tutorial PDF
100% (1)
Assembler Tutorial PDF
22 pages
Programming 1B
No ratings yet
Programming 1B
11 pages
Network +security L3 - Computa Center
No ratings yet
Network +security L3 - Computa Center
9 pages
Unicenter Service Desk: Web Services User Guide
No ratings yet
Unicenter Service Desk: Web Services User Guide
206 pages
Information Technology Essentials: Prepared By: Arlene N. Baratang
100% (1)
Information Technology Essentials: Prepared By: Arlene N. Baratang
31 pages

Shared Memory: Openmp Environment and Synchronization

Uploaded by

Shared Memory: Openmp Environment and Synchronization

Uploaded by

Shared Memory: OpenMP

Environment and Synchronization

A number of threads are spawned at entry.

Gets the thread id.

#pragma omp parallel

Makes a private copy for each thread for each variable

Sum is automatically initialized to zero.

Dependence on sum is removed.

You might also like