0% found this document useful (0 votes)

1 views

HPC_SUMMARY

This document serves as a comprehensive guide for OpenMP and MPI, focusing on parallel programming techniques for shared-memory and distributed-memory systems. It covers key concepts, syntax, examples, and comparisons to aid in exam preparation for High Performance Computing. The document includes detailed sections on compiling, running, and various programming constructs for both OpenMP and MPI.

Uploaded by

krimtech.contact

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

HPC_SUMMARY

Uploaded by

krimtech.contact

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

High Performance Computing: OpenMP and MPI

Prepared for Exam Revision

May 2025

A comprehensive guide to OpenMP and MPI for shared-memory and

distributed-memory parallel programming.

Contents
1 Introduction 2

2 OpenMP 2
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Compiling and Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.3 Setting the Number of Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.4 Private and Shared Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Parallelizing For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.7 Reduction Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.8 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.9 Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 MPI 12
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Compiling and Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 General Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Number of Processes and Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Message Passing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Broadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.7 Scatter and Gather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.8 Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.9 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.10 Cluster Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 OpenMP vs. MPI 17

5 Study Tips for Your Exam 17

1
1 Introduction
This document provides a detailed course on OpenMP and MPI, two key technologies
for parallel computing. OpenMP is designed for shared-memory systems using threads,
while MPI is suited for distributed-memory systems using processes. The course in-
cludes concepts, syntax, examples, and comparisons to aid in preparing for your High
Performance Computing exam.

Note
This document uses blue for OpenMP-related content and green for MPI-related
content to enhance clarity.

2 OpenMP
OpenMP (Open Multi-Processing) is an API for shared-memory parallel programming in
C, C++, and Fortran. It uses threads to execute code concurrently on a single machine
with shared memory.

2.1 Overview
• Purpose: Parallelize code on multi-core processors using threads.

• Key Features:

– Shared memory model: All threads access the same memory space.
– Directives to parallelize loops, sections, or tasks.
– Synchronization mechanisms to manage thread interactions.

• Advantages: Easy to use, portable across platforms.

• Limitations: Limited to single-node systems, thread overhead.

2
2.2 Compiling and Running
To compile, use the -fopenmp flag with gcc:

gcc -fopenmp source.c -o output

./output

Example: A simple "Hello World" program.

1 #include <stdio.h>
2 #include <omp.h>
3 int main() {
4 #pragma omp parallel
5 {
6 int thread_id = omp_get_thread_num();
7 printf("Hello from thread %d\n", thread_id);
8 }
9 return 0;
10 }
11 \newpage

Expected Output (varies by number of threads, e.g., for 4 threads):

Hello from thread 0

Hello from thread 1
Hello from thread 2
Hello from thread 3

Note: The order of output lines may vary due to thread scheduling.

2.3 Setting the Number of Threads

Threads can be set in three ways:

• Programmatically: omp_set_num_threads(NUM_THREADS);

• Directive: #pragma omp parallel num_threads(NUM_THREADS)

• Environment: export OMP_NUM_THREADS=NUM_THREADS

Comparison:

• omp_set_num_threads: Global setting for all parallel regions.

• num_threads: Specific to a parallel region.

• OMP_NUM_THREADS: Default for the entire program.

Example: Using all three methods.

3
1 #include <stdio.h>
2 #include <omp.h>
3 int main() {
4 omp_set_num_threads(4); // Programmatic
5 #pragma omp parallel
6 {
7 printf("Thread %d (programmatic)\n", omp_get_thread_num());
8 }
9 #pragma omp parallel num_threads(4) // Directive
10 {
11 printf("Thread %d (directive)\n", omp_get_thread_num());
12 }
13 #pragma omp parallel // Environment (export OMP_NUM_THREADS=4)
14 {
15 printf("Thread %d (environment)\n", omp_get_thread_num());
16 }
17 return 0;
18 }

Expected Output (assuming OMP_NUM_THREADS=4):

Thread 0 (programmatic)
Thread 1 (programmatic)
Thread 2 (programmatic)
Thread 3 (programmatic)
Thread 0 (directive)
Thread 1 (directive)
Thread 2 (directive)
Thread 3 (directive)
Thread 0 (environment)
Thread 1 (environment)
Thread 2 (environment)
Thread 3 (environment)

Note: The order of threads within each section may vary.

4
2.4 Private and Shared Variables
Variables are shared by default. Use private to give each thread its own copy.

• private(var): Each thread has an independent copy.

• shared(var): All threads access the same variable.

Example: Private vs. shared variables.

1 #include <stdio.h>
2 #include <unistd.h>
3 #include <omp.h>
4 int main() {
5 int prv = 0, pub = 0;
6 printf("Thread ID, Private, Shared\n");
7 #pragma omp parallel num_threads(4) private(prv) shared(pub)
8 {
9 int thread_id = omp_get_thread_num();
10 prv = 0;
11 prv += thread_id;
12 sleep(1);
13 pub += thread_id;
14 printf("%d, %d, %d\n", thread_id, prv, pub);
15 }
16 return 0;
17 }

Expected Output:

Thread ID, Private, Shared

0, 0, 0
1, 1, 1
2, 2, 3
3, 3, 6

Note: The shared variable pub accumulates thread IDs, and the output order may vary.

5
2.5 Parallelizing For Loops
#pragma omp for distributes loop iterations across threads. Example: 8 iterations
across 4 threads.
1 #include <stdio.h>
2 #include <omp.h>
3 #define NUM_THREADS 4
4 #define NUM_ITERS 8
5 int main() {
6 #pragma omp parallel num_threads(NUM_THREADS)
7 {
8 #pragma omp for
9 for (int i = 0; i < NUM_ITERS; i++) {
10 int id = omp_get_thread_num();
11 printf("Thread %d: Loop iteration %d\n", id, i);
12 }
13 }
14 return 0;
15 }

Expected Output (sample for static scheduling):

Thread 0: Loop iteration 0

Thread 0: Loop iteration 1
Thread 1: Loop iteration 2
Thread 1: Loop iteration 3
Thread 2: Loop iteration 4
Thread 2: Loop iteration 5
Thread 3: Loop iteration 6
Thread 3: Loop iteration 7

Note: The assignment of iterations to threads depends on the scheduling type (default is
static). Output order may vary.

2.6 Scheduling
Scheduling controls how iterations are assigned:

• Static:Divides the iterations of the loop into equal-sized chunks, with each chunk
assigned to a thread statically at compile time. The optional chunk parameter spec-
ifies the size of each chunk.
(Equal chunks at compile time.)

– Syntax: #pragma omp for schedule(static[,chunk])

– Pros: Low overhead.
– Cons: Poor load balancing.

• Dynamic: Divides the iterations of the loop into chunks of size chunk, and assigns
these chunks dynamically to threads as threads become available to do work.
(Assigns chunks dynamically as threads become available.)

– Syntax: #pragma omp for schedule(dynamic[,chunk])

– Pros: Better load balancing.
– Cons: Higher overhead.

6
• Guided:Similar to dynamic scheduling, but the size of the chunks decreases dy-
namically over time. Initially, larger chunks are assigned, which helps reduce
scheduling overhead, but as the loop progresses, the chunk size decreases to bal-
ance the workload among threads.
(Assigns larger chunks initially, decreasing over time.)

– Syntax: #pragma omp for schedule(guided[,chunk])

– Pros: Balances overhead and load.
– Cons: Less predictable.

• Runtime: Allows the scheduling policy to be determined at runtime using the OMP
SCHEDULE environment variable or the omp set schedule function. This provides
flexibility in choosing the scheduling policy without recompiling the code.
Set via OMP_SCHEDULE.

– Syntax: #pragma omp for schedule(runtime)

– Pros: Flexible.
– Cons: Requires external setup.

Example: Comparing scheduling types.

1 #include <stdio.h>
2 #include <omp.h>
3 #define NUM_THREADS 4
4 #define NUM_ITERS 16
5 int main() {
6 printf("Static Scheduling:\n");
7 #pragma omp parallel num_threads(NUM_THREADS)
8 {
9 #pragma omp for schedule(static, 2)
10 for (int i = 0; i < NUM_ITERS; i++) {
11 printf("Thread %d: Iteration %d\n", omp_get_thread_num(), i);
12 }
13 }
14 printf("\nDynamic Scheduling:\n");
15 #pragma omp parallel num_threads(NUM_THREADS)
16 {
17 #pragma omp for schedule(dynamic, 2)
18 for (int i = 0; i < NUM_ITERS; i++) {
19 printf("Thread %d: Iteration %d\n", omp_get_thread_num(), i);
20 }
21 }
22 printf("\nGuided Scheduling:\n");
23 #pragma omp parallel num_threads(NUM_THREADS)
24 {
25 #pragma omp for schedule(guided, 2)
26 for (int i = 0; i < NUM_ITERS; i++) {
27 printf("Thread %d: Iteration %d\n", omp_get_thread_num(), i);
28 }
29 }
30 printf("\nRuntime Scheduling:\n");
31 #pragma omp parallel num_threads(NUM_THREADS)
32 {
33 #pragma omp for schedule(runtime)
34 for (int i = 0; i < NUM_ITERS; i++) {
35 printf("Thread %d: Iteration %d\n", omp_get_thread_num(), i);

7
36 }
37 }
38 return 0;
39 }

Expected Output (sample):

Static Scheduling:
Thread 0: Iteration 0
Thread 0: Iteration 1
Thread 0: Iteration 2
Thread 0: Iteration 3
Thread 1: Iteration 4
Thread 1: Iteration 5
Thread 1: Iteration 6
Thread 1: Iteration 7
Thread 2: Iteration 8
Thread 2: Iteration 9
Thread 2: Iteration 10
Thread 2: Iteration 11
Thread 3: Iteration 12
Thread 3: Iteration 13
Thread 3: Iteration 14
Thread 3: Iteration 15

Dynamic Scheduling:
Thread 0: Iteration 0
Thread 0: Iteration 1
Thread 1: Iteration 2
Thread 1: Iteration 3
Thread 2: Iteration 4
Thread 2: Iteration 5
Thread 3: Iteration 6
Thread 3: Iteration 7
Thread 0: Iteration 8
Thread 0: Iteration 9
Thread 1: Iteration 10
Thread 1: Iteration 11
Thread 2: Iteration 12
Thread 2: Iteration 13
Thread 3: Iteration 14
Thread 3: Iteration 15

Guided Scheduling:
Thread 0: Iteration 0
Thread 0: Iteration 1
Thread 0: Iteration 2
Thread 0: Iteration 3
Thread 1: Iteration 4
Thread 1: Iteration 5
Thread 1: Iteration 6
Thread 2: Iteration 7
Thread 2: Iteration 8

8
Thread 2: Iteration 9
Thread 3: Iteration 10
Thread 3: Iteration 11
Thread 0: Iteration 12
Thread 1: Iteration 13
Thread 2: Iteration 14
Thread 3: Iteration 15

Runtime Scheduling:
Thread 0: Iteration 0
Thread 0: Iteration 1
Thread 0: Iteration 2
Thread 0: Iteration 3
Thread 1: Iteration 4
Thread 1: Iteration 5
Thread 1: Iteration 6
Thread 1: Iteration 7
Thread 2: Iteration 8
Thread 2: Iteration 9
Thread 2: Iteration 10
Thread 2: Iteration 11
Thread 3: Iteration 12
Thread 3: Iteration 13
Thread 3: Iteration 14
Thread 3: Iteration 15
Note: Static scheduling assigns fixed chunks (e.g., 4 iterations per thread). Dynamic and
guided scheduling assign chunks as threads become available, with guided reducing
chunk sizes over time. Runtime scheduling depends on OMP_SCHEDULE, so the output
may vary.

2.7 Reduction Clauses

reduction(op:var) combines thread results using operators like +, -, *, &, |, ^, &&, ||,
max, min. Example: Summing an array.
1 #include <stdio.h>
2 #include <omp.h>
3 #define NUM_THREADS 4
4 #define ARRAY_SIZE 8
5 int main() {
6 int arr[ARRAY_SIZE];
7 int total = 0;
8 for (int i = 0; i < ARRAY_SIZE; i++) {
9 arr[i] = i + 1;
10 }
11 #pragma omp parallel num_threads(NUM_THREADS)
12 {
13 #pragma omp for reduction(+:total)
14 for (int i = 0; i < ARRAY_SIZE; i++) {
15 total += arr[i];
16 }
17 }
18 printf("Total = %d\n", total); // 1+2+3+4+5+6+7+8 = 36
19 return 0;

9
20 }

Expected Output:

Total = 36

2.8 Synchronization
• Critical:Ensures that a specific block of code is executed by only one thread at a
time. all threads will eventually execute the code in this section but not in the same
time. ( One thread at a time)

• Single:Specifies that a block of code should be executed by only one thread, but it
doesn’t specify which thread. ( One thread, others wait.)

• Master:master: like single but the block is ensured to be executed by the master
(main program,thread with ID 0). (Master thread (ID 0) only.)

• Barrier:Synchronizes all threads in a parallel region. Each thread waits at the bar-
rier until all threads have reached it, then they all proceed. (All threads wait.)

• Atomic:Specifies that a specific operation is atomic,it is used for indivisible opera-

tions like an increment. (Indivisible operation.)

Example: Synchronization directives.

1 #include <stdio.h>
2 #include <omp.h>
3 int main() {
4 int counter = 0;
5 #pragma omp parallel num_threads(4)
6 {
7 int id = omp_get_thread_num();
8 #pragma omp critical
9 {
10 counter++;
11 printf("Thread %d incremented counter to %d\n", id, counter);
12 }
13 #pragma omp single
14 {
15 printf("Thread %d in single section\n", id);
16 }
17 #pragma omp master
18 {
19 printf("Master thread %d\n", id);
20 }
21 #pragma omp barrier
22 printf("Thread %d passed barrier\n", id);
23 #pragma omp atomic
24 counter++;
25 }
26 printf("Final counter: %d\n", counter);
27 return 0;
28 }

Expected Output (sample for 4 threads):

10
Thread 0 incremented counter to 1
Thread 1 incremented counter to 2
Thread 2 incremented counter to 3
Thread 3 incremented counter to 4
Thread 2 in single section
Master thread 0
Thread 0 passed barrier
Thread 1 passed barrier
Thread 2 passed barrier
Thread 3 passed barrier
Final counter: 8

Note: The order of critical section outputs may vary. The single section is executed by
one thread (ID varies). The final counter is 8 (4 from critical + 4 from atomic).

2.9 Sections
#pragma omp sections assigns independent tasks to threads. Example: Parallelizing
two functions.
1 #include <stdio.h>
2 #include <omp.h>
3 int f1(int b, int c) { return b + c; }
4 int f2(int d, int e) { return d * e; }
5 int main() {
6 int a = 0, b = 1, c = 2, d = 3, e = 4;
7 int a1, a2;
8 #pragma omp parallel sections
9 {
10 #pragma omp section
11 {
12 a1 = f1(b, c);
13 printf("Section 1: a1 = %d\n", a1);
14 }
15 #pragma omp section
16 {
17 a2 = f2(d, e);
18 printf("Section 2: a2 = %d\n", a2);
19 }
20 }
21 a = a1 + a2;
22 printf("a = %d\n", a);
23 return 0;
24 }

Expected Output:

Section 1: a1 = 3
Section 2: a2 = 12
a = 15

Note: The order of section outputs may vary due to thread assignment.

11
3 MPI
MPI (Message Passing Interface) is a standard for distributed-memory parallel comput-
ing using processes.

3.1 Overview
• Purpose: Parallelize code across multiple nodes/processes.

• Key Features:

– Distributed memory: Each process has its own memory.

– Message passing for communication.
– Scalable to thousands of nodes.

• Advantages: Scalable, works on clusters.

• Limitations: Complex, higher overhead for small tasks.

3.2 Compiling and Running

Compile with mpicc, run with mpirun:

mpicc source.c -o output

mpirun -np <NUM_PROCESSES> ./output

Example: MPI "Hello World".

Expected Output (for 4 processes):

Hello from process 0 of 4

Hello from process 1 of 4
Hello from process 2 of 4
Hello from process 3 of 4

Note: The order of process outputs may vary.

12
3.3 General Structure
• MPI_Init(&argc, &argv);

• Code and communications.

• MPI_Finalize();

3.4 Number of Processes and Rank

• MPI_Comm_size(MPI_COMM_WORLD, &size);

• MPI_Comm_rank(MPI_COMM_WORLD, &rank);

3.5 Message Passing

MPI_Send and MPI_Recv for point-to-point communication. Example: Sending a num-
ber.
1 #include <stdio.h>
2 #include <mpi.h>
3 int main(int argc, char *argv[]) {
4 MPI_Init(&argc, &argv);
5 int rank, size;
6 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
7 MPI_Comm_size(MPI_COMM_WORLD, &size);
8 int number;
9 if (rank == 0) {
10 number = 42;
11 MPI_Send(&number, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
12 printf("Process 0 sent %d to process 1\n", number);
13 } else if (rank == 1) {
14 MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
15 printf("Process 1 received %d from process 0\n", number);
16 }
17 MPI_Finalize();
18 return 0;
19 }

Expected Output (for 2 processes):

Process 0 sent 42 to process 1

Process 1 received 42 from process 0

Note: The order of outputs may vary.

13
3.6 Broadcast
MPI_Bcast sends data from a root process to all others. Example: Broadcasting a num-
ber.
1 #include <stdio.h>
2 #include <mpi.h>
3 int main(int argc, char *argv[]) {
4 MPI_Init(&argc, &argv);
5 int rank;
6 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
7 int number = 0;
8 if (rank == 0) {
9 number = 100;
10 }
11 MPI_Bcast(&number, 1, MPI_INT, 0, MPI_COMM_WORLD);
12 printf("Process %d has number %d\n", rank, number);
13 MPI_Finalize();
14 return 0;
15 }

Expected Output (for 4 processes):

Process 0 has number 100
Process 1 has number 100
Process 2 has number 100
Process 3 has number 100
Note: The order of process outputs may vary.

3.7 Scatter and Gather

• MPI_Scatter: Distributes data from root to all processes.

• MPI_Gather: Collects data from all processes to root.

Example: Scatter and gather.

1 #include <stdio.h>
2 #include <mpi.h>
3 int main(int argc, char *argv[]) {
4 MPI_Init(&argc, &argv);
5 int rank, size;
6 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
7 MPI_Comm_size(MPI_COMM_WORLD, &size);
8 int sendbuf[4] = {10, 20, 30, 40};
9 int recvbuf;
10 MPI_Scatter(sendbuf, 1, MPI_INT, &recvbuf, 1, MPI_INT, 0,
MPI_COMM_WORLD);
11 printf("Process %d received %d\n", rank, recvbuf);
12 int gatherbuf[4];
13 recvbuf *= 2;
14 MPI_Gather(&recvbuf, 1, MPI_INT, gatherbuf, 1, MPI_INT, 0,
MPI_COMM_WORLD);
15 if (rank == 0) {
16 printf("Gathered data: ");
17 for (int i = 0; i < size; i++) {
18 printf("%d ", gatherbuf[i]);
19 }

14
20 printf("\n");
21 }
22 MPI_Finalize();
23 return 0;
24 }

Expected Output (for 4 processes):

Process 0 received 10
Process 1 received 20
Process 2 received 30
Process 3 received 40
Gathered data: 20 40 60 80

Note: The order of scatter outputs may vary, but the gathered data is deterministic.

3.8 Reduce
MPI_Reduce combines data using an operation. Example: Summing values.
1 #include <stdio.h>
2 #include <mpi.h>
3 int main(int argc, char *argv[]) {
4 MPI_Init(&argc, &argv);
5 int rank;
6 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
7 int value = rank + 1;
8 int result;
9 MPI_Reduce(&value, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);
10 if (rank == 0) {
11 printf("Sum of values: %d\n", result);
12 }
13 MPI_Finalize();
14 return 0;
15 }

Expected Output (for 4 processes):

Sum of values: 10

Note: The sum is 1 + 2 + 3 + 4 = 10, printed only by rank 0.

15
3.9 Synchronization
MPI_Barrier synchronizes all processes. Example: Ordered "Hello World".
1 #include <stdio.h>
2 #include <mpi.h>
3 int main(int argc, char *argv[]) {
4 MPI_Init(&argc, &argv);
5 int size, rank;
6 char dummy;
7 MPI_Comm_size(MPI_COMM_WORLD, &size);
8 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
9 if (rank != 0) {
10 int src = (rank + size - 1) % size;
11 MPI_Recv(&dummy, 1, MPI_CHAR, src, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
12 }
13 printf("Process %d is saying Hello World.\n", rank);
14 if (rank != size - 1) {
15 MPI_Send(&dummy, 1, MPI_CHAR, (rank + 1) % size, 0,
MPI_COMM_WORLD);
16 }
17 MPI_Finalize();
18 return 0;
19 }

Expected Output (for 4 processes):

Process 0 is saying Hello World.

Process 1 is saying Hello World.
Process 2 is saying Hello World.
Process 3 is saying Hello World.

Note: The outputs are ordered due to the token-passing synchronization.

3.10 Cluster Setup

Master Node:

1. Update: sudo apt update && sudo apt upgrade

2. Install MPI: sudo apt install build-essential mpi-default-dev

3. SSH: ssh-keygen -t rsa, ssh-copy-id user@slave_ip

4. NFS: sudo apt install nfs-server, configure /etc/exports.

5. Machines file: List slave IPs.

6. Run: mpirun -n 2 -machinefile machines ./program

Slave Node:

1. Update system.

2. Install MPI and SSH server.

3. NFS client: sudo apt install nfs-client, mount master’s share.

16
Table 1: Comparison of OpenMP and MPI

Feature OpenMP MPI

Memory Model Shared memory (threads share Distributed memory (processes
memory) have own memory)
Parallel Unit Threads Processes
Ease of Use Easier (directives, minimal Harder (explicit message pass-
changes) ing)
Scalability Limited to single node Scales to clusters
Communication Implicit (shared variables) Explicit (send/receive)
Synchronization Critical, single, master, barrier, Barrier, send/receive ordering
atomic
Use Case Multi-core CPUs, simple tasks Clusters, large-scale computing

4 OpenMP vs. MPI

5 Study Tips for Your Exam

• Understand Concepts: Differentiate shared vs. distributed memory.

• Memorize: Key directives (#pragma omp parallel) and MPI functions (MPI_Send).

• Practice: Compile and modify provided examples.

• Compare: Scheduling types, synchronization, OpenMP vs. MPI.

• Debug: Watch for race conditions (OpenMP) and deadlocks (MPI).

GS06-6105 - B (GS4 Demonstration Software - A Quick Start Guide)
No ratings yet
GS06-6105 - B (GS4 Demonstration Software - A Quick Start Guide)
26 pages
Large-Scale C++ Software Design
No ratings yet
Large-Scale C++ Software Design
870 pages
CISA Questions Business Continuity and Disaster Recovery
100% (1)
CISA Questions Business Continuity and Disaster Recovery
30 pages
Chapter 3 - Shared-Memory Programming, OpenMP
No ratings yet
Chapter 3 - Shared-Memory Programming, OpenMP
65 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
Untitled document
No ratings yet
Untitled document
23 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
Programming Assignment: On Openmp
No ratings yet
Programming Assignment: On Openmp
19 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
02 Message Passing Interface Tutorial
No ratings yet
02 Message Passing Interface Tutorial
34 pages
Openmp
No ratings yet
Openmp
61 pages
Beginning OpenMP
No ratings yet
Beginning OpenMP
20 pages
OpenMP Application Program Interface
No ratings yet
OpenMP Application Program Interface
326 pages
Pdcnotes
No ratings yet
Pdcnotes
23 pages
openmp_HPC_ass1
No ratings yet
openmp_HPC_ass1
43 pages
CSC-334_ P&DC_Lab manual_V2.0
No ratings yet
CSC-334_ P&DC_Lab manual_V2.0
102 pages
mpi_book
No ratings yet
mpi_book
673 pages
Intro To MPI
No ratings yet
Intro To MPI
44 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
OpenMP_SPM
No ratings yet
OpenMP_SPM
9 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
PC_course_notes_May17
No ratings yet
PC_course_notes_May17
123 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
ipp question bank
No ratings yet
ipp question bank
11 pages
Cs6801 Mcap MGM
No ratings yet
Cs6801 Mcap MGM
7 pages
OpenMP 4
No ratings yet
OpenMP 4
18 pages
Practical MPI Programming
No ratings yet
Practical MPI Programming
238 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
PDC Lecture 15 OpenMP
No ratings yet
PDC Lecture 15 OpenMP
18 pages
Mpsoc Architectures Openmp
No ratings yet
Mpsoc Architectures Openmp
35 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
Parallel Programming: Process and Threads
No ratings yet
Parallel Programming: Process and Threads
18 pages
Presentation2 HS OpenMP
No ratings yet
Presentation2 HS OpenMP
29 pages
10 OpenMP-2
No ratings yet
10 OpenMP-2
25 pages
Parallel Programming Unit 2
No ratings yet
Parallel Programming Unit 2
71 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
Openmp: Parallel Processing
No ratings yet
Openmp: Parallel Processing
40 pages
DS1822-Parallel Computing - Unit2
No ratings yet
DS1822-Parallel Computing - Unit2
25 pages
ParallelProgramming_Start2016
No ratings yet
ParallelProgramming_Start2016
41 pages
Unit 4 Shared-Memory Parallel Programming With Openmp
No ratings yet
Unit 4 Shared-Memory Parallel Programming With Openmp
37 pages
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
No ratings yet
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
50 pages
3.Introduction to Parallelism
No ratings yet
3.Introduction to Parallelism
64 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
Parallel Programming Module 3
No ratings yet
Parallel Programming Module 3
44 pages
Unit III
No ratings yet
Unit III
15 pages
CS-3006_8_UsingOpenMP_SharedMemoryProgramming
No ratings yet
CS-3006_8_UsingOpenMP_SharedMemoryProgramming
61 pages
Open MP
No ratings yet
Open MP
35 pages
OpenMP Examples
No ratings yet
OpenMP Examples
12 pages
OpenMP Tutorial - Lawrence Livermore National Laboratory
No ratings yet
OpenMP Tutorial - Lawrence Livermore National Laboratory
75 pages
Mit Openmp Mpi
No ratings yet
Mit Openmp Mpi
77 pages
Parallel Programming For Multicore Machines Using OpenMP and MPI Lecture Notes (Dr. Constantinos Evangelinos) (Z-Library)
No ratings yet
Parallel Programming For Multicore Machines Using OpenMP and MPI Lecture Notes (Dr. Constantinos Evangelinos) (Z-Library)
292 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
PC File
No ratings yet
PC File
57 pages
Openmp Programming: Aiichiro Nakano
No ratings yet
Openmp Programming: Aiichiro Nakano
10 pages
04
No ratings yet
04
39 pages
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
From Everand
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
Michael Basler
No ratings yet
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
Mastering Python Advanced Concepts and Practical Applications
From Everand
Mastering Python Advanced Concepts and Practical Applications
Aissa Younes
No ratings yet
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
AI Course Outline
0% (1)
AI Course Outline
2 pages
Lecture 7
No ratings yet
Lecture 7
27 pages
Test Works Manual
No ratings yet
Test Works Manual
206 pages
Guidelines For Working With Text-Based .Acf File Formats - X-Plane Developer
No ratings yet
Guidelines For Working With Text-Based .Acf File Formats - X-Plane Developer
2 pages
final-project-about-house-rental-management-system (1)
No ratings yet
final-project-about-house-rental-management-system (1)
81 pages
Chaayos
No ratings yet
Chaayos
4 pages
User Manual - Asura 2pro Multi Mode Bluetooth Version - BTP-A1N2&A1N2S V4.2
No ratings yet
User Manual - Asura 2pro Multi Mode Bluetooth Version - BTP-A1N2&A1N2S V4.2
26 pages
UVVIS
No ratings yet
UVVIS
6 pages
Genesis Intermediate Python
No ratings yet
Genesis Intermediate Python
35 pages
Cyber Safety Presentation For Weebly
No ratings yet
Cyber Safety Presentation For Weebly
14 pages
AP Discovery of WLC
100% (1)
AP Discovery of WLC
2 pages
SQL 50 Questions
No ratings yet
SQL 50 Questions
15 pages
Logistics Manager Suite
No ratings yet
Logistics Manager Suite
10 pages
Stamped
No ratings yet
Stamped
6 pages
Hci Chapters Quick Reference
No ratings yet
Hci Chapters Quick Reference
108 pages
Section A Comprehension
No ratings yet
Section A Comprehension
13 pages
SAP ARIBA New Syllabus
No ratings yet
SAP ARIBA New Syllabus
4 pages
501 Spring Boot Intro
No ratings yet
501 Spring Boot Intro
77 pages
Literature Review
No ratings yet
Literature Review
6 pages
Oracle RDBMS Rootkits and Other Modifications
No ratings yet
Oracle RDBMS Rootkits and Other Modifications
9 pages
Brochure Nvidia Quadro 4000 - Macrotec PDF
No ratings yet
Brochure Nvidia Quadro 4000 - Macrotec PDF
2 pages
Notification RBI Assistant Posts PDF
No ratings yet
Notification RBI Assistant Posts PDF
22 pages
User Manual: MAN-525503/525534-UM-0413-01
No ratings yet
User Manual: MAN-525503/525534-UM-0413-01
26 pages
Ui / Ux Design Terms, Explanations & Roles
No ratings yet
Ui / Ux Design Terms, Explanations & Roles
19 pages
Though These Can Be Reviewed at Any Time in ALEKS by Clicking The Tab Labeled "Review"
No ratings yet
Though These Can Be Reviewed at Any Time in ALEKS by Clicking The Tab Labeled "Review"
4 pages
Sagsec
No ratings yet
Sagsec
4 pages
NG 278-A007-Link 22 Guidebook Overview Ed2
No ratings yet
NG 278-A007-Link 22 Guidebook Overview Ed2
120 pages