Lecture 4

Uploaded by

20-cs-112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views20 pages

Lecture 4

Uploaded by

20-cs-112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Distributed

Computing
Lecture-4: Parallel Programming Models
Yasir Noman Khalid
Overview
• There are several parallel programming models in common use:
• Shared Memory (without threads)
• Threads
• Distributed Memory / Message Passing
• Data Parallel
• Hybrid
• Single Program Multiple Data (SPMD)
• Multiple Program Multiple Data (MPMD)
• Parallel programming models exist as an abstraction above hardware and memory
architectures.
• Although it might not seem apparent, these models are NOT specific to a particular type of
machine or memory architecture. In fact, any of these models can (theoretically) be
implemented on any underlying hardware. Examples are discussed in the next 2 slides.

2
SHARED memory model on a DISTRIBUTED
memory machine
• Machine memory is physically distributed across networked machines but appeared to the
user as a single shared memory global address space. Generically, this approach is referred to
as "virtual shared memory".

3
DISTRIBUTED memory model on a SHARED
memory machine
• Message Passing Interface (MPI) on SGI Origin 2000. The SGI Origin 2000 employed the CC-
NUMA type of shared memory architecture, where every task has direct access to global
address space spread across all machines. However, the ability to send and receive messages
using MPI, as is commonly done over a network of distributed memory machines, was
implemented and commonly used.

4
Which model to use???
• Often a combination of what is available and personal choice.
• There is no "best" model, although there certainly are better implementations of some
models over others.

5
Shared Memory Model (without threads)
• In this programming model, processes/tasks share a common address space, which they read
and write to asynchronously.
• Various mechanisms such as locks / semaphores are used to
• Control access to the shared memory,
• Resolve contentions and
• Prevent race conditions and deadlocks.
• Perhaps the simplest parallel programming model.
• An advantage of this model from the programmer's point of view is that the notion of data
"ownership" is lacking
• There is no need to specify explicitly the communication of data between tasks.
• All processes see and have equal access to shared memory.
• Program development can often be simplified.

6
• An important disadvantage in terms of performance is that it becomes more difficult to
understand and manage data locality:
• Keeping data local to the process that works on it conserves memory accesses, cache refreshes and bus
traffic that occurs when multiple processes use the same data.
• Unfortunately, controlling data locality is hard to understand and may be beyond the control of the average
user.

7
Threads Model
• This programming model is a type of shared memory programming.
• In the threads model of parallel programming, a single "heavy weight" process can have multiple "light weight",
concurrent execution paths.
• For example:
• The main program a.out is scheduled to run by the native operating system. a.out loads and acquires all the
necessary system and user resources to run. This is the "heavy weight" process.
• a.out performs some serial work, and then creates several tasks (threads) that can be scheduled and run by
the operating system concurrently.
• Each thread has local data, but also, shares the entire resources of a.out. This saves the overhead associated
with replicating a program's resources for each thread ("light weight"). Each thread also benefits from a
global memory view because it shares the memory space of a.out.
• A thread's work may best be described as a subroutine within the main program. Any thread can execute any
subroutine at the same time as other threads.
• Threads communicate with each other through global memory (updating address locations). This requires
synchronization constructs to ensure that more than one thread is not updating the same global address at
any time.
• Threads can come and go, but a.out remains present to provide the necessary shared resources until the
application has completed. 8
9
Implementations: Threads Model
• From a programming perspective, threads implementations commonly comprise:
• A library of subroutines that are called from within parallel source code
• A set of compiler directives imbedded in either serial or parallel source code
• In both cases, the programmer is responsible for determining the parallelism (although
compilers can sometimes help).
• Threaded implementations are not new in computing. Historically, hardware vendors have
implemented their own proprietary versions of threads. These implementations differed
substantially from each other making it difficult for programmers to develop portable threaded
applications.

10
• Unrelated standardization efforts have resulted in two very different implementations of
threads: POSIX Threads and OpenMP.
• POSIX Threads
• Specified by the IEEE POSIX 1003.1c standard (1995). C Language only.
• Part of Unix/Linux operating systems
• Library based
• Commonly referred to as Pthreads.
• Very explicit parallelism; requires significant programmer attention to detail.
• OpenMP
• Industry standard, jointly defined and endorsed by a group of major computer hardware and software
vendors, organizations and individuals.
• Compiler directive based
• Portable / multi-platform, including Unix and Windows platforms
• Available in C/C++ and Fortran implementations
• Can be very easy and simple to use - provides for "incremental parallelism". Can begin with serial code.
• Other threaded implementations are common, but not discussed here:
• Microsoft threads
• Java, Python threads
• CUDA threads for GPUs 11
Distributed Memory/Message Passing Model
• This model demonstrates the following characteristics:
• A set of tasks that use their own local memory during computation. Multiple tasks can reside on the same
physical machine and/or across an arbitrary number of machines.
• Tasks exchange data through communications by sending and receiving messages.
• Data transfer usually requires cooperative operations to be performed by each process. For example, a send
operation must have a matching receive operation.
• Implementations:
• From a programming perspective, message passing implementations usually comprise a library of
subroutines. Calls to these subroutines are imbedded in source code. The programmer is responsible for
determining all parallelism.
• Historically, a variety of message passing libraries have been available since the 1980s. These
implementations differed substantially from each other making it difficult for programmers to develop
portable applications.
• In 1992, the MPI Forum was formed with the primary goal of establishing a standard interface for message
passing implementations.
• MPI is the "de facto" industry standard for message passing, replacing virtually all other message passing
implementations used for production work. MPI implementations exist for virtually all popular parallel
12
computing platforms. Not all implementations include everything in MPI-1, MPI-2 or MPI-3.
Implementations: Message Passing Model
• Message passing implementations usually comprise a library of subroutines. Calls to these subroutines are
imbedded in source code. The programmer is responsible for determining all parallelism.
• Historically, a variety of message passing libraries have been available since the 1980s. These
implementations differed substantially from each other making it difficult for programmers to develop
portable applications.
• MPI is the "de facto" industry standard for message passing, replacing virtually all other message passing
implementations used for production work. MPI implementations exist for virtually all popular parallel
computing platforms. Not all implementations include everything in MPI-1, MPI-2 or MPI-3.

13
Data Parallel Model
• May also be referred to as the Partitioned Global Address Space (PGAS) model.
• The data parallel model demonstrates the following characteristics:
• Address space is treated globally
• Most of the parallel work focuses on performing operations on a data set. The data set is typically organized
into a common structure, such as an array.
• A set of tasks work collectively on the same data structure, however, each task works on a different partition
of the same data structure.
• Tasks perform the same operation on their partition of work, for example, "add 4 to every array element".
• On shared memory architectures, all tasks may have access to the data structure through
global memory.
• On distributed memory architectures, the global data structure can be split up logically and/or
physically across tasks.

14
15
Implementations: Data Parallel Model
• Currently, there are several relatively popular, and sometimes developmental, parallel programming
implementations based on the Data Parallel / PGAS model.
• Coarray Fortran: a small set of extensions to Fortran 95 for SPMD parallel programming. Compiler
dependent.
• Unified Parallel C (UPC): an extension to the C programming language for SPMD parallel programming.
Compiler dependent.
• Global Arrays: provides a shared memory style programming environment in the context of distributed array
data structures. Public domain library with C and Fortran77 bindings.
• X10: a PGAS based parallel programming language being developed by IBM at the Thomas J. Watson
Research Center.
• Chapel: an open-source parallel programming language project being led by Cray.

16
Hybrid Model
• A hybrid model combines more than one of the previously described programming models.
• Currently, a common example of a hybrid model is the combination of the message passing
model (MPI) with the threads model (OpenMP).
• Threads perform computationally intensive kernels using local, on-node data
• Communications between processes on different nodes occurs over the network using MPI
• This hybrid model lends itself well to the most popular hardware environment of clustered
multi/many-core machines.
• Another similar and increasingly popular example of a hybrid model is using MPI with CPU-GPU
(Graphics Processing Unit) programming.
• MPI tasks run on CPUs using local memory and communicating with each other over a network.
• Computationally intensive kernels are off-loaded to GPUs on-node.
• Data exchange between node-local memory and GPUs uses CUDA (or something equivalent).
• Other hybrid models are common:
• MPI with Pthreads
• MPI with non-GPU accelerators etc. 17
18
Single Program Multiple Data (SPMD)
• SPMD is a "high level" programming model that can be built upon any combination of the
previously mentioned parallel programming models.
• SINGLE PROGRAM: All tasks execute their copy of the same program simultaneously. This
program can be threads, message passing, data parallel or hybrid.
• MULTIPLE DATA: All tasks may use different data
• SPMD programs usually have the necessary logic programmed into them to allow different
tasks to branch or conditionally execute only those parts of the program they are designed to
execute. That is, tasks do not necessarily have to execute the entire program - perhaps only a
portion of it.
• The SPMD model, using message passing or hybrid programming, is probably the most used
parallel programming model for multi-node clusters.

19
Multiple Program Multiple Data (MPMD):
• Like SPMD, MPMD is a "high level" programming model that can be built upon any
combination of the previously mentioned parallel programming models.
• MULTIPLE PROGRAM: Tasks may execute different programs simultaneously. The programs can
be threads, message passing, data parallel or hybrid.
• MULTIPLE DATA: All tasks may use different data
• MPMD applications are not as common as SPMD applications but may be better suited for
certain types of problems, particularly those that lend themselves better to functional
decomposition than domain decomposition.

Dr. Frank Zickert - Hands-On Quantum Machine Learning With Python Volume 1 - Get Started-PYQML (2021)
100% (5)
Dr. Frank Zickert - Hands-On Quantum Machine Learning With Python Volume 1 - Get Started-PYQML (2021)
435 pages
BMW 5 Series E60/E61 2008 LCI Detailed Changes
100% (4)
BMW 5 Series E60/E61 2008 LCI Detailed Changes
42 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
XDR80T Tri-Axle Rigid Mining Truck Operation and Maintenance Manual
100% (1)
XDR80T Tri-Axle Rigid Mining Truck Operation and Maintenance Manual
292 pages
Ansi Ieee C57.110-1986
No ratings yet
Ansi Ieee C57.110-1986
17 pages
Chapter 2 - Parallel Algorithm Design
No ratings yet
Chapter 2 - Parallel Algorithm Design
84 pages
3-ParallelProgrammingModels
No ratings yet
3-ParallelProgrammingModels
20 pages
2 Parallel Computer Memory Architectures
No ratings yet
2 Parallel Computer Memory Architectures
26 pages
Unit 2 Pram Algorithms: Structure Page Nos
No ratings yet
Unit 2 Pram Algorithms: Structure Page Nos
25 pages
Parallel and Distributed Computing Lecture#12
No ratings yet
Parallel and Distributed Computing Lecture#12
19 pages
DST4030A Lecture Notes Week 4
No ratings yet
DST4030A Lecture Notes Week 4
42 pages
Parallel Programming Models
No ratings yet
Parallel Programming Models
25 pages
IT105 Midterm Lecture Part1
No ratings yet
IT105 Midterm Lecture Part1
5 pages
3.3-Recent Trends in Parallel Computing
No ratings yet
3.3-Recent Trends in Parallel Computing
12 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
Part 1 - Lecture 3 - Parallel Software-1
No ratings yet
Part 1 - Lecture 3 - Parallel Software-1
45 pages
PP_CS(451)
No ratings yet
PP_CS(451)
89 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
Lecture-4 Parallel Programming Model
No ratings yet
Lecture-4 Parallel Programming Model
14 pages
Lecture 13 - Programming Models
No ratings yet
Lecture 13 - Programming Models
15 pages
High Performance Computing
No ratings yet
High Performance Computing
17 pages
Programação Paralela e Distribuída
No ratings yet
Programação Paralela e Distribuída
39 pages
PDC Lecture 05
No ratings yet
PDC Lecture 05
48 pages
Parallel Random Access Machines
No ratings yet
Parallel Random Access Machines
5 pages
Message Passing Fundamentals: Reference: Http://foxtrot - Ncsa.uiuc - edu:8900/public/MPI
No ratings yet
Message Passing Fundamentals: Reference: Http://foxtrot - Ncsa.uiuc - edu:8900/public/MPI
22 pages
Programming Models
No ratings yet
Programming Models
21 pages
Mpi Course
No ratings yet
Mpi Course
202 pages
OS Multi-Threading Concepts
No ratings yet
OS Multi-Threading Concepts
25 pages
PRAM Model
No ratings yet
PRAM Model
5 pages
Programming Models: Scope
No ratings yet
Programming Models: Scope
3 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Parallel_computing
No ratings yet
Parallel_computing
32 pages
6-Posix Threads
No ratings yet
6-Posix Threads
32 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
comporg6_ch12
No ratings yet
comporg6_ch12
36 pages
Concurrency: CS2403 Programming Languages
No ratings yet
Concurrency: CS2403 Programming Languages
44 pages
Parallel Programming
No ratings yet
Parallel Programming
108 pages
03 (Parallel Software)
No ratings yet
03 (Parallel Software)
38 pages
Chapter Four_parallel Computing
No ratings yet
Chapter Four_parallel Computing
86 pages
PA midsem
No ratings yet
PA midsem
20 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
Chapter 1
No ratings yet
Chapter 1
25 pages
15cs72aca Module-5 Aca
No ratings yet
15cs72aca Module-5 Aca
53 pages
L12-Principles of Message Passing1
No ratings yet
L12-Principles of Message Passing1
10 pages
Unit 1
No ratings yet
Unit 1
25 pages
03 Programming
No ratings yet
03 Programming
63 pages
Migration To Multicore: Tools That Can Help: Tasneem G. Brutch
No ratings yet
Migration To Multicore: Tools That Can Help: Tasneem G. Brutch
10 pages
Parallel Programming: Homework Number 5 Objective
No ratings yet
Parallel Programming: Homework Number 5 Objective
6 pages
Thread libraries and implicit threading
No ratings yet
Thread libraries and implicit threading
3 pages
Parallel Programming 3
No ratings yet
Parallel Programming 3
22 pages
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
No ratings yet
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
28 pages
Unit3-all
No ratings yet
Unit3-all
115 pages
Lecture 6 Parallel Programming Models
No ratings yet
Lecture 6 Parallel Programming Models
17 pages
Introduction To Parallel Programming: Center For Institutional Research Computing
No ratings yet
Introduction To Parallel Programming: Center For Institutional Research Computing
98 pages
Introduction To Parallel Computing: John Von Neumann Institute For Computing
No ratings yet
Introduction To Parallel Computing: John Von Neumann Institute For Computing
18 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
2. Parallel Computers
No ratings yet
2. Parallel Computers
39 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
OpenMP in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenMP in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
From Everand
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Rr720507 Neural Networks
No ratings yet
Rr720507 Neural Networks
5 pages
From Excel To Machine Learning
100% (1)
From Excel To Machine Learning
48 pages
Roles On Agile Teams: From Small To Large Teams: Search
No ratings yet
Roles On Agile Teams: From Small To Large Teams: Search
5 pages
Ebad Khan: Contact
No ratings yet
Ebad Khan: Contact
5 pages
Grade 7 Technology Exam June 2022
No ratings yet
Grade 7 Technology Exam June 2022
8 pages
Wireless LAN Proposal
100% (4)
Wireless LAN Proposal
19 pages
Company Profile
No ratings yet
Company Profile
16 pages
The Ultimate Guide To Swapping A Version 8/9 Sti Cluster in To Your 2005-2007 WRX
No ratings yet
The Ultimate Guide To Swapping A Version 8/9 Sti Cluster in To Your 2005-2007 WRX
4 pages
Maximum Sensitivity Ms Based I PD Controller Design For The Control of Integrating Processes With Time Delay
No ratings yet
Maximum Sensitivity Ms Based I PD Controller Design For The Control of Integrating Processes With Time Delay
21 pages
E Motherboards
No ratings yet
E Motherboards
6 pages
Maintaining and Troubleshooting Avaya Dialer
No ratings yet
Maintaining and Troubleshooting Avaya Dialer
384 pages
JIRA Demos
No ratings yet
JIRA Demos
14 pages
Systems Analysis & Design: Answer
No ratings yet
Systems Analysis & Design: Answer
4 pages
GIS (Geographic Information System) : Report Title
No ratings yet
GIS (Geographic Information System) : Report Title
19 pages
AL ICT P2 HIHS MOCK 2025
No ratings yet
AL ICT P2 HIHS MOCK 2025
5 pages
SM3 Forces in Plane Truss
No ratings yet
SM3 Forces in Plane Truss
17 pages
Airtel Information Guide
No ratings yet
Airtel Information Guide
3 pages
Sewing Machine
No ratings yet
Sewing Machine
2 pages
Product Data Sheet: Residual Current Protection Relay, Vigipact Rh99M, 30Ma-30A, 220/240vac 50/60Hz, Din Rail Mounting
No ratings yet
Product Data Sheet: Residual Current Protection Relay, Vigipact Rh99M, 30Ma-30A, 220/240vac 50/60Hz, Din Rail Mounting
3 pages
Company Profile - Metalman
No ratings yet
Company Profile - Metalman
46 pages
Eaton BECO PROTECT FS TechnicalDataSheet en
No ratings yet
Eaton BECO PROTECT FS TechnicalDataSheet en
4 pages
Airtel Mobile 2G & 3G Data Plans
No ratings yet
Airtel Mobile 2G & 3G Data Plans
2 pages
Fifth Generation (5G) : by - Shaik Mohammed Shabaz
No ratings yet
Fifth Generation (5G) : by - Shaik Mohammed Shabaz
16 pages
LIFE B3 FURN Leaflet v2
No ratings yet
LIFE B3 FURN Leaflet v2
2 pages
Microprocessor Microcontrollers
No ratings yet
Microprocessor Microcontrollers
56 pages
RKA Weekly EHS Report - 032
No ratings yet
RKA Weekly EHS Report - 032
6 pages

Lecture 4

Uploaded by

Lecture 4

Uploaded by

Distributed

You might also like