0% found this document useful (0 votes)

7 views52 pages

Lecture 11

Uploaded by

jihem33832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views52 pages

Lecture 11

Uploaded by

jihem33832

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Basic Communication

Operations
Preliminaries
• A big problem is divided into smaller tasks (logical unit)
• Process is an entity that execute tasks
• Mapping is performed to allocate tasks to processes
• Several processes executes at the same time and perform Inter
Process Communication (Interaction)
• Interaction is performed to share Data, Work, Synchronization
Information
• There are various patterns for communication
Assumptions for the Operations
• Interconnections support cut-through routing
• Communication time between any pair of nodes in the
network is same (regardless of the number of intermediate
nodes)
• Links are bi-directional
• The directly connected nodes can simultaneously send and receive messages of
m words without any congestion
• Single-port communication model
• A node can send on only one of its links at a time
• A node can receive on only one of its links at a time
• However, a node can receive a message while sending
another message at the same time on the same or a different
link.
Patterns
1. One to All Broadcast/All to One Reduction
2. All to All Broadcast/All to All Reduction
3. All Reduce (All to One Reduction + One to All
Broadcast)
4. Scatter (One to All Broadcast Personalized)/Gather
Topologies
1. Ring/Linear Array (One Dimensional)
2. Mesh (Two Dimensional)
3. Hyper Cube (Three Dimensional)
One-to-All Broadcast and All-to-One
Reduction
One-to-All Broadcast
• A single process sends identical data to all other processes.
• Initially one process has data of m size.
• After broadcast operation, each of the processes have own copy of
the m size.
All-to-One Reduction
• Dual of one-to-all broadcast
• The m-sized data from all processes are combined through an
associative operator
• Accumulated at a single destination process into one buffer of
size m
One-to-All Broadcast and All-to-One Reduction
One-to-All Broadcast and All-to-One Reduction
• Application: Used in many parallel algorithms including matrix-vector
multiplication, shortest path, Gaussian Elimination.
• How it works: Sequentially send p-1 from the source to the other p-1
process

• Disadvantages:
• Source becomes bottleneck
• The communication network is underutilized because only the connection
between a single pair of nodes is used at a time
• Solution: Recursive Doubling
Recursive doubling (Linear Array or Ring)
Recursive Doubling Broadcast
• Source process sends the massage to another process
• In next communication phase both the processes can
simultaneously propagate the message
• Message “HI” from the source node P0 is passed to all other nodes in
the ring in following three steps:
1. P0 to P4 (Distance:4)
2. P0 to P2, P4 to P6, in parallel (Distance:2)
3. P0 to P1, P2 to P3, P4 to P5, P6 to P7, in parallel (Distance:1)
Recursive doubling (Linear Array or Ring)
Recursive Doubling Reduction
Example: Sum of all numbers
Mesh
• We can regard each row and column of a square mesh
of p nodes as a linear array of nodes
• Communication algorithms on the mesh are simple
extensions of their linear array counterparts

Broadcast and Reduction

• Two step breakdown:
i. The operation is performed along one dimension by treating the row
as linear array
ii. Then the all the columns are treated similarly
One to all Broadcast on a 16 node mesh
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI”
Step 1 (0th row recursive doubling)
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI”
Step 2 (0th row recursive doubling)
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
Step 3 (All Column recursive doubling)
3 7 11 15

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
Step 4 (All Column recursive doubling)
3 7 11 15
“HI” “HI” “HI” “HI”

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13
“HI” “HI” “HI” “HI”

0 4 8 12
“HI” “HI” “HI” “HI”
Reduction
3 7 11 15
“HI” “HI” “HI” “HI”

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13
“HI” “HI” “HI” “HI”

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14
“HI” “HI” “HI” “HI”

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI” “HI” “HI”
3 7 11 15

2 6 10 14

1 5 9 13

0 4 8 12
“HI” “HI”
Mesh (Broadcast and Reduction)
Hypercube
Broadcast
• Source node first send data to one node in the highest
dimension
• The communication successively proceeds along lower
dimensions in the subsequent steps
• The algorithm is same as used for linear array
• However, here [in hypercube] changing order of dimension will not congest
the network
Hypercube (Broadcast)
Matrix-Vector Multiplication (An
Application)
All-to-All Broadcast and All-to-All
Reduction
• All-to-All Broadcast
• A generalization to of one-to-all broadcast.
• Every process broadcasts m-word message.
• The broadcast-message for each of the processes can be
different than others
• All-to-All Reduction
• Dual of all-to-all broadcast
• Each node is the destination of an all-to-one reduction out of
total P reductions.
All-to-All Broadcast and All-to-All
Reduction
Linear Ring Broadcast (All to All)
Linear Ring Reduction (All to All)
• Draw an All-to-All Broadcast on a P-node linear ring
• Reverse the directions in each foreach of the step without
changing message
• After each communication step, combine messages
having same broadcast destination with associative
operator.
Task
• Draw an All-to-All Broadcast on a 4-node linear ring
• Reverse the directions and combine the results using ‘SUM’
All-to-All Broadcast on 2D Mesh
• based on the linear
array algorithm,
treating rows and
columns of the mesh
as linear arrays
• communication takes
place in two phases
• Row Wise All to All
Broad cast
• Column Wise All to
All Broad cast
All-to-All Broadcast on HyperCube
• The hypercube algorithm for all-to-all broadcast extends
the mesh algorithm to log p dimensions.
• Procedure: Requires log p steps.
• Communication: Occurs along a different dimension (x, y,
z) of the p-node hypercube in each step.
• Step Process: Pairs of nodes exchange data, doubling the
message size for the next step by concatenating received
messages with current data.
• Figure Illustrates these steps for an eight-node hypercube
with bidirectional communication channels.
All-Reduce
• All-Reduce: All to One Reduction + One to All Broad Cast
• Use all-to-one reduction followed by one-to-all broadcast
• The output is same as All to All Broadcast with less traffic
congestion
Example

• All to All Broadcast

• All to One Reduction

• One to All Broadcast

Prefix-Sums
• Prefix-sums are also known as scan operations
• Given p numbers n0, n1, ..., np-1(one on each node), the
problem is to compute the sums such that: -
• 𝑺𝒌 = σ𝑘𝑖=0 (𝒏𝒊)
• Here 𝑺𝒌 is the prefix-sum computed at kth node after the operation.
• Example:
• Original sequence: <3, 1, 4, 0, 2>
• Sequence of prefix sums: <3, 4, 8, 8, 10>
Rules
• Round Bracket (): The
msg is sent to other
node in next step
• Square Bracket []: The
msg is kept with that
node
• Lower index node will
keep msg in square
bracket as it is
• Higher index will add
msg in square bracket
that it got from lower
index node
Scatter and Gather
• Scatter (one-to-all personalized communication)
• Gather (Concatenation) is different than all to one
reduction as it doesn’t reduce the results with
associative operator
The scatter operation on an eight-
node hypercube
All-to-All personalized Communication
• Each node sends a distinct message of size m to every
other node.
• Also known total exchange
Example (Transpose Matrix)

All-to-all personalized communication in

transposing a 4 x 4 matrix using four processes.
All-to-All personalized [Ring]
Cont.
• All-to-all personalized communication on a six-node
ring.
• The label of each message is of the form {x, y},
where x is the label of the node that originally owned
the message, and y is the label of the node that is the
final destination of the message.
• The label ({x1, y1}, {x2, y2}, ..., {xn, yn}) indicates a
message that is formed by concatenating n individual
messages.
All-to-All personalized [Mesh]
• Two Steps
1. All to All Personal
Communication (Row Wise)
2. All to All Personal
Communication (Column
Wise)
All-to-All
personalized
[Hyper Cube]
• 0th Process
• 1st Step (x-axis) 0<->1
• (0,1), (0,3), (0,5), (0,7)

• 2nd step (y-axis) 0<->2

• (0,2), (0,6), (1,2), (1,6)

• 3rd Step (z-axis) 0<->4

• (0,4), (1,4), (2,4), (3,4)

• 2nd Process
• 1st Step (x-axis) 2<->3
• (2,3), (2,7), (2,5), (2,1)

• 2nd step (y-axis) 2<->0

• (2,0), (2,4), (3,0), (3,4)

• 3rd Step (z-axis) 2<->6

• (2,6), (3,6), (0,6), (1,6)
Circular Shift
• circular q-shift is the operation in which node i sends
a data packet to node (i + q) mod p in a p-node
ensemble (0 < q < p).
Circular Shift [Linear/Ring]
• Min (q, P-q) for finding short path of communication
Circular Shift [Mesh]
• Circular shift over Mesh
Topology is done in following
steps
1. Communication Over Row {q
mod sqrt (p)}
2. Compensatory Column Shift
3. Communication Over Column
{Floor[q/sqrt(p)]}

The communication steps in a

circular 5-shift on a 4 x 4 mesh
Circular Shift
[Hypercube]
• Q-shift e.g 5-shift
• First convert to binary
representation (101)
• Write the power of 2
(for enabled bits) 22 +
20
• i.e. 5 = 4 + 1
• 5 shift = 4 shift + 1 shift
he mapping of an eight-node linear array
onto a three-dimensional hypercube to
perform a circular 5-shift as a combination of
a 4-shift and a 1-shift.

Video Surveillance Software
No ratings yet
Video Surveillance Software
2 pages
k18 User Manual
No ratings yet
k18 User Manual
59 pages
Lecture 14 Basic Communication Operations.pptx
No ratings yet
Lecture 14 Basic Communication Operations.pptx
40 pages
Basic Communications
No ratings yet
Basic Communications
13 pages
chap4_selected_slides
No ratings yet
chap4_selected_slides
54 pages
module 3ppt
No ratings yet
module 3ppt
50 pages
Communication Operations
No ratings yet
Communication Operations
70 pages
HPC UNIT 3 To UNIT 6 Technical-Merged
No ratings yet
HPC UNIT 3 To UNIT 6 Technical-Merged
143 pages
Parallel Computing - Unit II - NLAL
No ratings yet
Parallel Computing - Unit II - NLAL
84 pages
F2 PDF
No ratings yet
F2 PDF
51 pages
Parallel Computing Communication Operations Slides
No ratings yet
Parallel Computing Communication Operations Slides
71 pages
Decode HPC
No ratings yet
Decode HPC
68 pages
HPC Endsem 2024 FlyHigh Services
No ratings yet
HPC Endsem 2024 FlyHigh Services
16 pages
HPC_Bankai
No ratings yet
HPC_Bankai
7 pages
Lecture-17-PDC-BCS-6EF-SMI-Spring-2025
No ratings yet
Lecture-17-PDC-BCS-6EF-SMI-Spring-2025
17 pages
Pdc - Co1-Basic Op & Cost Analysis
No ratings yet
Pdc - Co1-Basic Op & Cost Analysis
22 pages
Lecture-16-PDC-BCS-6EF-SMI-Spring-2025
No ratings yet
Lecture-16-PDC-BCS-6EF-SMI-Spring-2025
15 pages
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
No ratings yet
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
13 pages
Lecture-15-PDC-BCS-6EF-SMI-Spring-2025
No ratings yet
Lecture-15-PDC-BCS-6EF-SMI-Spring-2025
27 pages
Lecture-19-PDC-BCS-6EF-SMI-Spring-2025
No ratings yet
Lecture-19-PDC-BCS-6EF-SMI-Spring-2025
14 pages
Unit 3 - Parallel Communication
No ratings yet
Unit 3 - Parallel Communication
41 pages
HPC Endsem FlyHigh Services
No ratings yet
HPC Endsem FlyHigh Services
18 pages
Lecture-18-PDC-BCS-6EF-SMI-Spring-2025
No ratings yet
Lecture-18-PDC-BCS-6EF-SMI-Spring-2025
14 pages
Unit 3 HPC
No ratings yet
Unit 3 HPC
73 pages
1 Module 1 Parallelism Fundamentals Motivation Key Concepts and Challenges Parallel Computing
No ratings yet
1 Module 1 Parallelism Fundamentals Motivation Key Concepts and Challenges Parallel Computing
81 pages
Principal Parameters: Communication Costs in Static Interconnection Networks
No ratings yet
Principal Parameters: Communication Costs in Static Interconnection Networks
31 pages
HPC 3rd Unit
No ratings yet
HPC 3rd Unit
16 pages
Intro To Communication: - Advantages
No ratings yet
Intro To Communication: - Advantages
13 pages
Unit 3
No ratings yet
Unit 3
62 pages
Mid 2 Solution
No ratings yet
Mid 2 Solution
5 pages
LEC6 parallelAlg-Broadcasting
No ratings yet
LEC6 parallelAlg-Broadcasting
15 pages
801DCexp3
No ratings yet
801DCexp3
18 pages
Lecture - 28
No ratings yet
Lecture - 28
24 pages
unit-3.2 static interconnection networks
No ratings yet
unit-3.2 static interconnection networks
10 pages
System Interconnect Network & Topologies
No ratings yet
System Interconnect Network & Topologies
48 pages
Chapter 2 - Parallel Programming Platforms
No ratings yet
Chapter 2 - Parallel Programming Platforms
33 pages
Exercise 9
No ratings yet
Exercise 9
5 pages
CN Slot
No ratings yet
CN Slot
8 pages
10-Hypercube & Network
No ratings yet
10-Hypercube & Network
22 pages
Array Processors: SIMD Computer Organization
100% (1)
Array Processors: SIMD Computer Organization
45 pages
3 Module 3 Message Passing Studemt Version 2
No ratings yet
3 Module 3 Message Passing Studemt Version 2
18 pages
Parallel Algorithms Underlying MPI Implementations
No ratings yet
Parallel Algorithms Underlying MPI Implementations
55 pages
Chapter 06
No ratings yet
Chapter 06
47 pages
Lec8 MPIalgorithmDesign
No ratings yet
Lec8 MPIalgorithmDesign
12 pages
Slides Chapter 2 - Parallel Programming Platforms
No ratings yet
Slides Chapter 2 - Parallel Programming Platforms
33 pages
31913
No ratings yet
31913
19 pages
Parallel Algorithms Underlying MPI Implementations
No ratings yet
Parallel Algorithms Underlying MPI Implementations
55 pages
Introduction
No ratings yet
Introduction
46 pages
Unit-I Data Communications: Dr. Y. Narasimha Murthy PH.D
No ratings yet
Unit-I Data Communications: Dr. Y. Narasimha Murthy PH.D
27 pages
Duplexing, Multiplexing, and Multiple Access: A Comparative Analysis For Mesh Networks
No ratings yet
Duplexing, Multiplexing, and Multiple Access: A Comparative Analysis For Mesh Networks
7 pages
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 2) : CSE3057Y Parallel and Distributed Systems
20 pages
wsn unit 4
No ratings yet
wsn unit 4
5 pages
Notes
No ratings yet
Notes
6 pages
Multipath Dissemination in Regular Mesh Topologies
No ratings yet
Multipath Dissemination in Regular Mesh Topologies
14 pages
Parallel 2ndtweek Class2
No ratings yet
Parallel 2ndtweek Class2
18 pages
Quality of Service Metrics For Data Transmission in Mesh Topologies
No ratings yet
Quality of Service Metrics For Data Transmission in Mesh Topologies
7 pages
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
From Everand
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Nell: An SVG Drawing Language
From Everand
Nell: An SVG Drawing Language
Stefan Hollos
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
From Everand
Analog Dialogue, Volume 48, Number 1: Analog Dialogue, #13
Analog Dialogue
4/5 (1)
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
1 s2.0 S1746809423000812 Main
No ratings yet
1 s2.0 S1746809423000812 Main
12 pages
ADHD TeenagerBooklet London
No ratings yet
ADHD TeenagerBooklet London
17 pages
Accounting Research Methods
No ratings yet
Accounting Research Methods
2 pages
CFASR45-006A Rev.0 E
100% (1)
CFASR45-006A Rev.0 E
89 pages
January 1st Update Yandere Simulator Development Blog
No ratings yet
January 1st Update Yandere Simulator Development Blog
1 page
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Learning Plans in The Context of The 21st Century - 2 PDF
No ratings yet
Learning Plans in The Context of The 21st Century - 2 PDF
39 pages
(No, Se, DK) Installasjon Av Microsoft Office (Side 6) (Fi) Asennusohje - Microsoft Office (Sivu 8)
No ratings yet
(No, Se, DK) Installasjon Av Microsoft Office (Side 6) (Fi) Asennusohje - Microsoft Office (Sivu 8)
17 pages
Capture d’écran . 2025-02-20 à 15.08.45
No ratings yet
Capture d’écran . 2025-02-20 à 15.08.45
1 page
Supplementary Handout On Blended Learning Delivery Modalities (BLDMS)
100% (5)
Supplementary Handout On Blended Learning Delivery Modalities (BLDMS)
16 pages
Kooky
No ratings yet
Kooky
26 pages
Demand Forecasting
No ratings yet
Demand Forecasting
24 pages
test2
No ratings yet
test2
41 pages
PH.D Computer Application (Section A) (Part 1)
No ratings yet
PH.D Computer Application (Section A) (Part 1)
24 pages
Quick Start Guide For DVR-Zosi Smart
No ratings yet
Quick Start Guide For DVR-Zosi Smart
10 pages
Computer Process Control
No ratings yet
Computer Process Control
11 pages
Multirate Signal Processing, DSV2: Our Website Contains The Slides
No ratings yet
Multirate Signal Processing, DSV2: Our Website Contains The Slides
20 pages
BMBS Quick Reading Notes PDF
No ratings yet
BMBS Quick Reading Notes PDF
1 page
VE TI L2 Tests ProgressTest1
No ratings yet
VE TI L2 Tests ProgressTest1
3 pages
EE201 Mid-Term Exam (2019 Fall)
No ratings yet
EE201 Mid-Term Exam (2019 Fall)
12 pages
Online Student TT Morning&Eve 21 April To 24 April
No ratings yet
Online Student TT Morning&Eve 21 April To 24 April
2 pages
CN-WEEK11
No ratings yet
CN-WEEK11
8 pages
Nios D.el - Ed. Assignment Front Page
No ratings yet
Nios D.el - Ed. Assignment Front Page
1 page
Claim Myunisa Mylife 2017 PDF
No ratings yet
Claim Myunisa Mylife 2017 PDF
12 pages
Autocad 2009 Key
No ratings yet
Autocad 2009 Key
5 pages
Daftar Akun Tryout Gaspolptn Scola Man 2 Jember
No ratings yet
Daftar Akun Tryout Gaspolptn Scola Man 2 Jember
11 pages
UGD-D00643 - Netspan - IIS - Installation - and - Configuration - Guide - SR14.20 and SR14.50 - Rev 7.0
No ratings yet
UGD-D00643 - Netspan - IIS - Installation - and - Configuration - Guide - SR14.20 and SR14.50 - Rev 7.0
36 pages
Search Engine Optimization of The Websit
No ratings yet
Search Engine Optimization of The Websit
9 pages

Lecture 11

Uploaded by

Lecture 11

Uploaded by

Basic Communication

Broadcast and Reduction

• All to All Broadcast

• All to One Reduction

• One to All Broadcast

All-to-all personalized communication in

• 2nd step (y-axis) 0<->2

• 3rd Step (z-axis) 0<->4

• 2nd step (y-axis) 2<->0

• 3rd Step (z-axis) 2<->6

The communication steps in a

You might also like