SlideShare a Scribd company logo
1
Parallel
Algorithms
Shashikant V. Athawale
Assistant Professor ,Computer Engineering
Department AISSMS College of Engineering,
Kennedy Road, Pune , MS, India - 411001
Parallel Algorithms
Parallel: perform more than one operation at a time.
PRAM model: Parallel Random Access Model.
2
p0
p1
pn-1
Shared
memory
Multiple processors connected to a shared memory.
Each processor access any location in unit time.
All processors can access memory in parallel.
All processors can perform operations in parallel.
Concurrent vs. Exclusive AccessFour models
EREW: exclusive read and exclusive write
CREW: concurrent read and exclusive write
ERCW: exclusive read and concurrent write
CRCW: concurrent read and concurrent write
Handling write conflicts
Common-write model: only if they write the same
value.
Arbitrary-write model: an arbitrary one succeeds.
Priority-write model: the one with smallest index
succeeds.
EREW and CRCW are most popular.
3
Synchronization and Control
Synchronization:
A most important and complicated issue
Suppose all processors are inherently tightly
synchronized:
 All processors execute the same statements at the same
time
 No race among processors, i.e, same pace.
Termination control of a parallel loop:
Depend on the state of all processors
Can be tested in O(1) time.
4
Pointer Jumping –list ranking
Given a single linked list L with n objects,
compute, for each object in L, its distance from the
end of the list.
Formally: suppose next is the pointer field
d[i]= 0 if next[i]=nil
 d[next[i]]+1 if next[i]≠nil
Serial algorithm: Θ(n).
5
List ranking –EREW algorithm
 LIST-RANK(L) (in O(lg n) time)
1. for each processor i, in parallel
2. do if next[i]=nil
3. then d[i]←0
4. else d[i]←1
5. while there exists an object i such that next[i]≠nil
6. do for each processor i, in parallel
7. do if next[i]≠nil
8. then d[i]← d[i]+ d[next[i]]
9. next[i] ←next[next[i]]
6
7
1
3
1
4
1
6
1
1
1
0
0
5
(a)
3 4 6 1 0 5
(b) 2 2 2 2 1 0
3 4 6 1 0 5
(c) 4 4 3 2 1 0
3 4 6 1 0 5
(d) 5 4 3 2 1 0
List ranking –correctness of EREW algorithm
Loop invariant: for each i, the sum of d values
in the sublist headed by i is the correct
distance from i to the end of the original list L.
Parallel memory must be synchronized: the
reads on the right must occur before the wirtes
on the left. Moreover, read d[i] and then read
d[next[i]].
An EREW algorithm: every read and write is
exclusive. For an object i, its processor reads
d[i], and then its precedent processor reads its
d[i]. Writes are all in distinct locations.
8
LIST ranking EREW algorithm running time
O(lg n):
The initialization for loop runs in O(1).
Each iteration of while loop runs in O(1).
There are exactly lg n iterations:
 Each iteration transforms each list into two interleaved lists:
one consisting of objects in even positions, and the other
odd positions. Thus, each iteration double the number of
lists but halves their lengths.
The termination test in line 5 runs in O(1).
Define work =#processors ×running time. O(n lg n).
9
Parallel prefix on a list
A prefix computation is defined as:
Input: <x1, x2, …, xn>
Binary associative operation ⊗
Output:<y1, y2, …, yn>
Such that:
 y1= x1
 yk= yk-1⊗ xkfork=2,3, …,n, i.e, yk= ⊗ x1⊗ x2 …⊗ xk.
Suppose <x1, x2, …, xn> are stored orderly in a list.
Define notation: [i,j]= xi⊗ xi+1 …⊗ xj
10
Prefix computation LIST-PREFIX(L)
1. for each processor i, in parallel
2. do y[i]← x[i]
3. while there exists an object i such that next[i]≠nil
4. do for each processor i, in parallel
5. do if next[i]≠nil
6. then y[next[i]]← y[i] ⊗ y[next[i]]
7. next[i] ←next[next[i]]
11
12
[1,1]
x1
[2,2]
x2
[3,3] [4,4]
x4
[5,5]
x5
[6,6]
x6
(a)
x3
x4
(b)
x1 x2 x5
x6x3
[1,1] [1,2] [2,3] [3,4] [4,5] [5,6]
x1 x2 x5
x6x3
x1 x2 x5
x6x3
(c)
(d)
[1,1] [1,2] [1,3] [1,4] [2,5] [3,6]
[1,1] [1,2] [1,3] [1,4] [1,5] [1,6]
Find root –CREW algorithm
Suppose a forest of binary trees, each node i has a
pointer parent[i].
Find the identity of the tree of each node.
Assume that each node is associated a processor.
Assume that each node i has a field root[i].
13
Find-roots –CREW algorithm
 FIND-ROOTS(F)
1. for each processor i, in parallel
2. do if parent[i] = nil
3. then root[i]←i
4. while there exist a node i such that parent[i] ≠ nil
5. do for each processor i, in parallel
6. do if parent[i] ≠ nil
7. then root[i] ← root[parent[i]]
8. parent[i] ← parent[parent[i]]
14
Find root –CREW algorithm
Running time: O(lg d), where d is the height of
maximum-depth tree in the forest.
All the writes are exclusive
But the read in line 7 is concurrent, since several
nodes may have same node as parent.
See figure 30.5.
15
16
Find roots –CREW vs. EREW
How fast can n nodes in a forest determine their
roots using only exclusive read?
17
Ω(lg n)
Argument: when exclusive read, a given peace of information can only be
copied to one other memory location in each step, thus the number of locations
containing a given piece of information at most doubles at each step. Looking
at a forest with one tree of n nodes, the root identity is stored in one place initially.
After the first step, it is stored in at most two places; after the second step, it is
Stored in at most four places, …, so need lg n steps for it to be stored at n places.
So CREW: O(lg d) and EREW: Ω(lg n).
If d=2(lg n)
, CREW outperforms any EREW algorithm.
If d=Θ(lg n), then CREW runs in O(lg lg n), and EREW is
much slower.
Find maximum – CRCW algorithm Given n elements A[0,n-1], find the maximum.
 Suppose n2
processors, each processor (i,j) compare A[i] and A[j], for 0≤
i, j ≤n-1.
 FAST-MAX(A)
1. n←length[A]
2. for i ←0 to n-1, in parallel
3. do m[i] ←true
4. for i ←0 to n-1 and j ←0 to n-1, in parallel
5. do if A[i] < A[j]
6. then m[i] ←false
7. for i ←0 to n-1, in parallel
8. do if m[i] =true
9. then max ← A[i]
10. return max
18
The running time is O(1).
Note: there may be multiple maximum values, so their processors
Will write to max concurrently. Its work = n2
× O(1) =O(n2
).
5 6 9 2 9 m
5 F T T F T F
6 F F T F T F
9 F F F F F T
2 T T T F T F
9 F F F F F T
A[j]
A[i]
max=9
Find maximum –CRCW vs. EREW
If find maximum using EREW, then Ω(lg n).
Argument: consider how many elements “think”
that they might be the maximum.
First, n,
After first step, n/2,
After second step n/4. …, each step, halve.
Moreover, CREW takes Ω(lg n).
19
Stimulating CRCW with EREW
Theorem:
A p-processor CRCW algorithm can be no more than O(lg p)
times faster than a best p-processor EREW algorithm for the same
problem.
Proof: each step of CRCW can be simulated by O(lg p)
computations of EREW.
Suppose concurrent write:
 CRCW pi write data xi to location li, (li may be same for multiple pi ‘s).
 Corresponding EREW pi write (li, xi) to a location A[i], (different A[i]’s)
so exclusive write.
 Sort all (li, xi)’s by li’s, same locations are brought together. in O(lg p).
 Each EREW picompares A[i]= (lj, xj), and A[i-1]= (lk, xk). If lj≠ lk or i=0,
then EREW pi writes xj to lj. (exclusive write).
See figure 30.7.
20
21
CRCW vs. EREW
CRCW:
Some says: easier to program and more faster.
Others say: The hardware to CRCW is slower than
EREW. And One can not find maximum in O(1).
Still others say: either EREW or CRCW is wrong.
Processors must be connected by a network, and only
be able to communicate with other via the network, so
network should be part of the model.
22
 Thank You
23
Ad

More Related Content

What's hot (20)

Algorithms Lecture 7: Graph Algorithms
Algorithms Lecture 7: Graph AlgorithmsAlgorithms Lecture 7: Graph Algorithms
Algorithms Lecture 7: Graph Algorithms
Mohamed Loey
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of Algorithms
Bulbul Agrawal
 
Np complete
Np completeNp complete
Np complete
Dr. C.V. Suresh Babu
 
TOC 1 | Introduction to Theory of Computation
TOC 1 | Introduction to Theory of ComputationTOC 1 | Introduction to Theory of Computation
TOC 1 | Introduction to Theory of Computation
Mohammad Imam Hossain
 
operating system structure
operating system structureoperating system structure
operating system structure
Waseem Ud Din Farooqui
 
Binary search
Binary searchBinary search
Binary search
AparnaKumari31
 
Time andspacecomplexity
Time andspacecomplexityTime andspacecomplexity
Time andspacecomplexity
LAKSHMITHARUN PONNAM
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
Sayed Chhattan Shah
 
Complexity of Algorithm
Complexity of AlgorithmComplexity of Algorithm
Complexity of Algorithm
Muhammad Muzammal
 
Algorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms IAlgorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms I
Mohamed Loey
 
Analysis of algorithm
Analysis of algorithmAnalysis of algorithm
Analysis of algorithm
Rajendra Dangwal
 
Binary Search
Binary SearchBinary Search
Binary Search
kunj desai
 
Big o notation
Big o notationBig o notation
Big o notation
hamza mushtaq
 
Algorithms Lecture 1: Introduction to Algorithms
Algorithms Lecture 1: Introduction to AlgorithmsAlgorithms Lecture 1: Introduction to Algorithms
Algorithms Lecture 1: Introduction to Algorithms
Mohamed Loey
 
Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
Orochi Krizalid
 
Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture Notes
FellowBuddy.com
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
Akhil Kaushik
 
Data Structure and Algorithms.pptx
Data Structure and Algorithms.pptxData Structure and Algorithms.pptx
Data Structure and Algorithms.pptx
Syed Zaid Irshad
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
INTRODUCTION TO ALGORITHMS Third Edition
INTRODUCTION TO ALGORITHMS Third EditionINTRODUCTION TO ALGORITHMS Third Edition
INTRODUCTION TO ALGORITHMS Third Edition
PHI Learning Pvt. Ltd.
 
Algorithms Lecture 7: Graph Algorithms
Algorithms Lecture 7: Graph AlgorithmsAlgorithms Lecture 7: Graph Algorithms
Algorithms Lecture 7: Graph Algorithms
Mohamed Loey
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of Algorithms
Bulbul Agrawal
 
TOC 1 | Introduction to Theory of Computation
TOC 1 | Introduction to Theory of ComputationTOC 1 | Introduction to Theory of Computation
TOC 1 | Introduction to Theory of Computation
Mohammad Imam Hossain
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
Sayed Chhattan Shah
 
Algorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms IAlgorithms Lecture 4: Sorting Algorithms I
Algorithms Lecture 4: Sorting Algorithms I
Mohamed Loey
 
Algorithms Lecture 1: Introduction to Algorithms
Algorithms Lecture 1: Introduction to AlgorithmsAlgorithms Lecture 1: Introduction to Algorithms
Algorithms Lecture 1: Introduction to Algorithms
Mohamed Loey
 
Compiler Design Lecture Notes
Compiler Design Lecture NotesCompiler Design Lecture Notes
Compiler Design Lecture Notes
FellowBuddy.com
 
Data Structure and Algorithms.pptx
Data Structure and Algorithms.pptxData Structure and Algorithms.pptx
Data Structure and Algorithms.pptx
Syed Zaid Irshad
 
INTRODUCTION TO ALGORITHMS Third Edition
INTRODUCTION TO ALGORITHMS Third EditionINTRODUCTION TO ALGORITHMS Third Edition
INTRODUCTION TO ALGORITHMS Third Edition
PHI Learning Pvt. Ltd.
 

Similar to Parallel algorithms (20)

algorithm unit 1
algorithm unit 1algorithm unit 1
algorithm unit 1
Monika Choudhery
 
Design and Analysis of algorithms
Design and Analysis of algorithmsDesign and Analysis of algorithms
Design and Analysis of algorithms
Dr. Rupa Ch
 
multi threaded and distributed algorithms
multi threaded and distributed algorithms multi threaded and distributed algorithms
multi threaded and distributed algorithms
Dr Shashikant Athawale
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
James Wong
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Tony Nguyen
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Luis Goldster
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Harry Potter
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Young Alista
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
Fraboni Ec
 
Chapter 4 ds
Chapter 4 dsChapter 4 ds
Chapter 4 ds
Hanif Durad
 
CS3401- Algorithmto use for data structure.docx
CS3401- Algorithmto use for data structure.docxCS3401- Algorithmto use for data structure.docx
CS3401- Algorithmto use for data structure.docx
ywar08112
 
Parallel search
Parallel searchParallel search
Parallel search
Md. Mahedi Mahfuj
 
Q
QQ
Q
guest9b2176
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernels
infopapers
 
Anu DAA i1t unit
Anu DAA i1t unitAnu DAA i1t unit
Anu DAA i1t unit
GANDIKOTA2012
 
Fibonacci Function Gallery - Part 2 - One in a series
Fibonacci Function Gallery - Part 2 - One in a seriesFibonacci Function Gallery - Part 2 - One in a series
Fibonacci Function Gallery - Part 2 - One in a series
Philip Schwarz
 
Python idiomatico
Python idiomaticoPython idiomatico
Python idiomatico
PyCon Italia
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
rik0
 
Review session2
Review session2Review session2
Review session2
NEEDY12345
 
I1
I1I1
I1
Maulik (N.) Shah
 
Design and Analysis of algorithms
Design and Analysis of algorithmsDesign and Analysis of algorithms
Design and Analysis of algorithms
Dr. Rupa Ch
 
multi threaded and distributed algorithms
multi threaded and distributed algorithms multi threaded and distributed algorithms
multi threaded and distributed algorithms
Dr Shashikant Athawale
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
James Wong
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Tony Nguyen
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Harry Potter
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Young Alista
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
Fraboni Ec
 
CS3401- Algorithmto use for data structure.docx
CS3401- Algorithmto use for data structure.docxCS3401- Algorithmto use for data structure.docx
CS3401- Algorithmto use for data structure.docx
ywar08112
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernels
infopapers
 
Fibonacci Function Gallery - Part 2 - One in a series
Fibonacci Function Gallery - Part 2 - One in a seriesFibonacci Function Gallery - Part 2 - One in a series
Fibonacci Function Gallery - Part 2 - One in a series
Philip Schwarz
 
Pydiomatic
PydiomaticPydiomatic
Pydiomatic
rik0
 
Review session2
Review session2Review session2
Review session2
NEEDY12345
 
Ad

More from Dr Shashikant Athawale (20)

Amortized analysis
Amortized analysisAmortized analysis
Amortized analysis
Dr Shashikant Athawale
 
Complexity theory
Complexity theory Complexity theory
Complexity theory
Dr Shashikant Athawale
 
Divide and Conquer
Divide and ConquerDivide and Conquer
Divide and Conquer
Dr Shashikant Athawale
 
Model and Design
Model and Design Model and Design
Model and Design
Dr Shashikant Athawale
 
Fundamental of Algorithms
Fundamental of Algorithms Fundamental of Algorithms
Fundamental of Algorithms
Dr Shashikant Athawale
 
CUDA Architecture
CUDA ArchitectureCUDA Architecture
CUDA Architecture
Dr Shashikant Athawale
 
Parallel Algorithms- Sorting and Graph
Parallel Algorithms- Sorting and GraphParallel Algorithms- Sorting and Graph
Parallel Algorithms- Sorting and Graph
Dr Shashikant Athawale
 
Analytical Models of Parallel Programs
Analytical Models of Parallel ProgramsAnalytical Models of Parallel Programs
Analytical Models of Parallel Programs
Dr Shashikant Athawale
 
Basic Communication
Basic CommunicationBasic Communication
Basic Communication
Dr Shashikant Athawale
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
Dr Shashikant Athawale
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
Dr Shashikant Athawale
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
Dr Shashikant Athawale
 
Greedy method
Greedy method Greedy method
Greedy method
Dr Shashikant Athawale
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
Dr Shashikant Athawale
 
Asymptotic notation
Asymptotic notationAsymptotic notation
Asymptotic notation
Dr Shashikant Athawale
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
Dr Shashikant Athawale
 
Advanced Wireless Technologies
Advanced Wireless TechnologiesAdvanced Wireless Technologies
Advanced Wireless Technologies
Dr Shashikant Athawale
 
Vo ip
Vo ipVo ip
Vo ip
Dr Shashikant Athawale
 
Vehicular network
Vehicular networkVehicular network
Vehicular network
Dr Shashikant Athawale
 
Delay telerant network
Delay telerant networkDelay telerant network
Delay telerant network
Dr Shashikant Athawale
 
Ad

Recently uploaded (20)

Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 

Parallel algorithms

  • 1. 1 Parallel Algorithms Shashikant V. Athawale Assistant Professor ,Computer Engineering Department AISSMS College of Engineering, Kennedy Road, Pune , MS, India - 411001
  • 2. Parallel Algorithms Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. 2 p0 p1 pn-1 Shared memory Multiple processors connected to a shared memory. Each processor access any location in unit time. All processors can access memory in parallel. All processors can perform operations in parallel.
  • 3. Concurrent vs. Exclusive AccessFour models EREW: exclusive read and exclusive write CREW: concurrent read and exclusive write ERCW: exclusive read and concurrent write CRCW: concurrent read and concurrent write Handling write conflicts Common-write model: only if they write the same value. Arbitrary-write model: an arbitrary one succeeds. Priority-write model: the one with smallest index succeeds. EREW and CRCW are most popular. 3
  • 4. Synchronization and Control Synchronization: A most important and complicated issue Suppose all processors are inherently tightly synchronized:  All processors execute the same statements at the same time  No race among processors, i.e, same pace. Termination control of a parallel loop: Depend on the state of all processors Can be tested in O(1) time. 4
  • 5. Pointer Jumping –list ranking Given a single linked list L with n objects, compute, for each object in L, its distance from the end of the list. Formally: suppose next is the pointer field d[i]= 0 if next[i]=nil  d[next[i]]+1 if next[i]≠nil Serial algorithm: Θ(n). 5
  • 6. List ranking –EREW algorithm  LIST-RANK(L) (in O(lg n) time) 1. for each processor i, in parallel 2. do if next[i]=nil 3. then d[i]←0 4. else d[i]←1 5. while there exists an object i such that next[i]≠nil 6. do for each processor i, in parallel 7. do if next[i]≠nil 8. then d[i]← d[i]+ d[next[i]] 9. next[i] ←next[next[i]] 6
  • 7. 7 1 3 1 4 1 6 1 1 1 0 0 5 (a) 3 4 6 1 0 5 (b) 2 2 2 2 1 0 3 4 6 1 0 5 (c) 4 4 3 2 1 0 3 4 6 1 0 5 (d) 5 4 3 2 1 0
  • 8. List ranking –correctness of EREW algorithm Loop invariant: for each i, the sum of d values in the sublist headed by i is the correct distance from i to the end of the original list L. Parallel memory must be synchronized: the reads on the right must occur before the wirtes on the left. Moreover, read d[i] and then read d[next[i]]. An EREW algorithm: every read and write is exclusive. For an object i, its processor reads d[i], and then its precedent processor reads its d[i]. Writes are all in distinct locations. 8
  • 9. LIST ranking EREW algorithm running time O(lg n): The initialization for loop runs in O(1). Each iteration of while loop runs in O(1). There are exactly lg n iterations:  Each iteration transforms each list into two interleaved lists: one consisting of objects in even positions, and the other odd positions. Thus, each iteration double the number of lists but halves their lengths. The termination test in line 5 runs in O(1). Define work =#processors ×running time. O(n lg n). 9
  • 10. Parallel prefix on a list A prefix computation is defined as: Input: <x1, x2, …, xn> Binary associative operation ⊗ Output:<y1, y2, …, yn> Such that:  y1= x1  yk= yk-1⊗ xkfork=2,3, …,n, i.e, yk= ⊗ x1⊗ x2 …⊗ xk. Suppose <x1, x2, …, xn> are stored orderly in a list. Define notation: [i,j]= xi⊗ xi+1 …⊗ xj 10
  • 11. Prefix computation LIST-PREFIX(L) 1. for each processor i, in parallel 2. do y[i]← x[i] 3. while there exists an object i such that next[i]≠nil 4. do for each processor i, in parallel 5. do if next[i]≠nil 6. then y[next[i]]← y[i] ⊗ y[next[i]] 7. next[i] ←next[next[i]] 11
  • 12. 12 [1,1] x1 [2,2] x2 [3,3] [4,4] x4 [5,5] x5 [6,6] x6 (a) x3 x4 (b) x1 x2 x5 x6x3 [1,1] [1,2] [2,3] [3,4] [4,5] [5,6] x1 x2 x5 x6x3 x1 x2 x5 x6x3 (c) (d) [1,1] [1,2] [1,3] [1,4] [2,5] [3,6] [1,1] [1,2] [1,3] [1,4] [1,5] [1,6]
  • 13. Find root –CREW algorithm Suppose a forest of binary trees, each node i has a pointer parent[i]. Find the identity of the tree of each node. Assume that each node is associated a processor. Assume that each node i has a field root[i]. 13
  • 14. Find-roots –CREW algorithm  FIND-ROOTS(F) 1. for each processor i, in parallel 2. do if parent[i] = nil 3. then root[i]←i 4. while there exist a node i such that parent[i] ≠ nil 5. do for each processor i, in parallel 6. do if parent[i] ≠ nil 7. then root[i] ← root[parent[i]] 8. parent[i] ← parent[parent[i]] 14
  • 15. Find root –CREW algorithm Running time: O(lg d), where d is the height of maximum-depth tree in the forest. All the writes are exclusive But the read in line 7 is concurrent, since several nodes may have same node as parent. See figure 30.5. 15
  • 16. 16
  • 17. Find roots –CREW vs. EREW How fast can n nodes in a forest determine their roots using only exclusive read? 17 Ω(lg n) Argument: when exclusive read, a given peace of information can only be copied to one other memory location in each step, thus the number of locations containing a given piece of information at most doubles at each step. Looking at a forest with one tree of n nodes, the root identity is stored in one place initially. After the first step, it is stored in at most two places; after the second step, it is Stored in at most four places, …, so need lg n steps for it to be stored at n places. So CREW: O(lg d) and EREW: Ω(lg n). If d=2(lg n) , CREW outperforms any EREW algorithm. If d=Θ(lg n), then CREW runs in O(lg lg n), and EREW is much slower.
  • 18. Find maximum – CRCW algorithm Given n elements A[0,n-1], find the maximum.  Suppose n2 processors, each processor (i,j) compare A[i] and A[j], for 0≤ i, j ≤n-1.  FAST-MAX(A) 1. n←length[A] 2. for i ←0 to n-1, in parallel 3. do m[i] ←true 4. for i ←0 to n-1 and j ←0 to n-1, in parallel 5. do if A[i] < A[j] 6. then m[i] ←false 7. for i ←0 to n-1, in parallel 8. do if m[i] =true 9. then max ← A[i] 10. return max 18 The running time is O(1). Note: there may be multiple maximum values, so their processors Will write to max concurrently. Its work = n2 × O(1) =O(n2 ). 5 6 9 2 9 m 5 F T T F T F 6 F F T F T F 9 F F F F F T 2 T T T F T F 9 F F F F F T A[j] A[i] max=9
  • 19. Find maximum –CRCW vs. EREW If find maximum using EREW, then Ω(lg n). Argument: consider how many elements “think” that they might be the maximum. First, n, After first step, n/2, After second step n/4. …, each step, halve. Moreover, CREW takes Ω(lg n). 19
  • 20. Stimulating CRCW with EREW Theorem: A p-processor CRCW algorithm can be no more than O(lg p) times faster than a best p-processor EREW algorithm for the same problem. Proof: each step of CRCW can be simulated by O(lg p) computations of EREW. Suppose concurrent write:  CRCW pi write data xi to location li, (li may be same for multiple pi ‘s).  Corresponding EREW pi write (li, xi) to a location A[i], (different A[i]’s) so exclusive write.  Sort all (li, xi)’s by li’s, same locations are brought together. in O(lg p).  Each EREW picompares A[i]= (lj, xj), and A[i-1]= (lk, xk). If lj≠ lk or i=0, then EREW pi writes xj to lj. (exclusive write). See figure 30.7. 20
  • 21. 21
  • 22. CRCW vs. EREW CRCW: Some says: easier to program and more faster. Others say: The hardware to CRCW is slower than EREW. And One can not find maximum in O(1). Still others say: either EREW or CRCW is wrong. Processors must be connected by a network, and only be able to communicate with other via the network, so network should be part of the model. 22