Chap3 Slides Week4
Chap3 Slides Week4
• It would appear that the parallel time can be made arbitrarily small
by making the decomposition finer in granularity.
• There is an inherent bound on how fine the granularity of a
computation can be. For example, in the case of multiplying a dense
matrix with a vector, there can be no more than (n2) concurrent
tasks.
• Concurrent tasks may also have to exchange data with other tasks.
This results in communication overhead. The tradeoff between the
granularity of a decomposition and associated overheads often
determines performance bounds.
Task Interaction Graphs
Note: These criteria often conflict eith each other. For example, a
decomposition into one task (or no decomposition at all) minimizes
interaction but does not result in a speedup at all! Can you think of
other such conflicting cases?
Processes and Mapping: Example
• recursive decomposition
• data decomposition
• exploratory decomposition
• speculative decomposition
Recursive Decomposition
In this example, once the list has been partitioned around the pivot, each
sublist can be processed concurrently (i.e., each sublist represents an
independent subtask). This can be repeated recursively.
Recursive Decomposition: Example
The problem of finding the minimum number in a given list (or indeed
any other associative operation such as sum, AND, etc.) can be
fashioned as a divide-and-conquer algorithm. The following algorithm
illustrates this.
We first start with a simple serial loop for computing the minimum entry
in a given list:
Task 1:
Task 2:
Task 3:
Task 4:
Output Data Decomposition: Example
Task 2: C1,1 = C1,1 + A1,2 B2,1 Task 2: C1,1 = C1,1 + A1,2 B2,1
Task 4: C1,2 = C1,2 + A1,2 B2,2 Task 4: C1,2 = C1,2 + A1,1 B1,2
Task 6: C2,1 = C2,1 + A2,2 B2,1 Task 6: C2,1 = C2,1 + A2,1 B1,1
Task 8: C2,2 = C2,2 + A2,2 B2,2 Task 8: C2,2 = C2,2 + A2,2 B2,2
Output Data Decomposition: Example
Often input and output data decomposition can be combined for a higher
degree of concurrency. For the itemset counting example, the transaction
set (input) and itemset counts (output) can both be decomposed as follows:
Intermediate Data Partitioning
Stage II
Task 01: D1,1,1= A1,1 B1,1 Task 02: D2,1,1= A1,2 B2,1
Task 03: D1,1,2= A1,1 B1,2 Task 04: D2,1,2= A1,2 B2,2
Task 05: D1,2,1= A2,1 B1,1 Task 06: D2,2,1= A2,2 B2,1
Task 07: D1,2,2= A2,1 B1,2 Task 08: D2,2,2= A2,2 B2,2
Task 09: C1,1 = D1,1,1 + D2,1,1 Task 10: C1,2 = D1,1,2 + D2,1,2
Task 11: C2,1 = D1,2,1 + D2,2,1 Task 12: C2,,2 = D1,2,2 + D2,2,2
Intermediate Data Partitioning: Example