0% found this document useful (0 votes)
20 views

Matrix Case Study

notes

Uploaded by

nithyasmail6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Matrix Case Study

notes

Uploaded by

nithyasmail6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Matrix Multiplication in

MapReduce
Matrix Multiplication

Matrices are very practical: sciences,


engineering, statistics, etc.

Multiplication is a fundamental nontrivial


matrix operation.

Simpler than something like matrix inversion


(although the time complexity is the same).
Matrix Multiplication

• Problem: Some people want to use enormous matrices.


Cannot be handled on one machine
• Take advantage of map-reduce parallelism to approach this
problem.
• Heuristics:
– 10,000x10,000: 100,000,000 entries
– 100,000x100,000: 10,000,000,000 entries
Two pass Matrix-
Multiplication
Two Pass Matrix
Multiplication

• The above represented matrices can be seen as two relational


tables with columns (i, j, v) and (j, k, v). Matrix multiplication
does resemble a lot to a natural join over the j column,
followed by a sum aggregation.
Two – Pass visualization
Map Function Pass 1 output
• For each row of matrix 1 create key-
value pair of form j:(A, i, vij). Similarly
do the same for the second matrix
Key Value Key Value Key Value

0 A, 0, 1 1 A, 0, 2 2 A, 0, 0
0 A, 1, 3 1 A, 1, 0 2 A, 1, 5
0 A, 2, 0 1 A, 2, 7 2 A, 2, 0
0 A, 3, 4 1 A, 3, 0 2 A, 3, 0
0 A, 4, 1 1 A, 4, 8 2 A, 4, 2

Key Value Key Value Key Value


0 B, 0, 1 1 B, 0, 0 2 B, 0, 6
0 B, 1, 4 1 B, 1, 4 2 B, 1, 3
0 B, 2, 5 1 B, 2, 0 2 B, 2, 9
0 B, 3, 0 1 B, 3, 1 2 B, 3, 1
Reduce Function Pass 1
• For a key j take each value that comes from A of
form(A, i, vij) and take each value that comes from B
of the form (B, k, vjk) and produce a key-value pair of
form i,k: vij * vjk. Key Value Key Value
0, 0 1 2, 0 0
Key Value Key Value
Key Value 0, 1 4 2, 1 0
0 A, 0, 1 4, 0 1
0 B, 0, 1 0, 2 5 2, 2 0
0 A, 1, 3 4, 1 4
0 B, 1, 4 0, 3 0 2, 3 0
0 A, 2, 0 4, 2 5
0 B, 2, 5 1, 0 3 3, 0 4
0 A, 3, 4 4, 3 0
0 B, 3, 0 1, 1 12 3, 1 16
0 A, 4, 1
1, 2 15 3, 2 20
1, 3 0 3, 3 0
All Reduce Pass 1 Results
Key Value Key Value Key Value Key Value Key Value Key Value
0, 0 1 2, 0 0 0, 0 0 2, 0 0 0, 0 0 2, 0 0
0, 1 4 2, 1 0 0, 1 8 2, 1 28 0, 1 0 2, 1 0
0, 2 5 2, 2 0 0, 2 0 2, 2 0 0, 2 0 2, 2 0
0, 3 0 2, 3 0 0, 3 2 2, 3 7 0, 3 0 2, 3 0
1, 0 3 3, 0 4 1, 0 0 3, 0 0 1, 0 30 3, 0 0
1, 1 12 3, 1 16 1, 1 0 3, 1 0 1, 1 15 3, 1 0
1, 2 15 3, 2 20 1, 2 0 3, 2 0 1, 2 45 3, 2 0
1, 3 0 3, 3 0 1, 3 0 3, 3 0 1, 3 5 3, 3 0

Key Value Key Value Key Value


4, 0 1 4, 0 0 4, 0 12
4, 1 4 4, 1 32 4, 1 6
4, 2 5 4, 2 0 4, 2 18
4, 3 0 4, 3 8 4, 3 2
Map Pass 2 Results
Key Value Key Value Key Value Key Value Key Value Key Value
0, 0 1 2, 0 0 0, 0 0 2, 0 0 0, 0 0 2, 0 0
0, 1 4 2, 1 0 0, 1 8 2, 1 28 0, 1 0 2, 1 0
0, 2 5 2, 2 0 0, 2 0 2, 2 0 0, 2 0 2, 2 0
0, 3 0 2, 3 0 0, 3 2 2, 3 7 0, 3 0 2, 3 0
1, 0 3 3, 0 4 1, 0 0 3, 0 0 1, 0 30 3, 0 0
1, 1 12 3, 1 16 1, 1 0 3, 1 0 1, 1 15 3, 1 0
1, 2 15 3, 2 20 1, 2 0 3, 2 0 1, 2 45 3, 2 0
1, 3 0 3, 3 0 1, 3 0 3, 3 0 1, 3 5 3, 3 0

Key Value Key Value Key Value


4, 0 1 4, 0 0 4, 0 12
4, 1 4 4, 1 32 4, 1 6
4, 2 5 4, 2 0 4, 2 18
4, 3 0 4, 3 8 4, 3 2
Reduce Pass 2 Process – Aggregate the
values with same key
Key Value Key Value Key Value Key Value Key Value Key Value
0, 0 1 2, 0 0 0, 0 0 2, 0 0 0, 0 0 2, 0 0
0, 1 4 2, 1 0 0, 1 8 2, 1 28 0, 1 0 2, 1 0
0, 2 5 2, 2 0 0, 2 0 2, 2 0 0, 2 0 2, 2 0
0, 3 0 2, 3 0 0, 3 2 2, 3 7 0, 3 0 2, 3 0
1, 0 3 3, 0 4 1, 0 0 3, 0 0 1, 0 30 3, 0 0
1, 1 12 3, 1 16 1, 1 0 3, 1 0 1, 1 15 3, 1 0
1, 2 15 3, 2 20 1, 2 0 3, 2 0 1, 2 45 3, 2 0
1, 3 0 3, 3 0 1, 3 0 3, 3 0 1, 3 5 3, 3 0

Key Value Key Value Key Value


4, 0 1 4, 0 0 4, 0 12
4, 1 4 4, 1 32 4, 1 6
4, 2 5 4, 2 0 4, 2 18
4, 3 0 4, 3 8 4, 3 2
Reduce Pass 2 Process – Aggregate the
values with same key
Key Value Key Value Key Value
0, 0 1 2, 0 0 4, 0 13
0, 1 12 2, 1 28 4, 1 42
0, 2 5 2, 2 0 4, 2 23
0, 3 2 2, 3 7 4, 3 10
1, 0 33 3, 0 4
1, 1 27 3, 1 16
1, 2 60 3, 2 20
1, 3 5 3, 3 0
One pass Matrix-Multiplication
Sample Representation of One-Pass
Map reduce
Map Function output
• For each element mij of A, produce all the key-value pairs (i,
k), (A, j, mij ) for k = 1, 2, . . . up to the number of columns of
B. Similarly do the same for the second matrix
Key Value Key Value Key Value Key Value Key Value
0, 0 A, 0, 1 1, 0 A, 0, 3 2, 0 A, 0, 0 3, 0 A, 0, 4 4, 0 A, 0, 1
0, 0 A, 1, 2 1, 0 A, 1, 0 2, 0 A, 1, 7 3, 0 A, 1, 0 4, 0 A, 1, 8
0, 0 A, 2, 0 1, 0 A, 2, 5 2, 0 A, 2, 0 3, 0 A, 2, 0 4, 0 A, 2, 2
0, 1 A, 0, 1 1, 1 A, 0, 3 2, 1 A, 0, 0 3, 1 A, 0, 4 4, 1 A, 0, 1
0, 1 A, 1, 2 1, 1 A, 1, 0 2, 1 A, 1, 7 3, 1 A, 1, 0 4, 1 A, 1, 8
0, 1 A, 2, 0 1, 1 A, 2, 5 2, 1 A, 2, 0 3, 1 A, 2, 0 4, 1 A, 2, 2
0, 2 A, 0, 1 1, 2 A, 0, 3 2, 2 A, 0, 0 3, 2 A, 0, 4 4, 2 A, 0, 1
0, 2 A, 1, 2 1, 2 A, 1, 0 2, 2 A, 1, 7 3, 2 A, 1, 0 4, 2 A, 1, 8
0, 2 A, 2, 0 1, 2 A, 2, 5 2, 2 A, 2, 0 3, 2 A, 2, 0 4, 2 A, 2, 2
0, 3 A, 0, 1 1, 3 A, 0, 3 2, 3 A, 0, 0 3, 3 A, 0, 4 4, 3 A, 0, 1
0, 3 A, 1, 2 1, 3 A, 1, 0 2, 3 A, 1, 7 3, 3 A, 1, 0 4, 3 A, 1, 8
0, 3 A, 2, 0 1, 3 A, 2, 5 2, 3 A, 2, 0 3, 3 A, 2, 0 4, 3 A, 2, 2
Map Function output
• For each element mij of A, produce all the key-value pairs (i,
k), (A, j, mij ) for k = 1, 2, . . . up to the number of columns of
B. Similarly do the same for the second matrix
Key Value Key Value Key Value Key Value Key Value
0, 0 B, 0, 1 1, 0 B, 0, 1 2, 0 B, 0, 1 3, 0 B, 0, 1 4, 0 B, 0, 1
0, 0 B, 1, 0 1, 0 B, 1, 0 2, 0 B, 1, 0 3, 0 B, 1, 0 4, 0 B, 1, 0
0, 0 B, 2, 6 1, 0 B, 2, 6 2, 0 B, 2, 6 3, 0 B, 2, 6 4, 0 B, 2, 6
0, 1 B, 0, 4 1, 1 B, 0, 4 2, 1 B, 0, 4 3, 1 B, 0, 4 4, 1 B, 0, 4
0, 1 B, 1, 4 1, 1 B, 1, 4 2, 1 B, 1, 4 3, 1 B, 1, 4 4, 1 B, 1, 4
0, 1 B, 2, 3 1, 1 B, 2, 3 2, 1 B, 2, 3 3, 1 B, 2, 3 4, 1 B, 2, 3
0, 2 B, 0, 5 1, 2 B, 0, 5 2, 2 B, 0, 5 3, 2 B, 0, 5 4, 2 B, 0, 5
0, 2 B, 1, 0 1, 2 B, 1, 0 2, 2 B, 1, 0 3, 2 B, 1, 0 4, 2 B, 1, 0
0, 2 B, 2, 9 1, 2 B, 2, 9 2, 2 B, 2, 9 3, 2 B, 2, 9 4, 2 B, 2, 9
0, 3 B, 0, 0 1, 3 B, 0, 0 2, 3 B, 0, 0 3, 3 B, 0, 0 4, 3 B, 0, 0
0, 3 B, 1, 1 1, 3 B, 1, 1 2, 3 B, 1, 1 3, 3 B, 1, 1 4, 3 B, 1, 1
0, 3 B, 2, 1 1, 3 B, 2, 1 2, 3 B, 2, 1 3, 3 B, 2, 1 4, 3 B, 2, 1
Reduce Function Output
• . Then, these products are summed and the result is paired with (i, k) in
the output of the Reduce function.

Key Value Key Value Key Value Key Value Key Value
0, 0 A, 0, 1 0, 0 B, 0, 1 0, 0 1 2, 0 0 4, 0 13
0, 0 A, 1, 2 0, 0 B, 1, 0 0, 1 12 2, 1 28 4, 1 42
0, 0 A, 2, 0 0, 0 B, 2, 6 0, 2 5 2, 2 0 4, 2 23
0, 1 A, 0, 1 0, 1 B, 0, 4 0, 3 2 2, 3 7 4, 3 10
0, 1 A, 1, 2 0, 1 B, 1, 4 1, 0 33 3, 0 4
0, 1 A, 2, 0 0, 1 B, 2, 3 1, 1 27 3, 1 16
0, 2 A, 0, 1 0, 2 B, 0, 5 1, 2 60 3, 2 20
0, 2 A, 1, 2 0, 2 B, 1, 0 1, 3 5 3, 3 0
0, 2 A, 2, 0 0, 2 B, 2, 9
0, 3 A, 0, 1 0, 3 B, 0, 0
0, 3 A, 1, 2 0, 3 B, 1, 1
0, 3 A, 2, 0 0, 3 B, 2, 1
Complexity Theory for MapReduce

• The communication cost of an algorithm is the sum of the


communication cost of all the tasks implementing that algorithm.
• Two parameters that characterize the families of Map reduce algorithm
– reducer size (q), replication rate (r)
• Reducer Size : Upper bound on the number of values that are allowed to
appear in the list with single key. This is a measure of the degree of
parallelism.
• Replication Rate: Number of key value pairs produced by all the map
tasks / Number of Inputs. Denotes the average communication from
map tasks to reduce tasks. This measures communication cost, that is
the cost of sending data from the map phase to the reduce phase
Complexity of Two-
pass algorithm
• Let us assume the example of
multiplying m x n and n x p matrices.
In our example 5 x 3 and 3 x 4
• Replication Rate = Pass 1 = 1, Pass 2 =
1;
• Reducer size = Pass 1= 5 + 4=9 (m+p
in general ) Pass 2= 3 (n in general)
• For n x n matrix – q is Pass1 - 2n and
Pass 2 = n
• No of reducers = RT 1: n RT2: m x p =
(mxp)=20
• Communication cost : MT1: 27 ; MT2
= 60
• Total cost = 87
Two – Pass visualization
Complexity of
One-pass
algorithm
• Let us assume the example of multiplying m x n and n
x p matrices. In our example 5 x 3 and 3 x 4
• Replication Rate, r = (4+5)/2 = 9/2 or (m+p)/2 in
general
• Reducer size, q = 6 or 2n in general

For square matrices of size = n x n


• For nxn matrix : r =(n+n)/2=n, q= 2n
• No of reducers = n x n = n2 reducers
`
Communication cost
• Communication cost: Total No of key value pairs
generated * Replication Rate
• Communication cost : =120
Sample Representation of One-Pass
Map reduce
Matrix Case Study
Recall
• Reducer Size : Upper bound on the number of
values that are allowed to appear in the list
with single key.
• Replication Rate: Number of key value pairs
produced by all the map tasks / Number of
Inputs.
Complexity of One pass algorithm
with grouping
• Lets group rows and columns to bands à g groups
• Each pair consisting of a band of rows of the first matrix and a band of columns of
the second matrix is used by one reducer to produce a square of elements of the
output matrix
Why we go for grouping ?

• If a reducer covers outputs of !!" and !#$ ,it can also covers !!$ and !#"
Matrix Multiplication
• Map
– For each element of X, Y generate g (k,v) pairs
– Key is group paired with all groups
– Value is (i,j, mij) or (i, j, nij)
• Reduce
– Reducer corresponds to key (i,j)
– All the elements in the ith band of A and jth band of B are aggregated.
Example
One Pass Algorithm
Map Output
Key Value Key Value Key Value Key Value Key Value Key Value
A,D 1, 1, 1 A,E 1, 1, 1 B,D 3, 1, 3 B,E 3, 1, 3 C,D 5, 1, 4 C,E 5, 1, 4
A,D 1, 2, 2 A,E 1, 2, 2 B,D 3, 2, 2 B,E 3, 2, 2 C,D 5, 2, 1 C,E 5, 2, 1
A,D 1, 3, 3 A,E 1, 3, 3 B,D 3, 3, 1 B,E 3, 3, 1 C,D 5, 3, 2 C,E 5, 3, 2
A,D 1, 4, 4 A,E 1, 4, 4 B,D 3, 4, 8 B,E 3, 4, 8 C,D 5, 4, 7 C,E 5, 4, 7
A,D 1, 5, 5 A,E 1, 5, 5 B,D 3, 5, 1 B,E 3, 5, 1 C,D 5, 5, 5 C,E 5, 5, 5
A,D 1, 6, 6 A,E 1, 6, 6 B,D 3, 6, 6 B,E 3, 6, 6 C,D 5, 6, 6 C,E 5, 6, 6
A,D 2, 1, 7 A,E 2, 1, 7 B,D 4, 1, 7 B,E 4, 1, 7 C,D 6, 1, 7 C,E 6, 1, 7
A,D 2, 2, 8 A,E 2, 2, 8 B,D 4, 2, 8 B,E 4, 2, 8 C,D 6, 2, 8 C,E 6, 2, 8
A,D 2, 3, 9 A,E 2, 3, 9 B,D 4, 3, 9 B,E 4, 3, 9 C,D 6, 3, 9 C,E 6, 3, 9
A,D 2, 4, 10 A,E 2, 4, 10 B,D 4, 4, 10 B,E 4, 4, 10 C,D 6, 4, 10 C,E 6, 4, 10
A,D 2, 5, 11 A,E 2, 5, 11 B,D 4, 5, 11 B,E 4, 5, 11 C,D 6, 5, 11 C,E 6, 5, 11
A,D 2, 6, 12 A,E 2, 6, 12 B,D 4, 6, 12 B,E 4, 6, 12 C,D 6, 6, 12 C,E 6, 6, 12
Map Output
Key Value Key Value Key Value Key Value Key Value Key Value
A,D 1, 1, 1 B,D 1, 1, 1 C,D 1, 1, 1 A,E 1, 3, 5 B,E 1, 3, 5 C,E 1, 3, 5
A,D 2, 1, 2 B,D 2, 1, 2 C,D 2, 1, 2 A,E 2, 3, 6 B,E 2, 3, 6 C,E 2, 3, 6
A,D 3, 1, 3 B,D 3, 1, 3 C,D 3, 1, 3 A,E 3, 3, 11 B,E 3, 3, 11 C,E 3, 3, 11
A,D 4, 1, 4 B,D 4, 1, 4 C,D 4, 1, 4 A,E 4, 3, 6 B,E 4, 3, 6 C,E 4, 3, 6
A,D 5, 1, 5 B,D 5, 1, 5 C,D 5, 1, 5 A,E 5, 3, 3 B,E 5, 3, 3 C,E 5, 3, 3
A,D 6, 1, 6 B,D 6, 1, 6 C,D 6, 1, 6 A,E 6, 3, 0 B,E 6, 3, 0 C,E 6, 3, 0
A,D 1, 2, 3 B,D 1, 2, 3 C,D 1, 2, 3 A,E 1, 4, 7 B,E 1, 4, 7 C,E 1, 4, 7
A,D 2, 2, 4 B,D 2, 2, 4 C,D 2, 2, 4 A,E 2, 4, 8 B,E 2, 4, 8 C,E 2, 4, 8
A,D 3, 2, 7 B,D 3, 2, 7 C,D 3, 2, 7 A,E 3, 4, 15 B,E 3, 4, 15 C,E 3, 4, 15
A,D 4, 2, 5 B,D 4, 2, 5 C,D 4, 2, 5 A,E 4, 4, 7 B,E 4, 4, 7 C,E 4, 4, 7
A,D 5, 2, 2 B,D 5, 2, 2 C,D 5, 2, 2 A,E 5, 4, 4 B,E 5, 4, 4 C,E 5, 4, 4
A,D 6, 2, 0 B,D 6, 2, 0 C,D 6, 2, 0 A,E 6, 4, 2 B,E 6, 4, 2 C,E 6, 4, 2
Reduce Output – One reducer
Calculation
Key Value Key Value
A,D 1, 1, 1 A,D 1, 1, 1 Key Value
A,D 1, 2, 2 A,D 2, 1, 2 1, 1 91
A,D 1, 3, 3 A,D 3, 1, 3 1, 2 62
A,D 1, 4, 4 A,D 4, 1, 4 2, 1 217
A,D 1, 5, 5 A,D 5, 1, 5 2, 2 188
A,D 1, 6, 6 A,D 6, 1, 6
1·1+2·2+3·3+4·4+5·5+6·
A,D 2, 1, 7 A,D 1, 2, 3 6 = 1 + 4 + 9 + 16 + 25 + 36 = 91
A,D 2, 2, 8 A,D 2, 2, 4
A,D 2, 3, 9 A,D 3, 2, 7
A,D 2, 4, 10 A,D 4, 2, 5
A,D 2, 5, 11 A,D 5, 2, 2
A,D 2, 6, 12 A,D 6, 2, 0
All Reducers Output
Key Value Key Value Key Value
1, 1 91 3, 1 83 5, 1 101
1, 2 62 3, 2 66 5, 2 75
1, 3 89 3, 3 89 5, 3 105
1, 4 128 3, 4 124 5, 4 147
2, 1 217 4, 1 217 6, 1 217
2, 2 188 4, 2 188 6, 2 188
2, 3 275 4, 3 275 6, 3 275
2, 4 386 4, 4 386 6, 4 386
Replication Rate and Reducer Size
# %
– Each reducer gets ! $!
"!# ! $"
elements
from A and B matrices respectively
$&'$(
– Replication rate = (
= (5/2 for our case)
#" #%
– Reducer size = +
$! $"
One Pass Algorithm – Square Matrices
A’s Map Output – (Value is (i,j,mij) )
K V K V K V K V K V K V
A,D 1, 1, 1 A,E 1, 1, 1 A,F 1, 1, 1 B,D 3, 1, 3 B,E 3, 1, 3 B,F 3, 1, 3
A,D 1, 2, 2 A,E 1, 2, 2 A,F 1, 2, 2 B,D 3, 2, 2 B,E 3, 2, 2 B,F 3, 2, 2
A,D 1, 3, 3 A,E 1, 3, 3 A,F 1, 3, 3 B,D 3, 3, 1 B,E 3, 3, 1 B,F 3, 3, 1
A,D 1, 4, 4 A,E 1, 4, 4 A,F 1, 4, 4 B,D 3, 4, 8 B,E 3, 4, 8 B,F 3, 4, 8
A,D 1, 5, 5 A,E 1, 5, 5 A,F 1, 5, 5 B,D 3, 5, 1 B,E 3, 5, 1 B,F 3, 5, 1
A,D 1, 6, 6 A,E 1, 6, 6 A,F 1, 6, 6 B,D 3, 6, 6 B,E 3, 6, 6 B,F 3, 6, 6
A,D 2, 1, 7 A,E 2, 1, 7 A,F 2, 1, 7 B,D 4, 1, 7 B,E 4, 1, 7 B,F 4, 1, 7
A,D 2, 2, 8 A,E 2, 2, 8 A,F 2, 2, 8 B,D 4, 2, 8 B,E 4, 2, 8 B,F 4, 2, 8
A,D 2, 3, 9 A,E 2, 3, 9 A,F 2, 3, 9 B,D 4, 3, 9 B,E 4, 3, 9 B,F 4, 3, 9
A,D 2, 4, 10 A,E 2, 4, 10 A,F 2, 4, 10 B,D 4, 4, 10 B,E 4, 4, 10 B,F 4, 4, 10
A,D 2, 5, 11 A,E 2, 5, 11 A,F 2, 5, 11 B,D 4, 5, 11 B,E 4, 5, 11 B,F 4, 5, 11
A,D 2, 6, 12 A,E 2, 6, 12 A,F 2, 6, 12 B,D 4, 6, 12 B,E 4, 6, 12 B,F 4, 6, 12
A’s Map Output – (Value is (i,j,mij) )
K V K V K V
C,D 5, 1, 4 C,E 5, 1, 4 C,F 5, 1, 4
C,D 5, 2, 1 C,E 5, 2, 1 C,F 5, 2, 1
C,D 5, 3, 2 C,E 5, 3, 2 C,F 5, 3, 2
C,D 5, 4, 7 C,E 5, 4, 7 C,F 5, 4, 7
C,D 5, 5, 5 C,E 5, 5, 5 C,F 5, 5, 5
C,D 5, 6, 6 C,E 5, 6, 6 C,F 5, 6, 6
C,D 6, 1, 7 C,E 6, 1, 7 C,F 6, 1, 7
C,D 6, 2, 8 C,E 6, 2, 8 C,F 6, 2, 8
C,D 6, 3, 9 C,E 6, 3, 9 C,F 6, 3, 9
C,D 6, 4, 10 C,E 6, 4, 10 C,F 6, 4, 10
C,D 6, 5, 11 C,E 6, 5, 11 C,F 6, 5, 11
C,D 6, 6, 12 C,E 6, 6, 12 C,F 6, 6, 12
B’s Map Output – (Value is (i,j,mij) )
K V K V K V K V K V K V
A,D 1, 1, 1 A,E 1, 1, 1 A,F 1, 1, 1 B,D 1, 3, 3 B,E 1, 3, 3 B,F 1, 3, 3
A,D 2, 1, 7 A,E 2, 1, 7 A,F 2, 1, 7 B,D 2, 3, 9 B,E 2, 3, 9 B,F 2, 3, 9
A,D 3, 1, 3 A,E 3, 1, 3 A,F 3, 1, 3 B,D 3, 3, 1 B,E 3, 3, 1 B,F 3, 3, 1
A,D 4, 1, 7 A,E 4, 1, 7 A,F 4, 1, 7 B,D 4, 3, 9 B,E 4, 3, 9 B,F 4, 3, 9
A,D 5, 1, 4 A,E 5, 1, 4 A,F 5, 1, 4 B,D 5, 3, 2 B,E 5, 3, 2 B,F 5, 3, 2
A,D 6, 1, 7 A,E 6, 1, 7 A,F 6, 1, 7 B,D 6, 3, 9 B,E 6, 3, 9 B,F 6, 3, 9
A,D 1, 2, 2 A,E 1, 2, 2 A,F 1, 2, 2 B,D 1, 4, 4 B,E 1, 4, 4 B,F 1, 4, 4
A,D 2, 2, 8 A,E 2, 2, 8 A,F 2, 2, 8 B,D 2, 4, 10 B,E 2, 4, 10 B,F 2, 4, 10
A,D 3, 2, 2 A,E 3, 2, 2 A,F 3, 2, 2 B,D 3, 4, 8 B,E 3, 4, 8 B,F 3, 4, 8
A,D 4, 2, 8 A,E 4, 2, 8 A,F 4, 2, 8 B,D 4, 4, 10 B,E 4, 4, 10 B,F 4, 4, 10
A,D 5, 2, 1 A,E 5, 2, 1 A,F 5, 2, 1 B,D 5, 4, 7 B,E 5, 4, 7 B,F 5, 4, 7
A,D 6, 2, 8 A,E 6, 2, 8 A,F 6, 2, 8 B,D 6, 4, 10 B,E 6, 4, 10 B,F 6, 4, 10
B’s Map Output – (Value is (i,j,mij) )
K V K V K V
C,D 1, 5, 5 C,E 1, 5, 5 C,F 1, 5, 5
C,D 2, 5, 11 C,E 2, 5, 11 C,F 2, 5, 11
C,D 3, 5, 1 C,E 3, 5, 1 C,F 3, 5, 1
C,D 4, 5, 11 C,E 4, 5, 11 C,F 4, 5, 11
C,D 5, 5, 5 C,E 5, 5, 5 C,F 5, 5, 5
C,D 6, 5, 11 C,E 6, 5, 11 C,F 6, 5, 11
C,D 1, 6, 6 C,E 1, 6, 6 C,F 1, 6, 6
C,D 2, 6, 12 C,E 2, 6, 12 C,F 2, 6, 12
C,D 3, 6, 6 C,E 3, 6, 6 C,F 3, 6, 6
C,D 4, 6, 12 C,E 4, 6, 12 C,F 4, 6, 12
C,D 5, 6, 6 C,E 5, 6, 6 C,F 5, 6, 6
C,D 6, 6, 12 C,E 6, 6, 12 C,F 6, 6, 12
Reducer sample output calculation
K V K V

B,D 3, 1, 3 B,D 1, 3, 3 Key Value


B,D 3, 2, 2 B,D 2, 3, 9 3, 3 157
B,D 3, 3, 1 B,D 3, 3, 1 3, 4 187
B,D 3, 4, 8 B,D 4, 3, 9 4, 3 322
B,D 3, 5, 1 B,D 5, 3, 2 4, 4 477

B,D 3, 6, 6 B,D 6, 3, 9
B,D 4, 1, 7 B,D 1, 4, 4
B,D 4, 2, 8 B,D 2, 4, 10 3 · 4 + 2 · 10 + 1 · 8 + 8 · 10 + 1 · 7 + 6 ·
10 = 12 + 20 + 8 + 80 + 7 + 60 = 187
B,D 4, 3, 9 B,D 3, 4, 8
B,D 4, 4, 10 B,D 4, 4, 10
B,D 4, 5, 11 B,D 5, 4, 7
B,D 4, 6, 12 B,D 6, 4, 10
Reducer all output
Key Value Key Value Key Value
1, 1 114 3, 1 122 5, 1 128
1, 2 109 3, 2 137 5, 2 129
1, 3 124 3, 3 156 5, 3 150
1, 4 183 3, 4 187 5, 4 207
1, 5 165 3, 5 197 5, 5 201
1, 6 198 3, 6 222 5, 6 234
2, 1 288 4, 1 288 6, 1 288
2, 2 283 4, 2 283 6, 2 283
2, 3 322 4, 3 322 6, 3 322
2, 4 477 4, 4 477 6, 4 477
2, 5 429 4, 5 429 6, 5 429
2, 6 522 4, 6 522 6, 6 522
Replication Rate and Reducer Size
!
– Each reducer gets 2n "
elements from 2 matrices
"#"
– Replication rate ! = $ =#
!
– Reducer size $ = 2& " = 2&$/#

Tradeoff!
(# " The bigger the reducers,
–%= the less communication.
)

– Replication rate varies inversely with reducer size.


!!
4#!
Calculation of Lower bound on
replication rate
n

k = n/g

k
Coverage of one reducer
• Each reducer gets 2nk inputs
• Thus, reducer size q = 2nk
• k = q/2n
! "!
• No. of outputs covered by this is ! =
#$!
!"
• Thus one reducer covers outputs --------- (1)
Replication • Total outputs = #$
"# "
----------------- (2)

rate as a • Eq. (2) divided by (1) à


"# #
!"
reducers

function of • Reducer size => 1 reducer à q inputs


"# # "# #
• reducers à ( no. of inputs to all the
q !" !
reducers), which is the communication cost ( used
for later discussion)
#$#
% $# "
• Replication rate = $%"
= !
Number of $%$
& %''
• Replication rate r = =
reducers %&' (
• Extreme cases
• Case 1: & = 2(% à r=1
• Case 2: & = 2( à r = n
Grouped two-pass approach

You might also like