0% found this document useful (0 votes)
16 views

Parallel Algorithms

The document discusses bitonic sorting and bitonic sequences. It explains how an unsorted sequence can be converted into a bitonic sequence using bitonic merges. It then describes how a bitonic sequence can be sorted in parallel using bitonic splits and bitonic merges in O(log^2 n) time.

Uploaded by

Rajat Dabas
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Parallel Algorithms

The document discusses bitonic sorting and bitonic sequences. It explains how an unsorted sequence can be converted into a bitonic sequence using bitonic merges. It then describes how a bitonic sequence can be sorted in parallel using bitonic splits and bitonic merges in O(log^2 n) time.

Uploaded by

Rajat Dabas
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 48

Kenneth E.

Batcher
Professor, Kent State University

http://www.cs.kent.edu/~batcher
“Sorting networks and their applications”, AFIPS Proc. of 1968
Spring Joint Computer Conference, Vol. 32, pp 307-314.
2 Background
0 Sorting is fundamental
1 Low bound of any sequential sorting algorithms is O(nlogn)
8 Can we improve the time complexity further?
– Parallel algorithms
– Circuit/Network Design
– Parallel Computing Models
2 ①Bitonic Sequence 双调序列
0
1 sequence of elements {a0, a1, …, an-1} where
8 either
– (1) there exists an index, i, 0 i  n-1, such that {a0,
…, ai} is monotonically increasing, and {ai+1, …, an-1}
is monotonically decreasing,
– e.g. {1, 2, 4, 7, 6, 0}
Or
– (2) there exists a cyclic shift of indices so that (1) is
satisfied
– e.g. {8, 9, 2, 1, 0, 4}  {0, 4, 8, 9, 2, 1}
2 ①Bitonic Sequence : Examples
0
1 Value of
8 element { 3, 5, 7, 9, 8, 6, 4, 2 }

a0 a1 a2 a3 a4 a5 a6 a7 ai
Value of
element
{ 8, 6, 4, 2, 3, 5, 7, 9}

a0 a1 a2 a3 a4 a5 a6 a7 ai
2 ①Bitonic Sequence : Examples
0
1 Value of
8 element
{ 3, 5, 7, 9, 11, 13, 15, 17 }

a0 a1 a2 a3 a4 a5 a6 a7 ai
Value of
element

{ 5, 3, 1, 2, 4, 6, 8, 7 }

a0 a1 a2 a3 a4 a5 a6 a7 ai
2 Bitonic Sort: basic idea
0
1 Consider a bitonic sequence S of size n where
8
– the first half ( {a0, a1, …, an/2-1} ) is increasing, and
the second half ( {an/2, an/2+1, …, an-1} ) is decreasing

Value of
element

a0 a1... an/2-1 an/2 an/2 +1 … an-1 ai


②“Bitonic Split” 双调分裂
2
Pair-wise min-max comparison
0
1 – s1 = {min(a0, an/2), min(a1, an/2+1), … , min(an/2-1, an-1)}
8
– s2 = {max(a0, an/2), max(a1, an/2+1), … , max(an/2-1, an-1)}

Compare and exchange

a0 a1... an/2-1 an/2 an/2 +1 … an-1 ai

an/2 S2
an/2-1 value
value
a0 an-1 S1
There exists
2 – an element b in S1 such that all elements before b is
0 increasing and all elements after b is decreasing
1 – an element c in S2 such that all elements before c is
8 decreasing and all elements after c is increasing
S1 and S2
– Both S1 and S2 are bitonic sequences
– Any elements in S1 < any elements in S2 (because b <
c and b is the maximum value in S1 and c is the
minimum value in S2)

S2
value c
S1 b
2 pair-wise min-max comparison
0
1
8 e.g. { 2, 4, 6, 8, 7, 5, 3, 1}
{ 2, 4, 6, 8
Compare and exchange
7, 5, 3, 1 }
=> S1={2, 4, 3, 1}
S2={7, 5, 6, 8}
bitonic sequence of size 8
=> 2 bitonic sequence of size 4
2 ②Bitonic Split
0
1
8
The split is applicable to any bitonic sequence.
Need not to have the 1st half to be
increasing/decreasing and the 2nd half to be
decreasing/increasing:

Bitonic(n) Bitonic Split 2 Bitonic(n/2)


2 Sorting a bitonic sequence
0
1
8 By using bitonic split recursively,
INPUT: a bitonic sequence of size n
 Phase 1: 2 bitonic sequence of size n/2
 Phase 2: 4 bitonic sequence of size n/4
 …
 …
 Phase (log n): n bitonic sequence of size 1
 a sorted sequence can be generated by
concatenating the n bitonic sequence of size 1
2 ③Bitonic Merge 双调合并
0
1 sort a bitonic sequence using bitonic splits
8
length 1 2 3 4 5 6 7 8 9 10111213141516
16

Anything wrong with this slide?


Bitonic Merge Circuit : BM[16]

2
0
1
8

What do you think of ?


2 Questions ?
0
1
8 How can we convert an unsorted sequence to a
bitonic sequence ? (then, by using bitonic split
recursively, a sorted sequence can be formed).
Turn an unsorted sequence into a bitonic
2 sequence: ③Bitonic Merge (BM) Operation
0
1
8 length 1 2 3 4 5 6 7 8 9 10111213141516

16

At every phase, sort a bitonic sequence of size 2, 4, 8, 16


into a monotonically increasing or decreased sequence
2
0
1
8

Turn an unsorted sequence into a bitonic sequence


2 ④Bitonic Sort
0
1
8 length 1 2 3 4 5 6 7 8 9 10111213141516
4

16
2 Sort (any ordered of) sequence
0
1
8 Using bitonic merge repeatedly
Definition:
 BM[n]: increasing bitonic merge of size n
• bitonic merge : sort a bitonic sequence of size n into a
monotonically increasing sequence
 BM[n]: decreasing bitonic merge of size n
• bitonic merge that sort a bitonic sequence of size n into a
monotonically decreasing sequence
2 Steps:
0
1
Divide the sequence into a group of 2
8 – any sequence of size 2 is a bitonic sequence: either
the increasing part is of size 2 and the decreasing
part is of size 0, or vice versa
Using BM[2] on a group to form an
increasing sequence, and BM[2] on the
adjacent group to form an decreasing sequence
Concatenate the two group to form a bitonic
sequence of size 4
2 Steps:
0
1 Repeat the above steps on other groups
8 Repeat the above steps recursively, until a
bitonic sequence of size n is formed
Using bitonic merge again to turn the bitonic
sequence into a sorted sequence
Bitonic Sorting Circuit: BS(18)

2
0
1
8

 BM[n]: increasing bitonic merge of size n


 BM[n]: decreasing bitonic merge of size n
2 Sort (any ordered of) sequence
0
1
8 Hence,
n unsorted numbers
n/2 group of 2-number bitonic sequence
n/4 group of 4-number bitonic sequence

1 group of n-number bitonic sequence
a sorted sequence
2 ⑤Complexity of Bitonic Sort
0
1 Parallel bitonic sort with n processor
8 – The last stage of an n-element bitonic sorting need
to merge n-element, and has a depth of log(n)
– Other stages perform a complete sort of n/2
elements
– Depth, d(n) = d(n/2) + log(n)
– d(n) = 1 + 2 + 4 + … + log(n) = (log2n)
– Complexity: T(n) = (log2n)
2 ⑤Complexity of Bitonic Sort
0
1
Parallel sorting with a block of elements per
8
processor
– sort the local block of elements first (using any
sorting algorithm such as quicksort, bitonic sort)
– sort the elements among processors using parallel
bitonic sort
– T(n) = T(local_sort) + T(comparisons)
+T(communication)
Only computation time is considered here (you
need to consider all communication time also)
2 ⑥Concluding Remarks
0
1
8 Bitonic Sorting: Common Sense
Regression to Computer Science
One of 10 Most Important Papers
Parallel Algorithm: Ascend/Descend
– Neighbors are communicated in dimension i , i is
from 1 to N, or from N to 1
– Another example: Prefix sum
Network Model:
2 Bitonic Sorting Network
0
1
8

Hypercube connections!
Try to Write Bitonic Sorting algorithm on hypercube.
27
Bitonic Sort on Butterfly
2
0
1
8

28
Bitonic Sort on Butterfly
2
0
1
8

29
Bitonic Sort on Butterfly
2
0
1
8

30
Bitonic Sort on Butterfly
2
0
1
8

31
Bitonic Sort on Butterfly
2
0
1
8

32
Bitonic Sort on Butterfly
2
0
1
8

33
Bitonic Sort on Butterfly
2
0
1
8

34
Bitonic Sort on Butterfly
2
0
1
8

35
Bitonic Sort on Butterfly
2
0
1
8

36
Bitonic Sort on Butterfly
2
0
1
8

37
Bitonic Sort on Butterfly
2
0
1
8

38
Bitonic Sort on Butterfly
2
0
1
8

39
2 PRAM Model
0
1 P1 P2 P3 … Pn
8

Memory
Access time from any processor to any memory
unit is equal
It is impossible in practice
So it is an ideal model for parallel computing
Let focus only on algorithm design
2 PRAM Model
0
1
8

Equal Access time is impossible even in


sequential computer
2 PRAM Model
0 Program for Sum= a(1)+a(2)+…+a(N)
1
8
for i = 1 to log N
for j= 1 to n/ 2i
parallel do a(j) = a(j) + a(N/ 2i + j)
endpar
endfor
endfor

Finally a(1) is the sum


2 PRAM Model
0 Program for Sum= a(1)+a(2)+…+a(N)
1
8 Processor P(i) holds a(i)

Finally a(1) is the sum


PRAM Model (Prefix Sum)
2
0
1
8
for i = 1 to log N
for j= 1 to N
parallel do
s(j) = s(j) + s(N/ 2i + j)
s(2i + j) =s(j) + s(2i + j)
s(j) = s(j) -a (N/ 2i + j)
endpar
endfor
endfor
PRAM Model (Prefix Sum)
2 a(1) a(2) a(3) a(4) a(5) a(6) a(7) a(8)

0
1
8
a(1) a(2) a(3) a(4) a(1)+a(5) a(2)+a(6) a(3)+a(7) a(4)+a(8)

a(2)+a(4) a(2)+a(4)+a(6) a(2)+a(4)+a(6)+a(8)


a(1) a(2) a(1)+a(3) a(1)+a(3)+a(5) a(1)+a(3)+a(5)+a(7)

a(1)+a(2)+a(3)+a(4)
a(1)+a(2)+a(3)+a(4)+a(5)+a(6) +a(5)+a(6)+a(7)+a(8)
a(1)+a(2)+a(3)+a(4)
a(1)+a(2)+a(3)+a(4)
a(1) a(1)+a(2) a(1)+a(2)+a(3) a(1)+a(2)+a(3)+a(4)+a(5) +a(5)+a(6)+a(7)
2 Hypercube Model
0
Suppose node N(x) holds element a(x), where x is the value of node index x 1x2…xn
1
8
for i = 1 to n
for j = i to n
parallel do
N(00…0 (xj=0) xj+1…xn)  N(00…0 (xj=1) xj+1…xn);
a(00…0 (xj=0) xj+1…xn) =
a(00…0 (xj=0) xj+1…xn) + a(00…0 (xj=1) xj+1…xn)
endpar
endfor
endfor

Finally node 00…0 holds the sum


2 Hypercube Model
0
Suppose node 000 holds element a(0) and 111holds element a(7)
1
a(4) a(5)
8 100 101 100 101
a(0) a(1) a(0)+a(4) a(1)+a(5)
000 001 000 001

110 111 110 111


a(6) a(7)
010 011 010 011
a(3)
a(2) a(2)+a(6) a(3)+a(7)

100 101 100 101


a(0)+a(4) +a(2)+a(6) a(0)+a(4) +a(2)+a(6) +a(1)+a(5)+a(3)+a(7)
000 001 a(1)+a(5)+a(3)+a(7) 000 001

110 111 110 111

010 011 010 011


2 Hypercube Model (Prefix Sum)
0
1
8

You might also like