0% found this document useful (0 votes)
8 views

DS MergeSort QuickSort (4) SLM

Uploaded by

m.biswas2014.mb
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

DS MergeSort QuickSort (4) SLM

Uploaded by

m.biswas2014.mb
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Divide-and-Conquer

• Divide the problem into a number of sub-problems


– Similar sub-problems of smaller size

• Conquer the sub-problems


– Solve the sub-problems recursively

– Sub-problem size small enough  solve the problems in


straightforward manner

• Combine the solutions of the sub-problems


– Obtain the solution for the original problem

1
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
– Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
– Sort the subsequences recursively using merge sort
– When the size of the sequences is 1 there is nothing
more to do
• Combine
– Merge the two sorted subsequences
2
Merge Sort
p q r
1 2 3 4 5 6 7 8

Alg.: MERGE-SORT(A, p, r) 5 2 4 7 1 3 2 6

if p < r Check for base case

then q ← (p + r)/2 Divide

MERGE-SORT(A, p, q) Conquer

MERGE-SORT(A, q + 1, r) Conquer

MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)

3
Example – n Power of 2
1 2 3 4 5 6 7 8

Divide 5 2 4 7 1 3 2 6 q=4

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

4
Example – n Power of 2
1 2 3 4 5 6 7 8

Conquer 1 2 2 3 4 5 6 7
and
Merge 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1 2 3 4 5 6 7 8

2 5 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

5
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6 q=6
Divide
1 2 3 4 5 6 7 8 9 10 11

q=3 4 7 2 6 1 4 7 3 5 2 6 q=9

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

6
Example – n Not a Power of 2
1 2 3 4 5 6 7 8 9 10 11

Conquer 1 2 2 3 4 4 5 6 6 7 7
and
Merge
1 2 3 4 5 6 7 8 9 10 11

1 2 4 4 6 7 2 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11

2 4 7 1 4 6 3 5 7 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 1 6 4 3 7 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

7
Merging
p q r
1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

• Input: Array A and indices p, q, r such that p


≤q<r
– Subarrays A[p . . q] and A[q + 1 . . r] are sorted
• Output: One single sorted subarray A[p . . r]

8
Merging
p q r
• Idea for merging: 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
– Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile

– Repeat the process until one pile is empty


– Take the remaining input pile and place it face-down
onto the output pile
A1 A[p, q]
A[p, r]

A2 A[q+1, r]

9
Merge - Pseudocode
p q r
Alg.: MERGE(A, p, q, r) 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6
1. Compute n1 and n2
2. Copy the first n1 elements into n1 n2
L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2
+ 1] p q

3. L[n1 + 1] ← ; R[n2 + 1] ←  L 2 4 5 7 
q+1 r
4. i ← 1; j ← 1
R 1 2 3 6 
5. for k ← p to r
6. do if L[ i ] ≤ R[ j ]
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ] 10
Running Time of Merge
(assume last for loop)
• Initialization (copying into temporary arrays):
 (n1 + n2) = (n)

• Adding the elements to the final array:


- n iterations, each taking constant time  (n)
• Total time for Merge:
 (n)

11
Analyzing Divide-and Conquer
Algorithms
• The recurrence is based on the three steps of
the paradigm:
– T(n) – running time on a problem of size n
– Divide the problem into a subproblems, each of size
n/b: takes D(n)
– Conquer (solve) the subproblems aT(n/b)
– Combine the solutions C(n)

(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n)
otherwise
12
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = (1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
 2T (n/2)
• Combine:
– MERGE on an n-element subarray takes (n) time
 C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1

13
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Use Master’s Theorem:

Compare n with f(n) = cn


Case 2: T(n) = Θ(nlgn)

14
Merge Sort - Discussion
• Running time insensitive of the input

• Advantages:
– Guaranteed to run in (nlgn)

• Disadvantage
– Requires extra space N

15
Sorting Challenge 1
Problem: Sort a file of huge records with tiny
keys
Example application: Reorganize your MP-3 files

Which method to use?


A. merge sort, guaranteed to run in time NlgN
B. selection sort
C. bubble sort
D. a custom algorithm for huge records/tiny keys
E. insertion sort

16
Sorting Files with Huge Records and
Small Keys

• Insertion sort or bubble sort?

– NO, too many exchanges

• Selection sort?

– YES, it takes linear time for exchanges

• Merge sort or custom method?

– Probably not: selection sort simpler, does less swaps

17
Sorting Challenge 2
Problem: Sort a huge randomly-ordered file
of small records
Application: Process transaction record for a
phone company

Which sorting method to use?


A. Bubble sort
B. Selection sort
C. Mergesort guaranteed to run in time NlgN
D. Insertion sort

18
Sorting Huge, Randomly - Ordered Files

• Selection sort?
– NO, always takes quadratic time

• Bubble sort?
– NO, quadratic time for randomly-ordered keys

• Insertion sort?
– NO, quadratic time for randomly-ordered keys

• Mergesort?
– YES, it is designed for this problem

19
Sorting Challenge 3
Problem: sort a file that is already almost in
order
Applications:
– Re-sort a huge database after a few changes
– Doublecheck that someone else sorted a file
Which sorting method to use?
A. Mergesort, guaranteed to run in time NlgN
B. Selection sort
C. Bubble sort
D. A custom algorithm for almost in-order files
E. Insertion sort
20
Sorting Files That are Almost in Order
• Selection sort?
– NO, always takes quadratic time
• Bubble sort?
– NO, bad for some definitions of “almost in order”
– Ex: B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
• Insertion sort?
– YES, takes linear time for most definitions of “almost
in order”
• Mergesort or custom method?
– Probably not: insertion sort simpler and faster

21
Quicksort
A[p…q] ≤ A[q+1…r]
• Sort an array A[p…r]
• Divide
– Partition the array A into 2 subarrays A[p..q] and A[q+1..r],
such that each element of A[p..q] is smaller than or equal to
each element in A[q+1..r]
– Need to find index q to partition the array

22
Quicksort
A[p…q] ≤ A[q+1…r]

• Conquer
– Recursively sort A[p..q] and A[q+1..r] using
Quicksort
• Combine
– Trivial: the arrays are sorted in place
– No additional work is required to combine them
– The entire array is now sorted

23
QUICKSORT

Alg.: QUICKSORT(A, p, r) Initially: p=1, r=n

if p < r

then q  PARTITION(A, p, r)

QUICKSORT (A, p, q)

QUICKSORT (A, q+1, r)


Recurrence:
T(n) = T(q) + T(n – q) + f(n) PARTITION())

24
Partitioning the Array
• Choosing PARTITION()
– There are different ways to do this
– Each has its own advantages/disadvantages

• Hoare partition
– Select a pivot element x around which to partition
– Grows two regions
A[p…i]  x x  A[j…r]
A[p…i]  x
x  A[j…r]
i j
25
Example
A[p…r] pivot x=5

5 3 2 6 4 1 3 7 5 3 2 6 4 1 3 7

i j i j

3 3 2 6 4 1 5 7 3 3 2 6 4 1 5 7

i j i j
A[p…q] A[q+1…r]

3 3 2 1 4 6 5 7 3 3 2 1 4 6 5 7

i j j i
26
Example

27
Partitioning the Array
Alg. PARTITION (A, p, r)
p r
1. x  A[p]
A: 5 3 2 6 4 1 3 7
2. i  p – 1
3. j  r + 1 i j
A[p…q] ≤ A[q+1…r]
4. while TRUE
5. do repeat j  j – 1 A: ap ar

6. until A[j] ≤ x
j=q i
7. do repeat i  i + 1
8. until A[i] ≥ x
Each element is
9. if i < j visited once!
10. then exchange A[i]  A[j] Running time: (n)
n=r–p+1
11. else return j
28
Recurrence

Alg.: QUICKSORT(A, p, r) Initially: p=1, r=n

if p < r

then q  PARTITION(A, p, r)

QUICKSORT (A, p, q)

QUICKSORT (A, q+1, r)


Recurrence:
T(n) = T(q) + T(n – q) + n
29
Worst Case Partitioning
• Worst-case partitioning
– One region has one element and the other has n – 1 elements

– Maximally unbalanced
n n
• Recurrence: q=1 1 n-1 n
1 n-2 n-1
T(n) = T(1) + T(n – 1) + n, n-2
n 1 n-3
T(1) = (1) 1
2 3
T(n) = T(n – 1) + n 1 1 2
 n 
n    k   1 (n)  (n 2 ) (n 2 ) (n2)
=  k 1 
When does the worst case happen? 30
Best Case Partitioning
• Best-case partitioning
– Partitioning produces two regions of size n/2
• Recurrence: q=n/2
T(n) = 2T(n/2) + (n)
T(n) = (nlgn) (Master theorem)

31
Case Between Worst and Best

• 9-to-1 proportional split


Q(n) = Q(9n/10) + Q(n/10) + n

32
How does partition affect performance?

33
How does partition affect performance?

34
Performance of Quicksort
• Average case
– All permutations of the input numbers are equally likely
– On a random input array, we will have a mix of well balanced
and unbalanced splits
– Good and bad splits are randomly distributed across throughout
the tree
partitioning cost:
n combined partitioning cost: n n = (n)
1 n-1 2n-1 = (n)
(n – 1)/2 + 1 (n – 1)/2
(n – 1)/2 (n – 1)/2

Alternate of a good Nearly well


and a bad split balanced split

• Running time of Quicksort when levels alternate


between good and bad splits is O(nlgn)
35

You might also like