0% found this document useful (0 votes)
26 views44 pages

02 D&C Merge and Quicksort

Uploaded by

PIYUSH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views44 pages

02 D&C Merge and Quicksort

Uploaded by

PIYUSH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Divide & Conquer

Approach…..

1
Recurrences
• A recurrence is an equation or inequality that
describes a function in of its value on smaller
inputs.
The divide-and-conquer approach
• Recursive in structure
Steps:
1. DIVIDE problem into smaller number of
subproblems
2. CONQUER by finding solution to
subproblems (recursive)
3. COMBINE solutions of subproblems to find
solution of original problem
Case Study
• Find different ways to sort ‘n’ numbers……
• Strategy #1 : Insertion sort….
Insertion Sort- Recursive Version
• Array divided in two parts
Sorted unsorted

• Take first element in the unsorted part, insert into


sorted part
Assume:
A[1,…i-1] = sorted
A[i,….n]=yet to be sorted
• Insert A[i] in A[0,….,i-1]
• Recursively sort A[i+1,…..,n]
• Finally stop when i=n
5
Insertion Sort- Recursive Version
Insert_sort(A, start, n) Insert(A, start)
If (start >n) // insert A[start] into A[1,..start-1]
Return i=start-1
// otherwise insert value at key=A[start]
position start to the rest of while i>0 and A[i]> key
‘A’ A[i+1]=A[i]
Insert (A, start) i=i-1
Insert_sort (A, start+1,n); A[i+1]=key
Return Return

Insert_sort(A, start=2, n)
6
Recurrence Relation
• 𝑇(𝑛), time to run insert_sort on length ‘n’
• Time 𝑇(𝑛 − 1) to sort A[0],….,A[n-2]

Using the recursive version


Question: why have I written this 2?
• 𝑇(𝑛) = 1 + 𝑇(𝑛 − 1) And not 1??

• 𝑇(𝑛) = 1 + 2 + 𝑇(𝑛 − 2)
• 𝑇(𝑛) = 1 + 2 + 3 + ⋯ + 𝑛 − 1 + 𝑇(1)
• 𝑇(𝑛) = 𝑛 (𝑛 − 1)/2 = 𝑂(𝑛2 )
7
Clever sorting
• Divide the array in two equal parts
• Separately sort each part
• Combine the results
Strategy #2
• Given two sorted arrays A[m], B[n] (non empty)
• Let the pointers to these array be ‘a’ and ‘b’
respectively
• Initialize a new array C[m+n], and pointer ‘c’
• At any pointer positions ‘a’ and ‘b’
• Compare:
– If A[a] < B[b] then C[c]=A[a], increment ‘a’
– Else C[c]=B[b], increment ‘b’

• Intuitively: Merge two sorted arrays!

9
Strategy#2
• Example
• A=[ 4 12 40]
• B=[ 3 9 24]

• Then C=[ ]?

10
Merge Sort
• Divide array into two equal parts
• Sort left part
• Sort right part
• Merge two sorted halves

• How to sort two halves??


– Recursively on each piece using same strategy!!!!
– Till we reach the base case , i.e., only one element
left….
– Lets see how….

12
Merge Sort
Algorithm for MergeSort
Algorithm MergeSort (A, left, right, B)
1. If right-left > 0
2. mid= (𝑙𝑒𝑓𝑡 + 𝑟𝑖𝑔ℎ𝑡)/2
3. Let L[1….. m] and R[1….n] be two new arrays where
m=(mid-left)+1, and n=(right-(mid+1)+1)=(right-mid)
4. Mergesort(A, left, mid, L)
5. Mergesort (A, mid+1, right, R)
6. B = Merge (L, m, R, n)
Subroutine Merge
Algorithm Merge (L, m, R, n)
1. Initialize i=1, j=1, k=1
2. Let C[1,…..m+n] be a new array
3. While k≤m+n
4. if L[i]< R[j] or j=n
5. C[k]=L[i]
6. increment i=i+1, k=k+1
7. if R[j]≤L[i]
8. C[k]=R[j]
9. increment j=j+1, k=k+1
10. Return C
MergeSort Analysis
• Analysis of Merge(A,m1,B,n1)
• For each comparison between A[i] and B[j] one element gets
filled in C[k]
• How many elements? Ans n = (m1+n1)
• Total number of comparison atmost: n
• Thus complexity of Merge ()-> 𝜽(𝒏)

16
MergeSort Analysis
• Merge sort on just one element takes constant time.
• When we have n>1 elements, we break down the running
time as follows

17
MergeSort Analysis
• Suppose that our division of the problem yields a
subproblems, each of which is 1/b the size of the
original

• D(n) time to divide the problem into subproblems


• C(n) time to combine the solutions to the subproblems
into the solution to the original problem
MergeSort Analysis
• let T(n) be the running time on a problem of size n

• It can be simplified as:

𝑐, 𝑖𝑓 𝑛 = 1
• 𝑇(𝑛) = ቐ 𝑛
2𝑇 + 𝑐𝑛, 𝑖𝑓 𝑛 > 1
2
• where the constant c represents the time required to solve
problems of size 1
MergeSort Analysis
• Therefore: T(n)=2T(n/2)+cn

20
MergeSort Analysis
• Therefore: T(n)=2T(n/2)+cn

21
MergeSort Analysis
𝑛
• 𝑇 𝑛 = 2𝑇 + 𝑐𝑛
2
𝑛 𝑛 𝑐𝑛
But we know that….𝑇 2
= 2𝑇 4
+ 2

𝑛 𝑐𝑛 𝑛
• 𝑇 𝑛 = 2[2𝑇 + ] + 𝑐𝑛= 22𝑇 + 2𝑐𝑛
22 2 22
𝑛 𝑛 𝑐𝑛
But we know that….𝑇 = 2𝑇 +
4 8 4

𝑛 𝑐𝑛 𝑛
• 𝑇 𝑛 = 22[2𝑇 + ] + 2𝑐𝑛= 23𝑇 + 3𝑐𝑛
23 4 23
• after some iterations…we can write
MergeSort Analysis
• after some iterations…we can write
𝑛
• 𝑇 𝑛 = 2k𝑇 + 𝑘𝑐𝑛
2𝑘
• Let the base case occur when T(1)=c
𝑛
• For that = 1, which mean 𝑛= 2𝑘, or k=log2 n
2𝑘
• Thus
• 𝑇 𝑛 = 𝑛 𝑐 + 𝑐 𝑛 𝑙𝑜𝑔2 𝑛= 𝜃(𝑛 log 𝑛)
Recursion Tree
MergeSort Analysis
• Does order of input elements have any effect
on worst case or best case???
Tutorial Question
1. Although merge sort runs in 𝜃 (n lg n) worst-case time and insertion sort runs in 𝜃 (n2)
worst-case time, the constant factors in insertion sort can make it faster in practice for
small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the
recursion by using insertion sort within merge sort when subproblems become
sufficiently small. Consider a modification to merge sort in which n/k sublists of length
k are sorted using insertion sort and then merged using the standard merging mechanism,
where k is a value to be determined.
• a. Show that insertion sort can sort the n/k sublists, each of length k, in in 𝜃 (nk)
worst-case time.
• b. Show how to merge the sublists in 𝜃 (n lg (n/k)) worst-case time.

• c. Given that the modified algorithm runs in in 𝜃 (nk+ n lg (n/k)) worst-case time,
what is the largest value of k as a function of n for which the modified algorithm
has the same running time as standard merge sort, in terms of ‚ 𝜃 -notation?
(page 40 CLRS)
Tutorial Question
• a. Show that insertion sort can sort the n/k sublists, each of length k, in in 𝜃 (nk)
worst-case time.
Tutorial Question
• b. Show how to merge the sublists in 𝜃 (n lg (n/k)) worst-case time.
Tutorial Question
• c. Given that the modified algorithm runs in in 𝜃 (nk+ n lg (n/k)) worst-case time,
what is the largest value of k as a function of n for which the modified algorithm
has the same running time as standard merge sort, in terms of ‚ 𝜃 -notation?
(page 40 CLRS)
Tutorial Question
• Observe that the while loop of lines 5–7 of the
INSERTION-SORT procedure uses a linear
search to scan (backward) through the sorted
subarray A[1,…, j-1].
• Can we use a binary search (see Exercise 2.3-
5) instead to improve the overall worst-case
running time of insertion sort to O(nlog n)?
Sorting Strategy #3
Partition about the Pivot
• Given an array: 5 3 7 18 24 4 8

• Assume a pivot: 5 3 7 18 24 4 8

• Re-arrange the array so that 5 3 7 4 8 18 24


– elements < pivot ֜ left of pivot
– elements > than ֜ right of pivot
• Bucketing the elements, without caring about the order!
• Note: puts the final pivot element to its “rightful
position”.
Strategy#3
Intuition:
• Divide the array into two parts such that
– Every element ‘x’ in the left part is less than
– Every element ‘y’ in the right part
• And every element must occur in left part or right part

• Then recursively sort left part and right part

• Imp: Figure out how to divide such that ‘x’<‘y’????

33
Two Cool Facts about Partitioning
• Implemented in linear time, and without any extra memory space
needed.
• Reduces problem size to enable Divide & Conquer.
• Divide: 1 2 3 4 5 6 7
Original Array A[]
5 3 7 18 24 4 8
Partition (A[], First=1, Last=7)
1 2 3 4 5 6 7
5 3 7 4 8 18 24
Partition (A[], First=1, Last=4)
1 2 3 4
Partition (A[], First=6, Last=7)
6 7
3 4 5 7 18 24
Partition (A[], First=3, Last=4)
5 7
• Conquer:
3 4 5 7 8 18 24
High Level Description
Algorithm QUICKSORT(A, First, Last)
// initial call QUICKSORT(A,1,N)
1. If (First<Last)
2. P =Partition (A, First, Last) // P is pivot position
3. QUICKSORT(A, First, P-1)
4. QUICKSORT(A, P+1, Last)

• Key subroutine: P=Partition (A, First, Last)


– Selects Pivot, Buckets elements
– Returns sorted position ‘P’ of pivot element
Think of an imaginary wall….
• Set pivot… maybe to last element of array
• Set up imaginary wall at index =first-1
• Traverse the array
– If current element < pivot : Swap it with first
element on the right of the wall and increment the
wall
– Else, do nothing
Insight into Partition Subroutine
 Set pivot to the last element in array
 Current_index, CI=First ; Wall_index, WI=First-1
1 2 3 4 5 6 7
5<8; Swap, increment
5 3 17 6 14 1 8 wall CI=C1+1,
WI=0 CI=1
1 2 3 4 5 6 7
WI=WI+1
3<8; swap, increment
5 3 17 6 14 1 8 wall
WI=1 CI=2 CI=C1+1,WI=WI+1
1 2 3 4 5 6 7
5 3 17 6 14 1 8 17<8..No swap or
increment
WI=2 CI=3 CI=C1+1
1 2 3 4 5 6 7
6<8; swap, increment
5 3 17 6 14 1 8 CI=C1+1; WI=WI+1
WI=2 CI=3
1 2 3 4 5 6 7
5 3 6 17 14 1 8
WI=3 CI=4
Key Subroutine- Partition about Pivot
• Set pivot to the last element in array
• Initialize two indices, Current_index, CI=First ; Wall_index, WI=First-1
PARTITION (A, First, Last)
1. Pivot= A[R],
2. WI=First-1, CI=First
3. for CI=First to Last-1
4. Current_item = A[CI]
5. If (Current_item ≤ Pivot) then
6. Swap( A[WI+1], Current_item) //swap Current_item with
the first element on the right of wall & increment the wall
7. WI:=WI+1
8. else
9. Do nothing
10. Swap(A[WI+1], Pivot) //swap pivot with first element on the right
of wall
11. Return (WI+1)
Quicksort Analysis
• Worst Case arises is when array is already
sorted
𝑇 𝑛 = 𝑇 𝑛 − 1 + C𝑛
– Complexity is 𝑂(𝑛2 )

• Best case arises when partition divides array in two


equal subparts, i.e., pivot=median of array elements
𝑛−1
𝑇 𝑛 = 2𝑇 + C𝑛
2
– Complexity is 𝑂(𝑛 log 𝑛)
– Effect of Pivot: First , last, middle or median of three
Quicksort Analysis
• Assume input is a permutation of [1,2,3… n]
– Actual values not imp, only relative order imp…
– Each input is equally likely, (uniform probabilty
distribution)
• Calculate running time across all inputs
permutations: n!
Quicksort Analysis
• In the average case,
PARTITION produces a mix
of “good” and “bad” splits.

• PARTITION, the good and


bad splits are distributed
randomly throughout the
tree. • The combination of the bad split
– intuition, that the good and followed by the good split produces
bad splits alternate levels in three subarrays of sizes
the tree, • 0, (𝑛 − 1/2) − 1, 𝑎𝑛𝑑 (𝑛 − 1)/2
– and that the good splits are • at a combined partitioning cost :
best-case splits and the bad • 𝜃 𝑛 + 𝜃 𝑛 − 1 = θ(𝑛)
splits are worst-case splits
Quicksort Analysis
• In the average case,
PARTITION produces a mix
of “good” and “bad” splits.

• PARTITION, the good and


bad splits are distributed
randomly throughout the combined partitioning cost :
tree. • 𝜃 𝑛 + 𝜃 𝑛 − 1 = θ(𝑛)
– intuition, that the good and
bad splits alternate levels in
the tree,
– and that the good splits are
best-case splits and the bad
splits are worst-case splits
Quicksort Analysis
• Intuitively, the 𝜽 𝒏 − 𝟏 cost of the bad split
can be absorbed into the 𝜽 𝒏 cost of the
good split, and the resulting split is good.
• Thus, the running time of quicksort, when
levels alternate between good and bad splits,
is like the running time for good splits alone:
still 𝑶(𝒏 lg 𝒏) but with a slightly larger
constant hidden by the O-notation.
Dual Pivot Quick Sort
• Proposed by Vladimir Yaroslavskiy, 2009.
• Java 7 runtime
LP
library, arrays.sort
RP
uses dual
pivot quicksort.
24 8 42 75 29 77 38 57
If RP>LP
LP RP Swap(LP, RP)
8 24 42 38 29 57 75 77

LP RP RP
LP
8 42 38 29
75 77
LP RP
29 38 42

38

You might also like