0% found this document useful (0 votes)
0 views37 pages

04-Quicksort

The document discusses the Quicksort algorithm, including its design, analysis, and recursive procedures. It covers the principles of divide-and-conquer, the partitioning process, and the worst-case and average-case complexities. Additionally, it provides insights into the performance of comparison-based sorting algorithms and the theoretical underpinnings of sorting through inversions.

Uploaded by

wabgdinghan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views37 pages

04-Quicksort

The document discusses the Quicksort algorithm, including its design, analysis, and recursive procedures. It covers the principles of divide-and-conquer, the partitioning process, and the worst-case and average-case complexities. Additionally, it provides insights into the performance of comparison-based sorting algorithms and the theoretical underpinnings of sorting through inversions.

Uploaded by

wabgdinghan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Quicksort

Algorithm : Design & Analysis


[4]
In the last class…
◼ Recursive Procedures
◼ Proving Correctness of Recursive Procedures
◼ Deriving recurrence equations
◼ Solution of the Recurrence equations
◼ Guess and proving
◼ Recursion tree
◼ Master theorem
◼ Divide-and-conquer
Quicksort
◼ Insertion sort
◼ Analysis of insertion sorting algorithm
◼ Lower bound of local comparison based
sorting algorithm
◼ General pattern of divide-and-conquer
◼ Quicksort
◼ Analysis of Quicksort
Comparison-Based Algorithm
◼ The class of “algorithms that sort by
comparison of keys”
◼ comparing (and, perhaps, copying) the key
◼ no other operations are allowed
◼ The measure of work used for analysis is the
number of comparison.
As Simple as Inserting

Initial Status

Sorted Unsorted
On Going

The “vacancy”, to be shifed


leftward, by comparisons

Final Status

Sorted Unsorted
(empty)
Shifting Vacancy: the Specification
◼ int shiftVac(Element[ ] E, int vacant, Key x)
◼ Precondition: vacant is nonnegative
◼ Postconditions: Let xLoc be the value returned to the
caller, then:
◼ Elements in E at indexes less than xLoc are in their original
positions and have keys less than or equal to x.
◼ Elements in E at positions (xLoc+1,…, vacant) are greater
than x and were shifted up by one position from their
positions when shiftVac was invoked.
Shifting Vacancy: Recursion
int shiftVacRec(Element[] E, int vacant, Key x)
int xLoc
1. if (vacant==0) The recursive call is working on a
2. xLoc=vacant; smaller range, so terminating;
3. else if (E[vacant-1].keyx) The second argument is non-
4. xLoc=vacant; negative, so precondition holding
5. else
6. E[vacant]=E[vacant-1];
7. xLoc=shiftVacRec(E,vacant-1,x);
8. Return xLoc Worse case frame stack
size is (n)
Shifting Vacancy: Iteration
int shiftVac(Element[] E, int xindex, Key x)
int vacant, xLoc;
vacant=xindex;
xLoc=0; //Assume failure
while (vacant>0)
if (E[vacant-1].keyx)
xLoc=vacant; //Succeed
break;
E[vacant]=E[vacant-1];
vacant--; //Keep Looking
return xLoc
Insertion Sorting: the Algorithm
◼ Input: E(array), n0(size of E)
◼ Output: E, ordered nondecreasingly by keys
◼ Procedure:
void insertSort(Element[] E, int n)
int xindex;
for (xindex=1; xindex<n; xindex++)
Element current=E[xindex];
Key x=current.key;
int xLoc=shiftVac(E,xindex,x);
E[xLoc]=current;
return;
Worst-Case Analysis
Sorted (i entries)

To find the right position for x in the


sorted segment, i comparisons must be
done in the worst case.

◼ At the beginning, there are n-1 entries in the unsorted


segment, so:
The input for which the
n −1
n(n − 1)

upper bound is reached
W ( n)  i=
i =1
2 does exist, so: W(n)(n2)
Average Behavior
Sorted (i entries)

x may be located in any one of the i+1


intervals(inclusive), assumingly, with
the same probabiliy

◼ Assumptions:
◼ All permutations of the keys are equally likely as input.
◼ There are not different entries with the same keys.
Note: For the (i+1)th interval (leftmost), only i comparisons are needed.
Average Complexity
◼ The average number of comparisons in shiftVac to find the
location for the ith element:
i


1 1 i i i 1
j+ (i) = + = +1−
i + 1 j =1 i +1 2 i +1 2 i +1
for the leftmost interval
◼ For all n-1 insertions:
æi
n-1
1 ö n(n -1) n
A(n) = åç +1- + n -1 - å
1
÷=
è
i=1 2 i +1 ø 4 j=2 j

n(n -1) n
1 n 2 3n
= +n-å = + - ln n Î Q(n 2 )
4 j=1 j 4 4
Inversion and Sorting
◼ An unsorted sequence E:
x1, x2, x3, …, xn-1, xn
◼ If there are no same keys, for the purpose of
sorting, it is a reasonable assumption that {x1, x2,
x3, …, xn-1, xn}={1,2,3,…,n-1,n}
◼ <xi, xj> is an inversion if xi>xj, but i<j
◼ All the inversions must be eliminated during the
process of sorting
Eliminating Inverses: Worst Case
◼ Local comparison is done between two adjacent
elements.
◼ At most one inversion is removed by a local
comparison.
◼ There do exist inputs with n(n-1)/2 inversions, such as
(n,n-1,…,3,2,1)

◼ The worst-case behavior of any sorting algorithm


that remove at most one inversion per key
comparison must in (n2)
Elininating Inverses: Average
◼ Computing the average number of inversions in inputs of size
n (n>1):
◼ Transpose: x1, x2, x3, …, xn-1, xn
xn, xn-1, …, x3, x2, x1

◼ For any i, j, (1jin), the inversion (xi,xj ) is in exactly one sequence


in a transpose pair.
◼ The number of inversions (xi,xj ) on n distinct integers is n(n-1)/2.
◼ So, the average number of inversions in all possible inputs is n(n-1)/4,
since exactly n(n-1)/2 inversions appear in each transpose pair.

◼ The average behavior of any sorting algorithm that


remove at most one inversion per key comparison must in
(n2)
Quicksort: the Strategy
◼ Dividing the array to be sorted into two parts: “small” and
“large”, which will be sorted recursively.

[first] [last]

for any element in this


for any element in this
segment, the key is not
segment, the key is less
less than pivot.
than pivot.
QuickSort: the algorithm
◼ Input: Array E and indexes first, and last, such that elements E[i] are
defined for firstilast.
◼ Output: E[first],…,E[last] is a sorted rearrangement of the same elements.
◼ The procedure: The splitting point is
void quickSort(Element[ ] E, int first, int last) chosen arbitrarily, as
if (first<last) the first element in the
Element pivotElement=E[first]; array segment here.
Key pivot=pivotElement.key;
int splitPoint=partition(E, pivot, first, last);
E[splitPoint]=pivotElement;
quickSort(E, first, splitPoint-1);
quickSort(E, splitPoint+1, last);
return
Partition: the Strategy

Expanding Directions

揝mall?segment

Unexamined segment

揝arge?segment
Partition: the Process
◼ Always keep a vacancy before completion.
Vacancy at beginning, the key as pivot First met key
that is less than
pivot

Moving as far as possible!


highVac

First met key


that is larger
lowVac than pivot
Vacant left after moving
Partition: the Algorithm
◼ Input: Array E, pivot, the key around which to partition, and
indexes first, and last, such that elements E[i] are defined for
first+1ilast and E[first] is vacant. It is assumed that
first<last.
◼ Output: Returning splitPoint, the elements origingally in
first+1,…,last are rearranged into two subranges, such that
◼ the keys of E[first], …, E[splitPoint-1] are less than pivot,
and
◼ the keys of E[splitPoint+1], …, E[last] are not less than
pivot, and
◼ firstsplitPointlast, and E[splitPoint] is vacant.
Partition: the Procedure
int partition(Element [ ] E, Key pivot, int first, int last)
int low, high;
1. low=first; high=last;
2. while (low<high)
3. int highVac=extendLargeRegion(E,pivot,low,high);
4. int lowVac =
extendSmallRegion(E,pivot,low+1,highVac);
5. low=lowVac; high=highVac-1;
6 return low; //This is the splitPoint highVac has
been filled now.
Extending Regions
◼ Specification for
extendLargeRegion(Element[ ] E, Key pivot, int lowVac, int high)

◼ Precondition:
◼ lowVac<high

◼ Postcondition:
◼ If there are elements in E[lowVac+1],...,E[high] whose
key is less than pivot, then the rightmost of them is
moved to E[lowVac], and its original index is returned.
◼ If there is no such element, lowVac is returned.
Example of Quicksort
45 14 62 51 75 96 33 84 20
45 as pivot high
low
20 14 62 51 75 96 33 84
highVac
20 14 51 75 96 33 84 62
lowVac

20 14 51 75 96 33 84 62
low high =highVac-1
20 14 33 51 75 96 84 62
highVac

20 14 33 75 96 51 84 62
lowVac highVac
To be processed in the next loop
Divide and Conquer: General Pattern
solve(I)
T(n)=B(n) for nsmallSize
n=size(I);
if (nsmallSize)
solution=directlySolve(I)
k
T(n)=D(n)+ T(size(Ii))+C(n)
else
divide I into I1,… Ik; i =1
for each i{1,…,k} for n>smallSize

Si=solve(Ii);
solution=combine(S1 ,… ,Sk);
return solution
Workhorse
◼ “Hard division, easy combination”
◼ “Easy division, hard combination”

◼ Usually, the “real work” is in one part.


Worst Case: a Paradox
◼ For a range of k positions, k-1 keys are compared with the
pivot(one is vacant).
◼ If the pivot is the smallest, than the “large” segment has all the
remaining k-1 elements, and the “small” segment is empty.
◼ If the elements in the array to be sorted has already in
ascending order(the Goal), then the number of comparison
that Partition has to do is:

n
n( n − 1)

k =2
( k − 1) =
2
 ( n 2 )
Average Analysis
◼ Assumption: all permutation of the keys are equally
likely.
◼ A(n) is the average number of key comparison done for
range of size n.
◼ In the first cycle of Partition, n-1 comparisons are
done
◼ If split point is E[i](each i has probability 1/n),
Partition is to be executed recursively on the subrange
[0,…i] and [i+1,…,n-1]
The Recurrence Equation
Why the assumed probability is
still hold for each subrange?
E[0] splitPoint: E[i] E[n-1]
No two keys within a subrange
have been compared each other!

subrange 1: size= i subrange 2: size= n-1-i


with i{0,1,2,…n-1}, each value with the probability 1/n
So, the average number of key comparison A(n) is:
n −1


1
A(n) = (n − 1) + [ A(i ) + A(n − 1 − i )] for n  2
i =0
n

and A(1)=A(0)=0 The number of key comparison in the first


cycle(finding the splitPoint) is n-1
Simplified Recurrence Equation
n −1 n −1
◼ Note:
 A(i) =  A[(n − 1) − i]
i =0 i =0
and A(0) = 0

n −1

 A(i)
2
◼ So: A(n) = (n − 1) + for n  1
n i =1

◼ Two approaches to solve the equation


◼ Guess and prove by induction
◼ Solve directly
Guess the Solution
◼ A special case as clue for guess
◼ Assuming that Partition divide the problem range into 2
subranges of about the same size.
◼ So, the number of comparison Q(n) satisfy:
Q(n)  n+2Q(n/2)
◼ Applying Master Theorem, case 2:
Q(n)(nlogn)
Note: here, b=c=2, so E=lg(b)/lg(c)=1, and, f(n)=n=nE
Inductive Proof: A(n)O(nlnn)
◼ Theorem: A(n)cnlnn for some constant c, with A(n) defined by
the recurrence equation above.
◼ Proof:
◼ By induction on n, the number of elements to be sorted. Base case(n=1)
is trivial.
◼ Inductive assumption: A(i)cilni for 1i<n

n −1 n −1

  ci ln(i)
2 2
A( n) = ( n − 1) + A(i )  ( n − 1) +
n i =1
n i =1
n −1
2c  n 2 ln(n) n 2 

n

2 2c cn
Note : ci ln(i ) x ln xdx  − = cn ln(n) −
n i =1
1 n n  2 4  2
 c
So, A( n)  cn ln(n) + n1 −  − 1
 2
Let c = 2, we have A( n)  2n ln(n)
For Your Reference
n 1 k +1 1

k +1
x ln( x)dx =
k
n ln(n) − n
1 k +1 (k + 1) 2

n
1

i =1 i
 ln(n) + 0.577

Harmonic Series

b b +1
f ( x)dx   f (i )  
b
 a −1
i =a
a
f ( x)dx
a b
Inductive Proof: A(n)(nlnn)
◼ Theorem: A(n)>cnlnn for some co c, with large n
◼ Inductive reasoning: Inductive
2 n−1 2 n−1 assumption
A(n) = (n − 1) +  A(i )  (n − 1) +  ci ln(i )
n i =1 n i =1
2c n 2c n
= (n − 1) +  i ln(i ) − 2c ln( n)  (n − 1) +  x ln xdx − 2c ln( n)
n i =2 n 1
n
 cn ln( n) + [(n − 1) − c( + 2 ln n)]
2
n −1 n −1
Let c  , then A(n)  cn ln( n) ( Note : lim = 2)
n n → n
+ 2 ln( n) + 2 ln( n)
2 2
Directly Derived Recurrence Equation
2 n−1
We have : A(n) = (n − 1) +  A(i ) and
n i =1
2 n−2
A(n − 1) = (n − 2) + 
n − 1 i =1
A(i )
Combining the 2 equations in some way, we can remove all A(i)
for i=1,2,…,n-2
nA(n) − (n − 1) A(n − 1)
n −1 n−2
= n(n − 1) + 2 A(i ) − (n − 1)(n − 2) − 2  A(i )
i =1 i =1
= 2 A(n − 1) + 2(n − 1)
So, nA(n) = (n + 1) A(n − 1) + 2(n − 1)
Solving the Equation Let it be B(n)

A(n) A(n − 1) 2(n − 1)


nA(n) = (n + 1) A(n − 1) + 2(n − 1) = +
n +1 n n(n + 1)

2(n − 1)
We have equation: B ( n) = B ( n − 1) + B (1) = 0
n(n + 1)
1 1 1
n
2(i − 1) n
(i + 1) − 2 n
1 n
1 Note : = −
B ( n) =  = 2 = 2 − 4 i(i + 1) i i + 1
i =1 i (i + 1) i =1 i (i + 1) i =1 i i =1 i (i + 1)
n
1 n
1 n +1 1 n
1 n
1 n
1 4 n
1 4n
= 4 − 2 = 4 − 2 = 4 − 2 + 4 − = 2 −
i =1 i + 1 i =1 i i=2 i i =1 i i =1 i i =1 i n +1 i =1 i n +1
4n
So, B(n)  2(ln n + 0.577) − , and , A(n)  1.386n lg n − 2.846n
n +1
Note: lnn  0.693 lgn
Space Complexity
◼ Good news:
◼ Partition is in-place
◼ Bad news:
◼ In the worst case, the depth of recursion will be n-1
◼ So, the largest size of the recursion stack will be in
(n)
Home Assignment
◼ pp.208-
◼ 4.6
◼ 4.8-4.9
◼ 4.11-4.12
◼ 4.17-4.18
◼ 4.21-4.22

You might also like