Lecture 12
Lecture 12
Administrative
Reminder: homework 3 due today
Reminder: Exam 1 Wednesday, Feb 13
1 8.5x11 crib sheet allowed
Both sides, mechanical reproduction okay
You will turn it in with the exam
Review Of Topics
Asymptotic notation
Solving recurrences
Sorting algorithms
Insertion sort
Merge sort
Medians/order statistics
Randomized algorithm
Worst-case algorithm
Structures for dynamic
sets
Heap sort
Priority queues
Quick sort
BST basics
Counting sort
Radix sort
Review: Induction
Suppose
S(k) is true for fixed constant k
Often k = 0
S(n) S(n+1) for all n >= k
Proof By Induction
Claim:S(n) is true for all n >= k
Basis:
Show formula is true when n = k
Inductive hypothesis:
Assume formula is true for an arbitrary n
Step:
Show that formula is then true for n+1
Induction Example:
Gaussian Closed Form
Prove 1 + 2 + 3 + + n = n(n+1) / 2
Basis:
If n = 0, then 0 = 0(0+1) / 2
Inductive hypothesis:
Assume 1 + 2 + 3 + + n = n(n+1) / 2
1 + 2 + + n + n+1 = (1 + 2 + + n) + (n+1)
= n(n+1)/2 + n+1 = [n(n+1) + 2(n+1)]/2
= (n+1)(n+2)/2 = (n+1)(n+1 + 1) / 2
Induction Example:
Geometric Closed Form
Prove a0 + a1 + + an = (an+1 - 1)/(a - 1) for all
a1
a0 = 1 = (a1 - 1)/(a - 1)
Inductive hypothesis:
Assume a0 + a1 + + an = (an+1 - 1)/(a - 1)
a0 + a1 + + an+1 = a0 + a1 + + an + an+1
= (an+1 - 1)/(a - 1) + an+1 = (an+1+1 - 1)/(a - 1)
Big O fact:
A polynomial of degree k is O(nk)
that
0 cg(n) f(n) n n0
such that
Review:
Other Asymptotic Notations
A function f(n) is o(g(n)) if positive
() is like >
() is like
() is like =
Effort
T(n)
(1)
(1)
T(n/2)
T(n/2)
(n)
c
n 1
T (n) aT n cn n 1
log b a
T ( n) n
log n
f ( n )
log b a
f (n) O n log b a
f ( n) n
log b a
c 1
Review: Heaps
A heap is a complete binary tree, usually
represented as an array:
16
4
10
14
2
7
8
A = 16 14 10 8
Review: Heaps
To represent a heap as an array:
A[Parent(i)] A[i]
Review: Heapify()
Heapify(): maintain the heap property
Given: a node i in the heap with children l and r
Given: two subtrees rooted at l and r, assumed to be
heaps
Action: let the value of the parent node float down so
subtree at i satisfies the heap property
If A[i] < A[l] or A[i] < A[r], swap A[i] with the largest of A[l]
and A[r]
Recurse on that subtree
Running time: O(h), h = height of heap = O(lg n)
Review: BuildHeap()
We can build a heap in a bottom-up manner by
Review: BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)
Heapify(A, i);
}
Review:
Implementing Priority Queues
HeapInsert(A, key)
// whats running time?
{
heap_size[A] ++;
i = heap_size[A];
while (i > 1 AND A[Parent(i)] < key)
{
A[i] = A[Parent(i)];
i = Parent(i);
}
A[i] = key;
}
Review:
Implementing Priority Queues
HeapExtractMax(A)
{
if (heap_size[A] < 1) { error; }
max = A[1];
A[1] = A[heap_size[A]]
heap_size[A] --;
Heapify(A, 1);
return max;
}
Review: Quicksort
Another divide-and-conquer algorithm
The array A[p..r] is partitioned into two non-empty
elements in A[q+1..r]
The subarrays are recursively sorted by calls to
quicksort
Unlike merge sort, no combining step: two
subarrays form an already-sorted array
Review: Partition
Clearly, all the action takes place in the
partition() function
Rearranges the subarray in place
End result:
Two subarrays
All values in first subarray all values in second
T(1) = (1)
T(n) = T(n - 1) + (n)
Works out to
T(n) = (n2)
T(n) = (n lg n)
(n)?
Sorting Summary
Insertion sort:
Easy to code
Fast on small inputs (less than ~50 elements)
Fast on nearly-sorted inputs
O(n2) worst case
O(n2) average (equally-likely inputs) case
O(n2) reverse-sorted case
Sorting Summary
Merge sort:
Divide-and-conquer:
Split array in half
Recursively sort subarrays
Linear-time merge step
O(n lg n) worst case
Doesnt sort in place
Sorting Summary
Heap sort:
Uses the very useful heap data structure
Complete binary tree
Heap property: parent key > childrens keys
O(n lg n) worst case
Sorts in place
Fair amount of shuffling memory around
Sorting Summary
Quick sort:
Divide-and-conquer:
Partition array into two subarrays, recursively sort
All of first subarray < all of second subarray
No merge step needed!
CountingSort(A, B, k)
for i=1 to k
C[i]= 0;
for j=1 to n
C[A[j]] += 1;
for i=2 to k
C[i] = C[i] + C[i-1];
for j=n downto 1
B[C[A[j]]] = A[j];
C[A[j]] -= 1;
smallest element
The minimum is thus the 1st order statistic
The maximum is (duh) the nth order statistic
The median is the n/2 order statistic
If n is even, there are 2 medians
element of a set
Two algorithms:
A practical randomized algorithm with O(n)
A[q]
p
A[q]
q
A[q]
q
1 n 1
T max k , n k 1 n
n k 0
2 n 1
T k n
n k n / 2
Review:
Worst-Case Linear-Time Selection
Randomized algorithm works well in practice
What follows is a worst-case linear time
Review:
Worst-Case Linear-Time Selection
The algorithm in words:
1. Divide n elements into groups of 5
2. Find median of each group (How? How long?)
3. Use Select() recursively to find median x of the n/5
medians
4. Partition the n elements around x. Let k = rank(x)
5. if (i == k) then return x
if (i < k) then use Select() recursively to find ith smallest
element in first partition
else (i > k) use Select() recursively to find (i-k)th smallest
element in last partition
Review:
Worst-Case Linear-Time Selection
(Sketch situation on the board)
How many of the 5-element medians are
x?
x?
Review:
Worst-Case Linear-Time Selection
Thus after partitioning around x, step 5 will
cn cn 20 n
cn if c is big enough
n/5 ???
n/5
Substitute T(n) =???
cn
Combine fractions
???
H
D
TreeWalk(x)
TreeWalk(left[x]);
print(x);
TreeWalk(right[x]);
Easy to show by induction on the BST property
Preorder tree walk: print root, then left, then right
Postorder tree walk: print left, then right, then root