0% found this document useful (0 votes)
20 views

Unit-1 DAA

Uploaded by

shreyash29092002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Unit-1 DAA

Uploaded by

shreyash29092002
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 194

1

2
3
What is an algorithm?

A clearly specified set of simple instructions to be followed to solve


a problem
üTakes a set of values, as input and
ü produces a value, or set of values, as output
May be specified
üIn English or Hindi or Telugu
üAs a computer program
üAs a pseudo-code

4
Some Vocabulary with an example

Problem: Sorting of given keys

Input: A sequence of n keys a1, . . . , an.

Output: The permutation (reordering) of the input sequence such that


a1 ≤ a2 ≤ · · · ≤ an−1 ≤ an.

Instance: An instance of sorting might be an array of names, like {Mike,


Bob, Sally, Jill ,Jan}, or a list of numbers like {154, 245, 568, 324, 654,
324}

Algorithm: An algorithm is a procedure that takes any of the possible


input instances and transforms it to the desired output.
5
Which algorithm is better?

Algorithm 1: Algorithm 2:

Sort A into decreasing order int i;


int m = A[1];
Output A[1]. for (i = 2; i <= n; i ++)
if (A[i] > m)
Which is better? m = A[i];
return m;

6
Who’s the champion?

7
“Better” = more efficient

ü Time

ü Space

Measure efficiency (asymptotic notation)


O(n)
o(n)
Ω(n)
Ө(n) 8
Sorting Algorithms

Comparison Based Non-Comp Based


ü Bubble Sort ü Radix Sort
ü Quick Sort ü Bucket Sort
ü Insertion Sort
ü Merge Sort Randomized

ü Heap Sort ü Quick Sort

Lower bound on comparison based algorithms


Reference Books

Also known as CLRS book

15
Reference Books

Sanjoy Das Gupta, Christos Papadimitriou, Umesh Vazirani: Algorithms


(Tata McGraw-Hill Publishers) 16
Reference Books

Ellis Horowitz, Sartaj Sahni, Sanguthevar Rajasekaran. Computer Algorithms

17
Definition: Data Structure
• Organization of large amount of data such
that, the operations we perform on data are
efficient.
• Criteria of judging an operation(algorithm)
• CPU
• MEMORY
• DISK(I/O)
• No of message exchanges to perform a task(N/W).
Pseudo Code
A mixture of natural language and high level
programming concepts used to represent an algorithm.
Characteristics of an algorithm:-

· Must take an input.


· Must give some output(yes/no,valueetc.)
· Definiteness –each instruction is clear and unambiguous.
· Finiteness –algorithm terminates after a finite number of
steps.
· Effectiveness –every instruction must be basic i.e. simple
instruction.
Expectation from an algorithm

• Correctness:-
Correct: Algorithms must produce correct result.
Approximation algorithm: Exact solution is not found,
but near optimal solution can be found out. (Applied to
optimization problem.)
• Less resource usage:
Algorithms should use less resources (time and space).
Why Algorithm
• Routing uses shortest path algo
• Cryptography Number theory
• Database needs balanced tree data structure and algorithm.
• Sort a bunch of numbers
• Shortest path
• Route Optimization
• Searching
• Indexing
• Any or every computation in efficient way.
To analyse an algorithm
Code and execute, find actual time.
What does the total time depend upon
§ Algorithm
§ Number of inputs
§ Count the number of primitive operations like
assignment, function call, control transfer,
arithmetic etc.
Solution to all issues is : Asymptotic analysis of
algorithms.

24
A survey suggest that we have gained more efficiency than
as compared to hardware or processor speed
development through algorithms instead.
Primitive Operations

Basic computations performed by an algorithm


Identifiable in pseudocode
Largely independent from the programming language
Exact definition not important (we will see why later)

Examples:
– Evaluating an expression
– Assigning a value to a variable
– Indexing into an array
– Calling a method
– Returning from a method
Analysis of Algorithms
Analyzing pseudocode
(by counting)
1. For each line of pseudocode, count the number of
primitive operations in it. Pay attention to the word
"primitive" here; sorting an array is not a primitive operation.
2. Multiply this count with the number of times this line
is executed.
3. Sum up over all lines.

Analysis of Algorithms 27
Counting Primitive Operations
By inspecting the pseudocode, we can determine the
maximum number of primitive operations executed by an
algorithm, as a function of the input size
Algorithm arrayMax(A, n CostTimes
currentMax ¬ A[0] 2 1
for i ¬ 1 to n - 1 do 2 n
if A[i] > currentMax then 2 (n - 1)
currentMax ¬ A[i] 2 (n - 1)
{ increment counter i } 2 (n - 1)
return currentMax 1 1
Total 8n - 3
Analysis of Algorithms 28
Insertion Sort

6 10 24 3
6

12

29
Insertion Sort
input array

5 2 4 6 1 3

at each iteration, the array is divided in two sub-arrays:

left sub-array right sub-array

sorted unsorted

30
Insertion Sort

31
INSERTION-SORT
Alg.: INSERTION-SORT(A) 1 2 3 4 5 6 7 8

for j ← 2 to n a1 a2 a3 a4 a5 a6 a7 a8
do key ← A[ j ]
key
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
• Insertion sort – sorts the elements in place

32
Correctness of algorithms by loop
Invariant
Loop Invariant for Insertion Sort
Alg.: INSERTION-SORT(A)
for j ← 2 to n
do key ← A[ j ]
Insert A[ j ] into the sorted sequence A[1 . . j -1]

i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
Invariant: at the start of the for loop the elements in A[1 . . j-1] are in
sorted order
34
Proving Loop Invariants
• Proving loop invariants works like induction
• Initialization (base case):
– It is true prior to the first iteration of the loop
• Maintenance (inductive step):
– If it is true before an iteration of the loop, it remains true before
the next iteration
• Termination:
– When the loop terminates, the invariant gives us a useful
property that helps show that the algorithm is correct
– Stop the induction when the loop terminates

35
Loop Invariant for Insertion Sort
• Initialization:
– Just before the first iteration, j
= 2:
the subarray A[1 . . j-1] =
A[1], (the element originally in
A[1]) – is sorted

36
Loop Invariant for Insertion Sort
• Maintenance:
– the while inner loop moves A[j -1], A[j -2],
A[j -3], and so on, by one position to the right
until the proper position for key (which has the
value that started out in A[j]) is found
– At that point, the value of key is placed into this
position.

37
Loop Invariant for Insertion Sort
• Termination:
– The outer for loop ends when j = n + 1 Þ j-1 =
n
– Replace n with j-1 in the loop invariant:
• the subarray A[1 . . n] consists of the jelements
-1 j
originally in A[1 . . n], but in sorted order

Invariant: at the start of the for loop the elements in A[1 . . j-1] are in
sorted order

• The entire array is sorted!


38
Analysis of Insertion Sort
INSERTION-SORT(A) cost times
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1
i←j-1 c4 n-1
å
n
while i > 0 and A[i] > key c5 j =2 j
t
å
n
do A[i + 1] ← A[i] c6 j =2
(t j - 1)
å
n
i←i–1 c7 j =2
(t j - 1)
A[i + 1] ← key c8 n-1
tj: # of times the while statement is executed at iteration j

T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2
39
Best Case Analysis
• The array is already sorted “while i > 0 and A[i] > key”
– A[i] ≤ key upon the first time the while loop test is run
(when i = j -1)

– tj = 1

• T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1)


= (c1 + c2 + c4 + c5 + c8)n + (c2 + c4 + c5 + c8)

= an + b = Q(n)
T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2
40
Worst Case Analysis
• The array is in reverse sorted order“while i > 0 and A[i] > key”
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th
position Þ compare with j-1 elements Þ tj = j
n
n(n + 1) n
n(n + 1) n
n(n - 1)
using å
j =1
j =
2
=> å
j =2
j =
2
- 1 => å ( j -1) =
j =2 2
we have:

æ n(n + 1) ö n(n - 1) n(n - 1)


T (n ) = c1n + c2 (n - 1) + c4 (n - 1) + c5 ç - 1÷ + c6 + c7 + c8 (n - 1)
è 2 ø 2 2

= an 2 + bn + c a quadratic function of n

• T(n) = Q(n2) order of growth in n2


T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2 41
Comparisons and Exchanges in Insertion
Sort
cost times
INSERTION-SORT(A)
c1 n
for j ← 2 to n
c2 n-1
do key ← A[ j ]
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1

i←j-1 » n2/2 comparisons c4 n-1


å
n
while i > 0 and A[i] > key c5 j =2 j
t

c6 å
n
do A[i + 1] ← A[i] j =2
(t j - 1)
i←i–1
exchanges c7 å
n
» n2/2 j =2
(t j - 1)
A[i + 1] ← key
c8 n-1
42
Insertion Sort - Summary
• Advantages
– Good running time for “almost sorted” arrays
Q(n)
• Disadvantages
– Q(n2) running time in worst and average case
– » n2/2 comparisons and exchanges

43
Thank You!!

44
Contents

1. Use of asymptotic notation,


a) Big-Oh,
b) Omega
c) Theta.
d) Little-Oh, and
e) Little-Omega Notation
2. Analyzing Recursive Algorithms:
a) Recurrence relations,
b) Specifying runtime of recursive algorithms,
c) Solving recurrence equations.
d) Master Theorem.
3. Case Study: Analysing Algorithms
Algorithms and it’s Specification

• An algorithm is a step-by-step procedure for


performing some task in a finite amount of time.
• Data structure is a systematic way of organizing and
accessing data.
• These concepts are central to computing, but to be
able to classify some algorithms and data structures as
"good:' we must have precise ways of analyzing them.
Characteristics of an algorithm:-

· Must take an input.


· Must give some output(yes/no, value etc.)
· Definiteness –each instruction is clear and unambiguous.
· Finiteness –algorithm terminates after a finite number of
steps.
· Effectiveness –every instruction must be basic i.e. simple
instruction.
Expectation from an algorithm

• Correctness:-
Correct: Algorithms must produce correct result.
Approximation algorithm: Exact solution is not found,
but near optimal solution can be found out. (Applied to
optimization problem.)
• Less resource usage:
Algorithms should use less resources (time and space).
To analyse an algorithm
• Code and execute, find actual time.
• What does the total time depend upon
§ Algorithm
§ Number of inputs
§ Count the number of primitive operations like
assignment, function call, control transfer,
arithmetic etc.
• Solution to all issues is : Asymptotic analysis of
algorithms.

5
Analyzing pseudo-code (by counting)
1. For each line of pseudo-code, count the number of
primitive operations in it.
Pay attention to the word "primitive" here; sorting an array is
not a primitive operation.
2. Multiply this count with the number of times this line
is executed.
3. Sum up over all lines.

Analysis of Algorithms 6
Proving Loop Invariants
• Proving loop invariants works like induction
• Initialization (base case):
– It is true prior to the first iteration of the loop
• Maintenance (inductive step):
– If it is true before an iteration of the loop, it remains true before
the next iteration
• Termination:
– When the loop terminates, the invariant gives us a useful
property that helps show that the algorithm is correct
– Stop the induction when the loop terminates

7
Analysis of Insertion Sort
INSERTION-SORT(A) cost times
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1
i←j-1 c4 n-1
å
n
while i > 0 and A[i] > key c5 j =2 j
t
å
n
do A[i + 1] ← A[i] c6 j =2
(t j - 1)
å
n
i←i–1 c7 j =2
(t j - 1)
A[i + 1] ← key c8 n-1
tj: # of times the while statement is executed at iteration j

T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2
8
Best Case Analysis
• The array is already sorted “while i > 0 and A[i] > key”

– A[i] ≤ key upon the first time the while loop test is run
(when i = j -1)

– tj = 1

• T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1) = (c1 + c2
+ c4 + c5 + c8)n + (c2 + c4 + c5 + c8)

= an + b = Q(n)

T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2
9
Worst Case Analysis
• The array is in reverse sorted order“while i > 0 and A[i] > key”
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th
position Þ compare with j-1 elements Þ tj = j
n
n(n + 1) n
n(n + 1) n
n(n - 1)
using å
j =1
j =
2
=> å
j =2
j =
2
- 1 => å ( j -1) =
j =2 2
we have:

æ n(n + 1) ö n(n - 1) n(n - 1)


T (n ) = c1n + c2 (n - 1) + c4 (n - 1) + c5 ç - 1÷ + c6 + c7 + c8 (n - 1)
è 2 ø 2 2

= an 2 + bn + c a quadratic function of n

• T(n) = Q(n2) order of growth in n2


T (n) = c1n + c2 (n - 1) + c4 (n - 1) + c5 å t j + c6 å (t j - 1) + c7 å (t j - 1) + c8 (n - 1)
n n n

j =2 j =2 j =2 10
Comparisons and Exchanges in Insertion
Sort
cost times
INSERTION-SORT(A)
c1 n
for j ← 2 to n
c2 n-1
do key ← A[ j ]
Insert A[ j ] into the sorted sequence A[1 . . j -1] 0 n-1

i←j-1 » n2/2 comparisons c4 n-1


å
n
while i > 0 and A[i] > key c5 j =2 j
t

c6 å
n
do A[i + 1] ← A[i] j =2
(t j - 1)
i←i–1
exchanges c7 å
n
» n2/2 j =2
(t j - 1)
A[i + 1] ← key
c8 n-1
11
Time complexity analysis-some
general rules.

12

BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956


Order of Growth

• Simplifying abstraction: interested in rate of


growth or order of growth of the running time of
the algorithm
• Allows us to compare algorithms without worrying
about implementation performance
• Usually only highest order term without constant
coefficient is taken
• Uses “theta” notation
– Best case of insertion sort is Q(n)
– Worst case of insertion sort is Q(n2)
Designing Algorithms

• Several techniques/patterns for designing


algorithms exist
• Incremental approach: builds the solution one
component at a time
• Divide-and-conquer approach: breaks original
problem into several smaller instances of the same
problem
– Results in recursive algorithms
– Easy to analyze complexity using proven techniques
Divide-and-Conquer

• Technique (or paradigm) involves:


– “Divide” stage: Express problem in terms of
several smaller subproblems
– “Conquer” stage: Solve the smaller
subproblems by applying solution recursively
– smallest subproblems may be solved
directly
– “Combine” stage: Construct the solution to
original problem from solutions of smaller
subproblem
Merge Sort Strategy
n
(unsorted)
• Divide stage: Split the n-element
sequence into two subsequences
of n/2 elements each n/2 n/2
(unsorted) (unsorted)

• Conquer stage: Recursively sort


the two subsequences MERGE SORT MERGE SORT

• Combine stage: Merge the two n/2 n/2


(sorted) (sorted)
sorted subsequences into one
sorted sequence (the solution)
MERGE

n
(sorted)
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Merging Sorted Sequences


Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Merging Sorted Sequences


• Combines the sorted
Q(1) subarrays A[p..q] and
A[q+1..r] into one sorted
array A[p..r]
Q(n)
• Makes use of two working
arrays L and R which
initially hold copies of the
Q(1) two subarrays
• Makes use of sentinel
value (¥) as last element
to simplify logic
Q(n)
Merge Sort Algorithm

Q(1)
T(n/2)
T(n/2)
Q(n)

T(n) = 2T(n/2) + Q(n)


Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Analysis of Merge Sort


Analysis of recursive calls …
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Analysis of Merge Sort

T(n) = cn(lg n + 1)
= cnlg n + cn

T(n) is Q(n lg n)
Asymptotic Notation

a) Big-Oh
b) Omega
c) Theta
d) Little-Oh, and
e) Little-Omega Notation
Symbol Meaning

$ (there exists): ∃ x: P(x) means there is at least one x such


that P(x) is true
∀ (for all; for any; for each): ∀ x: P(x) or (x) P(x) means P(x)
is true for all x
⊃: superset
⊂: subset
Asymptotic Complexity
Running time of an algorithm as a function of input size n
for large n.
Expressed using only the highest-order term in the
expression for the exact running time.
– Instead of exact running time, say Q(n2).
Describes behavior of function in the limit.
Written using Asymptotic Notation.
Asymptotic Notation
Q, O, W, o, w
Defined for functions over the natural numbers.
– Ex: f(n) = Q(n2).
– Describes how f(n) grows in comparison to n.
Define a set of functions; in practice used to
compare two function sizes.
The notations describe different rate-of-growth
relations between the defining function and the
defined set of functions.
Q-notation
For function f(n), we define Q(g(n)),
big-Theta of n, as the set:
Q(g(n)) = {f(n) :
$ positive constants c1, c2, and n0,
such that "n ³ n0,
we have 0 £ c1g(n) £ f(n) £ c2g(n)
}
Intuitively: Set of all functions that
have the same rate of growth as g(n).
g(n) is an asymptotically tight bound for f(n).
Example
Q(g(n)) = {f(n) : $ positive constants c1, c2, and n0,
such that "n ³ n0, 0 £ c1g(n) £ f(n) £ c2g(n)}
n2 /2 - 2n = Q(n2)
What constants for n0, c1, and c2 will work?
Make c1 a little smaller than the leading coefficient, and c2 a
little bigger.
To compare orders of growth, look at the leading term.
Exercise: Prove that 2n2= Q(n2)
• n2 /2 - 2n = Q(n2)
• C1=1/4, C2=1/2 and n0=8

• 2n2= Q(n2)
• C1=1, C2=3 (or C1=C2=2) and n0=0
O-notation
For function f(n), we define O(g(n)),
big-O of n, as the set:
O(g(n)) = {f(n) :
$ positive constants c and n0,
such that "n ³ n0,
we have 0 £ f(n) £ cg(n) }
Intuitively: Set of all functions
whose rate of growth is the same as
or lower than that of g(n).
g(n) is an asymptotic upper bound for f(n).
f(n) = Q(g(n)) Þ f(n) = O(g(n)).
Q(g(n)) Ì O(g(n)).
Comp 122
w Given functions f(n) and g(n), we say that f(n) is O(g(n)) if
there are positive constants
c and n0 such that
f(n) £ cg(n) for n ³ n0
w Example: 2n + 10 is O(n)
s 2n + 10 £ cn
s (c - 2) n ³ 10
s n ³ 10/(c - 2)
s Pick c = 3 and n0 = 10

Comp 122
More Big-Oh Examples
7n-2
7n-2 is O(n)
need c > 0 and n0 ³ 1 such that 7n-2 £ c•n for n ³ n0
this is true for c = 7 and n0 = 1

n 3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 ³ 1 such that 3n3 + 20n2 + 5 £ c•n3 for n ³ n0
this is true for c = 4 and n0 = 21

n 3 log n + 5
3 log n + 5 is O(log n)
need c > 0 and n0 ³ 1 such that 3 log n + 5 £ c•log n for n ³ n0
this is true for c = 8 and n0 = 2
Analysis of Algorithms 31
Big-Oh Rules (shortcuts)
If f(n) is a polynomial of degree d, then f(n) is
O(nd), i.e.,
1. Drop lower-order terms
2. Drop constant factors
Use the smallest possible class of functions
– Say “2n is O(n)” instead of “2n is O(n2)”
Use the simplest expression of the class
– Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

Analysis of Algorithms 32
Examples
O(g(n)) = {f(n) : $ positive constants c and n0,
such that "n ³ n0, we have 0 £ f(n) £ cg(n) }
2n2 =O(n3 )
2n2 = O(n2 )
• 2n2 =O(n3 )
• C=1 and n0=2

• 2n2 = O(n2 )
• C=2 and n0=0
W -notation
For function f(n), we define W(g(n)),
big-Omega of n, as the set:
W(g(n)) = {f(n) :
$ positive constants c and n0,
such that "n ³ n0,
we have 0 £ cg(n) £ f(n)}
Intuitively: Set of all functions
whose rate of growth is the same
as or higher than that of g(n).
g(n) is an asymptotic lower bound for f(n).
f(n) = Q(g(n)) Þ f(n) = W(g(n)).
Q(g(n)) Ì W(g(n)).
Comp 122
Example
W(g(n)) = {f(n) : $ positive constants c and n0, such
that "n ³ n0, we have 0 £ cg(n) £ f(n)}

Ön = W(lg n). Choose c and n0.


C=1 and n0=16
o-notation
For a given function f(n), the set little-o:
o(g(n)) = {f(n): " c > 0, $ n0 > 0 such that
" n ³ n0, we have 0 £ f(n) < cg(n)}.
f(n) becomes insignificant relative to g(n) as n
approaches infinity:
lim [f(n) / g(n)] = 0 {a/∞ = 0}
n®¥

g(n) is an upper bound for f(n) that is not


asymptotically tight.
Observe the difference in this definition from previous
ones. Why?
Ex:
• 2n=o(n2)
but
• 2n2≠o(n2)
w -notation
For a given function f(n), the set little-omega:

w(g(n)) = {f(n): " c > 0, $ n0 > 0 such that


" n ³ n0, we have 0 £ cg(n) < f(n)}.
f(n) becomes arbitrarily large relative to g(n) as n
approaches infinity:
lim [f(n) / g(n)] = ¥. {∞/a =∞}
n®¥

g(n) is a lower bound for f(n) that is not


asymptotically tight.
Ex:
• n2/2=ω(n)
but
• n2/2≠ ω(n2)
Recursive Algorithms & Recurrence relations

1. Analyzing Recursive Algorithms:


a) Recurrence relations,
b) Specifying runtime of recursive algorithms,
c) Solving recurrence equations.
d) Master Theorem.
2. Case Study: Analyzing Algorithms
Analysing Recursive Algorithms

• Iteration is not the only interesting way of solving a


problem. Another useful technique, which is employed by
many algorithms, is to. use recursion.
• In this technique, we define a procedure P that is allowed
to make calls to itself as a subroutine, provided those calls
to p are for solving sub problems, of smaller size.
• The subroutine calls to P on smaller instances are called
"recursive calls?'
• A recursive procedure should always define a base. case,
which is small enough that the algorithm can solve it
directly without using recursion
• Analyzing the running time of a recursive algorithm takes
a bit of additional work, however*
• In particular, to analyze such a running time, we use a
recurrence equation, which defines mathematical
statements that the running time of a recursive algorithm
must satisfy.
• We introduce a function T(n) that denotes the running
time of the algorithm, on an input of size n, and we write
equations that T(n) must satisfy
• For example,
1 if n=1
T(n) =
T(n-1) + 1 if n>1
Recurrences

• When an algorithm contains a recursive call to itself, its


running time can often be described by a recurrence.
• A recurrence is a function defined in terms of:
• One or more base case, and
• Itself with smaller arguments
• Ex
1 if n=1
T(n) =
T(n-1) + 1 if n>1

• Solution: T(n)=O(n)
1 if n=1
T(n) =
2T(n/2) + n if n>1

Solution: T(n)=O(nlgn)

1 if n=1
T(n) =
T(n/3) + T(2n/3)+n if n>1

Solution: T(n)=O(nlgn)
Solving recurrence equations.

• The Iterative Substitution Method


• The Recursion Tree
• The Guess-and-Test Method
• The Master Method
Iterative Substitution
In the iterative substitution, or “plug-and-chug,” technique, we iteratively
apply the recurrence equation to itself and see if we can find a
pattern:
T ( n ) = 2T ( n / 2) + bn
= 2( 2T ( n / 22 )) + b( n / 2)) + bn
= 22 T ( n / 22 ) + 2bn
= 23 T ( n / 23 ) + 3bn
= 24 T ( n / 24 ) + 4bn
= ...
= 2i T ( n / 2i ) + ibn
Note that base, T(n)=b, case occurs when 2i=n. That is, i = log n.
So,
T (n) = bn + bn log n
Thus, T(n) is O(n log n).
Divide-and-Conquer 47
Solving T(n) = 3T(n-2) with iterative method

T(n)=3T(n−2)
My first step was to iteratively substitute terms to arrive at a general
form:
T(n−2)=3T(n−2−2)
=3T(n−4)
T(n)=3∗3T(n−4)
Leading to the general form:
T(n)=3k T(n-2k)
n−2k=1 for k, which is the point where the recurrence stops
(where T(1)) and
Insert that value (n/2−1/2=k) into the general form:
T(n)=3n/2-1/2
Recursion Tree Method to Solve Recurrence Relations
Recursion Tree is another method for solving the recurrence
relations.
A recursion tree is a tree where each node represents the cost
of a certain recursive sub-problem.
We sum up the values in each node to get the cost of the
entire algorithm.
Steps in Recursion Tree Method to Solve Recurrence
Relations

Step-01:
Draw a recursion tree based on the given recurrence
relation.
Step-02:
Determine-
– Cost of each level
– Total number of levels in the recursion tree
– Number of nodes in the last level
– Cost of the last level
Step-03:
Add cost of all the levels of the recursion tree and simplify
the expression so obtained in terms of asymptotic notation.
The Recursion Tree
Draw the recursion tree for the recurrence relation and look for a
pattern:
ì b if n < 2
T (n) = í
î2T (n / 2) + bn if n ³ 2
time
depth T’s size
0 1 n bn

1 2 n/2 bn

i 2i n/2i bn

… … … …
Total time = bn + bn log n

(last level plus all previous levels)


Divide-and-Conquer 51
Recursion-Tree Method
T(n) = T(n/3) + T(2n/3) + O(n)
Guess-and-Test Method
In the guess-and-test method, we guess a closed form solution and
then try to prove it is true by induction:
ì b if n < 2
T (n) = í
î2T (n / 2) + bn log n if n ³ 2
Guess: T(n) < cn log n.

T ( n ) = 2T (n / 2) + bn log n
= 2(c( n / 2) log(n / 2)) + bn log n
= cn (log n - log 2) + bn log n
= cn log n - cn + bn log n

Wrong: we cannot make this last line be less than cn log n

Divide-and-Conquer 53
Guess-and-Test Method, Part 2
Recall the recurrence equation:
ì b if n < 2
T (n) = í
î2T (n / 2) + bn log n if n ³ 2
Guess #2: T(n) < cn log n.
2

T (n) = 2T (n / 2) + bn log n
= 2(c(n / 2) log 2 (n / 2)) + bn log n
= cn(log n - log 2) 2 + bn log n
= cn log 2 n - 2cn log n + cn + bn log n
– if c > b. £ cn log 2 n
So, T(n) is O(n log2 n).
In general, to use this method, you need to have a good guess and you
need to be good at induction proofs.

Divide-and-Conquer 54
Master Method
Many divide-and-conquer recurrence equations have the
form:
ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
logb a +e
3. if f (n) is W(n ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.

Divide-and-Conquer 55
Master Method, Example 1
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = 4T (n / 2) + n
Solution: logba=2, so case 1 says T(n) is O(n2).

Divide-and-Conquer 56
Master Method, Example 2
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = 2T (n / 2) + n log n
Solution: logba=1, so case 2 says T(n) is O(n log2 n).

Divide-and-Conquer 57
Master Method, Example 3
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = T (n / 3) + n log n
Solution: logba=0, so case 3 says T(n) is O(n log n).

Divide-and-Conquer 58
Master Method, Example 4
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = 8T (n / 2) + n 2

Solution: logba=3, so case 1 says T(n) is O(n3).

Divide-and-Conquer 59
Master Method, Example 5
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = 9T (n / 3) + n 3

Solution: logba=2, so case 3 says T(n) is O(n3).

Divide-and-Conquer 60
Master Method, Example 6
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:
T (n) = T (n / 2) + 1 (binary search)

Solution: logba=0, so case 2 says T(n) is O(log n).

Divide-and-Conquer 61
Master Method, Example 7
The form: ì c if n < d
T (n) = í
îaT (n / b) + f (n) if n ³ d
The Master Theorem:
1. if f (n) is O(n logb a -e ), then T (n) is Q(n logb a )
2. if f (n) is Q(n logb a log k n), then T (n) is Q(n logb a log k +1 n)
3. if f (n) is W(n logb a +e ), then T (n) is Q( f (n)),
provided af (n / b) £ df (n) for some d < 1.
Example:

T (n) = 2T (n / 2) + log n (heap construction)

Solution: logba=1, so case 1 says T(n) is O(n).

Divide-and-Conquer 62
Solve:
T(n) = 9T(n/3)+n.
Example 1: T(n) = 9T(n/3)+n.
Here a = 9, b = 3, f(n) = n, and nlogb a = nlog3 9 = Θ(n2). Since
=1

case 1 of the master theorem applies, and the solution is


T(n) = Θ(n2).
Solve:
T(n) = T(2n/3)+1.
T(n) = T(2n/3)+1.
Here a = 1, b = 3/2, f(n) = 1, and nlogb a = n0 = 1.
Since f(n) = Θ(nlogb a), case 2 of the master theorem applies,
so the solution is T(n) = Θ(logn).
T(n) = 3T(n/4) + nlogn.
T(n) = 3T(n/4) + nlogn.
Here a=3, b=4, f(n)=nlogn

2
we have )
So case 3 applies if we can show that
af(n/b) ≤ cf(n) for some c < 1 and all sufficiently large n.

This would mean 3


Setting c = 3/4 would cause this condition to be satisfied.

Solution is T(n)=O(nlogn)
Changing variables
• Examble: Consider the recurrence

• Which looks difficult. We can simplify this recurrence, though,


with a change of variables. For convenience, we shall not
worry about rounding off values, such as , to be integers.
• Renaming m = log n {n=2m }yields
• T(2m) = 2T(2m/2) + m.
• We can now rename S(m) = T(2m) to produce the new
recurrence
• S(m) = 2S( m/2) + m,
• which has the solution: S(m) = O(m lg m).
• Changing back from S(m) to T(n), we obtain T(n) = T(2m)
= S(m) = O(m lg m) = O(lg n lg lg n).
Summary

• The running time of algorithm prefixAverages2 is given by


the sum of three terms.
• The first and the third term are 0(n), and the second term
is 0(1).
• Thus the running time of prefixAverages2 is 0(n), which is
much better than the quadratic-time algorithm
prefixAverages1.
Solve using master method
Solution
Shell Sort - Introduction
More properly, Shell’s Sort
Created in 1959 by Donald Shell
Link to a local copy of the article:
Donald Shell, “A High-Speed Sorting Procedure”,
Communications of the ACM Vol 2, No. 7 (July
1959), 30-32
Originally Shell built his idea on top of Bubble Sort
(link to article flowchart), but it has since been
transported over to Insertion Sort.
Shell Sort -General Description

• Essentially a segmented insertion sort


– Divides an array into several smaller non-contiguous
segments
– The distance between successive elements in one segment
is called a gap.
– Each segment is sorted within itself using insertion sort.
– Then resegment into larger segments (smaller gaps) and
repeat sort.
– Continue until only one segment (gap = 1) - final sort
finishes array sorting.
Shell Sort -Background
General Theory:
Makes use of Insertion sort.
Insertion sort is fastest when:
The array is nearly sorted.
The array contains only a small number of data items.
Shell sort works well because:
It always deals with a small number of elements.
Elements are moved a long way through array with each swap
and this leaves it more nearly sorted.
Shell Sort - example

Initial Segmenting Gap = 4

80 93 60 12 42 30 68 85 10

10 30 60 12 42 93 68 85 80
Shell Sort - example (2)

Resegmenting Gap = 2

10 30 60 12 42 93 68 85 80

10 12 42 30 60 85 68 93 80
Shell Sort - example (3)

Resegmenting Gap = 1

10 12 42 30 60 85 68 93 80

10 12 30 42 60 68 80 85 93
Gap Sequences for Shell Sort

• The sequence h1, h2, h3,. . . , ht is a sequence of increasing


integer values which will be used as a sequence (from right to
left) of gap values.
– Any sequence will work as long as it is increasing and h1 = 1.
– For any gap value hk we have A[i] <= A[i + hk]
– An array A for which this is true is hk sorted.
– An array which is hk sorted and is then hk-1 sorted
remains hk sorted.
Shell Sort - Ideal Gap Sequence

• Although any increasing sequence will work ( if h1


= 1):
– Best results are obtained when all values in the gap
sequence are relatively prime (sequence does not share any
divisors).
– Obtaining a relatively prime sequence is often not
practical in a program so practical solutions try to
approximate relatively prime sequences.
Shell Sort - Added Gap Sequence

• Donald Knuth, in his discussion of Shell’s Sort,


recommended another sequence of gaps.
• h0 = 1
• hj+1 = hj * 3 + 1
• Find the hj > n, then start with hj/3
Shell Sort - Time Complexity

• Time complexity: O(nr) with 1 < r < 2


• This is better than O(n2) but generally worse than
O(n log2n).
Shellsort - Code
SHELL-SORT(A,n)
// we take gap sequence in order of |N/2|, |N/4|, |N/8|...1
for gap=n/2; gap=0; gap/=2 do:
//Perform gapped insertion sort for this gap size.
for i=gap; i<n; i+=1 do:
temp=A[i]
// shift earlier gap-sorted elements up until
// the correct location for a[i] is found
for j=i; j>=gap && A[j-gap]>temp;j=j-gap
do:
A[j]= A[j-gap]
end for
// put temp in its correct location
A[j]= temp;
end for
end for
end func
ShellSort -Trace (gap = 4)

[0] [1] [2] [3] [4] [5] [6] [7] [8]


theArray 80 93 60 12 42 30 68 85 10

n: 9 i:
gap: 4 j:
ShellSort -Trace (gap = 2)

[0] [1] [2] [3] [4] [5] [6] [7] [8]


theArray 10 30 60 12 42 93 68 85 80

n: 9 i:
gap: 2 j:
ShellSort -Trace (gap = 1)

[0] [1] [2] [3] [4] [5] [6] [7] [8]


theArray 10 12 42 30 60 85 68 93 80

n: 9 i:
gap: 1 j:
Shellsort Examples

Sort: 18 32 12 5 38 33 16 2
8 Numbers to be sorted, Shell’s increment will be floor(n/2)
* floor(8/2)  floor(4) = 4
increment 4: 1 2 3 4 (visualize underlining)
18 32 12 5 38 33 16 2
Step 1) Only look at 18 and 38 and sort in order ;
18 and 38 stays at its current position because they are in order.
Step 2) Only look at 32 and 33 and sort in order ;
32 and 33 stays at its current position because they are in order.
Shellsort Examples

Sort: 18 32 12 5 38 33 16 2
8 Numbers to be sorted, Shell’s increment will be floor(n/2)
* floor(8/2)  floor(4) = 4
increment 4: 1 2 3 4 (visualize underlining)
18 32 12 5 38 33 16 2
Step 3) Only look at 12 and 16 and sort in order ;
12 and 16 stays at its current position because they are in order.
Step 4) Only look at 5 and 2 and sort in order ;
2 and 5 need to be switched to be in order.
Shellsort Examples (con’t)

Sort: 18 32 12 5 38 33 16 2
Resulting numbers after increment 4 pass:
18 32 12 2 38 33 16 5
* floor(4/2)  floor(2) = 2

increment 2: 1 2
18 32 12 2 38 33 16 5
Step 1) Look at 18, 12, 38, 16 and sort them in their appropriate location
12 38 16 2 18 33 38 5
Step 2) Look at 32, 2, 33, 5 and sort them in their appropriate location:
12 2 16 5 18 32 38 33
Shellsort Examples (con’t)

Sort: 18 32 12 5 38 33 16 2

* floor(2/2)  floor(1) = 1
increment 1: 1
12 2 16 5 18 32 38 33

2 5 12 16 18 32 33 38

The last increment or phase of Shellsort is basically an Insertion


Sort algorithm.
Introduction to Algorithms
Quicksort
• Divide and conquer
• Partitioning
• Worst-case analysis
• Intuition
• Randomized quicksort
• Analysis

September 21, 2005 L4.1


Quicksort
• Proposed by C.A.R. Hoare in 1962.
• Divide-and-conquer algorithm.
• Sorts “in place” (like insertion sort, but not
like merge sort).
• Very practical (with tuning).

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.2
Divide and conquer
Quicksort an n-element array:
1. Divide: Partition the array into two
subarrays around a pivot x such that elements
in lower subarray  x  elements in upper
subarray.  x  x
 xx
2. Conquer:x Recursively sort thex two subarrays.
3. Combine: Trivial.
Key: Linear-time partitioning subroutine.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.3
Partitioning subroutine
PARTITION(A, p, q) ⊳ A[ p . . q]
x  A[ p] ⊳ pivot = A[ p]
Running time
ip = O(n)
for j  p + 1 to q for n elements.
do if A[ j]  x
then ii+1
exchange A[i]  A[ j]
exchange A[ p]  A[i]
return i

Invariant: xx  x  x ??
p x i x j q
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.4
Example of partitioning

66 10
10 13
13 55 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.5
Example of partitioning

66 10
10 13
13 55 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.6
Example of partitioning

66 10
10 13
13 55 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.7
Example of partitioning

66 10 13 55
10 13 88 33 22 11
11
66 55 13 10
13 10 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.8
Example of partitioning

66 10 13 55
10 13 88 33 22 11
11
66 55 13 10
13 10 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.9
Example of partitioning

66 10 13 55
10 13 88 33 22 11
11
66 55 13 10
13 10 88 33 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.10
Example of partitioning

66 10
10 1313 55 88 33 22 11
11
66 55 13
13 1010 88 33 22 11
11
66 55 33 10
10 88 1313 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.12
Example of partitioning

66 10
10 13
13 55 88 33 22 11
11
66 55 13
13 10
10 88 33 22 11
11
66 55 33 10
10 88 13
13 22 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.12
Example of partitioning

66 10 13
10 13 55 88 33 22 11
11

66 55 13
13 10
10 88 33 22 11
11
66 55 33 10
10 88 13
13 22 11
11
66 55 33 22 88 13
13 10
10 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.13
Example of partitioning

66 10 13
10 13 55 88 33 22 11
11

66 55 13
13 10
10 88 33 22 11
11
66 55 33 10
10 88 13
13 22 11
11
66 55 33 22 88 13
13 10
10 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.14
Example of partitioning

66 10 13
10 13 5 8 3 22 11
11

66 55 13
13 10
10 88 33 22 11
11
66 55 33 10
10 88 13
13 22 11
11
66 55 33 22 88 13
13 10
10 11
11
i j

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.15
Example of partitioning

66 10
10 13
13 55 88 33 22 11
11

66 55 13
13 10
10 88 33 22 11
11

66 55 33 10
10 88 13
13 22 11
11

66 55 33 22 88 13
13 10
10 11
11

2 5 3 6 8 13 10 11
2 5 3 6 8 13 10 11
i
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.16
Pseudocode for quicksort
QUICKSORT(A, p, r)
if p < r
then q  PARTITION(A, p, r)
QUICKSORT(A, p, q–1)
QUICKSORT(A, q+1, r)

Initial call: QUICKSORT(A, 1, n)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.17
Analysis of quicksort

• Assume all input elements are distinct.


• In practice, there are better partitioning
algorithms for when duplicate input
elements may exist.
• Let T(n) = worst-case running time on
an array of n elements.

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.18
Worst-case of quicksort
• Input sorted or reverse sorted.
• Partition around min or max element.
• One side of partition always has no elements.
T (n)  T (0)  T (n 1)  (n)
 (1)  T (n 1)  (n)
 T (n 1)  (n)
 (n2 ) (arithmetic series)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.19
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.20
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn
cn
T(0) T(n–1)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.22
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn
cn
T(0) c(n–1)
T(0) T(n–2)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.23
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn
cn
T(0) c(n–1)
T(0) c(n–2)
T(0) O

(1)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.24
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn
cn n

T(0) c(n–1) k =n2


k 1
T(0) c(n–2)
T(0) O

(1)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.25
Worst-case recursion tree
T(n) = T(0) + T(n–1) + cn
cn n
(1) c(n–1) k  n2 
k 1
(1) c(n–2)
h=n T(n) = (n) +
(1) O (n2)
= (n2)
(1)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.26
Best-case analysis
(For intuition only!)
If we’re lucky, PARTITION splits the array evenly:
T(n) = 2T(n/2) + (n)
= (n lg n) (same as merge sort)

1: 9
What if the split is always 10 10?
T (n)  T 101 n T 109 n (n)
What is the solution to this recurrence?

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.27
Analysis of “almost-best” case
T(n)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.28
Analysis of “almost-best” case
cn
T 101 n  T 109 n 

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.29
Analysis of “almost-best” case
cn
1
10
cn 9 cn
10

T 100
1
n T 100
9
n  T 100
9
n T 100
81
n

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.30
Analysis of “almost-best” case
cn cn
1
10
cn 9 cn cn
10
log10/9n
1 cn 9 cn 9 cn 81 cn
cn
100 100 100 100


(1)
(1)

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.31
Analysis of “almost-best” case
cn cn
1
10
cn 9 cn cn
10
log10n log10/9n
1 cn 9 cn
100 100
9
100
cn 81 cn
100
cn


(1)
(n lg n) (1)
T(n)=  n logn)
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.32
Randomized quicksort
IDEA: Partition around a random element.
•Running time is independent of the input
order.
•No assumptions need to be made about
the input distribution.
•No specific input elicits the worst-case
behavior.
•The worst case is determined only by the
output of a random-number generator.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.34
Randomized quicksort analysis
Let T(n) = the random variable for the running
time of randomized quicksort on an input of size
n, assuming random numbers are independent.
For k = 0, 1, …, n–1, define the indicator
random variable
1 if PARTITION generates a k : n–k–1split,
Xk =
0 otherwise.
E[Xk] = Pr{Xk = 1} = 1/n, since all splits are
equally likely, assuming elements are distinct.
September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.35
Randomized quicksort
Randomized-QUICKSORT(A, p, r)
if p < r then
q  Randomized-PARTITION(A, p, r)
Randomized-QUICKSORT(A, p, q–1)
Randomized-QUICKSORT(A, q+1, r)
_________________________________________
Randomized-PARTITION(A, p, r)
i=Random(p,r)
swap(A[i],A[p])
Partition(A,p,r)
____________________________________________________

Initial call: Randomized-QUICKSORT(A, 1, n)


Randomized Quicksort

PARTITION(A, p, q) ⊳ A[ p . .
q] x  A[ p] ⊳ pivot = A[
ip] p
for j  p + 1 to q
do if A[ j]  x
then i  i + 1
exchange A[i]  A[ j]
exchange A[ p]  A[i]
return i
Randomizing Quicksort
Randomly permute the elements of the input array before sorting
OR ... modify the PARTITION procedure

• At each step of the algorithm we exchange element A[p]


with an element chosen at random from A[p…r]

• The pivot element x = A[p] is equally likely to be any one


of the r – p + 1 elements of the subarray

36
Randomized Algorithms

No input can elicit worst case behavior


• Worst case occurs only if we get “unlucky” numbers from
the random number generator
Worst case becomes less likely
• Randomization can NOT eliminate the worst-case but it
can make it less likely!

37
Quicksort in practice

• Quicksort is a great general-purpose


sorting algorithm.
• Quicksort is typically over twice as fast
as merge sort.
• Quicksort behaves well even with
caching and virtual memory.

September 21, 2005 Copyright © 2001-5 by Erik D. Demaine and Charles E. Leiserson L4.47
Sorting in linear time
Counting sort: No comparisons between elements.
• Input: A[1 . . n], where A[ j]{1, 2, …, k}.
• Output: B[1 . . n], sorted.
• Auxiliary storage: C[1 . . k] .

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.11
Counting sort
for i  1 to k
do C[i]  0
for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key = i}|
for i  2 to k
do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key  i}|
for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.12
Counting-sort example
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C:

B:

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.13
Loop 1
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 00 00 00 00

B:

for i  1 to k
do C[i]  0

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.14
Loop 2
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 00 00 00 11

B:

for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.15
Loop 2
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 00 11

B:

for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.16
Loop 2
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 11 11

B:

for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.17
Loop 2
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 11 22

B:

for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.18
Loop 2
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 22 22

B:

for j  1 to n
do C[A[ j]]  C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.19
Loop 3
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 22 22

B: C': 11 11 22 22

for i  2 to k
do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key 
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.20
Loop 3
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 22 22

B: C': 11 11 33 22

for i  2 to k
do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key 
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.21
Loop 3
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 00 22 22

B: C': 11 11 33 55

for i  2 to k
do C[i]  C[i] + C[i–1] ⊳ C[i] = |{key 
i}|
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.22
Loop 4
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 11 33 55

B: 33 C': 11 11 22 55

for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.23
Loop 4
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 11 22 55

B: 33 44 C': 11 11 22 44

for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.24
Loop 4
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 11 22 44

B: 33 33 44 C': 11 11 11 44

for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.25
Loop 4
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 11 11 11 44

B: 11 33 33 44 C': 00 11 11 44

for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.26
Loop 4
1 2 3 4 5 1 2 3 4

A: 44 11 33 44 33 C: 00 11 11 44

B: 11 33 33 44 44 C': 00 11 11 33

for j  n downto 1
do B[C[A[ j]]]  A[j]
C[A[ j]]  C[A[ j]] – 1
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.27
Analysis
for i  1 to k
(k)
do C[i]  0
for j  1 to n
(n)
do C[A[ j]]  C[A[ j]] +1
for i  2 to k
(k) do C[i]  C[i] + C[i–1]
for j  n downto 1
(n) do B[C[A[ j]]]  A[ j]
C[A[ j]]  C[A[ j]] –1
(n + k)
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.28
Running time
If k = O(n), then counting sort takes (n) time.
• But, sorting takes (n lg n) time!
• Where’s the fallacy?
Answer:
• Comparison sorting takes (n lg n) time.
• Counting sort is not a comparison sort.
• In fact, not a single comparison between
elements occurs!
September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.29
Stable sorting
Counting sort is a stable sort: it preserves
the input order among equal elements.

A: 44 11 33 44 33

B: 11 33 33 44 44

Exercise: What other sorts have this property?

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.30
Radix sort
• Origin: Herman Hollerith’s card-sorting
machine for the 1890 U.S. Census.
• Digit-by-digit sort.
• Hollerith’s original (bad) idea: sort on
most-significant digit first.
• Good idea: Sort on least-significant digit
first with auxiliary stable sort.

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.31
Radix sort
Main idea
Break key into “digit” representation
key = id, id-1, …, i2, i1
"digit" can be a number in any base, a character, etc

Radix sort:
for i= 1 to d
sort “digit” i using a stable sort

Analysis : (d  (stable sort time)) where d is the number of “digit”s

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.32
Operation of radix sort

329 720 720 329


457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839

September 26, 2005 Copyright © 2001-5 Erik D. Demaine and Charles E. Leiserson L5.32
Non-Comparison Sort – Bucket Sort

Assumption: uniform distribution


Input numbers are uniformly distributed in [0,1).
Suppose input size is n.
Idea:
Divide [0,1) into n equal-sized subintervals (buckets).
Distribute n numbers into buckets
Expect that each bucket contains few numbers.
Sort numbers in each bucket (insertion sort as default).
Then go through buckets in order, listing elements
Can be shown to run in linear-time on average

24
Bucket sort

BUCKET_SORT (A)
1.n ← length [A]
2.For i = 1 to n do
3. Insert A[i] into list B[n*A[i]]
4.For i = 0 to n-1 do
5. Sort list B with Insertion sort
6.Concatenate the lists B[0], B[1], . . B[n-1] together in order.
Example of BUCKET-SORT

26
Generalizing Bucket Sort

Q: What if the input numbers are NOT uniformly


distributed in [0,1)?
A: Can be generalized in different ways, e.g. if the
distribution is known we can design (unequal sized)
bins that will have roughly equal number of
numbers on average.

27

You might also like