Module 1: Introduction and Asymptotic Analysis: CS 240 - Data Structures and Data Management
Module 1: Introduction and Asymptotic Analysis: CS 240 - Data Structures and Data Management
http://www.student.cs.uwaterloo.ca/
~
cs240/s13/
Course newsgroup:
www.piazza.com/class
Main resource: Lectures
Course slides will be posted on the course webpage but are not a
substitute for in-class lectures
Textbooks:
Topics and references for each lecture will be posted on the Webpage
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 4 / 48
Mark Breakdown
Final 50%
Midterm 25%
No lates allowed
Lecture slides
Assignments
Course policies
Piazza
A forum that is optimized for asking questions and giving answers.
You must sign up using your uwaterloo email address.
{I :Size(I )=n}
T
A
(I ).
Worst-case complexity of an algorithm: The worst-case running time
of an algorithm A is a function f : Z
+
R mapping n (the input size) to
the longest running time for any input instance of size n:
T
A
(n) = maxT
A
(I ) : Size(I ) = n.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 25 / 48
Growth Rates
If f (n) (g(n)), then the growth rates of f (n) and g(n) are the
same.
If f (n) o(g(n)), then we say that the growth rate of f (n) is
less than the growth rate of g(n).
If f (n) (g(n)), then we say that the growth rate of f (n) is
greater than the growth rate of g(n).
Typically, f (n) may be complicated and g(n) is chosen to be a very
simple function.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 26 / 48
Common Growth Rates
Commonly encountered growth rates in analysis of algorithms include the
following (in increasing order of growth rate):
(1) (constant complexity),
(log n) (logarithmic complexity),
(n) (linear complexity),
(n log n)(linearithmic/pseudo-linear complexity),
(n
2
) (quadratic complexity),
(n
3
) (cubic complexity),
(2
n
) (exponential complexity).
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 27 / 48
How Growth Rates Aect Running Time
It is interesting to see how the running time is aected when the size of
the problem instance doubles (i.e., n 2n).
constant complexity: T(n) = c, T(2n) = c.
logarithmic complexity: T(n) = c log n, T(2n) = T(n) + c.
linear complexity: T(n) = cn, T(2n) = 2T(n).
(n log n): T(n) = cn log n, T(2n) = 2T(n) + 2cn.
quadratic complexity: T(n) = cn
2
, T(2n) = 4T(n).
cubic complexity: T(n) = cn
3
, T(2n) = 8T(n).
exponential complexity: T(n) = c2
n
, T(2n) = (T(n))
2
/c.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 28 / 48
Complexity vs. Running Time
Suppose that algorithms A
1
and A
2
both solve some specied
problem.
Suppose that the complexity of algorithm A
1
is lower than the
complexity of algorithm A
2
. Then, for suciently large problem
instances, A
1
will run faster than A
2
. However, for small problem
instances, A
1
could be slower than A
2
.
Now suppose that A
1
and A
2
have the same complexity. Then we
cannot determine from this information which of A
1
or A
2
is faster; a
more delicate analysis of the algorithms A
1
and A
2
is required.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 29 / 48
Example
Suppose an algorithm A
1
with linear complexity has running time
T
A
1
(n) = 75n + 500 and an algorithm with quadratic complexity has
running time T
A
2
(n) = 5n
2
. Then A
2
is faster when n 20 (the
crossover point). When n > 20, A
1
is faster.
0
1,500
15
3,000
2,500
2,000
1,000
20 5 10 0
x
500
25
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 30 / 48
O-notation and Complexity of Algorithms
It is important not to try and make comparisons between algorithms
using O-notation.
For example, suppose algorithm A
1
and A
2
both solve the same
problem, A
1
has complexity O(n
3
) and A
2
has complexity O(n
2
).
The above statements are perfectly reasonable.
Observe that we cannot conclude that A
2
is more ecient than A
1
in
this situation! (Why not?)
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 31 / 48
Techniques for Order Notation
Suppose that f (n) > 0 and g(n) > 0 for all n n
0
. Suppose that
L = lim
n
f (n)
g(n)
.
Then
f (n)
o(g(n)) if L = 0
(g(n)) if 0 < L <
(g(n)) if L = .
The required limit can often be computed using lHopitals rule. Note that
this result gives sucient (but not necessary) conditions for the stated
conclusions to hold.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 32 / 48
An Example
Compare the growth rates of log n and n
i
(where i > 0 is a real number).
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 33 / 48
Example
Prove that n(2 + sin n/2) is (n). Note that lim
n
(2 + sin n/2) does
not exist.
60
x
25 20
0
15 10 5 0
40
20
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 34 / 48
Relationships between Order Notations
f (n) (g(n)) g(n) (f (n))
f (n) O(g(n)) g(n) (f (n))
f (n) o(g(n)) g(n) (f (n))
f (n) (g(n)) f (n) O(g(n)) and f (n) (g(n))
f (n) o(g(n)) f (n) O(g(n))
f (n) o(g(n)) f (n) , (g(n))
f (n) (g(n)) f (n) (g(n))
f (n) (g(n)) f (n) , O(g(n))
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 35 / 48
Algebra of Order Notations
Maximum rules: Suppose that f (n) > 0 and g(n) > 0 for all n n
0
.
Then:
O(f (n) + g(n)) = O(maxf (n), g(n))
(f (n) + g(n)) = (maxf (n), g(n))
(f (n) + g(n)) = (maxf (n), g(n))
Transitivity: If f (n) O(g(n)) and g(n) O(h(n)) then f (n) O(h(n)).
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 36 / 48
Summation Formulae
Arithmetic sequence:
n1
i =0
(a + di ) = na +
dn(n 1)
2
(n
2
).
Geometric sequence:
n1
i =0
a r
i
=
a
r
n
1
r 1
(r
n
) if r > 1
na (n) if r = 1
a
1r
n
1r
(1) if 0 < r < 1.
Harmonic sequence:
H
n
=
n
i =1
1
i
(log n)
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 37 / 48
More Formulae and Miscellaneous Math Facts
n
i =1
i r
i
=
nr
n+1
r 1
r
n+1
r
(r 1)
2
i =1
i
2
=
2
6
for k 0,
n
i =1
i
k
(n
k+1
)
log
b
a =
1
log
a
b
log
b
a =
log
c
a
log
c
b
a
log
b
c
= c
log
b
a
n!
n
n+1/2
e
n
log n! (n log n)
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 38 / 48
Techniques for Algorithm Analysis
Two general strategies are as follows.
Use -bounds throughout the analysis and obtain a -bound for the
complexity of the algorithm.
Prove a O-bound and a matching -bound separately to get a
-bound. Sometimes this technique is easier because arguments for
O-bounds may use simpler upper bounds
(and arguments for -bounds may use simpler lower bounds)
than arguments for -bounds do.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 39 / 48
Techniques for Loop Analysis
Identify elementary operations that require constant time
(denoted (1) time).
The complexity of a loop is expressed as the sum of the complexities
of each iteration of the loop.
Analyze independent loops separately, and then add the results
(use maximum rules and simplify whenever possible).
If loops are nested, start with the innermost loop and proceed
outwards. In general, this kind of analysis requires evaluation of
nested summations.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 40 / 48
Example of Loop Analysis
Test1(n)
1. sum 0
2. for i 1 to n do
3. for j i to n do
4. sum sum + (i j )
2
5. sum sum
2
6. return sum
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 41 / 48
Example of Loop Analysis
Test2(A, n)
1. max 0
2. for i 1 to n do
3. for j i to n do
4. sum 0
5. for k i to j do
6. sum A[k]
7. if sum > max then
8. max sum
9. return max
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 42 / 48
Example of Loop Analysis
Test3(n)
1. sum 0
2. for i 1 to n do
3. j i
4. while j 1 do
5. sum sum + i /j
6. j j /2|
7. return sum
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 43 / 48
Design of MergeSort
Input: Array A of n integers
Step 1: We split A into two subarrays: A
L
consists of the rst ,
n
2
|
elements in A and A
R
consists of the last
n
2
| elements in A.
Step 2: Recursively run MergeSort on A
L
and A
R
.
Step 3: After A
L
and A
R
have been sorted, use a function Merge to
merge them into a single sorted array. This can be done in time (n).
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 44 / 48
MergeSort
MergeSort(A, n)
1. if n = 1 then
2. S A
3. else
4. n
L
,
n
2
|
5. n
R
n
2
|
6. A
L
[A[1], . . . , A[n
L
]]
7. A
R
[A[n
L
+ 1], . . . , A[n]]
8. S
L
MergeSort(A
L
, n
L
)
9. S
R
MergeSort(A
R
, n
R
)
10. S Merge(S
L
, n
L
, S
R
, n
R
)
11. return S
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 45 / 48
Analysis of MergeSort
Let T(n) denote the time to run MergeSort on an array of length n.
Step 1 takes time (n)
Step 2 takes time T
,
n
2
|
+ T
n
2
|
,
n
2
|
+ T
n
2
|
+ (n) if n > 1
(1) if n = 1.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 46 / 48
Analysis of MergeSort
The mergesort recurrence is
T(n) =
,
n
2
|
+ T
n
2
|
+ (n) if n > 1
(1) if n = 1.
It is simpler to consider the following exact recurrence, with
unspecied constant factors c and d replacing s:
T(n) =
,
n
2
|
+ T
n
2
|
+ cn if n > 1
d if n = 1.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 47 / 48
Analysis of MergeSort
The following is the corresponding sloppy recurrence
(it has oors and ceilings removed):
T(n) =
2 T
n
2
+ cn if n > 1
d if n = 1.
The exact and sloppy recurrences are identical when n is a power of 2.
The recurrence can easily be solved by various methods when n = 2
j
.
The solution has growth rate T(n) (n log n).
It is possible to show that T(n) (n log n) for all n
by analyzing the exact recurrence.
Storjohann (CS, UW) CS240 - Module 1 Fall 2012 48 / 48