Ch 2 DS&A
Ch 2 DS&A
Introduction
7
8 CHAPTER 2. INTRODUCTION
design is all about the mathematical theory behind the design of good programs.
Why study algorithm design? There are many facets to good program design. Good algorithm design is
one of them (and an important one). To be really complete algorithm designer, it is important to be aware
of programming and machine issues as well. In any important programming project there are two major
types of issues, macro issues and micro issues.
Macro issues involve elements such as how does one coordinate the efforts of many programmers
working on a single piece of software, and how does one establish that a complex programming system
satisfies its various requirements. These macro issues are the primary subject of courses on software
engineering.
A great deal of the programming effort on most complex software systems consists of elements whose
programming is fairly mundane (input and output, data conversion, error checking, report generation).
However, there is often a small critical portion of the software, which may involve only tens to hundreds
of lines of code, but where the great majority of computational time is spent. (Or as the old adage goes:
80% of the execution time takes place in 20% of the code.) The micro issues in programming involve
how best to deal with these small critical sections.
It may be very important for the success of the overall project that these sections of code be written in the
most efficient manner possible. An unfortunately common approach to this problem is to first design an
inefficient algorithm and data structure to solve the problem, and then take this poor design and attempt
to fine-tune its performance by applying clever coding tricks or by implementing it on the most expensive
and fastest machines around to boost performance as much as possible. The problem is that if the
underlying design is bad, then often no amount of fine-tuning is going to make a substantial difference.
Before you implement, first be sure you have a good design. This course is all about how to design good
algorithms. Because the lesson cannot be taught in just one course, there are a number of companion
courses that are important as well. CS301 deals with how to design good data structures. This is not
really an independent issue, because most of the fastest algorithms are fast because they use fast data
structures, and vice versa. In fact, many of the courses in the computer science program deal with
efficient algorithms and data structures, but just as they apply to various applications: compilers,
operating systems, databases, artificial intelligence, computer graphics and vision, etc. Thus, a good
understanding of algorithm design is a central element to a good understanding of computer science and
good programming.
open eye to implementation issues down the line that will be important for final implementation. For
example, we will study three fast sorting algorithms this semester, heap-sort, merge-sort, and quick-sort.
From our mathematical analysis, all have equal running times. However, among the three (barring any
extra considerations) quick sort is the fastest on virtually all modern machines. Why? It is the best from
the perspective of locality of reference. However, the difference is typically small (perhaps 10-20%
difference in running time).
Thus this course is not the last word in good program design, and in fact it is perhaps more accurately just
the first word in good program design. The overall strategy that I would suggest to any programming
would be to first come up with a few good designs from a mathematical and algorithmic perspective.
Next prune this selection by consideration of practical matters (like locality of reference). Finally
prototype (that is, do test implementations) a few of the best designs and run them on data sets that will
arise in your application for the final fine-tuning. Also, be sure to use whatever development tools that
you have, such as profilers (programs which pin-point the sections of the code that are responsible for
most of the running time).
it is often necessary to design algorithms that are simple, and easily modified if problem parameters and
specifications are slightly modified. Fortunately, most of the algorithms that we will discuss in this class
are quite simple, and are easy to modify subject to small problem variations.
• Let a point p in 2-dimensional space be given by its integer coordinates, p = (p.x, p.y).
2.9. BRUTE-FORCE ALGORITHM 11
• Given a set of n points, P = {p1, p2, . . . , pn} in 2-space a point is said to be maximal if it is not
dominated by any other point in P.
The car selection problem can be modelled this way: For each car we associate (x, y) pair where x is the
speed of the car and y is the negation of the price. High y value means a cheap car and low y means
expensive car. Think of y as the money left in your pocket after you have paid for the car. Maximal
points correspond to the fastest and cheapest cars.
The 2-dimensional Maxima is thus defined as
• Given a set of points P = {p1, p2, . . . , pn} in 2-space, output the set of maximal points of P, i.e.,
those points pi such that pi is not dominated by any other point of P.
14
(7,13)
12 (12,12)
(4,11) (9,10)
10 (14,10)
8 (7,7) (15,7)
6 (11,5)
(2,5)
4 (4,4) (13,3)
2
(5,1)
2 4 6 8 10 12 14 16 18
Our description of the problem is at a fairly mathematical level. We have intentionally not discussed how
the points are represented. We have not discussed any input or output formats. We have avoided
programming and other software issues.
This English description is clear enough that any (competent) programmer should be able to implement
it. However, if you want to be a bit more formal, it could be written in pseudocode as follows:
M AXIMA(int n, Point P[1 . . . n])
1 for i ← 1 to n
2 do maximal ← true
3 for j ← 1 to n
4 do
5 if (i 6= j) and (P[i].x ≤ P[j].x) and (P[i].y ≤ P[j].y)
6 then maximal ← false; break
7 if (maximal = true)
8 then output P[i]
There are no formal rules to the syntax of this pseudo code. In particular, do not assume that more detail
is better. For example, I omitted type specifications for the procedure Maxima and the variable maximal,
and I never defined what a Point data type is, since I felt that these are pretty clear from context or just
unimportant details. Of course, the appropriate level of detail is a judgement call. Remember, algorithms
are to be read by people, and so the level of detail depends on your intended audience. When writing
pseudo code, you should omit details that detract from the main ideas of the algorithm, and just go with
the essentials.
You might also notice that I did not insert any checking for consistency. For example, I assumed that the
points in P are all distinct. If there is a duplicate point then the algorithm may fail to output even a single
point. (Can you see why?) Again, these are important considerations for implementation, but we will
often omit error checking because we want to see the algorithm in its simplest form.
Here are a series of figures that illustrate point domination.
14 (7,13) 14 (7,13)
12 (12,12) 12 (12,12)
(4,11) (14,10) (4,11) (14,10)
10 (9,10) 10 (9,10)
8 (7,7) 8 (7,7)
(15,7) (15,7)
6 (11,5) 6 (11,5)
(2,5) (2,5)
4 (4,4) (13,3) 4 (4,4) (13,3)
2 2
(5,1) (5,1)
2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18
Figure 2.2: Points that dominate (4, 11) Figure 2.3: Points that dominate (9, 10)
2.10. RUNNING TIME ANALYSIS 13
14 (7,13) 14
(7,13)
12 (12,12) 12 (12,12)
(4,11) (4,11) (9,10)
10 (9,10) (14,10) 10 (14,10)
8 (7,7) 8 (7,7)
(15,7) (15,7)
6 (11,5) 6 (11,5)
(2,5) (2,5)
4 (4,4) (13,3) 4 (4,4) (13,3)
2 2
(5,1) (5,1)
2 4 6 8 10 12 14 16 18 2 4 6 8 10 12 14 16 18
Figure 2.4: Points that dominate (7, 7) Figure 2.5: The maximal points
Worst-case time is the maximum running time over all (legal) inputs of size n. Let I denote an input
instance, let |I| denote its length, and let T (I) denote the running time of the algorithm on input I.
Then
Tworst(n) = max T (I)
|I|=n
Average-case time is the average running time over all inputs of size n. Let p(I) denote the probability
of seeing this input. The average-case time is the weighted sum of running times with weights
14 CHAPTER 2. INTRODUCTION
We will almost always work with worst-case time. Average-case time is more difficult to compute; it is
difficult to specify probability distribution on inputs. Worst-case time will specify an upper limit on the
running time.
Thus we might express the worst-case running time as a pair of nested summations, one for the i-loop
and the other for the j-loop:
Xn Xn
T (n) = 2+ 4
i=1 j=1
Xn
= (4n + 2)
i=1
= (4n + 2)n = 4n2 + 2n
For small values of n, any algorithm is fast enough. What happens when n gets large? Running time
does become an issue. When n is large, n2 term will be much larger than the n term and will dominate
the running time.
We will say that the worst-case running time is Θ(n2). This is called the asymptotic growth rate of the
function. We will discuss this Θ-notation more formally later.
2.10. RUNNING TIME ANALYSIS 15
The analysis involved computing a summation. Summation should be familiar but let us review a bit
here. Given a finite sequence of values a1, a2, . . . , an, their sum a1 + a2 + . . . + an is expressed in
summation notation as
X n
ai
i=1
If n = 0, then the sum is additive identity, 0.
Some facts about summation: If c is a constant
X n n
X
cai = c ai
i=1 i=1
and
n
X n
X n
X
(ai + bi) = ai + bi
i=1 i=1 i=1
Arithmetic series
n
X
i = 1 + 2 + ... + n
i=1
n(n + 1)
= = Θ(n2)
2
Quadratic series
n
X
i2 = 1 + 4 + 9 + . . . + n2
i=1
2n3 + 3n2 + n
= = Θ(n3)
6
Geometric series
n
X
xi = 1 + x + x2 + . . . + xn
i=1
x(n+1) − 1
= = Θ(n2)
x−1
If 0 < x < 1 then this is Θ(1), and if x > 1, then this is Θ(xn).
Harmonic series For n ≥ 0
n
X 1
Hn =
i=1
i
1 1 1
=1+ + + . . . + ≈ ln n
2 3 n
= Θ(ln n)
16 CHAPTER 2. INTRODUCTION
NESTED - LOOPS()
1 for i ← 1 to n
2 do
3 for j ← 1 to 2i
4 do k = j . . .
5 while (k ≥ 0)
6 do k = k − 1 . . .
How do we analyze the running time of an algorithm that has complex nested loop? The answer is we
write out the loops as summations and then solve the summations. To convert loops into summations, we
work from inside-out.
Consider the inner most while loop.
NESTED - LOOPS()
1 for i ← 1 to n
2 do for j ← 1 to 2i
3 do k = j
4 while (k ≥ 0) J
5 do k = k − 1
It is executed for k = j, j − 1, j − 2, . . . , 0. Time spent inside the while loop is constant. Let I() be the
time spent in the while loop. Thus
j
X
I(j) = 1=j+1
k=0
NESTED - LOOPS()
1 for i ← 1 to n
2 do for j ← 1 to 2i J
3 do k = j
4 while (k ≥ 0)
5 do k = k − 1
2.11. ANALYSIS: A HARDER EXAMPLE 17
Its running time is determined by i. Let M() be the time spent in the for loop:
2i
X
M(i) = I(j)
j=1
2i
X
= (j + 1)
j=1
2i
X 2i
X
= j+ 1
j=1 j=1
2i(2i + 1)
= + 2i
2
= 2i2 + 3i
NESTED - LOOPS()
1 for i ← 1 to n J
2 do for j ← 1 to 2i
3 do k = j
4 while (k ≥ 0)
5 do k = k − 1
points, P = {p1, p2, . . . , pn} in 2-space a point is said to be maximal if it is not dominated by any other
point in P. The problem is to output all the maximal points of P. We introduced a brute-force algorithm
that ran in Θ(n2) time. It operated by comparing all pairs of points. Is there an approach that is
significantly better?
The problem with the brute-force algorithm is that it uses no intelligence in pruning out decisions. For
example, once we know that a point pi is dominated by another point pj, we do not need to use pi for
eliminating other points. This follows from the fact that dominance relation is transitive. If pj dominates
pi and pi dominates ph then pj also dominates ph; pi is not needed.
The question is whether we can make an significant improvement in the running time? Here is an idea for
how we might do it. We will sweep a vertical line across the plane from left to right. As we sweep this
line, we will build a structure holding the maximal points lying to the left of the sweep line. When the
sweep line reaches the rightmost point of P , then we will have constructed the complete set of maxima.
This approach of solving geometric problems by sweeping a line across the plane is called plane sweep.
Although we would like to think of this as a continuous process, we need some way to perform the plane
sweep in discrete steps. To do this, we will begin by sorting the points in increasing order of their
x-coordinates. For simplicity, let us assume that no two points have the same y-coordinate. (This limiting
assumption is actually easy to overcome, but it is good to work with the simpler version, and save the
messy details for the actual implementation.) Then we will advance the sweep-line from point to point in
n discrete steps. As we encounter each new point, we will update the current list of maximal points.
We will sweep a vertical line across the 2-d plane from left to right. As we sweep, we will build a
structure holding the maximal points lying to the left of the sweep line. When the sweep line reaches the
rightmost point of P, we will have the complete set of maximal points. We will store the existing
maximal points in a list The points that pi dominates will appear at the end of the list because points are
sorted by x-coordinate. We will scan the list left to right. Every maximal point with y-coordinate less
than pi will be eliminated from computation. We will add maximal points onto the end of a list and
delete from the end of the list. We can thus use a stack to store the maximal points. The point at the top
of the stack will have the highest x-coordinate.
Here are a series of figures that illustrate the plane sweep. The figure also show the content of the stack.
2.11. ANALYSIS: A HARDER EXAMPLE 19
14 14
12 (3,13) (12,12) 12 (3,13) (12,12)
2 4 6 8 10 12 14 16 18 stack 2 4 6 8 10 12 14 16 18 stack
Figure 2.6: Sweep line at (2, 5) Figure 2.7: Sweep line at (3, 13)
14 14
12 (3,13) (12,12) 12 (3,13) (12,12)
2 4 6 8 10 12 14 16 18 stack 2 4 6 8 10 12 14 16 18 stack
Figure 2.8: Sweep line at (4, 11) Figure 2.9: Sweep line at (5, 1)
14 14
12 (3,13) (12,12) 12 (3,13) (12,12)
2 4 6 8 10 12 14 16 18 stack 2 4 6 8 10 12 14 16 18 stack
Figure 2.10: Sweep line at (7, 7) Figure 2.11: Sweep line at (9, 10)
20 CHAPTER 2. INTRODUCTION
14 14
12 (3,13) (12,12) 12 (3,13) (12,12)
2 4 6 8 10 12 14 16 18 stack 2 4 6 8 10 12 14 16 18 stack
Figure 2.12: Sweep line at (10, 5) Figure 2.13: Sweep line at (12, 12)
14
12 (3,13) (12,12)
10 (4,11) (14,10)
(9,10)
8
(15,7) (15,7)
6 (2,5) (7,7)
(14,10)
4 (10,5)
(13,3) (12,12)
2
(5,1) (3,13)
2 4 6 8 10 12 14 16 18 stack
n2 n
=
nlogn log n
n
n logn logn
100 7 15
1000 10 100
10000 13 752
100000 17 6021
1000000 20 50171
For n = 1, 000, 000, if plane-sweep takes 1 second, the brute-force will take about 14 hours!. From this
we get an idea about the importance of asymptotic analysis. It tells us which algorithm is better for large
values of n. As we mentioned before, if n is not very large, then almost any algorithm will be fast. But
efficient algorithm design is most important for large inputs, and the general rule of computing is that
input sizes continue to grow until people can no longer tolerate the running times. Thus, by designing
algorithms efficiently, you make it possible for the user to run large inputs in a reasonable amount of time.