Final Exam: 1. Lots of Ones (20 Points)
Final Exam: 1. Lots of Ones (20 Points)
Final Exam
Instructions
Please write your answers in the blue book(s). Work alone. Do not use any
notes or books. You have approximately three hours to complete this exam.
Unless otherwise specified, you should justify your answers. Running times
should be given in asymptotic notation. When describing an algorithm, you
should feel free to use as a subroutine any algorithm found in Levitin or the
lectures; you do not need to write down the details of such an algorithm or
re-prove its properties. Similarly, standard mathematical facts can be stated
and used without proof.
Ones(n):
if n = 0:
print 1
else:
for i = 1 to 2^n:
Ones(n-1)
(Note: You should read the upper bound of the for loop as 2n .)
Solution:
The recurrence is T (n) = 2n T (n − 1), with
PTn (0) = 1. It is not hard to see that
(
this has the solution T (n) = i=1 2 = 2 i=1 i) = 2n(n+1)/2 .
Qn i
Note that this quantity can be written as Θ(2n(n+1)/2 ) or 2Θ(n ) but not as
2
2
Θ(2n ), since this last quantity is roughly the square of the correct answer and
is not bounded by a constant multiple of it.
Solution:
Simple O(n log n) algorithm
Run binary search separately on each row.
1
O(n) reduce-and-conquer algorithm
Examine A[1][n]. If A[1][n] > x, then x does not appear anywhere in the first
row: recurse on A[2..n][1..n]. If A[1][n] < x, then x does not appear anywhere
in the last column: recurse on A[1..n][1..n − 1]. In either case we reduce the
sum of the number of rows and columns by 1, so if we don’t find x we reach an
empty matrix in at most 2n such steps. Total cost is O(n).
There are other ways to get O(n) time from divide-and-conquer algorithms
(e.g. by finding all values less than or greater than x in the middle row by binary
search and then recursing on the approximately n2 /2 positions that might still
hold x, but the analysis of these variants is more complicated. There are no
solutions that do better than O(n) in the worst case: see the lower bound below.
Lower bound
It is not possible to solve this problem in less than O(n) time in the worst case.
Consider an input where every element off the i = j diagonal is either 0 or 2x,
and every element on the diagonal is either x or x + 1. Without reading the
entire diagonal it is impossible to tell if it contains any instances of x.
Solution:
If we assume that the n-th INSERT causes a sort, we get for the total cost the
rather horrible recurrence T (n) = Θ(n lg n) + T (n − lg n), which expands into a
sum of terms of the form Θ(ni lg ni ) where each ni+1 = ni − lg ni . Solving this
sum exactly may be tricky, but we can reasonably guess that it is dominated by
the larger terms, e.g., those for which ni > n/2 and lg ni > lg (n/2) = lg (n) − 1.
For these terms we have Θ(ni lg ni ) = Θ(n lg n). There are somewhere between
n/2 n/2
lg n and lg n−1 such terms, which we can write as Θ(n/ lg n). Multiplying the
bound on each large term by the bound on the number of large terms gives
Θ ((n lg n)(n/ lg n)) = Θ(n2 ).
This will be our guess for T (n). We have just shown that T (n) = Ω(n2 ), but
we still need to prove that the guess works as an upper bound even if we throw
in the smaller terms.
Suppose that T (k) ≤ ak 2 for k < n; then T (n) = Θ(n lg n) + T (n − lg n) ≤
cn lg n + a(n − lg n)2 = cn lg n + an2 − 2an lg n + a lg 2 n ≤ an2 when a 21 c,
giving T (n) = O(n2 ).
2
to skip some questions. The reason you might choose to do this is that even
though you can solve any individual question i and obtain the pi points, some
questions are so frustrating that after solving them you will be unable to solve
any of the following fi questions.
Suppose that you are given the pi and fi values for all the questions as
input. Devise the most efficient algorithm you can for choosing set of questions
to answer that maximizes your total points, and compute its asymptotic worst-
case running time as a function of n.
Solution:
Short version: use dynamic programming.
More details: Let S(i) be the maximum total score that can be obtained
from questions i through n. Any such score is obtained from a set of questions
that either includes i or not; in the first case, the best score is pi + S(i + fi + 1),
and in the second case, the best score is S(i + 1). The following loop calculates
the best possible total score, given a large array S with all entries initialized to
0:
for i = n downto 1:
S[i] = max(p[i] + s[i+f[i]+1], s[i+1])
return S[1]
The running time is easily seen to be O(n) (possibly with some additional
tinkering to catch indices off of the end of the array).
Solution:
We’ll show that it’s NP-complete. It’s easy to see that it is in NP (guess S 0 and
verify). To show that it is NP-hard, reduce from SUBSET-SUM. Given an input
(S, K) to SUBSET-SUM, we will construct a new set T in polynomial time such
that (S, K) is in SUBSET-SUM if and only if T is in AVERAGE-SUM, by first
removing elements of S that are too big, and then adding a single huge new
element to S to bring the average up to K.
Let SK = {x ∈ S : x ≤ K}. Let n = |SK | and Pnlet SK = {x1 , x2 , . . . xn }. Let
T = {x 1 , x2 , . . . xn , y}, where y = (n + 1)K − i=1 xi . Then |T | = n + 1 and
1
P P
i∈T i = (n + 1)K, so T i∈T i = K.
If there is a subset S 0 of S that sums to K, then this same S 0 is a subset
of T ; it follows that (S, K) in SUBSET-SUM implies T in AVERAGE-SUM.
Conversely, suppose that T is in AVERAGE-SUM, i.e., that there is some subset
T 0 of T that sums to K. If y > 0 0
PK then y is not in T and T is a subset of S
that sums to K. If y ≤ K then i∈SK i ≥ nK which implies that every xi = K.
In this case there is also a subset S 0 of S that sums to K (any single xi will
do). So T in AVERAGE-SUM implies (S, K) in SUBSET-SUM as well, and the
reduction works as advertised.
3
CS 365 home page: http://zoo.cs.yale.edu/classes/cs365/
Fri 07 May 2004 17:40:21 EDT final.solutions.tyx Copyright
c 1998–2004 by Jim
Aspnes