MIT6 046JS15 Pset4sols
MIT6 046JS15 Pset4sols
(a) [5 points] Describe your data structure. Include clear invariants describing its key
properties. Hint: Use an actual queue plus auxiliary data structure(s) for bookkeeping.
Solution: For example, we might use a FIFO queue Main and an auxiliary linked
list, Min, satisfying the following invariants:
1. Item x appears in Min if and only if x is the minimum element of some tail-
segment of Main.
2. Min is sorted in increasing order, front to back.
Solution:
2 Problem Set 4 Solutions
E NQUEUE(x)
1 Add x to the end of Main.
2 Starting at the end of the list, examine elements of Min and remove those that are larger than x; stop ex
3 Add x to the end of Min.
D EQUEUE()
1 Remove and return the first element x of Main.
2 If x is the first element in Min, remove it.
F IND -M IN()
1 Return the first element of Min.
(c) [5 points] Prove that your operations give the right answers. Hint: You may want to
prove that their correctness follows from your data structure invariants. In that case
you should also sketch arguments for why the invariants hold.
Solution: This solution is for the choices of data structure and procedures given
above; your own may be different.
The only two operations that return answers are D EQUEUE and F IND -M IN. D EQUEUE
returns the first element of Main, which is correct because Main maintains the actual
queue. F IND -M IN returns the first element of Min. This is the smallest element of
Min because Min is sorted in increasing order (by Invariant 2 above). The smallest
element of Main is the minimum of the tail-segment consisting of all of Main, which
is the smallest of all the tail-mins of Main. This is the smallest element in Min (by
Invariant 1). Therefore, F IND -M IN returns the smallest element of Main, as needed.
Proofs for the invariants: The invariants are vacuously true in the initial state. We
argue that E NQUEUE and D EQUEUE preserve them; F IND -M IN does not affect them.
It is easy to see that both operations preserve Invariant 2: Since a D EQUEUE opera
tion can only remove an element from Min, the order of the remaining elements is
preserved. For E NQUEUE(x), we remove elements from the end of Min until we find
one that less than x, and then add x to the end of Min. Because Min was in sorted
order prior to the E NQUEUE(x), when we stop removing elements, we know that all
the remaining elements in Min are less than x. Since we do not change the order of
any elements previously in Min, all the elements are still in sorted order.
So it remains to prove Invariant 1. There are two directions:
• The new Min list contains all the tail-mins.
E NQUEUE(x): x is the minimum element of the singleton tail-segment of Main and
it is added to Min. Additionally, since every tail-segment now contains the value
x, all elements with value greater than x can no longer be tail-mins. So, after their
removal, Min still contains all the tail-mins.
Problem Set 4 Solutions 3
D EQUEUE of element x: The only element that could be removed from Min is x.
It is OK to remove x, because it can no longer be a tail-min since it is no longer in
Main. All other tail-mins are remain in Min.
• All elements of the new Min are tail-mins.
E NQUEUE(x): x is the only value that is added to Min. It is the min of the singleton
tail-segment. Every other element y remaining in the Min list was a tail-min before
the E NQUEUE and is less than x. So y is still a tail-min after the E NQUEUE.
(d) [10 points] Analyze the time complexity: the worst-case cost for each operation, and
the amortized cost of any sequence of m operations.
Solution: D EQUEUE and F IND -M IN are O(1) operations, in the worst case.
E NQUEUE is O(m) in the worst case. To see that the cost can be this large, suppose
that E NQUEUE operations are performed for the elements 2, 3, 4, . . . , m − 1, m, in or
der. After these, Min contains {2, 3, 4, . . . , m − 1, m}. Then perform E NQUEUE(1).
This takes Ω(m) time because all the other entries from Min are removed one by one.
However, the amortized cost of any sequence of m operations is O(m). To see this,
we use a potential argument. First, define the actual costs of the operations as follows:
The cost of any F IND -M IN operation is 1. The cost of any D EQUEUE operation is 2,
for removal from Main and possible removal from Min. The cost of an E NQUEUE
operation is 2 + s, where s is the number of elements removed from Min. Define the
potential function Φ = |Min|.
Now consider a sequence o1 , o2 , . . . , om of operations and let ci denote the actual
cost of operation oi . Let Φi denote the value of the potential function after exactly
i operations; let Φ0 denote the initial value of Φ, which here is 0. Define the amortized
cost ĉi of operation instance oi to be ci + Φi − Φi−1 .
We claim that ĉi ≤ 2 for every i. If we show this, then we know that the actual cost of
the entire sequence of operations satisfies:
m
m m
m m
m
ci = ĉi + Φ0 − Φm ≤ ĉi ≤ 2m.
i=1 i=1 i=1
To show that ĉi ≤ 2 for every i, we consider the three types of operations. If oi is a
F IND -M IN operation, then
If oi is a D EQUEUE, then since the lengths of the lists cannot increase, we have:
ĉi = ci + Φi − Φi−1 ≤ 2 + 0 ≤ 2.
If oi is an E NQUEUE, then
ĉi = ci + Φi − Φi−1 ≤ 2 + s − s = 2,
where s is the number of elements removed from Min. Thus, in every case, ĉi ≤ 2, as
claimed.
Alternatively, we could use the accounting method. Use the same actual costs as
above. Assign each E NQUEUE an amortized cost of 3, each D EQUEUE an amortized
cost of 2, and each F IND -M IN an amortized cost of 1. Then we must argue that
m
m m
m
ĉi ≥ ci
i=1 i=1
for any sequence of operations and costs as above. This is so because each E NQUEUE(x)
contributes an amortized cost of 3, which covers its own actual cost of 2 plus the pos
sible cost of removing x from Min later.
(b) [9 points] Consider a particular element xi . Prove that, with probability at least 1− n12 ,
the total number of times the algorithm compares xi with pivots is at most d lg n, for
a particular constant d. Give a value for d explicitly.
Solution: We use part (a) and the Claim. By part (a), each time Q UICKSORT is called
for a subarray containing xi , with probability at least 12 , either xi is chosen as the pivot
value or else the size of the subarray containing xi reduces to at most 34 of what it
was before the call. Let’s say that a call is “successful” if either of these two cases
happens. That is, with probability at least 12 , the call is successful.
Now, at most log4/3 n successful calls can occur for subarrays containing xi during an
execution, because after that many successful calls, the size of the subarray containing
xi would be reduced to 1. Using the change of base formula for logarithms,
log4/3 n = c lg n, where c = log4/3 2.
Now we can model the sequence of calls to Q UICKSORT for subarrays containing xi
as a sequence of tosses of a fair coin, where heads corresponds to successful calls.
By the Claim, with c = log4/3 2 and α = 2, we conclude that, with probability at
least 1 − n12 , we have at least c lg n successful calls within d lg n total calls, where
d = 3(2 + c). Each comparison of xi with a pivot occurs as part of one of these calls,
so with probability at least 1 − n12 , the total number of times the algorithm compares
xi with pivots is at most d lg n = 3(2 + c) lg n = 3(2 + log4/3 2) lg n. The required
value of d is 3(2 + log4/3 2) ≤ 14.
(c) [6 points] Now consider all of the elements x1 , x2 , . . . , xn . Apply your result from
part (b) to prove that, with probability at least 1 − n1 , the total number of comparisons
made by Q UICKSORT on the given array input is at most d' n lg n, for a particular
constant d' . Give a value for d' explicitly. Hint: The Union Bound may be useful for
your analysis.
Solution: Using a union bound for all the n elements of the original array A, we get
that, with probability at least 1 − n( n12 ) = 1 − n1 , every value in the array is compared
with pivots at most d lg n times, with d as in part (b). Therefore, with probability at
least 1 − n1 , the total number of such comparisons is at most dn lg n. Using d' = d
works fine.
Since all the comparisons made during execution of Q UICKSORT involve comparison
of some element with a pivot, we get the same probabilistic bound for the total number
of comparisons.
6 Problem Set 4 Solutions
(d) [5 points] Generalize your results above to obtain a bound on the number of compar
isons made by Q UICKSORT that holds with probability 1− n1α , for any positive integer
α, rather than just probability 1 − n1 (i.e., α = 1).
Solution: The modifications are easy. The Claim and part (a) are unchanged. For
1
part (b), we now prove that with probability at least 1 − nα+1 , the total number of
times the algorithm compares xi with pivots is at most d lg n, for d = 3(α + c). The
argument is the same as before, but we use the Claim with the value of α instead of 2.
Then for part (c), we show that with probability at least 1 − n1α , the total number
of times the algorithm compares any value with a pivot is at most dn lg n, where
d = 3(α + c).
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.