Hashing Practice Problems Hints
Hashing Practice Problems Hints
2. Still slide the needle over the haystack, but compare the hash values of the key k with the hash
values of the window you are checking against. For this to be efficient, you need your hash function to
be fast, such that you can calculate the next hash from the previous hash in O(1) time. This is called
a rolling hash.
3. The probability of false positives is equal to the probability of collisions of the hash function you choose.
So, if the hash function is good, the probability will be low.
1
n−1
1. Show that the expected size of maxi=0 T [i] is O(logn).
Note: The expected value of any random variable X is defined as,
X
E(X) = xP (X = x)
x
2. Argue that the expected value of maximum time taken to verify the membership of elements of U in
S is O(logn)
Hints:
Let X = maxi T [i]. Note we just want an upper bound on E(X). Here are some questions for you to think
about:
• Consider the event Ei , which is that the size of the ith chain is greater than some constant k. Can you
find an upper bound on P (Ei )?
• Consider the event E = ∪i Ei . What does this event signify? Can you find (an upper bound on) P (E)?
• Since the maximum value of X is n, convince yourself that you can bound the E(X) term as the
following:
E(X) ≤ nP (E) + kP (E c )
Hints: Use an accounting method over a series of hashtable operations, similar to the analysis of stacks
using dynamically sized arrays. You may want to refer to the analysis presented here.
Hints: The issues with both quadratic and cubic probing is that there is secondary clustering. A cubic
will grow faster, but therefore will also get modded out faster (since the size of the hashtable is the same).
There should not be any practical improvement in general.