Hashing
Hashing
Key Points
Data Structure
Hashing
What is Hashing
- If you know the index of an element in the array, you can retrieve the
element using the index in 0(1) time. So, can we store the values in
an array and use the key as the index to find the value? The answer is
yes if you can map a key to an index.
- The array that stores the values is called a hash table. The function
that maps a key to an index in the hash table is called a hash
function.
Why is Hashing?
The preceding chapters introduced search trees. An element can be
found in O(logn) time in a well-balanced search tree. Is there a more
efficient way to search for an element in a container? This chapter
introduces a technique called hashing. You can use hashing to
implement a map or a set to search. insert, and delete an element in
0(1) time.
Data Structure
Hashing
Hash Function and Hash Codes
A typical hash function first converts a search key to an integer value
called a hash code, and then compresses the hash code into an index to
the hash table.
Data Structure
Hashing
Hash Functions and Moduls
- A simple and effective hash function is:
- Convert the key value to an integer, x
- h(x) = x mod tablesize
- We want the keys to be distributed evenly over the underlying array
- This can usually be achieved by choosing a rime number as the table
size
If tableSize = 23
32%51>= 51
IO =>10%23
32%23 >= 23 (Collision)
Data Structure
Hashing
Dealing with Collisions
- A collision occurs when two different keys are mapped to the sanw
index
* Collisions may occur even when the hash function is good
- There are two main ways of dealing with collisions:
* Open addressing
* Separate chaining
Open Addressing
- Idea — when an insertion results in a collision look for an empty
array element
* Start at the index to which the hash function mapped the inserted
item
* Look for a free space in the array following a particular search
pattern, known as probing
- There are three open addressing schemes:
* Linear probing
* Quadratic probing
* Double hashing
Data Structure
Hashing
Linear Probing
- The hash table is searched sequentially
* Starting with the original hash location
* Search h(search key) + 1, then h(search key) + 2, and so on until an
available location is found
* If the sequence of probes reaches the last element of the array, wrap
around to arr[0.]
- Linear probing leads to primary clustering
* The table contains groups of consecutively occupied locations
* These clusters tend to get larger as time goes on Reducing the
efficiency of the hash table
Data Structure
Hashing
Linear Probing Example
- Hash table is size 23
- The hash function, h = x mod 23, where x is the search key value
- The search key values are shown in the table.
Data Structure
Hashing
- Insert 60, 60 1110d23 = 14
- Note that even though the key doesn't hash to 12 it still collides with
an item that did. First look at 14 + l, which is free.
- Insert 12, h= 12 mod 23 = 12. The item will be inserted at index 16.
- Notice that primary clustering is beginning to develop, making
insertions less efficient
Data Structure
Hashing
Searching
- Searching for an item is similar to insertion
- Find 59, h= 59 1110d23 = 13, index 13 does not contain 59 but is
occupied. Use linear probing to find 59 or an empty space. Conclude
that 59 is not in the table.
Quadratic Probing
- Quadratic probing is a refinement of linear probing that prevents
primary clustering. For each successive probe, i, add 12 to the
original location index.
Data Structure
Hashing
Quadratic Probing Example
- Insert 81, 81 mod23 = 12. Which collides with 58 so use quadratic
probing to find a free space, First look at 12 + 1^2 , which is free so
insert the item at index 13.
- Insert 35, 35 mod23 = 12. Which collides with 58. First look at 12 +
1^2, which is occupied, then look at 12 + 2^2 =16 and insert the item
there
- Insert 60, 60 mod23 = 14. The location is free, so insert the item
Data Structure