IT245 - Module 8
IT245 - Module 8
26/12/2021
1
College of Computing and
Informatics
Data Structure
2
Data Structure
Module 8
Hashing & Heaps
3
1. Hash Table and Hash Function
2. Implementing HashSet and HashTable
3. Rehashing
4. Implementation of Priority Queue
5. Binary Heap
Contents
4
1. Understand hash table data structure
2. Recognize the role of hash function
3. Understand rehashing operation
4. Implement hash tables in the standard library
5. Implement Priority Queue
Weekly
Learning
Outcomes
5
Required Reading
1. Chapter 5: Hashing
2. Chapter 6: Priority Queues (Heaps)(PP 225-234)
(Data structures and algorithm analysis in Java
by Mark Allen Weiss)
Recommended Reading
1. Chapter 11 (al, Cormen Thomas H et. Introduction
to Algorithms. Cambridge, MA: MIT Press, 2009)
6
This Presentation is mainly dependent on the textbook: Data Structures and Algorithm Analysis in Java by Mark Allen Weiss
• Hash Table and Hash Function
7
Hash Table and Hash Function
• Objects are array items and hash codes are used as array indexes
• In Figure 1: each object is associated with a unique hash code.
• In Figure 2: three objects are associated with one hash code (Collision Problem).
9
Hash Table and Hash Function
• Example of a good hash function:
10
Hash Table and Hash Function
• Collision Problem:
• When an element is inserted, it hashes to the same value as an already inserted element,
then we have a collision and need to resolve it.
• Example of hash function: hi(x) = (hash(x) + f(i)) mod TableSize, with f(0) = 0.
• The function f is the adopted collision resolution.
1. Separate Chaining
2. Open Addressing
11
Hash Table and Hash Function
1. Separate Chaining :
• Keep a list of all elements that hash to the same value.
• All keys that map to the same table location are kept in
a linked list
• We can use the standard library list implementations
• To search for an object:
1. Compute its hash code.
2. Search for the object in the appropriate list.
• To insert an object:
1. Compute its hash code.
2. Check if exist in the appropriate list.
3. If not exist, insert at the front of the list
• Disadvantages:
• The need for using linked lists slows down the
algorithm
• Required to implement another data structure
12
Hash Table and Hash Function
2. Open Addressing:
• Linear Probing
• When a collision occurs, sequentially search the table until an empty location is
found.
• Quadratic Probing
• Uses a collision resolution technique to avoid the problem of primary clustering
found with linear probes.
• Double Hashing
• When the result of the hash function results in a collision, a second hash function
is used.
13
Hash Table and Hash Function
• Linear Probing
• When a collision occurs, sequentially search the table until an empty
location is found.
• Example: Suppose the hash function hashes the key 343-567 (SEU
Dammam Branch) into index #4 which is already occupied by the key
343-567 (SEU Medina Branch). Collision!
14
Hash Table and Hash Function
• Linear Probing
• When a collision occurs, sequentially search the table until an empty
location is found.
• Example:
• Solution: Sequential search for empty location (index #5 is occupied!)
15
Hash Table and Hash Function
• Linear Probing
• When a collision occurs, sequentially search the table until an empty
location is found.
• Example:
• Solution: Sequential search for empty location (index #6 is occupied!)
16
Hash Table and Hash Function
• Linear Probing
• When a collision occurs, sequentially search the table until an empty
location is found.
• Example:
• Solution: Sequential search for empty location (location in index #7 is
empty).
17
Hash Table and Hash Function
• Linear Probing
• Problem: What if the end of table is reached without finding an
empty location?
• Solution: The table is treated as circular: When the end of the table
has been probed, begin probing at the beginning.
18
Hash Table and Hash Function
• Quadratic Probing
• Uses a collision resolution technique to avoid the problem of primary clustering
found with linear probes.
• General formula for the index = hashed key + n2
– First attempt: index = hashed key
– Second attempt: index = (hashed key + 12) modulo <table size> = hashed key + 1
– Third attempt: index = (hashed key + 22) modulo <table size> = hashed key + 4
– Fourth attempt: index = (hashed key + 32) modulo <table size> = hashed key + 9
-…
19
Hash Table and Hash Function
• Quadratic Probing (Example 1)
• To insert the following Keys (11, 85, 77, 10,35)
• Use the hash function: Key modulo <TableSize>
• In case of a collision: use the previous steps
[0] 10 TableSize = 10
[1] 11
[2]
[3] Insert(11) = 11 % 10 = 1
[4]
[5] 85 Insert(85) = 85 % 10 = 5
[6] 35 Insert(77) = 77 % 10 = 7
[7] 77
[8] Insert(10) = 10 % 10 = 0
[9] Insert(35) = 35 % 10 = 5 Collision!
(35 + 1) % 10 = 6 Vacant
20
Hash Table and Hash Function
• Quadratic Probing (Example 2)
• To insert the following Keys (99, 28, 88)
• Use the formula: Key modulo <TableSize>
• In case of a collision: use the previous steps
[0] TableSize = 10
[1]
[2] 88
[3] Insert(99) = 99 % 10 = 9
[4]
[5] Insert(28) = 28 % 10 = 8
[6] Insert(88) = 88 % 10 = 8 Collision!
[7]
[8] 28 (88 + 1) % 10 = 9 Collision!
[9] 99 (88 + 4) % 10 = 2 vacant
21
Hash Table and Hash Function
• Double Hashing
• Both Linear and Quadratic techniques are key-independent:
• Finding a new table entry is not affected by the value of the key.
• When the result of the hash function results in a collision, apply a second hash
function to x and probe at a distance hash2 (x), 2hash2 (x), . . . , and so on.
22
Hash Table and Hash Function
• Double Hashing (Example)
• To insert the following keys {89, 18, 49, 58, 69} :
• Use the hash function: Key modulo <TableSize>
• In case of a collision : Add the result of the second hash function
• hash2(x) = R − (x mod R) , R < TableSize, (R= 7 in this example)
TableSize = 10
Insert(89) = 89 % 10 = 9
Insert(18) = 18 % 10 = 8
Insert(49) = 49 % 10 = 9 Collision!
hash2(49) = 7 – (49 % 7) = 7 insert in position 6
Insert(58) = 58 % 10 = 8 Collision!
hash2(58) = 7 – (58 % 7) = 5 insert in position 3
Insert(69) = 69 % 10 = 9 Collision!
hash2(69) = 7 – (69 % 7) = 1 insert in position 0
23
Hash Table and Hash Function
• Double Hashing (Example) TableSize = 10
Insert(89) = 89 % 10 = 9
Insert(18) = 18 % 10 = 8
Insert(49) = 49 % 10 = 9 Collision!
hash2(49) = 7 – (49 % 7) = 7 insert in position 6
Insert(58) = 58 % 10 = 8 Collision!
hash2(58) = 7 – (58 % 7) = 5 insert in position 3
Insert(69) = 69 % 10 = 9 Collision!
hash2(69) = 7 – (69 % 7) = 1 insert in position 0
Q1. What if another collision occurred within the second hash function?
Answer: probe at a distance hash2 (x), 2hash2 (x), . . . , and so on.
(Ex. insert 60 to the table above.)
Insert(60) = 60 % 10 = 0 Collision!
hash2(60) = 7 – (60 % 7) = 3 insert in position 0 Collision!
Try positions 3, 6, 9, 2, 5, …. until an empty spot is found
Now, Where should 60 be inserted?
24
• Implementing HashSet and HashTable
25
Implementing HashSet and HashTable
• The items in the HashSet (or the keys in the HashMap) must provide
an equals and hashCode method
26
Implementing HashSet and HashTable
(Example: HashSet)
27
• Rehashing
28
Rehashing
• Solution:
• If the table reaches a predetermined percentage utilization,
• Create another table about twice size of the current one (not doubled, to make sure
table size is still prime).
• Associate a new hash function.
• Current items are rehashed to new locations based on the new table size and the
associated hash function.
29
Rehashing
• Example:
• To insert the following Keys (13, 15, 24, 6, 23) into a linear probing hash
table of size 7. [0] 6 [0]
• h(x) = x mod 7 [1] 15 [1]
[2] 23 [2]
• Linear probing technique is used to resolve collisions [3] 24 [3]
• More than 70% of the table cells are occupied. [4] [4]
• Rehashing: [5]
[6] 13
[5]
[6] 6
• Create a table size of 17. Why? [7] 23
Before rehashing
• 17 is the first prime number that satisfies increasing [8] the
24
table about twice size of the current one. [9]
[10]
• The new hash function is h(x) = x mod 17 [11]
• Current items in the table are scanned and inserted to new
[12]
locations in the new table [13] 13
[14]
[15] 15
[16]
After rehashing
30
• Implementation of Priority Queue
31
Implementation of Priority Queue
33
Implementation of Priority Queue
• Heap-Order Property:
•:
34
Implementation of Priority Queue
• Basic Heap Operations:
1. Steps for inserting an element (X):
1. Add an empty slot (a hole) to the end of the tree.
2. Move the parent value into the empty slot, and move the empty slot
up (bubble the hole up) if the parent is larger than X.
3. Repeat step 2 until the the parent is less than or equal X.
4. Insert X in the empty slot.
35
Implementation of Priority Queue
• Basic Heap Operations (Example):
• Insert the element 14.
36
Implementation of Priority Queue
• Basic Heap Operations:
1. Steps for deleting an element :
1. Find the minimum (smallest) element and remove it.
• It is easy to find the minimum where it is the root of the heap.
2. Add an empty slot (a hole) at the root.
3. Move the smaller child of the empty slot into the empty slot and move
the empty slot down (bubble the hole down).
4. Repeat step 3 until the last element (X) in the heap can be placed in
the empty slot.
37
Implementation of Priority Queue
• Basic Heap Operations (Example):
• delete the minimum element 13.
38
Implementation of Priority Queue
• Basic Heap Operations (Example):
• delete the minimum element 13.
39
Main Reference
1. Chapter 5: Hashing
2. Chapter 6: Priority Queues (Heaps)(PP 225-234)
(Data structures and algorithm analysis in Java
by Mark Allen Weiss)
3. Chapter 11 (al, Cormen Thomas H et.
Introduction to Algorithms. Cambridge, MA: MIT
Press, 2009)
4. Big Java by Cay Horstmann, 4th Edition.
40
This Presentation is mainly dependent on the textbook: Data Structures and Algorithm Analysis in Java by Mark Allen Weiss
Thank
You
41