0% found this document useful (0 votes)
11 views

Tutorial 3 - Part1

Uploaded by

bob pan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Tutorial 3 - Part1

Uploaded by

bob pan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Hash

Maps

Tutorial 3
Ahmed Fahmy
ECE 250 @uWaterloo
Motivation

• Look-ups for key-value pairs Ahmed 1


• For example:
Lulu 0
• Does an item (key) exist (value) in the data structure?
• Given a student name (key), Maaz 0
would they pass ECE250 (value)?

• Complexity John 1

• Linked lists Gloria 1


• Trees
• Vectors...?
ADT: Dictionary
Motivation

• Using vectors for lookups 0


• If keys are integers
• Space complexity depends on the key range! 1
• A quick solution would be mapping:
2

• Problem: collisions! 3
• Later...
• If keys are objects: 4
• Strings
• User-defined class
• Solution: transform the object into some integer
Hashing

• Hash: give each object a different unsigned int (hash) value.


• Requirements:
• Fast 
• An object will always have the same hash value

• Uniform probability for a collision  very important
Mapping

• We can use

• A bit slow operation!


• Solution: make
Collision
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2

• Chaining

33 10 4

0 1 2 3 4 5 6 7
Collision
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2

• Chaining
• Open-addressing

33 10 4

0 1 2 3 4 5 6 7
Linear-Probing
unsigned int hash(type obj, unsigned int size) {
return obj.hash() & ((1 << m) – 1);
}

• Insert
• 4, 10, 33, 2
• Check next location

• Search
• Stop when empty or full

33 10 4

0 1 2 3 4 5 6 7
Double Hashing

• It is the most efficient!


• Hash again to get the next cell index:

• Different hash functions for the initial value and jump


Quality of
Hashing
• How can we assess the quality of a hash function?
• Load factor: expected number of keys to have the same hash value
• Another way to define it:
• How many times we probe “on average” to find an item?
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

SDBM [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

DBJ2A [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

FNV1 [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

FNV1-A [1]
Quality of
Hashing
• Let us have an experiment:
• Pick a hash function
• Insert random numeric strings into
a hash map
• Draw the hash map as a picture:
• Each pixel is a cell
• Colored if cell is occupied
• White if cell is empty

Murmur2 [1]
Problem Solving

• Remove Duplicates
from (unsorted) vector void removeDubFast(vector<int>& v){ // un/sorted vector v
unordered_set<int> m;
for (int i:v)
• Complexity: m.insert(i);
v.clear();
for (int i:m)
v.push_back(i);
}
450000000

400000000

350000000

300000000

250000000

200000000

150000000

100000000

50000000

0
0 42000 84000 126000 168000 210000 252000 294000 336000 378000 420000 462000 504000 546000 588000 630000 672000 714000 756000 798000 840000 882000 924000 966000

Real Performance
Thank You
References

• [1]
https://softwareengineering.stackexchange.com/questions/49550/which-hashing-al
gorithm-is-best-for-uniqueness-and-speed

You might also like