0% found this document useful (0 votes)

53 views

Hash Tables: Dr. Dibakar Saha

Hash tables use a hash function to map keys to their associated values. When there are more possible keys than table slots, collisions can occur where multiple keys map to the same slot. Direct chaining handles collisions by linking colliding entries together in a linked list at each slot. The example shows keys 3, 2, 9, and 6 hashing to slots 9, 7, 1, and 4 respectively using the hash function h(ki)=2ki+3, with key 9 chaining to slot 1 since that slot is already occupied by key 2.

Uploaded by

hackers earth

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views

Hash Tables: Dr. Dibakar Saha

Uploaded by

hackers earth

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Hash Tables

Dr. Dibakar Saha

Assistant Professor
Department of Computer Applications
National Institute of Technology Raipur
Hash Table - Introduction

❑ Hash table is a generalization of array.

❑With an array, we store the element whose key is k at a position k of the

array.

❑That means, given a key k, we find the element whose key is k by just
looking in the kth position of the array.

This is called direct addressing.

Hash Table- Direct addressing

❑ Direct addressing is applicable when we can afford to allocate an

array with one position for every possible key.

❑ But if we do not have enough space to allocate a location for each

possible key, then we need a mechanism to handle this case.

❑ If we have less locations and more possible keys, then simple array
implementation is not enough.

In these cases one option is to use hash tables.

Hash Table

❑ Hash table or hash map is a data structure that stores the keys and their
associated values.

❑ hash table uses a hash function to map keys to their associated values.

❑ We use a hash table when the number of keys actually stored is small
relative to the number of possible keys.
Hash Function
❑The hash function is used to transform the key into the index. Ideally, the hash
function should map each possible key to a unique slot index, but it is difficult to
achieve in practice.

❑Given a collection of elements, a hash function that maps each item into a unique
slot is referred to as a perfect hash function.

❑If we know the elements and the collection will never change, then it is possible to
construct a perfect hash function.

❑Unfortunately, given an arbitrary collection of elements, there is no systematic way

to construct a perfect hash function.
Example 0
1
2
3
3 2 9 6 11 13 7 12 4
Key values 5
6
7
8
9
10
11
12
13
14

Hash table of size M=15

Example
1
2
3

3 2 9 6 11 13 7 12 Key values 4
5
6
7
We can use the key as the index of the Hash Table
8

9
10

11
12
Hash table of size M=15
13

14
Example 0
1
2 2
3 3
3 2 9 6 11 13 7 12 4
Key values 5
20 6 6
7 7
8
Now the question is if new key value 20 appears. Size 9 9
issue 10
So, what to do? 11 11
12 12
Or same key re-appear then? For example, 6 13 13
14

Already
Occupied Hash table of size M=15
Hash Function
❑There is no systematic way to construct a perfect hash function.

❑One way to always have a perfect hash function is to increase the size of the hash
table so that each possible value in the element range can be accommodated.

❑This guarantees that each element will have a unique slot.

❑Although this is practical for small numbers of elements, it is not feasible when the
number of possible elements is large.
Characteristics of Good Hash Functions

A good hash function should have the following characteristics:

❑ Minimize collision

❑ Be easy and quick to compute

❑ Distribute key values evenly in the hash table

❑ Use all the information provided in the key

❑ Have a high load factor for a given set of keys

Load Factor

❑ The load factor of a non-empty hash table is the number of items stored in the table divided by the
size of the table.

❑ This is the decision parameter used when we want to rehash or expand the existing hash table entries.

❑ This also helps us in determining the efficiency of the hashing function.

❑ That means, it tells whether the hash function is distributing the keys uniformly or not.

Number of elements in the hash table

Load Factor=
Hash table size
How to Choose Hash Function?
The basic problems associated with the creation of hash tables are:

• An efficient hash function should be designed so that it distributes the index

values of inserted objects uniformly across the table.

• An efficient collision resolution algorithm should be designed so that it

computes an alternative index for a key whose hash index corresponds to a
location previously inserted in the hash table.

• We must choose a hash function which can be calculated quickly, returns

values within the range of locations in our table, and minimizes collisions.
Collisions
0
Hash functions are used to map each key to a different address
space, but practically it is not possible to create such a hash 1

function and the problem is called collision. 2

3
Collision is the condition where two records are stored in the 4
same location. 5
6 26
6 26 7
h(k)=k%M h(k)=k%M
h(26)=26%10 8
h(6)=6%10=6
Hash function h(k)=k%M =6 9

Gives the location Hash table (M=10)

Collision Resolution Techniques
The process of finding an alternate location is called collision resolution.
Even though hash tables have collision problems, they are more efficient in many cases compared to all
other data structures, like search trees.

❑ There are a number of collision resolution techniques, and the most popular are direct chaining and
open addressing.

❑ Direct Chaining: An array of linked list application

❑ Separate chaining

❑ Open Addressing: Array-based implementation

❑ Linear probing (linear search)
❑ Quadratic probing (nonlinear search)
❑ Double hashing (use two hash functions)
Types of Hashing

Open Closed
Hashing Hashing

Chaining Linear probing Quadratic Double hashing

probing
Direct Chaining - Separate Chaining

❑ Collision resolution by chaining combines linked representation with hash table.

❑ When two or more records hash to the same location, these records are constituted into a
singly-linked list called a chain.
Chaining Example
3 2 9 6 11 13 7 12 Key values
Hash Function h(ki)=2ki+3
M=10
Key h(ki)=2ki+3 h(ki)%M Location
0
3 2*3+3=9 9%10=9 9
1 9
2 2*2+3=7 7%10=7 7 2
9 2*9+3=21 21%10=1 1 3

6 4
2*6+3=15 15%10=5 5
5 6 11
11 2*11+3=25 25%10=5 5
6
13 2*13+3=29 29%10=5 9
7 2 7 12
7 2*7+3=17 17%10=7 7
8
12 2*12+3=27 27%10=7 7 9 3 13
Open Addressing
In open addressing all keys are stored in the hash table itself. This approach is also known as
closed hashing. This procedure is based on probing.

A collision is resolved by probing.

❑Open Addressing: Array-based implementation
❑Linear probing (linear search)
❑Quadratic probing (nonlinear search)
❑Double hashing (use two hash functions)
Linear Probing
❑The interval between probes is fixed at 1.
❑In linear probing, we search the hash table sequentially, starting from the
original hash location.
❑If a location is occupied, we check the next location.
❑We wrap around from the last table location to the first table location if
necessary.

❑The function for rehashing is the following:

rehash(key) = (n + 1)% M
Linear Probing
❑ One of the problems with linear probing is that table items tend to cluster together in the
hash table.

❑ This means that the table contains groups of consecutively occupied locations that are
Clusters can get close to one another, and merge into a larger cluster.

❑ Thus, the one part of the table might be quite dense, even though another part has relatively
few items.

❑ Clustering causes long probe searches and therefore decreases the overall efficiency.
Quadratic Probing
The interval between probes increases proportionally to the hash value (the interval thus increasing linearly, and the
indices are described by a quadratic function).

In quadratic probing, we start from the original hash location i.

If a location is occupied, we check the locations i + 12 , i +22, i + 32, i + 42...

The function for rehashing is the following:

rehash(key) = (n + k2)% M
Quadratic Probing
Example: Let us assume that the table size is 11 (0..10)

Hash Function: h(key) = key mod 11

31 19 2 13 25 24 21 9
Double Hashing
❑ The interval between probes is computed by another hash function.

❑ Double hashing reduces clustering in a better way.

❑ The increments for the probing sequence are computed by using a second hash function.

❑ The second hash function h2 should be:

h1(key) ≠ 0 and h2 ≠ h1

➢ We first probe the location h1(key).

➢ If the location is occupied, we probe the location

➢ h1(key) + h2(key), h1(key) + 2 * h2(key), ...
Double Hashing
Table size is M=13 [0,….,12]

Hash Function: h1(key)=key % 13 and

h2(key)=7-(key % 7)
Comparisons: Open Addressing Method
Thank you!

Playing God (Intro Tab)
86% (7)
Playing God (Intro Tab)
2 pages
Autumn Leaves Tab by Sungha Jung
No ratings yet
Autumn Leaves Tab by Sungha Jung
6 pages
G1, F2
No ratings yet
G1, F2
7 pages
CP164 ExamNotes
No ratings yet
CP164 ExamNotes
14 pages
5 Swinging Blues Licks in D - TAB PDF
No ratings yet
5 Swinging Blues Licks in D - TAB PDF
4 pages
Atometa Book
67% (3)
Atometa Book
145 pages
Act. 1 PLace Value With Decimal
No ratings yet
Act. 1 PLace Value With Decimal
2 pages
Hwk3 Solution
No ratings yet
Hwk3 Solution
22 pages
Circle Divider
No ratings yet
Circle Divider
2 pages
Circle Divider
No ratings yet
Circle Divider
2 pages
Tim Henson is coming to town
No ratings yet
Tim Henson is coming to town
1 page
Kami Export - IDentifying Proportional Relationships in Graphs
No ratings yet
Kami Export - IDentifying Proportional Relationships in Graphs
7 pages
2.6.21.E7 Arpeggio Practice
No ratings yet
2.6.21.E7 Arpeggio Practice
1 page
Topic 1: Introduction To Binary Search Trees
No ratings yet
Topic 1: Introduction To Binary Search Trees
20 pages
Slope Formula: y X y X
No ratings yet
Slope Formula: y X y X
16 pages
Dbscan: Presented By: Garrett Poppe
No ratings yet
Dbscan: Presented By: Garrett Poppe
22 pages
My Basic Number Facts Book Milestones Assessment
No ratings yet
My Basic Number Facts Book Milestones Assessment
30 pages
Phase Off
No ratings yet
Phase Off
3 pages
Lab 05
No ratings yet
Lab 05
7 pages
Practice Math Skills Assessment
No ratings yet
Practice Math Skills Assessment
5 pages
DK Maths Year5 Percentages 2
No ratings yet
DK Maths Year5 Percentages 2
2 pages
Yonah
No ratings yet
Yonah
5 pages
Mp-Badwaha-3.75 MLD Stp-Layout Drawing For STP-06-12-2019 PDF
No ratings yet
Mp-Badwaha-3.75 MLD Stp-Layout Drawing For STP-06-12-2019 PDF
1 page
level-1-grade-Nursery-unit-2
No ratings yet
level-1-grade-Nursery-unit-2
10 pages
Etude 2 Anxiety
No ratings yet
Etude 2 Anxiety
2 pages
6 Domain and Range Araya Feb1
No ratings yet
6 Domain and Range Araya Feb1
2 pages
Mandala builder
No ratings yet
Mandala builder
1 page
Paganini Caprice 5
100% (2)
Paganini Caprice 5
14 pages
Paganini Caprice 5 PDF
No ratings yet
Paganini Caprice 5 PDF
14 pages
Action Verbs Vocabulary Esl Crossword Puzzle Worksheets For Kids
No ratings yet
Action Verbs Vocabulary Esl Crossword Puzzle Worksheets For Kids
5 pages
INTRODUCTION TO WHOLE NUMBERS-1
No ratings yet
INTRODUCTION TO WHOLE NUMBERS-1
8 pages
カラカラ
No ratings yet
カラカラ
1 page
AdditionSquare1Digit2by2
No ratings yet
AdditionSquare1Digit2by2
1 page
What's Wrong WIth Your Picking
No ratings yet
What's Wrong WIth Your Picking
1 page
Beginners Dowsing Chapter 5
No ratings yet
Beginners Dowsing Chapter 5
4 pages
TOS - Item Analysis Consolidated HK 11-12 Second Quater
No ratings yet
TOS - Item Analysis Consolidated HK 11-12 Second Quater
9 pages
MR Big Bass Transcription
No ratings yet
MR Big Bass Transcription
7 pages
Topic 1: Introduction To Heaps: Binary Heap: A Binary Heap Is A Heap Where Each Node Can Have at Most Two
No ratings yet
Topic 1: Introduction To Heaps: Binary Heap: A Binary Heap Is A Heap Where Each Node Can Have at Most Two
26 pages
MajorHexatonics EvanMarien
No ratings yet
MajorHexatonics EvanMarien
3 pages
Laboratory Hands Indraca
No ratings yet
Laboratory Hands Indraca
4 pages
Sweet Lips
No ratings yet
Sweet Lips
7 pages
ACT2_SED3201 (1)
No ratings yet
ACT2_SED3201 (1)
4 pages
Microsoft Excel A5e62 61642c23
No ratings yet
Microsoft Excel A5e62 61642c23
1 page
Level 1-Unit-2
No ratings yet
Level 1-Unit-2
13 pages
Numbers Crossword Puzzle
No ratings yet
Numbers Crossword Puzzle
2 pages
Numbers Crossword Puzzle
No ratings yet
Numbers Crossword Puzzle
2 pages
Collection Q&A
No ratings yet
Collection Q&A
12 pages
WAIS TABLE
No ratings yet
WAIS TABLE
15 pages
Arpeggios With Extensions - All Inversions With Tab
100% (1)
Arpeggios With Extensions - All Inversions With Tab
3 pages
Playing God Fingerstyle TAB by Paolo Gans
No ratings yet
Playing God Fingerstyle TAB by Paolo Gans
17 pages
Playing God Fingerstyle TAB - Paolo Gans
No ratings yet
Playing God Fingerstyle TAB - Paolo Gans
17 pages
Positions
No ratings yet
Positions
3 pages
Cluster Analysis - Approach 1
No ratings yet
Cluster Analysis - Approach 1
28 pages
Can't Play Faster
No ratings yet
Can't Play Faster
1 page
Arabic Reading Practicee3
No ratings yet
Arabic Reading Practicee3
3 pages
Data Presentation - Descriptive Stats - PGPEX
No ratings yet
Data Presentation - Descriptive Stats - PGPEX
87 pages
Mouna Ragam Theme Guitar Tabs by Samuel Thompson
No ratings yet
Mouna Ragam Theme Guitar Tabs by Samuel Thompson
1 page
Time Worksheet Byqtrhr5
No ratings yet
Time Worksheet Byqtrhr5
1 page
estudo de ciclo de 4 livre
No ratings yet
estudo de ciclo de 4 livre
1 page
MATH201 - Far From Average Assignment Template
No ratings yet
MATH201 - Far From Average Assignment Template
8 pages
Augmented Static BBST (Segment Tree) : July 2015
No ratings yet
Augmented Static BBST (Segment Tree) : July 2015
12 pages
Mental Maths Coursebook 3
From Everand
Mental Maths Coursebook 3
Collins Learning
5/5 (1)
Mental Maths Coursebook 4
From Everand
Mental Maths Coursebook 4
Collins Learning
5/5 (1)
SQL Lab Assignment
No ratings yet
SQL Lab Assignment
6 pages
OOP Concepts
100% (1)
OOP Concepts
30 pages
Ds Viva Q
No ratings yet
Ds Viva Q
13 pages
DBMS Final Term Paper
No ratings yet
DBMS Final Term Paper
16 pages
English Term PPR Format
No ratings yet
English Term PPR Format
4 pages
Employee Management System
No ratings yet
Employee Management System
19 pages
Data Structure VIVA Questions
100% (1)
Data Structure VIVA Questions
11 pages
Unnat Bharat Gram Aarogya Series:: 25 To 29 July 2021
No ratings yet
Unnat Bharat Gram Aarogya Series:: 25 To 29 July 2021
2 pages
Design and Implementation of A Hospital Database M
No ratings yet
Design and Implementation of A Hospital Database M
7 pages
Project 2
No ratings yet
Project 2
31 pages
DSD Laboratory 5
No ratings yet
DSD Laboratory 5
9 pages
CP Lab MidtermAssessment 16052020 125841am
No ratings yet
CP Lab MidtermAssessment 16052020 125841am
2 pages
Lab 13: Implementation of AVL TREE
No ratings yet
Lab 13: Implementation of AVL TREE
4 pages
CHE 306 Lesson Note 5
No ratings yet
CHE 306 Lesson Note 5
14 pages
Worksheet g7 Evaluating Algebraic
No ratings yet
Worksheet g7 Evaluating Algebraic
2 pages
CS 312 Project 3: Intelligent Scissors
No ratings yet
CS 312 Project 3: Intelligent Scissors
6 pages
Rust Programming Cheat Sheet: Includes
100% (1)
Rust Programming Cheat Sheet: Includes
2 pages
DAA Question Answer
No ratings yet
DAA Question Answer
36 pages
Addition of Two Polynomials Using Linked List
No ratings yet
Addition of Two Polynomials Using Linked List
15 pages
Complexity Theory
No ratings yet
Complexity Theory
19 pages
TOC Assignment 3
No ratings yet
TOC Assignment 3
2 pages
Complexity Theory Chapter 1
100% (1)
Complexity Theory Chapter 1
53 pages
Preconditioning: Condition Number
No ratings yet
Preconditioning: Condition Number
5 pages
CS3401 lesson plan
No ratings yet
CS3401 lesson plan
3 pages
End 332e Hw2
No ratings yet
End 332e Hw2
3 pages
006 - BCS-042 D18 - Compressed PDF
No ratings yet
006 - BCS-042 D18 - Compressed PDF
4 pages
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
No ratings yet
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
43 pages
BITS_F312_1334_20240731165555
No ratings yet
BITS_F312_1334_20240731165555
3 pages
DAA REPORT
No ratings yet
DAA REPORT
15 pages
Notes On RecursiveFunctions
No ratings yet
Notes On RecursiveFunctions
4 pages
Unit 2 - Session 3
No ratings yet
Unit 2 - Session 3
21 pages
Neural Network Questions
No ratings yet
Neural Network Questions
9 pages
Learning Law in Neural Networks
100% (2)
Learning Law in Neural Networks
19 pages
Part (14) - Interpolation 2
No ratings yet
Part (14) - Interpolation 2
21 pages
DNA5
No ratings yet
DNA5
9 pages
Questions On JAVA Arithmetic Operators: X X X X
No ratings yet
Questions On JAVA Arithmetic Operators: X X X X
4 pages
CS8451-DAA - by WWW - LearnEngineering.in PDF
No ratings yet
CS8451-DAA - by WWW - LearnEngineering.in PDF
120 pages

Hash Tables: Dr. Dibakar Saha

Uploaded by

Hash Tables: Dr. Dibakar Saha

Uploaded by

Hash Tables

Dr. Dibakar Saha

❑ Hash table is a generalization of array.

❑With an array, we store the element whose key is k at a position k of the

This is called direct addressing.

❑ Direct addressing is applicable when we can afford to allocate an

❑ But if we do not have enough space to allocate a location for each

In these cases one option is to use hash tables.

❑Unfortunately, given an arbitrary collection of elements, there is no systematic way

Hash table of size M=15

❑This guarantees that each element will have a unique slot.

A good hash function should have the following characteristics:

❑ Be easy and quick to compute

❑ Distribute key values evenly in the hash table

❑ Use all the information provided in the key

❑ Have a high load factor for a given set of keys

❑ This also helps us in determining the efficiency of the hashing function.

Number of elements in the hash table

• An efficient hash function should be designed so that it distributes the index

• An efficient collision resolution algorithm should be designed so that it

• We must choose a hash function which can be calculated quickly, returns

function and the problem is called collision. 2

Gives the location Hash table (M=10)

❑ Direct Chaining: An array of linked list application

❑ Open Addressing: Array-based implementation

Chaining Linear probing Quadratic Double hashing

❑ Collision resolution by chaining combines linked representation with hash table.

A collision is resolved by probing.

❑The function for rehashing is the following:

In quadratic probing, we start from the original hash location i.

The function for rehashing is the following:

Hash Function: h(key) = key mod 11

❑ Double hashing reduces clustering in a better way.

❑ The second hash function h2 should be:

➢ We first probe the location h1(key).

➢ If the location is occupied, we probe the location

Hash Function: h1(key)=key % 13 and

You might also like