0% found this document useful (0 votes)
22 views

Hash Table

Uploaded by

Amruta Navale
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Hash Table

Uploaded by

Amruta Navale
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Hash Table

Hash_table DSA by amruta_navale


Hash Table is a data structure which stores data in an associative manner. In a hash table,
data is stored in an array format, where each data value has its own unique index value.
Access of data becomes very fast if we know the index of the desired data.

Thus, it becomes a data structure in which insertion and search operations are very fast
irrespective of the size of the data. Hash Table uses an array as a storage medium and uses
hash technique to generate an index where an element is to be inserted or is to be located
from.

Hashing
Hashing is a technique to convert a range of key values into a range of indexes of an array.
We're going to use modulo operator to get a range of key values. Consider an example of
hash table of size 20, and the following items are to be stored. Item are in the (key,value)
format.

Hash Function

Hash_table DSA by amruta_navale


(1,20)
(2,70)
(42,80)
(4,25)
(12,44)
(14,32)
(17,11)
(13,78)
(37,98)

Hash_table DSA by amruta_navale


Linear Probing
As we can see, it may happen that the hashing technique is used to create an already used
index of the array. In such a case, we can search the next empty location in the array by
looking into the next cell until we find an empty cell. This technique is called linear probing.

Hash_table DSA by amruta_navale


Basic Operations
Following are the basic primary operations of a hash table.

Search − Searches an element in a hash table.

Insert − Inserts an element in a hash table.

Delete − Deletes an element from a hash table.

DataItem
Define a data item having some data and key, based on which the search is to be conducted in a
hash table.

struct DataItem {
int data;
int key;
};
Hash Method
Define a hashing method to compute the hash code of the key of the data item.

int hashCode(int key){


return key % SIZE;
}
Hash_table DSA by amruta_navale
Search Operation
Whenever an element is to be searched, compute the hash code of the key passed and
locate the element using that hash code as index in the array. Use linear probing to get the
element ahead if the element is not found at the computed hash code.

struct DataItem *search(int key) {


//get the hash
int hashIndex = hashCode(key);

//move in array until an empty


while(hashArray[hashIndex] != NULL) {

if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];

//go to next cell


++hashIndex;

//wrap around the table


hashIndex %= SIZE;
}

return NULL;
Hash_table DSA by amruta_navale
}
What is a Hash function?
The hash function creates a mapping between key and value, this is done through the use of
mathematical formulas known as hash functions. The result of the hash function is referred to as
a hash value or hash. The hash value is a representation of the original string of characters but
usually smaller than the original.
For example: Consider an array as a Map where the key is the index and the value is the value at
that index. So for an array A if we have index i which will be treated as the key then we can find
the value by simply looking at the value at A[i].
simply looking up A[i].

Hash_table DSA by amruta_navale


Types of Hash functions:
There are many hash functions that use numeric or alphanumeric keys. This article focuses on
discussing different hash functions:
1. Division Method.
2. Mid Square Method
3. Folding Method.
4. Multiplication Method

A good hash function should have the following properties:


1. Efficiently computable.
2. Should uniformly distribute the keys (Each table position is equally likely for
each.
3. Should minimize collisions.
4. Should have a low load factor(number of items in the table divided by the size
of the table).
Complexity of calculating hash value using the hash function
• Time complexity: O(n)
• Space complexity: O(1)

Hash_table DSA by amruta_navale


Problem with Hashing
If we consider the above example, the hash function we used is the sum of the letters, but if
we examined the hash function closely then the problem can be easily visualized that for
different strings same hash value is begin generated by the hash function.
For example: {“ab”, “ba”} both have the same hash value, and string {“cd”,”be”} also generate
the same hash value, etc. This is known as collision and it creates problem in searching,
insertion, deletion, and updating of value.
What is collision?
The hashing process generates a small number for a big key, so there is a possibility that two
keys could produce the same value. The situation where the newly inserted key maps to an
already occupied, and it must be handled using some collision handling technology.

Hash_table DSA by amruta_navale


What is Collision in Hashing
How to handle Collisions?
There are mainly two methods to handle collision:
1. Separate Chaining:
2. Open Addressing:

Hash_table DSA by amruta_navale


1) Separate Chaining
The idea is to make each cell of the hash table point to a linked list of records that have the
same hash function value. Chaining is simple but requires additional memory outside the
table.
Example: We have given a hash function and we have to insert some elements in the hash
table using a separate chaining method for collision resolution technique.
Hash function = key % 5,
Elements = 12, 15, 22, 25 and 37.

Let’s see step by step approach to how to solve the above problem:
• Step 1: First draw the empty hash table which will have a possible range of hash
values from 0 to 4 according to the hash function provided.

Hash_table DSA by amruta_navale


• Step 2: Now insert all the keys in the hash table one by one. The first key to be
inserted is 12 which is mapped to bucket number 2 which is calculated by using the hash
function 12%5=2.

Insert 12

Hash_table DSA by amruta_navale


• Step 3: Now the next key is 22. It will map to bucket number 2 because 22%5=2.
But bucket 2 is already occupied by key 12.

Insert 22
• Step 4: The next key is 15. It will map to slot number 0 because 15%5=0.

Hash_table DSA by amruta_navale


•Step 5: Now the next key is 25. Its bucket number will be 25%5=0. But bucket 0 is already
occupied by key 25. So separate chaining method will again handle the collision by creating a
linked list to bucket 0.

Insert 25

Hash_table DSA by amruta_navale


2.a) Linear Probing
In linear probing, the hash table is searched sequentially that starts from the original
location of the hash. If in case the location that we get is already occupied, then we check
for the next location.
Algorithm:
1. Calculate the hash key. i.e. key = data % size
2. Check, if hashTable[key] is empty
• store the value directly by hashTable[key] = data
3. If the hash index already has some value then
• check for next index using key = (key+1) % size
4. Check, if the next index is available hashTable[key] then store the value. Otherwise
try for next index.
5. Do the above process till we find the space.

Hash_table DSA by amruta_navale


Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that
are to be inserted are 50, 70, 76, 85, 93.
• Step 1: First draw the empty hash table which will have a possible range of hash
values from 0 to 4 according to the hash function provided.

Hash table
• Step 2: Now insert all the keys in the hash table one by one. The first key is 50. It will
map to slot number 0 because 50%5=0. So insert it into slot number 0.

Hash_table DSA by amruta_navale


• Step 3: The next key is 70. It will map to slot number 0 because 70%5=0 but 50 is
already at slot number 0 so, search for the next empty slot and insert it.

Insert 70 into hash table


• Step 4: The next key is 76. It will map to slot number 1 because 76%5=1 but 70 is
already at slot number 1 so, search for the next empty slot and insert it.

Hash_table DSA by amruta_navale


Insert 76 into hash table
• Step 5: The next key is 85 It will map to slot number 3 because 85%5=0, but 50 is
already at slot number 0 so, search for the next empty slot and insert it. So insert it into slot
number 3.

Insert 85 into hash table


• Step 6: The next key is 93 It will map to slot number 4 because 93%5=3, but 85 is
already at slot number 3 so, search for the next empty slot and insert it. So insert it into slot
number 4.

Insert 93 into hash table


Hash_table DSA by amruta_navale
Quadratic Probing
Quadratic probing is an open addressing scheme in computer programming for resolving hash
collisions in hash tables. Quadratic probing operates by taking the original hash index and
adding successive values of an arbitrary quadratic polynomial until an open slot is found.
An example sequence using quadratic probing is:
H + 12, H + 22, H + 32, H + 42…………………. H + k2
This method is also known as the mid-square method because in this method we look for i2‘th
probe (slot) in i’th iteration and the value of i = 0, 1, . . . n – 1. We always start from the original
hash location. If only the location is occupied then we check the other slots.
Let hash(x) be the slot index computed using the hash function and n be the size of the hash
table.
If the slot hash(x) % n is full, then we try (hash(x) + 12) % n.
If (hash(x) + 12) % n is also full, then we try (hash(x) + 22) % n.
If (hash(x) + 22) % n is also full, then we try (hash(x) + 32) % n.
This process will be repeated for all the values of i until an empty slot is found

Hash_table DSA by amruta_navale


Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution
strategy to be f(i) = i2 . Insert = 22, 30, and 50
• Step 1: Create a table of size 7.

•Step 2 – Insert 22 and 30


•Hash(22) = 22 % 7 = 1, Since the cell at index 1 is empty, we can easily insert 22 at slot 1.
•Hash(30) = 30 % 7 = 2, Since the cell at index 2 is empty, we can easily insert 30 at slot 2.

Hash_table DSA by amruta_navale


Insert key 22 and 30 in the hash table
• Step 3: Inserting 50
• Hash(50) = 50 % 7 = 1
• In our hash table slot 1 is already occupied. So, we will search for slot 1+12, i.e.
1+1 = 2,
• Again slot 2 is found occupied, so we will search for cell 1+22, i.e.1+4 = 5,
• Now, cell 5 is not occupied so we will place 50 in slot 5.

Insert key 50 in the hash table

Hash_table DSA by amruta_navale


Double Hashing
Double hashing is a collision resolving technique in Open Addressed Hash tables. Double
hashing make use of two hash function,
• The first hash function is h1(k) which takes the key and gives out a location on the
hash table. But if the new location is not occupied or empty then we can easily place our key.
• But in case the location is occupied (collision) we will use secondary hash-function
h2(k) in combination with the first hash-function h1(k) to find the new location on the hash
table.
This combination of hash functions is of the form
h(k, i) = (h1(k) + i * h2(k)) % n

where
• i is a non-negative integer that indicates a collision number,
• k = element/key which is being hashed
• n = hash table size.
Complexity of the Double hashing algorithm:
Time complexity: O(n)

Hash_table DSA by amruta_navale


What is meant by Load Factor in Hashing?
The load factor of the hash table can be defined as the number of items the hash table
contains divided by the size of the hash table. Load factor is the decisive parameter that is
used when we want to rehash the previous hash function or want to add more elements to
the existing hash table.
It helps us in determining the efficiency of the hash function i.e. it tells whether the hash
function which we are using is distributing the keys uniformly or not in the hash table.
Load Factor = Total elements in hash table/ Size of hash table

What is Rehashing?
As the name suggests, rehashing means hashing again. Basically, when the load factor
increases to more than its predefined value (the default value of the load factor is 0.75), the
complexity increases. So to overcome this, the size of the array is increased (doubled) and all
the values are hashed again and stored in the new double-sized array to maintain a low load
factor and low complexity.

Hash_table DSA by amruta_navale


Applications of Hash Data structure
•Hash is used in databases for indexing.
•Hash is used in disk-based data structures.
•In some programming languages like Python, JavaScript hash is used to implement objects.

Real-Time Applications of Hash Data structure


•Hash is used for cache mapping for fast access to the data.
•Hash can be used for password verification.
•Hash is used in cryptography as a message digest.
•Rabin-Karp algorithm for pattern matching in a string.
•Calculating the number of different substrings of a string.

Advantages of Hash Data structure


•Hash provides better synchronization than other data structures.
•Hash tables are more efficient than search trees or other data structures
•Hash provides constant time for searching, insertion, and deletion operations on average.

Disadvantages of Hash Data structure


•Hash is inefficient when there are many collisions.
•Hash collisions are practically not avoided for a large set of possible keys.
•Hash does not allow null values.

Hash_table DSA by amruta_navale


Hash_table DSA by amruta_navale
Coalesced hashing
Coalesced hashing is a collision avoidance technique when there is a fixed sized data. It is a
combination of both Separate chaining and Open addressing. It uses the concept of Open
Addressing(linear probing) to find first empty place for colliding element from the bottom
of the hash table and the concept of Separate Chaining to link the colliding elements to
each other through pointers.

The hash function used is h=(key)%(total number of keys). Inside the hash table, each node
has three fields:

h(key): The value of hash function for a key.


Data: The key itself.
Next: The link to the next colliding elements.
The basic operations of Coalesced hashing are:

INSERT(key): The insert Operation inserts the key according to the hash value of that key if
that hash value in the table is empty otherwise the key is inserted in first empty place from
the bottom of the hash table and the address of this empty place is mapped in NEXT field
of the previous pointing node of the chain.(Explained in example below).
DELETE(Key): The key if present is deleted. Also if the node to be deleted contains the
address of another node in hash table then this address is mapped in the NEXT field of the
node pointing to the node which is to be deleted
SEARCH(key): Returns True if key is present, otherwise return False.
Hash_table DSA by amruta_navale
Example:

n = 10
Input : {20, 35, 16, 40, 45, 25, 32, 37, 22, 55}
Hash function

h(key) = key%10
Steps:

i)Initially empty hash table is created with all NEXT field initialised with NULL and h(key) values
ranging from 0-9.
ii)Let’s start with inserting 20, as h(20)=0 and 0 index is empty so we insert 20 at 0 index.
iii)Next element to be inserted is 35, h(35)=5 and 5th index empty so we insert 35 there.
iv)Next we have 16, h(16)=6 which is empty so 16 is inserted at 6 index value.
v)Now we have to insert 40, h(40)=0 which is already occupied so we search for the first empty
block from the bottom and insert it there i.e 9 index value.Also the address of this newly
inserted node(from address we mean index value of a node) i.e(9 )is initialised in the next field
of 0th index value node.
vi) To insert 45, h(45)=5 which is occupied so again we search for the empty block from the
bottom i.e 8 index value and map the address of this newly inserted node i.e(8) to the Next
field of 5th index value node i.e in the next field of key=35.

Hash_table DSA by amruta_navale


vii)Next to insert 25, h(25)=5 is occupied so search for the first empty block from bottom i.e
7th index value and insert 25 there. Now it is important to note that the address of this new
node cant be mapped on 5th index value node which is already pointing to some other node.
To insert the address of new node we have to follow the link chain from the 5th index node
until we get NULL in next field and map the address of new node to next field of that node i.e
from 5th index node we go to 8th index node which contains NULL in next field so we insert
address of new node i.e(7) in next field of 8th index node.

viii)To insert 32, h(32)=2, which is empty so insert 32 at 2nd index value.
ix)To insert 37, h(37)=7 which is occupied so search for the first free block from bottom which
is 4th index value. So insert 37 at 4th index value and copy the address of this node in next
field of 7th index value node.
x)To insert 22, h(22)=2 which is occupied so insert it at 3rd index value and map the address of
this node in next field of 2nd index value node.
xi)Finally, to insert 55 h(55)=5 which is occupied and the only empty space is 1st index value
so insert 55 there. Now again to map the address of this new node we have to follow the
chain starting from 5th index value node until we get NULL in next field i.e from 5th index-
>8th index->7th index->4th index which contains NULL in Next field, and we insert the address
of newly inserted node at 4th index value node.

Hash_table DSA by amruta_navale


Deletion process is simple, for example:

Case 1:
To delete key=37, first search for 37. If it is present then simply
delete the data value and if the node contains any address in next
field and the node to be deleted i.e 37 is itself pointed by some
other node(i.e key=25) then copy that address in the next field of
37 to the next field of node pointing to 37(i.e key=25) and
initialize the NEXT field of key=37 as NULL again and erase the
key=37.

Hash_table DSA by amruta_navale


Hash_table DSA by amruta_navale
Hashing with Chaining in Data Structure
In this section we will see what is the hashing with chaining. The Chaining is one collision
resolution technique. We cannot avoid collision, but we can try to reduce the collision, and try to
store multiple elements for same hash value.

this technique suppose our hash function h(x) ranging from 0 to 6. So for more than 7 elements,
there must be some elements, that will be places inside the same room. For that we will create a
list to store them accordingly. In each time we will add at the beginning of the list to perform
insertion in O(1) time

Let us see the following example to get better idea. If we have some elements like {15, 47, 23,
34, 85, 97, 65, 89, 70}. And our hash function is h(x) = x mod 7.

The hash values will be

Hash_table DSA by amruta_navale


The Hashing with chaining will be like −

Hash_table DSA by amruta_navale


Separate Chaining:
With separate chaining, the array is implemented as a chain, which is a linked
list.
One of the most popular and often employed methods for handling accidents is
separate chaining.
This method is implemented using the linked list data structure.
As a result, when numerous elements are hashed into the same slot index,
those elements are added to a chain, which is a singly-linked list.
Here, a linked list is created out of all the entries that hash into the same slot
index.
Now, using merely linear traversal, we can search the linked list with a key K.
If the intrinsic key for any entry equals K, then we have identified our entry.
The entry does not exist if we have searched all the way to the end of the l
inked list and still cannot find it. In separate chaining, we therefore get to the
conclusion that if two different entries have the same hash value,
we store them both in the same linked list one after the other.
Let's use "key mod 7" as our simple hash function with the following key values:
50, 700, 76, 85, 92, 73, 101.

Hash_table DSA by amruta_navale


Hash_table DSA by amruta_navale
Frequently Asked Questions
How is a hash table created?
There are three main steps for creating a hash table. The first is that a hash function must
be created. Good hash functions both distribute data uniformly and are easy to compute.
After that, keys are hashed and are then assigned indexes. After the search process, the
data is entered into the array.

What is a hash table, and what is an example of one?


A hash table is a form of data structure in which data is efficiently stored into an array. This
makes it a popular form of data retrieval. One good example of this is in username and
password databases.

What is the purpose of a hash table?


The purpose of a hash table is to efficiently and securely store information for later
retrieval. By using a key-value storage method, large amounts of data can be stored while
still allowing for quick retrieval.

What is a hash table key?


A hash table key is what is used for retrieving the value stored in the hash table. The key is
passed through the function and then the index is returned. The index is the location on
the array where the data is stored.
Difference between separate & open addressing chaining
Hash_table DSA by amruta_navale
Separate Chaining Open Addressing

Keys are stored inside the hash All the keys are stored only inside
table as well as outside the hash the hash table.
table. No key is present outside the
hash table.

The number of keys to be stored The number of keys to be stored


in the hash table can even exceed in the hash table can never
the size of the hash table. exceed the size of the hash table.

Deletion is easier. Deletion is difficult.

Extra space is required for the


pointers to store the keys outside No extra space is required.
the hash table.

Cache performance is poor. Cache performance is better.


This is because of linked lists This is because here no linked
which store the keys outside the lists are used.
hash table.

Some buckets of the hash table Buckets may be used even if no


are never used which leads to key maps to those particular
wastage of space. buckets.

Hash_table DSA by amruta_navale

You might also like