0% found this document useful (0 votes)

7 views7 pages

Hash

Hashing is a technique that maps data to specific locations in a hash table using a hash function for efficient storage and retrieval. It has components like keys, hash functions, and hash tables, and is used in various applications such as database indexing and password storage. Collision resolution techniques include open hashing and closed hashing, each with its own advantages and disadvantages.

Uploaded by

student09hub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views7 pages

Hash

Uploaded by

student09hub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Hashing

Definition: Hashing is a fundamental technique in data structures and algorithms that facilitates
efficient data storage and retrieval by mapping data (keys) to specific locations (indices) in a data
structure called a hash table. This process is achieved through the use of a hash function, which
computes an index into an array of buckets or slots, from which the desired value can be found.

Situations Where Hash is not Used

 Need to maintain sorted data along with search, insert and delete. We use a self-balancing BST
in these cases.

 When Strings are keys and we need operations like prefix search along with search, insert and
delete. We use Trie in these cases.

 When we need operations like floor and ceiling along with search, insert and/or delete. We use
Self Balancing BST in these cases.

Components of Hashing

There are majorly three components of hashing:

1. Key: A Key can be anything string or integer which is fed as input in the hash function the
technique that determines an index or location for storage of an item in a data structure.

2. Hash Function: Receives the input key and returns the index of an element in an array called
a hash table. The index is known as the hash index .

3. Hash Table: Hash table is typically an array of lists. It stores values corresponding to the keys.
Hash stores the data in an associative manner in an array where each data value has its own
unique index.
How does Hashing work?

Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.

 Step 1: We know that hash functions (which is some mathematical formula) are used to
calculate the hash value which acts as the index of the data structure where the value will be
stored.

 Step 2: So, let’s assign

o “a” = 1,

o “b”=2, .. etc, to all alphabetical characters.

 Step 3: Therefore, the numerical value by summation of all characters of the string:

 “ab” = 1 + 2 = 3,

 “cd” = 3 + 4 = 7 ,

 “efg” = 5 + 6 + 7 = 18

 Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that
is used here is the sum of the characters in key mod Table size . We can compute the location
of the string in the array by taking the sum(string) mod 7 .

 Step 5: So we will then store

o “ab” in 3 mod 7 = 3,

o “cd” in 7 mod 7 = 0, and

o “efg” in 18 mod 7 = 4.
Properties of a Good Hash Function

 Deterministic: The same input should always produce the same output.

 Efficiently Computable: The function should compute the hash value quickly.

 Uniform Distribution: It should distribute the keys uniformly across the hash table to
minimize collisions.

 Minimize Collisions: Different keys should ideally hash to different indices.

Collision Resolution Techniques

Collisions occur when two keys hash to the same index in a hash table. Common methods to handle
collisions include:

In hashing, collision resolution is essential to handle scenarios where multiple keys map to the same
index in a hash table. Two primary strategies for collision resolution are open hashing (also known as
separate chaining) and closed hashing (also known as open addressing).

1. Open Hashing (Separate Chaining):

In open hashing, each slot in the hash table contains a reference to a collection (commonly a linked list)
of all keys that hash to the same index. When a collision occurs, the new key is added to the collection
at that slot.

Example:

Consider a hash table with 10 slots and a hash function h(k) = k % 10. Inserting keys 12, 22, and 32
would result in:

h(12) = 2

h(22) = 2

h(32) = 2

All three keys hash to index 2. Using separate chaining, index 2 would contain a linked list with
elements [12, 22, 32].

Advantages:

 Simplifies collision resolution by allowing multiple elements at each index.

 The hash table can handle a dynamic number of elements without significant performance
degradation.

Disadvantages:

 Requires additional memory for pointers in the linked lists.

 Performance may degrade if many keys hash to the same index, leading to long chains.

2. Closed Hashing (Open Addressing):

In closed hashing, all keys are stored within the hash table itself. When a collision occurs, the algorithm
probes the table to find the next available slot. Common probing methods include:

Linear Probing: Sequentially checks the next slots until an empty one is found.

Example:

If h(k) = 2 is occupied, check h(k) + 1, h(k) + 2, etc.

Quadratic Probing: Probes slots at intervals of 1², 2², 3², etc., away from the original hash index.

Example:

If h(k) = 2 is occupied, check h(k) + 1², h(k) + 2², etc.

Double Hashing: Uses a second hash function to determine the probe step distance.

Example:

If h1(k) = 2 is occupied, compute the step size using h2(k) and probe at intervals of h2(k).

Advantages:

 Avoids the need for additional data structures like linked lists.
 Can be more cache-friendly due to contiguous memory usage.

Disadvantages:

 Performance can degrade as the load factor increases, leading to clustering.

 Deletion of keys can be complex, often requiring special markers to indicate deleted slots.
 Choosing between open and closed hashing depends on factors like memory availability,
expected load factor, and performance requirements. Understanding these methods is crucial
for designing efficient hash tables and ensuring optimal data retrieval performance.

Applications of Hashing

 Database Indexing: Quickly locate data without searching every row.

 Caches: Implement associative arrays for fast data retrieval.

 Sets: Implement sets that can check for membership efficiently.

 Password Storage: Store hashed passwords to enhance security.

Advantages of Hashing

 Fast Data Retrieval: Provides constant time complexity, O(1), for search, insert, and delete
operations on average.

 Efficient Memory Usage: Only stores necessary data, leading to efficient memory utilization.

Disadvantages of Hashing

 Collisions: Handling collisions can complicate implementation and affect performance.

 Fixed Size: Hash tables have a fixed size, which can lead to inefficiencies if not appropriately
sized.

 Not Ordered: Data is not stored in a sorted order, making range queries inefficient.

Hash Function

A function that takes an input (or 'key') and returns an integer, which is typically used as an index in a
hash table. The goal is to distribute the keys uniformly across the hash table to minimize collisions.
Various types of hash functions are employed to achieve uniform distribution and minimize collisions.
Below are some common types of hash functions, each explained with examples:

1. Division (Modulo) Method:

o Description: This method computes the hash value by taking the remainder of the
division of the key by the size of the hash table.

o Formula: hash(key) = key % table_size

o Example: If the key is 1234 and the table size is 10, the hash value would be 1234 %
10 = 4.

2. Multiplication Method:

o Description: This method involves multiplying the key by a constant fractional value
(A), extracting the fractional part of the result, and then multiplying it by the table size
to get the hash value.

o Formula: hash(key) = floor(table_size * (key * A % 1)), where 0 < A < 1

o Example: If the key is 1234, table size is 10, and A is 0.618, the hash value would be
calculated as floor(10 * (1234 * 0.618 % 1)).

3. Mid-Square Method:

o Description: This method squares the key and then extracts a portion of the resulting
digits to use as the hash value.

o Example: If the key is 56, squaring it gives 3136. Extracting the middle two digits (13)
could serve as the hash value.

4. Folding Method:

o Description: This method divides the key into equal parts, adds these parts together,
and then applies a modulo operation with the table size to get the hash value.

o Example: If the key is 987654 and the table size is 100, dividing the key into two parts
(987 and 654), summing them gives 1641. Then, 1641 % 100 = 41, so the hash value
is 41.

5. Digit Extraction Method:

o Description: This method selects specific digits from the key to form the hash value.

o Example: If the key is 7654321, extracting the 2nd, 4th, and 6th digits gives 652, which
can be used as the hash value.

6. Radix Transformation Method:

o Description: This method changes the base of the key to another number system (radix)
and then applies a hash function.

o Example: Converting a decimal key 255 to binary gives 11111111. Applying a hash
function to this binary representation yields the hash value.

7. Pseudo-Random Method:

o Description: This method uses a pseudo-random number generator to produce a hash

value based on the key.

o Example: Using a seed value equal to the key in a pseudo-random number generator
to produce a hash value.

8. Cryptographic Hash Functions:

o Description: These functions are designed for security purposes, producing a fixed-
size hash value from input data, making it computationally infeasible to reverse the
process.

o Example: SHA-256 produces a 256-bit hash value from any input data.

Each of these hash functions has its own use cases and is chosen based on factors like the nature of the
keys, the required distribution uniformity, and specific application requirements. Understanding these
functions helps in designing efficient hash tables and ensuring optimal performance in data retrieval
operations.

Module 5 Hashing
No ratings yet
Module 5 Hashing
66 pages
Unit-5 Hashing (1)
No ratings yet
Unit-5 Hashing (1)
12 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Hashing
No ratings yet
Hashing
23 pages
HAshing (Satish sir)
No ratings yet
HAshing (Satish sir)
52 pages
HASHING
No ratings yet
HASHING
8 pages
Hashing in Data Structures
No ratings yet
Hashing in Data Structures
8 pages
Hashing.docx
No ratings yet
Hashing.docx
4 pages
Hashing
No ratings yet
Hashing
31 pages
Hashing
No ratings yet
Hashing
20 pages
hashtables
No ratings yet
hashtables
21 pages
Hashing
No ratings yet
Hashing
18 pages
Week13 1
No ratings yet
Week13 1
16 pages
DSA_M5
No ratings yet
DSA_M5
38 pages
Hashing
No ratings yet
Hashing
8 pages
What is Hashing
No ratings yet
What is Hashing
11 pages
UNIT - 2 Notes
No ratings yet
UNIT - 2 Notes
40 pages
Hash-Data Structure
No ratings yet
Hash-Data Structure
16 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
Hashing
No ratings yet
Hashing
7 pages
Unit-9-Hashing-BIM
No ratings yet
Unit-9-Hashing-BIM
5 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing and Skiplist_removed
No ratings yet
Hashing and Skiplist_removed
113 pages
Hashing
No ratings yet
Hashing
34 pages
Module 6 DSA 24
No ratings yet
Module 6 DSA 24
64 pages
ADS Unit-2
No ratings yet
ADS Unit-2
53 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Hashing
No ratings yet
Hashing
12 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing
No ratings yet
Hashing
24 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
No ratings yet
Module 5: HASHING: Functions. The Values Are Then Stored in A Data Structure Called Hash Table
39 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing
No ratings yet
Hashing
30 pages
UNIT V
No ratings yet
UNIT V
14 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Hash Function
No ratings yet
Hash Function
9 pages
Hashing Data Structure
No ratings yet
Hashing Data Structure
22 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
Week 9_Hash Functions and Collision
No ratings yet
Week 9_Hash Functions and Collision
73 pages
Hashing2
No ratings yet
Hashing2
59 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
Unit 1 Dsa Hashing
No ratings yet
Unit 1 Dsa Hashing
137 pages
C++&DS(UNIT5)
No ratings yet
C++&DS(UNIT5)
42 pages
Hashing
No ratings yet
Hashing
23 pages
UNIT 1- Hashing
No ratings yet
UNIT 1- Hashing
118 pages
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
No ratings yet
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
19 pages
Unit28 Hashing1
No ratings yet
Unit28 Hashing1
19 pages
Hashing
No ratings yet
Hashing
5 pages
UNIT 1- Hashing
No ratings yet
UNIT 1- Hashing
118 pages
12. Hashing
No ratings yet
12. Hashing
35 pages
Finals Complexity and Algorithmn
No ratings yet
Finals Complexity and Algorithmn
49 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
BCS304 DS Module 5 Notes
No ratings yet
BCS304 DS Module 5 Notes
45 pages
Unit 3.4 Hashing Techniques
No ratings yet
Unit 3.4 Hashing Techniques
7 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
CH 4 Hash Table
No ratings yet
CH 4 Hash Table
20 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
Hash Functions in Action: Hashes and MAC
No ratings yet
Hash Functions in Action: Hashes and MAC
13 pages
C - C++ - Ebooks Collection Sha Hashsum
No ratings yet
C - C++ - Ebooks Collection Sha Hashsum
12 pages
谁说人是理性的消费高手与行销达人都要懂的行为经济学全新增订版
No ratings yet
谁说人是理性的消费高手与行销达人都要懂的行为经济学全新增订版
397 pages
Information Security Unit-2 - 9 HMAC
No ratings yet
Information Security Unit-2 - 9 HMAC
4 pages
Lab. Sheet Seven: Data Structure Laboratory
No ratings yet
Lab. Sheet Seven: Data Structure Laboratory
7 pages
Question: Exercise 3. Draw The 11-Item Hash Table That Results From Using
No ratings yet
Question: Exercise 3. Draw The 11-Item Hash Table That Results From Using
4 pages
Hash Functions
No ratings yet
Hash Functions
60 pages
The Orange Manuscript v2 RED
No ratings yet
The Orange Manuscript v2 RED
61 pages
Hash Table
No ratings yet
Hash Table
31 pages
COMP2119 Assignment 4
No ratings yet
COMP2119 Assignment 4
4 pages
Tugas Security V.2
No ratings yet
Tugas Security V.2
3 pages
Handling Large Datasets
No ratings yet
Handling Large Datasets
26 pages
Siddaganga Institute of Technology: Team No. Name and USN Project Title (C++ Coding With STL Need To Be Done)
No ratings yet
Siddaganga Institute of Technology: Team No. Name and USN Project Title (C++ Coding With STL Need To Be Done)
2 pages
be_artificial-intelligence-and-data-science_semester-4_2024_march_data-structures-and-algorithms-ds-a-2019-pattern
No ratings yet
be_artificial-intelligence-and-data-science_semester-4_2024_march_data-structures-and-algorithms-ds-a-2019-pattern
2 pages
The MD5 Encryption & Decryption
No ratings yet
The MD5 Encryption & Decryption
13 pages
SHA-512 Overall Structure
No ratings yet
SHA-512 Overall Structure
5 pages
Test3 CSE205 - K18FR
No ratings yet
Test3 CSE205 - K18FR
7 pages
v.4.28.21 PSdZData Full
No ratings yet
v.4.28.21 PSdZData Full
2 pages
Bot Deshash
No ratings yet
Bot Deshash
2 pages
11-Searching and Hashing Final
No ratings yet
11-Searching and Hashing Final
71 pages
Data Structures Worksheet 5 Answers
No ratings yet
Data Structures Worksheet 5 Answers
4 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Hash Functions
No ratings yet
Hash Functions
26 pages
Practical 07
No ratings yet
Practical 07
3 pages
University of Science and Technology Chittagong
No ratings yet
University of Science and Technology Chittagong
14 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
BCSE309L__MODULE_4_MD5
No ratings yet
BCSE309L__MODULE_4_MD5
32 pages
Dsa 5
No ratings yet
Dsa 5
22 pages
Collision Resolution
No ratings yet
Collision Resolution
17 pages

Hash

Uploaded by

Hash

Uploaded by

Hashing

Situations Where Hash is not Used

There are majorly three components of hashing:

 Step 2: So, let’s assign

o “b”=2, .. etc, to all alphabetical characters.

 Step 5: So we will then store

o “cd” in 7 mod 7 = 0, and

 Minimize Collisions: Different keys should ideally hash to different indices.

Collision Resolution Techniques

1. Open Hashing (Separate Chaining):

 Simplifies collision resolution by allowing multiple elements at each index.

 Requires additional memory for pointers in the linked lists.

2. Closed Hashing (Open Addressing):

If h(k) = 2 is occupied, check h(k) + 1, h(k) + 2, etc.

If h(k) = 2 is occupied, check h(k) + 1², h(k) + 2², etc.

 Performance can degrade as the load factor increases, leading to clustering.

 Database Indexing: Quickly locate data without searching every row.

 Sets: Implement sets that can check for membership efficiently.

 Password Storage: Store hashed passwords to enhance security.

 Collisions: Handling collisions can complicate implementation and affect performance.

1. Division (Modulo) Method:

o Formula: hash(key) = key % table_size

o Formula: hash(key) = floor(table_size * (key * A % 1)), where 0 < A < 1

5. Digit Extraction Method:

6. Radix Transformation Method:

o Description: This method uses a pseudo-random number generator to produce a hash

8. Cryptographic Hash Functions:

You might also like