0% found this document useful (0 votes)
2 views

Week 14 lec 1 Hashing

The document discusses hashing as a technique for efficient data storage and retrieval, highlighting its advantages over traditional searching methods. It explains the properties of a good hash function, common collision resolution techniques, and various hashing methods including cryptographic hashing. Additionally, it provides examples of hashing applications in data structures and algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Week 14 lec 1 Hashing

The document discusses hashing as a technique for efficient data storage and retrieval, highlighting its advantages over traditional searching methods. It explains the properties of a good hash function, common collision resolution techniques, and various hashing methods including cryptographic hashing. Additionally, it provides examples of hashing applications in data structures and algorithms.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Hashing

Dr. Sohail
Week 14 Feb ..17
Objectives
• Today’s lecture objectives include:

– Hashing

Data Structures & Algorithms 2


Review of Searching Techniques
• Recall the efficiency of searching techniques covered earlier.

• The sequential search algorithm takes time proportional to the


data size, i.e., O(n).

• Binary search improves on liner search reducing the search


time to O(log n).

• With a BST, an O(log n) search efficiency can be obtained; but


the worst-case complexity is O(n).

• To guarantee the O(log n) search time, BST height balancing


is required.

Data Structures & Algorithms 3


Review of Searching Techniques
• O(n). Linear Time, Execution time grows proportionally with input size, Scales well but can be
slow for very large inputs.

• O(n) is a notation used in Big O notation, which describes the time complexity of an algorithm in
terms of input size n.

• When an algorithm has a time complexity of O(n), it means that the time required to complete the
algorithm grows linearly with the input size. In other words, if you double the input size, the runtime
will roughly double as well.
•O(log n). Logarithmic Time, The execution time grows slowly as input size increases, Common
in binary search algorithms. Efficient for large inputs, often seen in sorted data searching

• O(n²) – Quadratic Time. Execution time grows quadratically with input size. Common in
nested loops, Example (Bubble Sort), Bad for large inputs since time increases exponentially

• O(1) – Constant Time, The execution time does not depend on the input size.
def get_first_element(arr):

return arr[0] # Always takes the same time, regardless of array size

Data Structures & Algorithms 4


Hashing – Introduction
• Suppose that we want to store 10,000 students records
(each with a 5-digit ID) in a given container.
– A linked list implementation would take O(n) time.

– A height balanced tree would give O(log n) access time.

– Using an array of size 10,000 would give O(1) access time but will
lead to a lot of space wastage.

• Is there some way that we could get O(1) access without


wasting a lot of space?
– The answer is hashing.

Data Structures & Algorithms 5


What is Hashing?
• A technique that determines an index or location for
storage of an item in a data structure

• The hash function receives the search key


– Returns the index of an element in an array called the hash table

– The index is known as the hash index

• A perfect hash function maps each search key into a


different integer suitable as an index to the hash table

Data Structures & Algorithms 6


What is Hashing?...
• A hash function in data structures is a function that maps input data
(keys) to a fixed-size value (hash code or hash value),
• usually for fast data retrieval.
• It is primarily used in hash tables,
• where the hash function determines the index at which data is stored.

Data Structures & Algorithms 7


What is Hashing?...

Data Structures & Algorithms 8


Key Properties of a Good Hash Function

1. Deterministic – The same input always produces the same hash value.

2. Efficient – The function should compute the hash quickly.

3. Uniform Distribution – It should distribute hash values evenly to avoid clustering.

4. Minimizes Collisions – Different inputs should ideally have different hash values.

5. Fixed Output Size – Regardless of input size, the hash value should be of a fixed

length.

Data Structures & Algorithms 9


What is Hashing?...

Data Structures & Algorithms 10


What is Hashing?...

Data Structures & Algorithms 11


What is Hashing?...
• Hashing in Data Structures
• Hash Tables – Store key-value pairs and use a hash function to determine index.
• Hash Maps – Implement hash tables for quick lookups (e.g., Python's dictionary).
• Cryptographic Hashing – Secure functions like SHA-256 for data integrity.

Handling Collisions
• Collisions occur when two keys produce the same hash value.
• Common resolution techniques include:
1. Chaining (Separate Chaining) – Use linked lists at each index.
2. Open Addressing (Linear Probing, Quadratic Probing, Double Hashing) – Find
another available slot within the array.

Data Structures & Algorithms 12


Hashing methods

Handling Collisions
•Separate Chaining: Store multiple values at the same index using linked lists.
•Linear Probing: Place 15 in the next available index (6), and 40 at the next (1).

Data Structures & Algorithms 13


What is Hashing?...

Data Structures & Algorithms 14


Hashing methods Cont…

Data Structures & Algorithms 15


Data Structures & Algorithms 16
hashing functions

Mid-Square
• The key is squared and the middle part of the result taken as the
hash value.

• To map the key 3121 into a hash table of size 1000, we square it
31212 = 9740641 and extract 406 as the hash value.

Data Structures & Algorithms 17


Hashing methods Cont…

Data Structures & Algorithms 18


hashing functions

Folding
• It involves splitting keys into two or more parts and then combining
the parts to form the hash addresses.

• To map the key 25936715 to a range between 0 and 9999, we can:


– split the number into two as 2593 and 6715 and

– add these two to obtain 9308 as the hash value.

• Very useful if we have keys that are very large.

• Fast and simple especially with bit patterns.

• A great advantage is ability to transform non-integer keys into integer


values.

Data Structures & Algorithms 19


Data Structures & Algorithms 20
Hashing methods Cont…

SHA-256: Secure Hash Algorithm (256-bit)


SHA-256 is a cryptographic hash function that produces a 256-bit (32-byte) fixed-length hash value from any input. It is
part of the SHA-2 (Secure Hash Algorithm 2) family, designed by the NSA (National Security Agency) and published by
NIST (National Institute of Standards and Technology).

Data Structures & Algorithms 21


Cryptographic hash function

Data Structures & Algorithms 22


Hashing methods Cont…
5. Hashing in programming
A process of converting data (such as a string or number) into a fixed-length value (a
hash) using a hash function.
Commonly used in data structures like hash tables, cryptographic applications, and
checksums.
Explanation:
•The hashlib module provides hash functions like
SHA-256.

•The encode() function converts the string into


bytes (required for hashing).
•hashlib.sha256() creates a SHA-256 hash
object.
•hexdigest() returns the hash in hexadecimal
format.

Data Structures & Algorithms 23


hashing functions

6. Radix Conversion
• Transforms a key into another number base to obtain the hash
value.

• Typically use number base other than base 10 and base 2 to


calculate the hash addresses.

• To map the key 55354 in the range 0 to 9999 using base 11 we


have:

5535410 = 3865211

• We may truncate the high-order 3 to yield 8652 as our hash address


within 0 to 9999.

Data Structures & Algorithms 24


Hashing Example

Example 1: Illustrating Hashing

• Use the function f(r) = id % 13 to load the following


records into an array of size 13.
Name Marks ID
Ziyad 1.73 985926
Ahmad 1.60 970876
Mahtab 1.58 980962
Saad 1.80 986074
Adnan 1.73 970728
Yousuf 1.66 994593
Husain 1.70 996321

Data Structures & Algorithms 25


Hashing Example …

Name ID h(r) = id % 13
Ziyad 985926 6
Ahmad 970876 10
Mahtab 980962 8
Saad 986074 11
Adnan 970728 5
Yousuf 994593 2
Husain 996321 1

0 1 2 3 4 5 6 7 8 9 10 11 12

Data Structures & Algorithms 26


Hash Function – Cont…
• A good hash function should:
– Minimize collisions.

– Be easy and quick to compute.

– Distribute key values evenly in the hash table.

– Use all the information provided in the key.

Data Structures & Algorithms 27


Some Applications of Hash Tables

• Database systems

• Symbol tables

• Data dictionaries

• Network processing algorithms

Data Structures & Algorithms 28


THANK YOU

Data Structures & Algorithms 29

You might also like