0% found this document useful (0 votes)
3 views

7.- Hashing, extendible hashing

The document explains hashing, a method of mapping data to integer values for fast searching, and introduces hash functions used in hash tables for rapid data lookup. It discusses hash conflicts and various resolution techniques such as separate chaining, linear probing, quadratic probing, and double hashing. Additionally, it covers rehashing and the unordered_map data structure in STL, which utilizes hash tables for efficient data retrieval.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

7.- Hashing, extendible hashing

The document explains hashing, a method of mapping data to integer values for fast searching, and introduces hash functions used in hash tables for rapid data lookup. It discusses hash conflicts and various resolution techniques such as separate chaining, linear probing, quadratic probing, and double hashing. Additionally, it covers rehashing and the unordered_map data structure in STL, which utilizes hash tables for efficient data retrieval.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

7.

Hashing,
extendible hashing

1/22
7. Hashing, extendible
hashing

What is hashing?

 When you take any type of data and assign an integer value to it.

 Very fast method of searching

 It involves less key comparisons and searching can be performed in a constant time

2/22
7. Hashing, extendible
hashing

Hash Function

 A hash function is used to map data of arbitrary size to data of a fixed size.

 Hash functions are often used in combination with hash tables, a common
data structure used for rapid data lookup.

 Hash functions accelerate table or database lookup by detecting


duplicated records in a large file.

E.g. of a hash function application:


Finding similar stretches in DNA sequences

3/22
7. Hashing, extendible
hashing

Example of hashing

 In the following example, a hash function is used to


map names to integers (from 0 to 15).

 Note that there are two keys assigned for the same
hash value; it is called a “collision”, and it is a hash
conflict.

4/22
7. Hashing, extendible
hashing

Running time complexity of hashing

Operation Time Complexity


Search O(1)

Insertion O(1)

Deletion O(1)

5/22
7. Hashing, extendible
hashing

The hash table

 Also known as hash map, is a data structure that


implements an associative array* abstract data type;
a structure that maps keys to values.

 A hash table uses a hash function to compute an


index into an array of buckets or slots.

E.g.: small phone book as a hash table.

*Associative array: a.k.a. map, symbol table, or dictionary, is an abstract data type composed of a
collection of (key, value) pairs, such that each possible key appears at most once in the collection.
6/22
7. Hashing, extendible
hashing

How to make a good hash function:

 Time complexity O(1)

 Spread the elements evenly in the table.

 No hash conflict.

7/22
7. Hashing, extendible
hashing

Solving hash conflicts:

There are some ways to deal with hash conflicts, mentioned below:

1. Separate chaining (bucket hashing)

2. Linear probing

3. Quadratic probing

4. Double hashing

8/22
7. Hashing, extendible
hashing

Solving hash conflicts:

1. Separate chaining (bucket hashing) (3)

Example:

Before After
9/22
7. Hashing, extendible
hashing

Solving hash conflicts:

2. Linear probing

 When a collision happens because of the insertion of a new key to an already occupied
slot, it searches the table for the closest following free location and inserts the new key
there.
 Lookups are performed in the same way, by searching the table sequentially starting at
the position given by the hash function, until finding a cell with a matching key or an
empty cell.

10/22
7. Hashing, extendible
hashing

Solving hash conflicts:


g it of
i
la st d nt!
Linear probing – Example t the leme
a
ok rted e
lo
H(x)=h(x)+f(i) where f(i)=i, h(x)=xmodsize We e inse
th

11/22
7. Hashing, extendible
hashing

Solving hash conflicts:

3. Quadratic probing

 It operates by taking the original hash index and adding successive values of an arbitrary
quadratic polynomial until an open slot is found.
 It better avoids the clustering problem that can occur with linear probing, although it is
not immune to it.

12/22
7. Hashing, extendible
hashing

Solving hash conflicts:

3. Quadratic probing – Considerations & Example

Hash function: h(x)=X % Table size


h’(x)=[h(x)+f()] % Table size

13/22
7. Hashing, extendible
hashing

Solving hash conflicts:

4. Double hashing

• Double hashing is similar to linear probing and the only difference is the interval between
successive probes. Here, the interval between probes is computed by using 2 Hash
functions.

• The hashed index for an entry record is an index that is computed by one hashing function
and the slot at that index is already occupied. We must traverse in a specific probing
sequence to look for unoccupied slot.

14/22
7. Hashing, extendible
hashing

Solving hash conflicts:

4. Double hashing - Example

When inserting 49 there is collision;


it will be inserted in:
Hash=xmodsize
hash2(x) = R−(x mod R)
hash2(49)= 7-0= 7 49 will be inserted in position
6.
Where:
• “R” = prime smaller than the size of the table
• mod = taking the remainder of dividing two
numbers. In this case: x mod R= 49 mod 7 = 0

15/22
7. Hashing, extendible
hashing

What is rehashing?

 Operation by which we create another hash table twice as big as the original one (using
a new hash function), because the old table gets too full (load factor lambda > 50%).

 Very expensive operation: running time complexity of O(N); since there are N elements
to rehash.

 The size of the new table will be the first prime that is twice as large as the old table
size.

 Summarizing, we can use rehashing when:


 The hash table is half full

 An insert fails

 Lambda (load factor) is greater than a threshold value


16/22
7. Hashing, extendible
hashing
Example

• h(x)= x mode N (size)


• New h(x)= x mode ( 2N 1st prime number)

New h(x) = x mod 17

Previous h(x) = x mod 7.

17/22
7. Hashing, extendible
hashing

The unordered map in STL

• Unordered_map is like a data structure of dictionary type that stores


elements in itself. It contains successive pairs (key, value), which
allows fast retrieval of an individual element based on its unique key.
• Internally unordered_map is implemented using Hash Table, the key
provided to map is hashed into indices of a hash table which is why the
performance of data structure depends on the hash function a lot but
on average, the cost of search, insert, and delete from the hash
table is O(1).
• Implemented by “std::unordered_map”

18/22
Example

19
7. Hashing, extendible
hashing

Thank you!

20/22

You might also like