Hashing Techniques
Hashing Techniques
Dr.D.Sumathi
Why do we need hashing?
• We need something that can do better than a binary search,
O(log N).
• We want, O(1)
• Solution: Hashing
• In fact hashing is used in:
• Web searches
• Spell checkers
• Databases
• Compilers
• passwords
What is Hashing?
• a technique used for storing and retrieving information as
quickly as possible
• to perform optimal searches and is useful in implementing
symbol tables
Hashing- definition
• When values are huge, storing is not possible.
• a set of universal keys and limited locations in the memory
• simple arrays is not the correct choice for solving the problems
where the possible keys are very big.
• The process of mapping the keys to locations is called hashing
Components of Hashing
1) Hash Table
2) Hash Functions
3) Collisions
4) Collision Resolution Techniques
Hash Table
• Hash table or hash map is a data structure used to store key-
value pairs.
• It is a collection of items stored to make it easy to find them later.
• It uses a hash function to compute an index into an array of
buckets or slots from which the desired value can be found.
• It is an array of list where each list is known as bucket.
• It contains value based on the key.
• Hash table is used to implement the map interface and extends
Dictionary class.
• Hash table is synchronized and contains only unique elements.
• Given a key k, we find the element whose key is k by just
looking in the kth position of the array. This is called direct
addressing.
• Direct addressing is applicable when we can afford to allocate
an array with one position for every possible key.
Hash Function
• transform the key into the index is done.
• the hash function should map each possible key to a unique slot
index
• Given a collection of elements, a hash function that maps each
item into a unique slot is referred to as a perfect hash function
• For example , if the elements were nine-digit SSN, this method
would require almost one billion slots and for a class of 25
students, a huge amount of memory is wasted
Goal of hash function
• minimizing the number of collisions
• Easy computation
• Even distribution of the elements in the hash table
• a number of common ways
Folding method
• constructing hash functions begins by dividing the elements into
equal size pieces (the last piece may not be of equal size).
• These pieces are then added together to give the resulting hash
value.
• the phone number 436-555-4601, we would take the digits and
divide them into groups of 2 (43,65,55,46,01).
• After the addition, 43+65+55+46+01, we get 210. If we assume
our hash table has 11 slots, then we need to perform the extra
step of dividing by 11 and keeping the remainder.
• in this case 210 % 11 is 1, so the phone number 436-555-4601
hashes to slot 1.
• Some folding methods go one step further and reverse every
other piece before the addition.
• For the above example, we get 43+56+55+64+01=219 which
gives 219 % 11 = 10.
How to Choose Hash Function?
• The basic problems associated with the creation of hash tables
are
• distributing the index values of inserted objects uniformly across
the table
• Calculation must be fast and returns values within the range of
locations in our table, and minimizes collisions
• An efficient collision resolution algorithm should be designed so
that it computes an
• alternative index for a key whose hash index corresponds to a location
previously inserted in the hash table
Characteristics of Good Hash Functions
• Minimize collision
• Be easy and quick to compute
• Distribute key values evenly in the hash table
• Use all the information provided in the key
• Have a high load factor for a given set of keys
efficiency
of the
Decision
hashing
factor
function
Collisions
• Hash functions are used to map each key to a different address
space, but practically it is not possible to create such a hash
function and the problem is called collision.
• Collision is the condition where two records are stored in the
same location.