0% found this document useful (0 votes)

18 views71 pages

11-Searching and Hashing Final

The document compares linear search and binary search algorithms, highlighting their efficiency and implementation. Linear search is optimal for unsorted arrays but slower, while binary search is faster on sorted arrays, operating in logarithmic time. Additionally, it discusses the use of hash tables and direct access tables for efficient record storage and retrieval, emphasizing the trade-off between speed and memory usage.

Uploaded by

Santosh Deshmukh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views71 pages

11-Searching and Hashing Final

Uploaded by

Santosh Deshmukh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 71

Linear Search

vs
Binary Search

Kumkum Saxena
Linear Search
◼ Your code should look something like this:

int search(int array[], int len, int value) {

int i;
for (i=0; i<len; i++) {
if (array[i] == value)
return 1;
}
return 0;
}

Kumkum Saxena Searching and Hashing page 2

Linear Search
◼ Analyze code:
◼ Clearly, if the array is unsorted, this algorithm is
optimal
▪ They ONLY way to be sure that a value isn’t in the array is
to look at every single spot of the array
▪ Just like you can’t be sure that you DON’T have some
piece of paper or form unless you look through ALL of your
pieces of paper

◼ But we ask a question:

◼ Could we find an item in an array faster if it were
already sorted?

Kumkum Saxena Searching and Hashing page 3

Binary Search
◼ Number Guessing Game from childhood
◼ Remember the game you most likely played as
a child
◼ I have a secret number between 1 and 100.
◼ Make a guess and I’ll tell you whether your guess is
too high or too low.
◼ Then you guess again. The process continues until
you guess the correct number.
◼ Your job is to MINIMIZE the number of guesses you
make.

Kumkum Saxena Searching and Hashing page 4

Binary Search
◼ Number Guessing Game from childhood
◼ What is the first guess of most people?
◼ 50.
◼ Why?
◼ No matter the response (too high or too low), the most
number of possible values for your remaining search
is 50 (either from 1-49 or 51-100)
◼ Any other first guess results in the risk that the
possible remaining values is greater than 50.
▪ Example: you guess 75
▪ I respond: too high
▪ So now you have to guess between 1 and 74
▪ 74 values to guess from instead of 50
Kumkum Saxena Searching and Hashing page 5
Binary Search
◼ Number Guessing Game from childhood
◼ Basic Winning Strategy
◼ Always guess the number that is halfway between the
lowest possible value in your search range and the
highest possible value in your search range

◼ Can we now adapt this idea to work for

searching for a given value in an array?

Kumkum Saxena Searching and Hashing page 6

Binary Search
◼ Array Search
◼ We are given the following sorted array:
index 0 1 2 3 4 5 6 7 8
value 2 6 19 27 33 37 38 41 118

◼ We are searching for the value, 19

◼ So where is halfway between?
◼ One guess would be to look at 2 and 118 and take
their average (60).
◼ But 60 isn’t even in the list
◼ And if we look at the number closest to 60
▪ It is almost at the end of the array

Kumkum Saxena Searching and Hashing page 7

Binary Search
◼ Array Search
◼ We quickly realize that if we want to adapt the
number guessing game strategy to searching an
array, we MUST search in the middle INDEX of
the array.
◼ In this case:
◼ The lowest index is 0
◼ The highest index is 8
◼ So the middle index is 4

Kumkum Saxena Searching and Hashing page 8

Binary Search
◼ Array Search
◼ Correct Strategy
◼ We would ask, “is the number I am searching for, 19,
greater or less than the number stored in index 4?
▪ Index 4 stores 33
◼ The answer would be “less than”
◼ So we would modify our search range to in between
index 0 and index 3
▪ Note that index 4 is no longer in the search space
◼ We then continue this process
▪ The second index we’d look at is index 1, since (0+3)/2=1
▪ Then we’d finally get to index 2, since (2+3)/2 = 2
▪ And at index 2, we would find the value, 19, in the array
Kumkum Saxena Searching and Hashing page 9
Binary Search
◼ Binary Search code:
int binsearch(int a[], int len, int value) {

int low = 0, high = len-1;

while (low <= high) {
int mid = (low+high)/2;
if (value < a[mid])
high = mid-1;
else if (value > a[mid])
low = mid+1;
else
return 1;
}

return 0;
}
Kumkum Saxena Searching and Hashing page 10
Binary Search
◼ Binary Search code:

◼ At the end of each array iteration, all we do is

update either low or high
◼ This modifies our search region
◼ Essentially halving it

Kumkum Saxena Searching and Hashing page 11

Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ Let’s analyze how many comparisons (guesses) are
necessary when running this algorithm on an array of
n items
First, let’s try n = 100
▪ After 1 guess, we have 50 items left,
▪ After 2 guesses, we have 25 items left,
▪ After 3 guesses, we have 12 items left,
▪ After 4 guesses, we have 6 items left,
▪ After 5 guesses, we have 3 items left,
▪ After 6 guesses, we have 1 item left
▪ After 7 guesses, we have 0 items left.
Kumkum Saxena Searching and Hashing page 12
Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ Notes:
▪ The reason for the last iteration is because the number of
items left represent the number of other possible values to
search
▪ We need to reduce this to 0.
▪ Also, when n is odd, such as when n=25
▪ We search the middle element, # 13
▪ There are 12 elements smaller than 13
▪ And 12 elements bigger than 13
▪ This is why the number of items is slightly less than ½ in
those cases

Kumkum Saxena Searching and Hashing page 13

Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ General case:

◼ After 1 guess, we have n/2 items left

◼ After 2 guesses, we have n/4 items left
◼ After 3 guesses, we have n/8 items left
◼ After 4 guesses, we have n/16 items left
◼ …
◼ After k guesses, we have n/2k items left

Kumkum Saxena Searching and Hashing page 14

Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ General case:
◼ So, after k guesses, we have n/2k items left
◼ The question is:
▪ How many k guesses do we need to make in order to find
our answer?
▪ Or until we have one and only one guess left to make?
◼ So we want to get only 1 item left
◼ If we can find the value that makes the above fraction
equal to 1, then we know that in one more guess, we’ll
narrow down the item

Kumkum Saxena Searching and Hashing page 15

Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ General case:
◼ So, after k guesses, we have n/2k items left
▪ Again, we want only 1 item left
▪ So set this equal to 1 and solve for k
n
k
=1 n=2 k
k = log 2 n
2
◼ This means that a binary search roughly takes log2n
comparisons when searching in a sorted array of n
items
Kumkum Saxena Searching and Hashing page 16
Binary Search
◼ Efficiency of Binary Search
◼ Analysis:
◼ Runs in logarithmic (log n) time
◼ This is MUCH faster than searching linearly
◼ Consider the following chart:
n log n
8 3
1024 10
65536 16
1048576 20
33554432 25
1073741824 30

◼ Basically, any log n algorithm is SUPER FAST.

Kumkum Saxena Searching and Hashing page 17
Hash Tables
& Hashing

Kumkum Saxena
Terminology
◼ Table
◼ An abstract data type that stores & retrieves
records according to their search key values

◼ Record
◼ Each individual row in the table
◼ Example:
◼ A database of student records
◼ So each record will have a pid, first name, last name,
SSN, address, phone, email, etc.

Kumkum Saxena Searching and Hashing page 19

Record Example
sid (key) name score
0012345 andy 81.5
This is an 0033333 betty 90
example of a 0056789 david 56.8
table.

Each individual ...

row is a record.
9903030 tom 73
9908080 bill 49

...

Consider this problem. We want to store 1,000

student records and search them by student id.

Kumkum Saxena Searching and Hashing page 20

Motivation
◼ Problem:
◼ Given this table of records
◼ We need to be able to:
◼ Add new records
◼ Delete records
◼ Search for records

◼ What’s the most efficient way of doing this?

Kumkum Saxena Searching and Hashing page 21

Motivation
◼ Problem:
◼ What’s the most efficient way of doing this?
◼ Use an array to store the records, in unsorted order
◼ Running time:
▪ Adding a record:
▪ O(1) since we simply add at the end of the unsorted array
▪ Deleting a record:
▪ Very slow, or O(n), since we have to search through the entire
array to find the desired record to delete
▪ We then have a “hole” in the array.
▪ We can quickly fill that hole by moving the last element into it,
which can happen in O(1) time.
▪ Search for a record:
▪ Very slow, or O(n), since we search through the entire table

Kumkum Saxena Searching and Hashing page 22

Motivation
◼ Problem:
◼ What’s the most efficient way of doing this?
◼ Use an array to store the records, in sorted order
◼ Running time:
▪ Adding a record:
▪ Must insert at correct position
▪ And then ALL other records, after insertion spot, must be moved
▪ Very slow, or O(n)
▪ Deleting a record:
▪ Must find the record to delete, O(n)
▪ Must fill the “hole”, which means moving all other items, O(n)
▪ Search for a record:
▪ Binary search!
▪ Fast, or O(logn)
Kumkum Saxena Searching and Hashing page 23
Motivation
◼ Problem:
◼ What’s the most efficient way of doing this?
◼ Use a binary search tree to store the records
◼ Running time:
▪ Adding a record:
▪ Inserting into proper position in BST
▪ Fast, or O(logn)
▪ Deleting a record:
▪ Must find correct position to delete
▪ Fast, or O(logn)
▪ Search for a record:
▪ Also Fast, or O(logn)

Kumkum Saxena Searching and Hashing page 24

Motivation
◼ Problem:
◼ What’s the most efficient way of doing this?
◼ Use a binary search tree to store the records
◼ BSTs seem to be the best solution to this
◼ But there’s something that is WAAAAAY faster
▪ Adding, Deleting, and Searching are all O(1): CONSTANT time
◼ A very simple, naive solution that you could come up with
before even taking this class
◼ Just use an array! But a special type of an array.
◼ Specially, use an array that is SOOOOO large that every
record has its own, exclusive cell in the array
◼ Often called a Direct Access Table
Kumkum Saxena Searching and Hashing page 25
Direct Access Table
name score
0 Assume we stored records
: based on a social security #.
: :
123456789 andy 81.5 One way is to store the records
: : : in a huge array
334561894 betty 90 index 0..999999999
: : :
589224751 david 56.8 The index into array is simply
: : : an individuals SSN.
: : :
990847852 bill 49 So this is VERY FAST
: : :
999999999
Adding, Deleting, and
Searching: O(1)

Kumkum Saxena Searching and Hashing page 26

Motivation
◼ Problem:
◼ What’s the most efficient way of doing this?
◼ Use a Direct Access Table
◼ So a Direct Access Table is WAAAAAY fast
◼ But what is the obvious, HUUUGE problem???
◼ Let’s say we want to store 1000 students based on SSN
◼ SSN is 9 digits
▪ Assume the largest SSN is 999-99-9999
◼ So we need an array that is 1 BILLION in size
◼ So, yeah, this direct access table is O(1) in speed
◼ But it is O(stupid) in size and memory
▪ HUGE overkill to have an array of 1 billion to store 1000 records
Kumkum Saxena Searching and Hashing page 27
Motivation
◼ We need a better solution!
◼ We want constant add/delete/search time
◼ And a reasonably sized array
◼ What we ideally want:
◼ Let’s say we want to store 1000 students
◼ So ideally, we only want an array of size 1000
▪ So we don’t waste space
◼ But we still want the “direct access” that results in O(1)
lookup time
◼ How can we do this?
▪ Remembering that it was the SIZE of the array that allowed for
direct access in the first place

Kumkum Saxena Searching and Hashing page 28

Motivation
◼ What we ideally want:
◼ This array is size 1000 0
◼ And we will place students into : : :
this array based on their SSN. 150 842-33-5821 Andy
◼ So we need a way of mapping : : :
368 527-44-7521 Betty
a SSN to an index : : :
◼ Example: 527 452-85-6829 David
◼ We want SSN: 527-44-7521 to
: : :
somehow refer to index 368.
: : :
884 651-54-3218 Bill
◼ If we can do that, then we : : :
accomplish our goal 999

Kumkum Saxena Searching and Hashing page 29

Magic Address Calculator
◼ Solution:
◼ Let’s build a make-believe function:
◼ the “magic address calculator”
◼ The input to this function is the “key” (ie. SSN)
◼ The function converts this SSN into an index into the
reasonably sized array
◼ Ideally, each SSN will “map” into its own index in the array
◼ So this is still in constant time!
◼ Assuming the “magic address calculator” does the
conversion in constant time …which it does!
◼ And we are using a reasonably sized array!
◼ This is the concept of a hash table.
Kumkum Saxena Searching and Hashing page 30
Terminology
◼ Hash table
◼ An array of table items, where the index is
calculated by a hash function
◼ Searching in a hash table:
◼ Let’s say you are searching for a record with key 4256
◼ To find an item in a hash table, you do NOT follow the
standard protocol of searching the entire table, record by
record, comparing the key you are looking for to the key
in each record.
◼ Rather, we use a hash function on the search key to
quickly calculate the index of the item
▪ The hash function converts the key into the correct index into
the table
Kumkum Saxena Searching and Hashing page 31
Terminology
◼ Hash function
◼ A mathematical calculation that maps the search
key to an index in a hash table
◼ Should be fast to calculate
▪ Time for calculation should be O(1)
◼ Should distribute items evenly

◼ Hashing
◼ A way to access a table (array) in relatively
constant (quick) time
◼ Uses a hash function & collision resolution scheme
Kumkum Saxena Searching and Hashing page 32
Hash Example
◼ UCF System for storing student records
◼ Could store everyone’s records with name,
address, and telephone number using SSN as the
search key
◼ Could use entire SSN, but wastes too much space
▪ Again, SSN’s have 9 digits…that’s 1 BILLION different #’s to
account for
▪ But UCF has only 50,000 students...so in an array of size 1
BILLION, only 50,000 spots will be used
▪ EPIC WASTE!
▪ On a side note, there will be no “collisions”
▪ Each record will have its own, personal spot in the array based
on its key (phone number)

Kumkum Saxena Searching and Hashing page 33

Hash Example
◼ UCF System for storing student records
◼ Could store everyone’s records with name,
address, and telephone number using SSN as the
search key
◼ Better to use last five digits of SSN number
◼ For example, instead of using HashTable [589475127] to
access that record, use HashTable[75127]
◼ Now you need an array of size 100,000
▪ Since we are using 5 digits
▪ The array can go from index 0 to index 99999
◼ So this is still twice the # of UCF students
◼ BUT, much better than an array of size 1 BILLION

Kumkum Saxena Searching and Hashing page 34

Hash Example
◼ UCF System for storing student records
◼ Could store everyone’s records with name,
address, and telephone number using SSN as the
search key
◼ Better to use last five digits of SSN number
◼ However, there is a chance of collisions
▪ SSN # 589475127 and SSN # 428475127 have the same last
five digits
▪ So they will end up “mapping” to the same index in the array
▪ This is called a “collision”
▪ That is CLEARLY a problem.
▪ Can’t store two items in one index of the array
▪ So, we will need to know how to handle collisions
▪ Will discuss in a bit
Kumkum Saxena Searching and Hashing page 35
Hash Function
◼ A hash function is written h(x)=i
◼ h is the name of the hash function
◼ x is the record search key
◼ Such as the SSN in our example
◼ i is the output of the hash function
◼ which refers to an index in they array (hash table)
◼ Let’s say we are trying to add to a hash table
◼ Once i is calculated, we can then add the record at
HashTable[i]

Kumkum Saxena Searching and Hashing page 36

Hash Function
◼ A hash function is written h(x)=i
◼ In the UCF student example,
h(589475127)=75127
◼ So now we can take the record (name, address,
phone, etc.) of the student with SSN 589475127
◼ and we can store that record at HashTable[75127]
◼ So this mock UCF hash function simple takes a
phone number and keeps the last five digits
◼ Hash functions can be as easy or as difficult as you
want

Kumkum Saxena Searching and Hashing page 37

Example Hash Functions
◼ Three simple hash functions for integers
1. Selecting digits
2. Folding
3. Modulo arithmetic
◼ Again, these are just examples!
◼ Remember the goal here
◼ Given some key (ie. SSN, student ID, phone #, etc)
◼ We want to make an “smaller” version of that key
▪ Because when a key is smaller, that means the size of the
array needed can also be smaller
◼ Use this new key to index the record

Kumkum Saxena Searching and Hashing page 38

3 Simple Hash Functions
◼ Selecting digits hash function
◼ Instead of using the whole integer, only select
several digits
◼ For example, if you have the SS#123-45-6789, just
use the first 3 digits
◼ h(123456789)=123
◼ This is like the example we already did
◼ Fast & easy to calculate, but usually does not
distribute randomly
◼ The first three numbers of a social security number
are based on location, so people of the same state
usually have the same SS#

Kumkum Saxena Searching and Hashing page 39

3 Simple Hash Functions
◼ Folding hash function
◼ Add the digits of the integer together
◼ For example, if you have the SS#123-45-6789, add all
the digits together
◼ h(123456789)=1+2+3+4+5+6+7+8+9=45 with hash
table index range 0 < h(search key) < 81
◼ Can add in different ways for hash tables of
different sizes
◼ h(123456789)=123+456+789=1368 with hash table
index range 0 < h(search key) < 2997

Kumkum Saxena Searching and Hashing page 40

3 Simple Hash Functions
◼ Modulo arithmetic hash function
◼ Using modulus as a hash function
◼ h(x) = x mod tableSize
◼ Using a prime number as tableSize reduces
collisions
◼ For tableSize = 31,
h(123456789) = 123456789 mod 31 = 2
with hash table index range 0 < h(search key) < 30

Kumkum Saxena Searching and Hashing page 41

Hash Functions
◼ Hash functions only need to be designed to
operate on integers
◼ Although objects such as strings can be used as a
search key, they can be easily converted into an
integer value
◼ Then apply hash function to the integer value

Kumkum Saxena Searching and Hashing page 42

Convert String to Integer
◼ Ways to convert a string to an integer
1. Assign A to Z the numbers 0 to 25, and add the
integers together
2. Use the ASCII or Unicode integer value for each
character, and add the integers together
3. Use the binary number for the ASCII or Unicode
integer value for each character, and
concatenate the binary numbers together

Kumkum Saxena Searching and Hashing page 43

Convert String to Integer
◼ Examples of converting a string to an integer
1. “ABC” would be 0 + 1 + 2 = 3
2. “ABC” would be 65 + 66 + 67 = 198
3. “ABC” would be 01000001 + 01000010 +
01000011 = 010000010100001001000011 =
4,276,803

Kumkum Saxena Searching and Hashing page 44

Terminology
◼ Perfect hash function
◼ Ideal situation where hash function maps each
search key into a different location in the hash
table
◼ Telephone numbers would all map to different indexes
◼ Collision
◼ When a hash function maps two or more search
keys into the same location in the hash table
◼ h(key1) = h(key2), so have the same index value

Kumkum Saxena Searching and Hashing page 45

Example Collision
◼ Need to store the student records of ICS 211
students based on student ID
◼ Student ID has 8 digits, so need array of size
100,000,000
◼ This is a waste of space, so instead use an
array of size 31, with hash function h(x) = x mod
31
◼ h(12345678)=h(26508090)=21 is an example of
a collision
◼ Both should be stored at table[21]

Kumkum Saxena Searching and Hashing page 46

Collision Resolution
◼ In case of a collision, a collision resolution
scheme must be implemented
◼ Assigns the search keys with the same hash
function to different locations in the hash table
◼ Whenever possible, items should be placed evenly in the
hash table in order to avoid these collisions
◼ Or we use another method called Bucket Hashing
or Separate Chaining

Kumkum Saxena Searching and Hashing page 47

Resolving Collisions
◼ Two main approaches to collision resolution
1. Open addressing
2. Restructure the hash table
❖ Bucket Hashing
❖ Separate Chaining

Kumkum Saxena Searching and Hashing page 48

Open Addressing
◼ Open addressing
◼ Probe (search) for open locations in the hash
table
◼ Probe sequence
◼ The sequence of locations that are examined
for a possible open location to put the next
item

Kumkum Saxena Searching and Hashing page 49

Open Addressing
◼ Three types of probing
1. Linear probing
2. Quadratic probing
3. Double hashing

Kumkum Saxena Searching and Hashing page 50

Open Addressing
◼ Linear probing
◼ In the case of a collision, keep going to the
next hash table location until find an open
location
◼ In other words, if table[i] is occupied, check
table[i+1], table[i+2], table[i+3], …
◼ Need 3 states for each hash table location:
empty, occupied, deleted
◼ Common problem
◼ Items tend to cluster together in the hash table

Kumkum Saxena Searching and Hashing page 51

Open Addressing
◼ Linear probing example
◼ Table size = 31
◼ Hash function = key mod 31
◼ h(1234) = 25 table[25] = 1234
◼ h(4055) = 25+1 table[26] = 4055
◼ h(3962) = 25+2 table[27] = 3962
◼ h(5853) = 25+3 table[28] = 5853
◼ h(1766) = 30 table[30] = 1766
◼ h(1270) = 30+1 table[0] = 1270 (wraps around)
◼ All other table entries are empty

Kumkum Saxena Searching and Hashing page 52

Open Addressing
◼ Empty, occupied, & deleted states
◼ Assume we delete record #3962
◼ This state must be changed to occupied (not
empty), so we can still locate record #5853
◼ h(1234) = 25 table[25] = 1234
◼ h(4055) = 25 table[26] = 4055
◼ delete(3962) table[27] = “deleted”
◼ h(5853) = 25 table[28] = 5853
◼ no record added table[29] = “empty”
◼ h(1766) = 30 table[30] = 1766
◼ h(1270) = 30 table[0] = 1270 (wraps around)

Kumkum Saxena Searching and Hashing page 53

Open Addressing
Insert:
38
0 19
1 8
109
2 10
3
4 ◼ Linear Probing:
after checking spot
5
h(k), try spot h(k)+1,
6 if that is full, try
7 h(k)+2, then h(k)+3,
etc.
8
9 54
Kumkum Saxena Searching and Hashing page 54
Linear Probing – Clustering

no collision
collision in small cluster
no collision

collision in large cluster

[R. Sedgewick]

55
Kumkum Saxena Searching and Hashing page 55
Open Addressing
◼ Quadratic probing
◼ Instead of checking the next location
sequentially, check the next location based on
a sequence of squares
◼ In other words, if table[i] is occupied, check
table[i+12], table[i+22], table[i+32], …
◼ Still have clustering (called “secondary clustering”),
but this method is not as problematic as linear
probing

Kumkum Saxena Searching and Hashing page 56

Open Addressing
◼ Quadratic probing example
◼ Table size = 31
◼ Hash function = key mod 31
◼ h(1234) = 25 table[25] = 1234
◼ h(4055) = 25+12 table[26] = 4055
◼ h(3962) = 25+22 table[29] = 3962
◼ h(5853) = 25+32 table[3] = 5853 (wraps around)
◼ h(1766) = 30 table[30] = 1766
◼ h(1270) = 30+12 table[0] = 1270 (wraps around)
◼ All other table entries are empty

Kumkum Saxena Searching and Hashing page 57

Quadratic Probing
0 Insert:
1 89
18
2 49
3 58
79
4
5
6
7
8
9 58
Kumkum Saxena Searching and Hashing page 58
Open Addressing
◼ Double hashing
◼ Use two hash functions, where second hash
function determines the step size to next hash
table index
◼ Some restrictions
◼ h2(searchKey) != 0 (step size should not be zero)
◼ h2 != h1 (avoids clustering)

Kumkum Saxena Searching and Hashing page 59

Quadratic Probing:
Success guarantee for  < ½
◼ If size is prime and  < ½, then quadratic probing
will find an empty slot in size/2 probes or fewer.
◼ show for all 0  i,j  size/2 and i  j
(h(x) + i2) mod size  (h(x) + j2) mod size
◼ by contradiction: suppose that for some i  j:
(h(x) + i2) mod size = (h(x) + j2) mod size
 i2 mod size = j2 mod size
 (i2 - j2) mod size = 0
 [(i + j)(i - j)] mod size = 0

Because size is prime(i-j)or (i+j) must be zero, and neither can

60
Kumkum Saxena Searching and Hashing page 60
Quadratic Probing: Properties
◼ For any  < ½, quadratic probing will find an empty
slot; for bigger , quadratic probing may find a slot

◼ Quadratic probing does not suffer from primary

clustering: keys hashing to the same area are not
bad

◼ But what about keys that hash to the same spot?

◼ Secondary Clustering!

61
Kumkum Saxena Searching and Hashing page 61
Open Addressing
◼ Double hashing example
◼ Table size = 31
◼ Hash function #1 = key mod 31
◼ Hash function #2 = 23 – (key mod 23)
◼ h1(1234) = 25 table[25] = 1234
◼ h1(4055) = 25, h2(4055) = 16 (+25),table[10] = 4055
◼ h1(3962) = 25, h2(3962) = 17 (+25), table[11] = 3962
◼ h1(5853) = 25, h2(5853) = 12 (+25), table[6] = 5853
◼ h1(1766) = 30 table[30] = 1766
◼ h1(1270) = 30, h2(1270) = 18 (+30), table[17] = 1270
◼ All other table entries are empty

Kumkum Saxena Searching and Hashing page 62

Open Addressing
◼ Double hashing example
◼ h1(key) = key mod 13
◼ h2(key) = 11 – (key mod 11)
◼ If key = 30, probe sequence would be 4, 7, 10, 0, 3,
6, 9, 12, 2, 5, 8, 11, 1 (step 3 each time)
◼ If key = 50, probe sequence would be 11, 3, 8, 0, 5,
10, 2, 7, 12, 4, 9, 1, 6 (step 5 each time)

Kumkum Saxena Searching and Hashing page 63

Resolving Collisions with Double Hashing
0 Hash Functions:
H(K) = K mod M
1
H2(K) = 1 + ((K/M) mod (M-1))
2 M=
3
4 Insert these values into the hash table
in this order. Resolve any collisions
5 with double hashing:
6 13
7 28
33
8
147
9 43
64
Kumkum Saxena Searching and Hashing page 64
Open Addressing
◼ If table size is prime, then probe sequence
will visit all table locations
◼ With open addressing, increasing table size
will reduce collisions
◼ When increasing the size, the hash function
needs to be reapplied to every item in the old
hash table to place it in the new hash table

Kumkum Saxena Searching and Hashing page 65

Restructuring the Hash Table
◼ How is a hash table restructured for
collision resolution?
◼ The structure of the hash table is changed so
that the same index location can store multiple
items
◼ Two ways to restructure a hash table for
collision resolution
1. Bucket hashing
2. Separate chaining

Kumkum Saxena Searching and Hashing page 66

Restructuring the Hash Table
◼ Bucket hashing
◼ A hash table that has an array at each location
table[i], so that items of the same hash index
are stored here
◼ Choosing the size of the bucket is problematic
◼ If too small, will have collisions
◼ If too big, will waste space

Kumkum Saxena Searching and Hashing page 67

Restructuring the Hash Table
◼ Bucket hashing example
◼ Table size = 31
◼ Hash function = key mod 31
◼ h(1234) = 25 table[25][0] = 1234
◼ h(4055) = 25 table[25][1] = 4055
◼ h(3962) = 25 table[25][2] = 3962
◼ h(5853) = 25 table[25][3] = 5853
◼ h(1766) = 30 table[30][0] = 1766
◼ h(1270) = 30 table[30][1] = 1270
◼ All other table entries are empty

Kumkum Saxena Searching and Hashing page 68

Restructuring the Hash Table
◼ Separate chaining
◼ A hash table that has linked list (a chain) at
each location table[i], so that items of the
same hash index are stored here
◼ Size of the table is dynamic
◼ Less problematic than static bucket implementation

Kumkum Saxena Searching and Hashing page 69

Restructuring the Hash Table
◼ Separate chaining example
◼ Table size = 31
◼ Hash function = key mod 31
◼ h(1234) = 25, table[25]=>1234
◼ h(4055) = 25, table[25]=>4055=>1234
◼ h(3962) = 25, table[25]=>3962=>4055=>1234
◼ h(5853) = 25, table[25]=>5853=>3962=>4055=>1234
◼ h(1766) = 30, table[30]=>1766
◼ h(1270) = 30, table[30]=>1270=>1766

Kumkum Saxena Searching and Hashing page 70

Hash Tables
◼ Summary:
◼ We use a hash table to accomplish O(1) access
time into a table
◼ While keeping the table to a reasonable size
◼ Use a hash function to map the record “keys” into an
index in the hash table
◼ Collisions are bound to happen and are taken care of
using several possible methods
◼ Comparison of Implementations (slowest to
quickest)
◼ Linear probing, quadratic probing, double hashing,
separate chaining
Kumkum Saxena Searching and Hashing page 71

IPandBIP
No ratings yet
IPandBIP
30 pages
Binary Search
No ratings yet
Binary Search
14 pages
3.1 Searching Techniques
No ratings yet
3.1 Searching Techniques
49 pages
PF Lec13 Searching
No ratings yet
PF Lec13 Searching
17 pages
3.1 Searching Techniques
No ratings yet
3.1 Searching Techniques
48 pages
8. Searching and Hashing
No ratings yet
8. Searching and Hashing
36 pages
CMP 202 (Searching)
No ratings yet
CMP 202 (Searching)
26 pages
Lecture 06 - Searching Arrays
No ratings yet
Lecture 06 - Searching Arrays
24 pages
Lec 3- Searching
No ratings yet
Lec 3- Searching
32 pages
Sorting and Hashing
No ratings yet
Sorting and Hashing
33 pages
Unit 5
No ratings yet
Unit 5
6 pages
CH 6 Searching Algorithms and Hashing
No ratings yet
CH 6 Searching Algorithms and Hashing
142 pages
Basic Searching Algorithms
No ratings yet
Basic Searching Algorithms
16 pages
Algorithm Lecture6 Search
No ratings yet
Algorithm Lecture6 Search
40 pages
Searching_in_Data_Structures_Presentation (4)
No ratings yet
Searching_in_Data_Structures_Presentation (4)
12 pages
07 Searching Algorithms
No ratings yet
07 Searching Algorithms
40 pages
Dsu Microproject Final1
No ratings yet
Dsu Microproject Final1
24 pages
Chapter 9 Searching
No ratings yet
Chapter 9 Searching
47 pages
Unit-9-Searching
No ratings yet
Unit-9-Searching
10 pages
Final Binary +linear Search
No ratings yet
Final Binary +linear Search
18 pages
02Chapter Two Divide Conquer ALGORITHM
No ratings yet
02Chapter Two Divide Conquer ALGORITHM
63 pages
Lec 6
No ratings yet
Lec 6
23 pages
Search in C
No ratings yet
Search in C
15 pages
Binary Search Algorithm
No ratings yet
Binary Search Algorithm
13 pages
Unit 3 DSA
No ratings yet
Unit 3 DSA
273 pages
Searching Handout
No ratings yet
Searching Handout
58 pages
Lecture_5
No ratings yet
Lecture_5
25 pages
AICT-WK-11-Lec-21-22
No ratings yet
AICT-WK-11-Lec-21-22
9 pages
Searching Algorithms
No ratings yet
Searching Algorithms
17 pages
Searching and Sorting1 231121 100048
No ratings yet
Searching and Sorting1 231121 100048
119 pages
Unit-1-1
No ratings yet
Unit-1-1
63 pages
Quick Sort and Binary Search
No ratings yet
Quick Sort and Binary Search
17 pages
DS Unit-2 SearchSort
No ratings yet
DS Unit-2 SearchSort
24 pages
Binary Search
100% (1)
Binary Search
19 pages
Data Search Algorithm
No ratings yet
Data Search Algorithm
18 pages
Lesson 05
No ratings yet
Lesson 05
23 pages
8.0_Searching (1)
No ratings yet
8.0_Searching (1)
27 pages
DAA - DAC - Basic - BinarySearch
No ratings yet
DAA - DAC - Basic - BinarySearch
24 pages
Unit 5 - DSA
No ratings yet
Unit 5 - DSA
14 pages
11 Searching
No ratings yet
11 Searching
15 pages
C Programming Searching
No ratings yet
C Programming Searching
13 pages
Cha2 Algorithm
No ratings yet
Cha2 Algorithm
48 pages
Lecture 9 Searching
No ratings yet
Lecture 9 Searching
18 pages
10a-Searching
No ratings yet
10a-Searching
15 pages
Computer Systems Are Often Used To Store Large Amounts of Data From Which Individual Records Must Be Retrieved According To Some Search Criterion
No ratings yet
Computer Systems Are Often Used To Store Large Amounts of Data From Which Individual Records Must Be Retrieved According To Some Search Criterion
4 pages
Unit-4 Pps
No ratings yet
Unit-4 Pps
32 pages
2 - Search SortArrays
No ratings yet
2 - Search SortArrays
36 pages
Module 1-Binary Search
No ratings yet
Module 1-Binary Search
5 pages
DAA Unit1 05 BinarySearch
No ratings yet
DAA Unit1 05 BinarySearch
12 pages
Searching Algorithms: Sequential Search Binary Search Binary Search Tree (BST)
No ratings yet
Searching Algorithms: Sequential Search Binary Search Binary Search Tree (BST)
36 pages
8 search+hash - 2
No ratings yet
8 search+hash - 2
28 pages
Topic 8 - Searching Techniques
No ratings yet
Topic 8 - Searching Techniques
40 pages
Searching
No ratings yet
Searching
16 pages
Searching Techniques
No ratings yet
Searching Techniques
17 pages
UNIT-5 SEARCHING & SORTING TECHNIQUES NOT COMPLETED (1)
No ratings yet
UNIT-5 SEARCHING & SORTING TECHNIQUES NOT COMPLETED (1)
11 pages
unit-ii
No ratings yet
unit-ii
150 pages
unit 3 ds
No ratings yet
unit 3 ds
35 pages
10 - Searching & Sorting
No ratings yet
10 - Searching & Sorting
110 pages
Binary Search Algorithm
No ratings yet
Binary Search Algorithm
6 pages
Searching 2
No ratings yet
Searching 2
64 pages
Number Fun
From Everand
Number Fun
Albert Rodney Daniels
No ratings yet
tr-2006-52
No ratings yet
tr-2006-52
18 pages
12 2Recursion Final
No ratings yet
12 2Recursion Final
93 pages
13-Stacks and Queues Linked List
No ratings yet
13-Stacks and Queues Linked List
16 pages
Sim and Rim Instructions
100% (1)
Sim and Rim Instructions
3 pages
Data Sheet
No ratings yet
Data Sheet
29 pages
Read The Following Instructions Carefully Before Attempting The Assignment
No ratings yet
Read The Following Instructions Carefully Before Attempting The Assignment
2 pages
Exam
No ratings yet
Exam
15 pages
Stacks and Queues
No ratings yet
Stacks and Queues
18 pages
Insta DSA
No ratings yet
Insta DSA
5 pages
Harshit DAA 3.2
No ratings yet
Harshit DAA 3.2
5 pages
ITC Assignment 5
No ratings yet
ITC Assignment 5
6 pages
cs502 Midterm Solved MCQs
No ratings yet
cs502 Midterm Solved MCQs
28 pages
Google - All time
No ratings yet
Google - All time
61 pages
Leet Code
No ratings yet
Leet Code
12 pages
Scalable Neural Network
No ratings yet
Scalable Neural Network
31 pages
Pythoneasy
No ratings yet
Pythoneasy
3 pages
Top 50+ Java Collections Interview Questions (2024)
No ratings yet
Top 50+ Java Collections Interview Questions (2024)
44 pages
Here Is Another Example. The Insertion Sequence Is: Single Rot. Left at 50
No ratings yet
Here Is Another Example. The Insertion Sequence Is: Single Rot. Left at 50
2 pages
FASTA Algorithm
No ratings yet
FASTA Algorithm
15 pages
2014-Aut
No ratings yet
2014-Aut
5 pages
The Steps of The Simplex Algorithm
No ratings yet
The Steps of The Simplex Algorithm
8 pages
15 String Matching
No ratings yet
15 String Matching
45 pages
Showclassmst
No ratings yet
Showclassmst
17 pages
Assignment 1 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing
No ratings yet
Assignment 1 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing
15 pages
Nptel Week 8
No ratings yet
Nptel Week 8
3 pages
CS301 Final Term Solved MCQs by JUNAID
No ratings yet
CS301 Final Term Solved MCQs by JUNAID
34 pages
Session 5 - Print Version - Stiffness Method-Truss Analysis-Matrix
No ratings yet
Session 5 - Print Version - Stiffness Method-Truss Analysis-Matrix
17 pages
Look & Clook Scheduling
No ratings yet
Look & Clook Scheduling
4 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
QUIZ-1 - Attempt Review
No ratings yet
QUIZ-1 - Attempt Review
4 pages
Introduction To Greedy Algorithms
No ratings yet
Introduction To Greedy Algorithms
12 pages
Algorithm Quiz
No ratings yet
Algorithm Quiz
40 pages
3 Numerical Optimization
No ratings yet
3 Numerical Optimization
17 pages
The Simplex Method: Standard Form of LPP
No ratings yet
The Simplex Method: Standard Form of LPP
6 pages

11-Searching and Hashing Final

Uploaded by

11-Searching and Hashing Final

Uploaded by

Linear Search

int search(int array[], int len, int value) {

Kumkum Saxena Searching and Hashing page 2

◼ But we ask a question:

Kumkum Saxena Searching and Hashing page 3

Kumkum Saxena Searching and Hashing page 4

◼ Can we now adapt this idea to work for

Kumkum Saxena Searching and Hashing page 6

◼ We are searching for the value, 19

Kumkum Saxena Searching and Hashing page 7

Kumkum Saxena Searching and Hashing page 8

int low = 0, high = len-1;

◼ At the end of each array iteration, all we do is

Kumkum Saxena Searching and Hashing page 11

Kumkum Saxena Searching and Hashing page 13

◼ After 1 guess, we have n/2 items left

Kumkum Saxena Searching and Hashing page 14

Kumkum Saxena Searching and Hashing page 15

◼ Basically, any log n algorithm is SUPER FAST.

Kumkum Saxena Searching and Hashing page 19

Each individual ...

Consider this problem. We want to store 1,000

Kumkum Saxena Searching and Hashing page 20

◼ What’s the most efficient way of doing this?

Kumkum Saxena Searching and Hashing page 21

Kumkum Saxena Searching and Hashing page 22

Kumkum Saxena Searching and Hashing page 24

Kumkum Saxena Searching and Hashing page 26

Kumkum Saxena Searching and Hashing page 28

Kumkum Saxena Searching and Hashing page 29

Kumkum Saxena Searching and Hashing page 33

Kumkum Saxena Searching and Hashing page 34

Kumkum Saxena Searching and Hashing page 36

Kumkum Saxena Searching and Hashing page 37

Kumkum Saxena Searching and Hashing page 38

Kumkum Saxena Searching and Hashing page 39

Kumkum Saxena Searching and Hashing page 40

Kumkum Saxena Searching and Hashing page 41

Kumkum Saxena Searching and Hashing page 42

Kumkum Saxena Searching and Hashing page 43

Kumkum Saxena Searching and Hashing page 44

Kumkum Saxena Searching and Hashing page 45

Kumkum Saxena Searching and Hashing page 46

Kumkum Saxena Searching and Hashing page 47

Kumkum Saxena Searching and Hashing page 48

Kumkum Saxena Searching and Hashing page 49

Kumkum Saxena Searching and Hashing page 50

Kumkum Saxena Searching and Hashing page 51

Kumkum Saxena Searching and Hashing page 52

Kumkum Saxena Searching and Hashing page 53

collision in large cluster

Kumkum Saxena Searching and Hashing page 56

Kumkum Saxena Searching and Hashing page 57

Kumkum Saxena Searching and Hashing page 59

Because size is prime(i-j)or (i+j) must be zero, and neither can

◼ Quadratic probing does not suffer from primary

◼ But what about keys that hash to the same spot?

Kumkum Saxena Searching and Hashing page 62

Kumkum Saxena Searching and Hashing page 63

Kumkum Saxena Searching and Hashing page 65

Kumkum Saxena Searching and Hashing page 66

Kumkum Saxena Searching and Hashing page 67

Kumkum Saxena Searching and Hashing page 68

Kumkum Saxena Searching and Hashing page 69

Kumkum Saxena Searching and Hashing page 70

You might also like