0% found this document useful (0 votes)
113 views

BCS304 DS Module 5 Notes

Uploaded by

bheemanna171147
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views

BCS304 DS Module 5 Notes

Uploaded by

bheemanna171147
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Module 5

Hashing: Introduction, Static Hashing, Dynamic Hashing.

Priority Queues: Single and double ended Priority Queues, Leftist Trees

Introduction to Efficient Binary Search Trees: Optimal Binary Search Trees

Text Book: Chapter 8: 8.1 to 8.3, Chapter 9: 9.1, 9.2, Chapter 10: 10.1

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 1 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

5.1. Hashing: Introduction


What is Hashing?
• Hashing is a technique or process of mapping keys, and values into the hash table by using a hash
function.
• It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the
hash function used.
• Let a hash function H(x) maps the value x at the index x%10 in an Array.
• For example, the list of values is [11,12,13,14,15] it will be stored at positions {1,2,3,4,5} in the
array or Hash table respectively.

Need of Hashing?
• Every day, the data on the internet is increasing multifold and it is always a struggle to store this
data efficiently.
• In day-to-day programming, this amount of data might not be that big, but still, it needs to be
stored, accessed, and processed easily and efficiently.
• A very common data structure that is used for such a purpose is the Array data structure.

Components of Hashing
• There are majorly three components of hashing (Refer below figure):
o Key: A Key can be anything string or integer which is fed as input in the hash function that
determines an index or location for storage of an item in a data structure.
o Hash Function: The hash function receives the input key and returns the index of an
element in an array called a hash table. The index is known as the hash index.
o Hash Table: Hash table is a data structure that maps keys to values using a special function
called a hash function. Hash stores the data in an associative manner in an array where each
data value has its own unique index.

Hash Table
• Hash table is one of the most important data structures that uses a special function known as a hash
function that maps a given value with a key to access the elements faster.
• A Hash table is a data structure that stores some information, and the information has basically
two main components, i.e., key and value.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 2 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• The hash table can be implemented with the help of an associative array. The efficiency of mapping
depends upon the efficiency of the hash function used for mapping.
• For example, suppose the key value is John and the value is the phone number, so when we pass
the key value in the hash function shown as below:
Hash(key)= index;
When we pass the key in the hash function, then it gives the index.
Hash(john) = 3;
The above example adds the john at the index 3.

Drawback of Hash function


• A Hash function assigns each value with a unique key. Sometimes hash table uses an imperfect
hash function that causes a collision because the hash function generates the same key of two
different values.

How does Hashing work?


• Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.
• Our main objective here is to search or update the values stored in the table quickly in O(1) time
and we are not concerned about the ordering of strings in the table.
• So the given set of strings can act as a key and the string itself will act as the value of the string
but how to store the value corresponding to the key?
• Step 1: We know that hash functions (which is some mathematical formula) are used to calculate
the hash value which acts as the index of the data structure where the value will be stored.
• Step 2: So, let’s assign

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 3 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

“a” = 1,
“b”=2, .. etc, to all alphabetical characters.
• Step 3: Therefore, the numerical value by summation of all characters of the string:
“ab” = 1 + 2 = 3,
“cd” = 3 + 4 = 7 ,
“efg” = 5 + 6 + 7 = 18
• Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that is
used here is the sum of the characters in key mod Table size. We can compute the location of the
string in the array by taking the sum(string) mod 7.
• Step 5: So we will then store
“ab” in 3 mod 7 = 3,
“cd” in 7 mod 7 = 0, and
“efg” in 18 mod 7 = 4.

• The above technique enables us to calculate the location of a given string by using a simple hash
function and rapidly find the value that is stored in that location.
• Therefore the idea of hashing seems like a great way to store (key, value) pairs of the data in a
table.

What is a Hash function?


• The hash function creates a mapping between key and value, this is done through the use of
mathematical formulas known as hash functions. The result of the hash function is referred to as a
hash value or hash.
• The hash value is a representation of the original string of characters but usually smaller than the
original.
• For example: Consider an array as a Map where the key is the index and the value is the value at
that index.
• So for an array A if we have index i which will be treated as the key then we can find the value by
simply looking at the value at A[i].

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 4 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Properties of a Good Hash Function


1. Low cost:
▪ The cost of executing a hash function must be small, so that using the hashing technique
becomes preferable over other approaches.
▪ For example, if binary search algorithm can search an element from a sorted table of n items
with log2n key comparisons, then the hash function must cost less than performing log2n key
comparisons.
2. Determinism:
▪ A hash procedure must be deterministic. This means that the same hash value must be generated
for a given input value.
▪ However, this criteria excludes hash functions that depend on external variable parameters
(such as the time of day) and on the memory address of the object being hashed (because address
of the object may change during processing).
3. Uniformity:
▪ A good hash function must map the keys as evenly as possible over its output range.
▪ This means that the probability of generating every hash value in the output range should
roughly be the same.
▪ The property of uniformity also minimizes the number of collisions.

4. Efficiently computable.
5. Should uniformly distribute the keys (Each table position is equally likely for each.
6. Should have a low load factor(number of items in the table divided by the size of the table).

5.2. Different Hash Functions


• There are three ways of calculating the hash function:
o Division Method
o Multiplication Method
o Mid-Square method
o Folding Method
Division method:
▪ It is the most simple method of hashing an integer x.
▪ This method divides x by M and then uses the remainder obtained.
▪ the hash function can be defined as:

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 5 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

h(x) = x % m;
where m is the size of the hash table.
▪ For example, if the key value is 6 and the size of the hash table is 10. When we apply the hash
function to key 6 then the index would be:
h(6) = 6%10 = 6
The index is 6 at which the value is stored.
▪ Example: Calculate the hash values of keys 1234 and 5462.
Solution Setting M = 97, hash values can be calculated as:
h(1234) = 1234 % 97 = 70
h(5642) = 5642 % 97 = 16

Multiplication Method
▪ The steps involved in the multiplication method are as follows:
Step 1: Choose a constant A such that 0 < A < 1.
Step 2: Multiply the key k by A.
Step 3: Extract the fractional part of kA.
Step 4: Multiply the result of Step 3 by the size of hash table (m).
▪ Hence, the hash function can be given as:
h(k) = floor(m (kA mod 1))
▪ Example: Given a hash table of size 1000, map the key 12345 to an appropriate location in the
hash table.
Solution We will use A = 0.618033, m = 1000, and k = 12345
h(12345) = Floor(1000 (12345 ¥ 0.618033 mod 1))
h(12345) = floor(1000 (7629.617385 mod 1))
h(12345) = Floor(1000 (0.617385) )
h(12345) = floor(617.385)
h(12345) = 617

Mid-Square Method
▪ The mid-square method is a good hash function which works in two steps:
Step 1: Square the value of the key. That is, find k2.
Step 2: Extract the middle r digits of the result obtained in Step 1.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 6 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

▪ The algorithm works well because most or all digits of the key value contribute to the result.
▪ This is because all the digits in the original key value contribute to produce the middle digits of
the squared value.
▪ Therefore, the result is not dominated by the distribution of the bottom digit or the top digit of the
original key value.
▪ In the mid-square method, the same r digits must be chosen from all the keys. Therefore, the
▪ hash function can be given as:
h(k) = s
where s is obtained by selecting r digits from k2.
▪ Example: Calculate the hash value for keys 1234 and 5642 using the mid-square method.
The hash table has 100 memory locations.
Solution: Note that the hash table has 100 memory locations whose indices vary from 0 to 99.
This means that only two digits are needed to map the key to a location in the hash table, so r = 2.
When k = 1234, k2 = 1522756, h (1234) = 27
When k = 5642, k2 = 31832164, h (5642) = 21
Observe that the 3rd and 4th digits starting from the right are chosen.

Folding Method
▪ The folding method works in the following two steps:
▪ Step 1: Divide the key value into a number of parts. That is, divide k into parts k1, k2, ..., kn, where
each part has the same number of digits except the last part which may have lesser digits than the
other parts.
▪ Step 2: Add the individual parts. That is, obtain the sum of k1 + k2 + ... + kn. The hash value is
produced by ignoring the last carry, if any.
▪ Note that the number of digits in each part of the key will vary depending upon the size of the hash
table. For example, if the hash table has a size of 1000, then there are 1000 locations in the hash
table. To address these 1000 locations, we need at least three digits; therefore, each part of the key
must have three digits except the last part which may have lesser digits.
▪ Example: Given a hash table of 100 locations, calculate the hash value using folding method for
keys 5678, 321, and 34567.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 7 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

▪ Solution: Since there are 100 memory locations to address, we will break the key into parts where
each part (except the last) will contain two digits. The hash values can be obtained as shown below:

5.3. Collisions
▪ The hashing process generates a small number for a big key, so there is a possibility that two keys
could produce the same value.
▪ The situation where the newly inserted key maps to an already occupied, and it must be handled
using some collision handling technology.

How to handle Collisions?


▪ There are mainly two methods to handle collision:
o Separate Chaining:
o Open Addressing:

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 8 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Separate Chaining (Open Hashing)


o The idea is to make each cell of the hash table point to a linked list of records that have the same
hash function value. Chaining is simple but requires additional memory outside the table.
o Example: We have given a hash function and we have to insert some elements in the hash table
using a separate chaining method for collision resolution technique.
Hash function = key % 5,
Elements = 12, 15, 22, 25 and 37.
o Let’s see step by step approach to how to solve the above problem:
o Step 1: First draw the empty hash table which will have a possible range of hash values from 0 to
4 according to the hash function provided.

o Step 2: Now insert all the keys in the hash table one by one. The first key to be inserted is 12
which is mapped to bucket number 2 which is calculated by using the hash function 12%5=2.

o Step 3: Now the next key is 22. It will map to bucket number 2 because 22%5=2. But bucket 2
is already occupied by key 12.

Step 1 Step 2 Step 3


o Step 4: The next key is 15. It will map to slot number 0 because 15%5=0.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 9 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Step 5: Now the next key is 25. Its bucket number will be 25%5=0. But bucket 0 is already
occupied by key 25. So separate chaining method will again handle the collision by creating a
linked list to bucket 0.

o Hence In this way, the separate chaining method is used as the collision resolution technique.

Open Addressing
o In open addressing, all elements are stored in the hash table itself. Each table entry contains
either a record or NIL. When searching for an element, we examine the table slots one by one
until the desired element is found or it is clear that the element is not in the table.

a. Linear Probing
o In linear probing, the hash table is searched sequentially that starts from the original location of
the hash. If in case the location that we get is already occupied, then we check for the next
location.

o Algorithm:
I. Calculate the hash key. i.e. key = data % size
II. Check, if hashTable[key] is empty
o store the value directly by hashTable[key] = data
III. If the hash index already has some value then
o check for next index using key = (key+1) % size
IV. Check, if the next index is available hashTable[key] then store the value.
Otherwise try for next index.
V. Do the above process till we find the space.

Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that are
to be inserted are 50, 70, 76, 85, 93.
• Step1: First draw the empty hash table which will have a possible range of hash values from 0 to
4 according to the hash function provided.

• Step 2: Now insert all the keys in the hash table one by one. The first key is 50. It will map to
slot number 0 because 50%5=0. So insert it into slot number 0.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 10 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Step 1 Step 2
• Step 3: The next key is 70. It will map to slot number 0 because 70%5=0 but 50 is already at slot
number 0 so, search for the next empty slot and insert it.

• Step 4: The next key is 76. It will map to slot number 1 because 76%5=1 but 70 is already at slot
number 1 so, search for the next empty slot and insert it.

Step 3 Step 4 Step 5

• Step 5: The next key is 93 It will map to slot number 3 because 93%5=3, So insert it into slot
number 3.

b) Quadratic Probing
• Quadratic probing is an open addressing scheme in computer programming for resolving hash
collisions in hash tables. Quadratic probing operates by taking the original hash index and adding
successive values of an arbitrary quadratic polynomial until an open slot is found.
• An example sequence using quadratic probing is:
H + 12, H + 22, H + 32, H + 42…………………. H + k2

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 11 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• This method is also known as the mid-square method because in this method we look for i2‘th
probe (slot) in i’th iteration and the value of i = 0, 1, . . . n – 1. We always start from the original
hash location. If only the location is occupied then we check the other slots.
• Let hash(x) be the slot index computed using the hash function and n be the size of the hash
table.
If the slot hash(x) % n is full, then we try (hash(x) + 12) % n.
If (hash(x) + 12) % n is also full, then we try (hash(x) + 22) % n.
If (hash(x) + 22) % n is also full, then we try (hash(x) + 32) % n.

This process will be repeated for all the values of i until an empty slot is found
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution
strategy to be f(i) = i2 . Insert = 22, 30, and 50
Step 1: Create a table of size 7.

Step 2 – Insert 22 and 30


Hash(22) = 22 % 7 = 1, Since the cell at index 1 is empty, we can easily insert 22 at slot 1.
Hash(30) = 30 % 7 = 2, Since the cell at index 2 is empty, we can easily insert 30 at slot 2.

Step 1 Step 2 Step 3

Step 3: Inserting 50
• Hash(50) = 50 % 7 = 1
• In our hash table slot 1 is already occupied. So, we will search for slot 1+12, i.e. 1+1 = 2,
• Again slot 2 is found occupied, so we will search for cell 1+22, i.e.1+4 = 5,
• Now, cell 5 is not occupied so we will place 50 in slot 5.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 12 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

c) Double Hashing
• Double hashing is a collision resolving technique in Open Addressed Hash tables. Double
hashing make use of two hash function,

• The first hash function is h1(k) which takes the key and gives out a location on the hash table.
But if the new location is not occupied or empty then we can easily place our key.
• But in case the location is occupied (collision) we will use secondary hash-function h2(k) in
combination with the first hash-function h1(k) to find the new location on the hash table.
• This combination of hash functions is of the form
h(k, i) = (h1(k) + i * h2(k)) % n
where
i is a non-negative integer that indicates a collision number,
k = element/key which is being hashed
n = hash table size.

• Complexity of the Double hashing algorithm: Time complexity: O(n)

• Example: Insert the keys 27, 43, 692, 72 into the Hash Table of size 7. where first hash-function
is h1(k) = k mod 7 and second hash-function is h2(k) = 1 + (k mod 5)
• Step 1: Insert 27
27 % 7 = 6, location 6 is empty so insert 27 into 6 slot.

Step 2: Insert 43
43 % 7 = 1, location 1 is empty so insert 43 into 1 slot.

Step 3: Insert 692


692 % 7 = 6, but location 6 is already being occupied and this is a collision

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 13 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

So we need to resolve this collision using double hashing.


hnew = [h1(692) + i * (h2(692)] % 7
= [6 + 1 * (1 + 692 % 5)] % 7
=9%7
=2

Now, as 2 is an empty slot,


so we can insert 692 into 2nd slot.

Step 4: Insert 72
72 % 7 = 2, but location 2 is already being occupied and this is a collision.
So we need to resolve this collision using double hashing.
hnew = [h1(72) + i * (h2(72)] % 7
= [2 + 1 * (1 + 72 % 5)] % 7
=5%7
= 5,
Now, as 5 is an empty slot,
so we can insert 72 into 5th slot.

What is meant by Load Factor in Hashing?


• The load factor of the hash table can be defined as the number of items the hash table contains
divided by the size of the hash table.
• Load factor is the decisive parameter that is used when we want to rehash the previous hash
function or want to add more elements to the existing hash table.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 14 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• It helps us in determining the efficiency of the hash function i.e. it tells whether the hash function
which we are using is distributing the keys uniformly or not in the hash table.
Load Factor = Total elements in hash table/ Size of hash table

What is Rehashing?
• As the name suggests, rehashing means hashing again. Basically, when the load factor increases
to more than its predefined value (the default value of the load factor is 0.75), the complexity
increases.
• So to overcome this, the size of the array is increased (doubled) and all the values are hashed again
and stored in the new double-sized array to maintain a low load factor and low complexity.

Applications of Hash Data structure


• Hash is used in databases for indexing.
• Hash is used in disk-based data structures.
• In some programming languages like Python, JavaScript hash is used to implement objects.
• Password verification
• Associating filename with their paths in operating systems
• Data Structures, where a key-value pair is created in which the key is a unique value, whereas the
value associated with the keys can be either same or different for different keys.
• Board games such as Chess, tic-tac-toe, etc.
• Graphics processing, where a large amount of data needs to be matched and fetched.

Real-Time Applications of Hash Data structure


• Hash is used for cache mapping for fast access to the data.
• Hash can be used for password verification.
• Hash is used in cryptography as a message digest.

Advantages of Hash Data structure


• Hash provides better synchronization than other data structures.
• Hash tables are more efficient than search trees or other data structures
• Hash provides constant time for searching, insertion, and deletion operations on average.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 15 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Disadvantages of Hash Data structure


• Hash is inefficient when there are many collisions.
• Hash collisions are practically not avoided for a large set of possible keys.
• Hash does not allow null values.

5.4. Static and Dynamic Hashing


5.4.1. Static Hashing
What is Static Hashing?
• It is a hashing technique that enables users to lookup a definite data set. Meaning, the data in the
directory is not changing, it is "Static" or fixed.
• In static hashing, when a search key value is provided, the hash function always computes the
same address. It is used in databases.
• Example: For student id=76, using mod 5 hash function, it always result in the same bucket address
• There will not be any changes to the bucket address.

Operations Provided by Static Hashing


• Static hashing provides the following operations −
o Delete − Search a record address and delete a record at the same address or delete a chunk
of records from records for that address in memory.
o Insertion − While entering a new record using static hashing, the hash function (h)
calculates bucket address "h(K)" for the search key (k), where the record is going to be
stored.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 16 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Search − A record can be obtained using a hash function by locating the address of the
bucket where the data is stored.
o Update − It supports updating a record once it is traced in the data bucket.

Advantages of Static Hashing


• Offers unparalleled performance for small-size databases.
• Allows Primary Key value to be used as a Hash Key.

Disadvantages of Static Hashing


• It cannot work efficiently with the databases that can be scaled.
• It is not a good option for large-size databases.
• Bucket overflow issue occurs if there is more data and less memory.

5.4.2. Dynamic Hashing


• Major drawback of static hashing is that it does not expand or shrink dynamically as the size of
the data grows or shrinks.
• In dynamic hashing, data buckets grow or shrinks (added or removed dynamically). It is also called
as extended hashing.
• Thus, the resulting data bucket keeps increasing or decreasing depending on the number of records.
• In this hashing technique, the resulting number of data buckets in memory is ever-changing.

• Extensible hashing: it is a dynamic hashing method, where directories and buckets are used to
hash the data.
• It is flexible method in which hash function also experiences dynamic change.
• Directories: The directories store addresses of bucket pointers & ID are assigned to each
directories which may change each time when directory is expanded.
• Bucket: The buckets are used to hash the actual data into the hash table.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 17 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Dynamic Hashing using Directories:


• Consider an example, 16, 4, 6, 22, 24, 10, 31, 7, 9, 20, 26 for the order 3.
• Convert the given data into binary representation to store the values into hash table
Key Binary Representation
16 10000
4 00100
6 00110
22 10110
24 11000
10 01010
31 11111
7 00111
9 01001
20 10100
26 11010

• Create a directory with initial 2 slots, check the last bit with 0 in binary representation, the values
16, 4, 6 in a sequence are having last bit with 0, so added to bucket 0.

• 22 is also having last bit with 0, so divide directory into 4 slots with 2 bit binary and now check
binary representation with last 2 bits and add keys to buckets in order (as shown). Since, 01 and
11 doesnot grow, it is retained with single bucket. (After insertion of 16, 4, 6, 22, 24, 10, 31, 7, 9)

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 18 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Now, if we insert 20, 00 slot is already having order 3, further split 2 slots to 3 slots and continue
insertion based on binary values

• All the keys are inserted into hash table. All binary bits having 1 in LSB are not splitted as there
size is not increased.

Operations Provided by Dynamic Hashing


• Delete − Locate the desired location and support deleting data (or a chunk of data) at that location.
• Insertion − Support inserting new data into the data bucket if there is a space available in the data
bucket.
• Query − Perform querying to compute the bucket address.
• Update − Perform a query to update the data.

Advantages of Dynamic Hashing


• It works well with scalable data.
• It can handle addressing large amount of memory in which data size is always changing.
• Bucket overflow issue comes rarely or very late.

Disadvantages of Dynamic Hashing


• The location of the data in memory keeps changing according to the bucket size. Hence if there is
a phenomenal increase in data, then maintaining the bucket address table becomes a challenge.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 19 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

5.4.3. Static Hashing vs Dynamic Hashing

Key Factor Static Hashing Dynamic Hashing

Form of Data Fixed-size, non-changing data. Variable-size, changing data.

Result The resulting Data Bucket is of fixed-length. The resulting Data Bucket is of variable-length.

Bucket Challenge of Bucket overflow can arise Bucket overflow can occur very late or
Overflow often depending upon memory size. doesn’t occur at all.

Complexity Simple Complex

2) Design and develop a program in C that uses Hash Function H:K->L as H(K)=K mod
m(reminder method) and implement hashing technique to map a given key K to the
address space L. Resolve the collision (if any) using linear probing.
#include <stdio.h>
#include <stdlib.h>
#define MAX 10
int create(int);
void linear_prob(int[], int, int);
void display (int[]);
void main() {
int a[MAX],num,key,i;
int ans=1;
printf(" collision handling by linear probing : \n");
for (i=0; i<MAX; i++)
a[i] = -1;
do {
printf("\n Enter the data");
scanf("%4d", &num);
key=create(num);
linear_prob(a,key,num);
printf("\n Do you wish to continue ? (1/0) ");
scanf("%d",&ans);
}

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 20 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

while(ans);
display(a);
}

int create(int num){


int key;
key=num%100;
return key;
}

void linear_prob(int a[MAX], int key, int num){


int flag, i, count=0;
flag=0;
if(a[key]== -1) {
a[key] = num;
}
else {
printf("\nCollision Detected...!!!\n");
i=0;
while(i<MAX) {
if (a[i]!=-1)
count++;
i++;
}
printf("Collision avoided successfully using LINEAR PROBING\n");
if(count == MAX) {
printf("\n Hash table is full");
display(a);
exit(1);
}
for(i=key+1; i<MAX; i++)
if(a[i] == -1) {
a[i] = num;
flag =1;

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 21 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

break;
}
//for(i=0;i<key;i++)
i=0;
while((i<key) && (flag==0)) {
if(a[i] == -1) {
a[i] = num;
flag=1;
break;
}
i++;
}
}
}

void display(int a[MAX]){


int i,choice;
printf("1.Display ALL\n 2.Filtered Display\n");
scanf("%d",&choice);
if(choice==1) {
printf("\n the hash table is\n");
for(i=0; i<MAX; i++)
printf("\n %d %d ", i, a[i]);
}
else {
printf("\n the hash table is\n");
for(i=0; i<MAX; i++)
if(a[i]!=-1) {
printf("\n %d %d ", i, a[i]);
continue;
}
}
}

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 22 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

5.5. Priority Queues: Single and double ended Priority Queues, Leftist Trees

5.5.1. Priority Queues : Introduction


• A priority queue is a data structure which is an extension of normal queue that orders elements
based on their priority. The elements in a priority queue are processed in order of their priority,
with the highest-priority element processed first.
• New elements are added to the priority queue based on their priority order, and the highest-priority
element is always at the front of the queue.
• Heap structure is a classic data structure for representation of priority queue.

Example
• Consider an elements to insert 7, 2, 45, 32, and 12 in a priority queue.
• The element with the least value has the highest property. Thus, you should maintain the lowest
element at the front node.

• The above illustration how it maintains the priority during insertion in a queue. But, if you carry
the N comparisons for each insertion, time-complexity will become O(N2).

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 23 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Characteristics of a Priority queue


• Every element in a priority queue has some priority associated with it.
• Every element of this queue must be comparable.
• An element with the higher priority will be deleted before the deletion of the lower priority.
• If two elements in a priority queue have the same priority, they will be arranged using the FIFO
principle.

Representation of Priority Queue

Implementation Insert Remove Peek

Array 𝐎(𝒏𝟐 ) O(1) O(1)

Linked List O(n) O(1) O(1)

Binary Heap 𝐎(𝐥𝐨𝐠 𝐧) 𝐎(𝐥𝐨𝐠 𝐧) O(1)

Binary Tree 𝐎(𝐥𝐨𝐠 𝐧) 𝐎(𝐥𝐨𝐠 𝐧) O(1)

Priority Queue using Linked List


• Consider a linked queue having 3 data elements 3, 17, 43

• For instance, to insert a node consisting of element 45. Here, it will compare element 45 with each
element inside the queue. However, this insertion will cost you O(N). Representation of the linked
queue below displays how it will insert element 45 in a priority queue.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 24 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Types of Priority Queue


• There are two types of priority queues based on the priority of elements.
o If the element with the smallest value has the highest priority, then that priority queue is
called the min priority queue.
o If the element with a higher value has the highest priority, then that priority queue is known
as the max priority queue.
• Furthermore, you can implement the min priority queue using a min heap, whereas you can
implement the max priority queue using a max heap.

Applications of Priority Queue in Data Structure


• Used in the Dijkstra's shortest path algorithm.
• IP Routing to Find Open Shortest Path First
• Data Compression in like Huffman code, WINZIP / GZIP
• Used in implementing Prim’s algorithm
• Used to perform the heap sort
• Used in Scheduling, Load balancing and Interrupt handling

Operations supported by priority queue.


• Insertion (Enqueue): Insert an element into the priority queue along with its priority.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 25 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Deletion (Dequeue): Remove and return the element with the highest priority from the priority
queue.
• Peek: Retrieve the element with the highest priority without removing it from the priority queue.
• Size: Get the number of elements currently stored in the priority queue.
• Clear: Remove all elements from the priority queue, making it empty.
• Update Priority: Change the priority of an existing element in the priority queue.

Representation of Priority Queue using Heap


• Heap is Tree like data structure that forms complete binary tree. Parent node value is always
greater than or equal to child node for all nodes in the heap (max Heap) and reverse for min heap

Insert
• The new element will move to the empty space from top to bottom and left to right and heapify.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 26 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Delete
• The maximum/minimum element is the root node which will be deleted based on max/min heap.
(refer above right diagram)
• Example 2: 32, 15, 20, 30, 12, 25, 16

5.5.2. Single ended priority queue


• A single-ended priority queue is a type of priority queue where elements are inserted with
associated priorities, and removal (dequeuing) is performed only from one end, typically from the
front. In this type of priority queue, the elements with the highest priority are dequeued first.
• 2 types
o Ascending Priority Queue (Min Priority Queue)
▪ Return an element with minimum priority
▪ Insert an element at arbitrary priority
▪ Delete an element with minimum priority
o Descending Priority Queue (Max Priority Queue)
▪ Return an element with Maximum priority
▪ Insert an element at arbitrary priority
▪ Delete an element with maximum priority

• Common Operations
o Return an element with minimum/Maximum priority
o Insert an element at arbitrary priority
o Delete an element with maximum / minimum priority

• Example: Consider elements to insert: (5, 10), (2, 20), (8, 30), (1,40), (7, 50)
Insert (5, 10)
o Start with an empty priority queue.
o The first element (5, 10) is inserted as the root node since the priority queue is initially
empty.

Insert (2, 20)


o The second element (2, 20) is inserted as the left child of the root since it has a higher
priority than the root (5). Here, Heapify Up is applied

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 27 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Similarly, Insert (8, 30) & Insert (5, 10)


Insert (1, 40)
o The fourth element (1, 40) is inserted as the left child of the left child of the root since it
has the highest priority among all elements.

Insert (7, 50)


o The fifth element (7, 50) is inserted as the right child of the left child of the root since it
has a higher priority than the root (5) but lower priority than (2) and (1).

• Delete min
o The minimum element (1, 40) is deleted from the priority queue. The root node is replaced
with the last node of the heap, and then the last node is deleted.
o After deleting the minimum element, check the child nodes of (7, 50) and place smaller
priority node to the root.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 28 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o After adjusting, now check the heap property for node (7,50). Node (7, 50) has 2 child
nodes, apply min heap and move min priority node to root.

5.5.3. Double ended priority queue


• Common Operations
o Return an element with minimum priority
o Return an element with maximum priority
o Insert an element at arbitrary priority
o Delete an element with minimum priority
o Delete an element with maximum priority

Insertion
• Insertion is same as insertion in single ended priority queue

Inserting element (5, 10):


• Initially, the priority queue is empty.
• The first element (5,10) is inserted into the priority queue.
• Since this is the first element, it becomes the root of the tree.

Inserting element (3, 20):


• The second element (3,20) is inserted into the priority queue.
• Since the priority of (3,20) is smaller than the priority of the root (5,10), it becomes the left child
of the root.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 29 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Inserting element (30,8):


• The third element (8,30) is inserted into the priority queue.
• Since the priority of (8,30) is greater than the priority of the root (5,10), it becomes the right child
of the root.
• (8,30) is inserted as the right child of (5,10) since its priority is higher than the root.

Inserting element (4, 40):


• The fourth element (4,40) is inserted into the priority queue.
• Since the priority of (4,40) is greater than the priority of its parent (3,20) but smaller than the
priority of the root (5,10), it becomes the right child of (3,20).
• (4,40) is inserted as the right child of (3,20) since its priority is higher than its parent but lower
than the root.

Inserting element (2, 50):


• The fifth element (2,50) is inserted into the priority queue.
• Since the priority of (2,50) is smaller than the priority of the root (3,20), it becomes the left child
of the root.
• (2,50) is inserted as the left child of (3,20) since its priority is lower.

Inserting element (60,6):


• The sixth element (60,6) is inserted into the priority queue.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 30 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Since the priority of (60,6) is greater than the priority of the root (10,5), it becomes the right
child of the root.
• After insertion, the priority queue contains six elements: (50,2), (20,3), (10,5), (40,4), (30,8), and
(60,6).

Returning Min/Max priority is same as single ended priority queue


Deleting Min/Max priority is same as single ended priority queue

Returning Maximum priority element using Min heap


• Start at the root node of the min-heap.
• Compare the priority of the root node with the priorities of its children.
• If any child has a higher priority than the root node, recursively explore that subtree.
• Continue this process until reaching a leaf node, which represents the end of the traversal.
• The element with the highest priority encountered during the traversal will be the maximum
priority element in the min-heap.

Delete an element with maximum priority


• Convert the min-heap into a max-heap.
• The maximum priority element will now be at the root of the max-heap. Delete this element.
• Restore the max-heap property by adjusting the heap structure as necessary.

Conversion of min heap to max heap


• Start at the last non-leaf node of the heap (i.e., the parent of the last leaf node). For a binary heap,
this node is located at the index floor((n – 1)/2), where n is the number of nodes in the heap.
• For each non-leaf node, perform a “heapify” operation to fix the heap property.
o In a min heap, this operation involves checking whether the value of the node is greater
than that of its children, and if so, swapping the node with the smaller of its children.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 31 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o In a max heap, the operation involves checking whether the value of the node is less than
that of its children, and if so, swapping the node with the larger of its children.
• Repeat step 2 for each of the non-leaf nodes, working your way up the heap. When you reach the
root of the heap, the entire heap should now be a max heap.
• Consider the tree (fig .1) convert to max heap

• Example 2: 3, 5, 9, 6, 8, 20, 10, 12, 18, 9

• Example 3: 3, 4, 8, 11, 13

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 32 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

5.5.4. Leftist Trees


Limitations of Binary Heap
• A binary heap is a complete binary tree in which each and every node is greater than descendants
or children node.
𝑀𝑎𝑥 𝐻𝑒𝑎𝑝: 𝑛𝑜𝑑𝑒(𝑥) ≥ 𝑑𝑒𝑠𝑐𝑒𝑛𝑑𝑒𝑛𝑡𝑠 (𝑥) 𝑀𝑖𝑛 𝐻𝑒𝑎𝑝: 𝑛𝑜𝑑𝑒(𝑥) ≤ 𝑑𝑒𝑠𝑐𝑒𝑛𝑑𝑒𝑛𝑡𝑠 (𝑥)

• Consider to join above 2 trees and make a single binary heap


• We should create an array and copy m+n elements
𝑇(𝑛) = 𝑂(𝑚 + 𝑛) + 𝑂(𝑚 + 𝑛) ∗ 𝑂(log m +𝑛)
𝑇(𝑛) = 𝑂(𝑡) + 𝑂(𝑡) ∗ 𝑂(log 𝑡)
𝑇(𝑛) = 𝑂(𝑛 log 𝑛)
𝑤ℎ𝑒𝑟𝑒 𝑂(𝑚 + 𝑛) 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑜𝑚𝑝𝑙𝑒𝑥𝑖𝑡𝑦 𝑡𝑜 𝑐𝑜𝑝𝑦 𝑛𝑜𝑑𝑒𝑠 𝑓𝑟𝑜𝑚 𝑏𝑜𝑡ℎ 𝑡𝑟𝑒𝑒𝑠
𝑂(𝑚 + 𝑛) ∗ 𝑂(log +𝑛) 𝑖𝑠 𝐼𝑛𝑠𝑒𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑚 + 𝑛 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 & 𝑖𝑛𝑠𝑒𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑒𝑎𝑐ℎ 𝑒𝑙𝑒

• The complexity of binary heap 𝑂(𝑛 log 𝑛) is more compared to leftist trees 𝑂(log 𝑛)

Leftist Tree / Leftist Heap


• Definition: A leftist tree is a binary tree such that if it is not empty, for every internal node x
𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑙𝑒𝑓𝑡_𝑐ℎ𝑖𝑙𝑑(𝑥)) ≥ 𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑟𝑖𝑔ℎ𝑡_𝑐ℎ𝑖𝑙𝑑(𝑥))
• Let X be a node in an extended binary tree. Let left_child (x) and right_child (x), respectively,
denote the left and right children of the internal node x.
• Define shortest (x) to be the length of a shortest path from x to an external node. It is easy to see
that shortest (x) satisfies the following recurrence

0, 𝑖𝑓 𝑥 𝑖𝑠 𝑒𝑥𝑡𝑒𝑟𝑛𝑎𝑙 𝑛𝑜𝑑𝑒
𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑥) = {
1 + min {𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑙𝑒𝑓𝑡_𝑐ℎ𝑖𝑙𝑑(𝑥)), 𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑟𝑖𝑔ℎ𝑡_𝑐ℎ𝑖𝑙𝑑(𝑥)), 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
The number outside each internal node x of above figure is the value of shortest(x)

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 33 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Example 2

Properties of Leftist trees


• The length of the rightmost path from any node x to an external node is shortest(x)
• Number of internal nodes in the subtree with root x is atleast 2𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑥) − 1
• If the subtree with root x has n nodes then 𝑠(𝑥) is atmost 𝑙𝑜𝑔2 (𝑛 + 1).
o This can be proved using 2nd property 𝑛 ≥ 2𝑠(𝑥) − 1
𝑛 + 1 = 2𝑠(𝑥) taking log on both sides
𝑠(𝑥) ≤ 𝑙𝑜𝑔2 (𝑛 + 1)
Lemma 1: Let x be the root of a leftist tree that has n (internal) nodes.
a. 𝑛 ≥ 2𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑥) − 1
b. The rightmost root to external node path is the shortest root to external node path. Its length is
shortest (x)

Proof: (a) From the definition of shortest(x) it follows that there are no external nodes on the first
shortest(x) levels of the leftist tree. Hence, the leftist tree has at least
∑𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑥)
𝑖=1 2𝑖−1 = 2𝑠ℎ𝑜𝑟𝑡𝑒𝑠𝑡(𝑥) − 1 internal nodes.

(b) This follows directly from the definition of a leftist tree.

• Leftist trees are represented with nodes that have the fields left-child, right-child, shortest, and
data.
typedef struct {

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 34 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

int key;
/*---------------*/
} element;

struct leftist {
struct leftist *left_child;
element data;
struct leftist *right_chiId;
int shortest;
};

Definition:
• A min-leftist tree (max leftist tree) is a leftist tree in which the key value in each node is
smaller)than the key values in its children (if any). In other words, a min (max) leftist tree is a
leftist tree that is also a min (max) tree.
• Figure below depicts two min-leftist trees. The number inside a node x is the key of the element
in x and the number outside x is shortest (x). The operations insert, delete min(delete max), and
combine can be performed in logarithmic time using a min (max) leftist tree.
• Examples of Leftist trees computing s(x)

Criteria to be followed for Leftist Heap


• Must be Binary tree
• For all x, Shortest(left(x) >= Shortest(right(x)
• Must follow heap property

Operations of Leftist tree


• Merge
• Insertion
• Deletion
• Initialization of Heap:

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 35 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

Merge Operation (Melding)


• Consider an example 1:

• Merge

• Example 2:
o Step 1: Consider 2 Leftist trees

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 36 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Step 2: To apply merge for above 2 leftist trees, Find the minimum root in both leftist trees.
Minimum root is 1 and pass right subtree of min root 1 along with first tree (Apply recursive
call). This process will repeat until any one leftist tree without nodes i.e. reaches NULL.

o Step 3: To apply merge for above 2 leftist trees, Find the minimum root in both leftist trees.
Minimum root is 3 and pass right subtree of min root 3 along with first tree

o Step 4: To apply merge for above 2 leftist trees, Find the minimum root in both leftist trees.
Minimum root is 7 and pass right subtree of min root 7 along with first tree. Here, right subtree
is NULL

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 37 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Step 5: Since, right leftist tree is NULL (base condition is reached), return left tree as result
to previous step (Step 4). Attach the result obtained and apply merge concept.
o To apply merge concept, find the shortest(x) value for both trees.
For all x, Shortest(left(x) >= Shortest(right(x)
o Since, Shortest(left(x) >= Shortest(right(x) then add as right child of Minimum root.
After adding result is show in C
o Pass the result to step 2 (Return of recursive call)

o Step 6: Consider the smallest root in step 2, merge result obtained in step 5 with step 2.
o Merge the left subtree of step 2 (remaining part i.e left part - since right subtree is of
step 2 is already processed).
o To apply merge concept, find the shortest(x) value for both trees.
For all x, Shortest(left(x) >= Shortest(right(x)
o Compare the Shortest value of root left subtree i.e. 4 and root of second leftist tree.
Since criteria is satisfied, add second tree (Root 7) as right child of Root 3

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 38 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

o Transfer the resultant leftist tree to step 1, ignoring transferred tree in Step1.

o Step 7: Consider the smallest root in step 1, merge result obtained in step 6 with step 1.
o Merge the left subtree of step 2 (remaining part i.e left part, since right subtree is of
step 1 is already processed).
o To apply merge concept, find the shortest(x) value for both trees.
For all x, Shortest(left(x) >= Shortest(right(x)
o Compare the Shortest value of root left subtree i.e. 1 and root of second leftist tree.
Since criteria is not satisfied, swap left subtree of root 1 and second leftist tree to obtain
final tree.

• C Function to merge two leftist trees


Node *Merge(Node * root1, Node * root 2){
if(root1 == NULL) //Base Condition
return root2;
if(root2 == NULL)
return root1;

Node *FinalRoot, *left, *RetHeap;


if(root1→data < root2→data){
RetHeap = Merge(root1→right, root2) //Heap order property
FinalRoot = root1;
}

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 39 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

else{
RetHeap = Merge(root1, root2→right) //Heap order property
FinalRoot = root2;
}

if(FinalRoot→left→Svalue >= RetHeap→Svalue){


FinalRoot→right = RetHeap;
}
else{
Node *temp = FinalRoot→left;
FinalRoot→left = RetHeap;
FinalRoot→right = temp;
}
return FinalRoot;
}
• Time Complexity : T(n)=O(log n)

Insert Operation
• Consider the leftist tree

• Step 1: Insert node 6, Apply Merge operations

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 40 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Step 2: Find the minimum root and pass its right subtree along with 2nd tree for further

• Step 3: Smallest root in both leftist trees is 6, consider its right subtree of 6 along with remaining
leftist tree.

• Step 4: Since right leftist tree (root2) is empty return left subtree as result to the previous step. The
remaining tree and result of step 3 is

• Find the S value to merge these two leftist trees. The S value of left child of smaller root (Root 6)
is not greater the Root of right leftist tree so, swap and the result is

• Step 5: Pass the resultant tree to Step 1 to merge. Find the Shortest value for both leftist trees.
Smaller root is 5, check the Shortest value of Left of Root 5 with result obtained from step 4 i.e
shortest value of Root 6. S(8)>=S(6) so, add Root 6 as right child of 5.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 41 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Time Complexity :
o For insertion Insert O(1) and Merge O(log n) = O(log n)

Initialize Heap
6 2 9 8 3 4 11 18 7 24 1 5

• Create a Empty leftist heap.


• Read the elements one by one and apply merge
• Complexity will be
o For each insertion O(log n) and Total N insertions = O (n log n)

Delete Operation
• Delete the root element. We will get 2 leftist tree and apply merge for these leftist trees
• Complexity will be
o For Deletion O(1) and Merge O(log n) = O(log n)
• Consider an example

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 42 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• Delete Root node, we will get 2 leftist trees, apply merge

5.6. Optimal Binary Search Tree


• An Optimal Binary Search Tree (OBST), also known as a Weighted Binary Search Tree, is a binary
search tree that minimizes the expected search cost. In a binary search tree, the search cost is the
number of comparisons required to search for a given key.
• In an OBST, each node is assigned a weight that represents the probability of the key being
searched for.
• To construct an OBST, we start with a sorted list of keys and their probabilities. We then build a
table that contains the expected search cost for all possible sub-trees of the original list. We can
use dynamic programming to fill in this table efficiently. Finally, we use this table to construct the
OBST.

Example
• 10, 20, 30 are the keys, and the following are the binary search trees that can be made out from
these keys.

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 43 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

• The Formula for calculating the number of trees:

When n=3, No. of binary trees will be 5.


• The cost required for searching an element depends on the comparisons to be made to search an
element. Now, we will calculate the average cost of time of the above binary search trees.
• Consider 1st tree
To search key value 10, we require 1 comparison
To search key value 20, we require 2 comparisons
To search key value 30, we require 3 comparisons
𝟏+𝟐+𝟑
Average no. of comparisons = =2
𝟑

• Consider 2nd tree


To search key value 10, we require 1 comparison
To search key value 30, we require 2 comparisons
To search key value 20, we require 3 comparisons
𝟏+𝟐+𝟑
Average no. of comparisons = =2
𝟑

• Consider 3rd tree


To search key value 20, we require 1 comparison
To search key value 10, we require 2 comparisons
To search key value 30, we require 2 comparisons
𝟏+𝟐+𝟐
Average no. of comparisons = =1.66
𝟑

• Consider 4th tree

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 44 of 45


Data Structures and Applications (BCS304) Module 5: Hashing, Priority Queues & Efficient OBST

To search key value 30, we require 1 comparison


To search key value 10, we require 2 comparisons
To search key value 20, we require 2 comparisons
𝟏+𝟐+𝟑
Average no. of comparisons = =2
𝟑

• Consider 5th tree


To search key value 30, we require 1 comparison
To search key value 20, we require 2 comparisons
To search key value 10, we require 3 comparisons
𝟏+𝟐+𝟑
Average no. of comparisons = =2
𝟑

• In the above possible trees (iii) has less cost compared to all other cost. Hence, the tree is called
optimal binary search tree.

Example: refer Module 5 Optimal Binary Search tree document uploaded seperately

Department of AI & ML, Vemana IT Prepared by: Dr. Kantharaju H C Page 45 of 45

You might also like