0% found this document useful (0 votes)
30 views

DAA Unit 2 Notes

Uploaded by

Harsh Prajapati
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

DAA Unit 2 Notes

Uploaded by

Harsh Prajapati
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Lecture Notes

Design and Analysis of Algorithms


ECS-355
  

Submitted in partial fulfillment for the award of the degree of B. Tech in


Computer Science & Engineering

HARCOURT BUTLER TECHNICAL UNIVERSITY


SESSION 2022-23

Submitted by- Submitted to-


Harsh Prajapati DR. IMRAN KHAN
3rd B. Tech CSE ASSOCIATE PROFESSOR
(200104031) DEPT. OF CSE

1
Course Outcomes:

1. Understand and apply mathematical preliminaries to the analysis and design stages of

different types of algorithms. (Understand, Apply)

2. Analyze worst-case time complexity of various algorithms using asymptotic

methods. (Analyze)

3. Understand and apply the divide-and-conquer paradigm and synthesize divide and-conquer algorithms on
problems of Sorting. Searching, finding MST etc. (Understand, Apply)

4. Describe the greedy paradigm and explain when an algorithmic design situation calls

for it. For a given problem develop the greedy algorithms. (Apply. Analyze)

 5. Apply the dynamic-programming paradigm to model engineering problems using graph and write the
corresponding algorithm to solve the problems. (Apply)

6. Explain the ways to analyze randomized and approximation algorithms (Apply. Analyze)

2
UNIT 1
Table of contents-

Topics Page No.


Ø Algorithm 4
Ø Analysis of algorithms 4
Ø Algorithm Complexity 5
Ø Asymptotic Notations 6
v Omega notation 6
v Theta notation 6
v Big-O-notation 6
Ø Recurrence relations 7
v Substitution method 8
v Master’s theorem method 9
Ø Searching algorithms 11
v Linear Search 11
v Binary search 11
Ø Sorting algorithms 11
v Comparison of various algorithms 12
v Bubble Sort 12
v Insertion Sort 12
v Selection Sort 13
v Merge Sort 13
v Quick Sort 14
v Bucket Sort 14
v Radix sort 15
v Heap sort 16
Ø Order Statistics 16
v Randomized algorithms 16

3
Algorithm:-
Algorithm means - ”A set of rules to be followed in calculations or other problem-solving operations”
Or ” A procedure for solving a mathematical problem in a finite number of steps that frequently involves
recursive operations .It can be understood by taking the example of cooking a new recipe. To cook a new
recipe, one reads the instructions and steps and executes them one by one, in the given sequence. Algorithms
help to do a task in programming to get the expected output.

Characteristics of a good Algorithm-


·         Well-defined inputs
·        Clear and unambiguous
·        Language independent
·        Well-defined outputs
·        Finite
·        Feasible
 
Advantages of Algorithms-
·        Easy to understand: Since it is a stepwise representation of a solution to a given problem, it is easy to
understand.
·        Language Independent: It is not dependent on any programming language, so it can easily be understood
by anyone.
·        Debug / Error Finding: Every step is independent / in a flow so it will be easy to spot and fix the error.
·        Sub-Problems: It is written in a flow so now the programmer can divide the tasks which makes them easier
to code.
 
Analysis of Algorithm-
While implementing the algorithm, there can be more than one method to implement it. (One problem,
many solutions). For example, to add 3 numbers, the sum can be calculated in many ways like:+ operator, Bit-
wise operators etc. For a standard algorithm to be good, it must be efficient. Hence the efficiency of an
algorithm must be checked and maintained. It can be in two stages:
1. Priori Analysis 
2. Posterior Analysis
 
S.no.
Priori Analysis Posterior Analysis
1
“Priori” means “before”. Hence Priori  “Posterior” means “after”. Hence
analysis means checking the algorithm Posterior analysis means checking the
before its implementation. algorithm after its implementation.
2
In this, the algorithm is checked when it is The algorithm is checked when it is
written in the form of theoretical steps. converted into program.
3
This Efficiency of an algorithm is measured In this, the algorithm is checked by
by assuming that all other factors, for implementing it in any programming
example, processor speed, are constant and language and executing it.  That is, it is

4
have no effect on the implementation.  This dependent on the language of the compiler
analysis is independent of the type of and the type of hardware used.
hardware and language of the compiler.
4
 It gives the approximate answers for the This analysis helps to get the actual and
complexity of the program. real analysis report about correctness,
space required, time consumed etc.

Algorithm complexity-
An algorithm is defined as complex based on the  mount of Space and Time it consumes. Hence the
Complexity of an algorithm refers to the measure of the Time that it will need to execute and get the expected
output, and the Space it will need to store all the data (input, temporary data and output). Hence these two
factors define the efficiency of an algorithm.
The two factors of Algorithm Complexity are:
·       Time Factor: Time is measured by counting the number of key operations such as comparisons in the
sorting algorithm.
·       Space Factor: Space is measured by counting the maximum memory space required by the algorithm.
Therefore, the complexity of an algorithm can be divided into two types:
1. Space Complexity: The space complexity of an algorithm refers to the amount of memory used by the
algorithm to store the variables and get the result. This can be for inputs, temporary operations, or outputs. 
2. Time Complexity: The time complexity of an algorithm refers to the amount of time that is required by the
algorithm to execute and get the result. This can be for normal operations, conditional if-else statements, loop
statements, etc.

How to calculate Time Complexity?


The time complexity of an algorithm is also calculated by determining the following 2 components: 
·      Constant time part: Any instruction that is executed just once comes in this part. For example, input,
output, if-else, switch, etc.
·        Variable Time Part: Any instruction that is executed more than once, say n times, comes in this part. For
example, loops, recursion, etc.
Therefore, Time complexity of any algorithm P is T(P) = C + TP(I), where C is the constant time part and
TP(I)
is the variable part of the algorithm, which depends on the instance characteristic I.
 
Example: -
for(i=0; i<n; i++)      —>(n+1)steps
  {
     printf(“Good”);     —> n steps
   }
Total steps=2*n+1       —>(linear polynomial function of degree 1)
 

5
Asymptotic notations-
In computing, asymptotic analysis of an algorithm refers to defining the mathematical boundation of its
run-time performance based on the input size. For example, the running time of one operation is computed
as f(n), and maybe for another operation, it is computed as g(n2). This means the first operation running time
will increase linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Usually, the analysis of an algorithm is done based on three cases:
1. Best Case (Big-Omega Notation (Ω))
2. Average Case (Big-Theta Notation (Θ))
3. Worst Case (Big-O Notation(O))

Omega (Ω) Notation:


Omega (Ω) notation specifies the asymptotic lower bound for a function f(n). For a given function
g(n), Ω(g(n)) is denoted by:
Ω (g(n)) = {f(n): there exist positive constants c and n0 such that 0 ≤ c*g(n) ≤ f(n) for all n ≥ n0}. 
Example 1: - Consider f(n)=2*n+2, g(n)=n & c=2
0<=c*g(n)<=f(n)
 Hence, g(n) is the lower bound of f(n)
 
Theta (Θ) Notation:
Big-Theta(Θ) notation specifies a bound for a function f(n). For a given function g(n), Θ(g(n)) is
denoted by:
Θ (g(n)) = {f(n): there exist positive constants c1, c2   and n0  such that 0 ≤ c1*g(n) ≤ f(n) ≤ c2*g(n) for all n ≥
n0}.
Example: - Consider f(n)=2*n+2, g(n)=n & c=2 & b=4
c*g(n)<=f(n)<=b*g(n)
  Hence, average case complexity is Big-theta(n).
 
Big – O Notation:
Big – O (O) notation specifies the asymptotic upper bound for a function f(n). For a given function
g(n), O(g(n)) is denoted by:
Ω (g(n)) = {f(n): there exist positive constants c and n0 such that f(n) ≤ c*g(n) for all n ≥ n0}. 
It is the most widely used notation as it is easier to calculate since there is no need to check for every type of
input as it was in the case of theta notation. Since the worst case of input is taken into account it pretty much
gives the upper bound of the time the program will take to execute.
Example 1: - Consider f(n)=2*n+4 ,g(n)=n & c=6
  c*g(n)>=f(n)
  Hence, g(n) is the upper bound of f(n) and complexity is O(n)

Example 2: - Consider the following code:


for(i=0;i<n;i++)         —>(n+1)steps
{

6
  for(j=0;j<n;j++)       —>(n+1)*n steps
      printf(“Good”);     —> n  steps
2


Total steps=2*n + 2*n+1
2

Let c=6 & g(n)=n 2

c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(n ) 2

 
Example 3: - Consider the following code:
for(i=0;i<n;i+=2)         —>(n/2)steps
{
  printf(“Good”);        —>(n/2-1)steps

Total steps=n-1
Let c=1 & g(n)=n
c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(n)
 
Example 4: - Consider the following code:
for(i=0;i<n;i*=2)         —>(log n)steps
2

{
 printf(“Good”);        —>(log n-1)steps
2


Total steps=2log n-1 2

Let c=2 & g(n)=log n 2

c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(log n) 2

 
Recurrence relations-
Recurrence relation is way of determining the running time of a recursive algorithm or program. It's a
equation or a inequality that describes a functions in terms of its values and smaller inputs.
For example, let us consider T(n) to be the running time of a given problems on size n, the problem in our case
will be finding the nth Fibonacci number. Let F(n) denote the nth Fibonacci number, therefore F(n) = F(n-1) + F(n-
2) and T(0) = a , a is a constant. The recurrence relation will be :  T(n) = c + T(n-1), where c is a constant

There are mainly four methods of solving recurrence relation:


Ø Substitution method
Ø Iteration method
Ø Master method
Ø Recursion tree method
The two most used methods are substitution method and master theorem method.

7
Substitution method-
The whole working of the substitution method can be divided into two processes:
v Take a guess at the solution
v Find boundary conditions using the principles of mathematical induction and prove that the
guess is correct
Example 1: Recurrence relation: T(n) = T(n-1) + 1 for n > 1
Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n-2) + 1]+1
        = [[T(n-3)+1] + 1]+1
And so on: T(n) = T(0)+n*1
Step 4: Now we need to put the values into the recurrence formula, 
T(n) = T(0) + n = n+1
Therefore T(n) = O(n)
 
Example 2: Recurrence relation: T(n) = T(n-1) + c for n > 1
Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n-2) + c]+c
        = [[T(n-3)+c] + c]+c
And so on: T(n) = T(0)+n*c
Step 4: Now we need to put the values into the recurrence formula,
 T(n) = T(0) + cn=c n+1 .Therefore T(n) = O(n*c)

Example 3: Recurrence relation: T(n) = 3*T(n-1) + n2 for n > 1


Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = 3*[3*T(n-2) +n2]+n2
        = 3* [3*[3*T(n-3)+n2] + n2]+n2
And so on: T(n) = 3n*T(0) (1)+3n-1*n2+n2
Step 4: Now we need to put the values into the recurrence formula,
Therefore T(n) = O(3n*n3)
 
Example 4: Recurrence relation: T(n) = 5*T(n-1) + n7 for n > 1
Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = 5*[5*T(n-2) +n7]+n7
        = 5* [5*[5*T(n-3)+n7] + n7]+n7
And so on: T(n) = 5n*T(0) (1)+5n-1*n7+n7
Step 4: Now we need to put the values into the recurrence formula,
Therefore T(n) = O(5n*n8)
8
 
Example 5: Recurrence relation: T(n) = T(n/2) + 1 for n > 1
Step 1: Base condition: T(1) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n/4) + 1]+1
        =[[T(n/8) + 1]+1]+1
And so on: T(n) = T(n/2m)+m, m>=0
Step 4: Using base condition: n/2m =1
m=log2n
T(n) = 1+log2n
Therefore T(n) = O(log2n )
 
Example 6: Recurrence relation: T(n) = T(n/2) + n for n > 1
Step 1: Base condition: T(1) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n/4) + n]+n
        =[[T(n/8) + n]+n]+n
And so on: T(n) = T(n/2m)+m*n, m>=0
Step 4: Using base condition: m=log2n
T(n) = 1+n*[1+1/2+1/22+1/23+.......+1/2m]
T(n) = 1+n*[1+a value tending to 1]
T(n) = 1+n*2
Therefore T(n) = O(n)
 
Example 7: Recurrence relation: T(n) =2* T(n/2) + n for n > 1
Step 1: Base condition: T(1) = 1
Step 2: Using the above statement we can write that:
T(n) = 2*[2*T(n/4) + n]+n
        =2*[2*[2*T(n/8) + n]+n]+n
And so on: T(n) = 2m*T (n/2m)+m*n, m>=0
Step 4: Using base condition: n/2m =1
m=log2n
T(n) = n*1+n*log2n
Therefore T(n) = O(n*log2n+n )

Master’s theorem method-

Case 1: Theorem for decreasing functions-  


T(n)=aT(n-b)+f(n) ,a,b>0
For such a relation, complexity is defined as: O(an/b*f(n))
 
Example 1: Recurrence relation: T(n) = 2*T(n-3) + n for n > 1
9
        Complexity: O(2n/3*n)
Example 2: Recurrence relation: T(n) = T(n-1) + n2 for n > 1
        Complexity: O(n3)
Example 3: Recurrence relation: T(n) = 2*T(n-1) + 1 for n > 1
        Complexity: O(2n)
Example 4: Recurrence relation: T(n) = 5*T(n-1) + n2 for n > 1
        Complexity: O(5n*n2)
 
Case 2: Theorem for division functions-  
T(n)=aT(n/b)+f(n) ,a,b>0 &
f(n)=n *log n
k p

For such a relation, complexity is defined according to 3 conditions-


· If logba>k —> Complexity: O(nlogba)
· If logba=k : Check for the following:
v If p> -1 —>Complexity: O(nk*logp+1n)
v If p= -1 —>Complexity: O(nk*log(log n))
v If p< -1 —>Complexity: O(nk)
· If logba<k : Check for the following:   
v If p>=0 —>Complexity: O(nk*logpn)
v If p<0 —>Complexity: O(nk)
 
Example 1: Recurrence relation: T(n) = 2*T(n/2) + n for n > 1
        a=2,b=2,k=1,p=0
logba=k  p>-1
        Complexity: O(n*log n)
Example 2: Recurrence relation: T(n) = 9*T(n/3) + 1 for n > 1
        a=9,b=3,k=0
logba>k
        Complexity: O(n2)
Example 3: Recurrence relation: T(n) = 9*T(n/3) + n/log2n for n > 1
        a=9,b=3,k=1,p=-2
logba>k
        Complexity: O(n2)
Example 4: Recurrence relation: T(n) = 3*T(n/3) + n/log2n for n > 1
        a=3,b=3,k=1,p=-2
logba=k &  p<-1
        Complexity: O(n)
Example 5: Recurrence relation: T(n) = 3*T(n/3) + n3/log2n for n > 1
        a=3,b=3,k=3,p=-2
logba<k &  p<0
        Complexity: O(n3)
 
10
 
Searching Algorithms-
These are designed to check for an element or to retrieve an element from any data structure where it is
used. Based on the procedure of searching these algorithms are classified into 2 types-
· Sequential Search: In this, the list or array is traversed sequentially and every element is checked. For
example: Linear Search.
· Interval Search: These algorithms are specifically designed for searching in sorted data-structures. These
type of searching algorithms are much more efficient than Linear Search as they repeatedly target the
center of the search structure and divide the search space in half. For Example: Binary Search
Use of searching algorithms-
Ø To retrieve information from databases
Ø In routing
Ø Disk scheduling
Ø Allotment of memory to operations and storage

Linear Search-
Linear Search is defined as a sequential search algorithm that starts at one end and goes through each
element of a list until the desired element is found, otherwise the search continues till the end of the data set. It is
the easiest searching algorithm with a complexity of O(n)
Algorithm: 1-Initialize A[N]
2-For I=1 to N
2.1-If A[I]=KEY, return I
3-Return -1

Binary Search-
Binary Search is a searching algorithm used in a sorted array by repeatedly dividing the search interval in
half. The idea of binary search is to use the information that the array is sorted and reduce the time complexity to
O(Log n).
Algorithm: 1-Initialize A[N], LOWER=1, UPPER=N
2-While LOWER<UPPER
2.1-Set MID=(LOWER+UPPER)/2
2.2-If A[MID]=KEY, return MID
2.3-If A[MID]<KEY, set LOWER=MID+1
2.4-If A[MID]>KEY, set UPPER=MID-1
  3-Return -1

Sorting Algorithms-
Sorting is the process of arranging the elements of an array so that they can be placed either in ascending
or descending order. Consider an array; A[10] = { 5, 4, 10, 2, 30, 45, 34, 14, 18, 9 )The Array sorted in ascending
order will be given as; A[] = { 2, 4, 5, 9, 10, 14, 18, 30, 34, 45 }
There are many techniques by using which, sorting can be performed. Some of them are-
Ø Bubble Sort
Ø Merge Sort
Ø Selection Sort

11
Ø Insertion Sort
Ø Quick Sort
Ø Heap Sort
Ø Radix Sort
Ø Bucket Sort
 
Algorithms comparison table-
 
S.no. Sorting Algorithms Best case Average case Worst case
complexity complexity complexity
1 Bubble Sort O(n) O(n )
2
O(n2)
2 Insertion sort O(n) O(n2) O(n2)
3 Merge Sort O(nlogn) O(nlogn) O(nlogn)
4 Selection Sort O(n )
2
O(n )
2
O(n2)
5 Quick Sort O(nlogn) O(nlogn) O(n2)
6 Heap Sort O(nlogn) O(nlogn) O(nlogn)
7 Radix Sort O(n+k) O(n+k) O(n+k)
8 Bucket Sort O(n) O(n+k) O(n2)

Bubble sort: It is the simplest sort method which performs sorting by repeatedly moving the largest element to the
highest index of the array. It comprises of comparing each element to its adjacent element and replace them
accordingly.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
3. For J=I to N
4. If A[I] > A[J]  
5. Swap(A[I], A[J])  

Insertion sort: As the name suggests, insertion sort inserts each element of the array to its proper place. It is a
very simple sort method which is used to arrange the deck of cards while playing bridge.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
1.KEY=A[I]
2. J=I-1
3. While J.=0 and A[J] >KEY
 A[J+1]= A[J]  
J=J-1
      4.  A[J+1]=KEY
 
12
Selection sort - Selection sort finds the smallest element in the array and place it on the first place on the list, then
it finds the second smallest element in the array and place it on the second place. This process continues until all
the elements are moved to their correct ordering. It carries running time O(n2) which is worse than insertion sort.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
1.Set SMALL = A[I] 
2. Set POS=I
3. For J=I+1 to N
1.If SMALL>A[J]
1.Set SMALL=A[J]
2. Set POS=J
4.  Swap A[J] with A[POS]
 
Merge sort - Merge sort follows divide and conquer approach in which, the list is first divided into the sets of
equal elements and then each half of the list is sorted by using merge sort. The sorted list is combined again to
form an elementary sorted array
Algorithm for Sort function; -
1. Initialize A[N]
2. If BEG < END  
3. Set MID = (BEG + END)/2  
4.  Call function SORT(A, BEG, MID)  
5. Call function SORT(A, MID + 1, END)  
6. Call function MERGE (A, BEG, MID, END) 
Algorithm for Merge function; -
1. Initialize N1 = MID - BEG + 1 & N2 = END - MID, LEFT[N1], RIGHT[N2]      
2. For I=1 to N
1.  LEFT[I] = A[BEG + I]   
3.   For J=1 to N
1.  RIGHT[J] = A[MID + 1 + J] 
4.     Initialize I=0, J=0, K=BEG 
5.     While I < N1 and J < N2
1.  If LEFT[I] <= RIGHT[J]
1. A[K] = LEFT[I]
2. I=I+1
2. Else
1. A[K] = RIGHT[J]
2.  J=J+1
6.  K=K+1   
7.  While (I<N1)    
13
1. A[K] = LEFT[I]
2. I=I+1
3. K=K+1
8.  While (J<N2)      
1. A[K] = RIGHT[J] 
2. J=J+1
3. K=K+1
 
Quick sort- Quick sort is the most optimized sort algorithms which performs sorting in O(n log n) comparisons.
Like Merge sort, quick sort also works by using divide and conquer approach.
Sorting function algorithm-
1. Initialize A[N]
2. If (START < END)     
1. Call function PARTITION(A, START, END)  and store its value in P
3. Call function QUICKSORT (A, START, P - 1)    
4. Call function QUICKSORT (A, P + 1, END)    
Partition Algorithm:
The partition algorithm rearranges the sub-arrays in a place.
1. Initialize PIVOT =A[END], I= START-1     
2. For J =START to END -1 
1. If A[J] < PIVOT
1.1. I= I + 1     
1.2. Swap A[I] WITH A[J]  
3.  Swap A[I+1] with A[END]     
4.  Return I+1  

Heap sort -The concept of heap sort is to eliminate the elements one by one from the heap part of the list, and
then insert them into the sorted part of the list. Heapsort is the in-place sorting algorithm.  The first step includes
the creation of a heap by adjusting the elements of the array. After the creation of heap, now remove the root
element of the heap repeatedly by shifting it to the end of the array, and then store the heap structure with the
remaining elements.

Working of Heap Sort-


· Since the tree satisfies Max-Heap property, then the largest item is stored at the root node.
· Swap: Remove the root element and put at the end of the array (nth position) Put the last item of the tree
(heap) at the vacant place.
· Remove: Reduce the size of the heap by 1.
· Heapify: Heapify (modify complete binary tree to become a Max-Heap) the root element again so that we
have the highest element at root.
· The process is repeated until all the items of the list are sorted.

14
Algorithm for HEAPIFY function
1. Initialize LARGEST = I, LEFT = 2 * I + 1; & RIGHT = 2 * I + 2;
2. If (LEFT < N && ARR[LEFT] > ARR[LARGEST])
1. LARGEST = LEFT
3. If (RIGHT < N && ARR[RIGHT] > ARR[LARGEST])
1. LARGEST = RIGHT
4. IF (LARGEST != I)
1. SWAP(&ARR[I], &ARR[LARGEST])
2. HEAPIFY(ARR, N, LARGEST)
Algorithm for Building Max heap function-
1. FOR (I = N / 2 - 1) to 0
1. Call function HEAPIFY(ARR, N, I)
Algorithm for heap sort function-
1. Initialize A[N]
2. FOR I = N - 1 to 0
1. SWAP ARR[0], &ARR[I]);
2. Call function HEAPIFY(ARR, I, 0)

Radix sort: Radix sort is the linear sorting algorithm that is used for integers. In Radix sort, there is digit by digit
sorting is performed that is started from the least significant digit to the most significant digit.
Algorithm for radix sort function-
1. Initialize A[N]
2. Initialize MAX by calling function GETMAX
3. FOR I = 1 to (MAX/I> 0), increment I=I*10
4. Call function COUNTINGSORT(A,N,I)
Algorithm for GETMAX function-
1. Initialize MAX = A[0]
2. FOR I = 1 to N
1. IF(A[I] > MAX)
2. MAX = A[I]
3. RETURN MAX
Algorithm for COUNTINGSORT function-
1. Initialize OUTPUT[N + 1] & COUNT[10] = {0}
2. FOR I = 0 to N
1. COUNT[(A[I] / PLACE) % 10]++
3. FOR I = 1to 10
1. COUNT[I] = COUNT[I]+ COUNT[I - 1]
4. FOR I = N - 1 to 0
1. OUTPUT[COUNT[(A[I] / PLACE) % 10] - 1] = A[I]
2. COUNT[(A[I] / PLACE) % 10]--
5. FOR I = 0 to N
1. A[I] = OUTPUT[I]
15
Bucket sort: Itis a sorting algorithm that separate the elements into multiple groups said to be buckets. Elements
in bucket sort are first uniformly divided into groups called buckets, and then they are sorted by any other sorting
algorithm. After that, elements are gathered in a sorted manner.
The basic procedure of performing the bucket sort is given as follows -
· First, partition the range into a fixed number of buckets.
· Then, toss every element into its appropriate bucket.
· After that, sort each bucket individually by applying a sorting algorithm.
· And at last, concatenate all the sorted buckets.
Algorithm-
1. Initialize A[N], INDEX = 0
2. Create vector B[N]
3. FOR I = 0 to N
1. BI = N * ARR[I]
2. B[BI]. PUSH_BACK(ARR[I]);
4. FOR I = 0 to N
1. SORT(B[I].BEGIN(), B[I].END())
5. FOR I = 0 to N
1. FOR J = 0 to B[I].SIZE()
1. ARR[INDEX++] = B[I][J]

Order Statistics-
Suppose we have a set of values X1=4, X2=2, X3=7, X4=11, X5=5. The kth order statistic for this data is
the kth smallest value from the set {4, 2, 7, 11, 5}. So, the 1st order statistic is 2(smallest value), the 2nd order
statistic is 4 (next smallest), and so on. The 5th order statistic is the fifth smallest value (the largest value), which
is 11.
Ø If a sorted list is given, the complexity of finding kth smallest element is O(1).
Ø If an unsorted list is given, the complexity of finding kth smallest element depends upon the sorting
algorithm used.
Ø  We can make use of the concept of “Randomized Algorithm”.

Randomized Algorithm-
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called
Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to pick the next pivot
(or we randomly shuffle the array). Typically, this randomness is used to reduce time complexity or space
complexity in other standard algorithms.
 

16
UNIT-2
Table of contents-

Topics Page No.


Ø Binary Search Trees 18
v Operations 18
v Advantages & Disadvantages 19
Ø AVL Trees 19
v Operations 20
v Complexity Analysis 21
Ø Red-Black Tree 21
v Properties 21
v Advantages 21
v Operations 22
v Applications 23
v R-B Tree Vs AVL Tree 24
Ø B-Tree 24
v Operations 24
v Complexity Analysis 26
v Applications 26
Ø Binomial Heap 26
v Properties 26
v Binary Representation of a number using heap 27
v Operations 27
Ø Fibonacci Heap 29
v Properties 29
v Binary Representation of a number using heap 30
v Operations 30
Ø Amortized Analysis 31

17
Binary Search Tree
A binary Search Tree is a node-based binary tree data structure which has the following properties:  
 The left subtree of a node contains only nodes with keys lesser than the node’s key.
 The right subtree of a node contains only nodes with keys greater than the node’s key.
 The left and right subtree each must also be a binary search tree. 
left subtree (keys) < node (key) ≤ right subtree (keys)

 Representation

8 12

3 9 11 15

Operations on BST
1. Searching
2. Insertion
3. Deletion
4. Traversal

1. Searching-
Searching in BST involves the comparison of the key values. If the key value is equal to root key then, search
successful, if lesser than root key then search the key in the left subtree and if the key is greater than root key
then search the key in the right subtree.
Searching in BST algorithm: -
 Check if tree is NULL, if the tree is not NULL then follow the following steps.
 Compare the key to be searched with the root of the BST.
 If the key is lesser than the root then search in the left subtree.
 If the key is greater than the root then search in the right subtree.
 If the key is equal to root, then return and print search successful.
 Repeat step 3, 4 or 5 for the obtained subtree.

2. Insertion in a BST:
Insertion in BST involves the comparison of the key values. If the key value is lesser than or equal to root key
then go to left subtree, find an empty space following to the search algorithm and insert the data and if the key
is greater than root key then go to right subtree, find an empty space following to the search algorithm and
insert the data.

18
3. Deletion in a BST:
Deletion in BST involves three cases: First, search the key to be deleted using searching algorithm and find the
node. Then, find the number of children of the node to be deleted.  
 Case 1- If the node to be deleted is leaf node:  If the node to be deleted is a leaf node, then
delete it.
 Case 2- If the node to be deleted has one child:  If the node to be deleted has one child, then
delete the node and place the child of the node at the position of the deleted node.
 Case 3- If the node to be deleted has two children: If the node to be deleted has two
children, then, find the inorder successor or inorder predecessor of the node according to the
nearest capable value of the node to be deleted. Delete the inorder successor or predecessor using
the above cases. Replace the node with the inorder successor or predecessor. 

4. Traversal in a BST
1. In order – left subtree (keys) < node (key) ≤ right subtree (keys)
2. Pre order – node (key) < left subtree (keys) ≤ right subtree (keys)
3. Post order - left subtree (keys) < right subtree (keys) ≤ node(key)
Time Complexity: O(N)
Auxiliary Space: If we don’t consider the size of the stack for function calls then O(1) otherwise O(h) where
h is the height of the tree.

Note-> 
·  The height of the skewed tree is n (no. of elements) so the worst space complexity is O(N) and the
height is (log N) for the balanced tree so the best space complexity is O(log N).
· When frequency of insertion & deletion of nodes is very high, then AVL tree were used.
· When the tree is skewed then RED-BLACK tree is used.

Advantages of BST-
 BST is fast in insertion and deletion when balanced.
 BST is efficient.
 We can also do range queries – find keys between N and M (N <= M).
 BST code is simple as compared to other data structures.

Disadvantages of BST-
 The main disadvantage is that we should always implement a balanced binary search tree.
Otherwise, the cost of operations may not be logarithmic and degenerate into a linear search on an
array.
 Accessing the element in BST is slightly slower than array.
 A BST can be imbalanced or degenerated which can increase the complexity.

AVL Trees
AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and right
subtrees cannot be more than one for all nodes. 
19
Example:

12

8 18

5
11 17

The above tree is AVL because the differences between heights of left and right subtrees for every node are less
than or equal to 1.

Significance of AVL trees


Most of the BST operations (e.g., search, max, min, insert, delete, etc.) take O(h) time where h is the
height of the BST. The cost of these operations may become O(n) for a skewed Binary tree. If we make sure
that the height of the tree remains O(log(n)) after every insertion and deletion, then we can guarantee an upper
bound of O(log(n)) for all these operations. The height of an AVL tree is always O(log(n)) where n is the
number of nodes in the tree.

Insertion in AVL tree


The two basic operations that can be performed to balance a BST without violating the BST property
(keys(left) < key(root) < keys(right)). 
 Left Rotation 
 Right Rotation

Y X
2
T1 Y
X T3
33
T2 TT3

T T
1 2
Right Rotation Left Rotation
Possible arrangements :
1. Left Left case
2. Right Right case
3. Left Right case
4. Right Left case

20
Complexity Analysis
Time Complexity: O(log(n)), For Insertion
Space Complexity: O(1)
The rotation operations (left and right rotate) take constant time as only a few pointers are being changed there.
Updating the height and getting the balance factor also takes constant time. So, the time complexity of the AVL
insert remains the same as the BST insert which is O(h) where h is the height of the tree. Since the AVL tree is
balanced, the height is O(log(n)). So, time complexity of AVL insert is O(log(n).

Red Black Tree


A red-black tree is a kind of self-balancing binary search tree where each node has an extra bit, and
that bit is often interpreted as the color (red or black). These colors are used to ensure that the tree remains
balanced during insertions and deletions. Although the balance of the tree is not perfect, it is good enough to
reduce the searching time and maintain it around O(log n) time, where n is the total number of elements in the
tree.
· Diagrammatic representation: Black Node Red Node

Properties of Red Black Tree


 Every node has a color either red or black.
 The root of the tree is always black.
 There are no two adjacent red nodes (A red node cannot have a red parent or red child).
 Every path from a node (including root) to any of its descendant’s NULL nodes has the same number
of black nodes.
 All leaf nodes are black nodes.

Node Structure
Color Left Key Parent Right

Advantage of R-B Trees


Most of the BST operations (e.g., search, max, min, insert, delete, etc.) take O(h) time where h is the
height of the BST. The cost of these operations may become O(n) for a skewed Binary tree. If we make sure
that the height of the tree remains O(log n) after every insertion and deletion, then we can guarantee an upper
bound of O(log n) for all these operations. The height of a Red-Black tree is always O(log n) where n is the
number of nodes in the tree. 

Some examples

Not an RB tree
Note :
 In case we have red leaf, we need to attach null sentinels which are always black.

Can also be attached

21
to black node
 2 adjacent elements cannot be red, parents & child cannot be red at same time (siblings can).
 When we count the number of black nodes on any path between any black node to leaf node, same
number of black nodes are found.

(4 paths, each has 1 black node, root not counted.)

 Height from any black node to any leaf node will be same.

Not an RB tree since both parent


& child cannot be red.

Search Operation in RB Tree


Function: searchElement (tree, val)
1. IF TREE -> DATA = VAL OR TREE = NULL
1. RETURN TREE
2. ELSE
1. IF VAL < DATA
1. RETURN SEARCHELEMENT (TREE -> LEFT, VAL)
2. ELSE
1. RETURN SEARCHELEMENT (TREE -> RIGHT, VAL)

Insertion Operation in RB Tree


 Insert the new node the way it is done in Binary Search Trees.
 Color the node red
 If an inconsistency arises for the red-black tree, fix the tree according to the type of discrepancy.
Function: RB-INSERT (T, z)
1. Y ← NIL [T]
2. X ← ROOT [T]
3. WHILE X ≠ NIL [T]
4. DO Y ← X
5. IF KEY [Z] < KEY [X]
6. THEN X ← LEFT [X]
7. ELSE X ← RIGHT [X]

22
8. P [Z] ← Y
9. IF Y = NIL [T]
10. THEN ROOT [T] ← Z
11. ELSE IF KEY [Z] < KEY [Y]
12. THEN LEFT [Y] ← Z
13. ELSE RIGHT [Y] ← Z
14. LEFT [Z] ← NIL [T]
15. RIGHT [Z] ← NIL [T]
16. COLOR [Z] ← RED
17. RB-INSERT-FIXUP (T, Z)

Example Show the red-black trees that result after successively inserting the keys 41,38,31,12 into an initially
empty red-black tree.
1. Insert 41
41

2. Insert 38
41

38

3. Insert 31
41 38
38
31 41
31

4. Insert 12
38 38

31 41 31 41

12 12

Applications of RB-Tree
1. Most of the self-balancing BST library functions like map, multiset, and multimap in C++ use
Red-Black Trees.
2. It is used to implement CPU Scheduling Linux. 
3.  It is also used in the K-mean clustering algorithm in machine learning for reducing time
complexity.
4.  Moreover, MySQL also uses the Red-Black tree for indexes on tables in order to reduce the
searching and insertion time.
23
Comparison with AVL Tree
The AVL tree and other self-balancing search trees like Red Black are useful to get all basic operations
done in O(log(n)) time. The AVL trees are more balanced compared to Red-Black Trees, but they may cause
more rotations during insertion and deletion. So if our application involves many frequent insertions and
deletions, then Red Black trees should be preferred. And if the insertions and deletions are less frequent and
search is the more frequent operation, then the AVL tree should be preferred over.

B Tree-
B-Tree is a self-balancing search tree. In most of the other self-balancing search trees it is assumed that
everything is in the main memory. B Tree is a specialized m-way tree that can be widely used for disk access. A
B-Tree of order m can have at most m-1 keys and m children. One of the main reasons of using B tree is its
capability to store large number of keys in a single node and large key values by keeping the height of the tree
relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.

Operations on B-tree:

1.Searching
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the
following B Tree. The process will something like following:
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any
element in a B tree.

2.Insertion
Example Insert 20, 12, 50, 60, 18, 65,70, 11, 10.

1. 20

2. 12 20

3. 12 20 50

4. 20
24
12 50 60

5. 20
12 18 50 60

6.
20 60

12 18 50 65 70

7.
20 60

11 12 18 50 65 70

8.
12 20 60

10 11 18 65 70
50

3.Deletion

Delete 32

Delete 31

25
Delete 30

Complexity Analysis
For Every operation it is O(log(n)).

Applications of B-Trees:
 It is used in large databases to access data stored on the disk
 Searching for data in a data set can be achieved in significantly less time using the B-Tree
 With the indexing feature, multilevel indexing can be achieved.
 Most of the servers also use the B-tree approach.

Binomial Heap
Binomial heap is a collection of Binomial trees, which follows property of min-heap also.
Binomial Tree: A Binomial Tree of order 0 has 1 node. A Binomial Tree of order k can be constructed by
taking two binomial trees of order k-1 and making one the leftmost child or the other. 

1. For k=0, B0 (Single Node)

2. For k=1, B1 (2 Nodes) (two B0 forms B1)

3. For k=2, B2 (4 Nodes)


(two B1 forms B2)

4. For k=3, B3 (8 Nodes)


(two B2 forms B3)

26
Properties of Binomial Tree
 It has exactly 2k nodes. 
 It has depth k. 
 There are exactly Ci nodes at depth i for i = 0, 1, . . ., k. 
 The root has degree k and children of the root are themselves Binomial Trees with order k-1, k-2, .
. .,0 from left to right. 
Examples
12------------10--------------------20
/ \ / |\
15 50 70 50 40
| /| |
30 80 85 65
|
100
A Binomial Heap with 13 nodes. It is a collection of 3 Binomial Trees of orders 0, 2, and 3 from left to right.

Binary Representation of a number and Binomial Heaps  


A Binomial Heap with n nodes has the number of Binomial Trees equal to the number of set bits in the binary
representation of n. For example, let n be 13, there are 3 set bits in the binary representation of n (00001101),
hence 3 Binomial Trees. We can also relate the degree of these Binomial Trees with positions of set bits. With
this relation, we can conclude that there are O(log(n)) Binomial Trees in a Binomial Heap with ‘n’ nodes. 

Operations of Binomial Heap:  


The main operation in Binomial Heap is a union(), all other operations mainly use this operation. The union()
operation is to combine two Binomial Heaps into one. 

1. Insertion
Inserting an element in the heap can be done by simply creating a new heap only with the element to be inserted,
and then merging it with the original heap. Due to the merging, the single insertion in a heap takes O(log(n)) time.
12------------7--------------------15
/ / |
25 28 33
/
41

27
In the above heap, there are three binomial trees of degrees 0, 1, and 2 are given where B0 is attached to the head
of the heap.First, we have to combine both of the heaps. As both node 12 and node 15 are of degree 0, so node 15
is attached to node 12 as shown below –
12------15-------7--------------------15
/ / |
25 28 33
/
41
Now, assign x to B0 with value 12, next(x) to B0 with value 15, and assign sibling(next(x)) to B1 with value 7. As
the degree of x and next(x) is equal. The key value of x is smaller than the key value of next(x), so next(x) is
removed and attached to the x.
----- 12------------7--------------------15
/ / / |
15 25 28 33
/
41
Now, x points to node 12 with degree B1, next(x) to node 7 with degree B1, and sibling(next(x)) points to node 15
with degree B2. The degree of x is equal to the degree of next(x) but not equal to the degree of sibling(next(x)).
The key value of x is greater than the key value of next(x); therefore, x is removed and attached to the next(x).
----7--------------------15
/| / |
12 25 28 33
/ /
15 41
Now, x points to node 7, and next(x) points to node 15. The degree of both x and next(x) is B2, and the key value
of x is less than the key value of next(x), so next(x) will be removed and attached to x.
---- 7
/| \
15 12 25
/ | |
28 33 15
|
41

2. Deletion
To delete a node from the heap, first, we have to decrease its key to negative infinity (or -∞) and then delete the
minimum in the heap. Now we will see how to delete a node with the help of an example. Consider the below
heap, and suppose we have to delete the node 41 from the heap –
12------------7--------------------15
/ / |
25 28 33
/
28
41
First, replace the node with negative infinity (or -∞) as shown below –
12------------7--------------------15
/ / |
25 28 33
/
-∞
Now, swap the negative infinity with its root node in order to maintain the min-heap property.
12------------7--------------------15
/ / |
25 -∞ 33
/
28
Now, again swap the negative infinity with its root node
12------------7------------------- -∞
/ / |
25 15 33
/
28
The next step is to extract the minimum key from the heap. Since the minimum key in the above heap is -infinity
so we will extract this key, and the heap would be:

12------------7------------------- 15 ----- 33
/ /
25 28
12------------7------------------- 15
/ / /
33 25 28
---12------------------ 7
/ / |
33 15 25
/
28
The above is the final heap after deleting the node 41.

*Time Complexity of inserting a node is O(log(n))

Fibonacci Heap-
A Fibonacci heap is a data structure that consists of a collection of trees which follow min heap or max
heap property. These two properties are the characteristics of the trees present on a Fibonacci heap. In a Fibonacci
heap, a node can have more than two children or no children at all. Also, it has more efficient heap operations
than that supported by the binomial and binary heaps. The Fibonacci heap is called a Fibonacci heap because
29
the trees are constructed in a way such that a tree of order n has at least Fn+2 nodes in it, where Fn+2 is the (n +
2)th Fibonacci number

Properties of a Fibonacci Heap


Important properties of a Fibonacci heap are:
1. It is a set of min heap ordered trees. (i.e., The parent is always smaller than the children.)
2. A pointer is maintained at the minimum element node.
3. It consists of a set of marked nodes. (Decrease key operation)
4. The trees within a Fibonacci heap are unordered but rooted

Memory Representation of the Nodes in a Fibonacci Heap

Operations on a Fibonacci Heap

1.Insertion

Complexities Analysis
· Insertion O(1)
30
· Deletion O(log(n))

Amortized Analysis
Amortized Analysis is used for algorithms where an occasional operation is very slow, but most of the other
operations are faster. In Amortized Analysis, we analyze a sequence of operations and guarantee a worst-case
average time that is lower than the worst-case time of a particularly expensive operation. 
The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint
Sets, and Splay Trees. 
For a sequence of n operations, the cost is
Cost (n operations)/n = (Cost (normal operation) + Cost ( expensive operations))/n

Amortized analysis of insertion in Dynamic Array


For a dynamic array, items can be inserted at a given index in O(1) time. But if that index is not present in the
array, it fails to perform the task in constant time. For that case, it initially doubles the size of the array then inserts
the element if the index is present. 

Initial table is empty and size is 0


1
Insert Item
1 2
Insert Item 2
1 2 3
Insert Item 3
1 2 3 4
Insert Item 4
1 2 3 4 5
Insert Item 5

1 2 3 4 5 6
Insert Item 6

1 2 3 4 5 6 7
Insert Item 7

31
32

You might also like