DAA Unit 3 Notes
DAA Unit 3 Notes
1
Course Outcomes:
1. Understand and apply mathematical preliminaries to the analysis and design stages of
methods. (Analyze)
3. Understand and apply the divide-and-conquer paradigm and synthesize divide and-conquer algorithms on
problems of Sorting. Searching, finding MST etc. (Understand, Apply)
4. Describe the greedy paradigm and explain when an algorithmic design situation calls
for it. For a given problem develop the greedy algorithms. (Apply. Analyze)
5. Apply the dynamic-programming paradigm to model engineering problems using graph and write the
corresponding algorithm to solve the problems. (Apply)
6. Explain the ways to analyze randomized and approximation algorithms (Apply. Analyze)
2
UNIT 1
Table of contents-
Unit-1
Unit-2
3
v Operations 20
v Complexity Analysis 21
Ø Red-Black Tree 21
v Properties 21
v Advantages 21
v Operations 22
v Applications 23
v R-B Tree Vs AVL Tree 24
Ø B-Tree 24
v Operations 24
v Complexity Analysis 26
v Applications 26
Ø Binomial Heap 26
v Properties 26
v Binary Representation of a number using heap 27
v Operations 27
Ø Fibonacci Heap 29
v Properties 29
v Binary Representation of a number using heap 30
v Operations 30
Ø Amortized Analysis 31
Unit-3
4
Algorithm:-
Algorithm means - ”A set of rules to be followed in calculations or other problem-solving operations”
Or ” A procedure for solving a mathematical problem in a finite number of steps that frequently involves
recursive operations .It can be understood by taking the example of cooking a new recipe. To cook a new
recipe, one reads the instructions and steps and executes them one by one, in the given sequence. Algorithms
help to do a task in programming to get the expected output.
5
have no effect on the implementation. This dependent on the language of the compiler
analysis is independent of the type of and the type of hardware used.
hardware and language of the compiler.
4
It gives the approximate answers for the This analysis helps to get the actual and
complexity of the program. real analysis report about correctness,
space required, time consumed etc.
Algorithm complexity-
An algorithm is defined as complex based on the mount of Space and Time it consumes. Hence the
Complexity of an algorithm refers to the measure of the Time that it will need to execute and get the expected
output, and the Space it will need to store all the data (input, temporary data and output). Hence these two
factors define the efficiency of an algorithm.
The two factors of Algorithm Complexity are:
· Time Factor: Time is measured by counting the number of key operations such as comparisons in the
sorting algorithm.
· Space Factor: Space is measured by counting the maximum memory space required by the algorithm.
Therefore, the complexity of an algorithm can be divided into two types:
1. Space Complexity: The space complexity of an algorithm refers to the amount of memory used by the
algorithm to store the variables and get the result. This can be for inputs, temporary operations, or outputs.
2. Time Complexity: The time complexity of an algorithm refers to the amount of time that is required by the
algorithm to execute and get the result. This can be for normal operations, conditional if-else statements, loop
statements, etc.
6
Asymptotic notations-
In computing, asymptotic analysis of an algorithm refers to defining the mathematical boundation of its
run-time performance based on the input size. For example, the running time of one operation is computed
as f(n), and maybe for another operation, it is computed as g(n2). This means the first operation running time
will increase linearly with the increase in n and the running time of the second operation will increase
exponentially when n increases. Usually, the analysis of an algorithm is done based on three cases:
1. Best Case (Big-Omega Notation (Ω))
2. Average Case (Big-Theta Notation (Θ))
3. Worst Case (Big-O Notation(O))
7
for(j=0;j<n;j++) —>(n+1)*n steps
printf(“Good”); —> n steps
2
}
Total steps=2*n + 2*n+1
2
c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(n ) 2
Example 3: - Consider the following code:
for(i=0;i<n;i+=2) —>(n/2)steps
{
printf(“Good”); —>(n/2-1)steps
}
Total steps=n-1
Let c=1 & g(n)=n
c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(n)
Example 4: - Consider the following code:
for(i=0;i<n;i*=2) —>(log n)steps
2
{
printf(“Good”); —>(log n-1)steps
2
}
Total steps=2log n-1 2
c*g(n)>=f(n)
Hence, g(n) is the upper bound of f(n) and complexity is O(log n) 2
Recurrence relations-
Recurrence relation is way of determining the running time of a recursive algorithm or program. It's a
equation or a inequality that describes a functions in terms of its values and smaller inputs.
For example, let us consider T(n) to be the running time of a given problems on size n, the problem in our case
will be finding the nth Fibonacci number. Let F(n) denote the nth Fibonacci number, therefore F(n) = F(n-1) + F(n-
2) and T(0) = a , a is a constant. The recurrence relation will be : T(n) = c + T(n-1), where c is a constant
8
Substitution method-
The whole working of the substitution method can be divided into two processes:
v Take a guess at the solution
v Find boundary conditions using the principles of mathematical induction and prove that the
guess is correct
Example 1: Recurrence relation: T(n) = T(n-1) + 1 for n > 1
Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n-2) + 1]+1
= [[T(n-3)+1] + 1]+1
And so on: T(n) = T(0)+n*1
Step 4: Now we need to put the values into the recurrence formula,
T(n) = T(0) + n = n+1
Therefore T(n) = O(n)
Example 2: Recurrence relation: T(n) = T(n-1) + c for n > 1
Step 1: Base condition: T(0) = 1
Step 2: Using the above statement we can write that:
T(n) = [T(n-2) + c]+c
= [[T(n-3)+c] + c]+c
And so on: T(n) = T(0)+n*c
Step 4: Now we need to put the values into the recurrence formula,
T(n) = T(0) + cn=c n+1 .Therefore T(n) = O(n*c)
Linear Search-
Linear Search is defined as a sequential search algorithm that starts at one end and goes through each
element of a list until the desired element is found, otherwise the search continues till the end of the data set. It is
the easiest searching algorithm with a complexity of O(n)
Algorithm: 1-Initialize A[N]
2-For I=1 to N
2.1-If A[I]=KEY, return I
3-Return -1
Binary Search-
Binary Search is a searching algorithm used in a sorted array by repeatedly dividing the search interval in
half. The idea of binary search is to use the information that the array is sorted and reduce the time complexity to
O(Log n).
Algorithm: 1-Initialize A[N], LOWER=1, UPPER=N
2-While LOWER<UPPER
2.1-Set MID=(LOWER+UPPER)/2
2.2-If A[MID]=KEY, return MID
2.3-If A[MID]<KEY, set LOWER=MID+1
2.4-If A[MID]>KEY, set UPPER=MID-1
3-Return -1
Sorting Algorithms-
Sorting is the process of arranging the elements of an array so that they can be placed either in ascending
or descending order. Consider an array; A[10] = { 5, 4, 10, 2, 30, 45, 34, 14, 18, 9 )The Array sorted in ascending
order will be given as; A[] = { 2, 4, 5, 9, 10, 14, 18, 30, 34, 45 }
There are many techniques by using which, sorting can be performed. Some of them are-
Ø Bubble Sort
Ø Merge Sort
Ø Selection Sort
12
Ø Insertion Sort
Ø Quick Sort
Ø Heap Sort
Ø Radix Sort
Ø Bucket Sort
Algorithms comparison table-
S.no. Sorting Algorithms Best case Average case Worst case
complexity complexity complexity
1 Bubble Sort O(n) O(n )
2
O(n2)
2 Insertion sort O(n) O(n2) O(n2)
3 Merge Sort O(nlogn) O(nlogn) O(nlogn)
4 Selection Sort O(n )
2
O(n )
2
O(n2)
5 Quick Sort O(nlogn) O(nlogn) O(n2)
6 Heap Sort O(nlogn) O(nlogn) O(nlogn)
7 Radix Sort O(n+k) O(n+k) O(n+k)
8 Bucket Sort O(n) O(n+k) O(n2)
Bubble sort: It is the simplest sort method which performs sorting by repeatedly moving the largest element to the
highest index of the array. It comprises of comparing each element to its adjacent element and replace them
accordingly.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
3. For J=I to N
4. If A[I] > A[J]
5. Swap(A[I], A[J])
Insertion sort: As the name suggests, insertion sort inserts each element of the array to its proper place. It is a
very simple sort method which is used to arrange the deck of cards while playing bridge.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
1.KEY=A[I]
2. J=I-1
3. While J.=0 and A[J] >KEY
A[J+1]= A[J]
J=J-1
4. A[J+1]=KEY
13
Selection sort - Selection sort finds the smallest element in the array and place it on the first place on the list, then
it finds the second smallest element in the array and place it on the second place. This process continues until all
the elements are moved to their correct ordering. It carries running time O(n2) which is worse than insertion sort.
Algorithm-
1. Initialize A[N]
2. For I=1 to N
1.Set SMALL = A[I]
2. Set POS=I
3. For J=I+1 to N
1.If SMALL>A[J]
1.Set SMALL=A[J]
2. Set POS=J
4. Swap A[J] with A[POS]
Merge sort - Merge sort follows divide and conquer approach in which, the list is first divided into the sets of
equal elements and then each half of the list is sorted by using merge sort. The sorted list is combined again to
form an elementary sorted array
Algorithm for Sort function; -
1. Initialize A[N]
2. If BEG < END
3. Set MID = (BEG + END)/2
4. Call function SORT(A, BEG, MID)
5. Call function SORT(A, MID + 1, END)
6. Call function MERGE (A, BEG, MID, END)
Algorithm for Merge function; -
1. Initialize N1 = MID - BEG + 1 & N2 = END - MID, LEFT[N1], RIGHT[N2]
2. For I=1 to N
1. LEFT[I] = A[BEG + I]
3. For J=1 to N
1. RIGHT[J] = A[MID + 1 + J]
4. Initialize I=0, J=0, K=BEG
5. While I < N1 and J < N2
1. If LEFT[I] <= RIGHT[J]
1. A[K] = LEFT[I]
2. I=I+1
2. Else
1. A[K] = RIGHT[J]
2. J=J+1
6. K=K+1
7. While (I<N1)
14
1. A[K] = LEFT[I]
2. I=I+1
3. K=K+1
8. While (J<N2)
1. A[K] = RIGHT[J]
2. J=J+1
3. K=K+1
Quick sort- Quick sort is the most optimized sort algorithms which performs sorting in O(n log n) comparisons.
Like Merge sort, quick sort also works by using divide and conquer approach.
Sorting function algorithm-
1. Initialize A[N]
2. If (START < END)
1. Call function PARTITION(A, START, END) and store its value in P
3. Call function QUICKSORT (A, START, P - 1)
4. Call function QUICKSORT (A, P + 1, END)
Partition Algorithm:
The partition algorithm rearranges the sub-arrays in a place.
1. Initialize PIVOT =A[END], I= START-1
2. For J =START to END -1
1. If A[J] < PIVOT
1.1. I= I + 1
1.2. Swap A[I] WITH A[J]
3. Swap A[I+1] with A[END]
4. Return I+1
Heap sort -The concept of heap sort is to eliminate the elements one by one from the heap part of the list, and
then insert them into the sorted part of the list. Heapsort is the in-place sorting algorithm. The first step includes
the creation of a heap by adjusting the elements of the array. After the creation of heap, now remove the root
element of the heap repeatedly by shifting it to the end of the array, and then store the heap structure with the
remaining elements.
15
Algorithm for HEAPIFY function
1. Initialize LARGEST = I, LEFT = 2 * I + 1; & RIGHT = 2 * I + 2;
2. If (LEFT < N && ARR[LEFT] > ARR[LARGEST])
1. LARGEST = LEFT
3. If (RIGHT < N && ARR[RIGHT] > ARR[LARGEST])
1. LARGEST = RIGHT
4. IF (LARGEST != I)
1. SWAP(&ARR[I], &ARR[LARGEST])
2. HEAPIFY(ARR, N, LARGEST)
Algorithm for Building Max heap function-
1. FOR (I = N / 2 - 1) to 0
1. Call function HEAPIFY(ARR, N, I)
Algorithm for heap sort function-
1. Initialize A[N]
2. FOR I = N - 1 to 0
1. SWAP ARR[0], &ARR[I]);
2. Call function HEAPIFY(ARR, I, 0)
Radix sort: Radix sort is the linear sorting algorithm that is used for integers. In Radix sort, there is digit by digit
sorting is performed that is started from the least significant digit to the most significant digit.
Algorithm for radix sort function-
1. Initialize A[N]
2. Initialize MAX by calling function GETMAX
3. FOR I = 1 to (MAX/I> 0), increment I=I*10
4. Call function COUNTINGSORT(A,N,I)
Algorithm for GETMAX function-
1. Initialize MAX = A[0]
2. FOR I = 1 to N
1. IF(A[I] > MAX)
2. MAX = A[I]
3. RETURN MAX
Algorithm for COUNTINGSORT function-
1. Initialize OUTPUT[N + 1] & COUNT[10] = {0}
2. FOR I = 0 to N
1. COUNT[(A[I] / PLACE) % 10]++
3. FOR I = 1to 10
1. COUNT[I] = COUNT[I]+ COUNT[I - 1]
4. FOR I = N - 1 to 0
1. OUTPUT[COUNT[(A[I] / PLACE) % 10] - 1] = A[I]
2. COUNT[(A[I] / PLACE) % 10]--
5. FOR I = 0 to N
1. A[I] = OUTPUT[I]
16
Bucket sort: Itis a sorting algorithm that separate the elements into multiple groups said to be buckets. Elements
in bucket sort are first uniformly divided into groups called buckets, and then they are sorted by any other sorting
algorithm. After that, elements are gathered in a sorted manner.
The basic procedure of performing the bucket sort is given as follows -
· First, partition the range into a fixed number of buckets.
· Then, toss every element into its appropriate bucket.
· After that, sort each bucket individually by applying a sorting algorithm.
· And at last, concatenate all the sorted buckets.
Algorithm-
1. Initialize A[N], INDEX = 0
2. Create vector B[N]
3. FOR I = 0 to N
1. BI = N * ARR[I]
2. B[BI]. PUSH_BACK(ARR[I]);
4. FOR I = 0 to N
1. SORT(B[I].BEGIN(), B[I].END())
5. FOR I = 0 to N
1. FOR J = 0 to B[I].SIZE()
1. ARR[INDEX++] = B[I][J]
Order Statistics-
Suppose we have a set of values X1=4, X2=2, X3=7, X4=11, X5=5. The kth order statistic for this data is
the kth smallest value from the set {4, 2, 7, 11, 5}. So, the 1st order statistic is 2(smallest value), the 2nd order
statistic is 4 (next smallest), and so on. The 5th order statistic is the fifth smallest value (the largest value), which
is 11.
Ø If a sorted list is given, the complexity of finding kth smallest element is O(1).
Ø If an unsorted list is given, the complexity of finding kth smallest element depends upon the sorting
algorithm used.
Ø We can make use of the concept of “Randomized Algorithm”.
Randomized Algorithm-
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called
Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to pick the next pivot
(or we randomly shuffle the array). Typically, this randomness is used to reduce time complexity or space
complexity in other standard algorithms.
17
Unit-2
Binary Search Tree
A binary Search Tree is a node-based binary tree data structure which has the following properties:
The left subtree of a node contains only nodes with keys lesser than the node’s key.
The right subtree of a node contains only nodes with keys greater than the node’s key.
The left and right subtree each must also be a binary search tree.
left subtree (keys) < node (key) ≤ right subtree (keys)
Representation
8 12
3 9 11 15
Operations on BST
1. Searching
2. Insertion
3. Deletion
4. Traversal
1. Searching-
Searching in BST involves the comparison of the key values. If the key value is equal to root key then, search
successful, if lesser than root key then search the key in the left subtree and if the key is greater than root key
then search the key in the right subtree.
Searching in BST algorithm: -
Check if tree is NULL, if the tree is not NULL then follow the following steps.
Compare the key to be searched with the root of the BST.
If the key is lesser than the root then search in the left subtree.
If the key is greater than the root then search in the right subtree.
If the key is equal to root, then return and print search successful.
Repeat step 3, 4 or 5 for the obtained subtree.
2. Insertion in a BST:
Insertion in BST involves the comparison of the key values. If the key value is lesser than or equal to root key
then go to left subtree, find an empty space following to the search algorithm and insert the data and if the key
is greater than root key then go to right subtree, find an empty space following to the search algorithm and
insert the data.
18
3. Deletion in a BST:
Deletion in BST involves three cases: First, search the key to be deleted using searching algorithm and find the
node. Then, find the number of children of the node to be deleted.
Case 1- If the node to be deleted is leaf node: If the node to be deleted is a leaf node, then
delete it.
Case 2- If the node to be deleted has one child: If the node to be deleted has one child, then
delete the node and place the child of the node at the position of the deleted node.
Case 3- If the node to be deleted has two children: If the node to be deleted has two
children, then, find the inorder successor or inorder predecessor of the node according to the
nearest capable value of the node to be deleted. Delete the inorder successor or predecessor using
the above cases. Replace the node with the inorder successor or predecessor.
4. Traversal in a BST
1. In order – left subtree (keys) < node (key) ≤ right subtree (keys)
2. Pre order – node (key) < left subtree (keys) ≤ right subtree (keys)
3. Post order - left subtree (keys) < right subtree (keys) ≤ node(key)
Time Complexity: O(N)
Auxiliary Space: If we don’t consider the size of the stack for function calls then O(1) otherwise O(h) where
h is the height of the tree.
Note->
· The height of the skewed tree is n (no. of elements) so the worst space complexity is O(N) and the
height is (log N) for the balanced tree so the best space complexity is O(log N).
· When frequency of insertion & deletion of nodes is very high, then AVL tree were used.
· When the tree is skewed then RED-BLACK tree is used.
Advantages of BST-
BST is fast in insertion and deletion when balanced.
BST is efficient.
We can also do range queries – find keys between N and M (N <= M).
BST code is simple as compared to other data structures.
Disadvantages of BST-
The main disadvantage is that we should always implement a balanced binary search tree.
Otherwise, the cost of operations may not be logarithmic and degenerate into a linear search on an
array.
Accessing the element in BST is slightly slower than array.
A BST can be imbalanced or degenerated which can increase the complexity.
AVL Trees
19
AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and right
subtrees cannot be more than one for all nodes.
Example:
12
8 18
5
11 17
The above tree is AVL because the differences between heights of left and right subtrees for every node are less
than or equal to 1.
Y X
2
T1 Y
X T3
33
T2 TT3
T T
1 2
Right Rotation Left Rotation
Possible arrangements :
1. Left Left case
20
2. Right Right case
3. Left Right case
4. Right Left case
Complexity Analysis
Time Complexity: O(log(n)), For Insertion
Space Complexity: O(1)
The rotation operations (left and right rotate) take constant time as only a few pointers are being changed there.
Updating the height and getting the balance factor also takes constant time. So, the time complexity of the AVL
insert remains the same as the BST insert which is O(h) where h is the height of the tree. Since the AVL tree is
balanced, the height is O(log(n)). So, time complexity of AVL insert is O(log(n).
Node Structure
Color Left Key Parent Right
Some examples
Not an RB tree
Note :
21
In case we have red leaf, we need to attach null sentinels which are always black.
Height from any black node to any leaf node will be same.
22
5. IF KEY [Z] < KEY [X]
6. THEN X ← LEFT [X]
7. ELSE X ← RIGHT [X]
8. P [Z] ← Y
9. IF Y = NIL [T]
10. THEN ROOT [T] ← Z
11. ELSE IF KEY [Z] < KEY [Y]
12. THEN LEFT [Y] ← Z
13. ELSE RIGHT [Y] ← Z
14. LEFT [Z] ← NIL [T]
15. RIGHT [Z] ← NIL [T]
16. COLOR [Z] ← RED
17. RB-INSERT-FIXUP (T, Z)
Example Show the red-black trees that result after successively inserting the keys 41,38,31,12 into an initially
empty red-black tree.
1. Insert 41
41
2. Insert 38
41
38
3. Insert 31
41 38
38
31 41
31
4. Insert 12
38 38
31 41 31 41
12 12
Applications of RB-Tree
1. Most of the self-balancing BST library functions like map, multiset, and multimap in C++ use
Red-Black Trees.
2. It is used to implement CPU Scheduling Linux.
23
3. It is also used in the K-mean clustering algorithm in machine learning for reducing time
complexity.
4. Moreover, MySQL also uses the Red-Black tree for indexes on tables in order to reduce the
searching and insertion time.
B Tree-
B-Tree is a self-balancing search tree. In most of the other self-balancing search trees it is assumed that
everything is in the main memory. B Tree is a specialized m-way tree that can be widely used for disk access. A
B-Tree of order m can have at most m-1 keys and m children. One of the main reasons of using B tree is its
capability to store large number of keys in a single node and large key values by keeping the height of the tree
relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.
Operations on B-tree:
1.Searching
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the
following B Tree. The process will something like following:
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any
element in a B tree.
2.Insertion
Example Insert 20, 12, 50, 60, 18, 65,70, 11, 10.
1. 20
2. 12 20
24
3. 12 20 50
4. 20
12 50 60
5. 20
12 18 50 60
6.
20 60
12 18 50 65 70
7.
20 60
11 12 18 50 65 70
8.
12 20 60
10 11 18 65 70
50
3.Deletion
Delete 32
Delete 31
25
Delete 30
Complexity Analysis
For Every operation it is O(log(n)).
Applications of B-Trees:
It is used in large databases to access data stored on the disk
Searching for data in a data set can be achieved in significantly less time using the B-Tree
With the indexing feature, multilevel indexing can be achieved.
Most of the servers also use the B-tree approach.
Binomial Heap
Binomial heap is a collection of Binomial trees, which follows property of min-heap also.
Binomial Tree: A Binomial Tree of order 0 has 1 node. A Binomial Tree of order k can be constructed by
taking two binomial trees of order k-1 and making one the leftmost child or the other.
26
Properties of Binomial Tree
It has exactly 2k nodes.
It has depth k.
There are exactly Ci nodes at depth i for i = 0, 1, . . ., k.
The root has degree k and children of the root are themselves Binomial Trees with order k-1, k-2, .
. .,0 from left to right.
Examples
12------------10--------------------20
/ \ / |\
15 50 70 50 40
| /| |
30 80 85 65
|
100
A Binomial Heap with 13 nodes. It is a collection of 3 Binomial Trees of orders 0, 2, and 3 from left to right.
1. Insertion
Inserting an element in the heap can be done by simply creating a new heap only with the element to be inserted,
and then merging it with the original heap. Due to the merging, the single insertion in a heap takes O(log(n)) time.
12------------7--------------------15
/ / |
25 28 33
/
41
27
In the above heap, there are three binomial trees of degrees 0, 1, and 2 are given where B0 is attached to the head
of the heap.First, we have to combine both of the heaps. As both node 12 and node 15 are of degree 0, so node 15
is attached to node 12 as shown below –
12------15-------7--------------------15
/ / |
25 28 33
/
41
Now, assign x to B0 with value 12, next(x) to B0 with value 15, and assign sibling(next(x)) to B1 with value 7. As
the degree of x and next(x) is equal. The key value of x is smaller than the key value of next(x), so next(x) is
removed and attached to the x.
----- 12------------7--------------------15
/ / / |
15 25 28 33
/
41
Now, x points to node 12 with degree B1, next(x) to node 7 with degree B1, and sibling(next(x)) points to node 15
with degree B2. The degree of x is equal to the degree of next(x) but not equal to the degree of sibling(next(x)).
The key value of x is greater than the key value of next(x); therefore, x is removed and attached to the next(x).
----7--------------------15
/| / |
12 25 28 33
/ /
15 41
Now, x points to node 7, and next(x) points to node 15. The degree of both x and next(x) is B2, and the key value
of x is less than the key value of next(x), so next(x) will be removed and attached to x.
---- 7
/| \
15 12 25
/ | |
28 33 15
|
41
2. Deletion
To delete a node from the heap, first, we have to decrease its key to negative infinity (or -∞) and then delete the
minimum in the heap. Now we will see how to delete a node with the help of an example. Consider the below
heap, and suppose we have to delete the node 41 from the heap –
12------------7--------------------15
/ / |
25 28 33
/
28
41
First, replace the node with negative infinity (or -∞) as shown below –
12------------7--------------------15
/ / |
25 28 33
/
-∞
Now, swap the negative infinity with its root node in order to maintain the min-heap property.
12------------7--------------------15
/ / |
25 -∞ 33
/
28
Now, again swap the negative infinity with its root node
12------------7------------------- -∞
/ / |
25 15 33
/
28
The next step is to extract the minimum key from the heap. Since the minimum key in the above heap is -infinity
so we will extract this key, and the heap would be:
12------------7------------------- 15 ----- 33
/ /
25 28
12------------7------------------- 15
/ / /
33 25 28
---12------------------ 7
/ / |
33 15 25
/
28
The above is the final heap after deleting the node 41.
Fibonacci Heap-
A Fibonacci heap is a data structure that consists of a collection of trees which follow min heap or max
heap property. These two properties are the characteristics of the trees present on a Fibonacci heap. In a Fibonacci
heap, a node can have more than two children or no children at all. Also, it has more efficient heap operations
than that supported by the binomial and binary heaps. The Fibonacci heap is called a Fibonacci heap because
29
the trees are constructed in a way such that a tree of order n has at least Fn+2 nodes in it, where Fn+2 is the (n +
2)th Fibonacci number
1.Insertion
Complexities Analysis
· Insertion O(1)
30
· Deletion O(log(n))
Amortized Analysis
Amortized Analysis is used for algorithms where an occasional operation is very slow, but most of the other
operations are faster. In Amortized Analysis, we analyze a sequence of operations and guarantee a worst-case
average time that is lower than the worst-case time of a particularly expensive operation.
The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint
Sets, and Splay Trees.
For a sequence of n operations, the cost is
Cost (n operations)/n = (Cost (normal operation) + Cost ( expensive operations))/n
1 2 3 4 5 6
Insert Item 6
1 2 3 4 5 6 7
Insert Item 7
31
Unit-3
Divide & Conquer
The following are some standard algorithms that follow Divide and Conquer algorithm.
1. Quicksort is a sorting algorithm. The algorithm picks a pivot element and rearranges the array
elements so that all elements smaller than the picked pivot element move to the left side of the pivot,
and all greater elements move to the right side. Finally, the algorithm recursively sorts the subarrays
on the left and right of the pivot element.
2. Merge Sort is also a sorting algorithm. The algorithm divides the array into two halves, recursively
sorts them, and finally merges the two sorted halves.
3. Closest Pair of Points The problem is to find the closest pair of points in a set of points in the x-y
plane. The problem can be solved in O(n^2) time by calculating the distances of every pair of points
and comparing the distances to find the minimum. The Divide and Conquer algorithm solves the
problem in O(N log N) time.
4. Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple method to
multiply two matrices needs 3 nested loops and is O(n^3). Strassen’s algorithm multiplies two
matrices in O(n^2.8974) time.
5. Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common algorithm for FFT.
It is a divide and conquer algorithm which works in O(N log N) time.
6. Karatsuba algorithm for fast multiplication does the multiplication of two n-digit numbers in at
most
3nLog23 = 3n1.585
single-digit multiplications in general (and exactly when n is a power of 2). It is, therefore, faster
than the classical algorithm, which requires n2 single-digit products. If n = 210 = 1024, in particular,
the exact counts are 310 = 59, 049 and (210)2 = 1, 048, 576, respectively.
Example 1:
To find the maximum and minimum element in a given array.
Approach: To find the maximum and minimum element from a given array is an application for divide
and conquer. In this problem, we will find the maximum and minimum elements in a given array. In this
problem, we are using a divide and conquer approach(DAC) which has three steps divide, conquer and
combine.
For Maximum:
In this problem, we are using the recursive approach to find the maximum where we will see that only
two elements are left and then we can easily use condition i.e. if(a[index]>a[index+1].)
In a program line a[index] and a[index+1])condition will ensure only two elements in left.
Algorithm
if(index >= l-2)
{
if(a[index]>a[index+1])
{
// (a[index]
// Now, we can say that the last element will be maximum in a given array.
}
else
{
32
//(a[index+1]
// Now, we can say that last element will be maximum in a given array.
}
}
In the above condition, we have checked the left side condition to find out the maximum. Now, we will
see the right side condition to find the maximum.
Recursive function to check the right side at the current index of an array.
Now, we will compare the condition and check the right side at the current index of a given array.
In the given program, we are going to implement this logic to check the condition on the right side at the
current index.
For Minimum:
In this problem, we are going to implement the recursive approach to find the minimum no. in a given
array.
Now, we will check the condition on the right side in a given array.
Now, we will check the condition to find the minimum on the right side.
Example 2:
Problem: We are given an array of n points in the plane, and the problem is to find out the closest pair
of points in the array. This problem arises in a number of applications. For example, in air-traffic control,
33
you may want to monitor planes that come too close together, since this may indicate a possible collision.
Recall the following formula for distance between two points p and q.
Algorithm
Following are the detailed steps of a O(n (Logn)^2) algorithm.
Input: An array of n points P[]
Output: The smallest distance between two points in the given array.
As a pre-processing step, the input array is sorted according to x coordinates.
1) Find the middle point in the sorted array, we can take P[n/2] as middle point.
2) Divide the given array in two halves. The first subarray contains points from P[0] to P[n/2]. The
second subarray contains points from P[n/2+1] to P[n-1].
3) Recursively find the smallest distances in both subarrays. Let the distances be dl and dr. Find the
minimum of dl and dr. Let the minimum be d.
1) From the above 3 steps, we have an upper bound d of minimum distance. Now we need to consider the
pairs such that one point in pair is from the left half and the other is from the right half. Consider the
vertical line passing through P[n/2] and find all points whose x coordinate is closer than d to the middle
vertical line. Build an array strip[] of all such points.
5) Sort the array strip[] according to y coordinates. This step is O(nLogn). It can be optimized to O(n) by
recursively sorting and merging.
6) Find the smallest distance in strip[]. This is tricky. From the first look, it seems to be a O(n^2) step, but
it is actually O(n). It can be proved geometrically that for every point in the strip, we only need to check
34
at most 7 points after it (note that strip is sorted according to Y coordinate)..
7) Finally return the minimum of d and distance calculated in the above step (step 6)
Example 3:
Karatsuba algorithm for fast multiplication using Divide and Conquer algorithm
Given two binary strings that represent value of two integers, find the product of two strings. For
example, if the first bit string is “1100” and second bit string is “1010”, output should be 120.
For simplicity, let the length of two strings be same and be n.
A Naive Approach is to follow the process we study in school. One by one take all bits of second
number and multiply it with all bits of first number. Finally add all multiplications. This algorithm takes
O(n^2) time.
Using Divide and Conquer, we can multiply two integers in less time complexity. We divide the given
numbers in two halves. Let the given numbers be X and Y.
For simplicity let us assume that n is even
35
XY = 22ceil(n/2) XlYl + 2ceil(n/2) * [(Xl + Xr)(Yl + Yr) - XlYl - XrYr] +
XrYr
The above algorithm is called Karatsuba algorithm and it can be used for any base.
Dynamic Programming-
1. Tabulation: Bottom Up
2. Memoization: Top Down
36
// base case
int dp[0] = 1;
for (int i = 1; i< =n; i++)
{
dp[i] = dp[i-1] * i;
}
The above code clearly follows the bottom-up approach as it starts its transition from the bottom-most
base case dp[0] and reaches its destination state dp[n]. Here, we may notice that the DP table is being
populated sequentially and we are directly accessing the calculated states from the table itself and hence,
we call it the tabulation method.
Memoization Method – Top-Down Dynamic Programming
Once, again let’s describe it in terms of state transition. If we need to find the value for some state say
dp[n] and instead of starting from the base state that i.e dp[0] we ask our answer from the states that can
reach the destination state dp[n] following the state transition relation, then it is the top-down fashion of
DP.
Here, we start our journey from the top most destination state and compute its answer by taking in count
the values of states that can reach the destination state, till we reach the bottom-most base state.
Once again, let’s write the code for the factorial problem in the top-down fashion
// Memoized version to find factorial x.
// To speed up we store the values
// of calculated states
// initialized to -1
int dp[MAXN]
// return fact x!
int solve(int x)
{
if (x==0)
return 1;
if (dp[x]!=-1)
return dp[x];
return (dp[x] = x * solve(x-1));
}
As we can see we are storing the most recent cache up to a limit so that if next time we got a call from the
same state we simply return it from the memory. So, this is why we call it memoization as we are storing
the most recent state values.
In this case, the memory layout is linear that’s why it may seem that the memory is being filled in a
sequential manner like the tabulation method, but you may consider any other top-down DP having 2D
memory layout like Min Cost Path, here the memory is not filled in a sequential manner.
37
To dynamically solve a problem, we need to check two necessary
conditions:
1. Overlapping Subproblems: When the solutions to the same subproblems are needed repetitively for
solving the actual problem. The problem is said to have overlapping subproblems property.
2. Optimal Substructure Property: If the optimal solution of the given problem can be obtained by using
optimal solutions of its subproblems then the problem is said to have Optimal Substructure Property.
● Typically, all the problems that require maximizing or minimizing certain quantities or counting problems
that say to count the arrangements under certain conditions or certain probability problems can be solved by
using Dynamic Programming.
● All dynamic programming problems satisfy the overlapping subproblems property and most of the classic
Dynamic programming problems also satisfy the optimal substructure property. Once we observe these
properties in a given problem be sure that it can be solved using Dynamic Programming.
Dynamic Programming problems are all about the state and its transition. This is the most basic step
which must be done very carefully because the state transition depends on the choice of state definition
you make.
State:
A state can be defined as the set of parameters that can uniquely identify a certain position or standing in
the given problem. This set of parameters should be as small as possible to reduce state space.
38
Example:
In our famous Knapsack problem, we define our state by two parameters index and weight i.e
DP[index][weight]. Here DP[index][weight] tells us the maximum profit it can make by taking
items from range 0 to index having the capacity of sack to be weight. Therefore, here the
parameters index and weight together can uniquely identify a subproblem for the knapsack
problem.
The first step to solving a Dynamic Programming problem will be deciding on a state for the
problem after identifying that the problem is a Dynamic Programming problem. As we know
Dynamic Programming is all about using calculated results to formulate the final result.
So, our next step will be to find a relation between previous states to reach the current
state.
This part is the hardest part of solving a Dynamic Programming problem and requires a lot of intuition,
observation, and practice.
Example:
Given 3 numbers {1, 3, 5}, The task is to tell the total number of ways we can form a number N using the
sum of the given three numbers. (allowing repetitions and different arrangements).
The total number of ways to form 6 is: 8
1+1+1+1+1+1
1+1+1+3
1+1+3+1
1+3+1+1
3+1+1+1
3+3
1+5
5+1
The steps to solve the given problem will be:
● We decide a state for the given problem.
● We will take a parameter N to decide the state as it uniquely identifies any subproblem.
● DP state will look like state(N), state(N) means the total number of arrangements to form N by using
{1, 3, 5} as elements. Derive a transition relation between any two states.
● Now, we need to compute state(N).
How to Compute the state?
As we can only use 1, 3, or 5 to form a given number N. Let us assume that we know the result for N =
1,2,3,4,5,6
Let us say we know the result for:
state (n = 1), state (n = 2), state (n = 3) ……… state (n = 6)
Now, we wish to know the result of the state (n = 7). See, we can only add 1, 3, and 5. Now we can get a
sum total of 7 in the following 3 ways:
1) Adding 1 to all possible combinations of state (n = 6)
Eg : [ (1+1+1+1+1+1) + 1]
[ (1+1+1+3) + 1]
[ (1+1+3+1) + 1]
[ (1+3+1+1) + 1]
[ (3+1+1+1) + 1]
[ (3+3) + 1]
[ (1+5) + 1]
[ (5+1) + 1]
2) Adding 3 to all possible combinations of state (n = 4);
[(1+1+1+1) + 3]
[(1+3) + 3]
[(3+1) + 3]
3) Adding 5 to all possible combinations of state(n = 2)
[ (1+1) + 5]
39
(Note how it sufficient to add only on the right-side – all the add-from-left-side cases are covered, either
in the same state, or another, e.g. [ 1+(1+1+1+3)] is not needed in state (n=6) because it’s covered by
state (n = 4) [(1+1+1+1) + 3])
Now, think carefully and satisfy yourself that the above three cases are covering all possible ways to
form a sum total of 7;
Therefore, we can say that result for
state(7) = state (6) + state (4) + state (2)
OR
state(7) = state (7-1) + state (7-3) + state (7-5)
In general,
state(n) = state(n-1) + state(n-3) + state(n-5)
Here is a time efficient solution with O(n) extra space. The ugly-number sequence is 1, 2, 3, 4, 5, 6, 8, 9,
10, 12, 15, … because every number can only be divided by 2, 3, 5, one way to look at the sequence is to
split the sequence to three groups as below:
(1) 1×2, 2×2, 3×2, 4×2, 5×2, …
(2) 1×3, 2×3, 3×3, 4×3, 5×3, …
(3) 1×5, 2×5, 3×5, 4×5, 5×5, …
We can find that every subsequence is the ugly-sequence itself (1, 2, 3, 4, 5, …) multiply 2, 3, 5. Then
we use similar merge method as merge sort, to get every ugly number from the three subsequences. Every
step we choose the smallest one, and move one step after.
Algorithm
1 Declare an array for ugly numbers: ugly[n]
2 Initialize first ugly no: ugly[0] = 1
3 Initialize three array index variables i2, i3, i5 to point to
1st element of the ugly array:
i2 = i3 = i5 =0;
4 Initialize 3 choices for the next ugly no:
next_multiple_of_2 = ugly[i2]*2;
next_multiple_of_3 = ugly[i3]*3
next_multiple_of_5 = ugly[i5]*5;
5 Now go in a loop to fill all ugly numbers till 150:
For (i = 1; i < 150; i++ )
{
next_ugly_no = Min(next_multiple_of_2,
next_multiple_of_3,
next_multiple_of_5);
ugly[i] = next_ugly_no
if (next_ugly_no == next_multiple_of_2)
{
i2 = i2 + 1;
next_multiple_of_2 = ugly[i2]*2;
40
}
if (next_ugly_no == next_multiple_of_3)
{
i3 = i3 + 1;
next_multiple_of_3 = ugly[i3]*3;
}
if (next_ugly_no == next_multiple_of_5)
{
i5 = i5 + 1;
next_multiple_of_5 = ugly[i5]*5;
}
}/* end of for loop */
6.return next_ugly_no
A binomial coefficient C(n, k) can be defined as the coefficient of x^k in the expansion of (1 + x)^n.
A binomial coefficient C(n, k) also gives the number of ways, disregarding order, that k objects can be
chosen from among n objects more formally, the number of k-element subsets (or k-combinations) of a n-
element set.
The Problem
Write a function that takes two parameters n and k and returns the value of Binomial Coefficient C(n,
k). For example, your function should return 6 for n = 4 and k = 2, and it should return 10 for n = 5 and k
=2
Memoization Approach: The idea is to create a lookup table and follow the recursive top-down
approach. Before computing any value, we check if it is already in the lookup table. If yes, we return the
value. Else we compute the value and store it in the lookup table. Following is the Top-down approach of
dynamic programming to finding the value of the Binomial Coefficient.
int binomialCoeffUtil(int n, int k, int** dp)
{
if (dp[n][k] != -1)
return dp[n][k];
if (k == 0) {
dp[n][k] = 1;
return dp[n][k];
}
if (k == n) {
dp[n][k] = 1;
return dp[n][k];
}
dp[n][k] = binomialCoeffUtil(n - 1, k - 1, dp) +
binomialCoeffUtil(n - 1, k, dp);
return dp[n][k];
}
int binomialCoeff(int n, int k)
{
int** dp;
dp = new int*[n + 1];
for (int i = 0; i < (n + 1); i++) {
dp[i] = new int[k + 1];
}
for (int i = 0; i < (n + 1); i++) {
for (int j = 0; j < (k + 1); j++) {
41
dp[i][j] = -1;
}
}
return binomialCoeffUtil(n, k, dp);
}
Greedy Algorithms-
Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that
offers the most obvious and immediate benefit. So the problems where choosing locally optimal also leads to
global solution are the best fit for Greedy. A greedy algorithm is any algorithm that follows the problem-
solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not
produce an optimal solution, but a greedy heuristic can yield locally optimal solutions that approximate a globally
optimal solution in a reasonable amount of time.
For example consider the Fractional Knapsack Problem. The local optimal strategy is to choose the item that has
maximum value vs weight ratio. This strategy also leads to a globally optimal solution because we are allowed to
take fractions of an item.
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input
characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most
frequent character gets the smallest code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are
assigned in such a way that the code assigned to one character is not the prefix of code assigned to any other
character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated
bitstream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned to c
is the prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-compressed output may be
“cccd” or “ccb” or “acd” or “ab”.
See this for applications of Huffman Coding.
There are mainly two major parts in Huffman Coding
1. Build a Huffman Tree from input characters.
2. Traverse the Huffman Tree and assign codes to characters.
Steps to build Huffman Tree
Input is an array of unique characters along with their frequency of occurrences and output is Huffman Tree.
42
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a
priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least
frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with a frequency equal to the sum of the two nodes frequencies. Make the first
extracted node as its left child and the other extracted node as its right child. Add this node to the min
heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the
tree is complete.
Let us understand the algorithm with an example:
character Frequency
a 5
b 9
c 12
d 13
e 16
f 45
Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9
= 14.
Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap
node is root of tree with 3 elements
character Frequency
c 12
d 13
Internal Node 14
e 16
f 45
Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 +
13 = 25
43
Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes
are root of tree with more than one nodes
character Frequency
Internal Node 14
e 16
Internal Node 25
f 45
Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30
44
Now min heap contains 2 nodes.
character Frequency
f 45
Internal Node 55
Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100
45
Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes, extractMin() is
called 2*(n – 1) times. extractMin() takes O(logn) time as it calls minHeapify(). So, overall complexity is
O(nlogn).
If the input array is sorted, there exists a linear time algorithm. We will soon be discussing in our next post.
Applications of Huffman Coding:
1. They are used for transmitting fax and text.
2. They are used by conventional compression formats like PKZIP, GZIP, etc.
3. Multimedia codecs like JPEG, PNG, and MP3 use Huffman encoding(to be more precise the prefix
codes).
It is useful in cases where there is a series of frequently occurring characters.
Approach:
This approach takes the following steps:
1. First find the left most police and thief and store the indices. There can be two cases:
2. CASE 1: If the distance between the police and thief <= k (given), the thief can be caught, so increment
the res counter
3. CASE 2: If the distance between the police and thief >= k, the current thief cannot be caught by the
current police
1. For CASE 2, if the police is behind the thief, we need to find the next police and check if it can catch
the current thief
2. if the thief is behind the police, we need to find the next thief and check if the current police can catch
the thief
4. Repeat the process until we find the next police and thief pair, and increment result counter if conditions
are met, i,e, CASE 1.
Algorithm:
1. Initialize the current lowest indices of policeman in pol and thief in thi variable as -1.
2 Find the lowest index of policeman and thief.
3 If lowest index of either policeman or thief remain -1 then return 0.
46
4 If |pol – thi| <=k then make an allotment and find the next policeman and thief.
5 Else increment the min(pol , thi) to the next policeman or thief found.
6 Repeat the above two steps until we can find the next policeman and thief.
7 Return the number of allotments made.
Time Complexity: O(N)
Auxiliary Space: O(1)
Back-Tracking-
Backtracking can be defined as a general algorithmic technique that considers searching every possible
combination in order to solve a computational problem.
There are three types of problems in backtracking –
1. Decision Problem – In this, we search for a feasible solution.
2. Optimization Problem – In this, we search for the best solution.
3. Enumeration Problem – In this, we find all feasible solutions.
How to determine if a problem can be solved using Backtracking?
Generally, every constraint satisfaction problem which has clear and well-defined constraints on any objective
solution, that incrementally builds candidate to the solution and abandons a candidate (“backtracks”) as soon as it
determines that the candidate cannot possibly be completed to a valid solution, can be solved by Backtracking.
However, most of the problems that are discussed, can be solved using other known algorithms like Dynamic
Programming or Greedy Algorithms in logarithmic, linear, linear-logarithmic time complexity in order of input
size, and therefore, outshine the backtracking algorithm in every respect (since backtracking algorithms are
generally exponential in both time and space). However, a few problems still remain, that only have backtracking
algorithms to solve them until now.
Consider a situation that you have three boxes in front of you and only one of them has a gold coin in it but you do
not know which one. So, in order to get the coin, you will have to open all of the boxes one by one. You will first
check the first box, if it does not contain the coin, you will have to close it and check the second box and so on
until you find the coin. This is what backtracking is, that is solving all sub-problems one by one in order to reach
the best possible solution.
Given an instance of any computational problem P and data D corresponding to the instance, all the constraints
that need to be satisfied in order to solve the problem are represented by C. A backtracking algorithm will then
work as follows:
The Algorithm begins to build up a solution, starting with an empty solution set S.S = {}
1. Add to Backtracking | Set 1Backtracking | Set 1the first move that is still left (All possible moves are added
to S one by one). This now creates a new sub-tree s in the search tree of the algorithm.
2. Check if S+s satisfies each of the constraints in C.
● If Yes, then the sub-tree s is ”eligible” to add more “children”.
● Else, the entire sub-tree s is useless, recurs back to step 1 using argument S.
3.In the event of “eligibility” of the newly formed sub-tree s, recurs back to step 1, using argument S+s.
4. If the check for S+s return that it is a solution for the entire data D. Output and terminate the program.
If not, then return that no solution is possible with the current s and hence discard it.
47
● System.exit(0);
● return
Given a N*N board with the Knight placed on the first block of an empty board. Moving according to the
rules of chess knight must visit each square exactly once. Print the order of each cell in which they are
visited.
Backtracking Algorithm for Knight’s tour
Following is the Backtracking algorithm for Knight’s tour problem.
48
If all squares are visited
print the solution
Else
a) Add one of the next moves to solution vector and recursively
check if this move leads to a solution. (A Knight can make maximum
eight moves. We choose one of the 8 moves in this step).
b) If the move chosen in the above step doesn't lead to a solution
then remove this move from the solution vector and try other
alternative moves.
c) If none of the alternatives work then return false (Returning false
will remove the previously added item in recursion and if false is
returned by the initial call of recursion then "no solution exists" )
Following are implementations for Knight’s tour problem. It prints one of the possible solutions in 2D
matrix form. Basically, the output is a 2D 8*8 matrix with numbers from 0 to 63 and these numbers show
steps made by Knight.
Backtracking traverses the state space tree Branch-and-Bound traverse the tree in any
Traversal by DFS(Depth First Search) manner. manner, DFS or BFS.
49
Useful in solving N-Queen Problem, Sum of Useful in solving Knapsack Problem, Travelling
Applications subset. Salesman Problem.
Backtracking can solve almost any problem. Branch-and-Bound can not solve almost any
Solve (chess, sudoku, etc ). problem.
Time Complexity: The worst case complexity of Branch and Bound remains same as that of the Brute Force
clearly because in worst case, we may never get a chance to prune a node. Whereas, in practice it performs very
well depending on the different instance of the TSP. The complexity also depends on the choice of the bounding
function as they are the ones deciding how many nodes to be pruned.
**0-1 Knapsack Problem-**
Given weights and values of n items, put these items in a knapsack of capacity W to get the maximum total value
in the knapsack. In other words, given two integer arrays val[0..n-1] and wt[0..n-1] which represent values and
51
weights associated with n items respectively. Also given an integer W which represents knapsack capacity, find
out the maximum value subset of val[] such that sum of the weights of this subset is smaller than or equal to W.
You cannot break an item, either pick the complete item or don’t pick it (0-1 property).
1. A Greedy approach is to pick the items in decreasing order of value per unit weight. The Greedy approach
works only for fractional knapsack problem and may not produce correct result for 0/1 knapsack.
2. We can use Dynamic Programming (DP) for 0/1 Knapsack problem. In DP, we use a 2D table of size n x
W. The DP Solution doesn’t work if item weights are not integers.
3. Since DP solution doesn’t always work, a solution is to use Brute Force. With n items, there are 2n solutions
to be generated, check each to see if they satisfy the constraint, save maximum solution that satisfies
constraint. This solution can be expressed as tree.
4. We can use Backtracking to optimize the Brute Force solution. In the tree representation, we can do DFS of
tree. If we reach a point where a solution no longer is feasible, there is no need to continue exploring. In the
given example, backtracking would be much more effective if we had even more items or a smaller knapsack
capacity.
Branch and Bound The backtracking based solution works better than brute force by ignoring infeasible
solutions. We can do better (than backtracking) if we know a bound on best possible solution subtree rooted with
52
every node. If the best in subtree is worse than current best, we can simply ignore this node and its subtrees. So we
compute bound (best solution) for every node and compare the bound with current best solution before exploring
the node. Example bounds used in below diagram are, A down can give $315, B down can $275, C down can
$225, D down can $125 and E down can $30.
Branch and bound is very useful technique for searching a solution but in worst case, we need to fully calculate the
entire tree. At best, we only need to fully calculate one path through the tree and prune the rest of it.
Graph Coloring-
Using Backtracking-
Given an undirected graph and a number m, determine if the graph can be colored with at most m colors
such that no two adjacent vertices of the graph are colored with the same color
To solve the problem follow the below idea:
The idea is to assign colors one by one to different vertices, starting from vertex 0. Before assigning a
color, check for safety by considering already assigned colors to the adjacent vertices i.e check if the
adjacent vertices have the same color or not. If there is any color assignment that does not violate the
conditions, mark the color assignment as part of the solution. If no assignment of color is possible then
backtrack and return false
N-Queen Problem-
The N Queen is the problem of placing N chess queens on an N×N chessboard so that no two queens
attack each other. For example, the following is a solution for the 4 Queen problem.
Using Backtracking
The idea is to place queens one by one in different columns, starting from the leftmost column. When we
place a queen in a column, we check for clashes with already placed queens. In the current column, if we
find a row for which there is no clash, we mark this row and column as part of the solution. If we do not
find such a row due to clashes, then we backtrack and return false.
1) Start in the leftmost column
2) If all queens are placed
return true
3) Try all rows in the current column.
Do following for every tried row.
a) If the queen can be placed safely in this row
then mark this [row, column] as part of the solution and recursively check if placing
queen here leads to a solution.
b) If placing the queen in [row, column] leads to
a solution then return true.
c) If placing queen doesn't lead to a solution then unmark this [row, column] (Backtrack) and go to
step (a) to try other rows.
4) If all rows have been tried and nothing worked,
return false to trigger backtracking.
Time Complexity: O(N!)
Auxiliary Space: O(N2)
Job Scheduling-
Given an array of jobs where every job has a deadline and associated profit if the job is finished before
the deadline. It is also given that every job takes a single unit of time, so the minimum possible deadline
for any job is 1. Maximize the total profit if only one job can be scheduled at a time.
Greedy Approach-
54
Greedily choose the jobs with maximum profit first, by sorting the jobs in decreasing order of their profit.
This would help to maximize the total profit as choosing the job with maximum profit for every time slot
will eventually maximize the total profit
Branch Bound-
Let there be N workers and N jobs. Any worker can be assigned to perform any job, incurring some cost that may
vary depending on the work-job assignment. It is required to perform all jobs by assigning exactly one worker to
each job and exactly one job to each agent in such a way that the total cost of the assignment is minimized.
The selection rule for the next node in BFS and DFS is “blind”. i.e. the selection rule does not give any preference
to a node that has a very good chance of getting the search to an answer node quickly. The search for an optimal
solution can often be speeded by using an “intelligent” ranking function, also called an approximate cost function
to avoid searching in sub-trees that do not contain an optimal solution. It is similar to BFS-like search but with one
major optimization. Instead of following FIFO order, we choose a live node with least cost. We may not get
optimal solution by following node with least promising cost, but it will provide very good chance of getting the
search to an answer node quickly.
There are two approaches to calculate the cost function:
1. For each worker, we choose job with minimum cost from list of unassigned jobs (take minimum entry from
each row).
2. For each job, we choose a worker with lowest cost for that job from list of unassigned workers (take minimum
entry from each column).
Complete Algorithm:
/* findMinCost uses Least() and Add() to maintain the
list of live nodes
55
Implements list of live nodes as a min heap */
57