Ada Reference Notes Unit 1 - Unit 5
Ada Reference Notes Unit 1 - Unit 5
Tech
Subject Name: Analysis and Design of Algorithm
Subject Code: CS-402
Semester: 4th
Downloaded from be.rgpvnotes.in
Unit-1
ALGORITHM:
An Algorithm is a finite sequence of instructions, each of which has a clear meaning and can be
performed with a finite amount of effort in a finite length of time. No matter what the input values
may be, an algorithm terminates after executing a finite number of instructions. In addition, every
algorithm must satisfy the following criteria:
Input: there are zero or more quantities, which are externally supplied; Output: at least one quantity
is produced;
Definiteness: each instruction must be clear and unambiguous;
Finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm will
terminate after a finite number of steps;
Effectiveness: every instruction must be sufficiently basic that it can in principle be carried out by a
person using only pencil and paper. It is not enough that each operation be definite, but it must also
be feasible.
In formal computer science, one distinguishes between an algorithm, and a program. A program does
not necessarily satisfy the fourth condition. One important example of such a program for a
computer is its operating system, which never terminates (except for system crashes) but continues
in a wait loop until more jobs are entered.
We represent algorithm using a pseudo language that is a combination of the constructs of a
programming language together with informal English statements.
DESIGNING ALGORITHMS:
In Computer Science, developing an algorithm is an art or a skill. Before actual implementation of the
program, designing an algorithm is very important step.
Steps are:
1. Understand the problem
2. Decision making on
a. Capabilities of computational devices
b. Select exact or approximate methods
c. Data Structures
d. Algorithmic strategies
3. Specification of algorithms
4. Algorithmic verification
5. Analysis of algorithm
6. Implementation or coding of algorithm
ANALYZING ALGORITHMS:
The efficiency of an algorithm can be decided by measuring the performance of an algorithm.
Performance of a program:
The performance of a program is the amount of computer memory and time needed to run a
program. We use two approaches to determine the performance of a program. One is analytical, and
the other experimental. In performance analysis we use analytical methods, while in performance
measurement we conduct experiments.
Time Complexity:
The time needed by an algorithm expressed as a function of the size of a problem is called the time
complexity of the algorithm. The time complexity of a program is the amount of computer time it
needs to run to completion.
The limiting behavior of the complexity as size increases is called the asymptotic time complexity. It is
the asymptotic complexity of an algorithm, which ultimately determines the size of problems that
can be solved by the algorithm.
Space Complexity:
The space complexity of a program is the amount of memory it needs to run to completion. The
space need by a program has the following components:
Instruction space: Instruction space is the space needed to store the compiled version of the program
instructions.
Data space: Data space is the space needed to store all constant and variable values. Data space has
two components:
Space needed by constants and simple variables in program.
Space needed by dynamically allocated objects such as arrays and class instances.
Environment stack space: The environment stack is used to save information needed to resume
execution of partially completed functions.
Instruction Space: The amount of instructions space that is needed depends on factors such as:
The compiler used to complete the program into machine code.
The compiler options in effect at the time of compilation
The target computer.
A program that runs faster is a better program, so saving time is an obvious goal. Likewise, a program
that saves space over a competing program is conside ed desi a le. We a t to sa e fa e
preventing the program from locking up or generating reams of garbled data.
In this section we will briefly describe these techniques with appropriate examples.
1. Divide & conquer technique is a top-down approach to solve a problem.
3. Dynamic programming technique is similar to divide and conquer approach. Both solve a
problem by breaking it down into a several sub problems that can be solved recursively. The
difference between the two is that in dynamic programming approach, the results obtained
from solving smaller sub problems are reused (by maintaining a table of results) in the
calculation of larger sub problems. Thus, dynamic programming is a Bottom-up approach
that begins by solving the smaller sub-problems, saving these partial results, and then
reusing them to solve larger sub-problems until the solution to the original problem is
obtained. Reusing the results of sub-problems (by maintaining a table of results) is the major
advantage of dynamic programming because it avoids the re-computations (computing
results twice or more) of the same problem.
Thus, Dynamic programming approach takes much less time than naïve or straightforward
methods, such as divide-and-conquer approach which solves problems in top-down method
and having lots of re-computations. The dynamic programming approach always gives a
guarantee to get an optimal solution.
4. The term backtrack was coined by American mathematician D.H. Lehmer in the 1950s.
Ba kt a ki g a e applied o l fo p o le s hi h ad it the o ept of a pa tial
a didate solutio a d elati el ui k test of hethe it a possi l e o pleted to a
valid solution. Backtrack algorithms try each possibility until they find the right one. It is a
depth-first-search of the set of possible solutions. During the search, if an alternative doesn t
work, the search backtracks to the choice point, the place which presented different
alternatives, and tries the next alternative. When the alternatives are exhausted, the search
returns to the previous choice point and try the next alternative there. If there are no more
choice points, the search fails.
5. Branch-and-Bound (B&B) is a rather general optimization technique that applies where the
greedy method and dynamic programming fail.
B&B design strategy is very similar to backtracking in that a state-space- tree is used to solve
a problem. Branch and bound is a systematic method for solving optimization problems.
However, it is much slower. Indeed, it often leads to exponential time complexities in the
worst case. On the other hand, if applied carefully, it can lead to algorithms that run
reasonably fast on average. The general idea of B&B is a BFS-like search for the optimal
solution, but not all nodes get expanded (i.e., their children generated). Rather, a carefully
selected criterion determines which node to expand and when, and another criterion tells
the algorithm when an optimal solution has been found. Branch and Bound (B&B) is the
most widely used tool for solving large scale NP-hard combinatorial optimization problems.
The following table-1.1 summarizes these techniques with some common problems that follow these
techniques ith thei u i g ti e. Ea h te h i ue has diffe e t u i g ti e …ti e o ple it .
Multiplication of two n-bits numbers
Quick Sort
Heap Sort
Merge Sort
Knapsack (fractional) Problem
Minimum cost Spanning tree
Greedy Method
o Kruskal‟s algorithm
o Prim‟s algorithm
Single source shortest path problem
o Dijkstra‟s algorithm
Dynamic All pair shortest path-Floyed algorithm
Programming Chain matrix multiplication
Longest common subsequence (LCS)
0/1 Knapsack Problem
Traveling salesmen problem (TSP)
N-queen‟s problem
Sum-of subset
Backtracking
Assignment problem
Traveling salesmen problem (TSP)
Branch & Bound
Complexity of Algorithms
The complexity of an algorithm M is the function f(n) which gives the running time and/or storage
spa e e ui e e t of the algo ith i te s of the size of the i put data. Mostl , the sto age
spa e e ui ed a algo ith is si pl a ultiple of the data size . Co ple it shall efe to the
running time of the algorithm.
The function f , gi es the u i g ti e of a algo ith , depe ds ot o l o the size of the i put
data but also on the particular data. The complexity function f(n) for certain cases are:
1. Best Case : The minimum possible value of f(n) is called the best case.
2. Average Case : The expected value of f(n).
3. Worst Case : The maximum value of f(n) for any key possible input.
C.g(n)
f(n) F(n)
n0
F(n)
f(n) C.g(n)
n0
C2.g(n)
F(n)
C1.g(n)
f(n)
n0
Analyzing Algorithms
“uppose M is a algo ith , a d suppose is the size of the i put data. Clea l the o ple ity f(n)
of M increases as n increases. It is usually the rate of increase of f(n) we want to examine. This is
usually done by comparing f(n) with some standard functions. The most common computing times
are:
O(1), O(log n), O(n), O(nlogn), O(n2), O(n3), O(2n), n! and nn
Numerical Comparison of Different Algorithms
The execution time for six of the typical functions is given below:
n logn n*logn n2 n3 2n
1 0 0 1 1 2
2 1 2 4 8 4
4 2 8 16 64 16
8 3 24 64 512 256
16 4 64 256 4096 65,536
32 5 160 1024 32,768 4,294,967,296
64 6 384 4096 2,62,144 Note 1
128 7 896 16,384 2,097,152 Note 2
256 8 2048 65,536 1,677,216 ????????
A complete binary tree is a binary tree in which every level, except possibly the last,
Binary Heap:
A Binary Heap is a Complete Binary Tree where items are stored in a special order
is completely filled, and all nodes are as far left as possible.
such that value in a parent node is greater (or smaller) than the values in its two
children nodes. The former is called as max heap and the latter is called min heap.
The heap can be represented by binary tree or array.
The root of the tree A[1] and given index i of a node, the indices of its parent, left child
and right child can be computed
PARENT (i)
return floor(i/2)
LEFT (i)
return 2i
RIGHT (i)
return 2i + 1
Types of Heaps
Heap can be of 2 types:
Max Heap
Store data in ascending order
Has property of
A[Parent(i)] A[i]
Min Heap
Store data in descending order
Has property of
A[Pa e t i ] A[i]
Heapify
Procedures on Heap
Build Heap
Heap Sort
1. Heapify
Heapify picks the largest child key and compare it to the parent key. If parent key is larger than
heapify quits, otherwise it swaps the parent key with the largest child key. So that the parent is
now becomes larger than its children.
Heapify(A, i)
{
l left(i)
r right(i)
if l <= heapsize[A] and A[l] > A[i]
then largest l
else largest i
if r <= heapsize[A] and A[r] > A[largest]
then largest r
if largest != i
then swap A[i] A[largest]
Heapify(A, largest)
}
2. Build Heap
We can use the procedure 'Heapify' in a bottom-up fashion to convert an array A[1 . . n] into a
heap. Since the elements in the subarray A[n/2 +1 . . n] are all leaves, the procedure
BUILD_HEAP goes through the remaining nodes of the tree and runs 'Heapify' on each one. The
bottom-up order of processing node guarantees that the subtree rooted at children are heap
before 'Heapify' is run at their parent.
Buildheap(A)
{
heapsize[A] length[A]
for i |length[A]/2 //down to 1
do Heapify(A, i)
}
3. Heap Sort
The heap sort algorithm starts by using procedure BUILD-HEAP to build a heap on the input
array A[1 . . n]. Since the maximum element of the array stored at the root A[1], it can be put
into its correct final position by exchanging it with A[n] (the last element in A). If we now
discard node n from the heap than the remaining elements can be made into heap. Note that
the new element at the root may violate the heap property. All that is needed to restore the
heap property.
Heapsort(A)
{
Buildheap(A)
for i length[A] //down to 2
do swap A[1] A[i]
heapsize[A] heapsize[A] - 1
Heapify(A, 1)
}
Complexity
Time complexity of heapify is O(Logn). Time complexity of create and BuildHeap() is O(n) and overall time
complexity of Heap Sort is O(n Logn).
BINARY SEARCH:
The Binary search technique is a search technique which is based on Divide & Conquer strategy.
The entered array must be sorted for the searching, then we calculate the location of mid element
by using formula mid= (Beg + End)/2, here Beg and End represent the initial and last position of
array. In this technique we compare the Key element to mid element. So there May be three
cases:-
1. If array[mid] = = Key (Element found and Location is Mid)
2. If array[mid] > Key, then set End = mid-1. (continue the process)
3. If array [mid] < Key, then set Beg=Mid+1. (Continue the process)
Time complexity
As we dispose off one part of the search case during every step of binary search, and perform the
search operation on the other half, this results in a worst case time complexity of O(log 2 n).
MERGE SORT:
Merge sort is a divide-and-conquer algorithm based on the idea of breaking down a list into several
sub-lists until each sublist consists of a single element and merging those sublists in a manner that
results into a sorted list.
Divide the unsorted list into NN sublists, each containing 11 element.
Take adjacent pairs of two singleton lists and merge them to form a list of 2 elements. NN will
now convert into N/2N/2 lists of size 2.
Repeat the process till a single sorted list of obtained.
While comparing two sublists for merging, the first element of both lists is taken into consideration.
While sorting in ascending order, the element that is of a lesser value becomes a new element of the
sorted list. This procedure is repeated until both the smaller sublists are empty and the new combined
sublist comprises all the elements of both the sublists.
Merge sort first divides the array into equal halves and then combines them in a sorted manner.
Algorithm:
MergeSort(A, p, r)
{
if( p < r )
{
q = (p+r)/2;
mergeSort(A, p, q);
mergeSort(A, q+1, r);
merge(A, p, q, r);
}
}
Merge (A, p, q, r )
{
n1 = q – p + 1
n2 = r – q
de la e L[ … 1 + ] a d [‘ … 2 + 1] temporary arrays
for i = 1 to n1
L[i] = A[p + i - 1]
for j = 1 to n2
R[j] = numbers[q+ j]
L[n1 + ] = ∞
R[n2 + ] = ∞
i=1
j=1
for k = p to r
If L[i] ‘[j]
A[k] = L[i]
i=i+1
else
A[k] = R[j]
j=j+1
}
Time Complexity:
Sorting arrays on different machines. Merge Sort is a recursive algorithm and time complexity can be
expressed as following recurrence relation.
T = T / +Ө
The above recurrence can be solved either using Recurrence Tree method or Master method. It falls in
case II of Master Method and solution of the e u e e is Ө Log .
Ti e o ple it of Me ge “o t is Ө Log i all ases o st, average and best) as merge sort always
divides the array in two halves and take linear time to merge two halves.
Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation
Stable: Yes
QUICK SORT:
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and
partitions the given array around the picked pivot. There are many different versions of quickSort that
pick pivot in different ways.
Always pick first element as pivot.
Always pick last element as pivot (implemented below)
Pick a random element as pivot.
Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element x of
array as pivot, put x at its correct position in sorted array and put all smaller elements (smaller than x)
before x, and put all greater elements (greater than x) after x. All this should be done in linear time.
/* low --> Starting index, high --> Ending index */
Analysis of QuickSort
Time taken by QuickSort in general can be written as following.
The first two terms are for two recursive calls, the last term is for the partition process. k is the number
of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following are three
cases.
Worst Case: The worst case occurs when the partition process always picks greatest or smallest
element as pivot. If we consider above partition strategy where last element is always picked as pivot,
the worst case would occur when the array is already sorted in increasing or decreasing order.
Following is recurrence for worst case.
T(n) = T(0) + T(n- + Ө
which is equivalent to
T(n) = T(n- + Ө
The solutio of a o e e u e e is Ө 2).
Best Case: The best case occurs when the partition process always picks the middle element as pivot.
Following is recurrence for best case.
T = T / +Ө
The solutio of a o e e u e e is Ө Log . It a e sol ed usi g ase of Maste Theo e .
Average Case:
To do average case analysis, we need to consider all possible permutation of array and calculate time
taken by ever pe utatio hi h does t look eas .
We can get an idea of average case by considering the case when partition puts O(n/9) elements in one
set and O(9n/10) elements in other set. Following is recurrence for this case.
T = T /9 + T 9 / +Ө
Although the worst-case time complexity of QuickSort is O(n2) which is more than many other sorting
algorithms like Merge Sort and Heap Sort, QuickSort is faster in practice, because its inner loop can be
efficiently implemented on most architectures, and in most real-world data. QuickSort can be
implemented in different ways by changing the choice of pivot, so that the worst case rarely occurs for
a given type of data. However, merge sort is generally considered better when data is huge and stored
in external storage.
9. X6 MMult(A22,B21, n/2)
10. X7 MMult(A21,B12, n/2)
11. X8 MMult(A22,B22, n/2)
12. C11 X1 + X2
13. C12 X3 + X4
14. C21 X5 + X6
15. C22 X7 + X8
16. Output C
17. End If
Unit-2 Notes
Study of Greedy strategy, examples of greedy method like optimal merge patterns, Huffman
coding, minimum spanning trees, knapsack problem, job sequencing with deadlines, single
source shortest path algorithm
Greedy Technique
Greedy is the most straight forward design technique. Most of the problems have n inputs and
require us to obtain a subset that satisfies some constraints. Any subset that satisfies these
constraints is called a feasible solution. We need to find a feasible solution that either maximizes
or minimizes the objective function. A feasible solution that does this is called an optimal
solution.
The greedy method is a simple strategy of progressively building up a solution, one element at a
time, by choosing the best possible element at each stage. At each stage, a decision is made
regarding whether or not a particular input is in an optimal solution. This is done by considering
the inputs in an order determined by some selection procedure. If the inclusion of the next
input, into the partially constructed optimal solution will result in an infeasible solution then this
input is not added to the partial solution. The selection procedure itself is based on some
optimization measure. Several optimization measures are plausible for a given problem. Most of
them, however, will result in algorithms that generate sub-optimal solutions. This version of
greedy technique is called subset paradigm. Some problems like Knapsack, Job sequencing with
deadlines and minimum cost spanning trees are based on subset paradigm.
struct treenode
{
treenode * lchild;
treenode * rchild;
};
Analysis:
T= O (n-1) * max (O (Least), O (Insert)).
- Case 2: L is sorted.
Case 2.1
O (Least)= O (1)
O (Insert)= O (n)
T= O (n2)
Case 2.2
L is represented as a min-heap. Value in the root is <= the values of its children.
O (Least)= O (1)
O (Insert)= O (log n)
T= O (n log n).
Huffman Codes
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input
characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most
frequent character gets the smallest code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are
assigned in such a way that the code assigned to one character is not prefix of code assigned to any other
character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit
stream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned
to c is prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-compressed output
a e d o o a d o a figu e . .
Steps to build Huffman code
Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a
priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least
frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first
extracted node as its left child and the other extracted node as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the
tree is complete.
Example:
Letter A B C D E F
Frequency 10 20 30 40 50 60
V 210
0 1
Z 90 W 120
0 1 0 1
D 40 E 50 Y 60 F 60
0 1
X 30 C 30
0 1
A 10 B 20
Page no: 3 Follow us on facebook to get real-time updates from RGPV
Downloaded from be.rgpvnotes.in
Algorithm:
Huffman(A)
{
n = |A|;
Q = A;
for i = 1 to n-1
{
z = new node;
left[z] =Extract-Min(Q);
right[z] =Extract-Min(Q);
f[z] = f[left[z]] +f[right[z]];
Insert(Q, z);
}
return Extract-Min(Q);
}
Analysis of algorithm
Each priority queue operation (e.g. heap): O(log n)
In each iteration: one less subtree.
Initially: n subtrees.
Total: O(n log n) time.
Kruskal’s Algorith
This is a greedy algorithm. A greedy algorithm chooses some local optimum (i.e. picking an edge with the least
weight in a MST).
Kruskal's algorithm works as follows: Take a graph with 'n' vertices, keep on adding the shortest (least cost)
edge, while avoiding the creation of cycles, until (n - 1) edges have been added. Sometimes two or more edges
may have the same cost. The order in which the edges are chosen, in this case, does not matter. Different
MSTs may result, but they will all have the same total cost, which will always be the minimum cost.
t [i, 1] := u; t [i, 2] := v; mincost :=mincost + cost [u, v]; Union (j, k);
}
}
if (i >n-1) then write ("no spanning tree");
else return mincost;
}
Running time:
• The number of finds is at most 2e, and the number of unions at most n-1. Including the initialization
time for the trees, this part of the algorithm has a complexity that is just slightly more than O (n + e).
• We can add at most n-1 edges to tree T. So, the total time for operations on T is O(n).
Summing up the various components of the computing times, we get O (n + e log e) as asymptotic complexity.
This simple modified algorithm of spanning tree is called prim's algorithm for finding an
Minimal cost spanning tree.
Prim's algorithm is an example of a greedy algorithm.
return mincost;
}
Running time:
We do the same set of operations with dist as in Dijkstra's algorithm (initialize structure, m times
decrease value, n - 1 times select minimum). Therefore, we get O (n2) time when we implement dist with
array, O (n + E.log n) when we implement it with a heap.
an edge (u, v) of minimum weight vertex (say, v) to find MCST.
to find MCST. In Prim s algorithm for getting
In kruskal s algorithm for getting MCST, it is necessary to select
MCST, it is not necessary to an adjacent vertex of already
choose adjacent vertices of selected vertices (in any
already selected vertices (in any successive steps).
successive steps). At intermediate step of
At intermediate step of algorithm, there will be only
algorithm, there are may be one connected components
more than one connected
are possible
components are possible. Time complexity: O (V2)
Time complexity: O (|E| log |V|)
KNAPSACK PROBLEM
Let us appl the g eed ethod to sol e the k apsa k p o le . We a e gi e o je ts a d a k apsa k. The
o je t i has a eight i a d the k apsa k has a apa it . If a f a tio i, < i < of object i is placed into
the knapsack then a profit of pi xi is earned. The objective is to fill the knapsack that maximizes the total profit
earned.
“i e the k apsa k apa it is , e e ui e the total eight of all hose o je ts to e at ost .
Algorithm
If the objects are already been sorted into non-increasing order of p[i] / w[i] then the algorithm given below
obtains solutions corresponding to this strategy.
Greedy Fractional-Knapsack (P[1..n], W[1..n], X [1..n], M)
/* P[1..n] and W[1..n] contains the profit and weight of the n-objects ordered
such that
X[1..n] is a solution set and M is the capacity of KnapSack*/
{
1: For i ← to n do
2: X[i] ←
3: profit ← //Total profit of item filled in Knapsack
4: weight ← // Total weight of items packed in KnapSack
5: i←
6: While (Weight < M) // M is the Knapsack Capacity
{
7: if (weight + W[i] ≤ M)
8: X[i] = 1
9: weight = weight + W[i]
10: else
Running time:
The objects are to be sorted into non-decreasing order of pi / wi ratio. But if we disregard the time to initially
sort the objects, the algorithm requires O(nlogn) time.
We still have to discuss the running time of the algorithm. The initial sorting can be done in time O(n log n),
and the rest loop takes time O(n). It is not hard to implement each body of the second loop in time O(n), so
the total loop takes time O(n2). So the total algorithm runs in time O(n2). Using a more sophisticated data
structure one can reduce this running time to O(n log n), but in any case it is a polynomial-time algorithm.
for i :=1 to n do
{
S [i] := false; // Initialize S. dist [i] :=cost [v, i];
}
S[v] := true; dist[v] := 0.0; // Put v in S. for num := 2 to n – 1 do
{
Determine n - 1 paths from v.
Choose u from among those vertices not in S such that dist[u] is minimum; S[u] := true; // Put u is S.
for (each w adjacent to u with S [w] = false)
do
if (dist [w] > (dist [u] + cost [u, w]) then // Update distances dist [w] := dist [u] + cost [u, w];
}
}
Running time:
For heap A = O (n); B = O (log n); C = O (log n) which gives O (n + m log n) total.
Concept of dynamic programming, problems based on this approach such as 0/1 knapsack,
multistage graph, reliability design, Floyd Warshall algorithm
Unit-3
Concept of Dynamic Programming
Dynamic programming is a name, coined by Richard Bellman in 1955. Dynamic programming, as
greedy method, is a powerful algorithm design technique that can be used when the solution to the
problem may be viewed as the result of a sequence of decisions. In the greedy method we make
irrevocable decisions one at a time, using a greedy criterion. However, in dynamic programming we
examine the decision sequence to see whether an optimal decision sequence contains optimal
decision subsequence.
When optimal decision sequences contain optimal decision subsequences, we can establish
recurrence equations, called dynamic-programming recurrence equations, that enable us to solve the
problem in an efficient way.
Dynamic programming is based on the principle of optimality (also coined by Bellman). The
principle of optimality states that no matter whatever the initial state and initial decision are, the
remaining decision sequence must constitute an optimal decision sequence with regard to the state
resulting from the first decision. The principle implies that an optimal decision sequence is comprised
of optimal decision subsequences. Since the principle of optimality may not hold for some
formulations of some problems, it is necessary to verify that it does hold for the problem being
solved. Dynamic programming cannot be applied when this principle does not hold.
• Solve the dynamic-programming recurrence equations for the value of the optimal solution.
• Perform a trace back step in which the solution itself is constructed.
0/1 – KNAPSACK
Given weights and values of n items, put these items in a knapsack of capacity W to get the
maximum total value in the knapsack. In other words, given two integer arrays val[0..n-1] and
wt[0..n-1] which represent values and weights associated with n items respectively. Also given an
integer W which represents knapsack capacity, find out the maximum value subset of val[] such that
sum of the weights of this subset is smaller than or equal to W. You cannot break an item, either pick
the complete item, or do ’t pick it (0-1 property).
Dynamic-0-1-knapsack (v, w, n, W)
for w = 0 to W do
c[0, w] = 0
for i = 1 to n do
c[i, 0] = 0
for w = 1 to W do
if i ≤ the
if vi + c[i-1, w-wi] then
c[i, w] = vi + c[i-1, w-wi]
else c[i, w] = c[i-1, w]
else
c[i, w] = c[i-1, w]
int[] MStageForward(Graph G)
{
// returns vector of vertices to follow through the graph
// let c[i][j] be the cost matrix of G
Time complexity:
Complexity is O (|V| + |E|). Where the |V| is the number of vertices and |E| is the number of edges.
Example
Consider the following example to understand the concept of multistage graph.
Cost(2, 3) = {c(3, 4) + Cost(4, 8) + Cost(8, 9), c(3, 5) + Cost(5, 8)+ Cost(8, 9), c(3, 6) + Cost(6, 8) + Cost(8, 9)} = 10
RELIABILITY DESIGN
In reliability design, the problem is to design a system that is composed of several devices connected
in series.
If we imagine that r1 is the reliability of the device.Then the reliability of the function can be given
by πr .If r1 = 0.99 and n = 10 that n devices are set in a series, 1 <= i <= 10, then reliability of the whole
system πri can be given as: Πri = .
So, if we duplicate the devices at each stage then the reliability of the system can be increased.
It can be said that multiple copies of the same device type are connected in parallel through the use of
switching circuits. Here, switching circuit determines which devices in any given group are functioning
properly. Then they make use of such devices at each stage, that result is increase in reliability at each
stage. If at each stage, there are mi similar types of devices Di, then the probability that all mi have a
malfunction is (1 - ri)^mi, which is very less.
And the reliability of the stage I becomes (1 – (1 - ri) ^mi). Thus, if ri = 0.99 and mi = 2, then the stage
reliability becomes 0.9999 which is almost equal to 1. Which is much better than that of the previous
case or we can say the reliability is little less than 1 - (1 - ri) ^mi because of less reliability of switching
circuits.
In reliability design, we try to use device duplication to maximize reliability. But this maximization
should be considered along with the cost.
In the all pairs shortest path problem, we are to find a shortest path between every pair of vertices in
a directed graph G. That is, for every pair of vertices (i, j), we are to find a shortest path from i to j as
well as one from j to i. These two paths are the same when G is undirected.
When no edge has a negative length, the all-pairs shortest path problem may be solved by using
Dijkstra’s greedy si gle source algorith ti es, o ce ith each of the ertices as the source
vertex.
The all pairs shortest path problem is to determine a matrix A such that A (i, j) is the length of a
shortest path from i to j. The matrix A can be obtained by solving n single-source problems using the
algorithm shortest Paths. Since each application of this procedure requires O (n 2) time, the matrix A
can be obtained in O (n3) time.
Complexity Analysis:
A Dynamic programming algorithm based on this recurrence involves in calculating n+1 matrices,
each of size n x n. Therefore, the algorithm has a complexity of O (n3).
Problem-
Consider the following directed weighted graph-
Step-03:
The four matrices are-
The last matrix D4 represents the shortest path distance between every pair of vertices .
Unit-4
Syllabus: Backtracking concept and its e a ples like 8 uee s p o le , Ha ilto ia le, G aph
coloring problem etc. Introduction to branch & bound method, examples of branch and bound
method like traveling salesman problem etc. Meaning of lower bound theory and its use in solving
algebraic problem, introduction to parallel algorithms.
BACKTRACKING:
Backtracking is used to solve problem in which a sequence of objects is chosen from a specified set so
that the sequence satisfies some criterion. The desired solution is expressed as an n-tuple (x1, . . . . ,
xn) where each i Є S, S being a finite set.
The solution is based on finding one or more vectors that maximize, minimize, or satisfy a criterion
function P (x1, . . . . . , xn). Form a solution and check at every step if this has any chance of success. If
the solution at any point seems not promising, ignore it. All solutions requires a set of constraints
divided into two categories: explicit and implicit constraints.
Answer states are those solution states for which the path from root node to s defines a tuple
defines a tuple in the solution space.
State space is the set of paths from root node to other nodes. State space tree is the tree
that is a member of the set of solutions.
organization of the solution space. The state space trees are called static trees. This
terminology follows from the observation that the tree organizations are independent of the
problem instance being solved. For some problems it is advantageous to use different tree
organizations for different problem instance. In this case the tree organization is determined
dynamically as the solution space is being searched. Tree organizations that are problem
Live node is a node that has been generated but whose children have not yet been
instance dependent are called dynamic trees.
E-node is a live node whose children are currently being explored. In other words, an E-node
generated.
Dead node is a generated node that is not to be expanded or explored any further. All
is a node currently being expanded.
Depth first node generation with bounding functions is called backtracking. State generation
children of a dead node have already been expanded.
methods in which the E-node remains the E-node until it is dead, lead to branch and bound
methods.
N-QUEENS PROBLEM:
The N queens puzzle is the problem of placing N chess queens on an N×N chessboard so that no two
queens threaten each other. Thus, a solution requires that no two queens share the same row,
column, or diagonal.
8-QUEENS PROBLEM:
The eight uee s problem is the problem of placing eight queens on an 8×8 chessboard such that
none of them attack one another (no two are in the same row, column, or diagonal).
Algorithm for new queen be placed All solutions to the n·queens problem
Algorithm Place(k,i) Algorithm NQueens(k, n)
//Return true if a queen can be placed // its prints all possible placements of n-
in kth row & ith column queens on an n×n chessboard.
//Other wise return false {
{ for i:=1 to n do{
for j:=1 to k-1 do if Place(k,i) then
if(x[j]=i or Abs(x[j]-i)=Abs(j-k))) {
then return false X[k]:=I;
return true if(k==n) then write (x[1:n]);
} else NQueens(k+1, n);
}
}}
HAMILTONIAN CYCLES:
Let G = (V, E) be a connected graph with n vertices. A Hamiltonian cycle (suggested by William
Hamilton) is a round-trip path along n edges of G that visits every vertex once and returns to its
starting position.
In graph G, Hamiltonian cycle begins at some vertiex v1 ∈ G and the vertices of G are visited in the
order v1,v2,---vn+1, then the edges (vi, vi+1 a e i E, ≤ i ≤ .
Figure-4.3: Graph
There is no known easy way to determine whether a given graph contains a Hamiltonian cycle.
By using backtracking method, it can be possible
Backtracking algorithm, that finds all the Hamiltonian cycles in a graph.
The graph may be directed or undirected. Only distinct cycles are output.
From graph g1 backtracking solution vector= {1, 2, 8, 7, 6, 5, 4, 3, 1}
Complexity Analysis
In Hamiltonian cycle, in each recursive call one of the remaining vertices is selected in the worst case.
In each recursive call the branch factor decreases by 1. Recursion in this case can be thought of as n
nested loops where in each loop the number of iterations decreases by one. Hence the time
complexity is given by:
T(N)=N*(T(N-1)+O(1))
T(N) = N*(N-1)*(N-2).. = O(N!)
GRAPH COLORING :
Let G be a graph and m be a given positive integer. We want to discover whether the nodes of G can be
colored in such a way that no two adjacent nodes have the same color, yet only m colors are used. This
is termed the m-colorabiltiy decision problem. The m-colorability optimization problem asks for the
smallest integer m for which the graph G can be colored.
Note that, if d is the deg ee of the gi e g aph the it a e olo ed ith d+ olo s.
The m- olo a ilit opti izatio p o le asks fo the s allest i tege fo hi h the graph G can be
olo ed. This i tege is efe ed as Ch o ati u e of the g aph.
Example:
Algorithm:
Finding all m-coloring of a graph Getting next color
Algorithm mColoring(k){ Algorithm NextValue(k){
// g(1:n, 1:n):boolean adjacency matrix. //x[1],x[2],---x[k-1] have been assigned
// k index (node) of the next vertex to integer values in the range [1, m]
color. repeat {
repeat{ x[k]=(x[k]+1)mod (m+1); //next highest
nextvalue(k); // assign to x[k] a legal color. color
if(x[k]=0) then return; // no new color if(x[k]=0) then return; // all colors have
possible been used.
if(k=n) then write(x[1: n]; for j=1 to n do
else mcoloring(k+1); {
} if g[k,j]≠ a d [k]= [j]
until(false) then break;
} }
if(j=n+1) then return; //new color found
} until(false)
}
Complexity Analysis
1) 2-colorability
There is a simple algorithm for determining whether a graph is 2-colorable and assigning colors to its
vertices: do a breadth-first search, assigning "red" to the first layer, "blue" to the second layer, "red"
to the third layer, etc. Then go over all the edges and check whether the two endpoints of this edge
have different colors. This algorithm is O(|V|+|E|) and the last step ensures its correctness.
It has a branching function, which can be a depth first search, breadth first search or based on
bounding function.
It has a bounding function, which goes far beyond the feasibility test as a mean to prune
efficiently the search tree.
Branch and Bound refers to all state space search methods in which all children of the E-node
are generated before any other live node becomes the E-node
Branch and Bound is the generalization of graph search strategies, BFS and D- search.
o A BFS like state space search is called as FIFO (First in first out) search as the list of live
nodes in a first in first out list (or queue).
o A D search like state space search is called as LIFO (Last in first out) search as the list of
live nodes in a last in first out (or stack).
0/1 knapsack
Quadratic assignment problem
Nearest neighbor search
Subtracting the smallest element from row i(for example r1), so one element will become 0
Now find the reduced matrix by:
Then after subtracting the smallest element from col j(for example c1), so one element will
and rest element remain non negative.
Here the path is using above state space tree is: 1->4->2->3->1
function CheckBounds(st,des,cost[n][n])
Global variable: cost[N ][N ] - the cost assignment.
pencost[0] = t
fo i ← , − do
fo j ← , − do
reduced[i][j] = cost[i][j]
end for end for
fo j ← , − do
edu ed[st][j] = ∞
end for
fo i ← , − do
edu ed[i][des] = ∞
e d fo edu ed[des][st] = ∞ ‘o ‘edu tio reduced) C olumnReduction(reduced)
pencost[des] = pencost[st] + row + col + cost[st][des]
return pencost[des]
end function
function RowMin(cost[n][n],i)
min = cost[i][0]
fo j ← , − do
if cost[i][j] < min then
min = cost[i][j]
end if end for return min
end function
function ColMin(cost[n][n],i)
min = cost[0][j]
fo i ← , − do
if cost[i][j] < min then
min = cost[i][j]
end if end for return min
end function
function Rowreduction(cost[n][n])
row = 0
fo i ← , − do
rmin = rowmin(cost, i)
if i /= ∞ the
row = row + rmin
end if
fo j ← , − do
if ost[i][j] /= ∞ the
ost[i][j] = ost[i][j] − i
ost[p][k] = ∞
e d fo ost[k][j] = ∞ o edu tio ost olu edu tio ost
end while end function
Complexity Analysis:
Traveling salesman problem is a NP-hard problem. Until now, researchers have not found a
polynomial time algorithm for traveling salesman problem. Among the existing algorithms, dynamic
programming algorithm can solve the problem in time O(n^2*2^n) where n is the number of nodes in
the graph. The branch-and-cut algorithm has been applied to solve the problem with a large number
of nodes. However, branch-and-cut algorithm also has an exponential worst-case running time.
• Lower Bound, L(n), is a property of the specific problem, i.e. sorting problem, MST, matrix
multiplication, not of any particular algorithm solving that problem.
• Lo e ound theory says that no algorithm can do the job in fewer than L(n) time units for
arbitrary inputs, i.e., that every comparison-based sorting algorithm must take at least L(n) time in
the worst case.
• L is the i i u o e all possi le algo ith s, of the maximum complexity.
• Upper bound theory says that for any arbitrary inputs, we can always sort in time at most U(n).
How long it would take to solve a problem using one of the known Algorithms with worst-case input
gives us a upper bound.
• I p o i g an upper bound means finding an algorithm with better worst-case performance.
• U is the i i u o e all k o algo ith s, of the a i u o ple it .
• Both uppe a d lo e ou ds a e i i a o e the a i u o ple it of i puts of size .
• The ultimate goal is to make these two functions coincide. When this is done, the optimal algorithm
will have L(n) = U(n).
2) Information Theory:
The information theory method establishing lower bounds by computing the limitations on
information gained by a basic operation and then showing how much information is required before
a given problem is solved.
• This is used to sho that a possi le algo ith fo sol i g a p o le ust do so e i i al
amount of work.
• The ost useful p i iple of this ki d is that the out o e of a o pa iso et ee t o ite s
contains one bit of information.
PARALLEL ALGORITHM:
A parallel algorithm can be executed simultaneously on many different processing devices and then
combined together to get the correct result. Parallel algorithms are highly useful in processing huge
volumes of data in quick time. This tutorial provides an introduction to the design and analysis of
parallel algorithms. In addition, it explains the models followed in parallel algorithms, their
structures, and implementation.
An algorithm is a sequence of steps that take inputs from the user and after some computation,
produces an output. A parallel algorithm is an algorithm that can execute several instructions
simultaneously on different processing devices and then combine all the individual outputs to
produce the final result.
Concurrent Processing
The easy availability of computers along with the growth of Internet has changed the way we store
and process data. We are living in a day and age where data is available in abundance. Every day we
deal with huge volumes of data that require complex computing and that too, in quick time.
Sometimes, we need to fetch data from similar or interrelated events that occur simultaneously. This
is where we require concurrent processing that can divide a complex task and process it multiple
systems to produce the output in quick time.
Concurrent processing is essential where the task involves processing a huge bulk of complex data.
E a ples i lude − a essi g la ge data ases, ai aft testi g, ast o o i al al ulatio s, ato i a d
nuclear physics, biomedical analysis, economic planning, image processing, robotics, weather
forecasting, web-based services, etc.
Parallelism is the process of processing several set of instructions simultaneously. It reduces the total
computational time. Parallelism can be implemented by using parallel computers, i.e. a computer
with many processors. Parallel computers require parallel algorithm, programming languages,
compilers and operating system that support multitasking.
Unit-5
Syllabus: Binary search trees, height balanced trees, 2-3 trees, B-trees, basic search and traversal
techniques for trees and graphs (In order, preorder, postorder, DFS, BFS), NP-completeness.
Example
The following tree is a Binary Search Tree. In this tree, left subtree of every node contains nodes with
smaller values and right subtree of every node contains larger values.
The following operations are performed on a binary search tree:-1) Search 2) Insertion 3) Deletion
In a binary search tree, the search operation is performed with O(log n) time complexity. The search
operation is performed as follows:-
Step 5: If search element is smaller, then continue the search process in left subtree.
Step 6: If search element is larger, then continue the search process in right subtree.
Step 7: Repeat the same until we found exact element or we completed with a leaf node
Step 8: If we reach to the node with search value, then display "Element is found" and terminate the
function.
Step 9: If we reach to a leaf node and it is also not matching, then display "Element not found" and
terminate the function.
AVL tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and
right subtrees cannot be more than one for all nodes.Most of the BST operations (e.g., search, max,
min, insert, delete.. etc) take O(h) time where h is the height of the BST. The cost of these operations
may become O(n) for a skewed Binary tree. If we make sure that height of the tree remains O(Logn)
after every insertion and deletion, then we can guarantee an upper bound of O(Logn) for all these
operations. The height of an AVL tree is always O(Logn) where n is the number of nodes in the tree.
An AVL tree is a balanced binary search tree. In an AVL tree, balance factor of every node is either -1,
0 or +1.
AVL Tree Rotations: Rotation is the process of moving the nodes to either left or right to make tree
balanced.
There are four rotations and they are classified into two types.
1) Single Rotation
Left rotation
Right rotation
2)Double Rotation
Left-Right rotation
Right-Left rotation
So time complexity of AVL insert is O(Logn). The AVL tree and other self balancing search trees like Red
Black are useful to get all basic operations done in O(Log n) time. The AVL trees are more balanced
compared to Red Black Trees, but they may cause more rotations during insertion and deletion.
Insertion Operation:
1. Insert the new Node using recursion so while back tracking you will all the parents nodes to check
whether they are still balanced or not.
2. Every node has a field called height with default value as 1.
3. Whe e ode is added, its pare t s ode height get i reased y 1.
4. So as mentioned in step 1, every ancestors height will get updated while back tracking to the root.
5. At every node the balance factor will also be checked. balance factor = (height of left Subtree — height
of right Subtree).
6. If balance factor =1 means tree is balanced at that node.
7. If balance factor >1 means tree is not balanced at that node, left height is more that the right height so
that means we need rotation. (Either Left-Left Case or Left-Right Case).
8. Say the current node which we are checking is X and If new node is less than the X.left then it will
be Left-Left case, and if new node is greater than the X.left then it will be Left-Right case. see the pic-
tures above.
9. If balance factor <-1 means tree is not balanced at that node, right height is more that the left height so
that means we need rotation. (Either Right-Right Case or Right– Left Case)
10. Say the current node which we are checking is X and If new node is less than the X.right then it will
be Right-Right case, and if new node is greater than the X.right then it will be Right-Left case.
Examples:
An important example of AVL trees is the behavior on a worst-case add sequence for regular binary trees:
1, 2, 3, 4, 5, 6, 7
All insertions are right-right and so rotations are all single rotate from the right. All but two insertions require
re-balancing:
Deletion in AVL Tree: If we want to delete any element from the AVL Tree we can delete same as BST
deletion. for example Delete 30 in the AVL tree from the figure Figure:5.4 .
A node with value 30 is being deleted in figure 5.5. After deleting 30, we travel up and find the first
unbalanced node which is 18. We apply rotation and shift 18 to up for balanced tree .Again we have to
move 18 up, so we perform left rotation.
2-3 TREES:
A 2-3 Tree is a specific form of a B tree. A 2-3 tree is a search tree. However, it is very different from a binary
search tree.
Here are the properties of a 2-3 tree:
each node has either one value or two value
a node with one value is either a leaf node or has exactly two children (non-null). Values in left subtree
< value in node < values in right subtree
a node with two values is either a leaf node or has exactly three children (non-null). Values in left
subtree < first value in node < values in middle subtree < second value in node < value in right subtree.
all leaf nodes are at the same level of the tree
Insertion algorithm
Into a two-three tree is quite different from the insertion algorithm into a binary search tree. In a two-three
tree, the algorithm will be as follows:
1. If the tree is empty, create a node and put value into the node
2. Otherwise find the leaf node where the value belongs.
3. If the leaf node has only one value, put the new value into the node
4. If the leaf node has more than two values, split the node and promote the median of the three values to
parent.
5. If the parent then has three values, continue to split and promote, forming a new root node if necessary
the tree are leaves). So the time for lookup is also O(log M), where M is the number of key values stored in the
tree.
Complexity Analysis
keys are stored only at leaves, ordered left-to-right
non-leaf nodes have 2 or 3 children (never 1)
non-leaf nodes also have leftMax and middleMax values (as well as pointers to children)
all leaves are at the same depth
the height of the tree is O(log N), where N = # nodes in tree
at least half the nodes are leaves, so the height of the tree is also O(log M) for M = # values stored in
tree
the lookup, insert, and delete methods can all be implemented to run in time O(log N), which is also
O(log M)
B-TREE:
B-Tree is a self-balancing search tree. In most of the other self-balancing search trees (like AVL and Red Black
Trees), it is assumed that everything is in main memory. To understand use of B-Trees, we must think of huge
amount of data that cannot fit in main memory. When the number of keys is high, the data is read from disk in
the form of blocks. Disk access time is very high compared to main memory access time. The main idea of
using B-Trees is to reduce the number of disk accesses. Most of the tree operations (search, insert, delete,
max, min, ..etc ) require O(h) disk accesses where h is height of the tree. B-tree is a fat tree. Height of B-Trees
is kept low by putting maximum possible keys in a B-Tree node. Generally, a B-Tree node size is kept equal to
the disk block size. Since h is low for B-Tree, total disk accesses for most of the operations are reduced
significantly compared to balanced Binary Search Trees like AVL Tree, Red Black Tree, ..etc.
Properties of B-Tree
1) All leaves are at same level.
2) A B-Tree is defined by the term minimum degree t . The alue of t depe ds upo disk lo k size.
3) Every node except root must contain at least t-1 keys. Root may contain minimum 1 key.
4) All nodes (including root) may contain at most 2t – 1 keys.
5) Number of children of a node is equal to the number of keys in it plus 1.
6) All keys of a node are sorted in increasing order. The child between two keys k1 and k2 contains all keys in
range from k1 and k2.
7) B-Tree grows and shrinks from root which is unlike Binary Search Tree. Binary Search Trees grow downward
and also shrink from downward.
8) Like other balanced Binary Search Trees, time complexity to search, insert and delete is O(logn).
Traversal
Tree traversal (also known as tree search) is a form of graph traversal and refers to the process of
visiting (checking and/or updating) each node in a tree data structure, exactly once. Such traversals are
classified by the order in which the nodes are visited.
This is a very different approach for traversing the graph nodes. The aim of BFS algorithm is to traverse the
graph as close as possible to the root node. Queue is used in the implementation of the breadth first search.
Algorithmic Steps
Step 1: Push the root node in the Queue. Step 2: Loop until the queue is empty.
Step 3: Remove the node from the Queue.
Step 4: If the removed node has unvisited child nodes, mark them as visited and insert the unvisited children in
the queue.
NP-COMPLETENESS:
We have been writing about efficient algorithms to solve complex problems, like shortest path, Euler graph,
minimum spanning tree, etc. Those were all success stories of algorithm designers. In this post, failure stories of
computer science are discussed.
Can all computational problems be solved by a computer? There are computational problems that cannot be
solved by algorithms even with unlimited time. For example Turing Halting problem (Given a program and an
input, whether the program will eventually halt when run with that input, or will run forever). Alan Turing
proved that general algorithm to solve the halting problem for all possible program-input pairs cannot exist. A
key part of the proof is, Turing machine was used as a mathematical definition of a computer and program
(Source Halting Problem).Status of NP Complete problems is another failure story, NP complete problems are
problems whose status is unknown. No polynomial time algorithm has yet been discovered for any NP complete
problem, nor has anybody yet been able to prove that no polynomial-time algorithm exist for any of them. The
interesting part is, if any one of the NP complete problems can be solved in polynomial time, then all of them
can be solved.
What are NP, P, NP-complete and NP-Hard problems?
P is set of problems that can be solved by a deterministic Turing machine in Polynomial time.
NP is set of decision problems that can be solved by a Non-deterministic Turing Machine in Polynomial time. P is
subset of NP (any problem that can be solved by deterministic machine in polynomial time can also be solved by
non-deterministic machine in polynomial time) figure 5.1.
I for ally, NP is set of de isio pro le s hi h a e sol ed y a poly o ial ti e ia a Lu ky Algorith , a
magical algorithm that always makes a right guess among the given set of choices (Source Ref 1).
NP-complete problems are the hardest problems in NP set. A decision problem L is NP-complete if:
1) L is in NP (Any given solution for NP-complete problems can be verified quickly, but there is no efficient
known solution).
2) Every problem in NP is reducible to L in polynomial time (Reduction is defined below).
A problem is NP-Hard if it follo s property e tio ed a o e, does t eed to follo property . Therefore,
NP-Complete set is also a subset of NP-Hard set.
NP Hard
NP Complete
P=NP
NP