Module 2 Daa FINAL 20201
Module 2 Daa FINAL 20201
MODULE—2
DIVIDE AND CONQUER
Topics: General method, Binary search, Recurrence equation for divide and conquer, Finding
the maximum and minimum, Merge sort, Quick sort, Stassen‟s matrix multiplication,
Advantages and Disadvantages of divide and conquer, Decrease and Conquer Approach,
Topological Sort.
1. DIVIDE: A problems instance is divided into several sub problem instances of the same
problem, ideally of about the same size.
2. RECUR: solve the sub problems recursively
3. CONQUER : If necessary, the solutions obtained for the smaller instances are combined to
get a solution to the original instance
Control abstraction of divide and conquer
“Control abstraction is a procedure whose flow of control is clear but whose primary
operations are specified by other procedures whose precise meanings are left undefined”.
Consider the following algorithm (DAndC) which is invoked for problem p to be solved.
Small(P) is a Boolean-valued function that determines whether the input size is small enough that the
answer can be computed without splitting.
If the size of p is n and the sizes of k sub problems are n1, n2,….., nk then the computing time of
(DAndC ) is described by the recurrence relation. Divide and conquer (DAndC ) is described by the
recurrence relation given below which is used to know the computing time.
Divide-and-conquer recurrence
In the most typical case of divide-and-conquer a problem’s instance of size n is divided into two
instances of size n/2.
More generally, an instance of size n can be divided into b instances of size n/b, with a of
them needing to be solved. (Here, a and b are constants; a ≥ 1 and b > 1.)
Assuming that size n is a power of b to simplify our analysis, we get the following recurrence for the
running time T (n):
Where f (n) is a function that accounts for the time spent on dividing an instance of size n into
instances of size n/b and combining their solutions.
For the sum example discussed under general examples below, a = b = 2 and f(n)=1.
Obviously, the order of growth of its solution T (n) depends on the values of the constants a and b and
the order of growth of the function f (n).
Master Theorem
The efficiency analysis of divide & conquer algorithm is simplified by the Master Theorem form:
If f(n)∈Θ(nd)
where d≥0 in recurrence
equation then,
Binary Search
Let ai (1≤i≤n) is a list of elements stored in non decreasing order. The searching problem is to
determine whether a given element is present in the list. If key x is present we have to determine
the index value j such that aj=x. If x is not in the list j is set to be zero.
Let P= (n, ai, . . . al, x) denotes an instance of binary search problem. Divide & Conquer
can be used to solve this binary search problem. Let Small (P) be true if n=1. S(P) will take the
value i if x=ai , otherwise it will take 0.
If P has more than one element it can be divided into new sub-problems as follows:
Take an index q within the range [i, l] & compare x with aq. There are three possibilities.
is the middle element i.e. q=(n+1)/2, then the resulting searching algorithm is known as Binary
Search algorithm.
Recursive Binary search Algorithm:
Algorithm BinarySearch(a, i, l, x)
// Implements recursive Binary search algorithm
// Input: An array a[i. .. l] sorted in ascending order and a search key x
// Output: If x is present return j such that x=a[j]; else return 0.
{
if( i=l) then
{ if(x==a[i]) then return I;
else
return 0;
}
else
{
mid= ⌊(i+l)/2⌋;
if(x==a[mid]) then return mid;
else if (x<a[mid]) then return BinarySearch(a, i, mid-1, x);
else BinarySearch(a,mid+1, l, x);
}
}
mid=⌊(low+high)/2⌋;
Testing:-
To fully test binary search, we need not concern with the values of a[1:n]. By varying x sufficiently,
we can observe all possible computation sequences of BinarySearch without taking different values
for a.To test all successful searches, x must take on the n values in a. To test all
unsuccessful searches, x need only take on n+ 1 different value. So complexity of testing
BinarySearch is 2n+1 for each n.
Analysis:-
We do the analysis with frequency count(Key operation count) and space required for the algorithm.
In binary search the space is required to store n elements of the array and to store the variables low,
high, mid and x i.e. n+4 locations
To find the time complexity of the algorithm, as the comparison count depends on the
specifics of the input, so we need to analyze best case, average case & worst case efficiencies
separately. We assume only one comparison is needed to determine which of the three
possibilities of if condition in the algorithm holds. Let us take an array with 14 elements.
Index pos 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Elements -10 -6 -3 -1 0 2 4 9 12 15 18 19 22 30
comparisons 3 4 2 4 3 4 1 4 3 4 2 4 3 4
From the above table we can conclude that, no element requires more the 4 comparisons. The average
number of comparisons required is (sum of count of comparisons/number of elements) i.e 45/14=3.21
comparisons per successful search on average.
There are 15 possible ways that an unsuccessful search may terminate depending on the
value of x. if x < a[1], the algorithm requires three comparisons to determine x is not present. For all
the remaining possibilities the algorithm requires 4 element comparisons. Thus the average number of
comparisons for an unsuccessful search is (3+14*4)/15=59/15=3.93.
value in each node is the value of mid. The below figure is the decision tree for n=14.
11
3
5 9 13
1
6 8 10 12 14 2
4
Theorem 1:
If n is in the range [2k-1, 2k), then BinarySearch makes at most k element comparisons for
successful search and either k-1 or k comparisons for an unsuccessful search. (i.e. The time for a
successful search is O(logn) and for unsuccessful search is Θ(logn)).
Proof:
Let us consider a binary decision tree, which describes the function of Binary search on n
elements. Here all successful search end at a circular node, where as all unsuccessful search will end
at square node. If 2k-1≤n<2k, then all circular nodes are at levels 1,2,…,k, where as all square nodes
are at levels k & K+1(root node is at level 1). The number of element comparisons needed to
Analysis:
Input size parameter is n. Basic operation is key comparison.
Worstcase: is possible if the array does not contain key element or it takes maximum comparisions.
Cworst(n)=cworst(n/2)+cworst(n/2) for n>=1
Cworst(1)=1
Consider n=2k
Cworst(2k)=k+1=log 2 n+1
There fore cworst(n)= log 2 n
Recursive calls: BinarySearch(a,1,12,7) for the recursive algorithm defined above BinarySearch(a, i,
l, x).
Solution:
Index 1 2 3 4 5 6 7 8
Array a
-15 -6 0 7 9 23 54 82
elements
Example 2:
Consider an array A of elements below for binary search
-15, -6, 0, 7, 9 , 23, 54, 82 and Key=54
Solution:
Index 1 2 3 4 5 6 7 8
Array a
-15 -6 0 7 9 23 54 82
elements
Compute mid =(1+8)/2=4
Compare key x with mid element i.e 82 =a[mid] i.e a[4]=7 not equal to key 54.
if (x<a[mid]) then no
so (x>a[mid]) as 54>7, so call BinarySearch(a,mid+1, l, x);
Index 5 6 7 8
Array a 9 23 54 82
Index 5 6 Index 7 8
Array a 9 23 Array a 54 82
elements elements
Compute mid=(7+8)/2=7
A[7]=54 matches with the key.so return sussess with index 7 saying key 54 is found at position 7 in
the array.
Merge Sort
This problem is one of the best example for divide and conquer.Given a sequence of n elements
a[1],a[2],...,a[n], the merge sort algorithm will split into two sets a[1], ... a[⌊n/2⌋] and
a[⌊n/2⌋+1],….,a[n]. Each set is individually sorted & Resulting sorted sets are merged to get a
single sorted array of n elements.
Procedure:
Divide: Partition array in to two sub lists
Conquer: Then sort two sub lists
Combine: Merge sub problems
Analysis
Assuming for simplicity that n is a power of 2, the recurrence relation for the number of key
comparisons C(n) is
C(n) = 2C(n/2) + C merge (n) for n > 1, C(1) = 0.
Let us analyze C merge(n), the number of key comparisons performed during the merging stage. At
each step, exactly one comparison is made, after which the total number of elements in the two arrays
still needing to be processed is reduced by 1.
In the worst case, neither of the two arrays becomes empty before the other one contains just
one element (e.g., smaller elements may come from the alternating arrays).
Therefore, for the worst case C merge(n) = n − 1, and we have the recurrence
a n=1, a is constant
T(n)=
=2kT(n/2k)+kcn
=nT(1)+kcn //n=2k
T(n) =na+cnlogn
If 2k<n≤2k+1, then T(n)≤T(2k+1)
So, T(n)∈O(nlogn)
Quick sort
Quick sort is the other important sorting algorithm that is based on the divide-and conquers
approach. Unlike merge sort, which divides its input elements according to their position in the
array, quick sort divides them according to their value.
After a partition is achieved, A[s] will be in its final position in the sorted array, and we can continue
sorting the two sub arrays to the left and to the right of A[s] independently (e.g., by the same
method).
Note: the difference with merge sort: there, the division of the problem into two sub problems is
immediate and the entire work happens in combining their solutions; here, the entire work happens
in the division stage, with no work required to combine the solutions to the sub problems.
Here is pseudo code of quick sort:
ALGORITHM Quicksort(A[l..r])
//Sorts a subarray by quicksort
//Input: Subarray of array A[0..n − 1], defined by its left and right
// indices l and r
//Output: Sub array A[l..r] sorted in non decreasing order
if l < r
s ←Partition(A[l..r]) //s is a split position
Quicksort(A[l..s − 1])
Quicksort(A[s + 1..r]).
b) If i>j then swap pivot element with a[j].return j as the split position. If the scanning indices
have crossed over, i.e., i > j, we will have partitioned the
subarray after exchanging the pivot with A[j ]:
c) Finally, if the scanning indices stop while pointing to the same element, i.e., i = j,
the value they are pointing to must be equal to p (why?). Thus, we have the
subarray partitioned, with the split position s = i = j :
We can combine the last case with the case of crossed-over indices (i > j ) by exchanging the pivot
with A[j ] whenever i ≥ j . j is the split position. Towards left of this positions elements will be
lesser in value and right side elements will be larger in value. And the same are considered as two
subarrays which we take separately to sort.
ALGORITHM HoarePartition(A[l..r])
//Partitions a subarray by Hoare‟s algorithm, using the first element
// as a pivot
//Input: Subarray of array A[0..n − 1], defined by its left and right
// indices l and r (l<r)
//Output: Partition of A[l..r], with the split position returned as
Analysis:
We start our discussion of quick sort‟s efficiency by noting that the number of key comparisons
made before a partition is achieved is n + 1 if the scanning indices cross over and n if they coincide
Best case:If all the splits happen in the middle of corresponding subarrays, we will have the best
case. The number of key comparisons in the best case satisfies the recurrence
Cbest(n) = 2Cbest(n/2) + n for n > 1, Cbest(1) = 0.
According to the Master Theorem, Cbest(n) ∈ _(n log2 n); solving it exactly for n = 2k yields
Cbest(n) = n log2 n.
Worst case:In the worst case, all the splits will be skewed to the extreme: one of the two subarrays
will be empty, and the size of the other will be just 1 less than the size of the subarray being
partitioned. So, after making n + 1 comparisons to get to this partition and exchanging the pivot A[0]
with itself, the algorithm will be left with the strictly increasing array A[1..n − 1] to sort. The total
number of key comparisons made will be equal to
Average case: Let Cavg(n) be the average number of key comparisons made by quicksort on a
randomly ordered array of size n. A partition can happen in any position s (0 ≤ s ≤ n−1) after
n+1comparisons are made to achieve the partition. After the partition, the left and right subarrays
will have s and n − 1− s elements, respectively. Assuming that the partition split can happen in each
position s with the same probability 1/n, we get the following recurrence relation:
Cavg(n) = 1nn−1s=0[(n + 1) + Cavg(s) + Cavg(n − 1− s)] for n > 1,
Cavg(0) = 0, Cavg(1) = 0.
Its solution, which is much trickier than the worst- and best-case analyses, turns out to be
Cavg(n) ≈ 2n ln n ≈ 1.39n log2 n.
Thus, on the average, quicksort makes only 39% more comparisons than in the best case. Moreover,
its innermost loop is so efficient that it usually runs faster than mergesort on randomly ordered
arrays of nontrivial sizes. This certainly justifies the name given to the algorithm by its inventor.
Θ(nd) if a<bd
T(n)= Θ(ndlogbn) if a=bd
Θ(nlogba) if a>bd
T(n)∈Θ(nlogn)
10 11 | 12 13
10 11 12 | 13
10 11 12 13
0 if n=1
∴ T(n)= T(0)+T(n-1)+Cn
T(n)=T(n-1)+Cn //T(0)=0
By the method of backward substitution
T(n) = T(n-1)+Cn
= T(n-2)+C(n-1)+Cn
=T(n-3)+C(n-2)+C(n-1)+Cn
=T(n-4)+C(n-3)+C(n-2)+C(n-1)+Cn
…..
T(n)∈Θ(n2)
Average Case Analysis:-The average case of the quick sort will appear for typical or random or
real time input. Where the given array may not be exactly portioned into two equal parts as in best
case or the sub-arrays are not skewed as in worst case.
T(n)=(n+1)2log(n+1) = nlogn
To compute C(i,j) using this formula, we need n multiplications. As the matrix C has n2 elements,
the time for the resulting matrix multiplication algorithm, is ѳ(n3).
The divide and conquer strategy suggests another way to compute the product of two nxn
matrices. For simplicity we assume that n is a power of two, that is, that there exists a non negative
integer k such that n=2k.. In case n is not a power of two, then enough rows and columns of zeros
can be added to both A and B so that the resulting dimensions are a power of two. If A and B are
each partitioned in to four square sub matrices, each sub matrix having dimensions n/2 x n/2. Then
the product AB can be computed by using the above formula for the product of two matrices. If and
B are defined as below
To compute the AB using above formula we need to perform 8 multiplications of n/2 x n/2
matrices and four additions of n/2 x n/2 matrices. Since two n/2 x n/2 matrices can be added in C n
2
for the constant C. Therefore T(n)can be given as
Divide and conquer method is a top down technique for designing an algorithm which consists of
dividing the problem in to smaller sub problems hoping that the solutions of the sub problems are
easier to find. The solution of all smaller problems is then combined to get a solution for the original
problem.
Advantages:
The difficult problem is broken down to sub problems and each problem is solved separately
and independently. This is useful for obtaining solutions in easier way for difficult problems.
This technique facilitates the discovers of new efficient algorithms. Example: Quick sort,
Merge sort etc
The sub problems can be executed on parallel processor.
Hence time complexity can be reduced.
Disadvantages
Large number of sub lists are created and need to be processed
This algorithm makes use of recursive methods and the recursion is slow and complex.
Difficulties in solving larger size of inputs
General method
This technique is based on exploiting the relationship between a solution to a given instance of a
problem and a solution to a smaller instance of the same problem. Once such relationship is
established, it can be exploited either top down (recursively) or bottom up (without a recursion).
There are three major variations of decrease-and-conquer:
Decrease by a constant.
Decrease by a constant factor.
1. Decrease by a constant : This technique suggests reducing a problem‟s instance by the same
constant (example:one)on each iteration of the algorithm.
2. Decrease by a constant Factor : This technique suggests reducing a problem‟s instance by the
same constant factor (may be 2)on each iteration of the algorithm.
Decrease by (half) factor
Though the value of the second argument is always smaller on the right-hand side than on the left-hand
side, it decreases neither by a constant nor by a constant factor.
TOPOLOGICAL SORTING
Depth-first search and breadth-first search are principal traversal algorithms for Traversing digraphs
as well, but the structure of corresponding forests can be more complex than for undirected graphs.
Thus, even for the simple example of Figure below, exhibits all four types of edges possible in
a DFS forest of a directed graph: tree edges ( ab, bc, de), back edges ( ba) from vertices to their
A back edge in a DFS forest of a directed graph can connect a vertex to its parent. Whether or not it is
the case, the presence of a back edge indicates that the digraph has a directed cycle.
A directed cycle in a digraph is a sequence of three or more of its vertices that starts and ends
with the same vertex and in which every vertex is connected to its immediate predecessor by an edge
directed from the predecessor to the successor.
If a DFS forest of a digraph has no back edges, the digraph is a dag, an acronym for
directed acyclic graph.
We can list its vertices in such an order that for every edge in the graph, the vertex where the
edge starts is listed before the vertex where the edge ends .This problem is called topological sorting.
There are two methods to implement this algorithm.
1. DFS Method
2. Source Removal Method.
1. Algorithm1: (DFS method)
Procedure:
Perform a DFS traversal and note the order in which vertices become dead-ends (i.e., popped off the
traversal stack). Reversing this order yields a solution to the topological sorting problem, provided, of
course, no back edge has been encountered during the traversal. If a back edge has been encountered,
the digraph is not a dag, and topological sorting of its vertices is impossible.