DAA Merged
DAA Merged
Given N items where each item has some weight and profit associated with it and also
given a bag with capacity W, i.e., the bag can hold at most W weight in it. The task is
to put the items into the bag such that the sum of profits associated with them is the
maximum possible.
Note:The constraint here is we can either put an item completely into the bag or
cannot put it at all . It is not possible to put a part of an item into the bag.
Examples:
Input: N = 3, W = 4, profit[] = {1, 2, 3}, weight[] = {4, 5, 1}
Output: 3
Explanation: There are two items which have weight less than or equal to
4. If we select the item with weight 4, the possible profit is 1. And if we
select the item with weight 1, the possible profit is 3. So the maximum
possible profit is 3. Note that we cannot put both the items with weight 4
and 1 together as the capacity of the bag is 4.
Input: N = 3, W = 3, profit[] = {1, 2, 3}, weight[] = {4, 5, 6}
Output: 0
Recursion Approach for 0/1 Knapsack Problem:
To solve the problem follow the below idea:
A simple solution is to consider all subsets of items and calculate the total
weight and profit of all subsets. Consider the only subsets whose total
weight is smaller than W. From all such subsets, pick the subset with
maximum profit.
Optimal Substructure: To consider all subsets of items, there can be two
cases for every item.
Case 1: The item is included in the optimal subset.
obtained by remaining N-1 items and remaining weight i.e. (W-weight of the
Nth item).
Case 2 (exclude the Nth item): Maximum value obtained by N-1 items and W
weight.
1
Recursion Tree for 0-1 Knapsack
If the weight of the ‘Nth‘ item is greater than ‘W’, then the Nth item cannot be
2
Note: The above function using recursion computes the same subproblems again and
again. See the following recursion tree, K(1, 1) is being evaluated twice.
As there are repetitions of the same subproblem again and again we can implement
the following idea to solve the problem.
If we get a subproblem the first time, we can solve this problem by creating a 2-
D array that can store a particular state (n, w). Now if we come across the same
state (n, w) again instead of calculating it in exponential complexity we can
directly return its result stored in the table in constant time.
To solve the problem follow the below idea: Since subproblems are evaluated again,
this problem has Overlapping Sub-problems property. So the 0/1 Knapsack problem
has both properties of a dynamic programming problem. Like other typical Dynamic
Programming(DP) problems, re-computation of the same subproblems can be
avoided by constructing a temporary array K[][] in a bottom-up manner.
Illustration:
3
Below is the illustration of the above approach:
Let, weight[] = {1, 2, 3}, profit[] = {10, 15, 40}, Capacity = 6
If no element is filled, then the possible profit is 0.
weight⇢
0 1 2 3 4 5 6
item⇣ /
0 0 0 0 0 0 0 0
3
For filling the first item in the bag: If we follow the above mentioned
weight⇢
0 1 2 3 4 5 6
item⇣ /
0 0 0 0 0 0 0 0
1 1 1 1 1 1
1 0
0 0 0 0 0 0
4
weight⇢
0 1 2 3 4 5 6
item⇣ /
0 0 0 0 0 0 0 0
1 1 1 1 1 1
1 0
0 0 0 0 0 0
1 1 2 2 2 2
2 0
0 5 5 5 5 5
weight⇢
0 1 2 3 4 5 6
item⇣ /
0 0 0 0 0 0 0 0
1 1 1 1 1 1
1 0
0 0 0 0 0 0
1 1 2 2 2 2
2 0
0 5 5 5 5 5
3 0 1 1 4 5 5 6
5
weight⇢
0 1 2 3 4 5 6
item⇣ /
0 5 0 0 5 5
6
Dynamic Programming
Dynamic programming is an optimization technique used when solving problems that consists of
the following characteristics:
1. Optimal Substructure: Optimal substructure means that we combine the optimal results of
subproblems to achieve the optimal result of the bigger problem.
Example: Consider the problem of finding the minimum cost path in a weighted graph from a
source node to a destination node. We can break this problem down into smaller subproblems:
Find the minimum cost path from the source node to each intermediate node.
Find the minimum cost path from each intermediate node to the destination node.
The solution to the larger problem (finding the minimum cost path from the source node to the
destination node) can be constructed from the solutions to these smaller subproblems.
2. Overlapping Subproblems: The same subproblems are solved repeatedly in different parts of
the problem.
Example: Consider the problem of computing the Fibonacci series. To compute the Fibonacci
number at index n, we need to compute the Fibonacci numbers at indices n-1 and n-2. This
means that the subproblem of computing the Fibonacci number at index n-1 is used twice in the
solution to the larger problem of computing the Fibonacci number at index n.
1
Fibonacci Series using Dynamic Programming
Subproblems: F(0), F(1), F(2), F(3), …
Store Solutions: Create a table to store the values of F(n) as they are calculated.
Build Up Solutions: For F(n), look up F(n-1) and F(n-2) in the table and add them.
Avoid Redundancy: The table ensures that each subproblem (e.g., F(2)) is solved only
once.
By using DP, we can efficiently calculate the Fibonacci sequence without having to recompute
subproblems.
Longest Common Subsequence (LCS): Finds the longest common subsequence between
two strings.
Shortest Path in a Graph: Finds the shortest path between two nodes in a graph.
2
Knapsack Problem: Determines the maximum value of items that can be placed in a
knapsack with a given capacity.
Matrix Chain Multiplication: Optimizes the order of matrix multiplication to minimize the
number of operations.
Fibonacci Sequence: Calculates the nth Fibonacci number.
Advantages of Dynamic Programming (DP)
Avoids recomputing the same subproblems multiple times, leading to significant time
savings.
Ensures that the optimal solution is found by considering all possible combinations.
Breaks down complex problems into smaller, more manageable subproblems.
Applications of Dynamic Programming (DP)
3
Matrix Chain Multiplication
Example of Matrix Chain Multiplication: We are given the sequence {4, 10, 3, 12, 20, and 7}.
The matrices have size 4 x 10, 10 x 3, 3 x 12, 12 x 20, 20 x 7. We need to compute
M [i,j], 0 ≤ i, j≤ 5. We know M [i, i] = 0 for all i.
Let us proceed with working away from the diagonal. We compute the optimal solution for the
product of 2 matrices.
Here P0to P5are Position and M1to M5are matrix of size (pi to pi-1)
On the basis of sequence, we make a formula
We have to sort out all the combination but the minimum output combination is taken into
consideration.
1. m (1,2) = m1 x m2
= 4 x 10 x 10 x 3
4
= 4 x 10 x 3 = 120
2. m (2, 3) = m2 x m3
= 10 x 3 x 3 x 12
= 10 x 3 x 12 = 360
3. m (3, 4) = m3 x m4
= 3 x 12 x 12 x 20
= 3 x 12 x 20 = 720
4. m (4,5) = m4 x m5
= 12 x 20 x 20 x 7
= 12 x 20 x 7 = 1680
1. We initialize the diagonal element with equal i,j value with '0'.
2. After that second diagonal is sorted out and we get all the values corresponded to it
Now the third diagonal will be solved out in the same way.
M [1, 3] = M1 M2 M3
1.There are two cases by which we can solve this multiplication: ( M1 x M2) + M3, M1+
(M2x M3)
2.After solving both cases we choose the case in which minimum output is there.
5
M [1, 3] =264
As Comparing both output 264 is minimum in both cases so we insert 264 in table and ( M1 x
M2) + M3 this combination is chosen for the output making.
M [2, 4] = M2 M3 M4
1.There are two cases by which we can solve this multiplication: (M2x M3)+M4, M2+
(M3 x M4)
2.After solving both cases we choose the case in which minimum output is there.
M [2, 4] = 1320
As Comparing both output 1320 is minimum in both cases so we insert 1320 in table and M2+
(M3 x M4) this combination is chosen for the output making.
M [3, 5] = M3 M4 M5
1.There are two cases by which we can solve this multiplication: ( M3 x M4) + M5, M3+
( M4xM5)
2.After solving both cases we choose the case in which minimum output is there.
M [3, 5] = 1140
As Comparing both output 1140 is minimum in both cases so we insert 1140 in table and ( M3 x
M4) + M5this combination is chosen for the output making.
6
Now Product of 4 matrices:
M [1, 4] = M1 M2 M3 M4
After solving these cases we choose the case in which minimum output is there
M [1, 4] =1080
As comparing the output of different cases then '1080' is minimum output, so we insert 1080 in
the table and (M1 xM2) x (M3 x M4) combination is taken out in output making,
M [2, 5] = M2 M3 M4 M5
7
M [2, 5] = 1350
As comparing the output of different cases then '1350' is minimum output, so we insert 1350 in
the table and M2 x( M3 x M4 xM5)combination is taken out in output making.
M [1, 5] = M1 M2 M3 M4 M5
1.(M1 x M2 xM3 x M4 )x M5
2.M1 x( M2 xM3 x M4 xM5)
3.(M1 x M2 xM3)x M4 xM5
4.M1 x M2x(M3 x M4 xM5)
5.
After solving these cases we choose the case in which minimum output is there
M [1, 5] = 1344
As comparing the output of different cases then '1344' is minimum output, so we insert 1344 in
the table and M1 x M2 x(M3 x M4 x M5)combination is taken out in output making.
8
9
Longest Common Subsequence (LCS)
Given two strings, S1 and S2, the task is to find the length of the Longest Common
Subsequence. If there is no common subsequence, return 0. A subsequence is a string
generated from the original string by deleting 0 or more characters and without changing the
relative order of the remaining characters. For example , subsequences of “ABC” are “”, “A”,
“B”, “C”, “AB”, “AC”, “BC” and “ABC”. In general a string of length n has 2n subsequences.
LCS problem has great applications like diff utility (find the difference between two files) that
we use in our day to day software development.
Examples:
Input: S1 = “ABC”, S2 = “ACD”
Output: 2
Explanation: The longest subsequence which is present in both strings is “AC”.
The idea is to compare the last two characters. While comparing the strings S1 and S2 two
cases arise:
1.Match : Make the recursion call for the remaining strings (strings of lengths m-1 and n-1)
and add 1 to result.
2.Do not Match : Make two recursive calls. First for lengths m-1 and n, and second for m
and n-1. Take the maximum of two results.
Base case : If any of the strings become empty, we return 0.
1
LCS(“A”, “AC”) = max( LCS(“”, “AC”) , LCS(“A”, “A”) ) = max(0, 1 + LCS(“”,
“”)) = 1
LCS(“AB”, “A”) = max( LCS(“A”, “A”) , LCS(“AB”, “”) ) = max( 1 + LCS(“”, “”,
0)) = 1
So overall result is 1 + 1 = 2
2
Using Recursion Tree Method:
If we notice carefully, we can observe that the above recursive solution holds the following two
properties:
1. Optimal Substructure:
For solving the structure of L(S1[0, 1, . . ., m-1], S2[0, 1, . . . , n-1]) we are taking the
help of the substructures of S1[0, 1, …, m-2], S2[0, 1,…, n-2], depending on the
situation (i.e., using them optimally) to find the solution of the whole.
2. Overlapping Subproblems:
If we use the above recursive approach for strings “AXYT” and “AYZX“, we will
get a partial recursion tree as shown below. Here we can see that the subproblem
L(“AXY”, “AYZ”) is being calculated more than once. If the total tree is considered
there will be several such overlapping subproblems.
•There are two parameters that change in the recursive solution and these parameters go from
0 to m and 0 to n. So we create a 2D array of size (m+1) x (n+1).
•We initialize this array as -1 to indicate nothing is computed initially.
3
•Now we modify our recursive solution to first do a lookup in this table and if the value is -1,
then only make recursive calls. This way we avoid re-computations of the same subproblems.
Time Complexity:O(m * n) ,where m and n are lengths of strings S1 and S2.
Auxiliary Space:O(m * n)
4
5
Time Complexity: O(m * n) which is much better than the worst-case time complexity of Naive
Recursive implementation.
Auxiliary Space: O(m * n) because the algorithm uses an array of size (m+1)*(n+1) to store the
length of the common subsequence.
6
Master Theorem:
Practice Problems and Solutions
Master Theorem
The Master Theorem applies to recurrences of the following form:
where a ≥ 1 and b > 1 are constants and f (n) is an asymptotically positive function.
There are 3 cases:
1. If f (n) = O(nlogb a− ) for some constant > 0, then T (n) = Θ(nlogb a ).
2. If f (n) = Θ(nlogb a logk n) with1 k ≥ 0, then T (n) = Θ(nlogb a logk+1 n).
3. If f (n) = Ω(nlogb a+ ) with > 0, and f (n) satisfies the regularity condition, then T (n) = Θ(f (n)).
Regularity condition: af (n/b) ≤ cf (n) for some constant c < 1 and all sufficiently large n.
Practice Problems
For each of the following recurrences, give an expression for the runtime T (n) if the recurrence can be
solved with the Master Theorem. Otherwise, indicate that the Master Theorem does not apply.
1. T (n) = 3T (n/2) + n2
2. T (n) = 4T (n/2) + n2
3. T (n) = T (n/2) + 2n
4. T (n) = 2n T (n/2) + nn
1
7. T (n) = 2T (n/2) + n/ log n
√
11. T (n) = 2T (n/2) + log n
√
13. T (n) = 3T (n/3) + n
2
Solutions
1. T (n) = 3T (n/2) + n2 =⇒ T (n) = Θ(n2 ) (Case 3)
7. T (n) = 2T (n/2) + n/ log n =⇒ Does not apply (non-polynomial difference between f (n) and nlogb a )
19. T (n) = 64T (n/8) − n2 log n =⇒ Does not apply (f (n) is not positive)
22. T (n) = T (n/2) + n(2 − cos n) =⇒ Does not apply. We are in Case 3, but the regularity condition is
violated. (Consider n = 2πk, where k is odd and arbitrarily large. For any such choice of n, you can
show that c ≥ 3/2, thereby violating the regularity condition.)
3
Lecture 12: Chain Matrix Multiplication
CLRS Section 15.2
1
Recalling Matrix Multiplication
Matrix: An matrix is a two-
dimensional array
" ')(
! (
# # $# " $# (
!
.. .. .. .. *
%" & "
!
2
Recalling Matrix Multiplication
The product of a matrix and a
matrix is a matrix given by
for and .
Example: If
. / ' ( . ' (
0 1 0 1
! * *
1
, , , ,
then
43 43 ')(
. 0
+ + *
0 3 43 3 5
3
Remarks on Matrix Multiplication
If is defined, may not be defined.
Quite possible that .
4
Direct Matrix multiplication
Given a matrix and a matrix , the direct
way of multiplying is to compute each
for and .
5
Direct Matrix multiplication of
Given a matrix , a matrix and a
matrix , then can be computed in two ways
and :
. .
A big difference!
6
The Chain Matrix Multiplication Problem
Given
dimensions
555
corresponding to matrix sequence , , ,
555
where has dimension ,
determine the “multiplication sequence” that minimizes
the number of scalar multiplications in computing
. That is, determine how to parenthisize
the multiplications.
-
Exhaustive search: +
.
7
Developing a Dynamic Programming Algorithm
Clearly, is a matrix.
8
Developing a Dynamic Programming Algorithm
High-Level Parenthesization for
For any optimal multiplication sequence, at the last
step you are multiplying two matrices and
for some . That is,
5
Example
5
Here , .
9
Developing a Dynamic Programming Algorithm
10
Developing a Dynamic Programming Algorithm
Step 1 – Continued:
If parenthisization of was not optimal we could
replace it by a better parenthesization and get a cheaper
final solution, leading to a contradiction.
Similarly, if parenthisization of
was not op-
11
Developing a Dynamic Programming Algorithm
"
For , let denote the minimum
number of multiplications needed to compute .
The optimum cost can be described by the following
recursive definition.
12
Developing a Dynamic Programming Algorithm
Proof: Any optimal sequence of multiplication for
is equivalent to some choice of splitting
13
Developing a Dynamic Programming Algorithm
Therefore
14
Developing a Dynamic Programming Algorithm
to calculate
we must have already evaluated and
For both cases, the
corresponding length of the
matrix-chain are both less than . Hence, the algorithm
should fill the table in increasing order of the length of the matrix-
chain.
16
Example for the Bottom-Up Computation
Example: Given a chain of four matrices , ,
1
and , with , , + , , and
0
. Find + .
S0: Initialization
4
1 m[i,j]
3
2
j i
2 3
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
17
Example – Continued
Stp 1: Computing By definition
&
" $ 3
5
4
1 m[i,j]
3
2
j i
2 3
120
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
18
Example – Continued
#
Stp 2: Computing By definition
# $
$
.
+
5
4
1 m[i,j]
3
2
j i
2 3
120 48
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
19
Example – Continued
#
Stp3: Computing + By definition
#
+ +
.
+ + +
5
4
1 m[i,j]
3
2
j i
2 3
120 48 84
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
20
Example – Continued
Stp4: Computing By definition
"
$#
&
#
. .
4
1 m[i,j]
3
2
j i
88
2 3
120 48 84
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
21
Example – Continued
#
Stp5: Computing + By definition
# $
+ +
$#&
#
$#
+
+ +
43
+
5
4
1 m[i,j]
3
2
j i
88 104
2 3
120 48 84
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
22
Example – Continued
St6: Computing + By definition
+ +
#
#
+
+
+ +
.
,
5
4
1 m[i,j]
3
158
2
j i
88 104
2 3
120 48 84
4
1
0 0 0 0
5 4 6 2 7
A1 A2 A3 A4
p0 p1 p2 p3 p4
We are done!
23
Developing a Dynamic Programming Algorithm
% "
Idea: Maintain an array , where de-
5 5 5 5
notes for the optimal splitting in computing
. The array
can be used re-
5 5 5 5
cursively to recover the multiplication sequence.
... ...
24
Developing a Dynamic Programming Algorithm
25
The Dynamic Programming Algorithm
Matrix-Chain( )
for ( to )
;
for ( to )
for ( to )
;
;
for ( to )
;
if ( )
;
;
return and ; (Optimum in
)
26
Constructing an Optimal Solution: Compute
"
The actual multiplication code uses the value to
determine how to split the current sequence. Assume
that the matrices are stored in an array of matrices
, and that is global to this recursive pro-
5 5
cedure. The procedure returns a matrix.
Mult( )
if ( )
;
, where
is now
is
;
; multiply matrices
is now
return and
else return ;
%
To compute , call Mult( ).
27
Constructing an Optimal Solution: Compute
28
Matrix-chain Multiplication
• Suppose we have a sequence or chain
A1, A2, …, An of n matrices to be
multiplied
– That is, we want to compute the product
A1A2…An
11-1
Matrix-chain Multiplication …contd
11-3
Algorithm to Multiply 2 Matrices
Input: Matrices Ap×q and Bq×r (with dimensions p×q and q×r)
Result: Matrix Cp×r resulting from the product A·B
MATRIX-MULTIPLY(Ap×q , Bq×r)
1. for i ← 1 to p
2. for j ← 1 to r
3. C[i, j] ← 0
4. for k ← 1 to q
5. C[i, j] ← C[i, j] + A[i, k] · B[k, j]
6. return C
Scalar multiplication in line 5 dominates time to compute
CNumber of scalar multiplications = pqr
11-4
Matrix-chain Multiplication …contd
11-6
Dynamic Programming Approach
• The structure of an optimal solution
– Let us use the notation Ai..j for the matrix that
results from the product Ai Ai+1 … Aj
– An optimal parenthesization of the product
A1A2…An splits the product between Ak and
Ak+1 for some integer k where1 ≤ k < n
– First compute matrices A1..k and Ak+1..n ; then
multiply them to get the final matrix A1..n
11-7
Dynamic Programming Approach
…contd
11-8
Dynamic Programming Approach
…contd
11-9
Dynamic Programming Approach
…contd
11-10
Dynamic Programming Approach
…contd
0 if i=j
m[i, j ] =
min {m[i, k] + m[k+1, j ] + pi-1pk pj } if i<j
i ≤ k< j
11-11
Dynamic Programming Approach
…contd
11-12
Algorithm to Compute Optimal Cost
Input: Array p[0…n] containing matrix dimensions and n
Result: Minimum-cost table m and split table s
MATRIX-CHAIN-ORDER(p[ ], n)
for i ← 1 to n Takes O(n3) time
m[i, i] ← 0
Requires O(n 2) space
for l ← 2 to n
for i ← 1 to n-l+1
j ← i+l-1
m[i, j] ←
for k ← i to j-1
q ← m[i, k] + m[k+1, j] + p[i-1] p[k] p[j]
if q < m[i, j]
m[i, j] ← q
s[i, j] ← k
return m and s
11-13
Constructing Optimal Solution
• Our algorithm computes the minimum-
cost table m and the split table s
• The optimal solution can be constructed
from the split table s
– Each entry s[i, j ]=k shows where to split the
product Ai Ai+1 … Aj for the minimum cost
11-14
Example
• Show how to multiply Matrix Dimension
this matrix chain
A1 30×35
optimally
A2 35×15
• Solution on the board A3 15×5
– Minimum cost 15,125
A4 5×10
– Optimal parenthesization
((A1(A2A3))((A4 A5)A6)) A5 10×20
A6 20×25
11-15
Multistage Graph
A Multistage graph is a directed, weighted graph in which the nodes can be divided into a set of
stages such that all edges are from a stage to next stage only.
In other words there is no edge between vertices of same stage and from a vertex of current
stage to previous stage.
The vertices of a multistage graph are divided into n number of disjoint subsets S = {
S1, S2 ,S3………..Sn}, where S1 is the source and Sn is the sink or destination . The
cardinality of S1 and Sn are equal to 1. i.e., |S1| = |Sn | = 1.
We are given a multistage graph, a source and a destination, we need to find shortest
path from source to destination. By convention, we consider source at stage 1 and
destination as last stage.
Following is an example graph we will consider in this article :-
The Brute force method of finding all possible paths between Source and
Destination and then finding the minimum. That’s the WORST possible strategy.
Dijkstra’s Algorithm of Single Source shortest paths. This method will find
shortest paths from source to all other nodes which is not required in this case. So
it will take a lot of time and it doesn’t even use the SPECIAL feature that this
MULTI-STAGE graph has.
Simple Greedy Method – At each node, choose the shortest outgoing path. If we
apply this approach to the example graph given above we get the solution as 1 + 4
1
+ 18 = 23. But a quick look at the graph will show much shorter paths available
than 23. So the greedy method fails !
The best option is Dynamic Programming. So we need to find Optimal Sub-
This means that our problem of 0 —> 7 is now sub-divided into 3 sub-problems :-
So if we have total 'n' stages and target
as T, then the stopping condition will be :-
M(n-1, i) = i ---> T + M(n, T) = i ---> T
So, the hierarchy of M(x, y) evaluations will look something like this :-
2
So, here we have drawn a very small part of the Recursion Tree and we can already
see Overlapping Sub-Problems. We can largely reduce the number of M(x, y)
evaluations using Dynamic Programming.
Implementation details:
The below implementation assumes that nodes are numbered from 0 to N-1 from first
stage (source) to last stage (destination). We also assume that the input graph is
multistage.
We use top to bottom approach, and use dist[] array to store the value of overlapping
sub-problem.
dist[i] will store the value of minimum distance from node i to node n-1 (target
node).
Therefore, dist[0] will store minimum distance between from source node to target
node.
Question: Compute the shortest path from the above mentioned multistage
graph using dynamic programming.
3
NP and NP-
Completeness
Outline
Decision and Optimization Problems
P and NP
Polynomial-Time Reducibility
NP-Hardness and NP-Completeness
Examples: TSP, Circuit-SAT,
Knapsack
Polynomial-Time Approximation
Schemes
Outline
Backtracking
Branch-and-Bound
Summary
References
Decision and Optimization
Problems
Decision Problem: computational
problem with intended output of
“yes” or “no”, 1 or 0
Optimization Problem: computational
problem where we try to maximize
or minimize some value
Introduce parameter k and ask if the
optimal value for the problem is a
most or at least k. Turn optimization
into decision
Complexity Class P
Deterministic in nature
Solved by conventional computers in
polynomial time
• O(1) Constant
• O(log n) Sub-linear
• O(n) Linear
• O(n log n) Nearly Linear
• O(n2) Quadratic
Polynomial upper and lower bounds
Complexity Class NP
Non-deterministic part as well
choose(b): choose a bit in a non-
deterministic way and assign to b
If someone tells us the solution to a
problem, we can verify it in polynomial
time
Two Properties: non-deterministic method
to generate possible solutions,
deterministic method to verify in
polynomial time that the solution is
correct.
Relation of P and NP
P is a subset of NP
“P = NP”?
Language L is in NP, complement of
L is in co-NP
co-NP ≠ NP
P ≠ co-NP
Polynomial-Time Reducibility
Language L is polynomial-time
reducible to language M if there is a
function computable in polynomial
time that takes an input x of L and
transforms it to an input f(x) of M,
such that x is a member of L if and
only if f(x) is a member of M.
Shorthand, Lpoly M means L is
polynomial-time reducible to M
NP-Hard and NP-Complete
Language M is NP-hard if every other
language L in NP is polynomial-time
reducible to M
For every L that is a member of NP,
Lpoly
M
Logic Gates 0 1 0
1 1
NOT
1 1
0
OR 1
0 0 1
1
AND
2
Why should we care?
• Knowing that they are hard lets you stop beating
your head against a wall trying to solve them…
– Use a heuristic: come up with a method for solving a
reasonable fraction of the common cases.
– Solve approximately: come up with a solution that
you can prove that is close to right.
– Use an exponential time solution: if you really have
to solve the problem exactly and stop worrying about
finding a better solution.
3
Optimization & Decision Problems
• Decision problems
– Given an input and a question regarding a problem,
determine if the answer is yes or no
• Optimization problems
– Find a solution with the “best” value
• Optimization problems can be cast as decision
problems that are easier to study
– E.g.: Shortest path: G = unweighted directed graph
• Find a path between u and v that uses the fewest
edges
• Does a path exist from u to v consisting of at most k edges?
4
Algorithmic vs Problem Complexity
• The algorithmic complexity of a computation is
some measure of how difficult is to perform the
computation (i.e., specific to an algorithm)
• The complexity of a computational problem
or task is the complexity of the algorithm with the
lowest order of growth of complexity for solving
that problem or performing that task.
– e.g. the problem of searching an ordered list has at
most lgn time complexity.
• Computational Complexity: deals with
classifying problems by how hard they are.
5
Class of “P” Problems
• Class P consists of (decision) problems that are
solvable in polynomial time
• Polynomial-time algorithms
– Worst-case running time is O(nk), for some constant k
• Examples of polynomial time:
– O(n2), O(n3), O(1), O(n lg n)
• Examples of non-polynomial time:
– O(2n), O(nn), O(n!)
6
Tractable/Intractable Problems
• Problems in P are also called tractable
• Problems not in P are intractable or unsolvable
– Can be solved in reasonable time only for small inputs
– Or, can not be solved at all
7
Example of Unsolvable Problem
• Turing discovered in the 1930’s that there are
problems unsolvable by any algorithm.
• The most famous of them is the halting
problem
– Given an arbitrary algorithm and its input, will that
algorithm eventually halt, or will it continue forever in
an “infinite loop?”
8
Examples of Intractable Problems
9
Intractable Problems
• Can be classified in various categories based on
their degree of difficulty, e.g.,
– NP
– NP-complete
– NP-hard
• Let’s define NP algorithms and NP problems …
10
Nondeterministic and NP Algorithms
12
E.g.: Hamiltonian Cycle
• Given: a directed graph G = (V, E), determine a
simple cycle that contains each vertex in V
– Each vertex can only be visited once
• Certificate:
– Sequence: v1, v2, v3, …, v|V| hamiltonian
not
hamiltonian
13
Is P = NP?
• Any problem in P is also in NP: P
P NP NP
problems in NP
15
Reductions
• Reduction is a way of saying that one problem is
“easier” than another.
• We say that problem A is easier than problem B,
(i.e., we write “A B”)
if we can solve A using the algorithm that solves B.
• Idea: transform the inputs of A to inputs of B
yes
yes
f Problem B no
no
Problem A
16
Polynomial Reductions
17
NP-Completeness (formally)
• A problem B is NP-complete if:
P NP-complete
(1) B NP
NP
(2) A p B for all A NP
18
Implications of Reduction
yes
yes
f Problem B no
no
Problem A
- If A p B and B P, then A P
- if A p B and A P, then B P
19
Proving Polynomial Time
yes
Polynomial time
yes
f
algorithm to decide B no
no
20
Proving NP-Completeness In Practice
22
Revisit “Is P = NP?”
P NP-complete
NP
23
P & NP-Complete Problems
• Shortest simple path
– Given a graph G = (V, E) find a shortest path from a
source to all other vertices
– NP-complete
24
P & NP-Complete Problems
• Euler tour
– G = (V, E) a connected, directed graph find a cycle
that traverses each edge of G exactly once (may visit
a vertex multiple times)
– Polynomial solution O(E)
• Hamiltonian cycle
– G = (V, E) a connected, directed graph find a cycle
that visits each vertex of G exactly once
– NP-complete
25
Satisfiability Problem (SAT)
• Satisfiability problem: given a logical
expression , find an assignment of values
(F, T) to variables xi that causes to evaluate
to T
=x1 x2 x3 x4
26
CFN Satisfiability
• CFN is a special case of SAT
• is in “Conjuctive Normal Form” (CNF)
– “AND” of expressions (i.e., clauses)
– Each clause contains only “OR”s of the variables and
their complements
clauses
27
3-CNF Satisfiability
A subcase of CNF problem:
– Contains three clauses
• E.g.:
= (x1 x1 x2) (x3 x2 x4)
(x1 x3 x4)
• 3-CNF is NP-Complete
28
Clique
Clique Problem:
– Undirected graph G = (V, E)
– Clique: a subset of vertices in V all connected to each
other by edges in E (i.e., forming a complete graph)
– Size of a clique: number of vertices it contains
Optimization problem: Clique(G, 2) = YES
Clique(G, 3) = NO
– Find a clique of maximum size
Decision problem:
– Does G have a clique of size k? Clique(G, 3) = YES
Clique(G, 4) = NO
29
Clique Verifier
• Given: an undirected graph G = (V, E)
• Problem: Does G have a clique of size k?
• Certificate:
– A set of k nodes
• Verifier:
– Verify that for all pairs of vertices in this set there
exists an edge in E
30
3-CNF p Clique
• Idea:
31
NP-naming convention
• NP-complete - means problems that are
'complete' in NP, i.e. the most difficult to solve in
NP
• NP-hard - stands for 'at least' as hard as NP
(but not necessarily in NP);
• NP-easy - stands for 'at most' as hard as NP
(but not necessarily in NP);
• NP-equivalent - means equally difficult as NP,
(but not necessarily in NP);
32
Examples NP-complete and
NP-hard problems
NP-complete
NP-hard
33
1. Given the following message:
ABABCDEAABBEECADDBEC
How many bits will be required to encode the message using Huffman coding. Illustrate by making
tree and tables also.
Job J1 J2 J3 J4 J5 J6 J7
Profit 40 35 30 25 20 10 5
Deadline 4 3 3 4 2 1 2
3. Given the following Items and respective profits and weights associated with them:
Items 1 2 3 4 5 6 7
Profits 5 6 3 7 10 18 15
Weights 3 2 7 1 5 1 4
Considering maximum weight a knapsack can hold maximum upto 15 kgs. Solve this
optimize problem using Greedy method.
5. Given 4 Matrix A1, A2, A3 and A4 with sizes 2×3, 3×4, 4×4 and 4×3 respectively.
Find the most efficient way to multiply these matrices using Dynamic programming
1
• The ratio bound where H (d )
i 1 i
Example:
A 5
E
12
2 3 10
8 D
B 4 3
C
The problem lies in finding a minimal path passing from all vertices once. For example the path
Path1 {A, B, C, D, E, A} and the path Path2 {A, B, C, E, D, A} pass all the vertices but Path1
has a total length of 24 and Path2 has a total length of 31.
Definition:
A Hamiltonian cycle is a cycle in a graph passing through all the vertices once.
Example:
A
E
D
B
C
Theorem 10.1:
The traveling salesman problem is NP-complete.
Proof:
First, we have to prove that TSP belongs to NP. If we want to check a tour for credibility, we
check that the tour contains each vertex once. Then we sum the total cost of the edges and finally
we check if the cost is minimum. This can be completed in polynomial time thus TSP belongs to
NP.
Secondly we prove that TSP is NP-hard. One way to prove this is to show that Hamiltonian cycle
≤ P TSP (given that the Hamiltonian cycle problem is NP-complete). Assume G = (V, E) to be an
instance of Hamiltonian cycle. An instance of TSP is then constructed. We create the complete
graph G ′ = (V, E ′ ), where E ′ = {(i, j):i, j ∈ V and i ≠ j. Thus, the cost function is defined as:
0 if (i, j) ∈ E ,
t(i ,,j ) = 10.1
1 if (i, j) ∉ E.
Now suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each edge in h is 0
in G ′ as each edge belongs to E. Therefore, h has a cost of 0 in G ′ . Thus, if graph G has a
Hamiltonian cycle then graph G ′ has a tour of 0 cost.
Conversely, we assume that G’ has a tour h’ of cost at most 0. The cost of edges in E’ are 0 and 1
by definition. So each edge must have a cost of 0 as the cost of h’ is 0. We conclude that h’
contains only edges in E.
So we have proven that G has a Hamiltonian cycle if and only if G’ has a tour of cost at most 0.
Thus TSP is NP-complete.
10.2 Methods to solve the traveling salesman problem
10.2.1 Using the triangle inequality to solve the traveling salesman problem
Definition:
If for the set of vertices a, b, c ∈ V, it is true that t (a, c) ≤ t(a, b) + t(b, c) where t is the cost
function, we say that t satisfies the triangle inequality.
First, we create a minimum spanning tree the weight of which is a lower bound on the cost of an
optimal traveling salesman tour. Using this minimum spanning tree we will create a tour the cost
of which is at most 2 times the weight of the spanning tree. We present the algorithm that
performs these computations using the MST-Prim algorithm.
Approximation-TSP
Input: A complete graph G (V, E)
Output: A Hamiltonian cycle
E
C
D
A
A (a)
B
B
E
E
C
C
D
D
(b) (c)
Figure 10.3 A set of cities and the resulting connection after the MST-Prim algorithm has been
applied..
In Figure 10.3(a) a set of vertices is shown. Part (b) illustrates the result of the MST-Prim thus the
minimum spanning tree MST-Prim constructs. The vertices are visited like {A, B, C, D, E, A) by
a preorder walk. Part (c) shows the tour, which is returned by the complete algorithm.
Theorem 10.2:
Approximation-TSP is a 2-approximation algorithm with polynomial cost for the traveling
salesman problem given the triangle inequality.
Proof:
Approximation-TSP costs polynomial time as was shown before.
Assume H* to be an optimal tour for a set of vertices. A spanning tree is constructed by deleting
edges from a tour. Thus, an optimal tour has more weight than the minimum-spanning tree, which
means that the weight of the minimum spanning tree forms a lower bound on the weight of an
optimal tour.
Let a full walk of T be the complete list of vertices when they are visited regardless if they are
visited for the first time or not. The full walk is W. In our example:
W = A, B, C, B, D, B, E, B, A,.
The full walk crosses each edge exactly twice. Thus, we can write:
Which means that the cost of the full path is at most 2 time worse than the cost of an optimal tour.
The full path visits some of the vertices twice which means it is not a tour. We can now use the
triangle inequality to erase some visits without increasing the cost. The fact we are going to use is
that if a vertex a is deleted from the full path if it lies between two visits to b and c the result
suggests going from b to c directly.
In our example we are left with the tour: A, B, C, D, E, A. This tour is the same as the one we get
by a preorder walk. Considering this preorder walk let H be a cycle deriving from this walk. Each
vertex is visited once so it is a Hamiltonian cycle. We have derived H deleting edges from the full
walk so we can write:
c(H) ≤ c(W) 10.5
Definition:
If an NP-complete problem can be solved in polynomial time then P = NP, else P ≠ NP.
Definition:
An algorithm for a given problem has an approximation ratio of ρ(n) if the cost of the S solution
the algorithm provides is within a factor of ρ(n) of the optimal S* cost (the cost of the optimal
solution). We write:
If the cost function t does not satisfy the triangle inequality then polynomial time is not enough to
find acceptable approximation tours unless P = NP.
Theorem 10.3:
If P≠NP then there is no approximation algorithm with polynomial cost and with approximation
ratio of ρ for any ρ≥1 for the traveling salesman problem.
Proof:
Let us suppose that there is an approximation algorithm A with polynomial cost for some number
ρ≥1 with approximation ratio ρ. Let us assume that ρ is an integer without loss of generality. We
are going to try to use A to solve Hamiltonian cycle problems. If we can solve such NP-complete
problems then P = NP.
Let us assume a Hamiltonian-cycle problem G = (V, E). We are going to use algorithm A to
determine whether A contains Hamiltonian cycles. Assume G ′ = (V, E ′ ) to be the complete
graph on V. Thus:
1 if (a, b) ∈ E ,
t(a, b) = 10.9
p | V | +1else
Consider the traveling salesman problem ( G ′ , t). Assume there is a Hamiltonian cycle H in the
graph G. Each edge of H is assigned a cost of 1 by the cost function t. Hence ( G ′ , t) has a tour of
cost |V|. If we had assumed that there is not a Hamiltonian cycle in the graph G, then a tour in G ′
must contain edges that do not exist in E. Any tour with edges not in E has a cost of at least
Edges that do not exist in G are assigned a large cost the cost of any tour other than a Hamiltonian
one is incremented at least by |V|.
Let us use the approximation algorithm described in 10.2.1 to solve the traveling salesman
problem ( G ′ , t).
A returns a tour of cost no more than ρ times the cost of an optimal tour. Thus, if G has a
Hamiltonian cycle then A must return it. If G does not have a Hamiltonian cycle, A returns a tour
whose cost is more than ρ|V|. It is implied that we can use A to solve the Hamiltonian-cycle
problem with a polynomial cost. Therefore, the theorem is proved by contradiction.
According to Karp, we can partition the problem to get an approximate solution using the divide
and conquer techniques. We form groups of the cities and find optimal tours within these groups.
Then we combine the groups to find the optimal tour of the original problem. Karp has given
probabilistic analyses of the performance of the algorithms to determine the average error and
thus the average performance of the solution compared to the optimal solution. Karp has
proposed two dividing schemes. According to the first, the cities are divided in terms of their
location and only. According to the second, the cities are divided into cells that have the same
size. Karp has provided upper bounds of the worst-case error for the first dividing scheme, which
is also called Adaptive Dissection Method. The working of this method is explained below.
Let us assume that the n cities are distributed in a rectangle. This rectangle is divided in B = 2k
sub rectangles. Each sub rectangle contains at most t cities where k = log2[(N-1)/(t-1)]. The
algorithm computes an optimum tour for the cities within a sub-rectangle. These 2k optimal tours
are combined to find an optimal tour for the N cities. Let us explain the working of the division
algorithm. Assume Y to be a rectangle with num being the number of cities. The rectangle is
divided into two rectangles in correspondence of the [num/2] th city from the shorter side of the
rectangle. This city is true that belongs to the common boundary of the two rectangles. The rest of
the division process is done recursively.
The results for this algorithm are presented below.
Assume a rectangle X containing N cities and t the maximum allowed number of cities in a sub-
rectangle. Assume W to be the length of the walk the algorithm provides and Lopt to be the length
of the optimal path. Then the worst-case error id defined as follows:
Assuming that log2(N-1)/(t-1) is an integer we can express this equation in terms of N and t:
If k even W-Lopt ≤ 3a2 ( N − 1) /(t − 1) 10.17
This result from equation 10.12 shows that the relative error between a spanning walk and an
optimal tour can be estimated as:
∀ε>0 Prob limNÆ∞(W- TN(X)/ TN(X) – S/ t )>ε = 0 10.20
Where S>0.
βx(t) - β ≤ 6(a+b)/ t .
We can say that the length of a tour through t<N cities tends to be almost the same as the length
of the optimal tour.
10.2.4 Trying to solve the traveling salesman problem using greedy algorithms
Assume the asymmetric traveling salesman problem. We use the symbol of (Kn,c) whre c is the
weight function and n is the number of vertices. We assume the symmetric traveling salesman
problem to be defined in the same way but Kn symbols a complete undirected graph. If we try to
find an approximate solution to an NP-hard problem using heuristics, we need to compare the
solutions using computational experiments. There is a number called domination number that
compares the performance of heuristics. A heuristic with higher domination number is a better
choice than a heuristic with a lower domination number.
Definition:
The domination number for the TSP of a heuristic A is an integer such as that for each instance I
of the TSP on n vertices A produces a tour T that is now worse than at least d(n) tours in I
including T.
If we evaluate the greedy algorithm and the nearest neighbor algorithm for the TSP, we find that
they give good results for the Euclidean TSP but they both give poor results for the asymmetric
and the symmetric TSP. We analyze below the performance of the greedy algorithm and the
nearest neighbor algorithm using the domination number.
Theorem 10.4:
The domination number of the greedy algorithm for the TSP is 1.
Proof:
We assume an instance of the ATSP for which the greedy algorithm provides the worst tour. Let
nmin{i, j} be the cost of each arc(i, j). We assume the following exceptions: c(i, i+1) = in, for i =
1,2,..,n-1, c(i, 1) = n2 –1for i = 3,4,…,n-1 and c(n,1) = n3 .
We observe that the cheapest arc is (1,2). Thus the greedy algorithm returns the tour (1,2,…,n,1).
We can compute the cost of T as:
n −1
∑ in +c(n,1).
i =1
10.21
Assume a tour H in the graph such as that c(H) ≥ c(T). The arc (n,1) must be contained within the
tour h as
∑
n −1
It is implied that there is a Hamiltonian path P from 1 to n with a cost of i =1
in
Let ei be an arc of P with a tail i. It is true that c(ei) ≤ in+1. P mut have an arc (ek) that goes to an
edge with an identification number smaller than the number of the edge it starts from. We can
now write:
Theorem 10.5:
Let n≥4. The domination number of the nearest neighbor algorithm for the asymmetric traveling
salesman problem is at most n-1 and at least n/2.
Proof:
Assume all arcs such as that (i,i+1) 1≤ i <n, have a cost of iN, all arcs such as that (i,i+2) 1≤ i ≤ n-
2 have a cost of iN+1, all the other arcs that start from an edge with an identification number
smaller than that of the edge hey end(forward arcs) have a cost of iN+2 and all the other arcs that
start from an edge with a greater identification number than that of the edge they end (backward
arcs) have a cost of (j-1)N.
If nearest neighbor starts at i, which is neither 1 nor n, it has a cost of
n −1
l= ∑
k =1
kN -N+1. 10.25
n −1
If the algorithm starts at 1 we have a cost of ∑
k =1
kN > l
Any tour has a cost of at least l. Let us define the length of a backward arc as i-j. Let F be the set
of tours described above and T1 a tour not in F. T1 is a tour, so the cost of T1 is at most
n −1
∑
k =1
iN + 2 -qN - |B|N , 10.26
where B is the set of backward arcs and q is the sum of length of the arcs in B.
We conclude that the cost of T1 is less than l, which would mean that T1 belongs to F. So all
cycles that do not belong to F have a cost less than those who belong to F.
We assume that nearest neighbor does not have a domination number of at least n/2. Nearest
neighbor constructs n tours. By assumption the number of cities is at least 4, so we have at least 3
tours that may coincide. Let F = x1x2xnx1 be a tour such as that F = Fi=Fj= Fk. We could assume
that i=1 and 2<j≤1+(n/2). Foe every m with j<m≤n let Cm be the tour provided by deleting
consecutive arcs and adding backward arcs. We should note that
c(Cm)≥ c(Cf) since c(xi,xi+1)≤ c(xj,xj+1).
This is true as we used nearest neighbor from xj to construct Fj. and
c(xm,xm+1)≤ c(xm,xi+1) since nearest neighbor chose the (xm,xm+1) on Fj when the arc (xm,xi+1) was
available.
We can state now that the domination number is at least
n-j+1≤ n/2.
The theorem is proven by contradiction.
Definition:
A tour x1 x2 x3 xn x1 , x1 = 1 in a symmetric traveling salesman problem is called pyramidal if
x1<x2<xn<x1≤xk+1 >…>xn.
The number of pyramidal tours in a symmetric traveling salesman problem is:
2n-3.
Theorem 10.6:
Let n≥4. The domination number of nearest neighbor for the symmetrical traveling salesman
problem is at most 2n-3.
Proof:
We consider an instance of symmetric traveling salesman problem, which proves that nearest
neighbor has a domination number of 2n-3.
Let all edges {ι, ι+1}, 1≥ i < n have cost of iN.
Let all edges {i,i+2} have a cost of iN+1
Let all the remaining edges {i,j}, i<j , cost iN+2.
Let us assume that CNN is the cost of the cheapest tour provided by the nearest neighbor
algorithm. It is then clear that
n −1
CNN = c(12…n1) = ∑ iN +N+2.
i =1
10.27
Assume a tour on the graph. Let it be x1 x2 xn x1 .
We assume a directed cycle T’ which is constructed by orienting all the edges on the tour. For a
backward arc in the cycle e(j,i), we define its length as q(e) = j-i.
We express the sum of the lengths of the backward arcs in the cycle as q(T’).
Assume the most expensive non-pyramidal tour T. Let Cmax- be the cost of this tour.
We have to show that
Cmax < C nn
The branch and bound algorithm converts the asymmetric traveling salesman problem into an
assignment problem. Consider a graph V that contains all the cities. Consider Π being the set of
all the permutations of the cities, thus covering all possible solutions. Consider a permutation of
this set π∈Π in which each city is assigned a successor, say i,for the πi city. So a tour might be (1,
π(1), π(π(1)),…,1). If the number of the cities in the tour is n then the permutation is called a
cyclic permutation. The assignment problem tries to find such cyclic permutations and the
asymmetric traveling salesman problem seeks such permutations but with the constraint that they
have a minimal cost. The branch and bound algorithm firstly seeks a solution of the assignment
problem. The cost to find a solution to the assignment problem for n cities is quite large and is
asymptotically O(n3).
If this is a complete tour, then the algorithm has found the solution to the asymmetric traveling
salesman problem too. If not, then the problem is divided in several sub-problems. Each of these
sub-problems excludes some arcs of a sub-tour, thus excluding the sub-tour itself. The way the
algorithm chooses which arc to delete is called branching rules. It is very important that there are
no duplicate sub-problems generated so that the total number of the sub-problems is minimized.
Carpaneto and Toth have proposed a rule that guarantees that the sub-problems are independent.
They consider the included arc set and select a minimum number of arcs that do not belong to that
set. They divide the problem as follows. Symbolize as E the excluded arc set and as I the included
arc set. The I is to be decomposed. Let t arcs of the selected sub-tour x1x2 ...xn not to belong to I.
The problem is divided into t children so that the jth child has Ej excluded arc set and Ij included
arc set. We can now write:
Ej = E ∪ {x j }
k = 1,2,..,j 10.30
Ii - I ∪ {x 1 , x 2 ,..., x j-1 }
But xj is an excluded arc of the jth sub-problem and an included arc in the (j+1)st problem. This
means that a tour produced by the (j+1)st problem may have the xj arc but a tour produced by the
jth problem may not contain the arc. This means that the two problems cannot generate the same
tours, as they cannot contain the same arcs. This guarantees that there are no duplicate tours.
There has been a lot of controversy concerning the complexity of the branch and bound
algorithm. Bellmore and Malone have stated that the algorithm runs in polynomial time. They
have treated the problem as a statistical experiment assuming that the ith try of the algorithm is
successful if it finds a minimal tour for the Ith sub-problem. They assumed that the probability of
the assignment problem to find the solution to the asymmetric traveling salesman problem is e/n
where n is the number of the cities. Under other assumptions, they concluded that the total
number of sub-problems is expected to be:
i −1 ∞
Smith concluded that under some more assumptions the complexity of the algorithm is
O(n3ln(n))
The assumptions made to reach this result are too optimistic. Below it will be proven that they do
not hold and that the complexity of the branch and bound algorithm is not polynomial.
Definition:
The assignment problem of a cost matrix with ci,j = ∞ is called a modified assignment problem.
Lemma 10.1:
Assume a n x n random cost matrix. Consider the assignment problem that has s<n excluded arcs
and t included arcs. Let q(n,s,t) be the probability that the solution of the assignment problem is a
tour. Then q(n,s,t) is asymptotically
Proof:
See [7]
Lemma 10.2:
Consider branch and bound select two nodes that are independent. Assume that the probability
that a non-root node in the search tree is a leaf is p. Let po be the possibility that the node is the
root. There exists a constant 0<δ<l-1/e for a non-root node so that if t<δn then p<po, where n is
the number of cities.
Proof:
Assume the search tree and a node of it say Y, which is constructed from 10.33.Y, has some
included and some excluded arcs. Suppose the number of included arcs is t and the number of
excluded arcs is s, Observe that s is the depth of the node in the search tree. Let the path from the
root to the node is is Y0,Y1,Y2,…,Y. We can say that Yi has I excluded arcs. The probability of Y
creating a complete tour is that none of its ancestors provides a solution to the assignment
problem thus they do not provide a complete tour—but Y does. The probability that Y’s parent
does not provide a complete tour is (1-q(n,s-1,ts-1)). Consequently the probability that p and Y
exists and is a leaf taking into account the independence assumption is:
s −1
p = q(n.s.t) = ∏ (1 − q(n, i, ti)) .
i =0
10.34
s −1
λ
p = (λ/(n-t) + o(1/n)) ∏1 − n − ti
i =0
=
λ s −1
λi
n−t
(1- ∑ n − ti
i =0
+ o(1/n), 10.35
s −1
λi λ' s
∑ n − ti
i =0
=
n − t'
10.36
s −1
where λ’ = ½ ∑ λi
i =0
∑
s −1
i =0
λiti
and t’ =
∑
s −1
i =0
λi
λ λ' s
p= (1- ) + o(1/n) 10.37
n−t n − t'
Now we assume that the lemma does not hold thus p≥po where po = e/n. Let δ = (e-λ)/e. we know
that 1 < λ < e and 0 < δ < 1-1/e. Now it can be shown that:
λλ ' sn
t’ > n + >n 10.38
(e − λ )n − et
but n≥t so t’>t which is a contradiction. Thus, the lemma is proven.
Lemma 10.3:
Assume a n x n random matrix. Assume a solution in a modified assignment problem. The
expected number of sub-tours is asymptotically less than ln(n) and the expected number of arcs in
a sub-tour is greater than n/ln(n).
Proof:
See [7]
The number of children constructed when we choose a sub-tour with the minimum number of
arcs is O(n/ln(n)), as is proven in lemma 10.1.
The nodes at the first depth have t = O(n/ln(n)) included arcs. But as it it shown above t < δn.
This means that all nodes on the first depth asymptotically follow the inequality: t < δn.
Equally all nodes in the ith depth follow the same inequality except the ones that I is greater than
O(ln(n)). Assume a node with no included arcs. Its ancestors do not have included arcs either.
Using lemma 10.1 we can state that the probability that the node or one of its ancestors being a
solution to the assignment problem is e/n. We can now generalize this and say that the probability
that a node with no included arcs exists at dth depth and is a leaf node is:
p = e/n(1-e/n)d. 10.39
Observe that this probability is less than e/n or the probability that the root node is a leaf.. It is
therefore true that if we consider nodes whose depth is no greater than O(ln(n)) the probability
that they are leaf nodes is less than the probability of the root being a leaf itself.
The probability that a sub-problem chosen by branch and bound will be solved by the assignment
problem and will produce an optimal tour is less than the probability that the search will become
a leaf node. Consider the node that generates the optimal tour. If its depth is greater than ln(n)
then we have to expand ln(n) nodes each one of which has a probability of producing the optimal
tour less than po. If the depth of the node is lesser than ln(n) and if we need to expand only a
polynomial number of nodes according to Bellmore and Malone, then the expanded nodes have a
probability less than po of creating the optimal tour. This statement contradicts the polynomial
assumption. Therefore we can state that the branch and bound algorithm expands more than ln(n)
nodes to find the optimal tour. This means that the algorithm cannot finish in polynomial time.
10.2.6 The k-best traveling salesman problem, its complexity and one solution using
the branch and bound algorithm
Consider a graph G = (V, E). The number of the n cities in the graph has to be at least 3. Consider
an edge e ∈ E. The length of this edge is described by l(e). Consider a vector that contains the
lengths of the edges of the initial graph. Let us call this vector l. We can now create a weighted
graph, which consists of the pairs (G, d). Consider a set S of edges. The length of this set is
described as Ll(S). Consider the set H of all the Hamiltonian tours in the G. We assume that G has
at least on Hamiltonian tour.
Definition:
Let 1 ≤ k ≤ |H|. Any set H(k) satisfying
Ll(H1) ≤ Ll(H2)≤ … ≤ Ll(Hk) ≤ Ll(H) for all H
is called a set of k-best tours.
In other words the k-best tour problem is the problem of finding a set of k tours such that the
length of each tour is at least equal to the length of the greater tour in the set.
Complexity of the k-best traveling salesman problem
Theorem 10.7:
The k-best TSP is NP-hard for k≥1 regardless if k is fixed or variable.
Proof:
Consider the case that k is variable. This means that k is included in the input. We have to solve
TSP itself to find a solution to the k-best TSP. Since the TSP which is NP-hard is part of the k-
best TSP then the k-best TSP is NP-hard too.
Consider the case that k is fixed. This means that k is not included in the input. It is clear that a
shortest path can be determined if we know k-best tours. So we can conclude that k-best TSP is
NP-hard itself too.
Definition:
For I, O ∈ E, the set {H:I⊆H⊆E\O} is called a restricted set of tours.
To solve the problem using partitioning algorithms we use the following idea. We partition the
tours into sets such as that each set is a restricted one. We apply algorithms for solving the
traveling salesman problem for each one of these restricted sets. We combine the optimal
solutions for the sets to find the k-best solution.
There are two known partitioning algorithms for the k-best TSP. The one of them is the Lawler
algorithm and the other one is the Hamacher and Queyranne algorithm. The difference in these
two algorithms lies in the way they partition the solutions. They both call an algorithm to solve
the problem for a restricted set. Their complexity cannot be easily determined since the k-best
TSP is NP-hard. A clue to figure out which algorithm is the best of the two would be to check
how many times they call the algorithm to find a solution for the restricted set.
Using the branch and bound method to derive solutions for the k-best traveling salesman problem
Since the branch and bound method is used for solving the classic traveling salesman problem
(although in greater time than polynomial) it is worthy to modify it to solve the k-best TSP.
Initially the branch and bound tree contains only one node, the root node. As the algorithm
proceeds each node of the tree expands taking into computation edges from the graph. At a given
moment the branch and bound tree contains information about what are the best tours so far. As
an analogue to the original branch and bound, which contains information, what is the best
candidate for the optimal path. When the algorithm ends, the initially empty tree has information
about the set of k-best tours.
We considered a restricted set of tours as is defined above. Let us assume a node in the branch
and bound tree with this restricted set. First of all the algorithm determines a lower bound which
we will express as LB(I, O) : Ll(H) ≥ LB(I, O) for every tour H.
It is true that if LB(I, O) ≥ U then we should not take into account any tour that exists in the tree.
The algorithm continues until the above holds for all the tours ; that is to say that we cannot take
into account any tour in the tree. At that moment we should say that the tree is equal to H(k).
If LB(I, O) < U then we have to distinguish two cases. If the graph contains k tours, then the
longest of them is removed. The tour from the tree is removed from the tree and added to the
graph. At that point, the information about the longest tour is updated. If the graph contains less
than k tours, then we do not have to remove any tour. The longest tour from the tree is added to
the graph and the information about the longest tour is updated.
Below is the formal expression of the algorithm. It uses a recursive procedure named EXPLORE
that runs through all the nodes in the branch and bound tree and performs the computations we
have explained. It searches the tree at a depth-first way. It has three parameters I, O and the
graph. The I is the partial tour. The algorithm starts by taking the empty sets I ,O and the graph
and calling the procedure EXPLORE for these sets.
Modified Branch and bound for finding k-best tours for the traveling salesman problem
Input: (G, d), the set of tours H and an integer so that 1 ≤ k ≤ |H|
Output: A set H(k) of best tours in the (G, d).
It has been shown by experiments that the complexity of the branch and bound algorithm
increases dramatically as k gets larger. It is a result that we should expect as the branch and
bound algorithm gets much slower as the number of cities increase in the classic traveling
salesman problem.
10.3 Geometric solutions
10.3.1 A heuristic solution based on continuous approximation
We assume that the cities are distributed over a region that has the shape of a circular sector. The
starting point is on the sector vertex. It is not obligatory that this point is actually the point at
which the traveling salesman starts the tour, but we need to ensure that the tour visits the vector.
We also assume that the distribution of the cities is uniform to be able to state that the probability
of finding a city does not depend on the location of the city. Let us assume that we have N points
C of which are cities the salesman visits and N = C + 1 are the cities plus the starting city. The
way the tour is constructed is explained below.
We divide the circular vector into three parts. There are two inner circular sectors that share the
vertex as the border and the remaining ring sector. Figure 10.4 shows the division of the circular
sector.
B R
A R′ C
Figure 10.4 The circular sector after it has been divided in three regions.
We name the inner sectors A and C and the ring sector B. The division can be described
completely by R’. We can write now:
R'
p= 10.40
R
We visit each one of the regions and construct paths in these regions. In the final step, we
combine the paths. Figures 10.5 and 10.6 show the working of the algorithm.
Figure 10.5 The cities in the three sections Figure 10.6 All the cities have been
have been connected connected
From the preceding figures, one can see only the tour and not the starting point. This depends
entirely on the partition.
The average length of the tour is estimated below.
The distance between two points of polar coordinates (r, u) is:
We assume the length of the radial links and the length of the ring links. We express the total tour
length as the sum of all the path lengths either radial or ring on all sections. We can now write:
The two multiplier is there because section A and C are the same. We approximate the length of
the radial links as lA,r = p.
We can also approximate the length of the ring links in A or C by the number of the points in
these two regions times the average length of the ring between two points. So we can write:
By assumption the point are distributed uniformly in all the regions so we can state that:
nA = Cp2/2 10.44
The average ring length for all the points can be approximated by:
p
ru
dA,u = ∫
0
6
dr = pu/12 10.45
Taking into consideration that the radial distribution has a probability density of f(r) = 2r we
conclude that the average distance in sector B is:
1
2( pp+ p + 1)
pavg = ∫ rf (r )dr =
p
3(1 + p)
10.47
let us make an assumption that we have a uniform radial distribution so as to simplify the above
expression. Now we can write:
1+ p
p= 10.48
2
Now we can compute the length of the ring links in B. We find that
(1 + p )u
LB,u = 10.49
2
So the expected radial distance between two points found in B can be expressed as:
1 r
4(1 − p)( pp + 3 p + 1)
dB,r = 2 ∫ ∫ (r − s) f (r ) f (s)dsdr =
p p
15(1 + p)(1 + p)
10.50
Once again, we assume a uniform radial distribution and can write that:
1− p
dB,r = 10.51
3
We compute the length of the radial links in B as the number of points times the expected
distance between two points. Thus we can write:
From the above expressions, we can find the average tour length:
pppuC (1 + p )u C
l = 2p + + + (1-p2)(1-p). 10.53
12 2 3
The above expression has a drawback. The results it produces are pessimistic if p is close to 1.
This is the case when the circular sector is divided into two identical sections, thus having an
angle of u/2. Now there is no ring B but there still exists a ring that connects the outermost paths
of A and C. This is the cause that makes the estimation pessimistic.
We can instead substitute the expression that gives the length of the radial links in B by the more
accurate:
(1 + p )u
lB,u = (1-p/2) 10.54
2
We can take more precise estimations for the total length of the tour by this expression:
pppuC (1 + p )u C
l = 2p + + (1-p/2) + (1-p2)(1-p). 10.55
12 2 3
This expression is immune to very low values of p (approaching 1) and gives a very accurate
estimation of the total tour length.
Definition:
A 1-tree problem on n cities is the problem of finding a tree that connects n cities with the first
city connecting to two cities.
When we try to find a lower bound for the 1-tree problem we try to find a minimum 1-tree. We
apply the 1-tree problem to the traveling salesman problem by considering that a tour is a tree
whose each vertex has a degree of two. Then a minimum 1-tree is also a minimal path.
Figure 10.7 shows a simple 1-tree.
Let us consider the geometric traveling salesman problem. We denote to eij the length of the path
from the i city to the j city. We assume that each city has a weight of πi. So we can say that each
edge has a cost of
We now compute a new minimum 1-tree taking into account the edge costs. It is clear that the
new 1-tree we will construct id different from the original 1-tree. Let us consider a set of different
tours V. Let U be the set of 1-trees constructed by each tour from V. Recall that a tour is a 1-tree
with each vertex having a degree of 2. This means that the set of tours is included in the set of 1-
trees. Let us express a tour with T and the cost of a tour as L(cij , T) if we take in account the cost
of the edges. Therefore, it is true that
n
L(cij , T) = L(eij , T) + ∑
i =1
πi 2 10.59
n
min LT∈U(cij, T) ≤ min L(cij, T’) - ∑ i =1
πi 2 10.60
Let us express the length of the optimal tour as c’ = L(eij , T’).Then from equation 10.59 and
10.60 we can get:
n n
minT∈U{c + ∑
i =1
πidTi } ≤ c’ + ∑
i =1
πi 2 10.61
n
minT∈U{c + ∑
i =1
πi (dTi -2)} ≤ c’ 10.62
n
w(π) = minT∈U{c + ∑i =1
πi (dTi -2)} 10.63
It has been shown that Held Karp is a very good estimate for the minimum tour length although it
does not give the exact result.
References:
[1] Corman H. Thomas, Leiserson E. Charles, Rivest L. Ronald, Stein Clifford “Introduction to
Algorithms” Second Edition McGrawHill Book Company
[2] Cesari Giovanni “Divide and Conquer Strategies for Parallel TSP Heuristics”, Computers
Operations Research , Vol.23, No.7, pp 681-694, 1996
[3] del Castillo M. Jose “A heuristic for the traveling salesman problem based on a continuous
approximation”, Transportation Research Part B33 (1999) 123-152
[4] Gutin Gregory,eo Anders, Zverovich Alexey “Traveling salesman should not be greedy:
domination analysis of greedy-type heuristics for the TSP” , Discrete Applied Mathematics 117
(2002), 81-86
[5] van der Poort S. Edo, Libura Marek, Sierksma Gerard, vander Veen A. A. Jack “Solving the
k-best traveling salesman problem”, Computers & Operations Research 26 (1999) 409-425
[6] Valenzuela I. Christine, Jones J. Antonia “Estimating the Held-Karp lower bound for the
geometric TSP , European Journal of Operational Research 102(1997) 157-175
[7] Zhang Weixiong “A note on the complexity of the Assymetric Traveling Salesman Problem”,
Operation Research Letters 20 (1997) 31-38
Unit-1
Advanced Algorithmic Analysis
By
Dr. Shafiul Alom Ahmed
VIT-Bhopal University
Analysing Algorithm: Recursive relation, Substitution Method, Master Method,
Recursive tree.
What is Algorithm?
A finite set of instruction that specifies a sequence of operation is to be carried out in
order to solve a specific problem or class of problems is called an Algorithm.
Why study Algorithm?
●As the speed of processor increases, performance is frequently said to be less
central than other software quality characteristics (e.g. security, extensibility,
reusability etc.).
●However, in the area of computational science the problem sizes are relatively
huge, which makes performance a very important factor.
● Because higher computation time leads to slower results i.e., less throughput and
Design and Analysis of Algorithms
Design and analysis of algorithms is a crucial subject of computer science technology that deals
●
with developing and studying efficient algorithms for fixing computational issues.
●It entails several steps, including problem formulation, algorithm layout, algorithm analysis, and
algorithm optimization.
The problem formulation process entails identifying the computational problem to be solved as
●
It evaluates the order of count of operations executed by an algorithm as a function of input data size.
●
●To assess the complexity, the order (approximation) of the count of operation is always considered instead of
counting the exact steps.
●The O(f) notation represents the complexity of an algorithm, which is also termed as an Asymptotic notation
or "Big O" notation.
Here the f corresponds to the function whose size is the same as that of the input data.
●
●The complexity of the asymptotic computation O(f) determines in which order the resources such as CPU
time, memory, etc. are consumed by the algorithm that is articulated as a function of the size of the input data.
●The complexity can be found in any form such as constant, logarithmic, linear, n*log(n), quadratic, cubic,
exponential, etc. It is nothing but the order of constant, logarithmic, linear and so on, the number of steps
encountered for the completion of a particular algorithm.
To make it even more precise, we often call the complexity of an algorithm as "running time".
●
Typical Complexities of an Algorithm
●Constant Complexity: It imposes a complexity of O(1). It undergoes an execution of a constant
number of steps like 1, 5, 10, etc. for solving a given problem. The count of operations is
independent of the input data size.
●Logarithmic Complexity: It imposes a complexity of O(log(N)). It undergoes the execution of
the order of log(N) steps. To perform operations on N elements, it often takes the logarithmic base
as 2.
●For N = 1,000,000, an algorithm that has a complexity of O(log(N)) would undergo 20 steps (with
a constant precision). Here, the logarithmic base does not hold a necessary consequence for the
operation count order, so it is usually omitted.
●Linear Complexity:It imposes a complexity of O(N). It encompasses the same number of steps
as that of the total number of elements to implement an operation on N elements.
●For example, if there exist 500 elements, then it will take about 500 steps. Basically, in linear
complexity, the number of elements linearly depends on the number of steps. For example, the
number of steps for N elements can be N/2 or 3*N.
●Quadratic Complexity: It imposes a complexity of O(n2). For N input data size, it undergoes the order of N2
count of operations on N number of elements for solving a given problem.
●If N = 100, it will endure 10,000 steps. In other words, whenever the order of operation tends to have a
quadratic relation with the input data size, it results in quadratic complexity. For example, for N number of
elements, the steps are found to be in the order of 3*N2/2.
●Cubic Complexity: It imposes a complexity of O(n3). For N input data size, it executes the order of N3 steps
on N elements to solve a given problem.
For example, if there exist 100 elements, it is going to execute 1,000,000 steps.
●
●Exponential Complexity: It imposes a complexity of O(2n), O(N!), O(nk), …. For N elements, it will
execute the order of count of operations that is exponentially dependable on the input data size.
●For example, if N = 10, then the exponential function 2N will result in 1024. Similarly, if N = 20, it will result
in 1048 576, and if N = 100, it will result in a number having 30 digits. The exponential function N! grows
even faster; for example, if N = 5 will result in 120. Likewise, if N = 10, it will result in 3,628,800 and so on.
Algorithm Design Techniques
●Divide and Conquer Approach: It is a top-down approach. The algorithms which follow the
divide & conquer techniques involve three steps:
1. Divide the original problem into a set of subproblems.
●
3. Combine the solution of the subproblems (top level) into a solution of the whole original problem.
●
●Greedy Technique: Greedy method is used to solve the optimization problem. An optimization
problem is one in which we are given a set of input values, which are required either to be
maximized or minimized (known as objective), i.e. some constraints or conditions.
1. Greedy Algorithm always makes the choice (greedy criteria) looks best at the moment, to optimize a given objective.
●
●2. The greedy algorithm doesn't always guarantee the optimal solution however it generally produces a solution that is very
close in value to the optimal.
●Dynamic Programming: Dynamic Programming is a bottom-up approach we solve all possible
small problems and then combine them to obtain solutions for bigger problems.
This is particularly helpful when the number of copying subproblems is exponentially large.
●
●2. Branch and Bound algorithms can be slow, however in the worst case they require effort that grows exponentially with problem size,
but in some cases, the method coverage with much less effort.
●The word
Asymptotic means
approaching a value
Why is Asymptotic Notation Important?
● They give simple characteristics of an algorithm's efficiency.
● They allow the comparisons of the performances of various algorithms.
● Types of Asymptotic Notations:
● Big-O Notation: Big-O notation is used to describe the performance or complexity of an
algorithm. Specifically, it describes the worst-case scenario in terms of time or space complexity.
1. It provides an upper limit on the time taken by an algorithm in terms of the size of the input.
●
●2. It’s denoted as O(f(n)), where f(n) is a function that represents the number of operations (steps)
that an algorithm performs to solve a problem of size n.
3. Big O notation only describes the asymptotic behavior of a function, not its exact value. The
●
Big O notation can be used to compare the efficiency of different algorithms or data structures.
Definition of Big-O Notation:
Given two functions f(n) and g(n), we say that f(n) is O(g(n)) if there exist constants c > 0 and n0
●
>= 0 such that f(n) <= c*g(n) for all n >= n0.
●In simpler terms, f(n) is O(g(n)) if f(n) grows no faster than c*g(n) for all n >= n0 where c and n0
are constants.
Big-Omega Ω Notation
●Big-Omega Ω Notation, is a way to express the asymptotic lower bound of an
algorithm’s time complexity, since it analyses the best-case situation of algorithm.
●It provides a lower limit on the time taken by an algorithm in terms of the size of
the input. It’s denoted as Ω(f(n)), where f(n) is a function that represents the number
of operations (steps) that an algorithm performs to solve a problem of size n.
●Big-Omega Ω Notation is used when we need to find the asymptotic lower bound
of a function.
In other words, we use Big-Omega Ω when we want to represent that the algorithm
●
● Mathematical Representation:
● Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0}.
Recursive/Recurrence relation
●A recurrence relation is a mathematical expression that defines a sequence in terms of its previous terms. In the
context of algorithmic analysis, it is often used to model the time complexity of recursive algorithms.
●General form of a recurrence relation: an = f(an-1, an-2, an-3,..........,an-k), where f is the function that defines the
relationship between the current term and the previous term.
●Recurrence Relations play a significant role in analyzing and optimizing the complexity of algorithms. Having a
strong understanding of Recurrence Relations play a great role in developing the problem-solving skills of an
individual. Some of the common uses of Recurrence Relations are:
● 1. Linear Recurrence Relation: In case of Linear Recurrence Relation every term is dependent linearly on its
previous term.
2. Divide and Conquer Recurrence Relation: It the type of Recurrence Relation which is obtained from Divide and
●
Conquer Algorithm.
●3. First Order Recurrence Relation: It is the type of recurrence relation in which every term is dependent on just
previous term.
●(4) Higher Order Recurrence Relation: It is the type of recurrence relation where one term is not only dependent on
just one previous term but on multiple previous terms. If it will be dependent on two previous term then it will be
called to be second order. Similarly, for three previous term its will be called to be of third order and so on.
Methods of solving Recurrence
● There are four methods for solving Recurrence:
● 1. Substitution Method
● 2. Iteration Method
● 4. Master Method
Substitution Method
● Advantage:
●It can solve all the problems of recurrence relation.
(not possible using master theorem).
●The substituion method always gives the correct
answer.
● Disadvantage:
It requires many mathematical calculations, which
●
n*T(n-1) if n>1
T(n)=n*T(n-1).................(1)
T(n-1)=(n-1)*T(n-1-1)
=(n-1)*T(n-2)........(2)
T(n-2)=(n-2)*T(n-3).........(3)
T(n)=n*(n-1)*T(n-2)............(4)
T(n)=n*(n-1)*(n-2)*T(n-3)
Similarly,
T(n)=n*(n-1)*(n-2)*(n-3)*...................*T(n-(n-1))
2.Let us consider a recurrence relation
T(n)=1 if n=1
2*T(n/2)+n if n>1
T(n)=23*T(n/23)+3n..........(5)
T(n)=2k*T(n/2k)+kn.........(6)
T(n)=2k*T(1)+kn
=>n + n logn
=>O(nlogn)
T(n-1)+logn if n>1
Recursive Tree Method
Master’s Theorem
●The Master Theorem is a tool used to solve recurrence relations that arise in the analysis of divide-and-
conquer algorithms. The Master Theorem provides a systematic way of solving recurrence relations of the
form:
n/b − size of the sub problems based on the assumption that all sub-problems are of the same size.
●
●f(n) − represents the cost of work done outside the recursion -> Θ(nk logpn) ,where k >= 0 and p is a real
number.
a = 3, b = 2, k = 0, p = 2
T(n) = θ(nlogba)
T(n) = θ(nlog23)
a = 2, b = 2, k = 1, p = 2