CSC 308-2
CSC 308-2
ANALYSIS -Part I
Dr. S. A. Bakura
Now, the first thing we’ll do is count how many basic operations this piece
of code executes. As we analyze this piece of code, we want to break it up
into simple instructions; things that can be executed by the CPU directly
1
- or close to that. We’ll assume our processor can execute the following
operations as one instruction each:
• Incrementing a value
We’ll assume branching (the choice between if and else parts of code
after the if condition has been evaluated) occurs instantly and won’t count
these instructions. In the above code, the first line of code is:
var M = A[ 0 ];
This requires 2 instructions: One for looking up A[ 0 ] and one for assign-
ing the value to M (we’re assuming that n is always at least 1). These two
instructions are always required by the algorithm, regardless of the value of
n. The for loop initialization code also has to always run. This gives us two
more instructions; an assignment and a comparison:
i = 0;
i < n;
These will run before the first for loop iteration. After each for loop
iteration, we need two more instructions to run, an increment of i and a
comparison to check if we’ll stay in the loop:
++i;
i < n;
So, if we ignore the loop body, the number of instructions this algorithm
needs is 4 + 2n. That is, 4 instructions at the beginning of the for loop and
2 instructions at the end of each iteration of which we have n. We can now
define a mathematical function f(n) that, given an n, gives us the number
of instructions the algorithm needs. For an empty for body, we have f(n) =
4 + 2n.
2
if ( A[ i ] >= M ){ ...
• f( n ) = 8n + 34 gives f( n ) = n.
• f( n ) = 300 gives f( n ) = 1
3
• f( n ) = n2 + 6n + 42 gives f( n ) = n2
• f( n ) = nn + n gives f(n) = nn
Mathematically speaking, what we’re saying here is that we’re interested
in the limit of a function f as n tends to infinity; However, in a strict mathe-
matical setting, we would not be able to drop the constants in the limit; but
for complexity analysis purposes in computer science, we want to do that
for the reasons described above.
Example 2: What is the running time for this code fragment?
4
what aspect of the algorithm’s behavior is being described. One is the up-
per bound for the growth of the algorithm’s running time. It indicates the
upper or highest growth rate that the algorithm can have called big-Oh
notation.
Because the phrase “has an upper bound to its growth rate of f(n)” is
long and often used when discussing algorithms, we adopt a special nota-
tion, called big-Oh notation. If the upper bound for an algorithm’s growth
rate (for, say, the worst case) is f(n), then we would write that this algo-
rithm is “in the set O(f(n)) in the worst case” (or just “in O(f(n)) in the
worst case”). For example, if n2 grows as fast as T(n) (the running time
of our algorithm) for the worst-case input, we would say the algorithm is
“in O( n2 ) in the worst case.” The following is a precise definition for an
upper bound. T(n) represents the true running time of the algorithm. f(n)
is some expression for the upper bound.
5
2.2 Lower Bound
Big-Oh notation describes an upper bound. In other words, big-Oh notation
states a claim about the greatest amount of some resource (usually time)
that is required by an algorithm for some class of inputs of size n (typically
the worst such input, the average of all possible inputs, or the best such
input).
Similar notation is used to describe the least amount of a resource that an
algorithm needs for some class of input i.e the lower bound of an algorithm.
The lower bound for an algorithm (or a problem, as explained later) is
denoted by the symbol Ω, pronounced “big-Omega” or just “Omega.” The
following definition for is symmetric with the definition of big-Oh.
For T(n) a non-negatively valued function, T(n) is in set Ω(g(n))
if there exist two positive constants c and n0 such that T(n) ≥
cg(n) for all n > n0 .
2.3 Θ Notation
The definitions for big-Oh and Ω give us ways to describe the upper bound
for an algorithm (if we can find an equation for the maximum cost of a
particular class of inputs of size n) and the lower bound for an algorithm (if
we can find an equation for the minimum cost for a particular class of inputs
of size n). When the upper and lower bounds are the same within a constant
factor, we indicate this by using Θ (big-Theta) notation. An algorithm is
said to be Θ(h(n)) if it is in O(h(n)) and it is in Ω(h(n))
Note that we drop the word “in” for Θ notation, because there is a
strict equality for two equations with the same Θ. In other words, if f(n) is
Θ(g(n)), then g(n) is Θ(f(n))
Because the sequential search algorithm is both in O(n) and in Ω(n) in
the average case, we say it is Θ(n) in the average case.
Given an algebraic equation describing the time requirement for an al-
gorithm, the upper and lower bounds always meet. That is because in some
sense we have a perfect analysis for the algorithm, embodied by the running-
time equation. For many algorithms (or their instantiations as programs),
it is easy to come up with the equation that defines their runtime behavior.
6
analysis. Instead, you can use the following rules to determine the simplest
form.
1. If f(n) is in O(g(n)) and g(n) is in O(h(n)), then f(n) is in O(h(n)).
7
Our algorithm is Ω( something ) A number is ≥ something
Our algorithm is ω( something ) A number is > something
The first line is Θ(1). The for loop is repeated n times. The third line
takes constant time so, by simplifying rule (4) of Section 2.4, the total cost
for executing the two lines making up the for loop is Θ(n). By rule (3), the
cost of the entire code fragment is also Θ(n).
This code fragment has three separate statements: the first assignment
statement and the two for loops. Again the assignment statement takes
constant time; call it c1 . The second for loop is just like the one in Algorithm
4 and takes c2 n = Θ(n) time.
The first for loop is a double loop and requires a special technique. We
work from the inside of the loop outward. The expression sum++ requires
constant time; call it c3 . Because the inner for loop is executed i times, by
simplifying rule (4) it has cost c3 i. The outer for loop is executed n times,
but each time the cost of the inner loop is different because it costs c3 i with
i changing each time. You should see that for the first execution of the outer
loop, i is 1. For the second execution of the outer loop, i is 2. Each time
through the outer loop, i becomes one greater, until the last time through
the loop when i = n. Thus, the total cost of the loop is c3 times the sum of
the integers 1 through n.
8
From our mathematical preliminaries, we know that
n
X n(n + 1)
i= (2)
2
i=1
Algorithm 5 Not all doubly nested for loops are Θ(n2 ). The following pair
of nested loops illustrates this fact.
1:
sum1 = 0;
for (k=1; k<=n; k*=2) // Do log n times
for (j=1; j<=n; j++) // Do n times
sum1++;
2:
sum2 = 0;
for (k=1; k<=n; k*=2) // Do log n times
for (j=1; j<=k; j++) // Do k times
sum2++;
3 Complexity Classes
Let’s start by reviewing the definitions of some concepts.
9
3.1 Decision problem
A Decision problem is a computation problem to which the answer is either
“yes” or “no.” In mathematical language, we can think of a decision problem
as a function whose domain is the set of possible input strings ({0,1}, {
ASCII}, etc.) and whose range is {0,1} (with 0 meaning “no” and 1 meaning
“yes”).
3.4 Problem
Given a graph G and an integer k, is there a spanning tree of size less than
k?
For most real-world applications, search problems are much more impor-
tant than decision problems. So why do we restrict our attention to decision
problems when defining complexity classes? Here are a few reasons:
10
of reasonable, mainstream models of computation. Example of Polynomial
problem: Sorting, Matrix multiplication, Matching
5.1 Proposition
P ⊆ NP.
5.2 Proof
. Given a decision problem P, view P as a function whose domain is the set
of strings and whose range is 0,1. If P can be computed in polynomial time,
then we can just take A(x, y)= P(x). In this case, the verifier just re-solves
the entire problem. The converse to the above proposition is a famous open
problem:
5.3 Problem
(P vs. NP). Is it true that P= NP? The vast majority of computer scientists
believe that P 6= NP, and so the P vs. NP problem is sometimes called the
P 6= NP problem. If it were true that P = NP, then lots of problems that
seem hard would actually be easy. The P vs NP relationship is illustrated
on Figure 1
11
Figure 1: P Vs NP
that takes Θ(n4 ) time. These are all examples of polynomial running time,
because the exponents for all terms of these equations are constants.
5.6 NP-hard
A problem is NP-hard if any problem in NP can be reduce to X in polyno-
mial time. Thus, X is as hard as any problem in NP.
12
Figure 2: An illustration of the TRAVELING SALESMAN problem. Five
vertices are shown, with edges between each pair of cities. The problem is
to visit all of the cities exactly once, returning to the start city, with the
least total cost
5.7 NP-Complete
These are the problem that we know efficient non-deterministic algorithms
to solve them, but we do not know if there are efficient deterministic algo-
rithms. At the same time, we have not been able to prove that any of these
problems do not have efficient deterministic algorithms. This class of prob-
lems is called NP-complete. What is truly strange and fascinating about
NP-complete problems is that if anybody ever finds the solution to any
one of them that runs in polynomial time on a regular computer, then by a
series of reductions, every other problem that is in NP can also be solved in
polynomial time on a regular computer!
Definition of NP-Complete A problem X is defined to be NP-complete
if
1. X is in NP, and
2. X is NP-hard.
13
Figure 3: Our knowledge regarding the world of problems requiring expo-
nential time or less. Some of these problems are solvable in polynomial
time by a non-deterministic computer. Of these, some are known to be
NP-complete, and some are known to be solvable in polynomial time on a
regular computer.
14
in effect saying that the most brilliant computer scientists for the last 50
years have been trying and failing to find a polynomial time algorithm for
her problem.
References
[1] Clifford A. Shaffer Data Structures and Algorithm Analysis in Java,
Third Edition
15