Module 1
Module 1
What Is an Algorithm?
An algorithm is a sequence of unambiguous instructions for solving a problem, i.e., for
obtaining a required output for any legitimate input in a finite amount of time.
As examples illustrating the notion of the algorithm, we consider in this section three
methods for solving the same problem: computing the greatest common divisor of two
integers.
The greatest common divisor of two nonnegative, not-both-zero integers m and n, denoted
gcd(m, n), is defined as the largest integer that divides both m and n evenly, i.e., with a
remainder of zero.
Note that unlike Euclid’s algorithm, this algorithm, in the form presented, does not work
correctly when one of its input numbers is zero. This example illustrates why it is so
important to specify the set of an algorithm’s inputs explicitly and carefully.
The Third procedure for finding the greatest common divisor should be
familiar to you from middle school.
The middleschool procedure does not qualify, in the form presented, as a legitimate
algorithm. Why? Because the prime factorization steps are not defined unambiguously: they
require a list of prime numbers, the middle-school math teacher did not explain how to obtain
such a list.
So, let us introduce a simple algorithm for generating consecutive primes not exceeding any
given integer n > 1.
It was probably invented in ancient Greece and is known as the sieve of Eratosthenes.
The algorithm starts by initializing a list of prime candidates with consecutive integers
from 2 to n.
Then, on its first iteration, the algorithm eliminates from the list all multiples of 2, i.e.,
4, 6, and so on.
Then it moves to the next item on the list, which is 3, and eliminates its multiples.
No pass for number 4 is needed: since 4 itself and all its multiples are also multiples
of 2, they were already eliminated on a previous pass.
The next remaining number on the list, which is used on the third pass, is 5.
The algorithm continues in this fashion until no more numbers can be eliminated from
the list.
The remaining integers of the list are the primes needed.
What is the largest number p whose multiples can still remain on the list to make
further iterations of the algorithm necessary?
Let us first note that if p is a number whose multiples are being eliminated on the
current pass, then the first multiple we should consider is p . p because all its smaller
multiples 2p, . . . , (p − 1)p have been eliminated on earlier passes through the list.
This observation helps to avoid eliminating the same number more than once.
Obviously, p . p should not be greater than n, and therefore p cannot exceed √n
rounded down (denoted √n using the so-called floor function).
Before designing an algorithm is to understand completely the problem given. Read the
problem’s description carefully and ask questions if you have any doubts about the problem,
do a few small examples by hand, think about special cases, and ask questions again if
needed.
There are a few types of problems that arise in computing applications quite often. If the
problem in question is one of them, you might be able to use a known algorithm for solving
it.
Once you completely understand a problem, you need to ascertain the capabilities of the
computational device the algorithm is intended for.
Random-access machine (RAM). Its central assumption is that instructions are executed one
after another, one operation at a time. Accordingly, algorithms designed to be executed on
such machines are called sequential algorithms.
The central assumption of the RAM model does not hold for some newer computers that can
execute operations concurrently, i.e., in parallel. Algorithms that take advantage of this
capability are called parallel algorithms.
The next principal decision is to choose between solving the problem exactly or solving it
approximately.
In the former case, an algorithm is called an exact algorithm; in the latter case, an algorithm
is called an approximation algorithm.
First, there are important problems that simply cannot be solved exactly for most of their
instances; examples include extracting square roots, solving nonlinear equations, and
evaluating definite integrals.
Second, available algorithms for solving a problem exactly can be unacceptably slow because
of the problem’s intrinsic complexity. This happens, in particular, for many problems
involving a very large number of choices
They distill a few key ideas that have proven to be useful in designing algorithms. Learning
these techniques is of utmost importance for the following reasons.
First, they provide guidance for designing algorithms for new problems, i.e., problems for
which there is no known satisfactory algorithm.
Second, algorithms are the cornerstone of computer science. Algorithm design techniques
make it possible to classify algorithms according to an underlying design idea; therefore, they
can serve as a natural way to both categorize and study algorithms.
While the algorithm design techniques do provide a powerful set of general approaches to
algorithmic problem solving, designing an algorithm for a particular problem may still be a
challenging task.
Sometimes, several techniques need to be combined, and there are algorithms that are hard to
pinpoint as applications of the known design techniques.
Even when a particular design technique is applicable, getting an algorithm often requires a
nontrivial ingenuity on the part of the algorithm designer.
With practice, both tasks—choosing among the general techniques and applying them—get
easier, but they are rarely easy.
Of course, one should pay close attention to choosing data structures appropriate for the
operations performed by the algorithm.
The fundamental importance of both algorithms and data structures for computer
programming by its very title:
In the new world of object-oriented programming, data structures remain crucially important
for both design and analysis of algorithms.
Once you have designed an algorithm, you need to specify it in some fashion.
Algorithm is described in words (in a free and also a step-by-step form) and in pseudocode.
These are the two options that are most widely used nowadays for specifying algorithms.
Using a natural language (words) has an obvious appeal; however, the inherent ambiguity of
any natural language makes a succinct and clear description of algorithms surprisingly
difficult.
In the earlier days of computing, the dominant vehicle for specifying algorithms was a
flowchart, a method of expressing an algorithm by a collection of connected geometric
shapes containing descriptions of the algorithm’s steps.
The state of the art of computing has not yet reached a point where an algorithm’s
description—be it in a natural language or pseudocode—can be fed into an electronic
computer directly. Instead, it needs to be converted into a computer program written in a
particular computer language.
Once an algorithm has been specified, you have to prove its correctness. That is, you have to
prove that the algorithm yields a required result for every legitimate input in a finite amount
of time.
For some algorithms, a proof of correctness is quite easy; for others, it can be quite complex.
A common technique for proving correctness is to use mathematical induction because an
algorithm’s iterations provide a natural sequence of steps needed for such proofs.
But in order to show that an algorithm is incorrect, you need just one instance of its input for
which the algorithm fails.
Analyzing an Algorithm
There are two kinds of algorithm efficiency: time efficiency, indicating how fast the
algorithm runs, and space efficiency, indicating how much extra memory it uses.
Yet another desirable characteristic of an algorithm is generality. There are, in fact, two
issues here: generality of the problem the algorithm solves and the set of inputs it accepts.
Coding an Algorithm
The term ―analysis of algorithms‖ is usually used in a narrower, technical sense to mean an
investigation of an algorithm’s efficiency with respect to two resources:
memory space.
space efficiency.
Time efficiency, also called time complexity, indicates how fast an algorithm in question runs.
Space efficiency, also called space complexity, refers to the amount of memory units required
by the algorithm in addition to the space needed for its input and output.
For example, it takes longer to sort larger arrays, multiply larger matrices, and so on.
For example, it will be the size of the list for problems of sorting, searching, finding
the list’s smallest element, and most other problems dealing with lists.
For the problem of evaluating a polynomial p(x) = a nxn + . . . + a0 of degree n, it will
be the polynomial’s degree or the number of its coefficients, which is larger by 1 than
its degree.
The choice of an appropriate size metric can be influenced by operations of the algorithm in
question.
For example, how should we measure an input’s size for a spell-checking algorithm?
If the algorithm examines individual characters of its input, we should measure the
size by the number of characters; if it works by processing words, we should count
their number in the input.
We should make a special note about measuring input size for algorithms solving problems
such as checking primality of a positive integer n.
Here, the input is just one number, and it is this number’s magnitude that determines
the input size. In such situations, it is preferable to measure size by the number b of
bits in the n’s binary representation:
we can simply use some standard unit of time measurement—a second, or millisecond, and
so on—to measure the running time of a program implementing the algorithm.
Since we are after a measure of an algorithm’s efficiency, we would like to have a metric that
does not depend on these extraneous factors.
One possible approach is to count the number of times each of the algorithm’s operations is
executed. This approach is both excessively difficult and, as we shall see, usually
unnecessary.
The thing to do is to identify the most important operation of the algorithm, called the basic
operation, the operation contributing the most to the total running time, and compute the
number of times the basic operation is executed.
As a rule, it is not difficult to identify the basic operation of an algorithm: it is usually the
most time-consuming operation in the algorithm’s innermost loop.
Thus, the established framework for the analysis of an algorithm’s time efficiency suggests
measuring it by counting the number of times the algorithm’s basic operation is executed on
inputs of size n.
Let cop be the execution time of an algorithm’s basic operation on a particular computer, and
let C(n) be the number of times this operation needs to be executed for this algorithm. Then
we can estimate the running time T (n) of a program implementing this algorithm on that
computer by the formula
Orders of Growth
Difference in running times on small inputs is not what really distinguishes efficient
algorithms from inefficient ones.
The function growing the slowest among these is the logarithmic function.
On the other end of the spectrum are the exponential function 2 n and the factorial function n!
Both these functions grow so fast that their values become astronomically large even
for rather small values of n.
Algorithms that require an exponential number of operations are practical for solving
only problems of very small sizes.
But there are many algorithms for which running time depends not only on an input size but
also on the specifics of a particular input.
Clearly, the running time of this algorithm can be quite different for the same list size n.
In the worst case, when there are no matching elements or the first matching element
happens to be the last one on the list, the algorithm makes the largest number of key
comparisons among all possible inputs of size n:
Cworst(n) = n.
The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size n,
which is an input (or inputs) of size n for which the algorithm runs the longest among all
possible inputs of that size.
The best-case efficiency of an algorithm is its efficiency for the best-case input of size n,
which is an input (or inputs) of size n for which the algorithm runs the fastest among all
possible inputs of that size.
For example, the best-case inputs for sequential search are lists of size n with their first
element equal to a search key; accordingly, Cbest(n) = 1 for this algorithm.
However, that neither the worst-case analysis nor its best-case counterpart yields the
necessary information about an algorithm’s behavior on a ―typical‖ or ―random‖ input. This
is the information that the average-case efficiency seeks to provide. To analyze the
algorithm’s average case efficiency, we must make some assumptions about possible inputs
of size n.
For example, if p = 1 (the search must be successful), the average number of key
comparisons made by sequential search is (n + 1)/2; that is, the algorithm will inspect,
on average, about half of the list’s elements.
If p = 0 (the search must be unsuccessful), the average number of key comparisons
will be n because the algorithm will inspect all n elements on all such inputs.
To compare and rank such orders of growth, computer scientists use three notations:
(big oh),
Ω (big omega), and
Θ (big theta).
O (big oh)
t(n) and g(n) can be any nonnegative functions defined on the set of natural numbers.
In the context we are interested in, t(n) will be an algorithm’s running time and g(n) will be
some simple function to compare the count with.
Informally, O(g(n)) is the set of all functions with a lower or same order of growth as g(n)
Examples:
Indeed, the first two functions are linear and hence have a lower order of growth than g(n) =
n2, while the last one is quadratic and hence has the same order of growth as n2. On the other
hand,
Indeed, the functions n3 and 0.00001n3 are both cubic and hence have a higher order of
growth than n2, and so has the fourth-degree polynomial n4 + n + 1.
Ω -notation
The second notation, Ω(g(n)), stands for the set of all functions with a higher or same order of
growth as g(n) (to within a constant multiple, as n goes to infinity).
For example,
Θ -notation
Though the formal definitions of O, Ω, and Θ are indispensable for proving their abstract
properties, they are rarely used for comparing the orders of growth of two specific functions.
A much more convenient method for doing so is based on computing the limit of the ratio of
two functions in question.
Note that the first two cases mean that t (n) ∈ O(g(n)), the last two mean that t (n) ∈ Ω(g(n)),
and the second case means that t (n) ∈ Θ(g(n)). The limit-based approach is often more
convenient than the one based on the definitions because it can take advantage of the
powerful calculus techniques developed for computing limits, such as L’Hospital’s rule
It may come as a surprise that the time efficiencies of a large number of algorithms fall into
only a few classes. These classes are listed in Table in increasing order of their orders of
growth, along with their names and a few comments.
EXAMPLE 1 Consider the problem of finding the value of the largest element in a list of n
numbers. For simplicity, we assume that the list is implemented as an array.
EXAMPLE 2 Consider the element uniqueness problem: check whether all the elements in
a given array of n elements are distinct.
EXAMPLE 3 Given two n × n matrices A and B, find the time efficiency of the definition-
based algorithm for computing their product C = AB. By definition, C is an n × n matrix
whose elements are computed as the scalar (dot) products of the rows of matrix A and the
columns of matrix B:
Basic Operation: There are two arithmetical operations in the innermost loop
here—multiplication and addition—that, in principle, can
compete for designation as the algorithm’s basic operation.
EXAMPLE 4 The following algorithm finds the number of binary digits in the binary
representation of a positive decimal integer.
Input Size: Here, the input is just one number, and it is this number’s
magnitude that determines the input size.
Basic Operation: First, notice that the most frequently executed operation
here is not inside the while loop but rather the comparison
n>1
Count of Basic Operation, A more significant feature of this example is the fact that
C(n): the loop variable takes on only a few values between its
lower and upper limits;
EXAMPLE 1 Compute the factorial function F(n) = n! for an arbitrary nonnegative integer
n. Since
and 0!= 1 by definition, we can compute F(n) = F(n − 1) . n with the following recursive
algorithm.
The number of multiplications M(n) needed to compute it must satisfy the equality
Thus, we succeeded in setting up the recurrence relation and initial condition for the
algorithm’s number of multiplications M(n):
The Tower of Hanoi puzzle. In this puzzle, we have n disks of different sizes that can slide
onto any of three pegs. Initially, all the disks are on the first peg in order of size, the largest
on the bottom and the smallest on top.
The goal is to move all the disks to the third peg, using the second one as an auxiliary, if
necessary. We can move only one disk at a time, and it is forbidden to place a larger disk on
top of a smaller one.
Brute force is a straightforward approach to solving a problem, usually directly based on the
problem statement and definitions of the concepts involved.
Selection Sort
consider the application of the brute-force approach to the problem of sorting: given a list of
n orderable items (e.g., numbers, characters from some alphabet, character strings), rearrange
them in nondecreasing order.
We start selection sort by scanning the entire given list to find its smallest element and
exchange it with the first element, putting the smallest element in its final position in the
sorted list.
Then we scan the list, starting with the second element, to find the smallest among the last n
− 1 elements and exchange it with the second element, putting the second smallest element
in its final position.
Generally, on the ith pass through the list, which we number from 0 to n − 2, the algorithm
searches for the smallest item among the last n − i elements and swaps it with Ai :
As an example, the action of the algorithm on the list 89, 45, 68, 90, 29, 34, 17 is illustrated
in Figure
Bubble Sort
Another brute-force application to the sorting problem is to compare adjacent elements of the
list and exchange them if they are out of order. By doing it repeatedly, we end up ―bubbling
up‖ the largest element to the last position on the list.
The next pass bubbles up the second largest element, and so on, until after n − 1 passes the
list is sorted. Pass i (0 ≤ i ≤ n − 2) of bubble sort can be represented by the following
diagram:
The action of the algorithm on the list 89, 45, 68, 90, 29, 34, 17 is illustrated as an example in
Figure
Sequential Search
The algorithm simply compares successive elements of a given list with a given search key
until either a match is encountered (successful search) or the list is exhausted without finding
a match (unsuccessful search).
A simple extra trick is often employed in implementing sequential search: if we append the
search key to the end of the list, the search for the key will have to be successful, and
therefore we can eliminate the end of list check altogether.
CAvg(n)=n/2;
Given a string of n characters called the text and a string of m characters (m ≤ n) called the
pattern, find a substring of the text that matches the pattern.
To put it more precisely, we want to find i—the index of the leftmost character of the first
matching substring in the text—such that
If matches other than the first one need to be found, a string-matching algorithm can simply
continue working until the entire text is exhausted.
Align the pattern against the first m characters of the text and start matching the
corresponding pairs of characters from left to right until either all the m pairs of the
characters match (then the algorithm can stop) or a mismatching pair is encountered.
In the latter case, shift the pattern one position to the right and resume the character
comparisons, starting again with the first character of the pattern and its counterpart in the
text.
Note that the last position in the text that can still be a beginning of a matching substring is n
– m (provided the text positions are indexed from 0 to n − 1).
Beyond that position, there are not enough characters to match the entire pattern; hence, the
algorithm need not make any comparisons there.
Thus, in the worst case, the algorithm makes m (n − m + 1) character comparisons, which
puts it in the O(nm) class.
Therefore, the average-case efficiency should be considerably better than the worst-case
efficiency.
Indeed, it is: for searching in random texts, it has been shown to be linear, i.e., Θ(n).