ADA Module 1
ADA Module 1
MODULE 1
INTRODUCTION
Algorithm: It is a sequence of computational steps that transforms the input into the output.
A procedure for solving a mathematical problem in a finite number of steps that frequently involves
recursive operations.
Diagram:
Uses: Algorithm are used to analyze, process and extract insights from large amount of data in fields
such as marketing, finance and healthcare.
All algorithms should satisfy the following criteria or properties:
1. INPUT: Zero or more quantities are externally supplied.
2. OUTPUT: At least one quantity is produced.
3. DEFINITENESS: Each instruction is clear and unambiguous.
4. FINITENESS: If we trace out the instructions of an algorithm, then for all cases, the
algorithm terminates after a finite number of steps.
5. EFFECTIVENESS: The instructions should be simple and should transform the given input
to the desired output.
1|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
GCD of two non-negative integers m and n, denoted gcd(m,n) is defined as the largest integer that
divides both m and n evenly i.e. with a remainder of zero.
Euclid’s algorithm is based on, gcd(m,n) = gcd(n, m mod n) until, m mod n =0. Where, m and n is
the remainder of the division of m by n.
Example: gcd(60,24) = gcd(24,12) = gcd(12,0) =12.
Algorithm:
Step 1: If n=0, return the value of m as the answer and stop, otherwise proceed to step 2.
Step 2: Divide m by n and assign the value of the remainder to r.
Step 3: Assign the value of n to m and the value of r to n. Go to step 1.
This follows the observation that the second number of the pair gets smaller with each iteration and it
cannot become negative. Indeed, the new value of n on the next iteration is m and n, which is always
smaller than n. Hence, the value of the 2nd number in the pair eventually becomes zero, and the
algorithm stops.
Drawback: Unlike Euclid’s algorithm, this algorithm does not work correctly when one of its input
numbers is zero.
So it’s important to specify the range of an algorithm’s inputs explicitly and carefully.
3|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
Sieve of Eratosthenes:
It is a simple algorithm for generating consecutive primes not exceeding any given integer n.
The algorithm starts by initializing a list of prime candidates with consecutive integers from 2 to n.
Then, on the first iteration of the algorithm, it eliminates from the list all multiples of 2, i.e. 4, 6 and
so on. Then it moves to the next item on the list, which is 3, and eliminates its multiples. No pass for
number 4 since 4 itself and all its multiples are also multiples of 2, they were already eliminated on
the previous pass. The next remaining number on the list, which is used on the 3rd pass is 5.
The algorithm continues in this way until no more numbers can be eliminated from the list.
Example: Consider the application of the algorithm for finding the list of primes not exceeding n=25,
No more passes are needed because they would eliminate numbers that are already eliminated on
previous iterations of the algorithm. The remaining numbers on the list are the consecutive primes
<=25.
4|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
NOTE:
So now we can incorporate the sieve of Eratosthenes into the middle-school procedure to get
a legitimate algorithm for computing the GCD of 2 two positive integers.
1 is not considered as prime numbers.
Euclid’s Algorithm is the most efficient method for finding the GCD of two full numbers.
ANALYSIS FRAMEWORK:
The main purpose of algorithm analysis is to design most efficient algorithms. The efficiency of the
algorithm depends on two factors: 1. Space efficiency
2. Time efficiency
The space efficiency of an algorithm is the amount of memory required to run the program
completely and efficiently.
The space complexity of an algorithm depends on following factors:
1. Program space: The space required for storing the machine program generated by the
compiler or assembler.
2. Data space: The space required to store the constants, variables etc.,
3. Stack space: The space required to store the return address along with parameters that are
passed to the function, local variables etc...
Time efficiency is measured purely on how fast a given algorithm is executed.
Components that affect time efficiency are speed of the computer, choice of the programming
language, compiler used, choice of the algorithm, number of inputs/outputs and size of inputs/outputs.
The time efficiency of an algorithm depends on size of the input n and hence time efficiency is always
expressed in terms of n.
The time efficiency is normally computed by considering the basic operation.
Basic Operation: It is more convenient and easier to identify the most important operation of the
algorithm that often contributes most of the total time. The basic operation is the most time
consuming operation in the algorithm.
Examples: Innermost loop in the algorithm, addition operation in matrix addition and multiplication
operation in matrix multiplication.
To find the time efficiency, it is required to compute the number of times the basic operation is
executed.
Let C be the time of execution of a basic operation in algorithm. Let C(n) be the total number of times
the basic operation is executed. Then, running time T(n) is given by,
Orders of growth: we expect the algorithms to work faster for all values of n. Some algorithms
execute faster for smaller values of n. The behavior of some algorithm slowdowns with the increase in
value of n. This change in behavior as the value of n increases is called order of growth.
The order of growth is normally determined for larger values of n for the following reasons-
1. The behavior of algorithm changes as the value of n increases.
2. In real time applications we normally encounter large values of n.
5|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
ASYMPTOTIC NOTATIONS: The asymptotic behavior of a function is the study how the value of
a function varies for large value of n (n is the size of the input)
Time complexity of an algorithm can be easily obtained using asymptotic notations.
The different types of asymptotic notations are shown below:
6|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
O - Big Oh
Ω - Big Omega
Θ – Big Theta
O (Big Oh): It is the formal method of expressing the upper bound of an algorithm’s running time.
It’s a measure of the longest amount of time it could possibly take for the algorithm to complete.
Hence, it is used in worst case scenario.
Let f(n) be the time efficiency of an algorithm.
The f(n) function is said to be big Oh of g(n) denoted by, f(n) ϵ Og(n) i.e. f(n) = Og(n) such that
there exists a positive constant c and positive integer n0 satisfying the constraint
f(n)<=c*g(n) for all n>= n0.
The graph f(n) and c*g(n) verses n,
Here, c*g(n) is the upper bound. The upper bound indicates that function f(n) will not consume more
than the specified time of cg(n) i.e. running time of f(n) may not be equal to cg(n), but it will never be
worse than the upper bound. So, we can say that f(n) is generally faster than g(n).
7|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
The lower bound of f(n) indicates that function f(n) will consume at least the specified time of cg(n)
i.e. the algorithm has a running time that is always greater than cg(n).
Hence, it is used in best case scenario
Θ (Big Theta): Let f(n) be the time complexity of an algorithm. The function f(n) is said to be big
theta of g(n), denoted by f(n) ϵ Θ g(n) i.e. f(n) = Θ g(n) such that there exists some positive constants
c1, c2 and non-negative integer n0 satisfying the constraint.
c1*g(n)<=f(n)<=c2*g(n) for all n>=n0.
The graph f(n), c1*g(n) and c2*g(n) verses n,
The upper bound on f(n) indicates that function f(n) will not consume more than the specified time
c2*g(n). The lower bound on f(n) indicates that function f(n) in the best case will consume at least the
specified time c1*g(n).
8|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
Theorem: If f1(n) ϵ O(g1(n)) and f2(n) ϵ O(g2(n)) then, f1(n)+ f2(n) ϵ O(max{ g1(n), g2(n)}). The
same assertion is true if we replace O notation by Ω or θ notations. Let us consider four arbitrary
numbers a1, a2, b1 and b2. If a1<= b1 and a2<= b2 then, the following simple relation holds good:
a1+a2<=2max {b1, b2}
Proof: By definition, we know that the function f(n) is said to be O(g(n)), denoted as f(n) ϵ O(g(n))
such that there exists a positive constant c and positive integer n0 satisfying the constraint, f(n) <=
c*(g(n)) for all n>=n0
It is given that f1(n) ϵ O(g1(n)) so, by definition, there exists a relation f1(n) <= c1*(g1(n)) for
n>=n1 Equation 1
It is given that f2(n) ϵ O(g2(n)) so, by definition, there exists a relation f2(n) <= c2*(g2(n)) for
n>=n2 Equation 2
Let us assume c3 = max {c1, c2} and n = max {n1, n2} Equation 3
By adding equation 1 and 2 we have,
f1(n)+ f2(n) <= c1*g1(n)+ c2*g2(n)
<= c3* g1(n)+ c3* g2(n)
<= c3[g1(n)+ g2(n)] from a1+a2<=2max {b1, b2}
<= c3 2*max{g1(n)+ g2(n)}
Since, f1(n)+ f2(n) <= c3 2*max{g1(n)+ g2(n)}, f1(n)+ f2(n) ϵ O(max{ g1(n), g2(n)}). Where, c=2c3
=2max{ c1, c2} and n0 = max {n1, n2}. Hence, the proof.
It is clear from the above property that, the overall efficiency of the algorithm is determined by the
executable part which has larger order of growth.
Example: Find the time complexity of an algorithm which has two parts:
Sorting part which requires 1/2n(n-1) comparisons
To check consecutive elements requiring n comparisons
Solution: Part 1: The sorting algorithm in the worst case uses 1/2n(n-1) comparisons so, the time
complexity of sorting part is O(n2)
Part 2: The algorithm to check whether consecutive elements are present or not requires no more than
(n-1) comparisons so, the time complexity of the second part is O(n)
9|Page
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
We know from the property of asymptotic notations that, if f1(n) ϵ O(g1(n)) and f2(n) ϵ O(g2(n)) then,
f1(n)+ f2(n) ϵ O(max{ g1(n), g2(n)}).
According to this property, the time complexity of entire algorithm is given by,
O (max{ n2,n}) = O(n2).
It is clear from the above definitions and the formal definitions of O, Ω and θ that-
The 1 and 2 definition means f(n) ϵ O(g(n)) and the constraint to be satisfied is f(n) <=c*g(n)
The 2 definition means f(n) ϵ θ(g(n)) and the constraint to be satisfied is
c1*g(n)<=f(n)<=c2*g(n)
The 2 and 3 means f(n) ϵ Ω(g(n)) and the constraint to be satisfied is f(n) >=c*g(n)
To compute the order of growth, the limit based approach is often convenient because, it is more
advantageous to use powerful calculus techniques such as L Hospital’s rule defined by,
And Sterling’s formula which is given by, for very large values of n.
Since the limit is a positive constant, both the functions f(n) and g(n) have the same order of growth
and it is represented as, 1/2n(n-1) ϵ θ(n2).
10 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
Analysis:
1. Input size: The number of elements = n(size of the array)
2. Two operations can be considered to be as basic operation i.e. comparison: A[i]>maxval and
assignment: maxval<-A[i]. Here, the comparison statement is considered to be the basic
operation of the algorithm.
3. No best, worst and average cases because the number of comparisons will be same for all
arrays of size n and it is not dependent on type of input.
4. Let c(n) denotes number of comparisons: Algorithm makes one comparison on each
execution of the loop, which is repeated for each value of the loop’s variable is within the
bound between 1 and n-1 inclusive. Therefore, we get the following sum for C(n):
This is an easy sum to compute because it is nothing other than 1 repeated n-1 times. Thus,
Analysis:
1. Input size: number of elements = n (size of the array)
11 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
5.
Note: Recurrence relation is an equation according to which the nth term of a sequence of
numbers is equal to some combination of the previous terms.
Analysis:
1. Input size: Given number=n
2. Basic operation: Multiplication
3. No best, worst and average cases
4. Let M(n) denotes number of multiplications,
Initial condition.
Where, M (n-1) to compute factorial (n-1)
12 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
The general formula for the above pattern for i is as follows: M(n-i)+i.
By taking advantage of the initial condition given i.e. M(0)=0, we now substitute i=n in the
patterns formula to get the ultimate result of the backward substitutions = M(n-
n)+n = M(0)+n
M(n) = n
Therefore, M(n) ϵ θ(n)
The number of multiplications to compute the factorial of a number in n where the time
complexity is linear.
Diagram:
13 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
Analysis:
1. The number of disks n is the obvious choice for the input’s size indicator.
2. Moving one disk as the algorithm’s basic operation.
3. Clearly, the number of moves M(n) depends on n only, and we get the following
recurrence equation for it:
Thus, we have an exponential algorithm, which will run for an unimaginably long time
even for moderate values of n.
14 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
following sum:
Each line corresponds to one iteration of the algorithm, i.e. a pass through the list’s
tail to the right of the vertical bar, an element in bold indicates the smallest element
found.
Elements to the left of the vertical bar are in their final positions and are not
considered in this and subsequent iterations.
15 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
3rd pass: Place the remaining two elements at their correct positions.
16 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
The number of key comparisons for the bubble sort is the same for all arrays of size n.
Sequential Search: In this method, every element within the input array is traversed and compared
with the key element to be found.
If a match is found in the array the search is said to be successful, if there is no match found the
search is said to be unsuccessful and gives the worst-case time complexity.
Best case time complexity is O(1) and worst case time complexity is O(n).
Example:
17 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
18 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
19 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
Brute –Force String Matching: Consider a string of n characters called the text and a string of m
characters (m<=n) called the pattern, find a substring of the text that matches the pattern.
Example:
Text: NOBODY_NOTICED_ HIM
Pattern: NOT
20 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
21 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM
Analysis and Design of Algorithms (BCS401) Module 1
22 | P a g e
Dept. of Data Science and AIML, A.I.T, CKM