0% found this document useful (0 votes)

36 views

DAA_unit_5

Uploaded by

Ajay Pathak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

DAA_unit_5

Uploaded by

Ajay Pathak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Selected Topics:

• Algebraic Computation,
• Fast Fourier Transform,
• String Matching,
• Theory of NP-Completeness,
• Approximation Algorithms
• Randomized Algorithms
String Matching Introduction
String Matching Algorithm is also called "String Searching Algorithm." This is a vital class of string
algorithm is declared as "this is the method to find a place where one is several strings are found within the
larger string."
Given a text array, T [1.....n], of n character and a pattern array, P [1......m], of m characters. The problems
are to find an integer s, called valid shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m]. In other words,
to find even if P in T, i.e., where P is a substring of T. The item of P and T are character drawn from some
finite alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.
Given a string T [1......n], the substrings are represented as T [i......j] for some 0≤i ≤ j≤n-1, the string formed
by the characters in T from index i to index j, inclusive. This process that a string is a substring of itself (take
i = 0 and j =m).
The proper substring of string T [1......n] is T [1......j] for some 0<i ≤ j≤n-1. That is, we must have either
i>0 or j < m-1.
ADVERTISEMENT
Using these descriptions, we can say given any string T [1......n], the substrings are
1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.
And proper substrings are
1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.
Note: If i>j, then T [i.....j] is equal to the empty string or null, which has length zero.
Algorithms used for String Matching:
There are different types of method is used to finding the string
1. The Naive String Matching Algorithm
2. The Rabin-Karp-Algorithm
3. Finite Automata
4. The Knuth-Morris-Pratt Algorithm
5. The Boyer-Moore Algorithm
The Naive String Matching Algorithm
The naïve approach tests all the possible placement of Pattern P [1.......m] relative to text T [1......n]. We try
shift s = 0, 1.......n-m, successively and for each shift s. Compare T [s+1.......s+m] to P [1......m].
The naïve algorithm finds all valid shifts using a loop that checks the condition P [1.......m] = T [s+1.......s+m]
for each of the n - m +1 possible value of s.
NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. for s ← 0 to n -m
4. do if P [1.....m] = T [s + 1....s + m]
5. then print "Pattern occurs with shift" s
Analysis: This for loop from 3 to 5 executes for n-m + 1(we need at least m characters at the end) times and
in iteration we are doing m comparisons. So the total complexity is O (n-m+1).
Example:
1. Suppose T = 1011101110
2. P = 111
3. Find all the Valid Shift
Solution:
The Rabin-Karp-Algorithm
The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each M-
character subsequences of text to be compared. If the hash values are unequal, the algorithm will determine
the hash value for next M-character sequence. If the hash values are equal, the algorithm will analyze the
pattern and the M-character sequence. In this way, there is only one comparison per text subsequence, and
character matching is only required when the hash values match.
RABIN-KARP-MATCHER (T, P, d, q)
1. n ← length [T]
2. m ← length [P]
3. h ← dm-1 mod q
4. p ← 0
5. t0 ← 0
6. for i ← 1 to m
7. do p ← (dp + P[i]) mod q
8. t0 ← (dt0+T [i]) mod q
9. for s ← 0 to n-m
10. do if p = ts
11. then if P [1.....m] = T [s+1.....s + m]
12. then "Pattern occurs with shift" s
13. If s < n-m
14. then ts+1 ← (d (ts-T [s+1]h)+T [s+m+1])mod q
Example: For string matching, working module q = 11, how many spurious hits does the Rabin-Karp
matcher encounters in Text T = 31415926535.......
1. T = 31415926535.......
2. P = 26
3. Here T.Length =11 so Q = 11
4. And P mod Q = 26 mod 11 = 4
5. Now find the exact match of P mod Q...
Solution:
Complexity:
The running time of RABIN-KARP-MATCHER in the worst case scenario O ((n-m+1) m but it has a good
average case running time. If the expected number of strong shifts is small O (1) and prime q is chosen to be
quite large, then the Rabin-Karp algorithm can be expected to run in time O (n+m) plus the time to require
to process spurious hits.
String Matching with Finite Automata
The string-matching automaton is a very useful tool which is used in string matching algorithm. It examines
every character in the text exactly once and reports all the valid shifts in O (n) time. The goal of string
matching is to find the location of specific text pattern within the larger body of text (a sentence, a paragraph,
a book, etc.)
Finite Automata:
A finite automaton M is a 5-tuple (Q, q0,A,∑δ), where
ADVERTISEMENT
ADVERTISEMENT
o Q is a finite set of states,
o q0 ∈ Q is the start state,
o A ⊆ Q is a notable set of accepting states,
o ∑ is a finite input alphabet,
o δ is a function from Q x ∑ into Q called the transition function of M.
The finite automaton starts in state q0 and reads the characters of its input string one at a time. If the
automaton is in state q and reads input character a, it moves from state q to state δ (q, a). Whenever its current
state q is a member of A, the machine M has accepted the string read so far. An input that is not allowed
is rejected.
A finite automaton M induces a function ∅ called the called the final-state function, from ∑* to Q such that
∅(w) is the state M ends up in after scanning the string w. Thus, M accepts a string w if and only if ∅(w) ∈
A.
The function f is defined as
∅ (∈)=q0
∅ (wa) = δ ((∅ (w), a) for w ∈ ∑*,a∈ ∑)
FINITE- AUTOMATON-MATCHER (T,δ, m),
1. n ← length [T]
2. q ← 0
3. for i ← 1 to n
4. do q ← δ (q, T[i])
5. If q =m
6. then s←i-m
7. print "Pattern occurs with shift s" s
The primary loop structure of FINITE- AUTOMATON-MATCHER implies that its running time on a text
string of length n is O (n).
Computing the Transition Function: The following procedure computes the transition function δ from
given pattern P [1......m]
COMPUTE-TRANSITION-FUNCTION (P, ∑)
1. m ← length [P]
2. for q ← 0 to m
3. do for each character a ∈ ∑*
4. do k ← min (m+1, q+2)
5. repeat k←k-1
6. Until
7. δ(q,a)←k
8. Return δ
Example: Suppose a finite automaton which accepts even number of a's where ∑ = {a, b, c}

Solution:
q0 is the initial state.
The Knuth-Morris-Pratt (KMP)Algorithm
Knuth-Morris and Pratt introduce a linear time algorithm for the string matching problem. A matching time
of O (n) is achieved by avoiding comparison with an element of 'S' that have previously been involved in
comparison with some element of the pattern 'p' to be matched. i.e., backtracking on the string 'S' never
occurs
Components of KMP Algorithm:
1. The Prefix Function (Π): The Prefix Function, Π for a pattern encapsulates knowledge about how the
pattern matches against the shift of itself. This information can be used to avoid a useless shift of the pattern
'p.' In other words, this enables avoiding backtracking of the string 'S.'
2. The KMP Matcher: With string 'S,' pattern 'p' and prefix function 'Π' as inputs, find the occurrence of 'p'
in 'S' and returns the number of shifts of 'p' after which occurrences are found.
The Prefix Function (Π)
Following pseudo code compute the prefix function, Π:
COMPUTE- PREFIX- FUNCTION (P)
1. m ←length [P] //'p' pattern to be matched
2. Π [1] ← 0
3. k ← 0
4. for q ← 2 to m
5. do while k > 0 and P [k + 1] ≠ P [q]
6. do k ← Π [k]
7. If P [k + 1] = P [q]
8. then k← k + 1
9. Π [q] ← k
10. Return Π
Running Time Analysis:
In the above pseudo code for calculating the prefix function, the for loop from step 4 to step 10 runs 'm'
times. Step1 to Step3 take constant time. Hence the running time of computing prefix function is O (m).
Example: Compute Π for the pattern 'p' below:

Solution:
Initially: m = length [p] = 7
Π [1] = 0
k=0
After iteration 6 times, the prefix function computation is complete:

The KMP Matcher:

The KMP Matcher with the pattern 'p,' the string 'S' and prefix function 'Π' as input, finds a match of p in S.
Following pseudo code compute the matching component of KMP algorithm:
KMP-MATCHER (T, P)
1. n ← length [T]
2. m ← length [P]
3. Π← COMPUTE-PREFIX-FUNCTION (P)
4. q ← 0 // numbers of characters matched
5. for i ← 1 to n // scan S from left to right
6. do while q > 0 and P [q + 1] ≠ T [i]
7. do q ← Π [q] // next character does not match
8. If P [q + 1] = T [i]
9. then q ← q + 1 // next character matches
10. If q = m // is all of p matched?
11. then print "Pattern occurs with shift" i - m
12. q ← Π [q] // look for the next match
Running Time Analysis:
The for loop beginning in step 5 runs 'n' times, i.e., as long as the length of the string 'S.' Since step 1 to step
4 take constant times, the running time is dominated by this for the loop. Thus running time of the matching
function is O (n).
Example: Given a string 'T' and pattern 'P' as follows:

Let us execute the KMP Algorithm to find whether 'P' occurs in 'T.'
For 'p' the prefix function, ? was computed previously and is as follows:
Solution:
Initially: n = size of T = 15
m = size of P = 7
Pattern 'P' has been found to complexity occur in a string 'T.' The total number of shifts that took place for
the match to be found is i-m = 13 - 7 = 6 shifts.
The Boyer-Moore Algorithm
Robert Boyer and J Strother Moore established it in 1977. The B-M String search algorithm is a particularly
efficient algorithm and has served as a standard benchmark for string search algorithm ever since.
The B-M algorithm takes a 'backward' approach: the pattern string (P) is aligned with the start of the text
string (T), and then compares the characters of a pattern from right to left, beginning with rightmost character.
If a character is compared that is not within the pattern, no match can be found by analyzing any further
aspects at this position so the pattern can be changed entirely past the mismatching character.
For deciding the possible shifts, B-M algorithm uses two preprocessing strategies simultaneously. Whenever
a mismatch occurs, the algorithm calculates a variation using both approaches and selects the more significant
shift thus, if make use of the most effective strategy for each case.
The two strategies are called heuristics of B - M as they are used to reduce the search. They are:
1. Bad Character Heuristics
2. Good Suffix Heuristics
1. Bad Character Heuristics
This Heuristics has two implications:
o Suppose there is a character in a text in which does not occur in a pattern at all. When a mismatch
happens at this character (called as bad character), the whole pattern can be changed, begin matching
form substring next to this 'bad character.'
o On the other hand, it might be that a bad character is present in the pattern, in this case, align the
nature of the pattern with a bad character in the text.
Thus in any case shift may be higher than one.
Example1: Let Text T = <nyoo nyoo> and pattern P = <noyo>

Example2: If a bad character doesn't exist the pattern then.

Problem in Bad-Character Heuristics:
In some cases, Bad-Character Heuristics produces some negative shifts.
For Example:

This means that we need some extra information to produce a shift on encountering a bad character. This
information is about the last position of every aspect in the pattern and also the set of characters used in a
pattern (often called the alphabet ∑of a pattern).
COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )
1. for each character a ∈ ∑
2. do λ [a] = 0
3. for j ← 1 to m
4. do λ [P [j]] ← j
5. Return λ
2. Good Suffix Heuristics:
A good suffix is a suffix that has matched successfully. After a mismatch which has a negative shift in bad
character heuristics, look if a substring of pattern matched till bad character has a good suffix in it, if it is so
then we have an onward jump equal to the length of suffix found.
Example:

COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)
1. Π ← COMPUTE-PREFIX-FUNCTION (P)
2. P'← reverse (P)
3. Π'← COMPUTE-PREFIX-FUNCTION (P')
4. for j ← 0 to m
5. do ɣ [j] ← m - Π [m]
6. for l ← 1 to m
7. do j ← m - Π' [L]
8. If ɣ [j] > l - Π' [L]
9. then ɣ [j] ← 1 - Π'[L]
10. Return ɣ

BOYER-MOORE-MATCHER (T, P, ∑)
1. n ←length [T]
2. m ←length [P]
3. λ← COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )
4. ɣ← COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)
5. s ←0
6. While s ≤ n - m
7. do j ← m
8. While j > 0 and P [j] = T [s + j]
9. do j ←j-1
10. If j = 0
11. then print "Pattern occurs at shift" s
12. s ← s + ɣ[0]
13. else s ← s + max (ɣ [j], j - λ[T[s+j]])
Complexity Comparison of String Matching Algorithm:
Algorithm Preprocessing Time Matching Time

Naive O (O (n - m + 1)m)

Rabin-Karp O(m) (O (n - m + 1)m)

Finite Automata O(m|∑|) O (n)

Knuth-Morris-Pratt O(m) O (n)

Boyer-Moore O(|∑|) (O ((n - m + 1) + |∑|))

NP-Completeness
A decision problem L is NP-Hard if
L' ≤p L for all L' ϵ NP.
Definition: L is NP-complete if
1. L ϵ NP and
2. L' ≤ p L for some known NP-complete problem L.' Given this formal definition, the complexity
classes are:
P: is the set of decision problems that are solvable in polynomial time.
NP: is the set of decision problems that can be verified in polynomial time.
NP-Hard: L is NP-hard if for all L' ϵ NP, L' ≤p L. Thus if we can solve L in polynomial time, we can solve
all NP problems in polynomial time.
NP-Complete L is NP-complete if
1. L ϵ NP and
2. L is NP-hard
If any NP-complete problem is solvable in polynomial time, then every NP-Complete problem is also
solvable in polynomial time. Conversely, if we can prove that any NP-Complete problem cannot be solved
in polynomial time, every NP-Complete problem cannot be solvable in polynomial time.
Reductions
Concept: - If the solution of NPC problem does not exist then the conversion from one NPC problem to
another NPC problem within the polynomial time. For this, you need the concept of reduction. If a solution
of the one NPC problem exists within the polynomial time, then the rest of the problem can also give the
solution in polynomial time (but it's hard to believe). For this, you need the concept of reduction.
Example: - Suppose there are two problems, A and B. You know that it is impossible to solve problem A in
polynomial time. You want to prove that B cannot be solved in polynomial time. So you can convert the
problem A into problem B in polynomial time.
Example of NP-Complete problem
NP problem: - Suppose a DECISION-BASED problem is provided in which a set of inputs/high inputs you
can get high output.
Criteria to come either in NP-hard or NP-complete.
1. The point to be noted here, the output is already given, and you can verify the output/solution within
the polynomial time but can't produce an output/solution in polynomial time.
2. Here we need the concept of reduction because when you can't produce an output of the problem
according to the given input then in case you have to use an emphasis on the concept of reduction in
which you can convert one problem into another problem.
Note1:- If you satisfy both points then your problem comes into the category of NP-complete class
Note2:- If you satisfy the only 2nd points then your problem comes into the category of NP-hard class
So according to the given decision-based NP problem, you can decide in the form of yes or no. If, yes then
you have to do verify and convert into another problem via reduction concept. If you are being performed,
both then decision-based NP problems are in NP compete.
Here we will emphasize NPC.
Approximate Algorithms
Introduction:
An Approximate Algorithm is a way of approach NP-COMPLETENESS for the optimization problem.
This technique does not guarantee the best solution. The goal of an approximation algorithm is to come as
close as possible to the optimum value in a reasonable amount of time which is at the most polynomial time.
Such algorithms are called approximation algorithm or heuristic algorithm.
o For the traveling salesperson problem, the optimization problem is to find the shortest cycle, and the
approximation problem is to find a short cycle.
o For the vertex cover problem, the optimization problem is to find the vertex cover with fewest
vertices, and the approximation problem is to find the vertex cover with few vertices.
Performance Ratios
Suppose we work on an optimization problem where every solution carries a cost. An Approximate
Algorithm returns a legal solution, but the cost of that legal solution may not be optimal.
For Example, suppose we are considering for a minimum size vertex-cover (VC). An approximate
algorithm returns a VC for us, but the size (cost) may not be minimized.
Another Example is we are considering for a maximum size Independent set (IS). An approximate
Algorithm returns an IS for us, but the size (cost) may not be maximum. Let C be the cost of the solution
returned by an approximate algorithm, and C* is the cost of the optimal solution.
We say the approximate algorithm has an approximate ratio P (n) for an input size n, where

Intuitively, the approximation ratio measures how bad the approximate solution is distinguished with the
optimal solution. A large (small) approximation ratio measures the solution is much worse than (more or less
the same as) an optimal solution.
Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write P. Therefore, a 1-
approximation algorithm gives an optimal solution. Some problems have polynomial-time approximation
algorithm with small constant approximate ratios, while others have best-known polynomial time
approximation algorithms whose approximate ratios grow with n
Vertex Cover
A Vertex Cover of a graph G is a set of vertices such that each edge in G is incident to at least one of these
vertices.
The decision vertex-cover problem was proven NPC. Now, we want to solve the optimal version of the vertex
cover problem, i.e., we want to find a minimum size vertex cover of a given graph. We call such vertex cover
an optimal vertex cover C*.

An approximate algorithm for vertex cover:

1. Approx-Vertex-Cover (G = (V, E))
2. {
3. C = empty-set;
4. E'= E;
5. While E' is not empty do
6. {
7. Let (u, v) be any edge in E': (*)
8. Add u and v to C;
9. Remove from E' all edges incident to
10. u or v;
11. }
12. Return C;
13. }

The idea is to take an edge (u, v) one by one, put both vertices to C, and remove all the edges incident to u
or v. We carry on until all edges have been removed. C is a VC. But how good is C?
VC = {b, c, d, e, f, g}
Randomized Algorithms | Set 0 (Mathematical Background)
Conditional Probability Conditional probability P(A | B) indicates the probability of even ‘A’ happening
given that the even B happened.
𝑃(𝐴∩𝐵)
P(A|B)= 𝑃(𝐵)
We can easily understand above formula using below diagram. Since B has already happened, the sample
space reduces to B. So the probability of A happening becomes P(A ? B) divided by P(B)

Below is Bayes’s formula for conditional probability.

𝑃(𝐵|𝐴)𝑃(𝐴)
P(A|B)= 𝑃(𝐵)

The formula provides relationship between P(A|B) and P(B|A). It is mainly derived from conditional
probability formula discussed in the previous post.
Consider the below formulas for conditional probabilities P(A|B) and P(B|A)
𝑃(𝐴∩𝐵)
P(A|B)= 𝑃(𝐵) -------------
𝑃(𝐵∩𝐴)
P(A|B)= -------------
𝑃(𝐴)

Since P(B ? A) = P(A ? B), we can replace P(A ? B) in first formula with P(B|A)P(A)
After replacing, we get the given formula. Refer this for examples of Bayes’s formula.
Random Variables:
A random variable is actually a function that maps outcome of a random event (like coin toss) to a real value.
Example :
Coin tossing game :
A player pays 50 bucks if result of coin
toss is "Head"
The person gets 50 bucks if the result is
Tail.
A random variable profit for person can
be defined as below :
Profit = +50 if Head
-50 if Tail
Generally gambling games are not fair for players,
the organizer takes a share of profit for all
arrangements. So expected profit is negative for
a player in gambling and positive for the organizer.
That is how organizers make money.
Expected Value of Random Variable :
Expected value of a random variable R can be defined as following
E[R] = r1*p1 + r2*p2 + ... rk*pk
ri ==> Value of R with probability pi
Expected value is basically sum of product of following two terms (for all possible events)
a) Probability of an event.
b) Value of R at that even
Example 1:
In above example of coin toss,
Expected value of profit = 50 * (1/2) +
(-50) * (1/2)
=0
Example 2:
Expected value of six faced dice throw is
= 1*(1/6) + 2*(1/6) + .... + 6*(1/6)
= 3.5
Linearity of Expectation:
Let R1 and R2 be two discrete random variables on some probability space, then
E[R1 + R2] = E[R1] + E[R2]
For example, expected value of sum for 3 dice throws is = 3 * 7/2 = 7

Randomized Algorithms | Set 1 (Introduction and Analysis)

What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called a
Randomized Algorithm. For example, in Randomized Quick Sort, we use a random number to pick the next
pivot (or we randomly shuffle the array). And in Karger’s algorithm, we randomly pick an edge.

How to analyse Randomized Algorithms?

Some randomized algorithms have deterministic time complexity. For example, this implementation of
Karger’s algorithm has time complexity is O(E). Such algorithms are called Monte Carlo Algorithms and are
easier to analyse for worst case.
On the other hand, time complexity of other randomized algorithms (other than Las Vegas) is dependent on
value of random variable. Such Randomized algorithms are called Las Vegas Algorithms. These algorithms
are typically analysed for expected worst case. To compute expected time taken in worst case, all possible
values of the used random variable needs to be considered in worst case and time taken by every possible
value needs to be evaluated. Average of all evaluated times is the expected worst case time complexity.
Below facts are generally helpful in analysis os such algorithms.
Linearity of Expectation
Expected Number of Trials until Success.
For example consider below a randomized version of QuickSort.
A Central Pivot is a pivot that divides the array in such a way that one side has at-least 1/4 elements.
// Sorts an array arr[low..high]
randQuickSort(arr[], low, high)

1. If low >= high, then EXIT.

2. While pivot 'x' is not a Central Pivot.
(i) Choose uniformly at random a number from [low..high].
Let the randomly picked number number be x.
(ii) Count elements in arr[low..high] that are smaller
than arr[x]. Let this count be sc.
(iii) Count elements in arr[low..high] that are greater
than arr[x]. Let this count be gc.
(iv) Let n = (high-low+1). If sc >= n/4 and
gc >= n/4, then x is a central pivot.
3. Partition arr[low..high] around the pivot x.
4. // Recur for smaller elements
randQuickSort(arr, low, sc-1)
5. // Recur for greater elements
randQuickSort(arr, high-gc+1, high)
The important thing in our analysis is, time taken by step 2 is O(n).
How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/n.
Therefore, expected number of times the while loop runs is n (See this for details)
Thus, the expected time complexity of step 2 is O(n).
What is overall Time Complexity in Worst Case?
In worst case, each partition divides array such that one side has n/4 elements and other side has 3n/4
elements. The worst case height of recursion tree is Log 3/4 n which is O(Log n).
T(n) < T(n/4) + T(3n/4) + O(n)
T(n) < 2T(3n/4) + O(n)

Solution of above recurrence is O(n Log n)

Note that the above randomized algorithm is not the best way to implement randomized Quick Sort. The idea
here is to simplify the analysis as it is simple to analyse.

Typically, randomized Quick Sort is implemented by randomly picking a pivot (no loop). Or by shuffling
array elements. Expected worst case time complexity of this algorithm is also O(n Log n), but analysis is
complex, the MIT prof himself mentions same in his lecture here.
Example :

import random
import time

def find_solution(n):
# seed the random number generator with the current time
random.seed(time.time())
# randomly select a number between 1 and n and return it as the solution
return random.randint(1, n)

def main():
n = 10 # the range of possible solutions is 1 to n
print("Solution:", find_solution(n))

if __name__ == '__main__':
main()

Output
Solution: 10

Randomized Algorithms | Set 2 (Classification and Applications)

Classification
Randomized algorithms are classified in two categories.
Las Vegas:
A Las Vegas algorithm were introduced by Laszlo Babai in 1979.
A Las Vegas algorithm is an algorithm which uses randomness, but gives guarantees that the solution
obtained for given problem is correct. It takes the risk with resources used. A quick-sort algorithm is a simple
example of Las-Vegas algorithm. To sort the given array of n numbers quickly we use the quick sort
algorithm. For that we find out central element which is also called as pivot element and each element is
compared with this pivot element. Sorting is done in less time or it requires more time is dependent on how
we select the pivot element. To pick the pivot element randomly we can use Las-Vegas algorithm.
Definition:
A randomized algorithm that always produce correct result with only variation from one aun to another being
its running time is known as Las-Vegas algorithm.
OR
A randomized algorithm which always produces a correct result or it informs about the failure is known as
Las-Vegas algorithm.
OR
A Las-Vegas algorithm take the risk with the resources used for computation but it does not take risk with
the result i.e. it gives correct and expected output for the given problem.
Let us consider the above example of quick sort algorithm. In this algorithm we choose the pivot element
randomly. But the result of this problem is always a sorted array. A Las-Vegas algorithm is having one
restriction i.e. the solution for the given problem can be found out in finite time. In this algorithm the numbers
of possible solutions arc limited. The actual solution is complex in nature or complicated to calculate but it
is easy to verify the correctness of candidate solution.
These algorithms always produce correct or optimum result. Time complexity of these algorithms is based
on a random value and time complexity is evaluated as expected value. For example, Randomized Quick
Sort always sorts an input array and expected worst case time complexity of Quick Sort is O(nLogn).
Relation with the Monte-Carlo Algorithms:
• The Las-Vegas algorithm can be differentiated with the Monte-carlo algorithms in which the
resources used to find out the solution are bounded but it does not give guarantee that the solution
obtained is accurate.
• In some applications by making early termination a Las-Vegas algorithm can be converted into
Monte-Carlo algorithm.
Complexity Analysis:
The complexity class of given problem which is solved by using a Las-Vegas algorithms is expect that the
given problem is solved with zero error probability and in polynomial time.
This zero error probability polynomial time is also called as ZPP which is obtained as follows,
ZPP = RP ? CO-RP
Where, RP = Randomized polynomial time.
Randomized polynomial time algorithm always provide correct output when the correct output is no, but
with a certain probability bounded away from one when the answer is yes. These kinds of decision problem
can be included in class RP i.e. randomized where polynomial time.
That is how we can solve given problem in expected polynomial time by using Las-Vegas algorithm.
Generally there is no upper bound for Las-vegas algorithm related to worst case run time.
Monte Carlo:
The computational algorithms which rely on repeated random sampling to compute their results such
algorithm are called as Monte-Carlo algorithms.
OR
The random algorithm is Monte-carlo algorithms if it can give the wrong answer sometimes.
Whenever the existing deterministic algorithm is fail or it is impossible to compute the solution for given
problem then Monte-Carlo algorithms or methods are used. Monte-carlo methods are best repeated
computation of the random numbers, and that’s why these algorithms are used for solving physical simulation
system and mathematical system.
This Monte-carlo algorithms are specially useful for disordered materials, fluids, cellular structures. In case
of mathematics these method are used to calculate the definite integrals, these integrals are provided with the
complicated boundary conditions for multidimensional integrals. This method is successive one with
consideration of risk analysis when compared to other methods.
There is no single Monte carlo methods other than the term describes a large and widely used class
approaches and these approach use the following pattern.
1. Possible inputs of domain is defined.
2. By using a certain specified probability distribution generate the inputs randomly from the domain.
3.By using these inputs perform a deterministic computation.
4.The final result can be computed by aggregating the results of the individual computation.
Produce correct or optimum result with some probability. These algorithms have deterministic running time
and it is generally easier to find out worst case time complexity. For example this implementation of Karger’s
Algorithm produces minimum cut with probability greater than or equal to 1/n2 (n is number of vertices) and
has worst case time complexity as O(E). Another example is Fermat Method for Primality Testing. Example
to Understand Classification:
Consider a binary array where exactly half elements are 0
and half are 1. The task is to find index of any 1.
A Las Vegas algorithm for this task is to keep picking a random element until we find a 1. A Monte Carlo
algorithm for the same is to keep picking a random element until we either find 1 or we have tried maximum
allowed times say k. The Las Vegas algorithm always finds an index of 1, but time complexity is determined
as expect value. The expected number of trials before success is 2, therefore expected time complexity is
O(1). The Monte Carlo Algorithm finds a 1 with probability [1 – (1/2)k]. Time complexity of Monte Carlo
is O(k) which is deterministic
Optimization of Monte-Carlo Algorithms:
• In general the Monte-carlo optimization techniques are dependent on the random walks. The program
for Monte carlo algorithms move in multidimensional space around the generated marker or handle. It
wanted to move to the lower function but sometimes moves against the gradient.
• In numerical optimization the numerical simulation is used which effective and efficient and popular
application for the random numbers. The travelling salesman problem is an optimization problem which
is one of the best examples of optimizations.
• There are various optimization techniques available for Monte-carlo algorithms such as Evolution
strategy, Genetic algorithms, parallel tempering etc.
Applications and Scope:
The Monte-carlo methods has wider range of applications. It uses in various areas like physical science,
Design and visuals, Finance and business, Telecommunication etc. In general Monte carlo methods are used
in mathematics. By generating random numbers we can solve the various problem. The problems which are
complex in nature or difficult to solve are solved by using Monte-carlo algorithms. Monte carlo integration
is the most common application of Monte-carlo algorithm.
The deterministic algorithm provides a correct solution but it takes long time or its runtime is large. This run-
time can be improved by using the Monte carlo integration algorithms. There are various methods used for
integration by using Monte-carlo methods such as,
i) Direct sampling methods which includes the stratified sampling, recursive
stratified sampling, importance sampling.
ii) Random walk Monte-carlo algorithm which is used to find out the integration for
given problem.
iii) Gibbs sampling.
• Consider a tool that basically does sorting. Let the tool be used by many users and there are few
users who always use tool for already sorted array. If the tool uses simple (not randomized)
QuickSort, then those few users are always going to face worst case situation. On the other hand
if the tool uses Randomized QuickSort, then there is no user that always gets worst case.
Everybody gets expected O(n Log n) time.
• Randomized algorithms have huge applications in Cryptography.
• Load Balancing.
• Number-Theoretic Applications: Primality Testing
• Data Structures: Hashing, Sorting, Searching, Order Statistics and Computational Geometry.
• Algebraic identities: Polynomial and matrix identity verification. Interactive proof systems.
• Mathematical programming: Faster algorithms for linear programming, Rounding linear program
solutions to integer program solutions
• Graph algorithms: Minimum spanning trees, shortest paths, minimum cuts.
• Counting and enumeration: Matrix permanent Counting combinatorial structures.
• Parallel and distributed computing: Deadlock avoidance distributed consensus.
• Probabilistic existence proofs: Show that a combinatorial object arises with non-zero probability
among objects drawn from a suitable probability space.
• Derandomization: First devise a randomized algorithm then argue that it can be derandomized to
yield a deterministic algorithm.
Randomized algorithms are algorithms that use randomness as a key component in their operation. They
can be used to solve a wide variety of problems, including optimization, search, and decision-making. Some
examples of applications of randomized algorithms include:
1. Monte Carlo methods: These are a class of randomized algorithms that use random sampling to
solve problems that may be deterministic in principle, but are too complex to solve exactly.
Examples include estimating pi, simulating physical systems, and solving optimization problems.
2. Randomized search algorithms: These are algorithms that use randomness to search for solutions
to problems. Examples include genetic algorithms and simulated annealing.
3. Randomized data structures: These are data structures that use randomness to improve their
performance. Examples include skip lists and hash tables.
4. Randomized load balancing: These are algorithms used to distribute load across a network of
computers, using randomness to avoid overloading any one computer.
5. Randomized encryption: These are algorithms used to encrypt and decrypt data, using
randomness to make it difficult for an attacker to decrypt the data without the correct key.
Example 1:

import random

# Generates a random permutation of the given array

def random_permutation(array):
# Shuffle the array using the random number generator
random.shuffle(array)

array = [1, 2, 3, 4, 5]
# Generate a random permutation of the array
random_permutation(array)

# Print the shuffled array

print(array)

Output
51423
Example 2 :

from random import shuffle

def find_median(numbers):
n = len(numbers)
if n == 0:
return None
if n == 1:
return numbers[0]

# Shuffle the list to ensure a random ordering

shuffle(numbers)

# Find the median by selecting the middle element

return numbers[n // 2]

# Example usage
print(find_median([1, 2, 3, 4, 5])) # Output: 3
print(find_median([1, 2, 3, 4, 5, 6])) # Output: 3 or 4 (randomly chosen)
print(find_median([])) # Output: None
print(find_median([7])) # Output: 7

Output
4
6
None
7
Randomized Algorithms | Set 3 (1/2 Approximate Median)
In this post, a Monte Carlo algorithm is discussed.
Problem Statement : Given an unsorted array A[] of n numbers and ε > 0, compute an element whose rank
(position in sorted A[]) is in the range [(1 – ε)n/2, (1 + ε)n/2].
For ½ Approximate Median Algorithm &epsilom; is 1/2 => rank should be in the range [n/4, 3n/4]
We can find k’th smallest element in O(n) expected time and O(n) worst case time.
What if we want in less than O(n) time with low probable error allowed?
Following steps represent an algorithm that is O((Log n) x (Log Log n)) time and produces incorrect result
with probability less than or equal to 2/n2.
1. Randomly choose k elements from the array where k=c log n (c is some constant)
2. Insert then into a set.
3. Sort elements of the set.
4. Return median of the set i.e. (k/2)th element from the set

/* C++ program to find Approximate Median using

1/2 Approximate Algorithm */
#include<bits/stdc++.h>
using namespace std;

// This function returns the Approximate Median

int randApproxMedian(int arr[],int n)
{
// Declaration for the random number generator
random_device rand_dev;
mt19937 generator(rand_dev());

// Random number generated will be in the range [0,n-1]

uniform_int_distribution<int> distribution(0, n-1);

if (n==0)
return 0;

int k = 10*log2(n); // Taking c as 10

// A set stores unique elements in sorted order

set<int> s;
for (int i=0; i<k; i++)
{
// Generating a random index
int index = distribution(generator);

//Inserting into the set

s.insert(arr[index]);
}

set<int> ::iterator itr = s.begin();

// Report the median of the set at k/2 position

// Move the itr to k/2th position
advance(itr, (s.size()/2) - 1);
// Return the median
return *itr;
}
// Driver method to test above method
int main()
{
int arr[] = {1, 3, 2, 4, 5, 6, 8, 7};
int n = sizeof(arr)/sizeof(int);
printf("Approximate Median is %d\n",randApproxMedian(arr,n));
return 0
}

Time Complexity:
We use a set provided by the STL in C++. In STL Set, insertion for each element takes O(log k). So
For k insertions, time taken is O (k log k).
Now replacing k with c log n
=>O(c log n (log (clog n))) =>O (log n (log log n))
How is probability of error less than 2/n2?
Algorithm makes an error if the set S has at least k/2 elements are from the Left Quarter or Right
Quarter.
It is quite easy to visualize this statement since the median which we report will be (k/2)th element
and if we take k/2 elements from the left quarter(or right quarter) the median will be from the left
quarter (or the right quarter).
An array can be divided into 4 quarters each of size n/4. So P(selecting left quarter) is 1/4. So what
is the probability that at least k/2 elements are from the Left Quarter or Right Quarter? This
probability problem is same as below :
Given a coin which gives HEADS with probability 1/4 and TAILS with 3/4. The coin is tossed k
times. What is the probability that we get at least k/2 HEADS is less than or equal to?

Explanation:
If we put k = c log n for c = 10, we get
P <= (1/2)2log n
P <= (1/2)log n2
P <= n-2
Probability of selecting at least k/2 elements from the left quarter) <= 1/n2
Probability of selecting at least k/2 elements from the left or right quarter) <= 2/n2
Therefore algorithm produces incorrect result with probability less than or equal to 2/n2.

Assignment5 Drw2 B
No ratings yet
Assignment5 Drw2 B
5 pages
Ada Notes Unit 4
No ratings yet
Ada Notes Unit 4
28 pages
Unit8 ADA SPPDF 2022 11 11 17 17 37pdf 2023 12 06 16 57 08
No ratings yet
Unit8 ADA SPPDF 2022 11 11 17 17 37pdf 2023 12 06 16 57 08
18 pages
String Matching
No ratings yet
String Matching
63 pages
String Matching
No ratings yet
String Matching
30 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
46 pages
String Matching
100% (1)
String Matching
27 pages
CH-8
No ratings yet
CH-8
26 pages
Unit II
No ratings yet
Unit II
94 pages
UNIT-5 DAA Complete Notes
No ratings yet
UNIT-5 DAA Complete Notes
52 pages
Lecture 56string Matching
No ratings yet
Lecture 56string Matching
43 pages
Unit 5 String Matching 2010
No ratings yet
Unit 5 String Matching 2010
5 pages
11 Data Structures and Algorithms - Narasimha Karumanchi
No ratings yet
11 Data Structures and Algorithms - Narasimha Karumanchi
12 pages
Module 6 AOA
No ratings yet
Module 6 AOA
19 pages
patternmatching
No ratings yet
patternmatching
29 pages
Unit 3-Pattern Matching
No ratings yet
Unit 3-Pattern Matching
42 pages
A357460420 - 22393 - 2 - 2018 - String Matching
No ratings yet
A357460420 - 22393 - 2 - 2018 - String Matching
27 pages
SOU Lecture Handout ADA Unit-8
No ratings yet
SOU Lecture Handout ADA Unit-8
17 pages
Lecture 34, 35 36 - String Matching Algorithms
No ratings yet
Lecture 34, 35 36 - String Matching Algorithms
42 pages
Abstract
No ratings yet
Abstract
12 pages
Unit 3-Pattern Matching.pptx
No ratings yet
Unit 3-Pattern Matching.pptx
43 pages
Lecture#8 - String Matching Algorithm
No ratings yet
Lecture#8 - String Matching Algorithm
38 pages
BNP Unit-5 Lecture 19
No ratings yet
BNP Unit-5 Lecture 19
13 pages
M3-string_matching
No ratings yet
M3-string_matching
74 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
25 pages
Ch-5 Numerical Daa
No ratings yet
Ch-5 Numerical Daa
11 pages
Naive and Rabin Karp
No ratings yet
Naive and Rabin Karp
47 pages
Unit-8 String Matching
No ratings yet
Unit-8 String Matching
31 pages
5CS4-AOA-Unit-3 @zammers
No ratings yet
5CS4-AOA-Unit-3 @zammers
7 pages
String Matching Chapter 12 Goodrich Nep
No ratings yet
String Matching Chapter 12 Goodrich Nep
43 pages
String Matching
No ratings yet
String Matching
27 pages
String Matching 2019
No ratings yet
String Matching 2019
50 pages
KMP Algorithm
No ratings yet
KMP Algorithm
21 pages
String Matching
No ratings yet
String Matching
35 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
49 pages
Lecture 37 String Matching
100% (1)
Lecture 37 String Matching
12 pages
Lecture 18 - String Matching-KMP
No ratings yet
Lecture 18 - String Matching-KMP
40 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
21 pages
4string Matching Kmprabin Karp and Naive
No ratings yet
4string Matching Kmprabin Karp and Naive
57 pages
Slides - Chapter 32 - String Matching
No ratings yet
Slides - Chapter 32 - String Matching
18 pages
String Matching - RYS - Lect - 1 - 2 - 3 - Update
No ratings yet
String Matching - RYS - Lect - 1 - 2 - 3 - Update
61 pages
AAD-String Matching
No ratings yet
AAD-String Matching
15 pages
AOA Module 6 - String of Algorithms - Aeraxia - in
No ratings yet
AOA Module 6 - String of Algorithms - Aeraxia - in
26 pages
KMP Algo
No ratings yet
KMP Algo
16 pages
String Matching
No ratings yet
String Matching
18 pages
String Matching Problem
No ratings yet
String Matching Problem
16 pages
Text Pattern Search Using Naïve Algorithm: Justine Estoesta, Patricia Mae Omana, Winci John Singh
No ratings yet
Text Pattern Search Using Naïve Algorithm: Justine Estoesta, Patricia Mae Omana, Winci John Singh
5 pages
Lecture 39 Knutt Morris Pratt
No ratings yet
Lecture 39 Knutt Morris Pratt
15 pages
Algorithms in Bioinformatics
No ratings yet
Algorithms in Bioinformatics
7 pages
Pattern Matching Algo
No ratings yet
Pattern Matching Algo
21 pages
BNP Unit-5 Lecture 20 KMP 5.2
No ratings yet
BNP Unit-5 Lecture 20 KMP 5.2
14 pages
String Matching Algorithms: International Journal of Engineering and Computer Science March 2018
No ratings yet
String Matching Algorithms: International Journal of Engineering and Computer Science March 2018
5 pages
KMP 2
No ratings yet
KMP 2
7 pages
String Matching
No ratings yet
String Matching
34 pages
The Knuth Morris Pratt Algorithm
No ratings yet
The Knuth Morris Pratt Algorithm
7 pages
Knuth-Morris-Pratt Algorithm KENT
No ratings yet
Knuth-Morris-Pratt Algorithm KENT
4 pages
Module 06. String Algorithms Lecture 1 - 2
No ratings yet
Module 06. String Algorithms Lecture 1 - 2
19 pages
15 String Matching
No ratings yet
15 String Matching
45 pages
UNIT-V String Matching
No ratings yet
UNIT-V String Matching
24 pages
w 9 Presentation
No ratings yet
w 9 Presentation
20 pages
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
Proficiency Presentation: Design and Analysis of Algorithms
No ratings yet
Proficiency Presentation: Design and Analysis of Algorithms
7 pages
DAA MID2 (UandiStar - Org)
No ratings yet
DAA MID2 (UandiStar - Org)
27 pages
Daa QB Short
No ratings yet
Daa QB Short
11 pages
NR-2024-ME-CSE-SYLLABUS
No ratings yet
NR-2024-ME-CSE-SYLLABUS
97 pages
Mcta 102 Programming System Jun 2020
No ratings yet
Mcta 102 Programming System Jun 2020
2 pages
Automatic Time Table Management System: A Project Report ON
No ratings yet
Automatic Time Table Management System: A Project Report ON
64 pages
Buy ebook Analysis And Design Of Algorithms 2nd Edition Amrinder Arora cheap price
100% (2)
Buy ebook Analysis And Design Of Algorithms 2nd Edition Amrinder Arora cheap price
71 pages
Turning Machine: Unit-V
No ratings yet
Turning Machine: Unit-V
23 pages
CS502 Solved Subjective For Final Term
No ratings yet
CS502 Solved Subjective For Final Term
19 pages
DAA Endsem
No ratings yet
DAA Endsem
3 pages
M e Cse
No ratings yet
M e Cse
129 pages
DAA Syllabus 2022-23 As Per BoS
No ratings yet
DAA Syllabus 2022-23 As Per BoS
4 pages
P NP NPC and NPH
No ratings yet
P NP NPC and NPH
3 pages
UNIT IV - cyber security
No ratings yet
UNIT IV - cyber security
14 pages
Assignment of DSA
No ratings yet
Assignment of DSA
8 pages
M.Tech. IT
No ratings yet
M.Tech. IT
119 pages
DAA Unit Wise Importtant Questions
100% (4)
DAA Unit Wise Importtant Questions
2 pages
Daa Mid2 Objective Bits
No ratings yet
Daa Mid2 Objective Bits
3 pages
Introduction To The Theory of Computation
No ratings yet
Introduction To The Theory of Computation
4 pages
Algorithm Design & Data Structures
No ratings yet
Algorithm Design & Data Structures
13 pages
CSE408
No ratings yet
CSE408
2 pages
P NP NP Complete Examples
No ratings yet
P NP NP Complete Examples
19 pages
GATEOverflow Book PDF
100% (1)
GATEOverflow Book PDF
400 pages
Ads Unit - 5
No ratings yet
Ads Unit - 5
7 pages
DAA Marking Scheme
No ratings yet
DAA Marking Scheme
4 pages
NP-Completeness: Problems Proofs Approximations
No ratings yet
NP-Completeness: Problems Proofs Approximations
141 pages
Parallel Algorithms of Timetable Generation
100% (1)
Parallel Algorithms of Timetable Generation
60 pages
ME-CSE-2020-SYLLABUS
No ratings yet
ME-CSE-2020-SYLLABUS
74 pages
Bebras Solutions Guide 2021 R1 Primary
No ratings yet
Bebras Solutions Guide 2021 R1 Primary
48 pages

DAA_unit_5

Uploaded by

DAA_unit_5

Uploaded by

Selected Topics:

The KMP Matcher:

Example2: If a bad character doesn't exist the pattern then.

Rabin-Karp O(m) (O (n - m + 1)m)

Finite Automata O(m|∑|) O (n)

Knuth-Morris-Pratt O(m) O (n)

Boyer-Moore O(|∑|) (O ((n - m + 1) + |∑|))

An approximate algorithm for vertex cover:

Below is Bayes’s formula for conditional probability.

Randomized Algorithms | Set 1 (Introduction and Analysis)

How to analyse Randomized Algorithms?

1. If low >= high, then EXIT.

Solution of above recurrence is O(n Log n)

Randomized Algorithms | Set 2 (Classification and Applications)

# Generates a random permutation of the given array

# Print the shuffled array

from random import shuffle

# Shuffle the list to ensure a random ordering

# Find the median by selecting the middle element

/* C++ program to find Approximate Median using

// This function returns the Approximate Median

// Random number generated will be in the range [0,n-1]

int k = 10*log2(n); // Taking c as 10

// A set stores unique elements in sorted order

//Inserting into the set

set<int> ::iterator itr = s.begin();

// Report the median of the set at k/2 position

You might also like