0% found this document useful (0 votes)
21 views7 pages

Daa Exp 09

The document discusses two string matching algorithms - KMP and Karp-Rabin. It provides pseudocode and implementation of the algorithms in Java. It also analyzes the time complexity of both algorithms and justifies that they have O(n+m) complexity on average.

Uploaded by

aniketbande777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views7 pages

Daa Exp 09

The document discusses two string matching algorithms - KMP and Karp-Rabin. It provides pseudocode and implementation of the algorithms in Java. It also analyzes the time complexity of both algorithms and justifies that they have O(n+m) complexity on average.

Uploaded by

aniketbande777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Bharatiya Vidya Bhavan’s

SARDAR PATEL INSTITUTE OF TECHNOLOGY


(Autonomous Institute Affiliated to University of Mumbai)
Munshi Nagar, Andheri (W), Mumbai – 400 058.
Department of Computer Engineering

Experiment 9
Aim To understand and implement String Matching Algorithm

Objective 1) Write Pseudocode for any 2 string matching algorithm


2) Implementing the above mentioned 2 string matching
algorithm
3) Calculating time complexity of the given problems
4) Solve the string matching for both the algorithm on pen and
paper
Name Kunal Shantaram Bhoi
UCID 2022600007
Class CSE(AIML)
Batch A
Date of 03/04/2024
Submission

Algorithm and (1) KMP :


Explanation of Preprocess Pattern:
the technique Initialize two pointers, i and j, to 0.
used Create an array lps of length equal to the pattern's length to store the
longest proper prefix which is also a suffix for each index.
Iterate i from 1 to the length of the pattern:
If the characters at i and j are equal:
Increment j and set lps[i] to j.
If the characters are not equal and j is not at the start:
Update j to lps[j - 1].
If the characters are not equal and j is at the start:
Set lps[i] to 0.

Search Pattern in Text:


Initialize two pointers, i and j, to 0.
Iterate i over the text until it reaches the end:
If the characters at i and j are equal:
Increment both i and j.
If j reaches the length of the pattern:
Pattern found at index i - j.
Update j to lps[j - 1].
If the characters are not equal and j is not at the start:
Update j to lps[j - 1].
If the characters are not equal and j is at the start:
Increment i.
(2) Karp-Rabin :
Calculate Hash Function: Define a hash function that maps each
character in the pattern to a numeric value. The hash function converts
the pattern into a numerical representation.

Calculate Initial Hashes: Calculate the hash values for the pattern and
the first substring of the text with the same length as the pattern.

Search: Iterate through the text, comparing the hash value of the current
substring with the hash value of the pattern. If they match, perform a
character-by-character comparison to confirm the match.

Update Hashes: If the hash values don't match, update the hash value of
the current substring using a rolling hash function. This function
efficiently updates the hash value by removing the contribution of the
first character and adding the contribution of the next character.

Repeat: Continue the process until all substrings of the text have been
checked.

Output: Whenever a substring matches the pattern, output the index of


the starting position of the substring.
Program(Code) //KMP
package DAA;

public class KMP {

// Preprocess the pattern to calculate the


longest prefix suffix array (lps)
private static int[] computeLPSArray(String
pattern) {
int[] lps = new int[pattern.length()];
int len = 0; // Length of the previous
longest prefix suffix
int i = 1;

while (i < pattern.length()) {


if (pattern.charAt(i) ==
pattern.charAt(len)) {
len++;
lps[i] = len;
i++;
} else {
if (len != 0) {
len = lps[len - 1];
} else {
lps[i] = 0;
i++;
}
}
}
return lps;
}

// Perform pattern searching using KMP


algorithm
public static void search(String text, String
pattern) {
int M = pattern.length();
int N = text.length();
int[] lps = computeLPSArray(pattern);
int i = 0; // Index for text[]
int j = 0; // Index for pattern[]

while (i < N) {
if (pattern.charAt(j) ==
text.charAt(i)) {
i++;
j++;
}
if (j == M) {
System.out.println("Pattern found
at index " + (i - j));
j = lps[j - 1];
} else if (i < N && pattern.charAt(j)
!= text.charAt(i)) {
if (j != 0)
j = lps[j - 1];
else
i = i + 1;
}
}
}

public static void main(String[] args) {


String text = "ABABDABACDABABCABAB";
String pattern = "ABABCABAB";
search(text, pattern);
}
}

//Karp-Rabin
package DAA;
public class KarpRabin {
private final int PRIME = 101;

private double calculateHash(String str) {


double hash = 0;
for(int i=0; i < str.length(); i++) {
hash += str.charAt(i) *
Math.pow(PRIME, i);
}
return hash;
}

private double updateHash(double prevHash,


char oldChar, char newChar, int patternLength) {
double newHash = (prevHash - oldChar) /
PRIME;
newHash = newHash + newChar *
Math.pow(PRIME, patternLength - 1);
return newHash;
}

public void search(String text, String


pattern) {
int patternLength = pattern.length();
double patternHash =
calculateHash(pattern);
double textHash =
calculateHash(text.substring(0, patternLength));

for(int i=0; i<= text.length() -


patternLength; i++) {
if(textHash == patternHash) {
if(text.substring(i,
i+patternLength).equals(pattern)) {
System.out.println("Pattern
found at index " + i);
}
}

if (i < text.length() -
patternLength) {
textHash = updateHash(textHash,
text.charAt(i), text.charAt(i + patternLength),
patternLength);
}
}
}

public static void main(String[] args) {


KarpRabin algo = new KarpRabin();
algo.search("ABABDABACDABABCABAB",
"ABABCABAB");
}
}
Output

Justification of KMP :
the complexity The time complexity of the Knuth-Morris-Pratt (KMP) algorithm for
calculated pattern searching in a text of length n using a pattern of length m is
O(n+m).

Preprocessing (Building the lps Array):

This step takes


O(m) time since we iterate through each character of the pattern once to
compute the longest prefix suffix array (lps).
Searching:

The searching step iterates through the text once, comparing each
character to the corresponding character in the pattern.
In the worst case, each character of the text may be compared against all
characters of the pattern.
However, due to the lps array, we can avoid unnecessary comparisons
by skipping ahead in the pattern whenever a mismatch occurs.
Thus, the searching step also takes O(n+m) time.

Karp-Rabin:

Preprocessing (Calculating Hashes):


Computing the hash value for the pattern: O(m), where m is the length
of the pattern.
Computing the hash values for all m-length substrings of the text: O(n),
where n is the length of the text.

Search and Comparison:


In the worst case, each potential match must be confirmed by comparing
characters individually.
In the average case, the algorithm performs O(n+m) operations.

Update Hashes:
When moving the window along the text, updating the hash value can
be done in constant time with a rolling hash function.

Therefore, the average-case time complexity of the Rabin-Karp


algorithm is typically O(n+m), making it an efficient algorithm for
string searching.

Conclusion From this experiment, I learned the significance of the


Knuth-Morris-Pratt (KMP) algorithm and Karp-Rabin algorithm in
efficiently searching for a pattern within a text. By preprocessing the
pattern to compute the longest prefix suffix array, the algorithm offers a
time-efficient solution even for large text datasets and doing task using
hashing. Understanding and implementing these algorithms can lead to
optimized string search operations, particularly in scenarios where
performance is critical.

You might also like