0% found this document useful (0 votes)
7 views

String

Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

String

Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

Sure, let's dive deeper into each algorithm and provide simple implementations in

C++:

### 1. Knuth-Morris-Pratt (KMP) Algorithm:


- **Definition**: The KMP algorithm efficiently finds occurrences of a "pattern"
within a "text" by utilizing the concept of a "partial match table" (also known as
"failure function").
- **Details**: It avoids unnecessary comparisons by using information from previous
comparisons to determine where the next comparison should begin.
- **Implementation**:
```cpp
#include <vector>
#include <string>

std::vector<int> computeLPS(const std::string& pattern) {


int n = pattern.size();
std::vector<int> lps(n, 0);
int len = 0;
for (int i = 1; i < n; ) {
if (pattern[i] == pattern[len]) {
lps[i++] = ++len;
} else {
if (len != 0) {
len = lps[len - 1];
} else {
lps[i++] = 0;
}
}
}
return lps;
}

void KMP(const std::string& text, const std::string& pattern) {


int n = text.size(), m = pattern.size();
std::vector<int> lps = computeLPS(pattern);
int i = 0, j = 0;
while (i < n) {
if (pattern[j] == text[i]) {
i++;
j++;
}
if (j == m) {
std::cout << "Pattern found at index " << i - j << std::endl;
j = lps[j - 1];
} else if (i < n && pattern[j] != text[i]) {
if (j != 0)
j = lps[j - 1];
else
i++;
}
}
}
```

### 2. Rabin-Karp Algorithm:


- **Definition**: The Rabin-Karp algorithm uses hashing to find all occurrences of
a "pattern" within a "text" efficiently.
- **Details**: It compares hash values of the pattern and sliding windows in the
text to reduce the number of comparisons.
- **Implementation**:
```cpp
#include <iostream>
#include <string>

const int prime = 101;

void RabinKarp(const std::string& text, const std::string& pattern) {


int n = text.size(), m = pattern.size();
int pattern_hash = 0, text_hash = 0;
int h = 1;
for (int i = 0; i < m - 1; i++)
h = (h * 256) % prime;
for (int i = 0; i < m; i++) {
pattern_hash = (256 * pattern_hash + pattern[i]) % prime;
text_hash = (256 * text_hash + text[i]) % prime;
}
for (int i = 0; i <= n - m; i++) {
if (pattern_hash == text_hash) {
int j;
for (j = 0; j < m; j++)
if (text[i + j] != pattern[j])
break;
if (j == m)
std::cout << "Pattern found at index " << i << std::endl;
}
if (i < n - m) {
text_hash = (256 * (text_hash - text[i] * h) + text[i + m]) % prime;
if (text_hash < 0)
text_hash += prime;
}
}
}
```

### 3. Boyer-Moore Algorithm:


- **Definition**: The Boyer-Moore algorithm utilizes the concept of "bad character"
and "good suffix" rules to skip unnecessary comparisons.
- **Details**: It preprocesses the pattern to determine the shift amount for each
character occurrence in the text.
- **Implementation**:
```cpp
#include <iostream>
#include <string>
#include <vector>

const int CHAR_RANGE = 256;

void preProcess(const std::string& pattern, std::vector<int>& badChar) {


int m = pattern.size();
for (int i = 0; i < CHAR_RANGE; i++)
badChar[i] = -1;
for (int i = 0; i < m; i++)
badChar[(int)pattern[i]] = i;
}

void BoyerMoore(const std::string& text, const std::string& pattern) {


int n = text.size(), m = pattern.size();
std::vector<int> badChar(CHAR_RANGE);
preProcess(pattern, badChar);
int shift = 0;
while (shift <= (n - m)) {
int j = m - 1;
while (j >= 0 && pattern[j] == text[shift + j])
j--;
if (j < 0) {
std::cout << "Pattern found at index " << shift << std::endl;
shift += (shift + m < n) ? m - badChar[text[shift + m]] : 1;
} else {
shift += std::max(1, j - badChar[text[shift + j]]);
}
}
}
```

### 4. Edit Distance:


- **Definition**: The edit distance is the minimum number of operations (insertion,
deletion, substitution) required to convert one string into another.
- **Dynamic Programming Approach**:
```cpp
int editDistance(const std::string& str1, const std::string& str2) {
int m = str1.size(), n = str2.size();
std::vector<std::vector<int>> dp(m + 1, std::vector<int>(n + 1, 0));
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
if (i == 0)
dp[i][j] = j;
else

if (j == 0)
dp[i][j] = i;
else if (str1[i - 1] == str2[j - 1])
dp[i][j] = dp[i - 1][j - 1];
else
dp[i][j] = 1 + std::min({ dp[i - 1][j - 1], dp[i][j - 1], dp[i -
1][j] });
}
}
return dp[m][n];
}
```

### 5. Longest Common Subsequence (LCS):


- **Definition**: The longest common subsequence is the longest subsequence common
to two strings.
- **Dynamic Programming Approach**:
```cpp
int LCS(const std::string& str1, const std::string& str2) {
int m = str1.size(), n = str2.size();
std::vector<std::vector<int>> dp(m + 1, std::vector<int>(n + 1, 0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (str1[i - 1] == str2[j - 1])
dp[i][j] = 1 + dp[i - 1][j - 1];
else
dp[i][j] = std::max(dp[i - 1][j], dp[i][j - 1]);
}
}
return dp[m][n];
}
```

These are simplified implementations of the mentioned algorithms in C++. They are
meant for educational purposes and may not be as optimized as the standard
implementations. Understanding and implementing these algorithms will give you a
solid foundation for string processing tasks.

You might also like