Alg.20 Knutt-Morris-Pratt Algorithm

From algoexpert.io

Knutt-Morris-Pratt Algorithm, known as KMP, is a famous algorithm to solve substring problem. Given a string and a substring, we are supposed to determine whether the string contains the substring.

Usually, when we try to solve the substring problem, we may iterate the original string and check if it matches. In KMP algorithm, we can skip all the unnecessary compare and acclerate this process. For example, check if “aef aef aef aed aef aed aef aef” contains “aef aed aef aef”, we will encounter a match failure when we compare the second “f” in string and the first “d” in substring. Here, we may want start another round from second “aef” instead of the second character in the string. First, for each character of the substring, we need to find the nearest restart pointer when we meet a failure in matching:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

public static int[] generatePattern(String substring) {
int[] pattern = new int[substring.length()];
Arrays.fill(pattern, -1);
int j = 0;
for (int i = 1; i < substring.length();) {
if (substring.charAt(i) == substring.charAt(j)) {
pattern[i] = j;
i++;
j++;
} else if (j > 0) {
j = pattern[j - 1] + 1;
} else {
i++;
}
}

return pattern;
}

Now, we can start the match process.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

public static boolean doesMatch(String string, String substring, int[] pattern) {
int i = 0; // Main string idx
int j = 0; // Substring idx
while (i + substring.length() - j <= string.length()) {
if (string.charAt(i) == substring.charAt(j)) {
if (j == substring.length() - 1) {
return true;
}
i++;
j++;
} else if (j > 0) {
j = pattern[j - 1] + 1;
} else {
i++;
}
}

return false;
}

public static boolean knuthMorrisPrattAlgorithm(String string, String substring) {
int[] pattern = generatePattern(substring);
return doesMatch(string, substring, pattern);
}

Time complexity: O(n + m); Space complexity: O(m) for the extra special pattern.

Donate
  • Copyright: Copyright is owned by the author. For commercial reprints, please contact the author for authorization. For non-commercial reprints, please indicate the source.
  • Copyrights © 2019-2021 Zirun Lin

Thanks~

支付宝
微信