Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
code1.cpp		code1.cpp
code2.cpp		code2.cpp
input		input

README.md

KMP

Motivational Problem

You have 2 strings s and w, of length n and m respectively, and you want to know if string w is a substring of s.

For this task you develop a cool and intuitive algorithm with time efficiency O(nm) that just shifts the word w and compare to the substring of s in that position, this runs incredibly fast in your test cases, but when the strings get bigger your code starts being too slow and ineffective.

KMP is an algorithm that checks if w is a substring of s in O(n + m).

How does it works?

The KMP algorithm has basically two parts:

Prefix function
String Matching

"KMP’s algorithm is: whenever we detect a mismatch (after some matches), we already know some of the characters in the text (since they matched the pattern characters prior to the mismatch). We take advantage of this information to avoid matching the characters that we know will anyway match."

1. Prefix Function

So, in the Prefix function what we want to do is define an array f that stores the length of the maximum matching proper prefix which is also a suffix of the sub-pattern w[0..i].

Example:

w = “AAACAAAAAC”, f[] is [0, 1, 2, 0, 1, 2, 3, 3, 3, 4]. Because:

"A", is the first letter so f[0] = 0 (this is always true)
"AA", the maximum proper prefix which is also a suffix is "A" that has length 1, so f[1] = 1
"AAA", the maximum proper prefix which is also a suffix is "AA" that has length 2, so f[2] = 2
"AAAC", the maximum proper prefix which is also a suffix is "" that has length 0, so f[3] = 0
"AAACA", the maximum proper prefix which is also a suffix is "A" that has length 1, so f[4] = 1
"AAACAA", the maximum proper prefix which is also a suffix is "AA" that has length 2, so f[5] = 2
"AAACAAA", the maximum proper prefix which is also a suffix is "AAA" that has length 2, so f[6] = 3
"AAACAAAA", the maximum proper prefix which is also a suffix is "AAA" that has length 2, so f[7] = 3
"AAACAAAA", the maximum proper prefix which is also a suffix is "AAA" that has length 2, so f[8] = 3
"AAACAAAAC", the maximum proper prefix which is also a suffix is "AAAC" that has length 2, so f[9] = 4

2. String Matching

Then, the String Matching will consist of comparing s with w if a match happens (s[i] == w[j]) then continue comparing, if j == m it's a real match! if s[i] != w[j], then j = f[j], this means that if the current letters are not a match we don't need to start from the beginning of the string again we can instead start from f[j] (think a little about this).

Try to implement it! Doubts? see the code.cpp file to see the implementantion in C++ with comments 😄!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KMP

KMP

README.md

KMP

Motivational Problem

How does it works?

1. Prefix Function

2. String Matching

Other problems that it can solve?

Files

KMP

Directory actions

More options

Directory actions

More options

Latest commit

History

KMP

Folders and files

parent directory

README.md

KMP

Motivational Problem

How does it works?

1. Prefix Function

2. String Matching

Other problems that it can solve?