32.1 | Naive String-matching |
32.2 | Rabin-Karp Algorithm |
32.3 | String-matching Automata |
32.4 | Knuth-Morris-Pratt Algorithm |
32.5 [ed4] | Suffix Arrays |
String matching is the problem of finding all occurrences of a string pattern
We say that pattern
In other words, the string-matching problem is to find all valid shifts for
-
$\Sigma^*$ : the set of all finite-length strings over$\Sigma$ -
$\epsilon$ : the empty string, i.e. the string of length$0$ -
$|x|$ : the length of string$x$ -
$xy$ : the concatenation of strings$x$ and$y$ , with$|xy| = |x| + |y|$ -
$T_q$ : the q-character prefix of string$T$ , i.e.$T_q = T[0:q-1]$ -
$x \sqsubset y$ : string$x$ is a$\color{orchid}{\text{prefix}}$ of string$y$ , i.e.$y = xz$ for some string$z \in \Sigma^*$ -
$x \sqsupset y$ : string$x$ is a$\color{orchid}{\text{suffix}}$ of string$y$ , i.e.$y = zx$ for some string$z \in \Sigma^*$ $x \sqsubset y \Rightarrow |x| \leq |y|$ $x \sqsupset y \Rightarrow |x| \leq |y|$ -
$x \sqsubset y \land |x| < |y|$ :$x$ is a proper prefix of$y$ -
$x \sqsupset y \land |x| < |y|$ :$x$ is a proper suffix of$y$ $x \in \Sigma^* \Rightarrow \epsilon \sqsubset x$ $x \in \Sigma^* \Rightarrow \epsilon \sqsupset x$ $\forall a \in \Sigma: x \sqsupset y \Leftrightarrow xa \sqsupset ya$ $\forall z \in \Sigma^*: x \sqsupset y \land y \sqsupset z \Rightarrow x \sqsupset z$ $\forall z \in \Sigma^*: x \sqsubset y \land y \sqsubset z \Rightarrow x \sqsubset z$