Regular expressions matching at compile time with TypeScript. Try it out in the TS Playground!
The syntax of regular expressions is defined as follows:
∅is a regular expression denoting the empty languageεis a regular expression denoting the language with just the empty string- Given a symbol
rof the alphabet, the atom regular expression is defined asr - Given two regular expressions
FandG, the union is defined asF+G - Given two regular expressions
FandG, the concatenation is defined asFG - Given a regular expression
F, the star is defined asF*
In the Parse.ts file there is an implementation of a parser that can derive the AST following the above syntax. It takes a raw string passed as a literal type and concretely produces a composition of the following types:
type Sigma = "a" | "b" | "c" | "d" | "e" | "f" | "g";
type Zero = "∅";
type One = "ε";
type RZero = { type: "zero" };
type ROne = { type: "one" };
type RAtom<C> = { type: "char"; char: C };
type RSum<F, G> = { type: "sum"; left: F; right: G };
type RSeq<F, G> = { type: "seq"; left: F; right: G };
type RStar<F> = { type: "star"; r: F };For instance, given a regular expression like (a)(b), the parser derives the following AST:
type r = Parse<"(a)(b)">
type parsed = RSeq<RAtom<"a">, RAtom<"b">>The matching is available in Match.ts and uses derivatives on regular expression to match the strings.
For example, given the regular expression (a)*, TypeScript infers that the regular expression matches the strings , a, aa, aaa, aaaa, but doesn't match the strings b, ba, c:
type t5 = Matches<"(a)*", ""> // true
type t6 = Matches<"(a)*", "a"> // true
type t7 = Matches<"(a)*", "aa"> // true
type t8 = Matches<"(a)*", "aaa"> // true
type t9 = Matches<"(a)*", "aaaa"> // true
type f5 = Matches<"(a)*", "b"> // false
type f6 = Matches<"(a)*", "ba"> // false
type f7 = Matches<"(a)*", "c"> // falseAnother example for (ab)* + c (the syntax is a bit heavy due to limitations of the current parser implementation):
type t1 = Matches<"((((a)(b))*)+(c))", "c"> // true
type t2 = Matches<"((((a)(b))*)+(c))", "ab"> // true
type t3 = Matches<"((((a)(b))*)+(c))", "abababab"> // true
type t4 = Matches<"((((a)(b))*)+(c))", "ababababab"> // true
type f1 = Matches<"((((a)(b))*)+(c))", "b"> // false
type f2 = Matches<"((((a)(b))*)+(c))", "ba"> // false
type f3 = Matches<"((((a)(b))*)+(c))", "ca"> // false
type f4 = Matches<"((((a)(b))*)+(c))", "ac"> // falseMIT