Regular expressions matching at compile time with TypeScript. Try it out in the TS Playground!
The syntax of regular expressions is defined as follows:
∅
is a regular expression denoting the empty languageε
is a regular expression denoting the language with just the empty string- Given a symbol
r
of the alphabet, the atom regular expression is defined asr
- Given two regular expressions
F
andG
, the union is defined asF+G
- Given two regular expressions
F
andG
, the concatenation is defined asFG
- Given a regular expression
F
, the star is defined asF*
In the Parse.ts file there is an implementation of a parser that can derive the AST following the above syntax. It takes a raw string passed as a literal type and concretely produces a composition of the following types:
type Sigma = "a" | "b" | "c" | "d" | "e" | "f" | "g";
type Zero = "∅";
type One = "ε";
type RZero = { type: "zero" };
type ROne = { type: "one" };
type RAtom<C> = { type: "char"; char: C };
type RSum<F, G> = { type: "sum"; left: F; right: G };
type RSeq<F, G> = { type: "seq"; left: F; right: G };
type RStar<F> = { type: "star"; r: F };
For instance, given a regular expression like (a)(b)
, the parser derives the following AST:
type r = Parse<"(a)(b)">
type parsed = RSeq<RAtom<"a">, RAtom<"b">>
The matching is available in Match.ts and uses derivatives on regular expression to match the strings.
For example, given the regular expression (a)*
, TypeScript infers that the regular expression matches the strings
, a
, aa
, aaa
, aaaa
, but doesn't match the strings b
, ba
, c
:
type t5 = Matches<"(a)*", ""> // true
type t6 = Matches<"(a)*", "a"> // true
type t7 = Matches<"(a)*", "aa"> // true
type t8 = Matches<"(a)*", "aaa"> // true
type t9 = Matches<"(a)*", "aaaa"> // true
type f5 = Matches<"(a)*", "b"> // false
type f6 = Matches<"(a)*", "ba"> // false
type f7 = Matches<"(a)*", "c"> // false
Another example for (ab)* + c
(the syntax is a bit heavy due to limitations of the current parser implementation):
type t1 = Matches<"((((a)(b))*)+(c))", "c"> // true
type t2 = Matches<"((((a)(b))*)+(c))", "ab"> // true
type t3 = Matches<"((((a)(b))*)+(c))", "abababab"> // true
type t4 = Matches<"((((a)(b))*)+(c))", "ababababab"> // true
type f1 = Matches<"((((a)(b))*)+(c))", "b"> // false
type f2 = Matches<"((((a)(b))*)+(c))", "ba"> // false
type f3 = Matches<"((((a)(b))*)+(c))", "ca"> // false
type f4 = Matches<"((((a)(b))*)+(c))", "ac"> // false
MIT