Skip to content

Type-Level Regular Expressions matching with TypeScript

License

Notifications You must be signed in to change notification settings

desi-ivanov/ts-regexp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ts-regexp

Regular expressions matching at compile time with TypeScript. Try it out in the TS Playground!

Syntax

The syntax of regular expressions is defined as follows:

  • is a regular expression denoting the empty language
  • ε is a regular expression denoting the language with just the empty string
  • Given a symbol r of the alphabet, the atom regular expression is defined as r
  • Given two regular expressions F and G, the union is defined as F+G
  • Given two regular expressions F and G, the concatenation is defined as FG
  • Given a regular expression F, the star is defined as F*

Parsing

In the Parse.ts file there is an implementation of a parser that can derive the AST following the above syntax. It takes a raw string passed as a literal type and concretely produces a composition of the following types:

type Sigma      = "a" | "b" | "c" | "d" | "e" | "f" | "g";
type Zero       = "∅";
type One        = "ε";

type RZero      = { type: "zero" };
type ROne       = { type: "one" };
type RAtom<C>   = { type: "char"; char: C };
type RSum<F, G> = { type: "sum"; left: F; right: G };
type RSeq<F, G> = { type: "seq"; left: F; right: G };
type RStar<F>   = { type: "star"; r: F };

For instance, given a regular expression like (a)(b), the parser derives the following AST:

type r = Parse<"(a)(b)">
type parsed = RSeq<RAtom<"a">, RAtom<"b">>

Matching

The matching is available in Match.ts and uses derivatives on regular expression to match the strings.

For example, given the regular expression (a)*, TypeScript infers that the regular expression matches the strings , a, aa, aaa, aaaa, but doesn't match the strings b, ba, c:

type t5 = Matches<"(a)*", ""> // true
type t6 = Matches<"(a)*", "a"> // true
type t7 = Matches<"(a)*", "aa"> // true
type t8 = Matches<"(a)*", "aaa"> // true
type t9 = Matches<"(a)*", "aaaa"> // true

type f5 = Matches<"(a)*", "b"> // false
type f6 = Matches<"(a)*", "ba"> // false
type f7 = Matches<"(a)*", "c"> // false

Another example for (ab)* + c (the syntax is a bit heavy due to limitations of the current parser implementation):

type t1 = Matches<"((((a)(b))*)+(c))", "c"> // true
type t2 = Matches<"((((a)(b))*)+(c))", "ab"> // true
type t3 = Matches<"((((a)(b))*)+(c))", "abababab"> // true
type t4 = Matches<"((((a)(b))*)+(c))", "ababababab"> // true

type f1 = Matches<"((((a)(b))*)+(c))", "b"> // false
type f2 = Matches<"((((a)(b))*)+(c))", "ba"> // false
type f3 = Matches<"((((a)(b))*)+(c))", "ca"> // false
type f4 = Matches<"((((a)(b))*)+(c))", "ac"> // false

License

MIT

About

Type-Level Regular Expressions matching with TypeScript

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published