Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiered compilation #6

Open
no-defun-allowed opened this issue Sep 25, 2021 · 0 comments
Open

Tiered compilation #6

no-defun-allowed opened this issue Sep 25, 2021 · 0 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@no-defun-allowed
Copy link
Collaborator

no-defun-allowed commented Sep 25, 2021

The compiler is a bit slow. It's faster than compilers for many programming languages, but it's weird for a funny-looking regular expression to take 150ms or so to compile. I've done my part in optimizing DFA generation, so most compile-time is taken by the Common Lisp compiler. On the other hand, the compiler also generates really good code - I wouldn't be surprised if most projects could go without full compilation. While we could sink DFA compilation into compile time when the regular expression is known at compile time, we do provide a function which conceptually accepts a string at runtime, and so it would be nice to do something useful if we need to compile at runtime.

So here is an idea: we do something more like JIT compilation and have a tiered compiler. The current compiler is used for "hot" regexes which are used to scan lots of vectors, and we use a chain of closures for cold code. Some heuristic would be used to make the switch to hot code, such as if we've matched more than some length with a cached chain of closures, and the optimizing compiler could run concurrently in another thread, too.

The chain of closures would take some of the benefits of DFA generation: we know how many registers are needed (though, without a compiler with copy propagation we overestimate substantially; oh well) and we still have linear complexity while scanning. We also could still use type splitting for chains of closures. All in all I wouldn't be too surprised if the cold compiler was still faster than cl-ppcre and other bytecode/closure-based implementations.

@no-defun-allowed no-defun-allowed added enhancement New feature or request question Further information is requested labels Sep 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant