You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The compiler is a bit slow. It's faster than compilers for many programming languages, but it's weird for a funny-looking regular expression to take 150ms or so to compile. I've done my part in optimizing DFA generation, so most compile-time is taken by the Common Lisp compiler. On the other hand, the compiler also generates really good code - I wouldn't be surprised if most projects could go without full compilation. While we could sink DFA compilation into compile time when the regular expression is known at compile time, we do provide a function which conceptually accepts a string at runtime, and so it would be nice to do something useful if we need to compile at runtime.
So here is an idea: we do something more like JIT compilation and have a tiered compiler. The current compiler is used for "hot" regexes which are used to scan lots of vectors, and we use a chain of closures for cold code. Some heuristic would be used to make the switch to hot code, such as if we've matched more than some length with a cached chain of closures, and the optimizing compiler could run concurrently in another thread, too.
The chain of closures would take some of the benefits of DFA generation: we know how many registers are needed (though, without a compiler with copy propagation we overestimate substantially; oh well) and we still have linear complexity while scanning. We also could still use type splitting for chains of closures. All in all I wouldn't be too surprised if the cold compiler was still faster than cl-ppcre and other bytecode/closure-based implementations.
The text was updated successfully, but these errors were encountered:
The compiler is a bit slow. It's faster than compilers for many programming languages, but it's weird for a funny-looking regular expression to take 150ms or so to compile. I've done my part in optimizing DFA generation, so most compile-time is taken by the Common Lisp compiler. On the other hand, the compiler also generates really good code - I wouldn't be surprised if most projects could go without full compilation. While we could sink DFA compilation into compile time when the regular expression is known at compile time, we do provide a function which conceptually accepts a string at runtime, and so it would be nice to do something useful if we need to compile at runtime.
So here is an idea: we do something more like JIT compilation and have a tiered compiler. The current compiler is used for "hot" regexes which are used to scan lots of vectors, and we use a chain of closures for cold code. Some heuristic would be used to make the switch to hot code, such as if we've matched more than some length with a cached chain of closures, and the optimizing compiler could run concurrently in another thread, too.
The chain of closures would take some of the benefits of DFA generation: we know how many registers are needed (though, without a compiler with copy propagation we overestimate substantially; oh well) and we still have linear complexity while scanning. We also could still use type splitting for chains of closures. All in all I wouldn't be too surprised if the cold compiler was still faster than cl-ppcre and other bytecode/closure-based implementations.
The text was updated successfully, but these errors were encountered: