Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime feedback for SIMD and repetition #10

Open
no-defun-allowed opened this issue Jan 22, 2022 · 1 comment
Open

Runtime feedback for SIMD and repetition #10

no-defun-allowed opened this issue Jan 22, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@no-defun-allowed
Copy link
Collaborator

Consider the regular expression "(¬")*", which will match some text enclosed in double quotes.

There are two ways to run the "tight loop" currently:

  • We may use the "scalar" DFA and repeatedly check each character, exiting the loop when we encounter a " character.
  • We may use SIMD and scan several characters at a time.

The latter is faster than the former for long texts, but not for shorter texts. (I suspect less than one SIMD vector wide? idk) So to achieve best performance, we should choose the more appropriate tight loop implementation. There is no indication of what lengths are expected in the regular expression, and we have no way to annotate such things, so we must rely on runtime feedback (which is probably for the better, in terms of performance/effort on behalf of the user).

According to Cliff Click (from a conversation in the coffee compiler club) it should suffice to maintain a count of how many times the tight loop is iterated. Upon entering the loop, we compare the last count to some crossover point. If we have a higher count, we take the SIMD route, else we use scalar code.

There is also a case for unrolling the scalar code according to Gilbert Baumann, but I haven't tried that yet. For very long loop counts, unrolling the SIMD code may also produce some benefit, but we may face more code bloat; I still intend to compile everything eagerly, because it is simpler, and the duplicate code is not too large anyway.

@no-defun-allowed no-defun-allowed added the enhancement New feature or request label Jan 25, 2022
@no-defun-allowed
Copy link
Collaborator Author

no-defun-allowed commented Jan 30, 2022

More complex tests, such as that for ab|ac, will end up using all the vector pipes of a processor and unrolling won't help there. We need pretty simple tests to benefit from unrolling SIMD code, e.g. just a single code point to test for, rather than a range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant