You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think we should add a rule that reports cases for exponential backtracking and polynomial backtracking.
However, there are two problems standing in our way:
Performance
Detecting exponential backtracking is computationally expensive.
Static analysis methods might only take a few milliseconds per regex but this will quickly add up for a large codebase (e.g. Prism has about 2k regexes). Fuzzers will usually take about one second or more per regex and are completely impractical for our purposes.
Solutions
Since the detection itself is deterministic, a simple cache might largely solve the performance problem.
However, I am not aware of ESLint providing such a feature.
Detection
I am not aware of any existing detector implementation in JavaScript (time of writing: 2021-04-19).
safe-regex isn't practical. It implements a detection method that simply doesn't work. Star height has little to do with exponential runtime. I.e. (a|a)*b has a star height of 1 and the (a|a)* will cause exponential backtracking, and (ab+)*c has a star height of 2 without any form of exponential backtracking.
Solutions
Right now, we could use my library scslre to detect at least some polynomial and exponential backtracking quickly. scslre even offers fixes for simple cases.
There is also this method I implemented for Prism. This method is currently used in PrismJS and Highlight.JS. The upside of this method is that it is quick. It can analyze the >2.5k regexes of Prism in less than 5 seconds. The downside of this method is that there are false negatives and it cannot offer fixes. However, the positives it does find are mostly true positives.
I should note that half of this method is already partially implemented by the no-dupe-disjunctions rule. If we find two alternatives that are not disjoint and both are children of a star quantifier, then we can reasonably assume that this will cause exponential backtracking (e.g. (a|a)*). Changing the no-dupe-disjunctions to add this information should be fairly easy.
Given that refa recently added ENFAs (not published yet), someone could probably implement a detection method like RXXR fairly easily.
I am currently writing my Bachelor thesis on a new static analysis method for exponential and polynomial backtracking. Part of this will be a JS library implementing my new detection method. We could also use this when it's ready (probably in 3-4 months or so).
I think we should add a rule that reports cases for exponential backtracking and polynomial backtracking.
However, there are two problems standing in our way:
Performance
Detecting exponential backtracking is computationally expensive.
Static analysis methods might only take a few milliseconds per regex but this will quickly add up for a large codebase (e.g. Prism has about 2k regexes). Fuzzers will usually take about one second or more per regex and are completely impractical for our purposes.
Solutions
Since the detection itself is deterministic, a simple cache might largely solve the performance problem.
However, I am not aware of ESLint providing such a feature.
Detection
I am not aware of any existing detector implementation in JavaScript (time of writing: 2021-04-19).
safe-regex
isn't practical. It implements a detection method that simply doesn't work. Star height has little to do with exponential runtime. I.e.(a|a)*b
has a star height of 1 and the(a|a)*
will cause exponential backtracking, and(ab+)*c
has a star height of 2 without any form of exponential backtracking.Solutions
Right now, we could use my library scslre to detect at least some polynomial and exponential backtracking quickly. scslre even offers fixes for simple cases.
There is also this method I implemented for Prism. This method is currently used in PrismJS and Highlight.JS. The upside of this method is that it is quick. It can analyze the >2.5k regexes of Prism in less than 5 seconds. The downside of this method is that there are false negatives and it cannot offer fixes. However, the positives it does find are mostly true positives.
I should note that half of this method is already partially implemented by the
no-dupe-disjunctions
rule. If we find two alternatives that are not disjoint and both are children of a star quantifier, then we can reasonably assume that this will cause exponential backtracking (e.g.(a|a)*
). Changing theno-dupe-disjunctions
to add this information should be fairly easy.Given that refa recently added ENFAs (not published yet), someone could probably implement a detection method like RXXR fairly easily.
I am currently writing my Bachelor thesis on a new static analysis method for exponential and polynomial backtracking. Part of this will be a JS library implementing my new detection method. We could also use this when it's ready (probably in 3-4 months or so).
Related:
no-unsafe-regex
rule is annoying sindresorhus/eslint-plugin-unicorn#153The text was updated successfully, but these errors were encountered: