You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The library is building a regex here of the normalized first lines of license files. It then later splits files using the regex here.
The problem here is that the App-s2p.txt license's first line normalizes into an empty string. This then causes the regex to match every line beginning and ending as we can see for example in this regex tester. You can see the bug in the regex by searching for || which is where the license's first line would go.
This causes huge performance degradation in repositories with large files that match the license filename pattern. One example of a such a repository is https://gitlab.com/tikiwiki/tiki which contains a large file called copyright.txt. Detecting a license for the repository took 22s. Detecting the license takes 260ms with the below patch:
We are also observing slowness in large license files with the performance being much worse than what has been mentioned (degraded from seconds to close to an hour), although it gets resolved when we downgrade the version from 4.3.1 to 4.3.0.
I cannot share the license file so will try to debug this and update this issue about the bottleneck.
The library is building a regex here of the normalized first lines of license files. It then later splits files using the regex here.
The problem here is that the App-s2p.txt license's first line normalizes into an empty string. This then causes the regex to match every line beginning and ending as we can see for example in this regex tester. You can see the bug in the regex by searching for
||
which is where the license's first line would go.This causes huge performance degradation in repositories with large files that match the license filename pattern. One example of a such a repository is https://gitlab.com/tikiwiki/tiki which contains a large file called copyright.txt. Detecting a license for the repository took 22s. Detecting the license takes 260ms with the below patch:
What would be the appropriate fix here?
The text was updated successfully, but these errors were encountered: