Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<regex>: regex_traits<_Elem> uses an inadmissible value of type char_class_type to represent character class "w" #5242

Open
muellerj2 opened this issue Jan 17, 2025 · 1 comment
Labels
bug Something isn't working regex Everyone's favorite header vNext Breaks binary compatibility

Comments

@muellerj2
Copy link
Contributor

muellerj2 commented Jan 17, 2025

regex_traits<_Elem> uses static_cast<ctype_base::mask>(-1) to represent the character class "w":

_REGEX_CHAR_CLASS_NAME("w", static_cast<ctype_base::mask>(-1)),

This is an inadmissible choice, because it violates [re.grammar]/9:

The results from multiple calls to traits_inst.lookup_classname can be bitwise or'ed together and subsequently passed to traits_inst.isctype.

Specifically, or'ing the char_class_type for "w" with the char_class_type for any other character class always produces the value for "w" again, even if the combination should match more characters.

Additional remarks

I think resolving this issue will break ABI. However, it should be possible to mitigate the problems caused by this issue.

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

@StephanTLavavej StephanTLavavej added bug Something isn't working vNext Breaks binary compatibility regex Everyone's favorite header labels Jan 22, 2025
@StephanTLavavej
Copy link
Member

Yes, this will break ABI because it's stored in the NFA nodes. I don't believe we can get away with changing this value in v14, but if you can figure out binary-compatible ways to mitigate the damage, we can consider that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regex Everyone's favorite header vNext Breaks binary compatibility
Projects
None yet
Development

No branches or pull requests

2 participants