Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<locale>: std::collate<_Elem>::do_transform() should behave appropriately when _LStrxfrm() fails #5210

Open
muellerj2 opened this issue Dec 27, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@muellerj2
Copy link
Contributor

muellerj2 commented Dec 27, 2024

This is mainly an annoyance I noticed while preparing #5209, but there are two related problems here:

std::collate<char>

When _Strxfrm() fails, it returns -1 (SIZE_MAX) as an error code.std::collate<char>::do_transform() passes this return value to basic_string<char>::resize(), which throws a length_error("string too long").

This is misleading, since the problem isn't that the result can't be represented in memory, but that there is no result at all because no sort key could be generated by _Strxfrm(). We should check whether the return value of _Strxfrm() indicates an error and then handle this appropriately (either by throwing an appropriate exception or returning a substitute key).

std::collate<wchar_t>

When _Wcsxfrm() fails (due to a failure in LCMapStringW), it returns INT_MAX as an error code. std::collate<char>::do_transform() passes this return value to basic_string<char>::resize(), which is likely to succeed on x64. If so, it calls _Wcsxfrm() again, which returns INT_MAX, but because the string size equals INT_MAX, this is considered successful and the contents of the string are returned. However, the contents of the returned string are not guaranteed, so could be garbage.

Additional remarks

It seems that _Wcsxfrm() uses two error codes: SIZE_MAX when allocation fails and INT_MAX when LCMapStringW fails (e.g., because of encoding issues). I doubt that this is intentional and that it should have always returned SIZE_MAX.

While _Strxfrm() always and _Wcsxfrm() sometimes return -1 (SIZE_MAX) on error, the comments above the functions actually claim that it returns INT_MAX to designate failure:

// Non-standard: if OM/API error, return INT_MAX.

@StephanTLavavej StephanTLavavej added the bug Something isn't working label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants