-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix leak, broken collation of UTF-8 string in non-UTF8 locale #22963
Open
khwilliamson
wants to merge
10
commits into
Perl:blead
Choose a base branch
from
khwilliamson:khwilliamson-locale_utf8_to_bytes
base: blead
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Fix leak, broken collation of UTF-8 string in non-UTF8 locale #22963
khwilliamson
wants to merge
10
commits into
Perl:blead
from
khwilliamson:khwilliamson-locale_utf8_to_bytes
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We no longer need to have declarations first thing in a block. This allows removing a block and outdenting. And it outdents further in preparation for a later commit Also adds/clarifies some comments
The minimal space potentially freed up by this isn't worth the extra libc call needed to accomplish it.
This function has a successful return branch and an unsuccessful return branch, and a macro common to both, named CLEANUP_STRXFRM. There is another code clause repeated in both branches. This reorders some of the statements in each branch so that that clause is positioned to just after the calls to CLEANUP_STRXFRM. The next commit will add it to the macro, so as to remove the redundant code clause.
This refactors a few clauses so that the case of having a UTF-8 string in a UTF-8 locale can potentially be handled as a distinct situation. This prepares for a potential future change to deal with above-Unicode code points, which strxfrm() vary well may not deal with properly.
This uses the new function forms for doing this translation that can avoid unnecessary mallocs
This commit creates a new variable, sans_highs, to store newly allocated memory that only happens sometimes. Just before returning, that is freed. This is the final step in solving the leak spotted by Tony Cook in https://github.com/Perl/perl5/pulls?q=sort:updated-desc%20is:pr%20is:closed#discussion_r1868633416 The problem was that there were potentially 0 to 3 mallocs in this function, and it was only freeing up to two of them. The solution is to have a separate variable for each malloc, and to free them all before returning. If the corresponding malloc did not happen, the variable will be NULL, and no free will occur. This makes a loop rather more complicated. The next commit will simplify it.
This loop has gotten rather complicated. This refactors it to simplify it. It uses the typical paradigm used elsewhere in the code of having a while loop with a source pointer, and a dest pointer, and copying and incrementing them independently as needed as we go along.
We were creating a string but failed to set its length; hence the one from the old string was used.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This supercedes #22811
It started out just trying to convert to using the new forms for converting between bytes and UTF-8, but more problems were found.