Skip to content

locale.c: Use new utf8_to_bytes_temp_pv() #22811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

khwilliamson
Copy link
Contributor

  • This set of changes does not require a perldelta entry.

Comment on lines -9842 to +9843
s = (char *) bytes_from_utf8((const U8 *) s, &len, &utf8);

/* If the downgrade was successful we are done, but if the input
* contains things that require UTF-8 to represent, have to do
* damage control ... */
if (UNLIKELY(utf8)) {

/* What we do is construct a non-UTF-8 string with
if (UNLIKELY(! utf8_to_bytes_temp_pv((const U8 **) &s, &len))) {
/* If the downgrade was successful we are done, but if the
* input contains things that require UTF-8 to represent, have
* to do damage control:
*
* What we do is construct a non-UTF-8 string with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what originally freed the possible new buffer returned by bytes_from_utf8() here, this isn't needed for the replacement, but I don't see what would have been freeing that in the original.

I'll spend some more time looking at it, but it might be faster if you can point out what I missed :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a while:

$ PERL_DESTRUCT_LEVEL=2 gdb --args ./perl -Ilib -Mlocale -MPOSIX=setlocale,LC_ALL  -le 'setlocale(LC_ALL, "en_AU.ISO8859-1") or die; $x = "\x{100}", $y = "A\xBD"; utf8::upgrade($y); print $x cmp $y'
...
(gdb) r
Starting program: /home/tony/dev/perl/git/perl6/perl -Ilib -Mlocale -MPOSIX=setlocale,LC_ALL -le setlocale\(LC_ALL,\ \"en_AU.ISO8859-1\"\)\ or\ die\;\ \$x\ =\ \"\\x\{100\}\",\ \$y\ =\ \"A\\xBD\"\;\ utf8::upgrade\(\$y\)\;\ print\ \$x\ cmp\ \$y
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
1
free(): double free detected in tcache 2

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, 
    no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x00007ffff7d2df1f in __pthread_kill_internal (signo=6, 
    threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007ffff7cdefb2 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/posix/raise.c:26
#3  0x00007ffff7cc9472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00007ffff7d22430 in __libc_message (action=action@entry=do_abort, 
    fmt=fmt@entry=0x7ffff7e3c459 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5  0x00007ffff7d3783a in malloc_printerr (
    str=str@entry=0x7ffff7e3f098 "free(): double free detected in tcache 2")
    at ./malloc/malloc.c:5660
#6  0x00007ffff7d39ac6 in _int_free (av=0x7ffff7e75c60 <main_arena>, 
    p=0x555555b142f0, have_lock=have_lock@entry=0) at ./malloc/malloc.c:4469
#7  0x00007ffff7d3bf1f in __GI___libc_free (mem=mem@entry=0x555555b14300)
    at ./malloc/malloc.c:3385
#8  0x00005555557d3cfc in Perl_safesysfree (where=where@entry=0x555555b14300)
    at util.c:433
#9  0x0000555555762450 in Perl_leave_scope (base=0) at scope.c:1391
#10 0x000055555568ba85 in Perl_pp_leave () at pp_ctl.c:2423
#11 0x00005555555edc10 in Perl_runops_debug () at dump.c:2993
#12 0x00005555555d173d in S_run_body (oldscope=1) at perl.c:2883
#13 perl_run (my_perl=<optimized out>) at perl.c:2798
#14 0x000055555559c0ed in main (argc=<optimized out>, argv=<optimized out>, 
    env=<optimized out>) at perlmain.c:127
(gdb) q
$ PERL_DESTRUCT_LEVEL=2 valgrind -q ./perl -Ilib -Mlocale -MPOSIX=setlocale,LC_ALL  -le 'setlocale(LC_ALL, "en_AU.ISO8859-1") or die; $x = "\x{100}", $y = "A\xBD"; utf8::upgrade($y); print $x cmp $y'
1
==2081366== Invalid free() / delete / delete[] / realloc()
==2081366==    at 0x484317B: free (vg_replace_malloc.c:872)
==2081366==    by 0x387CFB: Perl_safesysfree (util.c:433)
==2081366==    by 0x31644F: Perl_leave_scope (scope.c:1391)
==2081366==    by 0x23FA84: Perl_pp_leave (pp_ctl.c:2423)
==2081366==    by 0x1A1C0F: Perl_runops_debug (dump.c:2993)
==2081366==    by 0x18573C: S_run_body (perl.c:2883)
==2081366==    by 0x18573C: perl_run (perl.c:2798)
==2081366==    by 0x1500EC: main (perlmain.c:127)
==2081366==  Address 0x12867490 is 0 bytes inside a block of size 3 free'd
==2081366==    at 0x484317B: free (vg_replace_malloc.c:872)
==2081366==    by 0x387CFB: Perl_safesysfree (util.c:433)
==2081366==    by 0x1D0233: Perl_mem_collxfrm_ (locale.c:10158)
==2081366==    by 0x33C4E9: Perl_sv_collxfrm_flags (sv.c:8646)
==2081366==    by 0x33C665: Perl_sv_cmp_locale_flags (sv.c:8561)
==2081366==    by 0x20F53A: Perl_pp_scmp (pp.c:2489)
==2081366==    by 0x1A1C0F: Perl_runops_debug (dump.c:2993)
==2081366==    by 0x18573C: S_run_body (perl.c:2883)
==2081366==    by 0x18573C: perl_run (perl.c:2798)
==2081366==    by 0x1500EC: main (perlmain.c:127)
==2081366==  Block was alloc'd at
==2081366==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==2081366==    by 0x3899B8: Perl_safesysmalloc (util.c:176)
==2081366==    by 0x38402D: Perl_utf8_to_bytes_ (utf8.c:2857)
==2081366==    by 0x1CF681: Perl_mem_collxfrm_ (locale.c:9838)
==2081366==    by 0x33C4E9: Perl_sv_collxfrm_flags (sv.c:8646)
==2081366==    by 0x33C665: Perl_sv_cmp_locale_flags (sv.c:8561)
==2081366==    by 0x20F53A: Perl_pp_scmp (pp.c:2489)
==2081366==    by 0x1A1C0F: Perl_runops_debug (dump.c:2993)
==2081366==    by 0x18573C: S_run_body (perl.c:2883)
==2081366==    by 0x18573C: perl_run (perl.c:2798)
==2081366==    by 0x1500EC: main (perlmain.c:127)
==2081366== 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this isn't covered by tests, considering none failed.

@khwilliamson khwilliamson deleted the locale_utf8_to_bytes branch January 30, 2025 01:36
@khwilliamson khwilliamson restored the locale_utf8_to_bytes branch January 30, 2025 23:05
@khwilliamson khwilliamson reopened this Jan 31, 2025
@khwilliamson
Copy link
Contributor Author

Superceded by #22963

@khwilliamson khwilliamson deleted the locale_utf8_to_bytes branch January 31, 2025 01:47
khwilliamson added a commit to khwilliamson/perl5 that referenced this pull request Feb 8, 2025
This commit creates a new variable, sans_highs, to store newly allocated
memory that only happens sometimes.  Just before returning, that is
freed.

This is the final step in solving the leak spotted by Tony Cook in
Perl#22811 (comment)

The problem was that there were potentially 0 to 3 mallocs in this
function, and it was only freeing up to two of them.  The solution is to
have a separate variable for each malloc, and to free them all before
returning.  If the corresponding malloc did not happen, the variable
will be NULL, and no free will occur.

This makes a loop rather more complicated.  The next commit will
simplify it.
khwilliamson added a commit to khwilliamson/perl5 that referenced this pull request Feb 8, 2025
khwilliamson added a commit that referenced this pull request Feb 9, 2025
This commit creates a new variable, sans_highs, to store newly allocated
memory that only happens sometimes.  Just before returning, that is
freed.

This is the final step in solving the leak spotted by Tony Cook in
#22811 (comment)

The problem was that there were potentially 0 to 3 mallocs in this
function, and it was only freeing up to two of them.  The solution is to
have a separate variable for each malloc, and to free them all before
returning.  If the corresponding malloc did not happen, the variable
will be NULL, and no free will occur.

This makes a loop rather more complicated.  The next commit will
simplify it.
khwilliamson added a commit that referenced this pull request Feb 9, 2025
scottchiefbaker pushed a commit to scottchiefbaker/perl5 that referenced this pull request Feb 11, 2025
This commit creates a new variable, sans_highs, to store newly allocated
memory that only happens sometimes.  Just before returning, that is
freed.

This is the final step in solving the leak spotted by Tony Cook in
Perl#22811 (comment)

The problem was that there were potentially 0 to 3 mallocs in this
function, and it was only freeing up to two of them.  The solution is to
have a separate variable for each malloc, and to free them all before
returning.  If the corresponding malloc did not happen, the variable
will be NULL, and no free will occur.

This makes a loop rather more complicated.  The next commit will
simplify it.
scottchiefbaker pushed a commit to scottchiefbaker/perl5 that referenced this pull request Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants