-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PT-2377 - fixed pt-table-sync for JSON utf8 strings #861
base: 3.x
Are you sure you want to change the base?
Conversation
67e432c
to
e5f0690
Compare
I updated the PR such that the change is also included in |
4acec07
to
fc9987d
Compare
fc9987d
to
263404d
Compare
I adjusted the added code to also handle The unit test for |
263404d
to
ec7705d
Compare
Thanks so much for the massive effort to adjust the code base to MySQL 8.4! I've rebased the changes on the latest commit in 3.x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test t/pt-table-sync/pt-2377.t does not pass with 5.7. Please add skip call into this test that will skip test if run on 5.7 or earlier version.
There is also a note at https://perldoc.perl.org/utf8:
$flag = utf8::is_utf8($string)
(Since Perl 5.8.1) Test whether $string is marked internally as encoded in UTF-8. Functionally the same as Encode::is_utf8($string).
Typically only necessary for debugging and testing, if you need to dump the internals of an SV, Devel::Peek's Dump() provides more detail in a compact form.
If you still think you need this outside of debugging, testing or dealing with filenames, you should probably read perlunitut and "What is "the UTF8 flag"?" in perlunifaq.
Don't use this flag as a marker to distinguish character and binary data: that should be decided for each variable when you write your code.
To force unicode semantics in code portable to perl 5.8 and 5.10, call utf8::upgrade($string) unconditionally.
Is there any other way to perform the same check?
The MySQL driver DBD::mysql does not decode JSON values as utf8 although MySQL uses utf8mb4 for all JSON strings. This change decodes JSON values as utf8 (when not already done) such that SQL statements are generated correctly.
ec7705d
to
16b06dc
Compare
I've removed the check, added a skip for versions below 8.0 in If I understand correctly, the check There certainly is a bug in |
The PR is intended to resolve this issue: https://perconadev.atlassian.net/browse/PT-2377
The tests added with
pt-2377.t
both fail with the current code base as they do not generate the expected DML statements with non-ASCII characters correctly. With the proposed change, they run successfully. They test that bothREPLACE
andUPDATE
statements are generated correctly bypt-table-sync
when applied to tables with JSON columns that contain non-ASCII characters.The code is my own creation and it can be distributed under the GPL2 licence.