-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: faster utf8<->utf16 conversion on Windows #4549
Conversation
OIIO 2.3.13 with PR AcademySoftwareFoundation#3307 changed MultiByteToWideChar/WideCharToMultiByte usage to C++11 <codecvt> functionality, but that has two issues: 1) it is *way* slower, primarily due to locale object access (on Visual C++ STL implementation in VS2022 at least). Since primary use case of these conversions is on Windows, maybe it is better to use a fast code path. 2) whole of <codecvt> machinery is deprecated with C++17 accross the board, and will be removed in C++26. I've kept the existing functions in there since otherwise it would have been an API break, but really maybe with OIIO they should have been un-exposed. Too late now though :( Performance numbers: doing ImageInput::create() on 1138 files where they are not images at all (so OIIO in turns tries all the input plugins on them). Ryzen 5950X, VS2022, Windows: - utf8_to_utf16 3851ms -> 21ms - utf16_to_utf8 1055ms -> 4ms Signed-off-by: Aras Pranckevicius <[email protected]>
The one CI failure fails at cmake setup time with |
I have seen that from time to time recently... but not always, since we obviously do have passing tests most of the time. I'm not quite sure what's going on there, but it's obviously unrelated to your patch. I will poke it to make it run the CI again for that job, but even if that doesn't pass, I won't hold this up. OIIO itself doesn't use pystring, that's being pulled in by the OpenColorIO build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and thanks for the fix! It's shocking that there isn't a stable, portable, performant utf8<->utf16 conversion that's part of the C++ standard that we can rely on and that doesn't completely change every couple of standard revisions.
I have, frankly, wondered if we should just get rid of all the wchar stuff entirely from OIIO, and put the responsibility of Windows programmers to convert to utf8 before calling OIIO API calls. We added the wchar ones to try to make it convenient for Windows users, but sheesh, C++ sure isn't making it easy on us.
Yeah, just rerunning the failed test with no other modifications... and it succeeded this time. Something odd is going on with OpenColorIO or pystring itself, maybe? But I'm not sure why it's nondeterministic. |
…ation#4549) OIIO 2.3.13 with PR AcademySoftwareFoundation#3307 changed MultiByteToWideChar/WideCharToMultiByte usage to C++11 <codecvt> functionality, but that has two issues: 1) it is *way* slower, primarily due to locale object access (on Visual C++ STL implementation in VS2022 at least). Since primary use case of these conversions is on Windows, maybe it is better to use a fast code path. 2) whole of <codecvt> machinery is deprecated with C++17 accross the board, and will be removed in C++26. I've kept the existing functions in there since otherwise it would have been an API break, but really maybe with OIIO 3.0 they should have been un-exposed. Too late now though :( ## Tests Performance numbers: doing ImageInput::create() on 1138 files where they are not images at all (so OIIO in turns tries all the input plugins on them). Ryzen 5950X, VS2022, Windows: - utf8_to_utf16 3851ms -> 21ms - utf16_to_utf8 1055ms -> 4ms Signed-off-by: Aras Pranckevicius <[email protected]>
…ation#4549) OIIO 2.3.13 with PR AcademySoftwareFoundation#3307 changed MultiByteToWideChar/WideCharToMultiByte usage to C++11 <codecvt> functionality, but that has two issues: 1) it is *way* slower, primarily due to locale object access (on Visual C++ STL implementation in VS2022 at least). Since primary use case of these conversions is on Windows, maybe it is better to use a fast code path. 2) whole of <codecvt> machinery is deprecated with C++17 accross the board, and will be removed in C++26. I've kept the existing functions in there since otherwise it would have been an API break, but really maybe with OIIO 3.0 they should have been un-exposed. Too late now though :( ## Tests Performance numbers: doing ImageInput::create() on 1138 files where they are not images at all (so OIIO in turns tries all the input plugins on them). Ryzen 5950X, VS2022, Windows: - utf8_to_utf16 3851ms -> 21ms - utf16_to_utf8 1055ms -> 4ms Signed-off-by: Aras Pranckevicius <[email protected]>
Description
OIIO 2.3.13 with PR #3307 changed MultiByteToWideChar/WideCharToMultiByte usage to C++11 functionality, but that has two issues:
it is way slower, primarily due to locale object access (on Visual C++ STL implementation in VS2022 at least). Since primary use case of these conversions is on Windows, maybe it is better to use a fast code path.
whole of machinery is deprecated with C++17 accross the board, and will be removed in C++26. I've kept the existing functions in there since otherwise it would have been an API break, but really maybe with OIIO 3.0 they should have been un-exposed. Too late now though :(
Tests
Performance numbers: doing ImageInput::create() on 1138 files where they are not images at all (so OIIO in turns tries all the input plugins on them). Ryzen 5950X, VS2022, Windows:
Checklist:
need to update the documentation, for example if this is a bug fix that
doesn't change the API.)
(adding new test cases if necessary).
corresponding Python bindings (and if altering ImageBufAlgo functions, also
exposed the new functionality as oiiotool options).
already run clang-format before submitting, I definitely will look at the CI
test that runs clang-format and fix anything that it highlights as being
nonconforming.