-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
src: use CP_UTF8 for wide file names on win32 #60575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
13d23c9 to
440a6be
Compare
|
cc @joyeecheung |
|
cc @nodejs/cpp-reviewers |
`src/node_modules.cc` needs to be consistent with `src/node_file.cc` in how it translates the utf8 strings to `std::wstring` otherwise we might end up in situation where we can read the source code of imported package from disk, but fail to recognize that it is an ESM (or CJS) and cause runtime errors. This type of error is possible on Windows when the path contains unicode characters and "Language for non-Unicode programs" is set to "Chinese (Traditional, Taiwan)". See: nodejs#58768
440a6be to
ec1fc03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have test for this?
|
@RafaelGSS it is a good question, and I'd like to answer yes, but I'm afraid this is complicated to reproduce as it requires changing "Language for non-Unicode programs" to non-english locale so that
|
|
I feel like this should be the safe change, though, since it moves to match |
|
Unrelated to above a backport to 22-x-y would be appreciated! (Although, the patch can't be a plain cherry-pick since there are two conversion callsites in 22-x-y) |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #60575 +/- ##
==========================================
+ Coverage 88.54% 88.56% +0.01%
==========================================
Files 704 704
Lines 208072 208092 +20
Branches 40076 40084 +8
==========================================
+ Hits 184241 184293 +52
+ Misses 15884 15831 -53
- Partials 7947 7968 +21
🚀 New features to boost your workflow:
|
src/node_modules.cc
Outdated
| return nullptr; | ||
| if (is_permissions_enabled) { | ||
| std::string generic_path; | ||
| #ifdef _WIN32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that the amount of ifdefs sprinkled everywhere is a bit cluttering; it looks like all new usages of these are on a std::filesystem::path, so technically they are just repeating what PathToString() does, I think we can just put PathToString to util.h and use that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I think usage of generic_string vs string complicates it a bit, but let me see what I can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about db29a1d? I tried to unify things as much as I could.
src/node_modules.cc
Outdated
|
|
||
| #ifdef _WIN32 | ||
| std::wstring wide_path = ConvertToWideString(path_value_str, GetACP()); | ||
| std::wstring wide_path = ConvertToWideString(path_value_str, CP_UTF8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here - I think we can just create StringToPath and let BufferValueToPath use it, and here it can just use StringToPath
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed by db29a1d. I also dropped the code_point argument from the util method to make sure we don't use GetACP() by accident.
|
FWIW it doesn't look like GHA supports locale setting https://github.com/orgs/community/discussions/68929 and I doubt that we can afford one more Windows setup on Jenkins. I don't have a Windows machine with me right now but I can confirm #60575 (comment) is generally how you reproduce this types of issues on Windows (it's relatively common to see programs misbehave on CJK locale, but crashing is unusual). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a suggestion
Co-authored-by: Joyee Cheung <[email protected]>
|
Let me know if anything else is needed before the merge! (It sounds like all LGTMs that this PR got a stale now because of addressing PR feedback so I can't merge it). cc @joyeecheung |
|
Looks like CI is green, whew 🥲 |
src/node_modules.ccneeds to be consistent withsrc/node_file.ccin how it translates the utf8 strings tostd::wstringotherwise we might end up in situation where we can read the source code of imported package from disk, but fail to recognize that it is an ESM (or CJS) and cause runtime errors. This type of error is possible on Windows when the path contains unicode characters and "Language for non-Unicode programs" is set to "Chinese (Traditional, Taiwan)".See: #58768