-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Support running stubtest in non-UTF8 terminals #19085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Support running stubtest in non-UTF8 terminals #19085
Conversation
From a local run, the crash seems solved, but the display is incorrect:
Well you have my local results above. |
Yeah, that's the spirit: stdout on windows is not UTF8, and those utf8
chars are not representable in the codepage it uses. So we can't display
the original characters and have to replace them. Regarding question mark
vs replacement character, does it exist in CP1252? I'm asking because
python helpfully uses it to replace invalid unicode points, but doesn't do
the same here, making me suspect there's simply nothing better than a plain
question mark in that codepage.
…On Tue, May 13, 2025, 01:49 Avasam ***@***.***> wrote:
*Avasam* left a comment (python/mypy#19085)
<#19085 (comment)>
The crash seems solved, but the display is incorrect:
error: networkx.classes.reportviews.InMultiEdgeView.dataview variable differs from runtime type abc.ABCMeta
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\classes\reportviews.pyi:334
def [_Node <: typing.Hashable, _D] (viewer: Any, nbunch: Union[_Node`1, typing.Iterable[_Node`1], None] =, data: builtins.bool =, *, default: Union[Any, None] =) -> networkx.classes.reportviews.InMultiEdgeDataView[_Node`1, _D`2]
Runtime: in file C:\Users\Avasam\AppData\Local\Temp\stubtest-ind4bx6n\Lib\site-packages\networkx\classes\reportviews.py:1019
def (viewer, nbunch=None, data=False, *, default=None, keys=False)
error: networkx.classes.reportviews.MultiEdgeView.dataview variable differs from runtime type abc.ABCMeta
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\classes\reportviews.pyi:272
def [_Node <: typing.Hashable, _D] (viewer: Any, nbunch: Union[_Node`1, typing.Iterable[_Node`1], None] =, data: builtins.bool =, *, default: Union[Any, None] =) -> networkx.classes.reportviews.MultiEdgeDataView[_Node`1, _D`2]
Runtime: in file C:\Users\Avasam\AppData\Local\Temp\stubtest-ind4bx6n\Lib\site-packages\networkx\classes\reportviews.py:983
def (viewer, nbunch=None, data=False, *, default=None, keys=False)
error: networkx.readwrite.text.AsciiDirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'!'
error: networkx.readwrite.text.AsciiUndirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'|'
error: networkx.readwrite.text.UtfDirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'?'
error: networkx.readwrite.text.UtfUndirectedGlyphs.vertical_edge is not present in stub
Stub: in file E:\Users\Avasam\Documents\Git\typeshed\stubs\networkx\networkx\readwrite\text.pyi
MISSING
Runtime:
'?'
Found 6 errors (checked 565 modules)
? (I assume that's just meant to be the replacement char �) should be ╽
and │
—
Reply to this email directly, view it on GitHub
<#19085 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMBQIRC2PK2U6JCAT5DOFI326EXPZAVCNFSM6AAAAAB47EGJYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNZUGU4DSNJRGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
VSCode saves it in "Western (Windows 1252)" as byte |
Thanks! So this is intended behavior, there's really no defined replacement
character, and that fancy boxed character simply doesn't exist outside of
UTF8, python is correctly using the question mark. The UNUSED codepoint
might be used for that, but that's simply not what python decided to do. Is
this solution good enough for your use case? Since we can't display utf8 in
terminal using some other encoding, we'll have to replace everything
unrecognized, and I suppose codec authors have put enough efforts in
choosing the suitable replacement character:)
…On Tue, May 13, 2025, 02:20 Avasam ***@***.***> wrote:
*Avasam* left a comment (python/mypy#19085)
<#19085 (comment)>
Regarding question mark vs replacement character, does it exist in CP1252?
VSCode saves it in "Western (Windows 1252)" as byte 0x9D/157. Which
https://www.fileformat.info/info/charset/windows-1252/list.htm doesn't
list and https://www.w3schools.com/charsets/ref_html_ansi.asp marks as
NOT USED.
image.png (view on web)
<https://github.com/user-attachments/assets/6a8d4304-c47a-4ba3-9474-f2f92a9ae229>
—
Reply to this email directly, view it on GitHub
<#19085 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMBQIRFJDLM6IYLY55ZBNQD26E3GXAVCNFSM6AAAAAB47EGJYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNZUGYZTCOJWGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Yeah, it definitely is. It's rare enough. And a "unknown character" representation is much, much better than a crash. It's also not too hard to connect the dots and understand why the console is showing |
Fixes #19071. I looked through the
open()
calls in the codebase, and onlyreports.py
raises some concerns. Stubtest crashes due to thisprint
call with incompatible encoding.I tested this on Linux with
LC_CTYPE=ru_RU.CP1251
(random non-utf8 locale I found in/usr/share/i18n/SUPPORTED
) and confirmed thatstubtest
crashes without the patch and passes with it.Using a simple MRE (empty stub file and
A = "╙"
in a file, this symbol is$'\u2559'
), I got this:Without the patch I get a crash - same as in the linked issue.
@Avasam could you check this patch on windows, please? My nerves really aren't strong enough to configure whole virtualbox for a tiny encoding problem...