You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In aliases and many other cases, CRuby's Ripper emits different lexer tokens depending on the symbol's name. For instance an uppercase letter emits :@const instead of :@Ident.
TruffleRuby does this correctly for 7-bit constants like "A", but not for unicode uppercase constants like "Ñ".
The text was updated successfully, but these errors were encountered:
noahgibbs
changed the title
Ripper: incompatibility for uppercase UTF-8 letters
Ripper: incompatibility for uppercase UTF-8 constant names in aliases
Feb 15, 2024
Thanks for the report.
We use the same C code as CRuby for Ripper.
So this is probably a bug of id_type/rb_str_symname_type/rb_enc_symname_type or sym_type or so.
Possibly related to #3407 which is also about identifier types, but probably not because rb_enc_symname_type seems implemented in C (code from CRuby).
In general the Ripper C extension uses way too many internals and is quite slow with tons of upcalls, so we'd like to get rid of it and replace it by Prism::RipperCompat :)
I think it's best to not use Ripper on TruffleRuby in the Prism test suite, if there is a difference with CRuby it's almost surely a bug and we'd want the same behavior as CRuby for Prism::RipperCompat.
In aliases and many other cases, CRuby's Ripper emits different lexer tokens depending on the symbol's name. For instance an uppercase letter emits :@const instead of :@Ident.
TruffleRuby does this correctly for 7-bit constants like "A", but not for unicode uppercase constants like "Ñ".
CRuby:
TruffleRuby:
The text was updated successfully, but these errors were encountered: