-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asm: pretty-printing signed immediates #10200
Comments
Personally I feel like we should prioritize the representation of the types of the immediates to ensure it minimizes errors and is easy to use. Matching capstone exactly seems like something where we might want to instead engineer the test suite/fuzzing to remove that necessity. One possible option with that is to rework tests to (a) generate an arbitrary Inst, (b) convert Inst to binary, (c) print the Inst and use a different assembler to convert to binary (maybe |
As mentioned in bytecodealliance#10200, `capstone` has a peculiar way of pretty-printing immediates, especially signed immediates. It is simpler (and perhaps more clear) for us to just print immediates in one consistent format: `0xffff...`, e.g. This change parses capstone's pretty-printed immediates and converts them to our simpler format if the first attempt to match this assembler's output with `capston` fails.
As pointed out in bytecodealliance#10200, it could be confusing for users for `cranelift-assembler-x64` to pass unsigned integers to certain assembler instructions and have them unexpectedly sign-extended. Well, it can't be too surprising since these instructions have a `_sx*` suffix, but this change implements @alexcrichton's additional suggestion to create separate types for the immediates that may be sign-extended. These new types (`Simm8`, `Simm16`, `Simm32`) are quite similar to their vanilla counterparts (`Imm8`, `Imm16`, `Imm32`) but have additional sign-extension logic when pretty-printed. This means the vanilla versions can be simplified and the pre-existing `Simm32` is renamed to the more appropriate `AmodeOffset`.
As mentioned in bytecodealliance#10200, `capstone` has a peculiar way of pretty-printing immediates, especially signed immediates. It is simpler (and perhaps more clear) for us to just print immediates in one consistent format: `0xffff...`, e.g. This change parses capstone's pretty-printed immediates and converts them to our simpler format if the first attempt to match this assembler's output with `capston` fails.
As pointed out in bytecodealliance#10200, it could be confusing for users for `cranelift-assembler-x64` to pass unsigned integers to certain assembler instructions and have them unexpectedly sign-extended. Well, it can't be too surprising since these instructions have a `_sx*` suffix, but this change implements @alexcrichton's additional suggestion to create separate types for the immediates that may be sign-extended. These new types (`Simm8`, `Simm16`, `Simm32`) are quite similar to their vanilla counterparts (`Imm8`, `Imm16`, `Imm32`) but have additional sign-extension logic when pretty-printed. This means the vanilla versions can be simplified and the pre-existing `Simm32` is renamed to the more appropriate `AmodeOffset`.
As mentioned in bytecodealliance#10200, `capstone` has a peculiar way of pretty-printing immediates, especially signed immediates. It is simpler (and perhaps more clear) for us to just print immediates in one consistent format: `0xffff...`, e.g. This change parses capstone's pretty-printed immediates and converts them to our simpler format if the first attempt to match this assembler's output with `capston` fails.
As pointed out in bytecodealliance#10200, it could be confusing for users for `cranelift-assembler-x64` to pass unsigned integers to certain assembler instructions and have them unexpectedly sign-extended. Well, it can't be too surprising since these instructions have a `_sx*` suffix, but this change implements @alexcrichton's additional suggestion to create separate types for the immediates that may be sign-extended. These new types (`Simm8`, `Simm16`, `Simm32`) are quite similar to their vanilla counterparts (`Imm8`, `Imm16`, `Imm32`) but have additional sign-extension logic when pretty-printed. This means the vanilla versions can be simplified and the pre-existing `Simm32` is renamed to the more appropriate `AmodeOffset`.
As mentioned in bytecodealliance#10200, `capstone` has a peculiar way of pretty-printing immediates, especially signed immediates. It is simpler (and perhaps more clear) for us to just print immediates in one consistent format: `0xffff...`, e.g. This change parses capstone's pretty-printed immediates and converts them to our simpler format if the first attempt to match this assembler's output with `capston` fails.
As pointed out in bytecodealliance#10200, it could be confusing for users for `cranelift-assembler-x64` to pass unsigned integers to certain assembler instructions and have them unexpectedly sign-extended. Well, it can't be too surprising since these instructions have a `_sx*` suffix, but this change implements @alexcrichton's additional suggestion to create separate types for the immediates that may be sign-extended. These new types (`Simm8`, `Simm16`, `Simm32`) are quite similar to their vanilla counterparts (`Imm8`, `Imm16`, `Imm32`) but have additional sign-extension logic when pretty-printed. This means the vanilla versions can be simplified and the pre-existing `Simm32` is renamed to the more appropriate `AmodeOffset`.
This issue outlines two problems I encountered adding new assembler instructions:
capstone
's pretty-printing, we must distinguish between signed and unsigned immediates, both of which can be sign-extended (!)Taken together, these two problems make it difficult to find a solution that satisfies both requirements. Let me explain:
capstone
pretty-prints immediates differently per instruction. The x64add
andand
groups both have instructions that sign-extend a 32-bit immediate into a 64-bit one before the operation. Theadd
output prints like a signed integer, but theand
prints like an unsigned integer:This is probably due to
capstone
understanding thatadd
is arithmetic andand
is logical — makes sense, right? One solution to properly match whatcapstone
prints is to add a newsimm*
form to the DSL: for sign-extending instructions,add
would get thesimm*
form and print the signed integer ($-0x...
),and
would get the currentimm*
form and print the unsigned integer ($0xffff...
)... just extended to the right width. (There are other solutions here, like switching to XED which prints both forms as unsigned integers, but we may not be ready for that just yet).But what about problem 2? @alexcrichton was concerned that if we don't differentiate the immediate type that the assembly instruction takes, we could try to pass in bit-equivalent values to these sign-extending instructions but then have unexpected effects when they are sign-extended; e.g., we pass in
254u8
to one of these instructions but it gets treated as-2i8
and sign-extended to-2i64
. We added this comment to track this:wasmtime/cranelift/codegen/src/isa/x64/lower/isle.rs
Lines 965 to 978 in d943d57
Problem 1 and problem 2 interfere: if we choose to represent the
add
operand withsimm*
as suggested above, the instruction can accept a newSimm*
type at the CLIF level that makes it clear that we accept a signed integer and that this will be sign-extended — all is well. But, theand
operand would still beimm*
, accepting anImm*
type, and still confusing the user at the CLIF level, as @alexcrichton was worried would happen. There are several solutions here, but none that I really like, so I'll just describe the problem for now.The text was updated successfully, but these errors were encountered: