Description
.clif
Test Case
The Cranelift IR average
example code from the docs/ir.md. I've already added that to the filetests test directory in caeafe2.
Using clif-util
;
../target/debug/clif-util bugpoint filetests/filetests/isa/x64/average.clif x86_64
After pass 0, remaining insts/blocks: 22/3 (will keep reducing)
After pass 1, remaining insts/blocks: 13/3 (will keep reducing)
After pass 2, remaining insts/blocks: 13/3 (stop reducing)
████████████████████████████████████████████████████████████ pass 2 phase merge blocks 2/ 2 Remove unused global values
Crash message: assertion `left == right` failed
left: types::I32
right: types::I64
function %average() -> f32 system_v {
ss0 = explicit_slot 8
block1:
v0 = iconst.i32 0
v1 = iconst.i32 0
v5 = iconst.i32 0
v6 = iadd v0, v5 ; v0 = 0, v5 = 0
v9 = f64const 0.0
v100 = f32const +NaN
brif v1, block2, block5 ; v1 = 0
block2:
v7 = load.f32 v6
v8 = fpromote.f64 v7
v10 = fadd v8, v9 ; v9 = 0.0
stack_store v10, ss0
trap user1
block5:
return v100 ; v100 = +NaN
}
5 blocks 22 insts -> 3 blocks 13 insts
Manually, I reduced it to the following:
function %average(i32, i32) -> f32 system_v {
ss0 = explicit_slot 8 ; Stack slot for `sum`
block1(v0: i32, v1: i32):
v20 = f64const 0x0.0 ; Create a f64 as 0.0, just to initialise ss0.
stack_store v20, ss0 ; Store f64 into the ss0 stack slot, just to initialise ss0.
v6 = iadd v0, v1 ; Adds the input together as integers, output is i32?
v7 = load.f32 v6 ; Converts v7 to f32 from i32.
v8 = fpromote.f64 v7 ; converts v7 to f64 into v8.
v9 = stack_load.f64 ss0 ; Loads ss0 from the stack into v9, v9 is now f64 that's 0.0
v10 = fadd.f64 v8, v9 ; Adds v8 and v9 together, both are f64's, so v10 is an f64.
stack_store v10, ss0 ; Write the 8 bytes back to ss0.
v100 = f32const +NaN ; Just creating a dummy to return.
return v100 ; Return the dummy.
}
Instructions themselves seem fine to me, but today is the first day I'm looking at Cranelift at all, so I absolutely may be missing something obvious.
Steps to Reproduce
- Cherrypick caeafe2
cd cranelift
cargo t
(note without--release
, test will pass with--release
).
Note, the problem is with x86_64
, this function compiles fine with aarch64
.
Expected Results
I would expect the example function provided in the documentation to compile, it compiles on aarch64
, which (to me at least) indicates the function is fine. I would expect this to also compile on x86_64
.
Actual Results
A debug assert triggers on this line;
Backtrace
---- filetests stdout ----
thread 'worker #7' panicked at cranelift/codegen/src/isa/x64/lower.rs:233:9:
assertion `left == right` failed
left: types::I32
right: types::I64
stack backtrace:
0: rust_begin_unwind
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/panicking.rs:665:5
1: core::panicking::panic_fmt
at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:76:14
2: core::panicking::assert_failed_inner
3: core::panicking::assert_failed
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:373:5
4: cranelift_codegen::isa::x64::lower::lower_to_amode
at ./codegen/src/isa/x64/lower.rs:233:9
5: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::sink_load
at ./codegen/src/isa/x64/lower/isle.rs:313:20
6: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::put_in_reg_mem
at ./codegen/src/isa/x64/lower/isle.rs:148:23
7: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::put_in_xmm_mem
at ./codegen/src/isa/x64/lower/isle.rs:132:28
8: cranelift_codegen::isa::x64::lower::isle::generated_code::constructor_lower
at /home/ivor/Documents/Code/rust/wasmtime/wasmtime/target/debug/build/cranelift-codegen-8f3d42bba10fabf3/out/isle_x64.rs:21098:42
9: cranelift_codegen::isa::x64::lower::isle::lower
at ./codegen/src/isa/x64/lower/isle.rs:55:5
10: cranelift_codegen::isa::x64::lower::<impl cranelift_codegen::machinst::lower::LowerBackend for cranelift_codegen::isa::x64::X64Backend>::lower
at ./codegen/src/isa/x64/lower.rs:311:9
11: cranelift_codegen::machinst::lower::Lower<I>::lower_clif_block
at ./codegen/src/machinst/lower.rs:679:39
12: cranelift_codegen::machinst::lower::Lower<I>::lower
at ./codegen/src/machinst/lower.rs:1020:17
13: cranelift_codegen::machinst::compile::compile
at ./codegen/src/machinst/compile.rs:42:9
14: cranelift_codegen::isa::x64::X64Backend::compile_vcode
at ./codegen/src/isa/x64/mod.rs:62:9
15: <cranelift_codegen::isa::x64::X64Backend as cranelift_codegen::isa::TargetIsa>::compile_function
at ./codegen/src/isa/x64/mod.rs:74:40
16: cranelift_codegen::context::Context::compile_stencil
at ./codegen/src/context.rs:138:9
17: cranelift_codegen::context::Context::compile
at ./codegen/src/context.rs:204:23
18: <cranelift_filetests::test_compile::TestCompile as cranelift_filetests::subtest::SubTest>::run
at ./filetests/src/test_compile.rs:59:29
19: cranelift_filetests::subtest::SubTest::run_target
at ./filetests/src/subtest.rs:101:13
20: cranelift_filetests::runone::run
at ./filetests/src/runone.rs:97:9
21: cranelift_filetests::concurrent::worker_thread::{{closure}}::{{closure}}
at ./filetests/src/concurrent.rs:149:46
22: std::panicking::try::do_call
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
23: __rust_try
24: std::panicking::try
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
25: std::panic::catch_unwind
at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
26: cranelift_filetests::concurrent::worker_thread::{{closure}}
at ./filetests/src/concurrent.rs:149:30
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
The relevant section from isle_x64.rs
;
&Opcode::Fpromote => {
let v1 = C::first_result(ctx, arg0);
if let Some(v2) = v1 {
let v3 = C::value_type(ctx, v2);
if v3 == F64 {
let v1809 = constructor_xmm_zero(ctx, F64X2);
let v1804 = &C::put_in_xmm_mem(ctx, v520); // <- Line isle_x64.rs:21098:42
let v1819 = constructor_x64_cvtss2sd(ctx, v1809, v1804);
let v1820 = constructor_output_xmm(ctx, v1819);
let v1821 = Some(v1820);
// Rule at src/isa/x64/lower.isle line 2661.
return v1821;
}
}
}
Which seems to indicate the problem is originating from Fpromote
.
Versions and Environment
Cranelift version or commit: Current released; 0.116.1
, also tested on current main; 5dfccc0.
Operating system: Ubuntu, stable rustc 1.84.0 (9fc6b4312 2025-01-07)
Architecture: x86_64
Extra Info
If I use a release build, target x86_64 and write the output to an object file using cranelift_object::ObjectModule
. The provided symbol does work and correctly calculates averages.
Disassembly of object and usage of it to calculate average
$ objdump -d average.o
average.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <average>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 66 0f 57 e4 xorpd %xmm4,%xmm4
c: 48 8d 04 24 lea (%rsp),%rax
10: f2 0f 11 20 movsd %xmm4,(%rax)
14: 85 f6 test %esi,%esi
16: 0f 85 12 00 00 00 jne 2e <average+0x2e>
1c: b9 00 00 c0 7f mov $0x7fc00000,%ecx
21: 66 0f 6e c1 movd %ecx,%xmm0
25: 48 83 c4 10 add $0x10,%rsp
29: 48 89 ec mov %rbp,%rsp
2c: 5d pop %rbp
2d: c3 retq
2e: 31 c9 xor %ecx,%ecx
30: 44 6b c9 04 imul $0x4,%ecx,%r9d
34: 66 0f 57 ed xorpd %xmm5,%xmm5
38: f3 42 0f 5a 2c 0f cvtss2sd (%rdi,%r9,1),%xmm5
3e: 4c 8d 0c 24 lea (%rsp),%r9
42: f2 41 0f 58 29 addsd (%r9),%xmm5
47: 4c 8d 0c 24 lea (%rsp),%r9
4b: f2 41 0f 11 29 movsd %xmm5,(%r9)
50: 83 c1 01 add $0x1,%ecx
53: 39 f1 cmp %esi,%ecx
55: 0f 82 d5 ff ff ff jb 30 <average+0x30>
5b: 48 8d 3c 24 lea (%rsp),%rdi
5f: f2 0f 10 3f movsd (%rdi),%xmm7
63: 66 0f 57 e4 xorpd %xmm4,%xmm4
67: 8b fe mov %esi,%edi
69: f2 48 0f 2a e7 cvtsi2sd %rdi,%xmm4
6e: f2 0f 5e fc divsd %xmm4,%xmm7
72: 0f 57 c0 xorps %xmm0,%xmm0
75: f2 0f 5a c7 cvtsd2ss %xmm7,%xmm0
79: 48 83 c4 10 add $0x10,%rsp
7d: 48 89 ec mov %rbp,%rsp
80: 5d pop %rbp
81: c3 retq
$ cat main.cpp
#include <array>
#include <iostream>
extern "C" {
float average(const float *array, size_t count);
}
int main(int, char**) {
const auto values = std::array<float, 4>{1, 5555, 3, 4};
const auto avg = average(values.data(), values.size());
std::cout << "avg: " << avg << std::endl;
return 0;
}
$ g++ main.cpp average.o && ./a.out
avg: 1390.75