Skip to content

Cranelift: IR Reference example triggers debug assert on x86_64 #10118

Closed
@iwanders

Description

@iwanders

.clif Test Case

The Cranelift IR average example code from the docs/ir.md. I've already added that to the filetests test directory in caeafe2.

Using clif-util;

../target/debug/clif-util  bugpoint  filetests/filetests/isa/x64/average.clif x86_64
After pass 0, remaining insts/blocks: 22/3 (will keep reducing)
After pass 1, remaining insts/blocks: 13/3 (will keep reducing)
After pass 2, remaining insts/blocks: 13/3 (stop reducing)
████████████████████████████████████████████████████████████ pass 2 phase merge blocks                   2/   2 Remove unused global values
Crash message: assertion `left == right` failed
  left: types::I32
 right: types::I64

function %average() -> f32 system_v {
    ss0 = explicit_slot 8

block1:
    v0 = iconst.i32 0
    v1 = iconst.i32 0
    v5 = iconst.i32 0
    v6 = iadd v0, v5  ; v0 = 0, v5 = 0
    v9 = f64const 0.0
    v100 = f32const +NaN
    brif v1, block2, block5  ; v1 = 0

block2:
    v7 = load.f32 v6
    v8 = fpromote.f64 v7
    v10 = fadd v8, v9  ; v9 = 0.0
    stack_store v10, ss0
    trap user1

block5:
    return v100  ; v100 = +NaN
}

5 blocks 22 insts -> 3 blocks 13 insts

Manually, I reduced it to the following:

function %average(i32, i32) -> f32 system_v {
    ss0 = explicit_slot 8        ; Stack slot for `sum`

block1(v0: i32, v1: i32):
    v20 = f64const 0x0.0         ; Create a f64 as 0.0, just to initialise ss0.
    stack_store v20, ss0         ; Store f64 into the ss0 stack slot, just to initialise ss0.
    v6 = iadd v0, v1             ; Adds the input together as integers, output is i32?
    v7 = load.f32 v6             ; Converts v7 to f32 from i32.
    v8 = fpromote.f64 v7         ; converts v7 to f64 into v8.
    v9 = stack_load.f64 ss0      ; Loads ss0 from the stack into v9, v9 is now f64 that's 0.0
    v10 = fadd.f64 v8, v9        ; Adds v8 and v9 together, both are f64's, so v10 is an f64.
    stack_store v10, ss0         ; Write the 8 bytes back to ss0.
    v100 = f32const +NaN         ; Just creating a dummy to return.
    return v100                  ; Return the dummy.
}

Instructions themselves seem fine to me, but today is the first day I'm looking at Cranelift at all, so I absolutely may be missing something obvious.

Steps to Reproduce

  • Cherrypick caeafe2
  • cd cranelift
  • cargo t (note without --release, test will pass with --release).

Note, the problem is with x86_64, this function compiles fine with aarch64.

Expected Results

I would expect the example function provided in the documentation to compile, it compiles on aarch64, which (to me at least) indicates the function is fine. I would expect this to also compile on x86_64.

Actual Results

A debug assert triggers on this line;

Backtrace
---- filetests stdout ----
thread 'worker #7' panicked at cranelift/codegen/src/isa/x64/lower.rs:233:9:
assertion `left == right` failed
  left: types::I32
 right: types::I64
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/9fc6b43126469e3858e2fe86cafb4f0fd5068869/library/core/src/panicking.rs:76:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
             at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:373:5
   4: cranelift_codegen::isa::x64::lower::lower_to_amode
             at ./codegen/src/isa/x64/lower.rs:233:9
   5: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::sink_load
             at ./codegen/src/isa/x64/lower/isle.rs:313:20
   6: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::put_in_reg_mem
             at ./codegen/src/isa/x64/lower/isle.rs:148:23
   7: cranelift_codegen::isa::x64::lower::isle::<impl cranelift_codegen::isa::x64::lower::isle::generated_code::Context for cranelift_codegen::machinst::isle::IsleContext<cranelift_codegen::isa::x64::lower::isle::generated_code::MInst,cranelift_codegen::isa::x64::X64Backend>>::put_in_xmm_mem
             at ./codegen/src/isa/x64/lower/isle.rs:132:28
   8: cranelift_codegen::isa::x64::lower::isle::generated_code::constructor_lower
             at /home/ivor/Documents/Code/rust/wasmtime/wasmtime/target/debug/build/cranelift-codegen-8f3d42bba10fabf3/out/isle_x64.rs:21098:42
   9: cranelift_codegen::isa::x64::lower::isle::lower
             at ./codegen/src/isa/x64/lower/isle.rs:55:5
  10: cranelift_codegen::isa::x64::lower::<impl cranelift_codegen::machinst::lower::LowerBackend for cranelift_codegen::isa::x64::X64Backend>::lower
             at ./codegen/src/isa/x64/lower.rs:311:9
  11: cranelift_codegen::machinst::lower::Lower<I>::lower_clif_block
             at ./codegen/src/machinst/lower.rs:679:39
  12: cranelift_codegen::machinst::lower::Lower<I>::lower
             at ./codegen/src/machinst/lower.rs:1020:17
  13: cranelift_codegen::machinst::compile::compile
             at ./codegen/src/machinst/compile.rs:42:9
  14: cranelift_codegen::isa::x64::X64Backend::compile_vcode
             at ./codegen/src/isa/x64/mod.rs:62:9
  15: <cranelift_codegen::isa::x64::X64Backend as cranelift_codegen::isa::TargetIsa>::compile_function
             at ./codegen/src/isa/x64/mod.rs:74:40
  16: cranelift_codegen::context::Context::compile_stencil
             at ./codegen/src/context.rs:138:9
  17: cranelift_codegen::context::Context::compile
             at ./codegen/src/context.rs:204:23
  18: <cranelift_filetests::test_compile::TestCompile as cranelift_filetests::subtest::SubTest>::run
             at ./filetests/src/test_compile.rs:59:29
  19: cranelift_filetests::subtest::SubTest::run_target
             at ./filetests/src/subtest.rs:101:13
  20: cranelift_filetests::runone::run
             at ./filetests/src/runone.rs:97:9
  21: cranelift_filetests::concurrent::worker_thread::{{closure}}::{{closure}}
             at ./filetests/src/concurrent.rs:149:46
  22: std::panicking::try::do_call
             at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:557:40
  23: __rust_try
  24: std::panicking::try
             at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:520:19
  25: std::panic::catch_unwind
             at /home/ivor/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:358:14
  26: cranelift_filetests::concurrent::worker_thread::{{closure}}
             at ./filetests/src/concurrent.rs:149:30
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

The relevant section from isle_x64.rs;

&Opcode::Fpromote => {
    let v1 = C::first_result(ctx, arg0);
    if let Some(v2) = v1 {
        let v3 = C::value_type(ctx, v2);
        if v3 == F64 {
            let v1809 = constructor_xmm_zero(ctx, F64X2);
            let v1804 = &C::put_in_xmm_mem(ctx, v520);             // <- Line isle_x64.rs:21098:42
            let v1819 = constructor_x64_cvtss2sd(ctx, v1809, v1804);
            let v1820 = constructor_output_xmm(ctx, v1819);
            let v1821 = Some(v1820);
            // Rule at src/isa/x64/lower.isle line 2661.
            return v1821;
        }
    }
}

Which seems to indicate the problem is originating from Fpromote.

Versions and Environment

Cranelift version or commit: Current released; 0.116.1, also tested on current main; 5dfccc0.

Operating system: Ubuntu, stable rustc 1.84.0 (9fc6b4312 2025-01-07)

Architecture: x86_64

Extra Info

If I use a release build, target x86_64 and write the output to an object file using cranelift_object::ObjectModule. The provided symbol does work and correctly calculates averages.

Disassembly of object and usage of it to calculate average
$ objdump -d average.o

average.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <average>:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	48 83 ec 10          	sub    $0x10,%rsp
   8:	66 0f 57 e4          	xorpd  %xmm4,%xmm4
   c:	48 8d 04 24          	lea    (%rsp),%rax
  10:	f2 0f 11 20          	movsd  %xmm4,(%rax)
  14:	85 f6                	test   %esi,%esi
  16:	0f 85 12 00 00 00    	jne    2e <average+0x2e>
  1c:	b9 00 00 c0 7f       	mov    $0x7fc00000,%ecx
  21:	66 0f 6e c1          	movd   %ecx,%xmm0
  25:	48 83 c4 10          	add    $0x10,%rsp
  29:	48 89 ec             	mov    %rbp,%rsp
  2c:	5d                   	pop    %rbp
  2d:	c3                   	retq   
  2e:	31 c9                	xor    %ecx,%ecx
  30:	44 6b c9 04          	imul   $0x4,%ecx,%r9d
  34:	66 0f 57 ed          	xorpd  %xmm5,%xmm5
  38:	f3 42 0f 5a 2c 0f    	cvtss2sd (%rdi,%r9,1),%xmm5
  3e:	4c 8d 0c 24          	lea    (%rsp),%r9
  42:	f2 41 0f 58 29       	addsd  (%r9),%xmm5
  47:	4c 8d 0c 24          	lea    (%rsp),%r9
  4b:	f2 41 0f 11 29       	movsd  %xmm5,(%r9)
  50:	83 c1 01             	add    $0x1,%ecx
  53:	39 f1                	cmp    %esi,%ecx
  55:	0f 82 d5 ff ff ff    	jb     30 <average+0x30>
  5b:	48 8d 3c 24          	lea    (%rsp),%rdi
  5f:	f2 0f 10 3f          	movsd  (%rdi),%xmm7
  63:	66 0f 57 e4          	xorpd  %xmm4,%xmm4
  67:	8b fe                	mov    %esi,%edi
  69:	f2 48 0f 2a e7       	cvtsi2sd %rdi,%xmm4
  6e:	f2 0f 5e fc          	divsd  %xmm4,%xmm7
  72:	0f 57 c0             	xorps  %xmm0,%xmm0
  75:	f2 0f 5a c7          	cvtsd2ss %xmm7,%xmm0
  79:	48 83 c4 10          	add    $0x10,%rsp
  7d:	48 89 ec             	mov    %rbp,%rsp
  80:	5d                   	pop    %rbp
  81:	c3                   	retq  
$ cat main.cpp 
#include <array>
#include <iostream>
extern "C" {
  float average(const float *array, size_t count);
}
int main(int, char**) {
  const auto values = std::array<float, 4>{1, 5555, 3, 4};
  const auto avg = average(values.data(), values.size());
  std::cout << "avg: " << avg << std::endl;
  return 0;
}
$ g++ main.cpp  average.o && ./a.out 
avg: 1390.75

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior in the current implementation that needs fixingcraneliftIssues related to the Cranelift code generator

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions