Skip to content

Commit 79a1bdd

Browse files
committed
Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco
Add `Ord::cmp` for primitives as a `BinOp` in MIR Update: most of this OP was written months ago. See rust-lang/rust#118310 (comment) below for where we got to recently that made it ready for review. --- There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level: 1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest. 2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations. Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (llvm/llvm-project#73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it. As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with rust-lang/rust@8811efa#diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113 -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.) --- r? `@ghost` But first I should check that perf is ok with this ~~...and my true nemesis, tidy.~~
2 parents b2f6349 + 629a772 commit 79a1bdd

File tree

2 files changed

+24
-3
lines changed

2 files changed

+24
-3
lines changed

src/codegen_i128.rs

+2-1
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ pub(crate) fn maybe_codegen<'tcx>(
6868
Some(CValue::by_val(ret_val, lhs.layout()))
6969
}
7070
}
71-
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne => None,
71+
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne | BinOp::Cmp => None,
7272
BinOp::Shl | BinOp::ShlUnchecked | BinOp::Shr | BinOp::ShrUnchecked => None,
7373
}
7474
}
@@ -134,6 +134,7 @@ pub(crate) fn maybe_codegen_checked<'tcx>(
134134
BinOp::AddUnchecked | BinOp::SubUnchecked | BinOp::MulUnchecked => unreachable!(),
135135
BinOp::Offset => unreachable!("offset should only be used on pointers, not 128bit ints"),
136136
BinOp::Div | BinOp::Rem => unreachable!(),
137+
BinOp::Cmp => unreachable!(),
137138
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne => unreachable!(),
138139
BinOp::Shl | BinOp::ShlUnchecked | BinOp::Shr | BinOp::ShrUnchecked => unreachable!(),
139140
}

src/num.rs

+22-2
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,33 @@ pub(crate) fn bin_op_to_intcc(bin_op: BinOp, signed: bool) -> Option<IntCC> {
4040
})
4141
}
4242

43+
fn codegen_three_way_compare<'tcx>(
44+
fx: &mut FunctionCx<'_, '_, 'tcx>,
45+
signed: bool,
46+
lhs: Value,
47+
rhs: Value,
48+
) -> CValue<'tcx> {
49+
// This emits `(lhs > rhs) - (lhs < rhs)`, which is cranelift's preferred form per
50+
// <https://github.com/bytecodealliance/wasmtime/blob/8052bb9e3b792503b225f2a5b2ba3bc023bff462/cranelift/codegen/src/prelude_opt.isle#L41-L47>
51+
let gt_cc = crate::num::bin_op_to_intcc(BinOp::Gt, signed).unwrap();
52+
let lt_cc = crate::num::bin_op_to_intcc(BinOp::Lt, signed).unwrap();
53+
let gt = fx.bcx.ins().icmp(gt_cc, lhs, rhs);
54+
let lt = fx.bcx.ins().icmp(lt_cc, lhs, rhs);
55+
let val = fx.bcx.ins().isub(gt, lt);
56+
CValue::by_val(val, fx.layout_of(fx.tcx.ty_ordering_enum(Some(fx.mir.span))))
57+
}
58+
4359
fn codegen_compare_bin_op<'tcx>(
4460
fx: &mut FunctionCx<'_, '_, 'tcx>,
4561
bin_op: BinOp,
4662
signed: bool,
4763
lhs: Value,
4864
rhs: Value,
4965
) -> CValue<'tcx> {
66+
if bin_op == BinOp::Cmp {
67+
return codegen_three_way_compare(fx, signed, lhs, rhs);
68+
}
69+
5070
let intcc = crate::num::bin_op_to_intcc(bin_op, signed).unwrap();
5171
let val = fx.bcx.ins().icmp(intcc, lhs, rhs);
5272
CValue::by_val(val, fx.layout_of(fx.tcx.types.bool))
@@ -59,7 +79,7 @@ pub(crate) fn codegen_binop<'tcx>(
5979
in_rhs: CValue<'tcx>,
6080
) -> CValue<'tcx> {
6181
match bin_op {
62-
BinOp::Eq | BinOp::Lt | BinOp::Le | BinOp::Ne | BinOp::Ge | BinOp::Gt => {
82+
BinOp::Eq | BinOp::Lt | BinOp::Le | BinOp::Ne | BinOp::Ge | BinOp::Gt | BinOp::Cmp => {
6383
match in_lhs.layout().ty.kind() {
6484
ty::Bool | ty::Uint(_) | ty::Int(_) | ty::Char => {
6585
let signed = type_sign(in_lhs.layout().ty);
@@ -160,7 +180,7 @@ pub(crate) fn codegen_int_binop<'tcx>(
160180
}
161181
BinOp::Offset => unreachable!("Offset is not an integer operation"),
162182
// Compare binops handles by `codegen_binop`.
163-
BinOp::Eq | BinOp::Ne | BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge => {
183+
BinOp::Eq | BinOp::Ne | BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge | BinOp::Cmp => {
164184
unreachable!("{:?}({:?}, {:?})", bin_op, in_lhs.layout().ty, in_rhs.layout().ty);
165185
}
166186
};

0 commit comments

Comments
 (0)