Skip to content

Commit

Permalink
Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco
Browse files Browse the repository at this point in the history
Add `Ord::cmp` for primitives as a `BinOp` in MIR

Update: most of this OP was written months ago.  See rust-lang/rust#118310 (comment) below for where we got to recently that made it ready for review.

---

There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches.  Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:

1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.

Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic.  Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (llvm/llvm-project#73417) to be practical.  Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)?  But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)?  And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers.  Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.

As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR.  The best way to see that is with rust-lang/rust@8811efa#diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113 -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks.  Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues.  (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)

---

r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
  • Loading branch information
bors committed Apr 2, 2024
2 parents b2f6349 + 629a772 commit 79a1bdd
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 3 deletions.
3 changes: 2 additions & 1 deletion src/codegen_i128.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ pub(crate) fn maybe_codegen<'tcx>(
Some(CValue::by_val(ret_val, lhs.layout()))
}
}
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne => None,
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne | BinOp::Cmp => None,
BinOp::Shl | BinOp::ShlUnchecked | BinOp::Shr | BinOp::ShrUnchecked => None,
}
}
Expand Down Expand Up @@ -134,6 +134,7 @@ pub(crate) fn maybe_codegen_checked<'tcx>(
BinOp::AddUnchecked | BinOp::SubUnchecked | BinOp::MulUnchecked => unreachable!(),
BinOp::Offset => unreachable!("offset should only be used on pointers, not 128bit ints"),
BinOp::Div | BinOp::Rem => unreachable!(),
BinOp::Cmp => unreachable!(),
BinOp::Lt | BinOp::Le | BinOp::Eq | BinOp::Ge | BinOp::Gt | BinOp::Ne => unreachable!(),
BinOp::Shl | BinOp::ShlUnchecked | BinOp::Shr | BinOp::ShrUnchecked => unreachable!(),
}
Expand Down
24 changes: 22 additions & 2 deletions src/num.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,33 @@ pub(crate) fn bin_op_to_intcc(bin_op: BinOp, signed: bool) -> Option<IntCC> {
})
}

fn codegen_three_way_compare<'tcx>(
fx: &mut FunctionCx<'_, '_, 'tcx>,
signed: bool,
lhs: Value,
rhs: Value,
) -> CValue<'tcx> {
// This emits `(lhs > rhs) - (lhs < rhs)`, which is cranelift's preferred form per
// <https://github.com/bytecodealliance/wasmtime/blob/8052bb9e3b792503b225f2a5b2ba3bc023bff462/cranelift/codegen/src/prelude_opt.isle#L41-L47>
let gt_cc = crate::num::bin_op_to_intcc(BinOp::Gt, signed).unwrap();
let lt_cc = crate::num::bin_op_to_intcc(BinOp::Lt, signed).unwrap();
let gt = fx.bcx.ins().icmp(gt_cc, lhs, rhs);
let lt = fx.bcx.ins().icmp(lt_cc, lhs, rhs);
let val = fx.bcx.ins().isub(gt, lt);
CValue::by_val(val, fx.layout_of(fx.tcx.ty_ordering_enum(Some(fx.mir.span))))
}

fn codegen_compare_bin_op<'tcx>(
fx: &mut FunctionCx<'_, '_, 'tcx>,
bin_op: BinOp,
signed: bool,
lhs: Value,
rhs: Value,
) -> CValue<'tcx> {
if bin_op == BinOp::Cmp {
return codegen_three_way_compare(fx, signed, lhs, rhs);
}

let intcc = crate::num::bin_op_to_intcc(bin_op, signed).unwrap();
let val = fx.bcx.ins().icmp(intcc, lhs, rhs);
CValue::by_val(val, fx.layout_of(fx.tcx.types.bool))
Expand All @@ -59,7 +79,7 @@ pub(crate) fn codegen_binop<'tcx>(
in_rhs: CValue<'tcx>,
) -> CValue<'tcx> {
match bin_op {
BinOp::Eq | BinOp::Lt | BinOp::Le | BinOp::Ne | BinOp::Ge | BinOp::Gt => {
BinOp::Eq | BinOp::Lt | BinOp::Le | BinOp::Ne | BinOp::Ge | BinOp::Gt | BinOp::Cmp => {
match in_lhs.layout().ty.kind() {
ty::Bool | ty::Uint(_) | ty::Int(_) | ty::Char => {
let signed = type_sign(in_lhs.layout().ty);
Expand Down Expand Up @@ -160,7 +180,7 @@ pub(crate) fn codegen_int_binop<'tcx>(
}
BinOp::Offset => unreachable!("Offset is not an integer operation"),
// Compare binops handles by `codegen_binop`.
BinOp::Eq | BinOp::Ne | BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge => {
BinOp::Eq | BinOp::Ne | BinOp::Lt | BinOp::Le | BinOp::Gt | BinOp::Ge | BinOp::Cmp => {
unreachable!("{:?}({:?}, {:?})", bin_op, in_lhs.layout().ty, in_rhs.layout().ty);
}
};
Expand Down

0 comments on commit 79a1bdd

Please sign in to comment.