
Description
The idiomatic way to sum a range of integers from a
up to b
is simply (a..=b).sum()
. Yet, this always emits suboptimal codegen, as LLVM cannot perfectly optimise this, yet slightly better mathematics can be employed to do the same, more efficiently.
I suggest that specialisation be applied to range inclusive for integer elements where T::BITS < 128
, to exchange a default internal implementation of sum
for a better optimised implementation.
(Note that when an intrinsic is created for u128::carrying_{op}
to yield the carry value, as opposed to just a boolean indicator, then this could be implemented for all types using carrying arithmetic; tag me then?)
given a
, b
, of integer type T
, and the immediately wider type Wider
, one can compute the sum via
// a: T, b: T
let a = Wider::from(a);
let b = Wider::from(b);
let sum_result = ((a + b) * (1 + b).saturating_sub(a) / 2);
// the intermittent values may exceed the bounds of {T}, but the result could still fit
if overflow_checks_enabled && !(Wider::from(T::MIN)..=Wider::from(T::MAX)).contains(&sum_result) {
panic!("integer underflow/overflow")
} else {
// if overflow checks are disabled, casting will work as expected, otherwise it fits as-is
return sum_result as T
}
If this wasn't the right place to have made the issue, please redirect me as appropriate.