-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discussion][donotmerge]: Copy Python implementation for float::div_euclid
#133485
base: master
Are you sure you want to change the base?
Conversation
FIXME: The Python link uses "fmod " to get more exact remander. But this PR only uses raw % operator. Is this a problem in practice? Reference: * <https://github.com/python/cpython/blob/3.13/Objects/floatobject.c#L662>.
float::div_euclid
by copying Python implementationfloat::div_euclid
https://doc.rust-lang.org/std/primitive.f32.html#method.div_euclid the documentation clearly states that this method is precise to a rounded infinite-precision result, so its current implementation is just wrong and the new implementation is current. libs-api can have a look, but this seems very correct to me. |
To inline the problem a bit further: fn main() {
let (lhs, rhs) = (11.0f64, 1.1f64);
// From the docs of `div_euclid`:
//
// This computes the integer `n`...
let n = lhs.div_euclid(rhs);
let r = lhs.rem_euclid(rhs);
dbg!(lhs, rhs, n, r);
// lhs = 11.0
// rhs = 1.1
// n = 10.0
// r = 1.0999999999999992
//
// ...such that `self = n * rhs + self.rem_euclid(rhs)`.
assert_eq!(lhs, n * rhs + r);
//~^ PANIC
// assertion `left == right` failed
// left: 11.0
// right: 12.1
} That is, the behavior contradicts what our documentation states (and what good mathematical sense would suggest) about the invariant that |
The documentation is not necessarily wrong because rounding from an infinite result can include rounding up. It's definitely bad that div/rem aren't in sync though. I'm worried about the license implications of blatantly copying Python too. If that's indeed a problem, we may need someone else to do a clean-room fix. |
Agreed. Over in... ...it was suggested that:
We could do something like that. |
On rounding:
|
This paper has an extensive analysis of this problem:
There is a companion paper: |
Are you going for "more correct" or actually correct in all cases? I believe that the proposed (Python) implementation only works correctly in some cases: if the integer result can be exactly represented (such as in the example of 9). But it doesn't necessarily round correctly when the result can't be exactly represented (very large integer results). More care (and a proof) is necessary to make sure of correct rounding in all cases. |
The documentation guarantees "the rounded infinite-precision result.". Therefore the correct behavior is unambiguously 9f32:
|
Maybe we need an Plus, since it is documented, |
Python |
Having analyzed this a bit, and having looked at the Rust implementation in pub fn rem_euclid(self, rhs: f64) -> f64 {
self - self.div_euclid(rhs) * rhs
} ...should be accompanied by a (preferably machine-checked) proof of correctness. |
This comment was marked as duplicate.
This comment was marked as duplicate.
I suppose we should modify div_euclid/rem_euclid by a little: If we want to ensure
I have another idea, mathematically, ensure What's more, since the calculation of Seems introducing such impl: pub fn div_euclid(self, div:Self) -> Self { self.divmod_euclid(div).0 }
pub fn divmod_euclid(self, div:Self) -> (Self, Self) {
// keep rem_euclid as-is
let rem = self.rem_euclid(div);
(((self - rem)/div).round(), rem) // the `.round()` might not be necessary, but I cannot sure.
} |
Yes, in my analysis I'd walked down this same path, done this same rearrangement of terms, and then upon hitting... (((self - rem)/div).round(), rem) // the `.round()` might not be necessary, but I cannot be sure. ...decided that I wasn't sure either, but from having read that paper knew the question was subtle, and so that's why I say it feels like it calls for a proof. The point about losing bits is a real consideration, though, and would deserve its own analysis. |
The paper only addresses the easy case, where the inputs are positive and the result is small enough that it can be exactly represented in floating point. Summary of the paper: even in that case you don't always get correct results if you do floating point division (we already know this from the example here), unless you change the rounding mode first. But there is a relatively easy solution for this case: scale mantissas to be integers, and use integer division (u128 / u64 -> u64)! A somewhat harder case is when the resulting integer is larger than 2^53 (or larger than 2^64), in which case getting the rounding correct is a little bit more subtle, but this is still doable using an integer u128 / u64 -> u64 division and a possible adjustment by 1. There are additional cases for negative numbers. I can implement this (with proofs). |
float::div_euclid
float::div_euclid
float::div_euclid
float::div_euclid
Failed to set assignee to
|
Oh. 🤦 Yes, I neglected the On the other hand,
I'm sure there are other inputs that would have some rounding error, but I wonder if this is generally more accurate than So regarding this suggested change:
The current pub fn div_euclid(self, rhs: f64) -> f64 {
(self - self.rem_euclid(rhs)) / rhs
} (might need to |
@rustbot author |
cc @BartMassey for help with floating point precision and confusion. |
👍 for computing one in terms of the other to guarantee by construction that they always have the expected relationship. Big shrug for the change to the observed behavior of |
We discussed this in the libs-api meeting and concluded that we definitely need As such, we are happy with a breaking change here that makes the result "more correct". @rfcbot fcp merge |
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
This doesn't actually work (or at least, I don't see why it would work, haven't looked for an explicit counterexample). The guarantee currently in the docs, as I interpret it, is this: Define round(x) = the closest representable number to x. For given real numbers a, b define real numbers q, r such that a = b * q + r, q is an integer, 0 <= r < |b|. That's the "infinite precision" part. Then:
Currently,
The proposed alternative implementation in terms of |
Incurring several roundings in |
I agree that it's an improvement. It could be replaced again later with a fully correct algorithm if this one still has bad cases (I give it maybe a 25% chance that it might be always correct). |
I played a bit with quickcheck and found cases that do need rounding to get an integer result.
This one is right on the edge:
And there are probably more errors once you get past the manitissa's integer accuracy. Not necessarily worse than the status quo though, since that's attempting +/-1.0 adjustments that are also bad past the mantissa size. Also, that subtraction can overflow, like with
I did not look at any non-normal inputs yet. |
I think that even after this is merged, there needs to be a separate issue that tracks that there is a discrepancy between what the documentation guarantees and what is implemented -- the documentation specifies perfect rounding in the "Precision" section for Consider loosening the guarantee in the docs for now. But I think it would be useful to try to eventually get that guarantee for this operation. It is feasible to implement without very complicated algorithms like what you'd need for perfect rounding in |
Is that impossible / hard to achieve? Or does it conflict with the other guarantee? |
No, there is no conflict, and it's possible. It's just that the proposed one-liner above doesn't do it because of a sequence of approximations. If nobody else solves it, I'm planning to give it a try later on. |
What should the result of div_euclid be if the real-number-result is an integer which cannot be represented exactly with an f32? |
Great question. Give me a couple of days to talk to my floating-point-expert friends and try to figure out what might be best. I think there's no big panic on this (?), so let's try to get it as right as we can on the first try rather than just merging a potential improvement, I think? |
The integer should get rounded, the same way a |
Just a note that I've been working on this for the last few days, but American Thanksgiving and a couple of other things have gotten in the way, and the project has turned into more of a thing than I anticipated. Will try to get something written up tomorrow. |
Sigh. I'm working on a writeup of my findings so far, but today just uncovered a bunch more rabbit holes. Very brief summary, probably full of errors:
More in a while. I wouldn't recommend doing anything until we have an implementation that likely does the right thing. It's sat this long; it can sit a while longer. |
This quandry goes even deeper. I found a 2001 paper on this subject by Daan Leijen (of Koka), "Division and Modulus for Computer Scientists". In it, he gives a proof of correctness for the algorithm for On that basis, I implemented the algorithm in Rust here: But then, much to my surprise, given that the proof looks reasonable to me, and given that my original encoding of it in Rust seemed correct and well founded, it didn't actually solve the problem. Why? In my initial encoding of this algorithm in Rust, I relied on our documentation for
And yet, this is trivially false: fn main() {
let (x, y) = (11f64, 1.1f64);
assert_eq!(x - (x / y).trunc() * y, x % y);
//~^ PANIC
// assertion `left == right` failed
// left: 0.0
// right: 1.0999999999999992
} |
Yes, that's misleading and leaves out the crucial qualification of
Compare to how the equivalent
And later, in the Notes section
|
I believe the current
I agree that this is the best way forward.
It might not be too bad For |
cuviper pointed out a lisensing issue about copying Python code, which as I understand is PSF owned (GPL compatible).
This PR cannot be merged as is. I let this PR open in the meantime for more discussions.
This is a breaking change trying to fix https://internals.rust-lang.org/t/bug-rounding-error-that-break-the-ensurance-of-f32-div-euclid/21917.
In summary, Rust
float::div_euclid
and Pythondivmod
disagrees with each others:Personally I think the Python's behavior is more correct. Because given real numbers
a
andb
,q
is euclidean division ofa
andb
, andr
is the eulidean remainder of them.q*b + r
should be equal toa
.Take the first example, 11.0 and 1.1. Currectly with Rust
10.0*1.1 + 1.0999998 > 11.0 + 1 > 11.0
.FIXME: The Python link uses "fmod " to get more exact remander. But this PR only uses raw
%
operator.Is this a problem in practice?
Reference:
r? @ghost