-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix bad values for high (abs) eta #172
Conversation
By the way, I'm looking at this. This would be the first |
Yeah, there were two errors, and you've fixed the non-Numba one. Removing |
An alternative implementation would look like this: And yield the following values:
Notable difference is that is turns NaN from several cases in quite unphysical zeros (when rho and z are infinities, or rho is NaN). It also turns some negative zeros positive, but that's not too imporant. I wonder why there is an issue with numba and @numba.njit
def eta(p, z):
return np.where(z, np.arcsinh(z / p), z) |
We can't leave the >>> import vector
>>> xyz = vector.obj(x=1.1, y=2.2, z=3.3)
>>> xyz.theta
0.6405223126794245
>>> xyz.eta
array(1.10358684) versus >>> import vector
>>> xyz = vector.obj(x=1.1, y=2.2, z=3.3)
>>> xyz.theta
0.6405223126794245
>>> xyz.eta
1.103586841560145 I'm thinking of a way around this with |
Oh, you beat me to it. The issue with Numba is that it's a different type, which is also an issue in the plain-object backend. Numba computes the |
This is a little disgusting, but what about >>> for z in (-np.inf, -1.1, 0.0, 1.1, np.inf, np.nan):
... print(z, np.nan_to_num((z != 0) * np.inf, posinf=np.nan))
...
-inf nan
-1.1 nan
0.0 0.0
1.1 nan
inf nan
nan nan |
Entirely without
The NaNs for rho=NaN are acceptable, but for rho=0=z it's an issue. |
So far, this one is looking good (the solution I called "disgusting" a moment ago): def xy_z(lib, x, y, z):
- return lib.where(z, lib.arcsinh(z / lib.sqrt(x**2 + y**2)), z)
+ return lib.nan_to_num(
+ lib.arcsinh(z / lib.sqrt(x**2 + y**2)),
+ nan=lib.nan_to_num((z != 0) * float("inf"), posinf=float("nan")),
+ )
def rhophi_z(lib, rho, phi, z):
- return lib.where(z, lib.arcsinh(z / rho), z)
+ return lib.nan_to_num(
+ lib.arcsinh(z / rho),
+ nan=lib.nan_to_num((z != 0) * float("inf"), posinf=float("nan")),
+ ) Looking at all the values with: >>> import vector
>>> values = [float("-inf"), -1.1, 0.0, 1.1, float("inf"), float("nan")]
>>> for x in values:
... for y in values:
... for z in values:
... xyz = vector.obj(x=x, y=y, z=z)
... print(f"{x:4} {y:4} {z:4} | {xyz.eta}") I don't think any of those |
The "disassemble check" is intended to constrain the kinds of functions that are used in compute functions, and if it had run long enough, it would have complained about |
With your proposal it looks like this:
I thinks this is ok - nobody will miss missing minus for z=-0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're happy with these results (the edge cases match your expectations), then I am, too.
inf
and nan
were only added to math
in Python 3.5, but we're only supporting Python 3.6 and above, so this is totally fine. Better, even, than constructing them from strings (though Python byte-compilation probably optimizes that sort of thing, or at least, it can).
I'm looking at the Awkward error. |
The problem is that the argument passed to |
This also seem to be the case for the numba "implementation" ... |
I was wrong about >>> array = np.array([1, 2, 3, np.nan, np.nan, 10])
>>> np.nan_to_num(array, nan=np.array([999, 999, 999, 123, 321, 999]))
array([ 1., 2., 3., 123., 321., 10.]) |
Fortunately, Numba has implemented >>> import numpy as np
>>> import numba as nb
>>> nb.__version__
'0.55.1'
>>> def f(array):
... return np.nan_to_num(array, nan=np.array([999, 999, 999, 123, 321, 999]))
...
>>> array = np.array([1, 2, 3, np.nan, np.nan, 10])
>>> f(array)
array([ 1., 2., 3., 123., 321., 10.]) Then we just need a working |
numba/numba#6857 and numba/numba#5977 I'm seeing if we can do without these workaround-implementations now. |
I was wrong—I was seeing Vector's implementation of I'm running out of time. I'll have to come back to this later. |
… (update ak.nan_to_num).
I've fixed the Numba implementation of Along the way, I learned that if you don't set Unfortunately, this PR is going to be blocked until I publish a non-release candidate version of Awkward Array on PyPI. That's overdue, anyway, so I should focus on that. |
#173. |
- should not break numba - should decay 0-rank array to scalar
Why are you reverting this to the Once Awkward 1.8.0 is released, this PR will simply pass tests with the |
I don't know what you did, but it has the right type: >>> import vector
>>> xyz = vector.obj(x=1.1, y=2.2, z=3.3)
>>> xyz.theta
0.6405223126794245
>>> xyz.eta
1.103586841560145 I'm checking more deeply now. (And I can loosen the disassemble-check to allow |
Multiplying by 1 turns a zero-dimensional array into a scalar. I had no idea! >>> np.where(0, 1, 1.1)
array(1.1)
>>> np.where(0, 1, 1.1) * 1
1.1
>>> type(np.where(0, 1, 1.1) * 1)
<class 'numpy.float64'> While that's brilliant, I'm a little uncomfortable with the solution because new backends might not be able to handle it. I've been hoping, for instance, that we'll be able to add SymPy as a backend. Do the |
I’d recommend checking against the new NumPy API. It’s a lot more strict, and it is supposed to be implemented by all the other libs eventually. I think it handles scalars very differently. |
Sympy has the Piecewise function that can express the
The extreme values do not chaange, merely the values for for FYI, the value matrix looks like follows (same as the
I'm not familiar with what you are talking about. Is there an |
I guess I have to say that I'm more uncomfortable with the solution of multiplying the result of It's a little ugly that vector/src/vector/_compute/spatial/eta.py Lines 31 to 62 in 12ae4d6
When the tests run again, they'll pick up Awkward 1.8.0 and it should pass. Then I'll accept the PR. Thanks! |
Scalars were dropped from the new API. See https://numpy.org/neps/nep-0047-array-api-standard.html. |
It looks like the Array API is now a fully drafted standard, NumPy's adoption of it (NEP 47) is under consideration, and so are the promotion rules for Python scalars (NEP 50), which is partly inspired by a NumPy users poll and also defined by the Array API. Interesting times! So NumPy will be dropping the distinction between zero-dimensional arrays and scalars (as CuPy already is; I guess they had some say in the Array API!). This would make the Thanks, @bfis, for reverting it. |
The Array API is implemented in experimental form in NumPy 1.22, use |
The current eta calculation gives very wrong values for higher (absolute) eta values. This effect is more prominent if the underlying values are
float32
. The issue lies in the used formula, which adds rho and z, which may be more orders of magnitude apart than the float numbers can represent. The solution is to use a different way (mathematically equivalent) which has greater numerical stability. The same method is used in ROOT.The behavior introduced in #139 is retained (for the most part). Otherwise, the implementation is closer to ROOT how it handles infinities. It differs from ROOT as follow:
sqrt
approximation (via Taylor series) for small (abs) etaTesting should be covered by #36.
Detailed difference between old, new, and ROOT behaviour.
The following tables were produces by this code:
Axes:
horizontal: z
vertical: rho
old
new
ROOT