-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
FIX: handle dask ValueErrors in apply_ufunc
(set allow_rechunk=True
)
#4392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As laid out in #4372 the behaviour before #4060 was to run If I understand the comments of @dcherian and @shoyer in #4372 correctly, This PR just checks for the dask ValueError which is raised for non-core dimension chunk mismatch, issues a warning, that the behaviour will change in the future (not sure if this is intended, please recommend otherwise) and reruns |
allow_rechunk=True
apply_ufunc
(set allow_rechunk=True
)
@dcherian I think, that this is almost good to go. Any rework on the warning message needed? |
xarray/core/computation.py
Outdated
try: | ||
res = gufunc(**dask_gufunc_kwargs) | ||
except ValueError as exc: | ||
if "with different chunksize present" in str(exc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is checking for exact wording of an error raised by dask for chunked non-core dimensions. That seems not-very-dependable. Can we instead check for chunked core dimensions and dask_gufunc_kwargs["allow_rechunk"] is True
and raise an error on our own? I think we could copy the code and error over from before your previous PR, and modify it slightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now you see me totally confused 😀
If we set allow_rechunk=True
it will be rechunked no matter if core or non-core dimension mismatch. If we then check for core dimension chunksize > 1 (as in the previous version) and raise, how should the users pass this without rechunking themselves?
I'll try to come up with a proposal which doesn't rely on raised dask errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do check for core dimension chunksize > 1 (as in the previous version) AND not dask_gufunc_kwargs.get("allow_rechunk")
I think my latest changes according to @dcherian's comments are now backwards compatible. Short explanation:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a small change to the error message.
Thanks @kmuehlbauer !
follow-up on #4060
allow_rechunk=True
inapply_ufunc
#4372isort . && black . && mypy . && flake8