When computation is triggered with xarray dask array ? (memory problem) #6887

ghislainp · 2022-08-07T21:29:00Z

ghislainp
Aug 7, 2022

I'm processing time series of satellite images (x, y, time) but my processing function operates only on time (each pixel is independant).
The processing function is quite complex (many functions in different modules) but only involves xarray and numpy function (I believe).The whole input dataset does not fit in memory and the output dataset neither, but the data are read from a zarr with a very small chunk size in x and y and the output dataset is persisted with to_netcdf. (I'm not using load, compute or persist of course).

I expect a very low memory usage, as my processing function only operates in time and the chunk size in x and y is very small. In the extreme case, using chunk={'x': 1, 'y': 1} should consum a minimum of memory but could be quite efficient (but I don't mind).

However my script takes all the memory whatever the chunk size and stops with MemoryError. I've worked the algorithm to simplify the operation where I suspect a problem, but in fact the problem is that I don't identify where the problem is.

Is there any way to debug this kind of situation (which happen often to me), for instance by checking if the calculation operates on chunks only, or if, when and why the whole dataset is loaded or if many computations are keep in memory ? Is there a way to inspect dask internals information to understand what's happening ?

andersy005 · 2022-08-08T00:24:17Z

andersy005
Aug 8, 2022
Maintainer

.The whole input dataset does not fit in memory and the output dataset neither, but the data are read from a zarr with a very small chunk size in x and y and the output dataset is persisted with to_netcdf. (I'm not using load, compute or persist of course).

have you tried writing to zarr instead of netCDF?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

When computation is triggered with xarray dask array ? (memory problem) #6887

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

When computation is triggered with xarray dask array ? (memory problem) #6887

Uh oh!

ghislainp Aug 7, 2022

Replies: 1 comment

Uh oh!

andersy005 Aug 8, 2022 Maintainer

ghislainp
Aug 7, 2022

andersy005
Aug 8, 2022
Maintainer