Description
Currently, an operation such as ds.diff('x')
will result in a smaller size dimension, e.g.,
In [1]: import xarray as xr
In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]})
In [3]: ds
Out[3]:
<xarray.Dataset>
Dimensions: (x: 3)
Coordinates:
* x (x) int64 1 2 3
Data variables:
foo (x) int64 1 2 3
In [4]: ds.diff('x')
Out[4]:
<xarray.Dataset>
Dimensions: (x: 2)
Coordinates:
* x (x) int64 2 3
Data variables:
foo (x) int64 1 1
However, there are cases where the same size would be beneficial to keep so that you would get
In [1]: import xarray as xr
In [2]: ds = xr.Dataset({'foo': (('x',), [1, 2, 3])}, {'x': [1, 2, 3]})
In [3]: ds.diff('x', preserve_shape=True, empty_value=0)
Out[3]:
<xarray.Dataset>
Dimensions: (x: 3)
Coordinates:
* x (x) int64 1 2 3
Data variables:
foo (x) int64 0 1 1
Is there interest in addition of a preserve_shape=True
keyword such that it results in this shape-preserving behavior? I'm proposing you could use this with label='upper'
and label='lower'
.
empty_value
could be a value or empty_index
could be an index for the fill value. If empty_value=None
and empty_index=None
, it would produce a nan
.
The reason I'm asking the community is because this is at least the second time I've encountered an application where this behavior would be helpful, e.g., computing ocean layer thicknesses from bottom depths. A previous application was computation of a time step from time slice output and the desire to use this product in an approximated integral, e.g.,
y*diff(t, label='lower', preserve_shape=True)
where y
and t
are both of size n
, which is effectively a left-sided Riemann sum.