-
Notifications
You must be signed in to change notification settings - Fork 38
Closed
Description
It appears data variables are lost when writing and subsequently reading an Xarray.DataTree with Icechunk. I'd be glad to look into this further to see if it relates to upstream issues (e.g., pydata/xarray#9960), but first wanted to check if there's a known solution.
MVCE
import zarr
import icechunk
import xarray as xr
set1_data = xr.Dataset({"a": 0, "b": 1})
set2_data = xr.Dataset({"a": ("x", [2, 3]), "b": ("x", [0.1, 0.2])})
root_data = xr.Dataset({"a": ("y", [6, 7, 8]), "set0": ("x", [9, 10])})
root = xr.DataTree.from_dict(
{
"": root_data,
"set1": set1_data,
"set1/set1": None,
"set1/set2": None,
"set2": set2_data,
"set2/set1": None,
"set3": None,
}
)
storage_config = icechunk.s3_storage(
bucket="nasa-veda-scratch",
prefix="icechunk-test/max/xr-datatree-roundtrip",
region="us-west-2"
)
repo = icechunk.Repository.create(storage_config)
session = repo.writable_session("main")
root.to_zarr(session.store, zarr_format=3, consolidated=False)
session.commit("Commit datatree")
roundtripped = xr.open_datatree(session.store, engine="zarr")
xr.testing.assert_equal(root, roundtripped)
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[4], line 11
9 session.commit("Commit datatree")
10 roundtripped = xr.open_datatree(session.store, engine="zarr")
---> 11 xr.testing.assert_equal(root, roundtripped)
[... skipping hidden 1 frame]
File [/opt/conda/lib/python3.11/site-packages/xarray/testing/assertions.py:138](https://hub.openveda.cloud/opt/conda/lib/python3.11/site-packages/xarray/testing/assertions.py#line=137), in assert_equal(a, b, check_dim_order)
136 assert a.equals(b), formatting.diff_coords_repr(a, b, "equals")
137 elif isinstance(a, DataTree):
--> 138 assert a.equals(b), diff_datatree_repr(a, b, "equals")
139 else:
140 raise TypeError(f"{type(a)} not supported by assertion comparison")
AssertionError: Left and right DataTree objects are not equal
Data at node 'set1' does not match:
Data variables only on the left object:
a int64 8B 0
b int64 8B 1
Data at node 'set2' does not match:
Differing dimensions:
(x: 2) != ()
Data variables only on the left object:
a (x) int64 16B 2 3
b (x) float64 16B 0.1 0.2
Metadata
Metadata
Assignees
Labels
No labels