Skip to content

(fix): structured dtype fill value consolidated metadata #3015

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ilan-gold
Copy link
Contributor

@ilan-gold ilan-gold commented Apr 24, 2025

Closes #2998

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Apr 24, 2025
@ilan-gold ilan-gold force-pushed the ig/fix_structured_dtype_consolidated branch from 3f97416 to 8e23bf9 Compare April 24, 2025 15:21
@ilan-gold ilan-gold mentioned this pull request Apr 24, 2025
3 tasks
@tasansal
Copy link
Contributor

@d-v-b is this pending on anything? we would like to see this in the next patch release as well if possible :)

Comment on lines 321 to 331
def test_structured_dtype_fill_value_serialization(tmp_path):
group_path = tmp_path / "test.zarr"
root_group = zarr.open_group(group_path, mode="w", zarr_format=2)
root_group.create_array(
name="structured_headers",
shape=(100, 100),
chunks=(100, 100),
dtype=np.dtype([("foo", "i4"), ("bar", "i4")]),
)

zarr.consolidate_metadata(root_group.store, zarr_format=2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is weak (it only tests that consolidate metadata doesn't error). Do you think it makes sense to make this test a big stronger, e.g. by checking that the fill value was actually encoded the way we think it should have been?

Also I think we need a check to ensure that if the dtype is void and the fill value is None, then there's no base64 encoding (I know from the implementation in this PR that the test will pass, but it's good to have the test in any case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is weak (it only tests that consolidate metadata doesn't error). Do you think it makes sense to make this test a big stronger, e.g. by checking that the fill value was actually encoded the way we think it should have been?

Oh wow, I totally intended to do that (hence the name of the test).

Also I think we need a check to ensure that if the dtype is void and the fill value is None, then there's no base64 encoding

Great suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! tried both!

@d-v-b
Copy link
Contributor

d-v-b commented Apr 24, 2025

thanks for the ping @tasansal, i left some comments about the test, but personally It's OK with me if those comments are ignored. Long term it's not sustainable for us to have duplicated fill value encoding logic across the codebase, and I think some upcoming PRs will help by heavily consolidating this, but I think it's OK in the short-term if this is pushed out quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Structured dtype serialization with consolidated metadata fails
3 participants