Skip to content

Conversation

terraputix
Copy link

@terraputix terraputix commented Aug 3, 2025

Closes #2900.

I did not add a unit test because it was already mentioned in #2900 that all numcodecs should be better tested in CI, but I guess that belongs into another PR.

You can verify that it works by running the following script:

#!/usr/bin/env -S uv run
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     # "zarr==3.0.2,<3.0.3", # WORKS
#     # "zarr==3.0.4", # FAILS
#     "zarr@git+https://github.com/terraputix/zarr-python.git@16bd1c7825b895d0247b08b255ffcfa214b6e150",  # WORKS
#     "numcodecs==0.15.0",
#     "zfpy==1.0.1",
#     "pcodec==0.3.2",
# ]
# ///
#


import numpy as np
import zarr
zarr.__version__ = "3.0.2"
from numcodecs.zarr3 import ZFPY, PCodec

for serializer in [
    ZFPY(mode=4, tolerance=0.01),
    PCodec(level=8, mode_spec="auto"),
]:
    array = zarr.create_array(
        store=zarr.storage.LocalStore("test"),
        shape=[2, 2],
        chunks=[2, 1],
        dtype=np.float32,
        serializer=serializer,
        compressors=None,
        overwrite=True,
    )
    array[...] = np.array([[0, 1], [2, 3]])

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Aug 3, 2025
@terraputix terraputix force-pushed the fix-pcodec-compression branch from 83413cb to 16bd1c7 Compare August 6, 2025 16:01
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.24%. Comparing base (62551c7) to head (b699024).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3326   +/-   ##
=======================================
  Coverage   61.24%   61.24%           
=======================================
  Files          83       83           
  Lines        9907     9907           
=======================================
  Hits         6068     6068           
  Misses       3839     3839           
Files with missing lines Coverage Δ
src/zarr/core/codec_pipeline.py 70.09% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@terraputix
Copy link
Author

I improved this fix, avoiding the extra copy which I introduced before.

Basically, this change reverts the codec pipeline modifications introduced in #2851 and provides an alternative fix for handling chunks at the end of the array when the chunk shape does not evenly divide the array shape.

Do you mind taking a look at this one @dcherian or @d-v-b ?

@d-v-b
Copy link
Contributor

d-v-b commented Aug 8, 2025

I had a look, but I don't know this part of the code well and that function has no comments or docstring(!), so I'm not sure how much my review is worth. If the tests pass and someone who knows the code a bit better gives it a thumbs-up (cc @dcherian) then I think we can merge.

but why are we fixing this in zarr python, instead of in the individual codecs?

@terraputix
Copy link
Author

terraputix commented Aug 8, 2025

but why are we fixing this in zarr python, instead of in the individual codecs?

I actually tried the solution proposed in #2900 (comment), but it was breaking a handful of other tests in numcodecs...

This PR is basically an improved version of #2851, which also does not require copies in downstream codecs and (I think) therefore should be preferred.

@d-v-b
Copy link
Contributor

d-v-b commented Sep 18, 2025

@TomAugspurger / @dcherian could you have a look at this?

@d-v-b
Copy link
Contributor

d-v-b commented Sep 18, 2025

@terraputix could you add your snippet as a test? As minimal as possible but IMO it's fine to add dependencies as needed to our test suite to ensure that we catch this bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some numcodecs.zarr3 codecs fail with 3.0.4+
2 participants