Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codec not available: 'pcodec' - using numcodecs.zarr3 module #2806

Closed
jhamman opened this issue Feb 6, 2025 · 3 comments
Closed

codec not available: 'pcodec' - using numcodecs.zarr3 module #2806

jhamman opened this issue Feb 6, 2025 · 3 comments
Labels
bug Potential issues with the zarr-python library

Comments

@jhamman
Copy link
Member

jhamman commented Feb 6, 2025

Zarr version

3.0.1

Numcodecs version

0.15.0

Python Version

3.12

Operating System

Linux

Installation

Conda

Description

I am able to create an array using numcodecs.zarr3.PCodec as a serializer (PCodec is an array->bytes codec) but subsequent writes to the array fail with the error ValueError: codec not available: 'pcodec'. Is there another way to specify this codec? Or perhaps, this is a bug in numcodecs?

Steps to reproduce

import zarr
import numcodecs.zarr3


store = {}
compressor = numcodecs.zarr3.PCodec()
arr = zarr.create_array(store=store, shape=(10, 10), dtype='f8', serializer=compressor)
arr[:] = 2

raises

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[43], line 1
----> 1 arr[:] = 2

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/array.py:2523](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/array.py#line=2522), in Array.__setitem__(self, selection, value)
   2521     self.vindex[cast(CoordinateSelection | MaskSelection, selection)] = value
   2522 elif is_pure_orthogonal_indexing(pure_selection, self.ndim):
-> 2523     self.set_orthogonal_selection(pure_selection, value, fields=fields)
   2524 else:
   2525     self.set_basic_selection(cast(BasicSelection, pure_selection), value, fields=fields)

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/_compat.py:43](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/_compat.py#line=42), in _deprecate_positional_args.<locals>._inner_deprecate_positional_args.<locals>.inner_f(*args, **kwargs)
     41 extra_args = len(args) - len(all_args)
     42 if extra_args <= 0:
---> 43     return f(*args, **kwargs)
     45 # extra_args > 0
     46 args_msg = [
     47     f"{name}={arg}"
     48     for name, arg in zip(kwonly_args[:extra_args], args[-extra_args:], strict=False)
     49 ]

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/array.py:2979](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/array.py#line=2978), in Array.set_orthogonal_selection(self, selection, value, fields, prototype)
   2977     prototype = default_buffer_prototype()
   2978 indexer = OrthogonalIndexer(selection, self.shape, self.metadata.chunk_grid)
-> 2979 return sync(
   2980     self._async_array._set_selection(indexer, value, fields=fields, prototype=prototype)
   2981 )

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/sync.py:142](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/sync.py#line=141), in sync(coro, loop, timeout)
    139 return_result = next(iter(finished)).result()
    141 if isinstance(return_result, BaseException):
--> 142     raise return_result
    143 else:
    144     return return_result

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/sync.py:98](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/sync.py#line=97), in _runner(coro)
     93 """
     94 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
     95 exception, the exception will be returned.
     96 """
     97 try:
---> 98     return await coro
     99 except Exception as ex:
    100     return ex

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/array.py:1413](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/array.py#line=1412), in AsyncArray._set_selection(self, indexer, value, prototype, fields)
   1410     _config = replace(_config, order=self.metadata.order)
   1412 # merging with existing data and encoding chunks
-> 1413 await self.codec_pipeline.write(
   1414     [
   1415         (
   1416             self.store_path [/](https://cluster-iwilb.dask.host/) self.metadata.encode_chunk_key(chunk_coords),
   1417             self.metadata.get_chunk_spec(chunk_coords, _config, prototype),
   1418             chunk_selection,
   1419             out_selection,
   1420         )
   1421         for chunk_coords, chunk_selection, out_selection in indexer
   1422     ],
   1423     value_buffer,
   1424     drop_axes=indexer.drop_axes,
   1425 )

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py:468](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py#line=467), in BatchedCodecPipeline.write(self, batch_info, value, drop_axes)
    462 async def write(
    463     self,
    464     batch_info: Iterable[tuple[ByteSetter, ArraySpec, SelectorTuple, SelectorTuple]],
    465     value: NDBuffer,
    466     drop_axes: tuple[int, ...] = (),
    467 ) -> None:
--> 468     await concurrent_map(
    469         [
    470             (single_batch_info, value, drop_axes)
    471             for single_batch_info in batched(batch_info, self.batch_size)
    472         ],
    473         self.write_batch,
    474         config.get("async.concurrency"),
    475     )

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/common.py:68](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/common.py#line=67), in concurrent_map(items, func, limit)
     65     async with sem:
     66         return await func(*item)
---> 68 return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/common.py:66](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/common.py#line=65), in concurrent_map.<locals>.run(item)
     64 async def run(item: tuple[Any]) -> V:
     65     async with sem:
---> 66         return await func(*item)

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py:403](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py#line=402), in BatchedCodecPipeline.write_batch(self, batch_info, value, drop_axes)
    400         else:
    401             chunk_array_batch.append(chunk_array)
--> 403 chunk_bytes_batch = await self.encode_batch(
    404     [
    405         (chunk_array, chunk_spec)
    406         for chunk_array, (_, chunk_spec, _, _) in zip(
    407             chunk_array_batch, batch_info, strict=False
    408         )
    409     ],
    410 )
    412 async def _write_key(byte_setter: ByteSetter, chunk_bytes: Buffer | None) -> None:
    413     if chunk_bytes is None:

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py:210](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/codec_pipeline.py#line=209), in BatchedCodecPipeline.encode_batch(self, chunk_arrays_and_specs)
    205     chunk_array_batch = await aa_codec.encode(
    206         zip(chunk_array_batch, chunk_specs, strict=False)
    207     )
    208     chunk_specs = resolve_batched(aa_codec, chunk_specs)
--> 210 chunk_bytes_batch = await self.array_bytes_codec.encode(
    211     zip(chunk_array_batch, chunk_specs, strict=False)
    212 )
    213 chunk_specs = resolve_batched(self.array_bytes_codec, chunk_specs)
    215 for bb_codec in self.bytes_bytes_codecs:

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/abc/codec.py:152](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/abc/codec.py#line=151), in BaseCodec.encode(self, chunks_and_specs)
    136 async def encode(
    137     self,
    138     chunks_and_specs: Iterable[tuple[CodecInput | None, ArraySpec]],
    139 ) -> Iterable[CodecOutput | None]:
    140     """Encodes a batch of chunks.
    141     Chunks can be None in which case they are ignored by the codec.
    142 
   (...)
    150     Iterable[CodecOutput | None]
    151     """
--> 152     return await _batching_helper(self._encode_single, chunks_and_specs)

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/abc/codec.py:407](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/abc/codec.py#line=406), in _batching_helper(func, batch_info)
    403 async def _batching_helper(
    404     func: Callable[[CodecInput, ArraySpec], Awaitable[CodecOutput | None]],
    405     batch_info: Iterable[tuple[CodecInput | None, ArraySpec]],
    406 ) -> list[CodecOutput | None]:
--> 407     return await concurrent_map(
    408         list(batch_info),
    409         _noop_for_none(func),
    410         config.get("async.concurrency"),
    411     )

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/common.py:68](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/common.py#line=67), in concurrent_map(items, func, limit)
     65     async with sem:
     66         return await func(*item)
---> 68 return await asyncio.gather(*[asyncio.ensure_future(run(item)) for item in items])

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/core/common.py:66](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/core/common.py#line=65), in concurrent_map.<locals>.run(item)
     64 async def run(item: tuple[Any]) -> V:
     65     async with sem:
---> 66         return await func(*item)

File [/opt/coiled/env/lib/python3.12/site-packages/zarr/abc/codec.py:420](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/zarr/abc/codec.py#line=419), in _noop_for_none.<locals>.wrap(chunk, chunk_spec)
    418 if chunk is None:
    419     return None
--> 420 return await func(chunk, chunk_spec)

File [/opt/coiled/env/lib/python3.12/site-packages/numcodecs/zarr3.py:178](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/numcodecs/zarr3.py#line=177), in _NumcodecsArrayBytesCodec._encode_single(self, chunk_ndbuffer, chunk_spec)
    176 async def _encode_single(self, chunk_ndbuffer: NDBuffer, chunk_spec: ArraySpec) -> Buffer:
    177     chunk_ndarray = chunk_ndbuffer.as_ndarray_like()
--> 178     out = await asyncio.to_thread(self._codec.encode, chunk_ndarray)
    179     return chunk_spec.prototype.buffer.from_bytes(out)

File [/opt/coiled/env/lib/python3.12/functools.py:995](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/functools.py#line=994), in cached_property.__get__(self, instance, owner)
    993 val = cache.get(self.attrname, _NOT_FOUND)
    994 if val is _NOT_FOUND:
--> 995     val = self.func(instance)
    996     try:
    997         cache[self.attrname] = val

File [/opt/coiled/env/lib/python3.12/site-packages/numcodecs/zarr3.py:106](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/numcodecs/zarr3.py#line=105), in _NumcodecsCodec._codec(self)
    104 @cached_property
    105 def _codec(self) -> numcodecs.abc.Codec:
--> 106     return numcodecs.get_codec(self.codec_config)

File [/opt/coiled/env/lib/python3.12/site-packages/numcodecs/registry.py:53](https://cluster-iwilb.dask.host/jupyter/lab/tree/synced/env/lib/python3.12/site-packages/numcodecs/registry.py#line=52), in get_codec(config)
     51 if cls:
     52     return cls.from_config(config)
---> 53 raise ValueError(f'codec not available: {codec_id!r}')

ValueError: codec not available: 'pcodec'

Additional output

No response

@jhamman jhamman added the bug Potential issues with the zarr-python library label Feb 6, 2025
@jhamman jhamman changed the title codec not available: 'pcodec' using numcodecs.zarr3 module codec not available: 'pcodec' - using numcodecs.zarr3 module Feb 6, 2025
@LDeakin
Copy link
Contributor

LDeakin commented Feb 7, 2025

You need pcodec installed in your python env or use numcodecs[pcodec]

@jakirkham
Copy link
Member

Also looks like this is not currently packaged in conda-forge

Have raised issue: conda-forge/staged-recipes#29051

Should be possible to create a recipe with using Grayskull, which could then be added with a staged-recipes PR

@jhamman
Copy link
Member Author

jhamman commented Feb 7, 2025

@LDeakin - thanks so much. This is obvious now. I got myself tied into knots because of prior errors when using compressors=[PCodec()] which does not work for other reasons. In the end, I was just missing the dependency 🤦 .

@jhamman jhamman closed this as completed Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

3 participants