Skip to content

We need an ak.packed function #746

@jrueb

Description

@jrueb

Following the guide here https://awkward-array.org/how-to-convert-buffers.html it instructs to use ak.to_buffers in order to write HDF5 files. However, the output files can become unnecessary large very easily.
Please consider the following example

import numpy as np
import awkward as ak
arr = ak.Array({"x": np.random.rand(1000)})
mask = [0, 2]
arr = arr[mask]
form, length, container = ak.to_buffers(arr)

container, which will get saved to the file, contains an array of 1000 numbers, even though we only want 2 of them. It doesn't have to be 1000, in fact this number can be much larger.
What I think would be very nice here is an option to have the container be restricted to only the data that is necessary. This could even be an additional function, condensing an awkward array so that it is compact in memory.
I know that flattening can have a similar effect, but it doesn't work on arrays with records. Surprisingly doing something like ak.from_array(ak.to_arrow(arr)) has the desired effect on the array. However, this seems to be a very crude workaround.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions