Skip to content

Storing variable-length strings in HDF5 format #124

@WardLT

Description

@WardLT

Our current HDF5 writer saves strings columns as fixed length, using the maximum length of any string to set the string lengths. This only works when you have seen all possible values of the the string, which is not true if you're appending to an existing HDF5 file.

We should explore ways of at least warning users of this behavior and, ideally, providing ways around it. Ideas include

  • Warning users if new strings exceed the fixed length
  • Letting users set an expected maximum for lengths
  • Storing string columns in separate, VLArray storage
  • Rebuilding a Table with a new, longer column if a new entry is found

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions