You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know if you pass the field metadata metadata={"lance-encoding:compression": "zstd"} it enables some compression, but I wasn't sure if that's page level or cell level.
I've been doing some work on compression as part of the 2.1 file format. For large values (e.g. > 4KB per value) I've been using Zstd-per-value: #3448
For smaller values I've been using FSST and/or dictionary.
That covered all my test cases but it's not an exhaustive set at the moment.
I think the main case I haven't thoroughly benchmarked is values between 128 bytes and 4KB. FSST might work fine, per-value zstd might work fine. If neither works well I have an idea that we can chunk a few values and still use an offsets array. I'm working on a paper that documents this all in more detail.
We have some binary-type fields and want to do compression with
lz4
or some other compression algorithm.Can Lance match this requirement currently? If can't, can you share an idea of how to implement it?
The text was updated successfully, but these errors were encountered: