-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Support Puffin file format #744
Comments
Can someone assign this ticket to me to avoid confusion? |
We have also implemented a puffin inside greptimedb (https://github.com/GreptimeTeam/greptimedb/tree/main/src/puffin), would like to push this forward together 🍻 |
Great! Looking forward to see your contribution! |
Part of #744 # Summary - Add PuffinReader # Context - This is the fourth of a number of PRs to add support for Iceberg Puffin file format. - It might be helpful to refer to the overarching [PR](#714) from which these changes were split to understand better how these changes will fit in to the larger picture. - It may also be helpful to refer to the Java reference implementation for PuffinReader [here](https://github.com/apache/iceberg/blob/8cd5b1985d3f9c55ab2ced174559a8416b6ca1b4/core/src/main/java/org/apache/iceberg/puffin/PuffinReader.java#L123).
Part of #744 # Summary - Add PuffinWriter # Context - This is the fifth of a number of PRs to add support for Iceberg Puffin file format. - It might be helpful to refer to the overarching #714 from which these changes were split to understand better how these changes will fit in to the larger picture. - It may also be helpful to refer to the Java reference implementation for PuffinWriter [here](https://github.com/apache/iceberg/blob/1d9fefeb9680d782dc128f242604903e71c32f97/core/src/main/java/org/apache/iceberg/puffin/PuffinWriter.java#L43).
## Which issue does this PR close? Part of #744 ## What changes are included in this PR? - Make Puffin APIs public - Turn dead-code warning on (disabled earlier to allow for private development) ## Are these changes tested? N/A
Going to close this ticket as IMO the major pieces are complete (feel free to reopen if you disagree). @liurenjie1024 it would be great if you could add this to the 0.5.0 milestone (for better visibility). |
Thanks @fqaiser94 , sounds reasonable to me. |
Why
There has been increased interest in Iceberg's Puffin file format recently. This is partially driven by the fact that the Iceberg V3 Spec added support for deletion vectors which are expected to be stored in Puffin files.
However, as was recently noted on the dev-list, currently only the iceberg-java SDK supports reading or writing Puffin files. Iceberg-rust itself has zero support for the Puffin file format today. The purpose of this ticket is to change that by adding support for Puffin to iceberg-rust and iceberg-python (through exposed bindings).
How
I have already raised a PR for adding Puffin support to iceberg-rust here: #714. However as that PR is quite large, I am splitting up the code and submitting it as multiple PRs in the following order:
feat(puffin): Add Python bindingsOptimize puffin file metadata parsing (see comment for more details)The text was updated successfully, but these errors were encountered: