-
Notifications
You must be signed in to change notification settings - Fork 397
refactor: consolidate snapshot expiration into MaintenanceTable #2143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: consolidate snapshot expiration into MaintenanceTable #2143
Conversation
…h a new Expired Snapshot class. updated tests.
ValueError: Cannot expire snapshot IDs {3051729675574597004} as they are currently referenced by table refs.
Moved expiration-related methods from `ExpireSnapshots` to `ManageSnapshots` for improved organization and clarity. Updated corresponding pytest tests to reflect these changes.
Re-ran the `poetry run pre-commit run --all-files` command on the project.
Re-ran the `poetry run pre-commit run --all-files` command on the project.
Moved: the functions for expiring snapshots to their own class.
…ng it in a separate issue. Fixed: unrelated changes caused by afork/branch sync issues.
Co-authored-by: Fokko Driesprong <[email protected]>
Implemented logic to protect the HEAD branches or Tagged branches from being expired by the `expire_snapshot_by_id` method.
|
@Fokko @jayceslesar let me know if you guys prefer i stack this pr into the #1200 or if you both would rather i wait until the #1200 is merged into |
|
Great seeing this PR @ForeverAngry, thanks again for working on this! I'm okay with first merging #1200, but we could also merge this first, and adapt the remove orphan files routine to use |
|
@Fokko did you decide if you wanted me to stay stacked on the delete orphans pr, or go ahead and prepare the pr for this, to the main branch? |
a6c3b63 to
9937894
Compare
keep the table.maintenance.expire_snapshots() API signature Return the existing ExpireSnapshots class that extends UpdateTableMetadata Enable transaction semantics with context manager support Focus this PR on API refactoring, move complex retention logic to separate PR
|
@Fokko @kevinjqliu let me know if this commit looks like what you both were expecting. |
…at original suggestions
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ForeverAngry i think this pr is very close. we need to get it to a good shape for merge and then we can release 0.10 :)
…ected and unprotected snapshots
…mp, with updated docstrings
|
@Fokko can you kick off the workflows for me? |
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, lets revert the irrelevant docs and this thing is ready to merge!!
|
caught up with @ForeverAngry on slack, helped fix the merge issue with the documentation |
kevinjqliu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!! Thanks a bunch of the persistence to get this PR into a good state :)
|
|
||
| # Method chaining | ||
| table.maintenance.expire_snapshots().by_id(12345).commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same example as above?
| # Method chaining | |
| table.maintenance.expire_snapshots().by_id(12345).commit() |
| for snapshot_id in snapshot_ids: | ||
| expire.by_id(snapshot_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the by_ids? This makes the example a bit more consise:
| for snapshot_id in snapshot_ids: | |
| expire.by_id(snapshot_id) | |
| expire.by_ids(snapshot_ids) |
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good @ForeverAngry
Thanks for your work and all the patience 👍
…he#2143) <!-- Thanks for opening a pull request! --> <!-- Closes apache#2150 --> # Rationale for this change - Consolidates snapshot expiration functionality from the standalone `ExpireSnapshots` class into the `MaintenanceTable` class for a unified maintenance API. - Resolves planned work left over from apache#1880, and closes apache#2142 - Achieves feature and API parity with the Java implementation for snapshot retention and table maintenance. # Features & Enhancements - Introduces `table.maintenance.expire_snapshots()` as the unified entry point for snapshot expiration and future maintenance operations. - Retains the existing `ExpireSnapshots` implementation internally. The `expire_snapshots()` method on `MaintenanceTable` now returns an `ExpireSnapshots` object, preserving transaction semantics and supporting context manager usage: ```python with table.maintenance.expire_snapshots() as expire_snapshots: expire_snapshots.by_id(1) expire_snapshots.by_id(2) ``` - Focuses this PR on refactoring and documentation improvements, while maintaining compatibility with the prior `ExpireSnapshots` interface. - Sets a foundation for future expansion of the `MaintenanceTable` abstraction to encapsulate additional maintenance operations. # Bug Fixes & Cleanups - **ManageSnapshots Cleanup ([apache#2151](apache#2151 - Removes an unrelated instance variable from the `ManageSnapshots` class, aligning with the Java reference implementation. # Testing & Documentation - **Testing:** - Tested the new API interface including: - Expiration by ID - Protection of branch/tag snapshots - **Documentation:** - Added and updated documentation to describe: - API usage examples Preview: <img width="1686" height="1015" alt="Screenshot 2025-08-11 at 1 37 04 PM" src="https://github.com/user-attachments/assets/f469f3fc-b4b1-4ec9-b1ca-b9185e22643e" /> # Are these changes tested? Yes. All changes are tested.~, with this PR predicated on the final changes from apache#1200.~ This work builds on the framework introduced by @jayceslesar in apache#1200 for the `MaintenanceTable`. # Are there any user-facing changes? --- **Closes:** - Closes apache#2151 - Closes apache#2142 --------- Co-authored-by: Fokko Driesprong <[email protected]> Co-authored-by: Kevin Liu <[email protected]>
Rationale for this change
ExpireSnapshotsclass into theMaintenanceTableclass for a unified maintenance API.Features & Enhancements
table.maintenance.expire_snapshots()as the unified entry point for snapshot expiration and future maintenance operations.ExpireSnapshotsimplementation internally. Theexpire_snapshots()method onMaintenanceTablenow returns anExpireSnapshotsobject, preserving transaction semantics and supporting context manager usage:ExpireSnapshotsinterface.MaintenanceTableabstraction to encapsulate additional maintenance operations.Bug Fixes & Cleanups
ManageSnapshotsclass, aligning with the Java reference implementation.Testing & Documentation
Preview:

Are these changes tested?
Yes. All changes are tested.
, with this PR predicated on the final changes from #1200.This work builds on the framework introduced by @jayceslesar in #1200 for theMaintenanceTable.Are there any user-facing changes?
Closes:
ManageSnapshotsclass #2151