-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH]: wire GC v2 to new cleanup modes & call FinishDatabaseDeletion from garbage collector #4671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
3f44d7a
to
b021664
Compare
a897895
to
89363e7
Compare
f93cb18
to
0184cd6
Compare
Garbage Collector V2 Modes, Soft/Hard Delete for Databases, and Per-Tenant Cleanup Control This PR introduces two new garbage collection modes ( Key Changes: Affected Areas: Potential Impact: Functionality: Enables per-tenant and system-wide selection of new GC logic, changes DB deletion semantics (no longer immediately hard-delete), and ensures databases are only finally removed once all collections are gone and past cutoff. Performance: No direct regression expected-v2 orchestrator may have efficiency differences for complex fork-graphs. Soft/hard delete pattern minimizes DB contention during deletions. Security: No security issues introduced; hard deletes are only performed on eligible databases. Scalability: Prepares the system for larger/multi-tenant topologies by decoupling database cleanup from immediate collection drop actions. Review Focus: Testing Needed• Verify Code Quality Assessmentall test files: Thorough unit, integration, and property-based tests added or extended; edge cases well-covered rust/garbage_collector/src/garbage_collector_component.rs: Well-structured and extensively modified; handles orchestrator switching and FinishDatabaseDeletion invocation go/pkg/sysdb/metastore/db/dao/database.go: Soft/hard deletion implementation is clear; proper use of transactions and new method for eligibility check idl/chromadb/proto/coordinator.proto: Proto extended in a backward-compatible manner; enums and new messages for finish logic are correct rust/garbage_collector/src/config.rs: Config upgrade backwards compatible, with new field correctly defaulted rust/garbage_collector/src/types.rs: CleanupMode extension and new response struct properly scoped Best PracticesError Handling: API Versioning: Config Management: Testing: Potential Issues• If a tenant's per-tenant cleanup mode is misconfigured, collections may not be cleaned as intended This summary was automatically generated by @propel-code-bot |
89363e7
to
5e896ee
Compare
0184cd6
to
7922809
Compare
05cf039
to
52d3755
Compare
Merge activity
|
7922809
to
f5dbbcd
Compare
…from garbage collector (chroma-core#4671) ## Description of changes Changes: - Adds two new cleanup modes, `DryRunV2` and `DeleteV2`. When the default cleanup mode is set to one of these, or when it is set per tenant, the v2 garbage collector orchestrator will be used. Otherwise, the old orchestrator will be used. The default remains the same (the old orchestrator in dry run mode). - Adds a config field to the garbage collector config for the root manager cache config. Has a default. - Calls `FinishDatabaseDeletion` at the end of each GC cycle. This will transition eligible soft deleted databases -> hard deleted databases. I added a test for this. ## Test plan _How are these changes tested?_ - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_ n/a
Description of changes
Changes:
DryRunV2
andDeleteV2
. When the default cleanup mode is set to one of these, or when it is set per tenant, the v2 garbage collector orchestrator will be used. Otherwise, the old orchestrator will be used. The default remains the same (the old orchestrator in dry run mode).FinishDatabaseDeletion
at the end of each GC cycle. This will transition eligible soft deleted databases -> hard deleted databases. I added a test for this.Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?
n/a