bumping max se versions and age #321
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Problem:
The current CDC (Change Data Capture) benchmark process requires streaming data into OpenHouse for periods up to 24 hours. After the stream is manually stopped (sometime after the 24-hour mark), operators need to perform two key actions:
The primary challenge is that routine snapshot expiration can delete the historical data versions required for these rollbacks. If snapshots expire too soon, rolling back becomes impossible. This necessitates a complete re-ingestion of the data, which can delay testing by up to 24 hours.
Solution:
To ensure that necessary snapshots are retained for the duration of the CDC benchmark and subsequent testing, we are increasing a configuration related to snapshot retention (referred to as the "ceiling") to 900. This new value is derived as follows:
Calculation:
(24 hours/day * 60 minutes/hour / 5 minutes/commit) * 3 days = 288 commits/day * 3 days = 864 commits
Setting the ceiling to 900 provides a sufficient buffer above the calculated 864 commits. This prevents premature snapshot expiration, allowing operators to reliably perform rollbacks as required by the benchmark, thus avoiding lengthy data re-ingestion delays.
In addition, the max_age will take precedence over a max version, so 900 versions will still be limited to 3 days. so we must also bump the max_age to 15 days.
as an aside, making these parameters static constants for clarity.
Changes
Testing Done
Relying on existing unittests
Additional Information