fix(foundation): prevent integer overflow in RetryPolicy::get_delay#1326
Open
kshirajahere wants to merge 1 commit intomofa-org:mainfrom
Open
fix(foundation): prevent integer overflow in RetryPolicy::get_delay#1326kshirajahere wants to merge 1 commit intomofa-org:mainfrom
kshirajahere wants to merge 1 commit intomofa-org:mainfrom
Conversation
2u64.pow(retry_count) overflows for retry_count >= 64:
- debug builds: arithmetic-overflow panic -> crash
- release builds: wraps to 0 -> delay becomes 0 -> unthrottled
retry storm that DOSes both the agent and the upstream service
Switch to checked_pow + checked_mul and saturate at max_delay_ms when
either operation would overflow. This is the correct behaviour anyway:
if 2^retry_count * base_ms exceeds the configured cap, use the cap.
Adds regression test covering retry_count values 55, 63, 64, 65, 100,
1000, and u32::MAX, all of which overflowed with the old code.
Contributor
Author
|
Please ignore my previous one-word review comment (ignore). It was an accidental tooling artifact. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RetryPolicy::get_delayinmofa-foundationused bareu64::pow()and*for exponential backoff, causing integer overflow whenretry_countis large.Replace with
checked_pow+checked_mul, saturating atmax_delay_mson overflow (which is the intentional cap anyway).Related Issues
Closes #1325
Context
retry_countis au32parameter andmax_retriesinRetryPolicyis alsou32. Any workflow config withmax_retries >= 55(combined with the defaultretry_delay_ms = 1000) will trigger the overflow in the exponential path. The existing unit test only exercised counts 0-10, leaving the overflow undetected.Changes
crates/mofa-foundation/src/workflow/node.rsRetryPolicy::get_delay: replaceself.retry_delay_ms * 2u64.pow(retry_count)withchecked_pow + checked_mul, unwrapping tomax_delay_mson overflowtest_retry_policy_large_count_does_not_overflowregression test covering counts 55, 63, 64, 65, 100, 1000, andu32::MAXHow you Tested
cargo test -p mofa-foundation test_retry_policyResults:
test workflow::node::tests::test_retry_policy ... oktest workflow::node::tests::test_retry_policy_large_count_does_not_overflow ... okScreenshots / Logs
Breaking Changes
The fix changes only the internal arithmetic. The public API, default values, and observable behaviour for normal retry counts (0-53 with default delay) are identical.
Checklist
Code Quality
cargo fmtruncargo clippypasses without warningsTesting
cargo testpasses locally without any errorDocumentation
PR Hygiene
mainDeployment Notes
No migration required. Single crate, single function, no API change.
Additional Notes for Reviewers
The
unwrap_or(self.max_delay_ms)default is intentional: if the arithmetic would overflow, the delay is already beyond the configured maximum, so returningmax_delay_msis both mathematically correct and safe. The subsequent.min(self.max_delay_ms)is kept as a belt-and-suspenders guard for the normal (non-overflow) path.