Skip to content

Conversation

@baileympearson
Copy link
Contributor

@baileympearson baileympearson commented Dec 2, 2025

DRIVERS-3239

Overview

This PR adds support for a new class of errors (SystemOverloadedError) to drivers' operation retry logic, as outlined in the design document.

Additionally, it includes a new argument to the MongoDB handshake (also defined in the design document).

Python will be second implementer.
Node implementation: mongodb/node-mongodb-native#4806

Testing

The testing strategy is two-fold:

  • Building off of Ezra's work to generate unified tests for retryable handshake errors, this PR generates unified tests to confirm that:

    • operations are retried using the new SystemOverloadedError label
    • operations are retried no more than 5 (current MAX_ATTEMPTS, as defined in the spec) times
  • Following Iris's work in DRIVERS-1934: withTransaction API retries too frequently #1851, this PR adds a prose test that ensures drivers apply exponential backoff in the retryability loop.

  • Update changelog.

  • Test changes in at least one language driver.

  • Test these changes against all server versions and topologies (including standalone, replica set, and sharded
    clusters).

@baileympearson baileympearson marked this pull request as ready for review December 2, 2025 18:59
@baileympearson baileympearson requested review from a team as code owners December 2, 2025 18:59
@baileympearson baileympearson requested review from jmikola and jyemin and removed request for a team December 2, 2025 18:59
@blink1073
Copy link
Member

It looks like you also need to bump the schema version:

source/client-backpressure/tests/backpressure-retry-loop.yml invalid
[
  {
    instancePath: '/tests/0/operations/3/expectError',
    schemaPath: '#/definitions/expectedError/type',
    keyword: 'type',
    params: { type: 'object' },
    message: 'must be object'
  }
]
 using schema v1.3
source/client-backpressure/tests/backpressure-retry-max-attempts.yml invalid
[
  {
    instancePath: '/tests/0/operations/1/expectError',
    schemaPath: '#/definitions/expectedError/type',
    keyword: 'type',
    params: { type: 'object' },
    message: 'must be object'
  }
]
 using schema v1.3source/client-backpressure/tests/backpressure-retry-loop.yml invalid
[
  {
    instancePath: '/tests/0/operations/3/expectError',
    schemaPath: '#/definitions/expectedError/type',
    keyword: 'type',
    params: { type: 'object' },
    message: 'must be object'
  }
]
 using schema v1.3
source/client-backpressure/tests/backpressure-retry-max-attempts.yml invalid
[
  {
    instancePath: '/tests/0/operations/1/expectError',
    schemaPath: '#/definitions/expectedError/type',
    keyword: 'type',
    params: { type: 'object' },
    message: 'must be object'
  }
]
 using schema v1.3

@blink1073
Copy link
Member

WIP Python implementation: mongodb/mongo-python-driver#2635

@blink1073
Copy link
Member

blink1073 commented Dec 3, 2025

All unified and prose tests are passing in the Python implementation.

Edit: we're still failing one unified test, "client.clientBulkWrite retries using operation loop", investigating...

Edit 2: we're all good now

Copy link
Contributor

@jyemin jyemin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the specification changes, not the pseudocode or tests. Those are best reviewed by implementers.

- This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the
timeout to avoid retry storms.
5. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according
to according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)`
to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


This specification expands the driver's retry ability to all commands, including those not currently considered
retryable such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys
the following rules:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the rules include all the deposits into the token bucket, consider adding withdrawals as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


## Q&A

TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything to add here, or just remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing yet .. I'll remove it for now. It can always be added back if there are items worth mentioning here.


## Changelog

- 2025-XX-XX: Initial version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how we handle the date... Is there an automation for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. Usually the spec author fills it out before merging

I'll just leave this thread open to remind myself to add changelog dates before merging once all changes are completed.

to identify clients which do and do not support backpressure. Currently, this flag is unused but in the future the
server may offer different rate limiting behavior for clients that do not support backpressure.

##### Implementation notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
##### Implementation notes
#### Implementation notes

@blink1073 blink1073 requested a review from a team as a code owner December 17, 2025 12:41
@blink1073 blink1073 requested review from durran and removed request for a team December 17, 2025 12:41

An error considered retryable by the [Retryable Writes Specification](../retryable-writes/retryable-writes.md).

#### Backpressure Error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec should also get a changelog entry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@baileympearson baileympearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submitting comments

operations:
-
object: *utilCollection
name: deleteMany
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this one. I like your suggestion - done.

- This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the
timeout to avoid retry storms.
5. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according
to according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


This specification expands the driver's retry ability to all commands, including those not currently considered
retryable such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys
the following rules:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

The following pseudocode describes the overload retry policy:

```python
BASE_BACKOFF = 0.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to leave it as-is; as the pseudocode is written in python and follows python conventions. Let me know if the clarified pseudocde works for you


## Changelog

- 2025-XX-XX: Initial version.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the changelog

}
```

3. Execute the following command. Expect that the command errors. Measure the duration of the command execution.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked the prose to be explicit about what the command is.

Copy link
Contributor

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

5. A retry attempt consumes 1 token from the token bucket.
6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to
the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)`
- `i` is the retry attempt number (starting with 0 for the first retry).
Copy link
Member

@sanych-sun sanych-sun Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we decided to have retry number start we 0? It makes the requirements confusing. Why don't we start with 1 and in fact we will have the same formula as for withTransaction:

delayMS = j * min(maxBackoff, baseBackoff * 2^(i-1))

With the only difference here we have 2 as the base for pow function, in convinientTransaction API we have 1.5

Copy link
Member

@sanych-sun sanych-sun Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the formula from withTransaction:
jitter * min(BACKOFF_INITIAL * 1.5 ** (transactionAttempt - 1), BACKOFF_MAX)

Where transactionAttempt started with 0 and is being incremented AFTER the delay, but before executing the callback attempt. Which is also confusing... but in C# implementation we wait AFTER the attempt so it's more natural.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing, including "Retries start at 0", was just taken from the design. There's no need to keep it this way if it causes confusion.

I can adjust the phrasing to more closely align with the transaction spec, if that's preferable?

Copy link
Member

@sanych-sun sanych-sun Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the following formula?
delayMS = j * min(maxBackoff, baseBackoff * 2^(i-1))

or

we can keep the formula as is, but adjust the baseBackoff :
delayMS = j * min(maxBackoff, baseBackoff * 2^i)
where baseBackoff is 50 instead of 100.

It produces the same results, but starts i with 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to use any of those three formulas in individual driver implementations if it increases readability so long as they result in the same outputs. As far as reducing confusion, I can say it didn't substantially change my understanding of the formula.

+1 to Bailey changing the phrasing, which I think would suffice.

# Note: the values below have been scaled down by a factor of 1000 because
# Python's sleep API takes a duration in seconds, not milliseconds.
BASE_BACKOFF = 0.1 # 100ms
MAX_BACKOFF = 10 # 10s
Copy link
Member

@stIncMale stIncMale Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 100ms comment is helpful, because one can use it to find the same value in the prose part of the specification (though absolutely nothing prevents us from simply writing BASE_BACKOFF = 100ms).

However, the 10s comment is not helpful, because the prose part says 10000ms, and a reader won't find 10000ms in the pseudocode.

Suggested change
MAX_BACKOFF = 10 # 10s
MAX_BACKOFF = 10 # 10000ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants