-
Notifications
You must be signed in to change notification settings - Fork 1.3k
CSHARP-3550: CSOT: Server Selection #1705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -47,7 +47,7 @@ private Exception CreateTimeoutException(Stopwatch stopwatch, string message) | |||
var checkOutsForOtherCount = checkOutsCount - checkOutsForCursorCount - checkOutsForTransactionCount; | |||
|
|||
message = | |||
$"Timed out after {stopwatch.ElapsedMilliseconds}ms waiting for a connection from the connection pool. " + | |||
$"Timed out after {operationContext.Elapsed.TotalMilliseconds}ms waiting for a connection from the connection pool. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Looks like Elapsed.TotalMilliseconds
is frequently used, does this award a shortcut property:
ElapsedMilleseconds
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to use stopwatch here.
// TODO: this static field is temporary here and will be removed in a future PRs in scope of CSOT. | ||
public static readonly OperationContext NoTimeout = new(System.Threading.Timeout.InfiniteTimeSpan, CancellationToken.None); | ||
|
||
private readonly Stopwatch _stopwatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are saving the multiple stopwatches creation on operation execution path, it's minor but still nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but as it was discussed had to revert the change =)
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
[Values(false, true)] | ||
bool async) | ||
{ | ||
var subject = CreateSubject(); | ||
var subject = CreateSubject(serverSelectionTimeout: TimeSpan.FromMilliseconds(10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adjusted serverSelectionTimeout
is an optimization of the test: we probably do not have to wait whole 2 seconds here (the default server selection for this test class, see line 49).
@@ -134,23 +134,13 @@ await Record.ExceptionAsync(() => subject.ExecuteWriteOperationAsync(null, opera | |||
.Subject.ParamName.Should().Be("session"); | |||
} | |||
|
|||
private OperationExecutor CreateSubject(out Mock<IClusterInternal> clusterMock, out Mock<ICoreSessionHandle> implicitSessionMock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftovers of the implicit session creation, that was factored out in one of the latest commits related to OperationExecutor refactoing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors various server selection and connection pool APIs to use a new OperationContext
abstraction instead of separate CancellationToken
and timeout TimeSpan
parameters. Key changes include:
- Replaced all
CancellationToken
(and internalStopwatch
timeout logic) withOperationContext
throughout connection pool helpers, cluster selection, and binding classes. - Updated
CreateTimeoutException
methods to accept aTimeSpan elapsed
rather than aStopwatch
. - Removed legacy overloads and helper classes associated with
CancellationToken
/TimeSpan
and simplified server selection wait logic usingOperationContext
.
Reviewed Changes
Copilot reviewed 141 out of 141 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs | Swapped Stopwatch usage and token/timeouts for OperationContext |
src/MongoDB.Driver/Core/Clusters/LoadBalancedCluster.cs | Updated server selection to use OperationContext and WithTimeout |
src/MongoDB.Driver/Core/Clusters/IClusterExtensions.cs | Changed extension methods to accept OperationContext |
src/MongoDB.Driver/Core/Clusters/ICluster.cs | Updated interface signatures to use OperationContext |
src/MongoDB.Driver/Core/Clusters/Cluster.cs | Refactored internal select/wait loops to data in OperationContext |
src/MongoDB.Driver/Core/Bindings/**/*.cs | Replaced all CancellationToken parameters in binding APIs with OperationContext |
Comments suppressed due to low confidence (3)
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs:50
- [nitpick] Using TotalMilliseconds can include fractional values (e.g., 12.345ms). Consider formatting with a fixed precision or rounding (e.g.,
{elapsed.TotalMilliseconds:F0}
) to produce a cleaner integer-based message.
message = $"Timed out after {elapsed.TotalMilliseconds}ms waiting for a connection from the connection pool. " +
src/MongoDB.Driver/Core/Clusters/LoadBalancedCluster.cs:179
- The new
WithTimeout
logic andWaitTask
behavior are significant changes to server selection. Please add unit tests that simulate both timeout and cancellation scenarios forSelectServer
/SelectServerAsync
using mocks/stubs for_serverReadyTaskCompletionSource
.
var serverSelectionOperationContext = operationContext.WithTimeout(_settings.ServerSelectionTimeout);
src/MongoDB.Driver/Core/Clusters/ICluster.cs:64
- Removing the old
CancellationToken
overloads is a breaking API change. Consider retaining the old overloads (marked obsolete) to preserve backward compatibility, or provide clear upgrade guidance in release notes.
IServer SelectServer(IServerSelector selector, OperationContext operationContext);
if (index != 0) | ||
try | ||
{ | ||
serverSelectionOperationContext.WaitTask(_serverReadyTaskCompletionSource.Task); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code explicitly threw on cancellation (cancellationToken.ThrowIfCancellationRequested()
), but this method only catches TimeoutException
. Consider catching OperationCanceledException
from WaitTask
or explicitly checking operationContext.CancellationToken
to ensure cancellations are surfaced correctly.
Copilot uses AI. Check for mistakes.
// should be pinned. | ||
var server = cluster.SelectServer(selector, cancellationToken); | ||
var server = cluster.SelectServer(selector, operationContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] We added non-null checks for selector
and operationContext
in core cluster methods; consider adding Ensure.IsNotNull(operationContext, nameof(operationContext));
here as well to maintain consistent validation.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reasons we do not validate other parameters in the IClusterExtensions
methods. Will keep it as is for now.
{ | ||
return GetReadChannelSource(null, cancellationToken); | ||
return GetReadChannelSource(null, operationContext); | ||
} | ||
|
||
public Task<IChannelSourceHandle> GetReadChannelSourceAsync(CancellationToken cancellationToken) | ||
public Task<IChannelSourceHandle> GetReadChannelSourceAsync(OperationContext operationContext) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Most public API methods now guard against null operationContext
. Add an Ensure.IsNotNull(operationContext, nameof(operationContext));
at the top of this method to match the pattern in cluster selection.
Copilot uses AI. Check for mistakes.
} | ||
} | ||
catch (TimeoutException) | ||
{ | ||
var message = BuildTimeoutExceptionMessage(_settings.ServerSelectionTimeout, selector, helper.Description); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BuildTimeoutExceptionMessage
doesn't inform the user whether the operation timed out or the server selection timed out.
{ | ||
var connection = connectionCreator.CreateOpened(cancellationToken); | ||
var connection = connectionCreator.CreateOpened(operationContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for tiny 20ms timeout is just try to take turn in the connecting queue without blocking checkouts.
This timeout is applied to every step individually: MaxConnections and MaxConnecting.
Seems that this code changes this and allocates 20ms to
- MaxConnections
- MaxConnecting
- Whole Connection creation operation (excluding open)
If this thought is correct, it's really surprising that this was not caught by tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanych-sun was this addressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about _maxConnectingQueue
inside ConnectionCreator ?
We don't want to block on that for whole timeout, just a quick try instead.
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -50,9 +50,10 @@ public TResult ExecuteReadOperation<TResult>( | |||
Ensure.IsNotNull(session, nameof(session)); | |||
ThrowIfDisposed(); | |||
|
|||
var cancellationContext = options.ToOperationContext(cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename this variable (and other instances) to operationContext?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
{ | ||
exception = Record.Exception(() => subject.SelectServer(Mock.Of<IServerSelector>(), cancellationTokenSource.Token)); | ||
} | ||
var cancellationContext = new OperationContext(Timeout.InfiniteTimeSpan, cancellationTokenSource.Token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename variable to operationContext?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Core/ConnectionPools/ExclusiveConnectionPool.Helpers.cs
Outdated
Show resolved
Hide resolved
{ | ||
var connection = connectionCreator.CreateOpened(cancellationToken); | ||
var connection = connectionCreator.CreateOpened(operationContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanych-sun was this addressed?
@@ -287,6 +279,15 @@ private void AcquireWaitQueueSlot() | |||
_enteredWaitQueue = true; | |||
} | |||
|
|||
private void EnsureTimeout(OperationContext operationContext, Stopwatch stopwatch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ThrowIfTimedOut
be a better name?
Normally we use EnsureXyz
for argument validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
|
||
return CreateOpenedInternal(cancellationToken); | ||
var operationContext = new OperationContext(Timeout.InfiniteTimeSpan, cancellationToken); | ||
return CreateOpenedInternal(operationContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: return CreateOpenedInternal(new(Timeout.InfiniteTimeSpan, cancellationToken));
No description provided.