Skip to content

[✨ Triage] dotnet/runtime#126183 by danmoseley - System.Net.Sockets.Tests tests fail with ObjectDisposedException #172

@MihuBot

Description

@MihuBot

Triage for dotnet/runtime#126183.
Repo filter: All networking issues.
MihuBot version: 246635.
Ping MihaZupan for any issues.

This is a test triage report generated by AI, aimed at helping the triage team quickly identify past issues/PRs that may be related.
Take any conclusions with a large grain of salt.

Tool logs
dotnet/runtime#126183: System.Net.Sockets.Tests tests fail with ObjectDisposedException by danmoseley
Extracted 5 search queries: ObjectDisposedException SafeSocketHandle during SetIsNonBlocking / SafeHandleMarshaller.ManagedToUnmanagedIn, SocketAsyncContext.SetHandleBlocking or ConnectOperation throwing ObjectDisposedException SafeSocketHandle, SafeHandle.DangerousAddRef ObjectDisposedException in SafeHandle marshalling of SafeSocketHandle, System.Net.Sockets.Tests failing on net11.0 osx Mono_Minijit with ObjectDisposedException SafeSocketHandle, Interop.Sys.Fcntl.SetIsNonBlocking SafeSocketHandle error in SocketAsyncContext/WriteOperation on Unix
Found 25 candidate issues

Here are the potentially relevant prior issues/PRs I reviewed and a short summary of their discussions / conclusions, with notes on why each may relate to the new failure (tests failing with ObjectDisposedException on SafeSocketHandle on a Mono macOS Debug leg).

  • Issue #82308 (Feb 2023) - "SafeHandle marshalling broken with Mono"
    Summary: On a Mono Debug configuration, the managed SafeHandle layout (debug-only debugging fields) didn’t match Mono’s native marshalling expectations, leading to wrong handle values (fd 0) being passed to native calls and many Socket tests failing. The discussion concluded the easiest mitigation was to make the extra debug-only SafeHandle fields coreclr-only (i.e., avoid changing layout seen by Mono).
    Relevance: High. The new failure runs on a Mono_Debug leg (Mono_Minijit_Debug-OSX). Past Mono-specific SafeHandle layout/marshalling bugs produced socket test failures on macOS. Investigate whether Mono vs coreclr SafeHandle layout/interop differences (or a change in SafeHandle debug-only fields) could be contributing to the ObjectDisposedException or causing early disposal/invalid handles in this leg.

  • PR #124200 (Feb 2026 → merged Mar 17 2026) - "Restore blocking mode after successful ConnectAsync on Unix"
    Summary: Restores the fd to blocking mode after a successful ConnectAsync (if the user did not explicitly set Blocking=false) to optimize subsequent synchronous I/O. The change invoked SetHandleBlocking() / TryRestoreBlocking() at several connect-completion code paths. Reviewers called out threading/race concerns and suggested documenting safety and adding tests for concurrency. Some copilot comments pointed out specific code paths where RestoreBlocking might be skipped or invoked on failures.
    Relevance: High. The stack trace in the failing run shows SetHandleBlocking / SetIsNonBlocking / SocketAsyncContext.ConnectOperation.InvokeCallback in the call stack. The new PR added logic that touches the exact code path in which DangerousAddRef is called to change blocking mode; that could expose a race where the SafeSocketHandle is observed as disposed during the attempted DangerousAddRef (ObjectDisposedException). Investigate whether the PR introduced a timing/race that allows a connect completion callback to run concurrently with disposal or otherwise call SetBlocking when the handle is already closed.

  • Issue #31570 (Nov 2019) - "Deadlock in SocketAsyncContext.Unix.cs"
    Summary: macOS test hangs due to a deadlock when a socket handle was released (leading to Close/StopAndAbort/TryCancel) while an async operation (BufferMemoryReceiveOperation) was still executing; the operation effectively waits for itself and deadlocks. Workarounds and later fixes in 5.0 changed the async engine and the issue was considered fixed there.
    Relevance: Medium. This is the same source file (SocketAsyncContext.Unix.cs) and shows how handle release/dispose during ongoing async processing can create hard-to-diagnose failures on Unix/macOS. The stackframes are similar; use it to guide investigation of races between async operations and socket disposal.

  • Issue #40301 (Aug 2020) and PR #41508 (Aug 2020 → merged Sep 1 2020) - "SafeSocketHandle.CloseAsIs hanging in finalizer thread" / "avoid potential blocking of finalizer thread"
    Summary: Closing sockets could hang the finalizer thread because SafeHandle refcounts from other code paths (e.g., SafePipeHandle) prevented release; code added spinning retries to TryUnblockSocket and later was changed to skip TryUnblockSocket from the finalizer path to avoid finalizer hangs. PR #41508 updated SafeSocketHandle to avoid spinning when running on finalizer thread.
    Relevance: Medium. These show earlier tricky interactions between SafeHandle refcounts, DangerousAddRef/Release, and finalization when there are cross-component references (e.g., SafePipeHandle referencing SafeSocketHandle). The current failure is an ObjectDisposedException during DangerousAddRef; the prior work documents the pitfalls and mitigations around disposing/unblocking and finalization.

  • Issue #26219 (May 2018) - "SocketAsyncEventArgs.Complete may hang if used on disposed socket"
    Summary: A race between socket disposal and allocating NativeOverlapped could lead to ObjectDisposedException thrown outside of the proper try/finally and cause another part of code to spin indefinitely. This was a subtle race introduced when enabling Memory usage for Socket. It was fixed for that release.
    Relevance: Medium. Demonstrates an earlier case where ObjectDisposedException arising from a race between disposal and I/O caused hangs or test flakiness. The present failure (ObjectDisposedException from SafeHandle.DangerousAddRef inside SetIsNonBlocking) may be another manifestation of a dispose-vs-async-operation race.

  • Issue #47561 (Jan 2021) and PR #47575 (Jan 2021 → merged Feb 2021) - "ClosedDuringOperation_Throws_ObjectDisposedExceptionOrSocketException" / "Fix test failure: ClosedDuringOperation_Throws..."
    Summary: Tests exercising dispose-during-operation had timing races; depending on timing they could see ObjectDisposedException or SocketException. The test was made tolerant to either outcome, and some code paths were changed to translate certain ObjectDisposedExceptions to SocketExceptions to stabilize behavior.
    Relevance: Medium. This issue is directly about test flakiness when sockets are closed concurrently with operations, and it shows how tests and product code evolved to handle these races. The new failure is a test hit on ObjectDisposedException — check whether the test expectations need to be broadened or whether a product regression occurred.

  • PR #72000 (Jul 2022 → merged Aug 2022) - "Socket: cancel on-going operations when Sockets with non-owning handle gets disposed."
    Summary: When a Socket is constructed with a non-owning SafeHandle, Dispose/Close must not attempt certain abortive operations that would affect other owners of the same raw handle. The PR cancels on-going ops for owning handles and leaves non-owning handles alone; it added tests about not closing non-owned handles.
    Relevance: Low/Contextual. Shows how dispose semantics for owning vs non-owning handles were tightened; relevant if the failing tests use non-owning handles, but new failure stack shows SafeSocketHandle.Release/DangerousAddRef, so ownership semantics may be pertinent to possible races.

  • Issue #73496 (Aug 2022 → closed Jun 2023) - "SafeHandle marshalling improvements in the source-generated marshalling"
    Summary: Source-generated marshalling for SafeHandle was improved; they considered disposing semantics and ensuring SafeHandle marshalling doesn’t leak handles if exceptions occur. The work completed with #85419.
    Relevance: Low/Contextual. Background on marshalling-generated code handling SafeHandles more safely; useful to check if generated marshalling (or IL stubs) on Mono vs coreclr differs.

  • Issue #64400 (Jan 2022 → closed Feb 2022) - "Closing the AsyncWaitHandle from the result of socket.BeginConnect crashes the application in .NET 6.0"
    Summary: Regression where closing AsyncWaitHandle could cause ObjectDisposedException originating from SafeHandle.DangerousAddRef in a Task completion path; discussion centered on APM semantics, whether Close should signal wait handle before returning, and migration guidance away from APM.
    Relevance: Low/Contextual. The stack involves DangerousAddRef and shows that races between close/dispose and async completion paths can cause ObjectDisposedException bubbling up. It is a helpful precedent.

Overall assessment and suggested next investigation steps

  • Mono-specific SafeHandle/marshalling problems are a top suspect given the leg (Mono_Minijit_Debug-OSX) and the history in #82308. Start by checking whether the failing leg uses a Mono runtime build whose SafeHandle layout differs from expectations (debug-only fields, etc.) or if recent changes affected SafeHandle layout/marshalling for Mono. If layout/marshalling is suspect, reproducing locally on the same Mono config (Release vs Debug, librariesConfiguration) or running under strace to see the fd values (as in #82308) will help.
  • The call stack shows SetHandleBlocking / SocketAsyncContext.ConnectOperation.InvokeCallback. PR #124200 (restoring blocking mode after ConnectAsync) touched this code and was merged in March 2026 — investigate whether that PR’s new calls to SetBlocking/SetHandleBlocking can race with disposal in some test scenarios and cause DangerousAddRef on an already-disposed SafeSocketHandle. Review the PR call sites and add guards (or try/catch) around DangerousAddRef usage if necessary, and/or add checks to ensure the socket isn’t disposed before attempting to transition blocking mode.
  • Reproduce on the same macOS / Mono_Debug configuration and collect:
    • full test logs and helix logs for the failing job,
    • strace/ktrace to see fd values if marshalling is suspected,
    • thread dumps when the exception occurs to see ordering of dispose vs the connect callback,
    • which test exactly triggers it and whether it's newly introduced by the PR referenced in the issue body (pull/126006).
  • Check test expectations: prior issues show tests were made tolerant to either ObjectDisposedException or SocketException for races. If this is a known/acceptable race, the test might need to be widened; but because this is a blocking-clean-ci label, err on investigating product regression before changing tests.
  • If reproducing the issue shows the handle truly being disposed before the SetBlocking call, consider hardening SetHandleBlocking / SocketAsyncContext.ConnectOperation.InvokeCallback to defensively handle ObjectDisposedException from DangerousAddRef (log and treat as operation-aborted) or to ensure the handle is referenced safely before changing OS flags.

If you want, I can:

  • try to map the exact lines in the current source (post-PR 126006) where DangerousAddRef is hit and point to the minimal guard points to add, and/or
  • pick a few of the most actionable follow-ups (Mono layout check and review PR #124200 call sites) and propose specific diagnostic commands or quick patch ideas.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions