[cDAC] Enumerate R2R hashmap pages into triage dumps for odd entry points#129931
Merged
max-charlamb merged 6 commits intoJun 28, 2026
Merged
Conversation
Mirror the legacy DAC's fast path in
ReadyToRunInfo::GetMethodDescForEntryPointInNativeImage
(src/coreclr/vm/readytoruninfo.cpp:371-376): on AMD64 and x86 a normal
method entry point is always 2+ byte aligned, but a funclet can start at
an odd address. The legacy DAC bails without probing the
EntryPointToMethodDescMap PtrHashMap for these.
The cDAC was probing the hashmap for odd entry points. That hash lands
in bucket slots the producer-side DAC walker (which has the fast-path
bail) never touches and therefore never enumerates into a triage
minidump. The consumer-side cDAC probe then faults reading the
not-in-dump bucket page, throws VirtualReadException, and aborts the
whole stack walk a single frame past a SoftwareExceptionFrame.
Manifests as SOS '!ClrStack' producing only:
[SoftwareExceptionFrame: ...]
<failed>
Stack Walk failed. Reported stack incomplete.
on net11 macOS triage dumps where cDAC defaults on (dotnet/diagnostics
PR dotnet#5874). Diagnosed via dotnet/diagnostics issue #<TBD>.
Validated on the failing CI dump (SOS.ReflectionTest.Triage.dmp from
public build 1483142, macOS x64): pre-fix dies after one frame; with
fix produces the full eight-frame managed stack with file/line
numbers, matching the legacy DAC's output.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the cDAC ReadyToRun MethodDesc lookup to short-circuit on x86/x64 when the computed entry point is odd (low bit set), returning TargetPointer.Null rather than probing EntryPointToMethodDescMap. This aligns the cDAC behavior with the legacy DAC’s fast-path and avoids doing an unnecessary hash map lookup for entry points that won’t be present as keys.
Changes:
- Add an architecture-gated odd-entry-point bail-out in
GetMethodDescForRuntimeFunction(x86/x64 only). - Minor formatting-only adjustment in the ReadyToRun major-version → GCInfo-version switch expression (no behavioral change).
Contributor
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
jkotas
reviewed
Jun 28, 2026
The odd-entry-point bail added to GetMethodDescForRuntimeFunction calls
Target.Contracts.RuntimeInfo.GetTargetArchitecture(), but the test target
built by CreateTarget only registered IExecutionManager and a mock
IPlatformMetadata. Every R2R test then threw NotImplementedException
('Contract IRuntimeInfo is not supported by the target').
Register a mock IRuntimeInfo that reports X64/X86 based on the test
architecture's bitness, keeping the suite green while exercising the new
x86/x64 branch.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jkotas
reviewed
Jun 28, 2026
jkotas
reviewed
Jun 28, 2026
Per review feedback, fix the root cause instead of mirroring the bail in the cDAC. The odd-entry-point check in ReadyToRunInfo::GetMethodDescForEntryPointInNativeImage is a perf optimization that skips a guaranteed-miss hashmap lookup for funclet addresses. In DAC builds it had the side effect of skipping the probe during createdump's memory enumeration, so the bucket pages were never captured into triage minidumps and the cDAC consumer faulted reading them. Gate the optimization with !DACCESS_COMPILE so the DAC performs the lookup and enumerates those pages, and remove the cDAC bail entirely so it reads naturally. Also delete the inaccurate 'PtrHashMap can't handle odd pointers' comment - the map handles odd keys fine, they are simply never present. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The fix lives entirely in the runtime: gating the odd-entry-point optimization with !DACCESS_COMPILE makes the DAC enumerate the bucket pages into triage dumps, so the cDAC needs no change and reads them naturally. Restore the file to its main state, including the unrelated switch-expression whitespace. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jkotas
approved these changes
Jun 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix the cDAC's R2R stack walk aborting a frame past a
SoftwareExceptionFramewhen reading macOS triage minidumps. The root cause is an odd-entry-point optimization in the legacy DAC that skips a hashmap probe, so the bucket pages that probe would touch are never enumerated into the triage dump -- and the cDAC consumer then faults reading those not-in-dump pages.Tracking issue: dotnet/diagnostics#5910 -- full repro, captured failure trace, and dump-file page-map evidence are there.
Root cause
ReadyToRunInfo::GetMethodDescForEntryPointInNativeImagelooks up an entry point in theEntryPointToMethodDescMaphashmap. It has an x64/x86 fast path that returns early when the entry point is odd:This is purely a performance optimization: the map only ever contains true method entry points (4-byte aligned on every architecture), so a lookup for an odd (funclet) address is always a miss. (
PtrHashMaphandles odd keys fine -- they're just never present.)The problem is the side effect in DAC builds. When
createdumpenumerates memory for a triage minidump (DOTNET_DbgMiniDumpType=3), it drives stack walking through this same path. The fast path makes it return without ever probing the hashmap for odd (funclet) entry points, so the bucket pages that probe would touch are never read and never captured into the dump.The cDAC consumer, walking the same stack later, probes the hashmap for that odd entry point, lands in a bucket page absent from the dump, throws
VirtualReadException, and aborts the walk -- SOS reports<failed>one frame past theSoftwareExceptionFrame.Fix
Make the producer enumerate the pages instead of mirroring the bail on the consumer:
src/coreclr/vm/readytoruninfo.cpp: gate the odd-entry-point optimization with!defined(DACCESS_COMPILE). The live runtime keeps the perf win; DAC builds perform the lookup so the bucket pages are enumerated into triage minidumps. Also removed the inaccurate "PtrHashMap can't handle odd pointers" comment.ExecutionManagerCore.ReadyToRunJitManager: removed the bail entirely. With the DAC enumerating the pages, the consumer reads them naturally.This is a root-cause fix (complete dump enumeration) rather than masking the symptom on the consumer side, and it benefits the legacy DAC consumer too, not just the cDAC.
Validation
coreclr.dll(non-DAC) andmscordaccore.dll(DAC); both compilations ofreadytoruninfo.cppsucceed with 0 warnings / 0 errors, confirming the#ifchange is valid in both configurations.Microsoft.Diagnostics.DataContractReader.Tests): all green.Note
This PR description was drafted with assistance from GitHub Copilot.