Skip to content

JIT: only skip loop inversion when the IV test is the bottom test#129868

Open
AndyAyersMS wants to merge 2 commits into
dotnet:mainfrom
AndyAyersMS:AndyAyersMS/loop-inversion-ivtest
Open

JIT: only skip loop inversion when the IV test is the bottom test#129868
AndyAyersMS wants to merge 2 commits into
dotnet:mainfrom
AndyAyersMS:AndyAyersMS/loop-inversion-ivtest

Conversation

@AndyAyersMS

Copy link
Copy Markdown
Member

The bail-out added in #128459 fires for any BBJ_COND latch (or BBJ_COND predecessor of a canonical BBJ_ALWAYS latch) that exits the loop. That block may be an unrelated early-exit and not the loop's continuation test — for example, Dictionary<TKey,TValue>+Enumerator.MoveNext's body has an early-exit BBJ_COND (if (entry.next >= -1) return true;) while the actual loop continuation is the top-tested _index < _count check in the header. The current predicate falsely classifies that shape as "already bottom-tested" and skips a beneficial inversion.

Restrict the bail-out to the case where the recognized IV test (from AnalyzeIteration) is at that latch / predecessor position. AnalyzeIteration is called lazily so only loops that match the structural shape pay the cost.

Fixes the Dictionary regression reported in #128910

System.Collections.Generic.Dictionary2+Enumerator[int,int]:MoveNext` codegen:

code size instructions PerfScore
Pre-#128459 (target) 129 bytes 43 105.38
Post-#128459 (regressed) 114 bytes 38 114.88
With this fix 129 bytes 43 105.38

BDN IterateForEach<Int32>.Dictionary(Size: 512) on AMD Zen 3 (3 interleaved rounds, mean of 3):

  • current main: 501.9 ns
  • with fix: 490.1 ns (−2.4%)

The Viper lab is Zen 4 where the regression is +6%; the fix restores byte-identical pre-#128459 codegen, so the lab perf should fully recover.

SPMI impact

Collection Code size PerfScore overall PerfScore on diffs TP (release)
benchmarks.run_pgo +0.18% −0.04% −1.07% +0.46%
aspnet2.run +0.09% +0.07% −0.77% +0.28%
libraries.pmi +0.07% −0.02% −0.79% +0.11%

~1,600 methods get inversion re-enabled. TP cost mostly comes from running the actual inversion phase on those methods; the lazy AnalyzeIteration call itself is a small fraction.

Original target of #128459

The Perf_Encoding regression that #128459 fixed (Utf8Utility.GetPointerToFirstInvalidByte on arm64) has a true bottom-tested IV-controlled loop where AnalyzeIteration recognizes the latch as the IV test, so this refined predicate still bails out there.

Note

This PR description was drafted with assistance from GitHub Copilot CLI.

The bail-out added in dotnet#128459 fires for any BBJ_COND latch (or
BBJ_COND predecessor of a canonical BBJ_ALWAYS latch) that exits
the loop, but that block may be an unrelated early-exit and not
the loop's continuation test. Restrict the bail-out to the case
where the recognized IV test (from AnalyzeIteration) is at that
position. AnalyzeIteration is called lazily so only loops that
match the structural shape pay the cost.

Fixes the Dictionary regression reported in dotnet#128910.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 25, 2026 20:17
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 25, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines Compiler::optTryInvertWhileLoop’s “already bottom-tested” bail-out so it only triggers when the back-edge test corresponds to the IV test recognized by FlowGraphNaturalLoop::AnalyzeIteration, avoiding false positives from unrelated early-exit conditionals that happen to feed the back-edge.

Changes:

  • Tighten the “skip inversion for bottom-tested loops” check to require the exiting BBJ_COND at the latch (or predecessor of a canonical BBJ_ALWAYS latch) to match NaturalLoopIterInfo::TestBlock.
  • Add a lazy, cached AnalyzeIteration invocation so only candidate loops pay the analysis cost.
  • Update dump messages/comments to reflect the refined criteria.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AndyAyersMS

Copy link
Copy Markdown
Member Author

@jakobbotsch PTAL
@dotnet/jit-contrib

A bit more fussing with loop inversion, trying to refine the criteria for invertible multi-exit loops better, and fix some regressions.

@AndyAyersMS AndyAyersMS requested a review from jakobbotsch June 25, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants