JIT: only skip loop inversion when the IV test is the bottom test#129868
Open
AndyAyersMS wants to merge 2 commits into
Open
JIT: only skip loop inversion when the IV test is the bottom test#129868AndyAyersMS wants to merge 2 commits into
AndyAyersMS wants to merge 2 commits into
Conversation
The bail-out added in dotnet#128459 fires for any BBJ_COND latch (or BBJ_COND predecessor of a canonical BBJ_ALWAYS latch) that exits the loop, but that block may be an unrelated early-exit and not the loop's continuation test. Restrict the bail-out to the case where the recognized IV test (from AnalyzeIteration) is at that position. AnalyzeIteration is called lazily so only loops that match the structural shape pay the cost. Fixes the Dictionary regression reported in dotnet#128910. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refines Compiler::optTryInvertWhileLoop’s “already bottom-tested” bail-out so it only triggers when the back-edge test corresponds to the IV test recognized by FlowGraphNaturalLoop::AnalyzeIteration, avoiding false positives from unrelated early-exit conditionals that happen to feed the back-edge.
Changes:
- Tighten the “skip inversion for bottom-tested loops” check to require the exiting
BBJ_CONDat the latch (or predecessor of a canonicalBBJ_ALWAYSlatch) to matchNaturalLoopIterInfo::TestBlock. - Add a lazy, cached
AnalyzeIterationinvocation so only candidate loops pay the analysis cost. - Update dump messages/comments to reflect the refined criteria.
Member
Author
|
@jakobbotsch PTAL A bit more fussing with loop inversion, trying to refine the criteria for invertible multi-exit loops better, and fix some regressions. |
jakobbotsch
approved these changes
Jun 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bail-out added in #128459 fires for any BBJ_COND latch (or BBJ_COND predecessor of a canonical BBJ_ALWAYS latch) that exits the loop. That block may be an unrelated early-exit and not the loop's continuation test — for example,
Dictionary<TKey,TValue>+Enumerator.MoveNext's body has an early-exit BBJ_COND (if (entry.next >= -1) return true;) while the actual loop continuation is the top-tested_index < _countcheck in the header. The current predicate falsely classifies that shape as "already bottom-tested" and skips a beneficial inversion.Restrict the bail-out to the case where the recognized IV test (from
AnalyzeIteration) is at that latch / predecessor position.AnalyzeIterationis called lazily so only loops that match the structural shape pay the cost.Fixes the Dictionary regression reported in #128910
System.Collections.Generic.Dictionary2+Enumerator[int,int]:MoveNext` codegen:BDN
IterateForEach<Int32>.Dictionary(Size: 512)on AMD Zen 3 (3 interleaved rounds, mean of 3):The Viper lab is Zen 4 where the regression is +6%; the fix restores byte-identical pre-#128459 codegen, so the lab perf should fully recover.
SPMI impact
~1,600 methods get inversion re-enabled. TP cost mostly comes from running the actual inversion phase on those methods; the lazy
AnalyzeIterationcall itself is a small fraction.Original target of #128459
The Perf_Encoding regression that #128459 fixed (
Utf8Utility.GetPointerToFirstInvalidByteon arm64) has a true bottom-tested IV-controlled loop whereAnalyzeIterationrecognizes the latch as the IV test, so this refined predicate still bails out there.Note
This PR description was drafted with assistance from GitHub Copilot CLI.