-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Try to retain entry weight during profile synthesis #111971
base: main
Are you sure you want to change the base?
JIT: Try to retain entry weight during profile synthesis #111971
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show numerous methods where we can compute a more precise call count, rather than falling back to |
There's kind of a chicken and egg problem here, because we're about to recompute the weights of all the blocks that are preds of the entry block, so relying on the existing weights of those blocks seems a bit odd. Can you post a simple(-ish) non-OSR example where this changes things? If the entry is a loop head (not sure that is possible anymore with ominpresent scratch BB) there is a |
I had #110693 where I tried to compute |
Thanks for pointing this out; this seems better than relying on the old weights. One thing I notice with this approach is
Sure. For
Whereas if we derive the entry weight from the loop's cyclic probability:
I'm tempted to revive this, since we'd ideally compute this early when the profile is still consistent: I suspect some of diffs you got on that PR had to do with OSR methods having nonsensical weights on |
Sounds right. IMO it would be best to go that route since otherwise we may just end up churning things twice... |
I think we still run into an ordering problem where we don't know how much flow makes it back into the entry block until profile synthesis has run, but perhaps we can make profile synthesis responsible for updating/setting |
If you have a If the entry block has backedges I believe we'll always find a loop there, since there is no other possible entry (ignoring OSR for the time being). |
So in the case where the entry block is also a loop header, we have to cache the block's original weight before running |
Part of #107749. Prerequisite to #111915. Regardless of the profile synthesis option used, we ought to maintain the method's entry weight, which is computed by summing all non-flow weight into the entry block. Ideally, we'd use
fgCalledCount
here, but this isn't computed until after morph, and we need to tolerate the existence of multiple entry blocks for methods with OSR pre-morph.