-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Reduce allocations in AssemblySymbol.GetTopLevelTypeByMetadataName #79810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations in AssemblySymbol.GetTopLevelTypeByMetadataName #79810
Conversation
This allocation shows up as over 5% of allocations during completion in the CompletionInCohosting razor speedometer test. The issue here is that the size of this array can exceed the the PooledArrayLengthLimitExclusive threshold specified in ArrayBuilder, thus causing items to not get placed back into the standard pool.
{ | ||
var referenceManager = GetBoundReferenceManager(); | ||
|
||
int length = referenceManager.ReferencedAssemblies.Length; | ||
|
||
assemblies.EnsureCapacity(assemblies.Count + length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this? lists can have their capacity ensured as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it doesn't really provide any value for a pooled object. Letting the array grow organically shouldn't show any allocation differences, assuming it goes back in the pool.
@dotnet/roslyn-compiler -- ptal |
@@ -30,6 +30,9 @@ internal abstract class AssemblySymbol : Symbol, IAssemblySymbolInternal | |||
// to the VB version. | |||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | |||
|
|||
// separate pool for assembly symbols as these collections commonly exceed ArrayBuilder's size threshold | |||
private static readonly ObjectPool<List<AssemblySymbol>> s_symbolPool = new ObjectPool<List<AssemblySymbol>>(() => new List<AssemblySymbol>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify a bit? The reason I did it this way:
-
the discardLargeInstances version of ArrayBuilder.GetInstance isn't available within the compiler directory.
-
I'm not seeing the advantage of using ObjectPool<ArrayBuilder> rather than using an ObjectPool<List> (there are definitely more hits of the latter in the roslyn codebase than the former). If you know of an advantage, I'm definitely willing to investigate and change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the clarification. The primary motivation for the question was the fact that keeping the same type would require less code changes at use sites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the discardLargeInstances version of ArrayBuilder.GetInstance isn't available within the compiler directory.
Does ObjectPool<List<AssemblySymbol>>
behave better in this regard?
I'm not seeing the advantage of using
ObjectPool<ArrayBuilder>
rather than using anObjectPool<List>
What are the advantages of using ObjectPool<List>
vs. ObjectPool<ArrayBuilder>
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure it can work either way, but for me, ObjectPool and ArrayBuilder each have their own pooling semantics, and mixing them up confuses things (for me). As mentioned, ObjectPool<List> has more usage, including in the compilers folder, so it seemed like an appropriate choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to not switch from an ArrayBuilder
to List
just because we want to use different pooling strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure we're on the same page:
The code before was using ArrayBuilder<AssemblySymbol> but that is a poor fit because of the size threshold ArrayBuilder uses and that the discardLargeInstances override isn't available in the compilers folder.
So, the code needed to be switched to another type, and specifically an ObjectPool of some collection type. To me, ObjectPool<ArrayBuilder> conflates multiple pooling technicques whereas ObjectPool<List> is commonly used in similar scenarios throughout the codebase.
If this were to use ObjectPool<ArrayBuilder>, the code would have to use new ArrayBuilder() in the pool to create it (not something code outside the ArrayBuilder class usually does) and it wouldn't call Free on it because the ArrayBuilder wouldn't be tied to the pool that that class knows about. I'm not seeing the benefit to wanting ObjectPool<ArrayBuilder> over ObjectPool<List>.
As I said, that concept is confusing to me, so I want to make sure I understand your proposal and it's value before trying something different from what is in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that is a poor fit because of the size threshold ArrayBuilder uses and that the discardLargeInstances override isn't available in the compilers folder.
Is the issue with the collection behavior or the pooling strategy used by default? My current understanding is that the problem is with the default pooling strategy, not the collection behavior itself. Therefore, changing the collection type is not necessary and, in my opinion, is not justified.
If this were to use ObjectPool<ArrayBuilder>, the code would have to use new ArrayBuilder() in the pool to create it (not something code outside the ArrayBuilder class usually does) and it wouldn't call Free on it because the ArrayBuilder wouldn't be tied to the pool that that class knows about.
This doesn't scary me at all. If the problem is with the pooling strategy, that is what we should be fixing, and, if one place in code does that for ArrayBuilder, that is not a problem. We have plenty of places that use specialized (non-default) pools for various objects. Comments can help with explaining unusual logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone ahead and pushed a change to use an ObjectPool<ArrayBuilder> instead of ObjectPool<List>. I personally find this more confusing than the original change (and think there is a reason basically no other code uses ArrayBuilders like this), but I don't think I'm able to convince you otherwise, and it's I don't think it woud be worthwhile to continue to try to do so. Let me know if this matches your expectation. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and think there is a reason basically no other code uses ArrayBuilders like this
I think there is only one reason for that - the default pooling worked well. A lack of existing example on itself is not an indication that doing something is wrong, we are in the business of doing things that were never done before.
In general, decisions what collection to use and what pooling strategy to use are two distinct and independent decisions.
…er<AssemblySymbol>> per PR suggestion
… calling that as the ArrayBuilder implementation doesn't suffer from the same issues as the List implementation
@@ -1327,8 +1327,6 @@ internal void GetUnaliasedReferencedAssemblies(ArrayBuilder<AssemblySymbol> asse | |||
|
|||
int length = referenceManager.ReferencedAssemblies.Length; | |||
|
|||
assemblies.EnsureCapacity(assemblies.Count + length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not obvious why this statement is removed. Do we have data proving that the removal has a noticeable positive impact? I would think that this statement was allowing us to reduce amount of times the builder was reallocated. Letting it grow on demand inside the loop might result in multiple reallocations. Assembly references with aliases are not very common, therefore, the used capacity should be pretty accurate. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling EnsureCapacity on a pooled data structure has virtually no benefit and could have a bad performance impact. For example, if the data structure continually grows by a constant number of elements, you end up reallocating each time, instead of taking advantage of the exponential growth that the Add call allows.
If the data structure isn't pooled, then I totally agree that ensuring capacity before calling Add makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling EnsureCapacity on a pooled data structure has virtually no benefit
Well, I described one possible benefit above.
and could have a bad performance impact. For example, if the data structure continually grows by a constant number of elements, you end up reallocating each time, instead of taking advantage of the exponential growth that the Add call allows.
This is a good theory. Do you have data supporting it for this specific code path, i.e. supporting a need for this change? If not, I am leaning towards reverting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen such an allocation profile before, but I have no data on this particular call experiencing this. I've reverted the change. I cared enough to make what is almost certainly an improvement, but not enough to provide anything more than a simple explanation for why it's almost certainly a proper choice.
@@ -30,6 +30,9 @@ internal abstract class AssemblySymbol : Symbol, IAssemblySymbolInternal | |||
// to the VB version. | |||
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! | |||
|
|||
// separate pool for assembly symbols as these collections commonly exceed ArrayBuilder's size threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Done with review pass (commit 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (commit 5)
@ToddGrun Please squash commits while merging. |
This allocation shows up as a varying amount of allocations (from 0.1% to 10%) in the Roslyn CodeAnalysis process during completion in the CompletionInCohosting razor speedometer test.
The issue here is that the size of this array can exceed the the PooledArrayLengthLimitExclusive threshold specified in ArrayBuilder, thus causing items to not get placed back into the standard pool.
Test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/659384
Profiles from the test insertion baseline exhibited large variations in AssemblySymbol[] allocations. Some had 15 MB allocated of this type, some had over a GB allocated. With this change, it was 1-2 MB.
*** From one of the bad before profiles ***

*** None of the after profiles had high allocation numbers for AssemblySymbol[], as evidenced by ***
