Skip to content

Conversation

ToddGrun
Copy link
Contributor

@ToddGrun ToddGrun commented Aug 6, 2025

This allocation shows up as a varying amount of allocations (from 0.1% to 10%) in the Roslyn CodeAnalysis process during completion in the CompletionInCohosting razor speedometer test.

The issue here is that the size of this array can exceed the the PooledArrayLengthLimitExclusive threshold specified in ArrayBuilder, thus causing items to not get placed back into the standard pool.

Test insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/659384

Profiles from the test insertion baseline exhibited large variations in AssemblySymbol[] allocations. Some had 15 MB allocated of this type, some had over a GB allocated. With this change, it was 1-2 MB.

*** From one of the bad before profiles ***
image

*** None of the after profiles had high allocation numbers for AssemblySymbol[], as evidenced by ***
image

This allocation shows up as over 5% of allocations during completion in the CompletionInCohosting razor speedometer test.

The issue here is that the size of this array can exceed the the PooledArrayLengthLimitExclusive threshold specified in ArrayBuilder, thus causing items to not get placed back into the standard pool.
{
var referenceManager = GetBoundReferenceManager();

int length = referenceManager.ReferencedAssemblies.Length;

assemblies.EnsureCapacity(assemblies.Count + length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this? lists can have their capacity ensured as well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it doesn't really provide any value for a pooled object. Letting the array grow organically shouldn't show any allocation differences, assuming it goes back in the pool.

@ToddGrun ToddGrun changed the title *** WIP: Reduce allocations in Assembly.GetTopLevelTypeByMetadataName Reduce allocations in AssemblySymbol.GetTopLevelTypeByMetadataName Aug 7, 2025
@ToddGrun ToddGrun marked this pull request as ready for review August 7, 2025 00:25
@ToddGrun ToddGrun requested a review from a team as a code owner August 7, 2025 00:25
@ToddGrun
Copy link
Contributor Author

@dotnet/roslyn-compiler -- ptal

@@ -30,6 +30,9 @@ internal abstract class AssemblySymbol : Symbol, IAssemblySymbolInternal
// to the VB version.
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

// separate pool for assembly symbols as these collections commonly exceed ArrayBuilder's size threshold
private static readonly ObjectPool<List<AssemblySymbol>> s_symbolPool = new ObjectPool<List<AssemblySymbol>>(() => new List<AssemblySymbol>());
Copy link
Contributor

@AlekseyTs AlekseyTs Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List

Can we still use a pool of ArrayBuilders, just with significantly big size threshold? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify a bit? The reason I did it this way:

  1. the discardLargeInstances version of ArrayBuilder.GetInstance isn't available within the compiler directory.

  2. I'm not seeing the advantage of using ObjectPool<ArrayBuilder> rather than using an ObjectPool<List> (there are definitely more hits of the latter in the roslyn codebase than the former). If you know of an advantage, I'm definitely willing to investigate and change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification. The primary motivation for the question was the fact that keeping the same type would require less code changes at use sites.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the discardLargeInstances version of ArrayBuilder.GetInstance isn't available within the compiler directory.

Does ObjectPool<List<AssemblySymbol>> behave better in this regard?

I'm not seeing the advantage of using ObjectPool<ArrayBuilder> rather than using an ObjectPool<List>

What are the advantages of using ObjectPool<List> vs. ObjectPool<ArrayBuilder> in this case?

Copy link
Contributor Author

@ToddGrun ToddGrun Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it can work either way, but for me, ObjectPool and ArrayBuilder each have their own pooling semantics, and mixing them up confuses things (for me). As mentioned, ObjectPool<List> has more usage, including in the compilers folder, so it seemed like an appropriate choice.

Copy link
Contributor

@AlekseyTs AlekseyTs Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to not switch from an ArrayBuilder to List just because we want to use different pooling strategy.

Copy link
Contributor Author

@ToddGrun ToddGrun Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure we're on the same page:

The code before was using ArrayBuilder<AssemblySymbol> but that is a poor fit because of the size threshold ArrayBuilder uses and that the discardLargeInstances override isn't available in the compilers folder.

So, the code needed to be switched to another type, and specifically an ObjectPool of some collection type. To me, ObjectPool<ArrayBuilder> conflates multiple pooling technicques whereas ObjectPool<List> is commonly used in similar scenarios throughout the codebase.

If this were to use ObjectPool<ArrayBuilder>, the code would have to use new ArrayBuilder() in the pool to create it (not something code outside the ArrayBuilder class usually does) and it wouldn't call Free on it because the ArrayBuilder wouldn't be tied to the pool that that class knows about. I'm not seeing the benefit to wanting ObjectPool<ArrayBuilder> over ObjectPool<List>.

As I said, that concept is confusing to me, so I want to make sure I understand your proposal and it's value before trying something different from what is in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that is a poor fit because of the size threshold ArrayBuilder uses and that the discardLargeInstances override isn't available in the compilers folder.

Is the issue with the collection behavior or the pooling strategy used by default? My current understanding is that the problem is with the default pooling strategy, not the collection behavior itself. Therefore, changing the collection type is not necessary and, in my opinion, is not justified.

If this were to use ObjectPool<ArrayBuilder>, the code would have to use new ArrayBuilder() in the pool to create it (not something code outside the ArrayBuilder class usually does) and it wouldn't call Free on it because the ArrayBuilder wouldn't be tied to the pool that that class knows about.

This doesn't scary me at all. If the problem is with the pooling strategy, that is what we should be fixing, and, if one place in code does that for ArrayBuilder, that is not a problem. We have plenty of places that use specialized (non-default) pools for various objects. Comments can help with explaining unusual logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone ahead and pushed a change to use an ObjectPool<ArrayBuilder> instead of ObjectPool<List>. I personally find this more confusing than the original change (and think there is a reason basically no other code uses ArrayBuilders like this), but I don't think I'm able to convince you otherwise, and it's I don't think it woud be worthwhile to continue to try to do so. Let me know if this matches your expectation. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and think there is a reason basically no other code uses ArrayBuilders like this

I think there is only one reason for that - the default pooling worked well. A lack of existing example on itself is not an indication that doing something is wrong, we are in the business of doing things that were never done before.

In general, decisions what collection to use and what pooling strategy to use are two distinct and independent decisions.

… calling that as the ArrayBuilder implementation doesn't suffer from the same issues as the List implementation
@@ -1327,8 +1327,6 @@ internal void GetUnaliasedReferencedAssemblies(ArrayBuilder<AssemblySymbol> asse

int length = referenceManager.ReferencedAssemblies.Length;

assemblies.EnsureCapacity(assemblies.Count + length);
Copy link
Contributor

@AlekseyTs AlekseyTs Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assemblies.EnsureCapacity(assemblies.Count + length);

It is not obvious why this statement is removed. Do we have data proving that the removal has a noticeable positive impact? I would think that this statement was allowing us to reduce amount of times the builder was reallocated. Letting it grow on demand inside the loop might result in multiple reallocations. Assembly references with aliases are not very common, therefore, the used capacity should be pretty accurate. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling EnsureCapacity on a pooled data structure has virtually no benefit and could have a bad performance impact. For example, if the data structure continually grows by a constant number of elements, you end up reallocating each time, instead of taking advantage of the exponential growth that the Add call allows.

If the data structure isn't pooled, then I totally agree that ensuring capacity before calling Add makes sense.

Copy link
Contributor

@AlekseyTs AlekseyTs Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling EnsureCapacity on a pooled data structure has virtually no benefit

Well, I described one possible benefit above.

and could have a bad performance impact. For example, if the data structure continually grows by a constant number of elements, you end up reallocating each time, instead of taking advantage of the exponential growth that the Add call allows.

This is a good theory. Do you have data supporting it for this specific code path, i.e. supporting a need for this change? If not, I am leaning towards reverting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen such an allocation profile before, but I have no data on this particular call experiencing this. I've reverted the change. I cared enough to make what is almost certainly an improvement, but not enough to provide anything more than a simple explanation for why it's almost certainly a proper choice.

@@ -30,6 +30,9 @@ internal abstract class AssemblySymbol : Symbol, IAssemblySymbolInternal
// to the VB version.
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

// separate pool for assembly symbols as these collections commonly exceed ArrayBuilder's size threshold
Copy link
Contributor

@AlekseyTs AlekseyTs Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//

Consider using standard doc comment syntax #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AlekseyTs
Copy link
Contributor

Done with review pass (commit 3)

Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (commit 5)

@AlekseyTs
Copy link
Contributor

@ToddGrun Please squash commits while merging.

@ToddGrun ToddGrun merged commit 3faf4ab into dotnet:main Sep 9, 2025
24 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants