Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understand and improve ConfigurationBinder Source Generator incremental behavior #92914

Open
ericstj opened this issue Oct 2, 2023 · 5 comments
Labels
area-Extensions-Configuration source-generator Indicates an issue with a source generator feature tenet-build-performance Impacts build time: official, developer or CI
Milestone

Comments

@ericstj
Copy link
Member

ericstj commented Oct 2, 2023

I would expect a well-behaved generator to only execute when code is changed that activates it. I would expect that generator to demonstrate significantly different performance characteristics when editing code which doesn't activate it vs code that does activate it (or otherwise influence its outputs).

This is not what I observe with the ConfigurationBinder source generator. It seems to be running and consuming ~ the same amount of CPU regardless of if I make edits to unrelated files vs files which should influence the generator output.

I tested it with https://github.com/chsienki/GeneratorTracer and a project that gives the generator a lot of work. I also added explicit events to measure the bulk of the work in the Parse and Emit phases and they agree with this: ericstj@c10104c
And hacked @chsienki's tool to measure those: ericstj/GeneratorTracer@c2e33bf

After doing this I see lots of executions of the parse and emit phases even when making completely unrelated edits.

I'm not certain if what I'm observing here is normal and expected or not, but it does disagree with what I would expect from a functioning incremental build pipeline (only rerun when changes are needed).

I'd like to have a closer look at this with compiler folks and other generator authors to understand if what I see is normal or a problem.
@eiriktsarpalis @CyrusNajmabadi @chsienki @captainsafia

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 2, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 2, 2023
@ericstj ericstj added area-Extensions-Configuration source-generator Indicates an issue with a source generator feature labels Oct 2, 2023
@ghost
Copy link

ghost commented Oct 2, 2023

Tagging subscribers to this area: @dotnet/area-extensions-configuration
See info in area-owners.md if you want to be subscribed.

Issue Details

I would expect a well-behaved generator to only execute when code is changed that activates it. I would expect that generator to demonstrate significantly different performance characteristics when editing code which doesn't activate it vs code that does activate it (or otherwise influence its outputs).

This is not what I observe with the ConfigurationBinder source generator. It seems to be running and consuming ~ the same amount of CPU regardless of if I make edits to unrelated files vs files which should influence the generator output.

I tested it with https://github.com/chsienki/GeneratorTracer and a project that gives the generator a lot of work. I also added explicit events to measure the bulk of the work in the Parse and Emit phases and they agree with this: ericstj@c10104c
And hacked @chsienki's tool to measure those: ericstj/GeneratorTracer@c2e33bf

After doing this I see lots of executions of the parse and emit phases even when making completely unrelated edits.

I'm not certain if what I'm observing here is normal and expected or not, but it does disagree with what I would expect from a functioning incremental build pipeline (only rerun when changes are needed).

I'd like to have a closer look at this with compiler folks and other generator authors to understand if what I see is normal or a problem.
@eiriktsarpalis @CyrusNajmabadi @chsienki @captainsafia

Author: ericstj
Assignees: -
Labels:

untriaged, area-Extensions-Configuration, source-generator, needs-area-label

Milestone: -

@ericstj ericstj added tenet-build-performance Impacts build time: official, developer or CI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Oct 2, 2023
@tarekgh tarekgh added this to the 8.0.0 milestone Oct 2, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Oct 2, 2023
@eiriktsarpalis
Copy link
Member

After doing this I see lots of executions of the parse and emit phases even when making completely unrelated edits.

Executions of the parse phase is expected in unrelated edits, given the current structure of the generator. What is more concerning is the emit phase, which suggests to me that there might be false negative equality comparisons in certain model instances that are not covered by unit testing.

FWIW I had experienced similar issues while working on STJ incremental values, and it took a bit of debugging until all equality-related bugs were squashed.

@captainsafia
Copy link
Member

Executions of the parse phase is expected in unrelated edits, given the current structure of the generator.

Yes, this meshes with similar things I've seen when profiling RDG with the GeneratorTrace tool as well.

RDG's "parse" phase is particularly expensive because our equality checks need to analyze multiple parts of the invocation syntax (the signature of the delegate parameter, the name of the Map action call being invoked, the name of the parameters being passed to the invocation).

We can certainly resolve this by making our pipeline more incremental, but it comes at the cost of the generator implementation itself being more complex.

I think it would be worthwhile to explore the possibility of having generators that don't run on every edit in the future...

@ericstj
Copy link
Member Author

ericstj commented Oct 10, 2023

I'm moving this issue out of 8.0 - I was able to measure simple usage of ConfigurationBinder and in that case the parse phase is the only phase running on unrelated edits.

I did find when testing our the ConfigurationBinder unit tests assembly (lots of Configuration usage) that we regress to running on keypress -- I need to narrow down what's causing that but given it's not in the simple case I don't think it's blocking 8.0.

This testing did reveal that both the Options and Logging generator run the entire pipeline on every edit. This is because they Combine with the compilation without any further filtering:

IncrementalValueProvider<(Compilation, ImmutableArray<(TypeDeclarationSyntax? TypeSyntax, SemanticModel SemanticModel)>)> compilationAndTypes =
context.CompilationProvider.Combine(typeDeclarations.Collect());

IncrementalValueProvider<(Compilation, ImmutableArray<ClassDeclarationSyntax>)> compilationAndClasses =
context.CompilationProvider.Combine(classDeclarations.Collect());

So whenever the compilation changes, their code-generation step will rerun. They should instead be trying to capture only the relevant portions of the compilation and exposing that in a Select step to better reduce the work from every edit. This doesn't appear to be a regression from previous release for 6.0/7.0 - so it's OK to move that out (while still fixing it in 9.0 with a possibility of backport).

We should fix both of these generators and add tests which leverage tracking names to ensure that they remain well-beahved for incremental compilation.

think it would be worthwhile to explore the possibility of having generators that don't run on every edit in the future...

Agreed. I've raised this question quite a few times during development of the config source generator - what benefit do we really get by regenerating with every edit? Is it just diagnostics that might result from examining updates to the type graph? If so - maybe it's an acceptable tradeoff to defer work until build.

I chatted with @chsienki about adding the ETW events to our generators and it looks like he has something even better in the works in roslyn itself (hooray!) so these changes don't need to go in.

@ericstj
Copy link
Member Author

ericstj commented Aug 6, 2024

We discussed improving the incremental characteristics of the runtime source generators and scoped it out of 9.0

@ericstj ericstj modified the milestones: 9.0.0, 10.0.0 Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Extensions-Configuration source-generator Indicates an issue with a source generator feature tenet-build-performance Impacts build time: official, developer or CI
Projects
None yet
Development

No branches or pull requests

4 participants