Skip to content

Async drop codegen #123948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 28, 2025
Merged

Async drop codegen #123948

merged 2 commits into from
Apr 28, 2025

Conversation

azhogin
Copy link
Contributor

@azhogin azhogin commented Apr 15, 2024

Async drop implementation using templated coroutine for async drop glue generation.

Scopes changes to generate async_drop_in_place() awaits, when async droppable objects are out-of-scope in async context.

Implementation details:
https://github.com/azhogin/posts/blob/main/async-drop-impl.md

New fields in Drop terminator (drop & async_fut). Processing in codegen/miri must validate that those fields are empty (in full version async Drop terminator will be expanded at StateTransform pass or reverted to sync version). Changes in terminator visiting to consider possible new successor (drop field).

ResumedAfterDrop messages for panic when coroutine is resumed after it is started to be async drop'ed.

Lang item for generated coroutine for async function async_drop_in_place. async fn async_drop_in_place<T>()::{{closure0}}.

Scopes processing for generate async drop preparations. Async drop is a hidden Yield, so potentially async drops require the same dropline preparation as for Yield terminators.

Processing in StateTransform: async drops are expanded into yield-point. Generation of async drop of coroutine itself added.

Shims for AsyncDropGlueCtorShim, AsyncDropGlue and FutureDropPoll.

#[lang = "async_drop"]
pub trait AsyncDrop {
    #[allow(async_fn_in_trait)]
    async fn drop(self: Pin<&mut Self>);
}

impl Drop for Foo {
    fn drop(&mut self) {
        println!("Foo::drop({})", self.my_resource_handle);
    }
}

impl AsyncDrop for Foo {
    async fn drop(self: Pin<&mut Self>) {
        println!("Foo::async drop({})", self.my_resource_handle);
    }
}

First async drop glue implementation re-worked to use the same drop elaboration code as for sync drop.
async_drop_in_place changed to be async fn. So both async_drop_in_place ctor and produced coroutine have their lang items (AsyncDropInPlace/AsyncDropInPlacePoll) and shim instances (AsyncDropGlueCtorShim/AsyncDropGlue).

pub async unsafe fn async_drop_in_place<T: ?Sized>(_to_drop: *mut T) {
}

AsyncDropGlue shim generation uses elaborate_drops::elaborate_drop to produce drop ladder (in the similar way as for sync drop glue) and then coroutine::StateTransform to convert function into coroutine poll.

AsyncDropGlue coroutine's layout can't be calculated for generic T, it requires known final dropee type to be generated (in StateTransform). So, templated coroutine was introduced here (templated_coroutine_layout(...) etc).

Such approach overrides the first implementation using mixing language-level futures in #121801.

@rustbot
Copy link
Collaborator

rustbot commented Apr 15, 2024

r? @oli-obk

rustbot has assigned @oli-obk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot
Copy link
Collaborator

rustbot commented Apr 15, 2024

rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead.

cc @rust-lang/rust-analyzer

This PR changes MIR

cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras

This PR changes Stable MIR

cc @oli-obk, @celinval, @ouz-a

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 15, 2024
@slanterns
Copy link
Contributor

How does this relate to #121801?

@azhogin
Copy link
Contributor Author

azhogin commented Apr 15, 2024

How does this relate to #121801?

This is a proof-of-concept of async drop implementation with only final drop function (no glue for child objects drop).
Async drop glue generation implemented in #121801. We will intergate this two PRs.

@@ -170,7 +170,7 @@ impl<'mir, 'tcx: 'mir, M: Machine<'mir, 'tcx>> InterpCx<'mir, 'tcx, M> {
}
}

Drop { place, target, unwind, replace: _ } => {
Drop { place, target, unwind, replace: _, drop: _, async_fut: _ } => {
Copy link
Member

@RalfJung RalfJung Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something strange going on that's not explained in the comments in syntax.rs -- why are these things entirely ignored by the interpreter and codegen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both drop and async_fut fields are only used in compiler/rustc_mir_transform/src/coroutine.rs, StateTransform pass. In expand_async_drops async drops are expanded into one or two yield points with poll ready/pending switch.

Drop terminator comments updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case codegen and the interpreter should bug! when those fields are present.

Though it seems to me it would probably be better to make this a different terminator. They have very different operational behavior, after all.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Apr 17, 2024
…-obk

Add simple async drop glue generation

This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).

This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).

Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).

This is a part of the [MCP: Low level components for async drop](rust-lang/compiler-team#727) work.

Feature completeness:

 - [x] `AsyncDrop` trait
 - [ ] `async_drop_in_place_raw`/async drop glue generation support for
   - [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
   - [x] Arrays and slices (array pointer is unsized into slice pointer)
   - [x] ADTs (enums, structs, unions)
   - [x] tuple-like types (tuples, closures)
   - [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
   - [ ] coroutines (rust-lang#123948)
 - [x] Async drop glue includes sync drop glue code
 - [x] Cleanup branch generation for `async_drop_in_place_raw`
 - [ ] Union rejects non-trivially async destructible fields
 - [ ] `AsyncDrop` implementation requires same bounds as type definition
 - [ ] Skip trivially destructible fields (optimization)
 - [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
 - [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
 - [ ] Automatic async drop at the end of the scope in async context
Comment on lines 60 to 86
fn block_on<F>(fut_unpin: F) -> F::Output
where
F: Future,
{
let mut fut_pin = pin!(ManuallyDrop::new(fut_unpin));
let mut fut: Pin<&mut F> = unsafe {
Pin::map_unchecked_mut(fut_pin.as_mut(), |x| &mut **x)
};
let (waker, rx) = simple_waker();
let mut context = Context::from_waker(&waker);
let rv = loop {
match fut.as_mut().poll(&mut context) {
Poll::Ready(out) => break out,
// expect wake in polls
Poll::Pending => rx.try_recv().unwrap(),
}
};
loop {
match future_drop_poll(fut.as_mut(), &mut context) {
Poll::Ready(()) => break,
// expect wake in polls
Poll::Pending => rx.try_recv().unwrap(),
}
}
rv
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is a bit overblown and a single poll would have been enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shows an example of correct block_on function for async drop coroutine. I don't see a reason to optimize test here.

@zetanumbers
Copy link
Contributor

To clarify: if I understood everyone correctly this PR implements async drop with some features, which are considered to be undesirable by a number of Async WG members (regular drop after async drop was suspended, etc.) This is fine as long as we would be capable of disabling those (by restricting synchronous drop for such coroutines).

@azhogin azhogin force-pushed the azhogin/async-drop branch 2 times, most recently from e12d2c6 to cdf91ab Compare April 21, 2024 22:39
replace: _,
drop: _,
async_fut: _,
} => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should match on TerminatorKind::Drop { async_fut: Some(_), drop: Some(_), .. } in "shouldn't exist at codegen" branch, and put None in these fields in old drop case.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 23, 2024
Add simple async drop glue generation

This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).

This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).

Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).

This is a part of the [MCP: Low level components for async drop](rust-lang/compiler-team#727) work.

Feature completeness:

 - [x] `AsyncDrop` trait
 - [ ] `async_drop_in_place_raw`/async drop glue generation support for
   - [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
   - [x] Arrays and slices (array pointer is unsized into slice pointer)
   - [x] ADTs (enums, structs, unions)
   - [x] tuple-like types (tuples, closures)
   - [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
   - [ ] coroutines (rust-lang#123948)
 - [x] Async drop glue includes sync drop glue code
 - [x] Cleanup branch generation for `async_drop_in_place_raw`
 - [ ] Union rejects non-trivially async destructible fields
 - [ ] `AsyncDrop` implementation requires same bounds as type definition
 - [ ] Skip trivially destructible fields (optimization)
 - [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
 - [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
 - [ ] Automatic async drop at the end of the scope in async context
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Apr 23, 2024
Add simple async drop glue generation

This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).

This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).

Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).

This is a part of the [MCP: Low level components for async drop](rust-lang/compiler-team#727) work.

Feature completeness:

 - [x] `AsyncDrop` trait
 - [ ] `async_drop_in_place_raw`/async drop glue generation support for
   - [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
   - [x] Arrays and slices (array pointer is unsized into slice pointer)
   - [x] ADTs (enums, structs, unions)
   - [x] tuple-like types (tuples, closures)
   - [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
   - [ ] coroutines (rust-lang/rust#123948)
 - [x] Async drop glue includes sync drop glue code
 - [x] Cleanup branch generation for `async_drop_in_place_raw`
 - [ ] Union rejects non-trivially async destructible fields
 - [ ] `AsyncDrop` implementation requires same bounds as type definition
 - [ ] Skip trivially destructible fields (optimization)
 - [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
 - [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
 - [ ] Automatic async drop at the end of the scope in async context
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
Add simple async drop glue generation

This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).

This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).

Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).

This is a part of the [MCP: Low level components for async drop](rust-lang/compiler-team#727) work.

Feature completeness:

 - [x] `AsyncDrop` trait
 - [ ] `async_drop_in_place_raw`/async drop glue generation support for
   - [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
   - [x] Arrays and slices (array pointer is unsized into slice pointer)
   - [x] ADTs (enums, structs, unions)
   - [x] tuple-like types (tuples, closures)
   - [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
   - [ ] coroutines (rust-lang/rust#123948)
 - [x] Async drop glue includes sync drop glue code
 - [x] Cleanup branch generation for `async_drop_in_place_raw`
 - [ ] Union rejects non-trivially async destructible fields
 - [ ] `AsyncDrop` implementation requires same bounds as type definition
 - [ ] Skip trivially destructible fields (optimization)
 - [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
 - [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
 - [ ] Automatic async drop at the end of the scope in async context
ty::InstanceDef::DropGlue(_, None)
| ty::InstanceDef::AsyncDropGlueCtorShim(_, None)
| ty::InstanceDef::AsyncDropGlue(_, None),
) = def {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this change to rustc_codegen_cranelift too at

if let ty::InstanceDef::DropGlue(_, None) | ty::InstanceDef::AsyncDropGlueCtorShim(_, None) =
and
InstanceDef::DropGlue(_, None) | ty::InstanceDef::AsyncDropGlueCtorShim(_, None) => {
?

bors added a commit to rust-lang-ci/rust that referenced this pull request May 31, 2024
Implement `needs_async_drop` in rustc and optimize async drop glue

This PR expands on rust-lang#121801 and implements `Ty::needs_async_drop` which works almost exactly the same as `Ty::needs_drop`, which is needed for rust-lang#123948.

Also made compiler's async drop code to look more like compiler's regular drop code, which enabled me to write an optimization where types which do not use `AsyncDrop` can simply forward async drop glue to `drop_in_place`. This made size of the async block from the [async_drop test](https://github.com/zetanumbers/rust/blob/67980dd6fb11917d23d01a19c2cf4cfc3978aac8/tests/ui/async-await/async-drop.rs) to decrease by 12%.
@petrochenkov petrochenkov added the F-async_drop Async drop label Jun 17, 2024
@oli-obk oli-obk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 28, 2024
@azhogin azhogin force-pushed the azhogin/async-drop branch from c9aacfe to 025a613 Compare June 30, 2024 20:41
@oli-obk
Copy link
Contributor

oli-obk commented Apr 28, 2025

@bors r+ rollup=never

@bors
Copy link
Collaborator

bors commented Apr 28, 2025

📌 Commit c366756 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 28, 2025
@bors
Copy link
Collaborator

bors commented Apr 28, 2025

⌛ Testing commit c366756 with merge 7d65abf...

@bors
Copy link
Collaborator

bors commented Apr 28, 2025

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 7d65abf to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 28, 2025
@bors bors merged commit 7d65abf into rust-lang:master Apr 28, 2025
7 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 28, 2025
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing a932eb3 (parent) -> 7d65abf (this PR)

Test differences

Show 162 test diffs

Stage 1

  • [crashes] tests/crashes/128695.rs: pass -> [missing] (J1)
  • [crashes] tests/crashes/132103.rs: pass -> [missing] (J1)
  • [ui] tests/ui/async-await/async-drop.rs: pass -> [missing] (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-future-from-future.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-future-in-sync-context.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue-array.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue-generic.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-initial.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-middle-drop.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop-open.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/async-drop.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/ex-ice-132103.rs: [missing] -> pass (J1)
  • [ui] tests/ui/async-await/async-drop/ex-ice1.rs: [missing] -> pass (J1)
  • [ui] tests/ui/feature-gates/feature-gate-async-drop.rs: [missing] -> pass (J1)

Stage 2

  • [ui] tests/ui/async-await/async-drop.rs: pass -> [missing] (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-future-from-future.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-future-in-sync-context.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue-array.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue-generic.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-glue.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-initial.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-middle-drop.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop-open.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/async-drop.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/ex-ice-132103.rs: [missing] -> pass (J0)
  • [ui] tests/ui/async-await/async-drop/ex-ice1.rs: [missing] -> pass (J0)
  • [ui] tests/ui/feature-gates/feature-gate-async-drop.rs: [missing] -> pass (J0)
  • [crashes] tests/crashes/128695.rs: pass -> [missing] (J2)
  • [crashes] tests/crashes/132103.rs: pass -> [missing] (J2)

Additionally, 132 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 7d65abfe80f9eee93296d1ce08f845c9bf7039f8 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-apple-various: 8167.9s -> 6409.8s (-21.5%)
  2. x86_64-apple-1: 6083.4s -> 7256.8s (19.3%)
  3. x86_64-apple-2: 4625.1s -> 5323.5s (15.1%)
  4. aarch64-apple: 4002.4s -> 4317.6s (7.9%)
  5. dist-x86_64-musl: 7277.6s -> 7725.1s (6.1%)
  6. x86_64-gnu-llvm-20-1: 5290.8s -> 5566.6s (5.2%)
  7. test-various: 4108.6s -> 4316.6s (5.1%)
  8. x86_64-msvc-2: 6645.3s -> 6981.1s (5.1%)
  9. dist-x86_64-netbsd: 4891.5s -> 5112.2s (4.5%)
  10. i686-gnu-2: 6393.6s -> 6669.2s (4.3%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (7d65abf): comparison URL.

Overall result: ❌ regressions - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.4% [0.2%, 0.6%] 12
Regressions ❌
(secondary)
1.0% [0.2%, 2.4%] 28
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [0.2%, 0.6%] 12

Max RSS (memory usage)

Results (primary 0.3%, secondary 0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.5%, 2.2%] 7
Regressions ❌
(secondary)
2.4% [0.7%, 4.2%] 6
Improvements ✅
(primary)
-0.9% [-2.0%, -0.5%] 4
Improvements ✅
(secondary)
-3.2% [-4.4%, -1.4%] 3
All ❌✅ (primary) 0.3% [-2.0%, 2.2%] 11

Cycles

Results (primary 0.7%, secondary 2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.4%, 1.0%] 6
Regressions ❌
(secondary)
2.2% [0.8%, 3.0%] 11
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [0.4%, 1.0%] 6

Binary size

Results (primary 0.3%, secondary 2.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.0%] 48
Regressions ❌
(secondary)
2.0% [0.0%, 4.5%] 18
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.3% [0.0%, 1.0%] 48

Bootstrap: 761.378s -> 765.314s (0.52%)
Artifact size: 365.18 MiB -> 365.17 MiB (-0.00%)

@rustbot rustbot added the perf-regression Performance regression. label Apr 28, 2025
@oli-obk
Copy link
Contributor

oli-obk commented Apr 29, 2025


 -134,124,196  ???:<rustc_trait_selection::traits::select::SelectionContext>::poly_select::{closure
   99,231,099  ???:<rustc_trait_selection::traits::select::SelectionContext>::confirm_candidate
   57,767,214  ???:<rustc_trait_selection::traits::fulfill::FulfillProcessor as rustc_data_structures::obligation_forest::ObligationProcessor>::process_obligation
  -21,943,725  ???:<rustc_trait_selection::traits::select::SelectionContext>::match_impl::{closure
   15,129,826  ???:rustc_mir_transform::simplify::simplify_cfg
   12,888,519  ???:rustc_mir_transform::simplify::remove_dead_blocks
  -11,741,513  ???:<rustc_middle::ty::predicate::Clause as rustc_type_ir::fold::TypeFoldable<rustc_middle::ty::context::TyCtxt>>::fold_with::<rustc_type_ir::binder::ArgFolder<rustc_middle::ty::context::TyCtxt>>
   11,004,458  ???:rustc_middle::ty::util::fold_list::<rustc_type_ir::binder::ArgFolder<rustc_middle::ty::context::TyCtxt>, &rustc_middle::ty::list::RawList<rustc_middle::ty::list::TypeInfo, rustc_middle::ty::predicate::Clause>, rustc_middle::ty::predicate::Clause, <&rustc_middle::ty::list::RawList<rustc_middle::ty::list::TypeInfo, rustc_middle::ty::predicate::Clause> as rustc_type_ir::fold::TypeFoldable<rustc_middle::ty::context::TyCtxt>>::fold_with<rustc_type_ir::binder::ArgFolder<rustc_middle::ty::context::TyCtxt>>::{closure
    7,227,082  ???:<std::sync::poison::once::Once>::call_once_force::<<std::sync::once_lock::OnceLock<alloc::vec::Vec<rustc_middle::mir::BasicBlock>>>::initialize<<std::sync::once_lock::OnceLock<alloc::vec::Vec<rustc_middle::mir::BasicBlock>>>::get_or_init<<rustc_middle::mir::basic_blocks::BasicBlocks>::reverse_postorder::{closure
    4,215,304  ???:<std::sync::poison::once::Once>::call_once_force::<<std::sync::once_lock::OnceLock<rustc_data_structures::graph::dominators::Dominators<rustc_middle::mir::BasicBlock>>>::initialize<<std::sync::once_lock::OnceLock<rustc_data_structures::graph::dominators::Dominators<rustc_middle::mir::BasicBlock>>>::get_or_init<<rustc_middle::mir::basic_blocks::BasicBlocks>::dominators::{closure
    3,200,420  ???:<rustc_mir_transform::remove_noop_landing_pads::RemoveNoopLandingPads as rustc_mir_transform::pass_manager::MirPass>::run_pass
    3,103,429  ???:<rustc_trait_selection::traits::normalize::AssocTypeNormalizer>::fold::<rustc_type_ir::predicate::TraitRef<rustc_middle::ty::context::TyCtxt>>
    3,089,070  ???:<dyn rustc_hir_analysis::hir_ty_lowering::HirTyLowerer>::lower_generic_args_of_path::{closure
   -2,807,695  ???:<dyn rustc_hir_analysis::hir_ty_lowering::HirTyLowerer>::lower_path

some inliner noise, though within confirm_candidate and process_obligation that's likely gonna cause some sort of instruction counter change.

The actual perf regression is more simplify_cfg, remove_dead_blocks, RemoveNoopLandingPads, along with some dominator cache invalidations and recomputations.

@rylev
Copy link
Member

rylev commented Apr 29, 2025

@azhogin @oli-obk is this perf regression worth thinking about addressing somehow especially considering this isn't a bug fix?

@oli-obk
Copy link
Contributor

oli-obk commented Apr 29, 2025

Yes, but I have so far been unable to figure out the exact cause of the regression.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 29, 2025
Use a closure instead of three chained iterators

Should fix the perf regression from rust-lang#123948

r? `@ghost`
VlaDexa added a commit to VlaDexa/rust that referenced this pull request May 2, 2025
…rochenkov

Use a closure instead of three chained iterators

Fixes the perf regression from rust-lang#123948

That PR had chained a third option to the iterator which apparently didn't optimize well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-async_drop Async drop merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. WG-trait-system-refactor The Rustc Trait System Refactor Initiative (-Znext-solver)
Projects
None yet
Development

Successfully merging this pull request may close these issues.