Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always inline functions signatures containing f16 or f128 #133050

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Nov 14, 2024

There are a handful of tier 2 and tier 3 targets that cause a LLVM crash or linker error when generating code that contains f16 or f128. The cranelift backend also does not support these types. To work around this, every function in std or core that contains these types must be marked #[inline] in order to avoid sending any code to the backend unless specifically requested.

However, this is inconvenient and easy to forget. Introduce a check for these types in the frontend that automatically inlines any function signatures that take or return f16 or f128.

Note that this is not a perfect fix because it does not account for the types being passed by reference or as members of aggregate types, but this is sufficient for what is currently needed in the standard library.

Fixes: #133035
Closes: #133037

@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2024

r? @compiler-errors

rustbot has assigned @compiler-errors.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 14, 2024
@rustbot
Copy link
Collaborator

rustbot commented Nov 14, 2024

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@tgross35
Copy link
Contributor Author

I'm not sure what I am missing here - testing by setting #[inline(always)] the codegen tests seem to emit code (maybe a compiletest option). Aside from that, I don't think generics or traits are handled at this point.

r? @saethlin, I could use a bit of help here

@rustbot rustbot assigned saethlin and unassigned compiler-errors Nov 14, 2024
@tgross35
Copy link
Contributor Author

Oh right, no_mangle forces codegen

@saethlin
Copy link
Member

You're getting code emitted because the functions are #[no_mangle], which is the first thing that cross_crate_inlinable checks for. contains_extern_indicator is perhaps poorly named.

You might also run into

let generate_cgu_internal_copies = tcx
.sess
.opts
.unstable_opts
.inline_in_all_cgus
.unwrap_or_else(|| tcx.sess.opts.optimize != OptLevel::No)
&& !tcx.sess.link_dead_code();
after fixing that.

Comment on lines +53 to +57
let sig = tcx.fn_sig(def_id).instantiate_identity();
for ty in sig.inputs().skip_binder().iter().chain(std::iter::once(&sig.output().skip_binder()))
{
// FIXME(f16_f128): in order to avoid crashes building `core`, always inline to skip
// codegen if the function is not used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you move this down after let mir, you can just iterate over mir.args_iter() instead of having to call fn_sig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would require moving let mir above the checks against opts.incremental, OptLevel::No, and threashold to avoid hitting a return false before we check for the types. Would this wind up with a perf impact since we don't generate mir here in these cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I thought having this check so far down would be fine. But maybe not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a (temporary, right?) hack, I'd rather the diff be small and localized than pretty. What you've written here is good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(temporary, right?)

Yes please, I don't think we will be able to stabilize these types until LLVM at least doesn't crash on everything T2+.


pub fn f16_arg(_a: f16) {
// CHECK-NOT: f16_arg
todo!()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the automatic inlining heuristics wouldn't anyway already mark these functions as inlineable? Might be worth setting cross_crate_inline_threshold to 0 to be sure that doesn't mask the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a revision with -Zcross-crate-inline-threshold=never -Zmir-opt-level=0 -Cno-prepopulate-passes, is that reasonable?

Copy link
Member

@saethlin saethlin Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also add -Copt-level=0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah of course. Done, tests still pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-Cno-prepopulate-passes just disables LLVM optimizations, rustc changes its behavior in a few places based on the value of opt-level.

@rust-log-analyzer

This comment has been minimized.

@tgross35
Copy link
Contributor Author

Is there a good way to walk the signature and check for the types recursively at this point? I'm realizing that &f16/&f128 aren't matched, it seems better to not special case them.

You might also run into

let generate_cgu_internal_copies = tcx
.sess
.opts
.unstable_opts
.inline_in_all_cgus
.unwrap_or_else(|| tcx.sess.opts.optimize != OptLevel::No)
&& !tcx.sess.link_dead_code();

after fixing that.

Should this function be updated to emit a LocalCopy if generate_cgu_internal_copies || /* contains f16/f128 */? I think this relates to Ralf's comment.

@RalfJung
Copy link
Member

I'm realizing that &f16/&f128 aren't matched, it seems better to not special case them.

Please change the const_assert to not use references, that was just a work-around.

@tgross35
Copy link
Contributor Author

tgross35 commented Nov 14, 2024

I need to unjunkify my commit history (edit: done) but as-is, this resolves the problem at #133035.

@tgross35 tgross35 marked this pull request as ready for review November 14, 2024 21:57
@tgross35 tgross35 changed the title [wip] Always inline f16 and f128 Always inline f16 and f128 Nov 14, 2024
These types are currently passed by reference, which does not avoid the
backend crashes. Change these back to being passed by value, which makes
the types easier to detect for automatic inlining.
@tgross35 tgross35 changed the title Always inline f16 and f128 Always inline functions signatures containing f16 or f128 Nov 14, 2024
There are a handful of tier 2 and tier 3 targets that cause a LLVM crash
or linker error when generating code that contains `f16` or `f128`. The
cranelift backend also does not support these types. To work around
this, every function in `std` or `core` that contains these types must
be marked `#[inline]` in order to avoid sending any code to the backend
unless specifically requested.

However, this is inconvenient and easy to forget. Introduce a check for
these types in the frontend that automatically inlines any function
signatures that take or return `f16` or `f128`.

Note that this is not a perfect fix because it does not account for the
types being passed by reference or as members of aggregate types, but
this is sufficient for what is currently needed in the standard library.

Fixes: rust-lang#133035
Closes: rust-lang#133037
@tgross35 tgross35 added the F-f16_and_f128 `#![feature(f16)]`, `#![feature(f128)]` label Nov 14, 2024
@saethlin
Copy link
Member

Should this function be updated to emit a LocalCopy if generate_cgu_internal_copies || /* contains f16/f128 */? I think this relates to Ralf's comment.

After thinking about this I think I've convinced myself that no such change is required, because that code is only run after MirUsedCollector has determined that the Instance must be codegenned. The choice of LocalCopy or GloballyShared there only changes how we're going to codegen the instance, not whether we are going to.

@saethlin
Copy link
Member

I've manually verified that

cargo +stage1 build -Zbuild-std=core --target=powerpc64-ibm-aix
cargo +stage1 build -Zbuild-std=core --target=arm64ec-pc-windows-msvc

Work on this PR, and not on nightly.

@bors r+

@bors
Copy link
Contributor

bors commented Nov 14, 2024

📌 Commit 5d81891 has been approved by saethlin

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 14, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Nov 15, 2024
Always inline functions signatures containing `f16` or `f128`

There are a handful of tier 2 and tier 3 targets that cause a LLVM crash or linker error when generating code that contains `f16` or `f128`. The cranelift backend also does not support these types. To work around this, every function in `std` or `core` that contains these types must be marked `#[inline]` in order to avoid sending any code to the backend unless specifically requested.

However, this is inconvenient and easy to forget. Introduce a check for these types in the frontend that automatically inlines any function signatures that take or return `f16` or `f128`.

Note that this is not a perfect fix because it does not account for the types being passed by reference or as members of aggregate types, but this is sufficient for what is currently needed in the standard library.

Fixes: rust-lang#133035
Closes: rust-lang#133037
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-f16_and_f128 `#![feature(f16)]`, `#![feature(f128)]` S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LLVM ERROR about f128 on AIX and SPARC32 when building core without optimization
7 participants