Skip to content

Move scalar_to_backend to ssa #142960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Jun 24, 2025

r? @GuillaumeGomez @antoyo

I think that should make it much more maintainable going forward

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 24, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 24, 2025

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@GuillaumeGomez
Copy link
Member

Thanks a lot, it will help making our life easier in the GCC backend!

Just waiting for @antoyo to confirm it's good for him as well then we can r+.

Comment on lines 70 to 75
Scalar::Int(int) => {
let data = int.to_bits(layout.size(self));
let llval = self.const_uint_big(self.type_ix(bitsize), data);
if matches!(layout.primitive(), Primitive::Pointer(_)) {
self.const_int_to_ptr(llval, llty)
} else {
self.const_bitcast(llval, llty)
}
}
Copy link
Contributor

@antoyo antoyo Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very likely to break cg_gcc since the bitcast doesn't work in const context (see comment in the original implementation).
I can check if I could make it work in const context in libgccjit if needed.

Nice refactor, btw!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we cheat and do the bitsize > 1 && ty.is_integral() && bytesize as u32 == ty.get_size() check within the cg_gcc const_bitcast impl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to thing about whether this could break other things: if that ever gets called with a value of a struct type, for instance, I assume this would break.
Would you have access to bytesize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can very readily provide it, we're in control of this method and llvm will just ignore it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try locally since I'm concerned that this will break other things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can very readily provide it, we're in control of this method and llvm will just ignore it

So, your plan would be to change all other calls and possibly the API of the functions calling const_bitcast to add the missing parameters if needed?
Or will you change the other calls so that they call another function than const_bitcast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no rush, we can let this PR sit for a while.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, your plan would be to change all other calls and possibly the API of the functions calling const_bitcast to add the missing parameters if needed?
Or will you change the other calls so that they call another function than const_bitcast?

ah yea we should probably just add a new function for this specific case and have llvm forward to const_bitcast and gcc do the checks and fallback logic

Copy link
Contributor

@antoyo antoyo Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, good.
We're in the process of adding more tests for cg_gcc in the Rust CI (will try to open an MCP for this soon), so this should make this kind of refactor much easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean by this comment you'd prefer to block this PR on that landing?

/// Create a global constant.
///
/// The returned global variable is a pointer in the default address space for globals.
fn static_addr_of_impl(&self, cv: Self::Value, align: Align, kind: Option<&str>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this static_addr_of_const?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it looks like the kind argument is always None, so you can remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately vtables need this at present to achieve

// Make sure that vtables don't have the unnamed_addr attribute when debuginfo is enabled.

and since vtables are const, and that logic calls the mut logic before marking the alloc as const, we need to keep it around everywhere. There's likely things to improve here, but that should probably be its own PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah fun, we only do that in some vtable paths, but not in the one in scalar_to_backend

-> Self::Value;

/// Same as `static_addr_of_impl`, but does not mark the static as immutable
fn static_addr_of_mut(&self, cv: Self::Value, align: Align, kind: Option<&str>) -> Self::Value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. kind is always None.

@bors
Copy link
Collaborator

bors commented Jun 29, 2025

☔ The latest upstream changes (presumably #143161) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the scalar-to-backend-dedup branch from 35ccb26 to 84c8bde Compare June 30, 2025 09:07
@bors
Copy link
Collaborator

bors commented Jul 1, 2025

☔ The latest upstream changes (presumably #143254) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the scalar-to-backend-dedup branch from 84c8bde to c285206 Compare July 2, 2025 13:51
@rust-log-analyzer

This comment has been minimized.

@oli-obk oli-obk force-pushed the scalar-to-backend-dedup branch from c285206 to ed01d01 Compare July 2, 2025 14:21
@bors
Copy link
Collaborator

bors commented Jul 7, 2025

☔ The latest upstream changes (presumably #143182) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Collaborator

bors commented Jul 10, 2025

☔ The latest upstream changes (presumably #143696) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the scalar-to-backend-dedup branch from 74f9659 to e02729a Compare July 10, 2025 14:37
Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just happy to see the separation between static_addr_of and const_data_from_alloc disappear (after noticing myself that pattern and confirming that all calls to either contains the other), the comments are small things I noticed in passing (of varying subjectivity).

(Oh, and since I can't leave a comment on a line outside the diff: const_struct is unused in rustc_codegen_ssa, and AFAICT it's also only ever used to build the initializers for globals, not any SSA values)

Comment on lines +56 to +62
/// Create a global constant.
///
/// The returned global variable is a pointer in the default address space for globals.
fn static_addr_of_const(&self, alloc: ConstAllocation<'_>, kind: Option<&str>) -> Self::Value;

fn scalar_to_backend(&self, cv: Scalar, layout: abi::Scalar, llty: Self::Type) -> Self::Value;
/// Same as `static_addr_of_const`, but does not mark the static as immutable
fn static_addr_of_mut(&self, alloc: ConstAllocation<'_>, kind: Option<&str>) -> Self::Value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would these be different from static_addr_of? This feels like an accidental historical divergence within rustc_codegen_llvm, from what I could tell.

EDIT: looks like there used to be static_addr_of and static_addr_of_mut before e8da3c6

I think a single static_addr_of method (not here but in StaticCodegenMethods, where it is right now) that either takes Mutability, or even better inspects alloc.inner().mutability itself, would be an improvement over having 3 slightly different static_addr_of* methods.

Comment on lines +65 to +69
let bitsize = if layout.is_bool() { 1 } else { layout.size(self).bits() };
match cv {
Scalar::Int(int) => {
let data = int.to_bits(layout.size(self));
let val = self.const_uint_big(self.type_ix(bitsize), data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like spelling cx.type_bool() using cx.type_ix(1) is a LLVM-ism, and generally the only integer types a codegen backend is required to support, are the ones matching abi::Integer.

But I don't have a lot of good suggestions, other than e.g. spelling type_ix as type_int (and making it take Size, at least, so it's harder to misuse), which wouldn't be trivial because type_int is already taken (for c_int AFAICT?).

/// Same as `static_addr_of_const`, but does not mark the static as immutable
fn static_addr_of_mut(&self, alloc: ConstAllocation<'_>, kind: Option<&str>) -> Self::Value;

fn scalar_to_backend(&self, cv: Scalar, layout: abi::Scalar, ty: Self::Type) -> Self::Value {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been looking at this method a lot and the name keeps feeling out of place, what do you think about naming it const_scalar?

fn static_addr_of(&self, cv: Self::Value, align: Align, kind: Option<&str>) -> Self::Value;
fn static_addr_of(&self, alloc: ConstAllocation<'_>, kind: Option<&str>) -> Self::Value;
fn get_value_name(&self, val: Self::Value) -> &[u8];
fn set_value_name(&self, val: Self::Value, name: &[u8]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these here? The non-constant one is treated as a debuginfo helper:

fn set_var_name(&mut self, value: Self::Value, name: &str);

fn codegen_static(&mut self, def_id: DefId);
fn get_static(&self, def_id: DefId) -> Self::Value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the other copy of this method from StaticBuilderMethods (and that entire trait, since it would be empty).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried and there were some issues, but I'll try again

Comment on lines +52 to +55
fn const_bitcast(&self, val: Self::Value, ty: Self::Type) -> Self::Value;
fn const_pointercast(&self, val: Self::Value, ty: Self::Type) -> Self::Value;
fn const_int_to_ptr(&self, val: Self::Value, ty: Self::Type) -> Self::Value;
fn const_ptr_to_int(&self, val: Self::Value, ty: Self::Type) -> Self::Value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any consensus on whether these should be separate operations, or could all be subsumed by const_bitcast? (I'm wondering along similar lines to ptr as usize vs ptr->int transmute)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are all subsumed then some backends need to check their type, even if the static logic knows what it actually wants

let base_addr_space = global_alloc.address_space(self);

// Cast to the required address space if necessary
let val = self.const_pointercast(base_addr, self.type_ptr_ext(base_addr_space));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't base_addr_space already be the AS of base_addr?

I guess not, e.g. if the pointer starts off in some "AS of globals" and then needs to be cast to the default AS, which is what base_addr_space would refer to.

Wish I had any suggestions for what to name this to better express that it's an "intermediate" between base_addr (binning its AS into one of the two that Rust itself supports, at most - i.e. data-vs-code) and layout/ty (which can, in theory, be using a different AS, or not even be a pointer at all, etc.).

@@ -238,8 +238,7 @@ impl<'a, 'tcx, V: CodegenObject> OperandRef<'tcx, V> {
// Neither a scalar nor scalar pair. Load from a place
// FIXME: should we cache `const_data_from_alloc` to avoid repeating this for the
// same `ConstAllocation`?
let init = bx.const_data_from_alloc(alloc);
let base_addr = bx.static_addr_of(init, alloc_align, None);
let base_addr = bx.static_addr_of(alloc, None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in from_const_alloc, which could be adjusted to call scalar_to_backend with a Scalar::Ptr.

(At the very least, the comment mentioning const_data_from_alloc should be updated)

Comment on lines 116 to +117
let vtable_allocation = tcx.global_alloc(vtable_alloc_id).unwrap_memory();
let vtable_const = cx.const_data_from_alloc(vtable_allocation);
let align = cx.data_layout().pointer_align().abi;
let vtable = cx.static_addr_of(vtable_const, align, Some("vtable"));
let vtable = cx.static_addr_of(vtable_allocation, Some("vtable"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a historical accident that this is separate from scalar_to_backend's own ability to handle vtables (and only one of them is cached?) - AFAICT resulting in different codegen for:

  • let x = (); &x as &dyn Trait
  • const { &() as &dyn Trait }

Comment on lines +157 to +158
// TODO(antoyo)
&[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally added the missing feature in rust-lang/gcc#77 and rust-lang/gccjit.rs#54. So can be done in this PR or I can send a follow-up once merged to add:

Suggested change
// TODO(antoyo)
&[]
val.get_name().unwrap().as_slice()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require updating the GCC submodule, so perhaps it is better in a follow-up PR.

&[]
}
fn set_value_name(&self, _val: Self::Value, _name: &[u8]) {
// TODO(antoyo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I need to do this one. ^^'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants