Skip to content

Conversation

@Flakebi
Copy link
Contributor

@Flakebi Flakebi commented Dec 8, 2025

Add support for inline assembly for the amdgpu backend (the amdgcn-amd-amdhsa target).
Add register classes for vgpr (vector general purpose register) and sgpr (scalar general purpose register).
The LLVM backend supports two more classes, reg, which is either VGPR or SGPR, up to the compiler to decide. As instructions often rely on a register being either a VGPR or SGPR for the assembly to be valid, reg doesn’t seem that useful (I struggled to write correct tests for it), so I didn’t end up adding it.
The fourth register class is AGPRs, which only exist on some hardware versions (not the consumer ones) and they have restricted ways to write and read from them, which makes it hard to write a Rust variable into them. They could be used inside assembly blocks, but I didn’t add them as Rust register class.

There is one change affecting general inline assembly code, that is InlineAsmReg::name() now returns a Cow instead of a &'static str. Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4 VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu reg stores the register number(s) and a non-static String is generated at runtime for the register name.

Tracking issue: #135024

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 8, 2025
@rustbot
Copy link
Collaborator

rustbot commented Dec 8, 2025

r? @eholk

rustbot has assigned @eholk.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Flakebi Flakebi mentioned this pull request Dec 8, 2025
26 tasks
@rust-log-analyzer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Dec 9, 2025

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@eholk
Copy link
Contributor

eholk commented Dec 9, 2025

This seems okay to me, but I'd rather someone more familiar with this part of the compiler give the final signoff.

@bors r?

@rustbot
Copy link
Collaborator

rustbot commented Dec 9, 2025

Error: Parsing assign command in comment failed: ...'' | error: specify user to assign to at >| ''...

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #triagebot on Zulip.

@eholk
Copy link
Contributor

eholk commented Dec 9, 2025

@bors r? compiler

@rustbot rustbot assigned fee1-dead and unassigned eholk Dec 9, 2025
@fee1-dead
Copy link
Member

@rustbot reroll

@rustbot rustbot assigned chenyukang and unassigned fee1-dead Dec 10, 2025
@Flakebi Flakebi force-pushed the inline-asm branch 2 times, most recently from bdb726b to 9db5dca Compare December 14, 2025 15:07
@Flakebi
Copy link
Contributor Author

Flakebi commented Dec 14, 2025

Removed return type from tests to fix conflict with #149991, which starts disallowing returns in gpu-kernel functions.

@chenyukang
Copy link
Member

The change seems Ok, i'd like people with more background to take a look.
@rustbot reroll

@rustbot rustbot assigned jdonszelmann and unassigned chenyukang Dec 19, 2025
@jdonszelmann
Copy link
Contributor

That's not me (sorry it took me a while because of holidays). But iirc that could be amanieu? r? @Amanieu

@rustbot rustbot assigned Amanieu and unassigned jdonszelmann Jan 6, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 9, 2026

☔ The latest upstream changes (presumably #150866) made this pull request unmergeable. Please resolve the merge conflicts.

@Flakebi
Copy link
Contributor Author

Flakebi commented Jan 9, 2026

@taiki-e, I saw your mention in the asm tracking issue. amdgpu has named registers (>384 of them), so a clobber_abi would be possible. However, there’s currently no stable ABI, so we would need to follow what the LLVM backend does and I think there’s no tools to make sure the Rust and LLVM ABI understanding is in sync, so that would be asking for subtle breakages. Therefore I think it doesn’t make sense to support clobber_abi with amdgpu at the moment.

@rustbot
Copy link
Collaborator

rustbot commented Jan 14, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@Flakebi
Copy link
Contributor Author

Flakebi commented Jan 14, 2026

Rebased to fix conflict, no other changes

Edit: kdiff3 didn’t like 🦀 when fixing conflicts 😢, fixed now

@rust-log-analyzer

This comment has been minimized.

Add support for inline assembly for the amdgpu backend (the
amdgcn-amd-amdhsa target).
Add register classes for `vgpr` (vector general purpose register) and
`sgpr` (scalar general purpose register).
The LLVM backend supports two more classes, `reg`, which is either VGPR
or SGPR, up to the compiler to decide. As instructions often rely on a
register being either a VGPR or SGPR for the assembly to be valid, reg
doesn’t seem that useful (I struggled to write correct tests for it), so
I didn’t end up adding it.
The fourth register class is AGPRs, which only exist on some hardware
versions (not the consumer ones) and they have restricted ways to write
and read from them, which makes it hard to write a Rust variable into
them. They could be used inside assembly blocks, but I didn’t add them
as Rust register class.

There is one change affecting general inline assembly code, that is
`InlineAsmReg::name()` now returns a `Cow` instead of a `&'static str`.
Because amdgpu has many registers, 256 VGPRs plus combinations of 2 or 4
VGPRs, and I didn’t want to list hundreds of static strings, the amdgpu
reg stores the register number(s) and a non-static String is generated
at runtime for the register name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants