Skip to content

gpu offload host code generation #142097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Jun 5, 2025

r? ghost

This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. LIBOMPTARGET_INFO=-1 ./my_rust_binary should print that a memcpy to and later from the device is happening.

A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.

I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.

Tracking:

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@ZuseZ4 ZuseZ4 added F-gpu_offload `#![feature(gpu_offload)]` and removed A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 5, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added the F-autodiff `#![feature(autodiff)]` label Jun 9, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Jun 10, 2025

@oli-obk Featurewise, I am almost done. I'll add a few more lines to describe the layout of Rust types to the offload library, but in this PR I only intend to support one type or two (maybe array's, raw pointer, or slices). I might even hardcode the length in the very first approach. In a follow-up PR I'll do some proper type parsing on a higher level, similar to what I did in the past with Rust TypeTrees. This work is much simpler and more reliable though, since offload doesn't care what type something has, just how many bytes it is large, and hence need to be moved to/from the GPU.

I was able to just move a few of the builder methods I needed to the generic builder.
However, there are also around 7 that I had to duplicate. I guess at some point I'll need to do the proper work of enabling the trait implementations for both builders :/
Once I have everything working, I'll clean it up and add some tests and docs.

@ZuseZ4 ZuseZ4 mentioned this pull request Mar 4, 2025
5 tasks
@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Jun 12, 2025

Not fully ready yet, I apparently missed yet another global to initialize the offload runtime. But at least it compiles successfully to a binary if I emit the IR from Rust, and then use clang for the rest. I'll add the global today, then I should be done and will clean it up

@rustbot rustbot added the T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) label Jun 17, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Jun 18, 2025

☔ The latest upstream changes (presumably #142644) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 18, 2025
@ZuseZ4 ZuseZ4 force-pushed the offload-host1 branch 2 times, most recently from 100f9f3 to 0fb93f0 Compare June 18, 2025 21:55
@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Jun 18, 2025

Jay, turns out the only issue in my test binary was a bug in LLVM, which was already fixed upstream in llvm/llvm-project#143638.
Once rustc syncs the llvm submodule again (in a week or so), we should get the fix. This does not affect the llvm-ir we generate with rustc, it just affects how clang compiles the llvm-ir from rustc to a binary. Therefore we don't have to wait for it. I'll add an llvm-ir test to make sure we generate the right things and clean it up a bit more.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check-tidy failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
fmt: checked 6074 files
tidy check
Running eslint on rustdoc JS files
tidy: Skipping binary file check, read-only filesystem
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/back/lto.rs:672: `dbg!` macro is intended as a debugging tool. It should not be in version control.
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:121: TODO is used for tasks that should be done before merging a PR; If you want to leave a message in the codebase use FIXME
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:157: `dbg!` macro is intended as a debugging tool. It should not be in version control.
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:167: `dbg!` macro is intended as a debugging tool. It should not be in version control.
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:250: `dbg!` macro is intended as a debugging tool. It should not be in version control.
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:457: `dbg!` macro is intended as a debugging tool. It should not be in version control.
##[error]tidy error: /checkout/compiler/rustc_codegen_llvm/src/builder/gpu_offload.rs:467: `dbg!` macro is intended as a debugging tool. It should not be in version control.
removing old virtual environment
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'venv'
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'virtualenv'
Requirement already satisfied: pip in ./build/venv/lib/python3.10/site-packages (25.1.1)
linting python files
All checks passed!
checking python file formatting
28 files already formatted
checking C++ file formatting
some tidy checks failed
Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:01:25
  local time: Wed Jun 18 23:54:41 UTC 2025
  network time: Wed, 18 Jun 2025 23:54:41 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

@ZuseZ4 ZuseZ4 marked this pull request as ready for review June 19, 2025 00:38
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 19, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 19, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Copy link
Member Author

@ZuseZ4 ZuseZ4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the first round of reviews for myself, I'll address them tomorrow.
I'll also clean up the code in gpu_builder more, it has a lot of duplications and IR comments from when I was trying to figure out what to generate..

@@ -117,6 +118,70 @@ impl<'a, 'll, CX: Borrow<SCx<'ll>>> GenericBuilder<'a, 'll, CX> {
}
bx
}

pub(crate) fn my_alloca2(&mut self, ty: &'ll Type, align: Align, name: &str) -> &'ll Value {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll find a better name for it.

@@ -667,6 +668,13 @@ pub(crate) fn run_pass_manager(
write::llvm_optimize(cgcx, dcx, module, None, config, opt_level, opt_stage, stage)?;
}

if cfg!(llvm_enzyme) && enable_gpu && !thin {
dbg!(&enable_gpu);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the dbg statements.

@@ -215,7 +215,9 @@ impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {

llfn
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I forgot this one in the first PR, I can make another one?

@@ -1004,13 +1004,22 @@ unsafe extern "C" {
SLen: c_uint,
) -> MetadataKindId;

// Create modules.
// Create, print, and destroy modules.
pub(crate) fn LLVMPrintModuleToFile(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can go into the follow-up PR (I guess I am not getting all unused function warnings for some reason).

const llvm::Function *calledFunc = callInst->getCalledFunction();
if (calledFunc && calledFunc->getName() == targetName) {
// Found a call to the target function
llvm::errs() << "Found call: " << *callInst << "\n";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-autodiff `#![feature(autodiff)]` F-gpu_offload `#![feature(gpu_offload)]` S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants