[CIR][HIP|CUDA] Mirrors CUDARuntime skeleton of OG #1380

koparasy · 2025-02-21T05:33:36Z

As discussed in #1341 CIRGen does not match OG CG. This PR mirrors the skeleton of OG.

This PR adds LLVMIR lowering support for `cir.assume`, `cir.assume.aligned`, and `cir.assume.separate_storage`.

In OG CodeGen, string literals has `private` linkage as default (marked by `cir_private` in CIR assembly). But CIR uses `internal`, which is probably an ancient typo. This PR keeps align with it and thus modifies the existing test files.

…bits (llvm#1068) This PR adds a partial support for so-called indirect function arguments for struct types with size > 128 bits for aarch64. #### Couple words about the implementation The hard part is that it's not one-to-one copy from the original codegen, but the code is inspired by it of course. In the original codegen there is no much job is done for the indirect arguments inside the loop in the `EmitFunctionProlog`, and additional alloca is added in the end, in the call for `EmitParamDecl` function. In our case, for the indirect argument (which is a pointer) we replace the original alloca with a new one, and store the pointer in there. And replace all the uses of the old alloca with the load from the new one, i.e. in both cases users works with the pointer to a structure. Also, I added several missed features in the `constructAttributeList` for indirect arguments, but didn't preserve the original code structure, so let me know if I need to do it.

Note that we lack two pieces of support for aliases in LLVM IR dialect globals: the `alias` keyword and function types `void (ptr)`, this needs to be done before we can nail this for good, but it's outside the scope of this commit. The behavior is slightly different under -O1, which will be addressed next.

…m#1073) This PR changes the naming format of string literals from `.str1` to `.str.1`, making it easier to reuse the existing testcases of OG CodeGen.

It's currently polluting the `cir` namespace with very generic symbols like `Integer` and `Memory`, which is pretty confusing. `X86_64ABIInfo` already has `Class` alias for `X86ArgClass`, so we can use that alias to qualify all uses.

Note that there are still missing pieces, which will be incrementally addressed.

… test

Just verified this is actually done by some LLVM optimization, not by the frontend emitting directly, so this is a non-goal now, since CIR can also use LLVM opts to do the same once we have real global alias.

Also verified this does not apply anymore, we match -O0. The only remaing part is to lower to proper LLVM globals once LLVM IR dialect gets the global alias support.

Reverts llvm#1058

test case is from [Traditional Clang CodeGen test file](https://github.com/llvm/clangir/blob/723e78afb5ae4fbd000269a057410913ade3ef44/clang/test/CodeGen/2004-02-13-BuiltinFrameReturnAddress.c#L5)

This commit extends the support for floating point attributes parsing by using the new `AsmParser::parseFloat(fltSemnatics, APFloat&)` interface. As a drive-by, this commit also harmonizes the cir.fp print/parse namespace usage, and adds the constraint of supporting only "CIRFPType"s for cir.fp in tablegen instead of verifying it manually in the parsing logic. --- This commit is based on top of a to-be-upstreamed commit which extends the upstream MLIR float type parsing. Upstream parsing of float type has full capability only through parsing the Builtin Dialect's `FloatAttr`. Thos commit exposes the same capabilities to downstream users. --- This PR should resolve (at least) `GCC-C-execute-ieee-fp-cmp-2` and `GCC-C-execute-ieee-fp-cmp-4`, paving the way to other `GCC-C-execute-ieee-*` tests passing from the SingleSource suite. It resolves llvm#559 .

The test case is from [clang/test/CodeGen/neon-crypto.c](https://github.com/llvm/clangir/blob/dbf320e5c3db0410566ae561067c595308870bad/clang/test/CodeGen/neon-crypto.c#L28) Need a dedicated test file as RunOptions has crypto related target features.

llvm#1025 explains why we want to move the CIR dialect from the `mlir::cir` to the `cir` namespace. To avoid overloading the `cir` namespace too much afterwards, move all symbols whose equivalents live inside the `clang::CodeGen` namespace to a new `clang::CIRGen` namespace, so that we match the original CodeGen's structure more closely. There's some symbols that live under `clang/include/clang/CIR` whose equivalents live in `clang/lib/CodeGen` and are in the `clang::CodeGen` namespace. We have these symbols in a common location since they're also used by lowering, so I've also left them in the `cir` namespace. Those symbols are: - AArch64ABIKind - ABIArgInfo - FnInfoOpts - TypeEvaluationKind - X86AVXABILevel This is a pretty large PR out of necessity. To make it slightly more reviewable, I've split it out into three commits (which will be squashed together when the PR lands): - The first commit manually switches places to the `clang::CIRGen` namespace. This has to be manual because we only want to move things selectively. - The second commit adjusts namespace prefixes to make builds work. I ran https://gist.github.com/smeenai/f4dd441fb61c53e835c4e6057f8c322f to make this change. The script is idempotent, and I added substitutions one at a time and reviewed each one afterwards (using `git diff --color-words=.`) to ensure only intended changes were being made. - The third commit runs `git clang-format`. Because I went one-by-one with all my substitutions and checked each one afterwards, I'm pretty confident in the validity of all the changes (despite the size of the PR).

As the title says, this PR adds support for unions for AArch64 lowering. The idea is basically the same as the [original](https://github.com/llvm/clangir/blob/dbf320e5c3db0410566ae561067c595308870bad/clang/lib/AST/RecordLayoutBuilder.cpp#L2111) codegen, and I added a couple of tests.

llvm#1025 explains why we want to move the CIR dialect from the `mlir::cir` to the `cir` namespace. This is a large PR, and I've split it out into four commits (that'll be squashed when landing). The first commit changes `mlir::cir` to `cir` everywhere. This was originally done mechanically with: ``` find clang $ -name '*.h' -o -name '*.cpp' -o -name '*.td' $ -print0 | xargs -0 perl -pi -e 's/mlir::cir/cir/g' find clang $ -name '*.h' -o -name '*.cpp' $ -print0 | xargs -0 perl -pi -e 's/::cir/cir/g' find clang $ -name '*.h' -o -name '*.cpp' $ -print0 | xargs -0 perl -0777 -pi -e 's/namespace mlir \{\nnamespace cir \{/namespace cir {/g' find clang $ -name '*.h' -o -name '*.cpp' $ -print0 | xargs -0 perl -0777 -pi -e 's!\} // namespace cir\n\} // namespace mlir!} // namespace cir!g' ``` It then required some manual fixups to address edge cases. Code that lived under `mlir::cir` could refer to the `mlir` namespace without qualification, but after the namespace change, we need to explicitly qualify all our usages. This is done in the second commit via https://gist.github.com/smeenai/996200fd45ad123bbf22b412d59479b6, which is an idempotent script to add all qualifications. I added cases to the script one at a time and reviewed each change afterwards to ensure we were only making the intended modifications, so I feel pretty confident in the end result. I also removed `using namespace llvm` from some headers to avoid conflicts, which in turn required adding some `llvm::` qualifiers as well. The third commit fixes a test, since an error message now contains the mlir namespace. Similar tests in flang also have the namespace in their error messages, so this is an expected change. The fourth change runs `git clang-format`. Unfortunately, that doesn't work for TableGen files, so we'll have a few instances of undesirable formatting left there. I'll look into fixing that as a follow-up. I validated the end result by examining the symbols in the built Clang binary. There's nothing in the `mlir::cir` namespace anymore. https://gist.github.com/smeenai/8438fd01588109fcdbde5c8652781dc0 had the symbols which lived in `cir` and should have moved to `clang::CIRGen`, and I validated that all the symbols were moved, with the exceptions noted in llvm#1082 and the duplicated symbols noted in llvm#1025.

We have both clang/include/clang/CIRFrontendAction/CIRGenAction.h and clang/include/clang/cir/FrontendAction/CIRGenAction.h, which is a historical artifact. The latter is what's being upstreamed, so merge the former into it to avoid any confusion.

Fixed: llvm#265

…lvm#1079)

The buildX naming convention originated when the CIRGen implementation was planned to be substantially different from original CodeGen. CIRGen is now a much closer adaption of CodeGen, and the emitX to buildX renaming just makes things more confusing, since CodeGen also has some helper functions whose names start with build or Build, so it's not immediately clear which CodeGen function corresponds to a CIRGen buildX function. Rename the buildX functions back to emitX to fix this. This diff was generated mostly mechanically. I searched for all buildX functions in CIRGen and all emitX or buildX functions in CodeGen: ``` rg '\b[Bb]uild[A-Z][A-Za-z0-9_]*\b' clang/lib/CIR/CodeGen -Io | sort -u -o /tmp/buildfuncs rg '\b([Ee]mit|[Bb]uild)[A-Z][A-Za-z0-9_]*\b' clang/lib/CodeGen -Io | sort -u -o /tmp/emitfuncs ``` I used a simple Python script to find corresponding functions: https://gist.github.com/smeenai/02be7ced8564cef5518df72606ec7b19. https://gist.github.com/smeenai/6ffd67be4249c8cebdd7fa99cfa4f13c is the resulting list of correspondences. This isn't 100% accurate because it's not accounting for the files that the functions are present in, but that's pretty unlikely to matter here, so I kept it simple. The buildX functions in CIRGen which correspond to an emitX function in CodeGen should be changed, and the ones which correspond to a BuildX function in CodeGen should not be changed. That leaves some functions without any correspondences, which required a judgement call. I scanned through all those functions, and buildVirtualMethodAttr was the only one that seemed like it shouldn't be changed to emit. I performed the replacement as follows: ``` funcs="$(awk '(/-> [Ee]/ || !/->/) && !/buildVirtualMethodAttr/ { print substr($1, 6) }' /tmp/corrfuncs | paste -sd '|')" find clang/include/clang/CIR clang/lib/CIR/{CodeGen,FrontendAction} $ -name '*.h' -o -name '*.cpp' $ -print0 | \ xargs -0 perl -pi -e "s/\bbuild($funcs)\\b/emit\\1/g" ``` The mechanical changes are in the first commit of this PR. There was a manual fixup required for a token pasting macro in CIRGenExprScalar.cpp, which is the second commit. I then ran `git clang-format`, which is the third commit. (They'll be squashed together when the PR is committed.)

Move LP into CIRGen and give it a handle on the CIRGenModule. A lot of code has been duplicated from CIRGen into cir/Dialect/Transforms in order to let LP live there, but with more necessary CIRGen features (e.g. EH scope and cleanups) going to be used in LP it doesn't make sense to keep it separate. Add this patch that just refactors LoweringPrepare into the CIRGen directory and give it a handle on the CGM.

This PR supports string literals in OpenCL end to end, making it possible to use `printf`. This involves two changes: * In CIRGen, ensure we create the global symbol for string literals with correct `constant` address space. * In LowerToLLVM, make the lowering of `GlobalViewAttr` aware of the upstream address space. Other proper refactors are also applied. Two test cases from OG CodeGen are reused. `str_literals.cl` is the primary test, while `printf.cl` is the bonus one.

This reverts commit 4463352.

This patch adds support for the `__builtin_signbit` intrinsic. The intrinsic requires special handling for PowerPC; however, since ClangIR does not currently support PowerPC, this handling is omitted in this implementation.

The test code is from [OG's clang/test/CodeGen/builtins-memcpy-inline.c](https://github.com/llvm/clangir/blob/5f1afad625f1292ffcf02c36402d292c46213c86/clang/test/CodeGen/builtins-memcpy-inline.c#L7) Also, a little design choice when introducing MemCpyInlineOp, I chose to let it inherit CIR_MemCpyOp, so in future when we optimize MemCpy like Ops, we'd have cleaner and more unified code. However, the cost is that during LLVM lowering I'd have to convert the length from ConstOp into IntegerAttr as that's [what LLVM dialect is expecting](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmemcpyinline-llvmmemcpyinlineop)

Notable change is to introduce helper function `buildBinaryAtomicPost` which models on [OG's `EmitBinaryAtomicPost`](https://github.com/llvm/clangir/blob/dbf320e5c3db0410566ae561067c595308870bad/clang/lib/CodeGen/CGBuiltin.cpp#L340C15-L340C35). Comparing to `EmitBinaryAtomicPost`, `buildBinaryAtomicPost` is more concise as OG's ``EmitBinaryAtomicPost`` duplicates quite a bit of code from [MakeBinaryAtomicValue](https://github.com/llvm/clangir/blob/dbf320e5c3db0410566ae561067c595308870bad/clang/lib/CodeGen/CGBuiltin.cpp#L340) Also, didn't implement invert as __sync_add_and_fetch does not need it, but will add it (which is a trivial work) when we implement a builtin that needs it. Test cases are from [OG](https://github.com/llvm/clangir/blob/dbf320e5c3db0410566ae561067c595308870bad/clang/test/CodeGen/Atomics.c#L134)

…lvm#1095) When emitting initializers for static declarations, it's essential to ensure that the `cir.global` operation aligns its type with that of the initializer. The original approach creates a new global op and copies every attribute from the old one. But just `setSymType` should work well. This also removes missing feature flags there.

Traditional Clang implementation: https://github.com/llvm/clangir/blob/a0091e38f1027e35d17819e02ee1ae257a12d296/clang/lib/CodeGen/CGAtomic.cpp#L545-L550

… bf16 (llvm#1360) Lower vget_lane_bf16, vduph_lane f16 and bf16

Traditional Clang implementation: https://github.com/llvm/clangir/blob/a0091e38f1027e35d17819e02ee1ae257a12d296/clang/lib/CodeGen/CGBuiltin.cpp#L4116-L4118 I use the first argument type as the return type. It is OK for `__builtin_elementwise_acos`, however, I'm not sure it is OK for other builtin functions. Resolves: llvm#1361

…ance (llvm#1363) This PR removes XFAIL test_vec3

CUDA shared variables are device-only, accessible from all threads in a block of some kernel. It's similar to `local` variables in OpenCL which all threads in a work-group can access. Hence they are realized as `static` variables in addrspace(local). On the other hand, the local variables inside a kernel (without special attributes) are just regular variables, typically emitted by `CreateTempAlloca`. They are in the default address space. OG checks if the expected address space, denoted by the type, is the same as the actual address space indicated by attributes. If they aren't the same, a `addrspacecast` is emitted when a global variable is accessed. In CIR however, `cir.get_global` alreadys carries that information in `!cir.ptr` type, so we don't need a cast.

Add an implementation for array new initialization of multidimension arrays. This is able to leverage existing array element initialization with just a few changes to update the pointer types.

…1341) On HIP when launching a kernel we pass as a first argument a global variable that points to the device stub function. We follow OG design by having a map that pairs globals to symbols. In CUDA this is effectively a nop, as CUDA passes the device stub as a first argument.

Lower `neon_vshrd_n_u64`

Lower `neon_vsrad_n_u64`

This PR fixes a case with global vars replacement when an array element is neither a`GlobalView` nor `ConstArray` but just a null pointer . Previously we just skip such element. Basically, the case in question is `static S* global[] = {0, &another_global}`

The final part of llvm#1273.

llvm#1372) Lower vgetq_lane_bf16, vduph f16 and fb16

This is the second part of [PR#1290](llvm#1290), adding initial support for `__cxa_rethrow`. So, the last PR did not support statements that come after the `UnreachableOp` from the `rethrow`, we just ignored them, e.g: ``` struct S { S() {} }; void foo() { int r = 1; try { throw; S s; } catch (...) { ++r; } } ``` This PR fixes this. A few changes: - `rethrow` statements split their block into multiple blocks. - Tests with statements that come after the `rethrow` were added and old ones were modified. Concern: - The `ScopeOp` workaround still exists which I guess was one of the main concerns when we tried in the last PR. As usual, I am open to discussions on this and how to approach it better -:)

This patch introduces `CIR_TBAAStructAttr`, which encodes the type and offset of each field in a struct, although it may lead to some duplication in `CIR`. If we manage `cir::TBAAStructAttr` by adding a new method to `ASTRecordDeclInterface`, it will also introduce duplication between `CIRGen` and LLVM lowering.

…vm#1370) Currently, trying to generate CIR for the following code snippet `yield.cpp` fails. Using `bin/clang++ yield.cpp -Xclang -fclangir -Xclang -emit-cir -S -o -`: ``` struct S { S() {}; int a; }; void foo() { try { S s; } catch (...) { foo(); } } ``` The error: ``` loc("yield.cpp":6:6): error: 'cir.yield' op must be the last operation in the parent block ``` There is an extra YieldOp! The CIR dump before verification looks something like: ``` "cir.scope"() ({ %0 = "cir.alloca"() <{alignment = 4 : i64, allocaType = !cir.struct<struct "S" {!cir.int<s, 32>} #cir.record.decl.ast>, ast = #cir.var.decl.ast, init, name = "s"}> : () -> !cir.ptr<!cir.struct<struct "S" {!cir.int<s, 32>} #cir.record.decl.ast>> "cir.try"() <{catch_types = [#cir.all]}> ({ "cir.call"(%0) <{callee = @_ZN1SC1Ev, calling_conv = 1 : i32, exception, extra_attrs = #cir<extra({})>, side_effect = 1 : i32}> ({ "cir.yield"() : () -> () }) : (!cir.ptr<!cir.struct<struct "S" {!cir.int<s, 32>} #cir.record.decl.ast>>) -> () "cir.yield"() : () -> () }, { %1 = "cir.catch_param"() : () -> !cir.ptr<!cir.void> "cir.call"() <{ast = #cir.call.expr.ast, callee = @_Z3foov, calling_conv = 1 : i32, exception, extra_attrs = #cir<extra({})>, side_effect = 1 : i32}> ({ "cir.yield"() : () -> () "cir.yield"() : () -> () <--- **DUPLICATE** }) : () -> () "cir.yield"() : () -> () }) : () -> () "cir.yield"() : () -> () }, { }) : () -> () ``` This PR adds a check for an already existing terminator before creating a YieldOp during the cleanup.

Original Clang's implementation: https://github.com/llvm/clangir/blob/2df2022c90e00c0eac542eba4078e79306155c7b/clang/lib/CodeGen/CGBuiltin.cpp#L4131-L4133 https://github.com/llvm/clangir/blob/2df2022c90e00c0eac542eba4078e79306155c7b/clang/lib/CodeGen/CGBuiltin.cpp#L762-L776 Compared with non-elementwise exp https://github.com/llvm/clangir/blob/2df2022c90e00c0eac542eba4078e79306155c7b/clang/lib/CodeGen/CGBuiltin.cpp#L3006-L3016 https://github.com/llvm/clangir/blob/2df2022c90e00c0eac542eba4078e79306155c7b/clang/lib/CodeGen/CGBuiltin.cpp#L669-L685 elementwise version doesn't handle constrained situation. I'm not sure whether it is intended. For renaming, it is to match original clang's implementation closely. Resolves: llvm#1375

Lower vrsrad_n u64 and s64

There were problems with the pointer type and element type of the Address class getting out of sync. In the traditional codegen the pointer has no type, so it was sufficient for the Address class to simply track the type it was supposed to be pointing to. Since ClangIR pointer values are typed, the Address::withType function wasn't really doing what it was supposed to. It returned an object with the same pointer that the original object had, but with a mismatched element type. This change updates the Address::withType function to perform a bitcast to get the expected pointer type before creating a new Address object. It also adds assertions in the Address class to verify that pointer type and element type are consistent and updates many places that were causing those assertions to fire. These code changes cause extra bitcasts to be emitted in a few places. Regression tests have been updated as needed to reflect the CIR that is now generated.

bcardosolopes · 2025-02-22T01:03:44Z

conflict should be resolved

This is a rebased version of the inactive PR #1380. --------- Co-authored-by: koparasy <[email protected]>

Lancern and others added 30 commits January 27, 2025 12:04

[CIR][Lowering] Add LLVM lowering support for cir.assume (llvm#1066)

cd152c2

This PR adds LLVMIR lowering support for `cir.assume`, `cir.assume.aligned`, and `cir.assume.separate_storage`.

[CIR][NFC] Fix unused warning

13be731

[CIR][CodeGen] Use the same SSA name as OG's for string literals (llv…

d8e1ad3

…m#1073) This PR changes the naming format of string literals from `.str1` to `.str.1`, making it easier to reuse the existing testcases of OG CodeGen.

[CIR] Make X86ArgClass an enum class (llvm#1080)

fe01d5a

It's currently polluting the `cir` namespace with very generic symbols like `Integer` and `Memory`, which is pretty confusing. `X86_64ABIInfo` already has `Class` alias for `X86ArgClass`, so we can use that alias to qualify all uses.

[CIR][CIRGen] Fix some alias issues under -O1 and above

b15d9ad

Note that there are still missing pieces, which will be incrementally addressed.

[CIR][NFC] Add more checks and notes for virtual-destructor-calls.cpp…

8fc5d4b

… test

[CIR][NFC] Update test comment

1a52c81

Just verified this is actually done by some LLVM optimization, not by the frontend emitting directly, so this is a non-goal now, since CIR can also use LLVM opts to do the same once we have real global alias.

[CIR][NFC] More comments removed for ir differences

5d1abbe

Also verified this does not apply anymore, we match -O0. The only remaing part is to lower to proper LLVM globals once LLVM IR dialect gets the global alias support.

Revert "[CIR] fix getTypeSizeInBits with fp80 and fp128" (llvm#1089)

e6753e4

Reverts llvm#1058

[CIR][CIRGen][Builtin] Support __builtin_return_address (llvm#1046)

baec4d1

test case is from [Traditional Clang CodeGen test file](https://github.com/llvm/clangir/blob/723e78afb5ae4fbd000269a057410913ade3ef44/clang/test/CodeGen/2004-02-13-BuiltinFrameReturnAddress.c#L5)

[CIR][CIRGen][Builtin][Neon] Lower neon_vqmovn_v (llvm#1071)

abbf847

[CIR][NFC] Expand doc on cir.scope terminators (llvm#1078)

25fb28b

Fixed: llvm#265

[CIR][CIRGen][Builtin][Neon] Lower neon_vqdmulhq_v, neon_vqdmulh_v (l…

adf1b7d

…lvm#1079)

Revert "[CIR][NFC] Move LoweringPrepare into CIRGen (llvm#1092)"

5ba213a

This reverts commit 4463352.

[CIR][CIRGen] support builtin signbit (llvm#1033)

3bef174

This patch adds support for the `__builtin_signbit` intrinsic. The intrinsic requires special handling for PowerPC; however, since ClangIR does not currently support PowerPC, this handling is omitted in this implementation.

FantasqueX and others added 22 commits February 19, 2025 11:24

[CIR][CIRGen] Atomics: handle atomic_compare_exchange_weak (llvm#1359)

19e5076

Traditional Clang implementation: https://github.com/llvm/clangir/blob/a0091e38f1027e35d17819e02ee1ae257a12d296/clang/lib/CodeGen/CGAtomic.cpp#L545-L550

[CIR][CIRGen][Builtin][Neon] Lower vget_lane_bf16, vduph_lane f16 and…

a99b65f

… bf16 (llvm#1360) Lower vget_lane_bf16, vduph_lane f16 and bf16

[CIR][CIRGen] handle vectors of size 3 like size 4 for better perform…

2b94285

…ance (llvm#1363) This PR removes XFAIL test_vec3

[CIR] Add handling for initializing multi-dimension arrays (llvm#1369)

74e4af3

Add an implementation for array new initialization of multidimension arrays. This is able to leverage existing array element initialization with just a few changes to update the pointer types.

[CIR][CIRGen][Builtin][Neon] Lower neon_vshrd_n_u64 (llvm#1353)

a4d50eb

Lower `neon_vshrd_n_u64`

[CIR][CIRGen][Builtin][Neon] Lower neon_vsrad_n_u64 (llvm#1356)

8dfd5ee

Lower `neon_vsrad_n_u64`

[CIR][CIRGen] Handle __sync_nand_and_fetch (llvm#1351)

27154ac

The final part of llvm#1273.

[CIR][CIRGen][Builtin][Neon] Lower vgetq_lane_bf16, vduph f16 and fb16 (

2df2022

llvm#1372) Lower vgetq_lane_bf16, vduph f16 and fb16

[CIR][CIRGen][Builtin][Neon] Lower vrsrad_n u64 and s64 (llvm#1379)

5f68f6c

Lower vrsrad_n u64 and s64

Rename file

f747b57

Modify files to match OG

ba263ca

Add new CUDA Runtime

f25305e

Reorder cmake

1812c76

koparasy marked this pull request as ready for review February 21, 2025 05:33

koparasy requested review from lanza and bcardosolopes as code owners February 21, 2025 05:33

lanza force-pushed the main branch from 0364dd2 to 79d0d74 Compare March 18, 2025 02:22

seven-mile mentioned this pull request Mar 25, 2025

[CIR][CIRGen][CUDA][NFC] Mirror CUDARuntime skeleton of OG #1527

Merged

seven-mile added a commit that referenced this pull request Mar 26, 2025

[CIR][CIRGen][CUDA][NFC] Mirror CUDARuntime skeleton of OG (#1527)

67579ff

This is a rebased version of the inactive PR #1380. --------- Co-authored-by: koparasy <[email protected]>

lanza force-pushed the main branch from d3fe299 to 98e8811 Compare April 9, 2025 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CIR][HIP|CUDA] Mirrors CUDARuntime skeleton of OG #1380

[CIR][HIP|CUDA] Mirrors CUDARuntime skeleton of OG #1380

Uh oh!

koparasy commented Feb 21, 2025

Uh oh!

bcardosolopes commented Feb 22, 2025

Uh oh!

Uh oh!

[CIR][HIP|CUDA] Mirrors CUDARuntime skeleton of OG #1380

Are you sure you want to change the base?

[CIR][HIP|CUDA] Mirrors CUDARuntime skeleton of OG #1380

Uh oh!

Conversation

koparasy commented Feb 21, 2025

Uh oh!

bcardosolopes commented Feb 22, 2025

Uh oh!

Uh oh!