GPU matmul rules in Enzyme CUDA extension

Following up on [this comment](https://github.com/rsenne/ParallelMCMC.jl/pull/32#discussion_r...) from @wsmoses on rsenne/ParallelMCMC.jl#32 -- I'd like to upstream the GPU matmul EnzymeRules I wrote there.

Plan: a CUDA package extension registering forward / augmented_primal / reverse for Base.* on CuArray / CuMatrix / CuVector (plus transpose/adjoint variants). Rules compute primal and cotangents with plain * so the cuBLAS call stays opaque to Enzyme -- sidesteps the gc-transition abort during LLVM lowering. Width-1 to start.

Reference implementation: [ParallelMCMC.jl/ext/EnzymeExt.jl](https://github.com/rsenne/ParallelMCMC.jl/blob/GPU_Fixes_/ext/EnzymeExt.jl).

OK to open a PR?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU matmul rules in Enzyme CUDA extension #3122

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

GPU matmul rules in Enzyme CUDA extension #3122

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions