MLPoisson on AMD GPUs crashes - kernel not starting (?) #4336
Unanswered
ElloiseFangelLloyd
asked this question in
Q&A
Replies: 1 comment 6 replies
-
Could you try |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all
I have been using the MLPoisson functionality to solve Poisson's equation, and it runs with no issues on CPUs and on GPUs with the CUDA backend. Having migrated to a new system (LUMI), I need to run the same code on the AMD GPUs, and this is crashing during runtime. (MRE attached).
AMReX was compiled and installed (with cmake, which is the preferred approach at LUMI) using the amdclang compiler and the HIP backend, and has been working without problems thus far. I have had no issues setting up MultiFabs and using ParallelFor
and MFIter, however, the MLPoisson instantiation (line 96 in the MRE) fails.
I have attached the MRE, the error occurs in line 96:
MLPoisson Poisson_equation({geom}, {grids}, {dmap}, info).
This is a single-level solver and the code is almost entirely taken from the ABecLaplacianC example.It is my understanding (not an expert in reading traces...) that the kernel has not been created and the template instantiation has not worked.
I am usually compiling AMReX code with amdclang for uniformity, however, crayclang was also tried. I contacted support at LUMI, who also tried varying verions of ROCm (6.0 and 6.2) and two AMReX versions (devel and v2024).
Any advice on this issue is massively appreciated! Thanks so much in advance for your time.
Backtrace:
amrex_poisson.zip
Edit: put line 96 into
code
format.Beta Was this translation helpful? Give feedback.
All reactions