explicit time advance portability

currently, the explicit time advance (the core of the code) runs via calls into the kronmult library: https://github.com/project-asgard/kronmult. some kernels in https://github.com/project-asgard/asgard/blob/develop/src/device/kronmult_cuda.cpp are used to set up for calls into the library.

both the main kronmult code and the setup kernels are written as cuda kernels, with a fallback to OpenMP. To enhance portability, we could try a number of higher level approaches:

nvidia hpc sdk: https://developer.nvidia.com/hpc-sdk allows parallel algorithms https://en.cppreference.com/w/cpp/experimental/parallelism to be run on the accelerator. our code may not fit this paradigm, but may be worth exploring.

hipify kernels: https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html.

others? kokkos, OpenCL, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explicit time advance portability #349

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

explicit time advance portability #349

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions