Skip to content

explicit time advance portability #349

@bmcdanie

Description

@bmcdanie

currently, the explicit time advance (the core of the code) runs via calls into the kronmult library: https://github.com/project-asgard/kronmult. some kernels in https://github.com/project-asgard/asgard/blob/develop/src/device/kronmult_cuda.cpp are used to set up for calls into the library.

both the main kronmult code and the setup kernels are written as cuda kernels, with a fallback to OpenMP. To enhance portability, we could try a number of higher level approaches:

nvidia hpc sdk: https://developer.nvidia.com/hpc-sdk allows parallel algorithms https://en.cppreference.com/w/cpp/experimental/parallelism to be run on the accelerator. our code may not fit this paradigm, but may be worth exploring.

hipify kernels: https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html.

others? kokkos, OpenCL, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions