Skip to content

Latest commit

 

History

History
139 lines (109 loc) · 6.04 KB

README.md

File metadata and controls

139 lines (109 loc) · 6.04 KB

kokkos-fft

CI docs

Warning

EXPERIMENTAL FFT interfaces for Kokkos C++ Performance Portability Programming EcoSystem

kokkos-fft implements local interfaces between Kokkos and de facto standard FFT libraries, including fftw, cufft, hipfft (rocfft), and oneMKL. "Local" means not using MPI, or running within a single MPI process without knowing about MPI. We are inclined to implement the numpy.fft-like interfaces adapted for Kokkos. A key concept is that "As easy as numpy, as fast as vendor libraries". Accordingly, our API follows the API by numpy.fft with minor differences. A fft library dedicated to Kokkos Device backend (e.g. cufft for CUDA backend) is automatically used. If something is wrong with runtime values (say View extents), it will raise runtime errors (C++ std::runtime_error). See documentations for more information.

Here is an example for 1D real to complex transform with rfft in kokkos-fft.

#include <Kokkos_Core.hpp>
#include <Kokkos_Complex.hpp>
#include <Kokkos_Random.hpp>
#include <KokkosFFT.hpp>
using execution_space = Kokkos::DefaultExecutionSpace;
template <typename T> using View1D = Kokkos::View<T*, execution_space>;
constexpr int n = 4;

View1D<double> x("x", n);
View1D<Kokkos::complex<double> > x_hat("x_hat", n/2+1);

Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();

KokkosFFT::rfft(execution_space(), x, x_hat);

This is equivalent to the following python code.

import numpy as np
x = np.random.rand(4)
x_hat = np.fft.rfft(x)

There are two major differences: execution_space argument and output value (x_hat) is an argument of API (not returned value from API). As imagined, kokkos-fft only accepts Kokkos Views as input data. The accessibilities of Views from execution_space are statically checked (compilation errors if not accessible).

Depending on a View dimension, it automatically uses the batched plans as follows

#include <Kokkos_Core.hpp>
#include <Kokkos_Complex.hpp>
#include <Kokkos_Random.hpp>
#include <KokkosFFT.hpp>
using execution_space = Kokkos::DefaultExecutionSpace;
template <typename T> using View2D = Kokkos::View<T**, execution_space>;
constexpr int n0 = 4, n1 = 8;

View2D<double> x("x", n0, n1);
View2D<Kokkos::complex<double> > x_hat("x_hat", n0, n1/2+1);

Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();

int axis = -1;
KokkosFFT::rfft(execution_space(), x, x_hat, KokkosFFT::Normalization::backward, axis); // FFT along -1 axis and batched along 0th axis

This is equivalent to

import numpy as np
x = np.random.rand(4, 8)
x_hat = np.fft.rfft(x, axis=-1)

In this example, the 1D batched rfft over 2D View along axis -1 is executed. Some basic examples are found in examples.

Disclaimer

kokkos-fft is under development and subject to change without warning. The authors do not guarantee that this code runs correctly in all the environments.

Using kokkos-fft

For the moment, there are two ways to use kokkos-fft: including as a subdirectory in CMake project or installing as a library. First of all, you need to clone this repo.

git clone --recursive https://github.com/kokkos/kokkos-fft.git

Prerequisites

To use kokkos-fft, we need the followings:

  • CMake 3.22+
  • Kokkos 4.4+
  • gcc 8.3.0+ (CPUs)
  • IntelLLVM 2023.0.0+ (CPUs, Intel GPUs)
  • nvcc 11.0.0+ (NVIDIA GPUs)
  • rocm 5.3.0+ (AMD GPUs)

CMake

Since kokkos-fft is a header-only library, it is enough to simply add as a subdirectory. It is assumed that kokkos and kokkos-fft are placed under <project_directory>/tpls.

Here is an example to use kokkos-fft in the following CMake project.

---/
 |
 └──<project_directory>/
    |--tpls
    |    |--kokkos/
    |    └──kokkos-fft/
    |--CMakeLists.txt
    └──hello.cpp

The CMakeLists.txt would be

cmake_minimum_required(VERSION 3.23)
project(kokkos-fft-as-subdirectory LANGUAGES CXX)

add_subdirectory(tpls/kokkos)
add_subdirectory(tpls/kokkos-fft)

add_executable(hello-kokkos-fft hello.cpp)
target_link_libraries(hello-kokkos-fft PUBLIC Kokkos::kokkos KokkosFFT::fft)

For compilation, we basically rely on the CMake options for Kokkos. For example, the compile options for A100 GPU is as follows.

cmake -B build \
      -DCMAKE_CXX_COMPILER=g++ \
      -DCMAKE_BUILD_TYPE=Release \
      -DKokkos_ENABLE_CUDA=ON \
      -DKokkos_ARCH_AMPERE80=ON
cmake --build build -j 8

This way, all the functionalities are executed on A100 GPUs. For installation, details are provided in the documentation.

LICENCE

License License: MIT

kokkos-fft is distributed under either the MIT license, or at your option, the Apache-2.0 licence with LLVM exception.