Warning
EXPERIMENTAL FFT interfaces for Kokkos C++ Performance Portability Programming EcoSystem
kokkos-fft implements local interfaces between Kokkos and de facto standard FFT libraries, including fftw, cufft, hipfft (rocfft), and oneMKL. "Local" means not using MPI, or running within a single MPI process without knowing about MPI. We are inclined to implement the numpy.fft-like interfaces adapted for Kokkos.
A key concept is that "As easy as numpy, as fast as vendor libraries". Accordingly, our API follows the API by numpy.fft with minor differences. A fft library dedicated to Kokkos Device backend (e.g. cufft for CUDA backend) is automatically used. If something is wrong with runtime values (say View
extents), it will raise runtime errors (C++ std::runtime_error
). See documentations for more information.
Here is an example for 1D real to complex transform with rfft
in kokkos-fft.
#include <Kokkos_Core.hpp>
#include <Kokkos_Complex.hpp>
#include <Kokkos_Random.hpp>
#include <KokkosFFT.hpp>
using execution_space = Kokkos::DefaultExecutionSpace;
template <typename T> using View1D = Kokkos::View<T*, execution_space>;
constexpr int n = 4;
View1D<double> x("x", n);
View1D<Kokkos::complex<double> > x_hat("x_hat", n/2+1);
Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();
KokkosFFT::rfft(execution_space(), x, x_hat);
This is equivalent to the following python code.
import numpy as np
x = np.random.rand(4)
x_hat = np.fft.rfft(x)
There are two major differences: execution_space
argument and output value (x_hat
) is an argument of API (not returned value from API). As imagined, kokkos-fft only accepts Kokkos Views as input data. The accessibilities of Views from execution_space
are statically checked (compilation errors if not accessible).
Depending on a View dimension, it automatically uses the batched plans as follows
#include <Kokkos_Core.hpp>
#include <Kokkos_Complex.hpp>
#include <Kokkos_Random.hpp>
#include <KokkosFFT.hpp>
using execution_space = Kokkos::DefaultExecutionSpace;
template <typename T> using View2D = Kokkos::View<T**, execution_space>;
constexpr int n0 = 4, n1 = 8;
View2D<double> x("x", n0, n1);
View2D<Kokkos::complex<double> > x_hat("x_hat", n0, n1/2+1);
Kokkos::Random_XorShift64_Pool<> random_pool(12345);
Kokkos::fill_random(x, random_pool, 1);
Kokkos::fence();
int axis = -1;
KokkosFFT::rfft(execution_space(), x, x_hat, KokkosFFT::Normalization::backward, axis); // FFT along -1 axis and batched along 0th axis
This is equivalent to
import numpy as np
x = np.random.rand(4, 8)
x_hat = np.fft.rfft(x, axis=-1)
In this example, the 1D batched rfft
over 2D View along axis -1
is executed. Some basic examples are found in examples.
kokkos-fft is under development and subject to change without warning. The authors do not guarantee that this code runs correctly in all the environments.
For the moment, there are two ways to use kokkos-fft: including as a subdirectory in CMake project or installing as a library. First of all, you need to clone this repo.
git clone --recursive https://github.com/kokkos/kokkos-fft.git
To use kokkos-fft, we need the followings:
CMake 3.22+
Kokkos 4.4+
gcc 8.3.0+
(CPUs)IntelLLVM 2023.0.0+
(CPUs, Intel GPUs)nvcc 11.0.0+
(NVIDIA GPUs)rocm 5.3.0+
(AMD GPUs)
Since kokkos-fft is a header-only library, it is enough to simply add as a subdirectory. It is assumed that kokkos and kokkos-fft are placed under <project_directory>/tpls
.
Here is an example to use kokkos-fft in the following CMake project.
---/
|
└──<project_directory>/
|--tpls
| |--kokkos/
| └──kokkos-fft/
|--CMakeLists.txt
└──hello.cpp
The CMakeLists.txt
would be
cmake_minimum_required(VERSION 3.23)
project(kokkos-fft-as-subdirectory LANGUAGES CXX)
add_subdirectory(tpls/kokkos)
add_subdirectory(tpls/kokkos-fft)
add_executable(hello-kokkos-fft hello.cpp)
target_link_libraries(hello-kokkos-fft PUBLIC Kokkos::kokkos KokkosFFT::fft)
For compilation, we basically rely on the CMake options for Kokkos. For example, the compile options for A100 GPU is as follows.
cmake -B build \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_AMPERE80=ON
cmake --build build -j 8
This way, all the functionalities are executed on A100 GPUs. For installation, details are provided in the documentation.
kokkos-fft is distributed under either the MIT license, or at your option, the Apache-2.0 licence with LLVM exception.