GPU-accelerated 3D image deconvolution & affine transforms using CUDA.
Python bindings are also available at pycudadecon
Precompiled binaries available for linux and windows at conda-forge (see GPU driver requirements below)
conda install -c conda-forge cudadecon
# or... to also install the python bindings
conda install -c conda-forge pycudadecon
# check that GPU is discovered
cudaDecon -Q
# Basic Usage
# 1. create an OTF from a PSF with "radialft"
radialft /path/to/psf.tif /path/to/otf_output.tif --nocleanup --fixorigin 10
# 2. run decon on a folder of tiffs:
# 'filename_pattern' is a string that must appear in the filename to be processed
cudaDecon $OPTIONS /folder/of/images filename_pattern /path/to/otf_output.tif
# see manual for all of the available arguments
cudaDecon --help
This software requires a CUDA-compatible NVIDIA GPU. The libraries available on conda-forge have been compiled against different versions of the CUDA toolkit. The required CUDA libraries are bundled in the conda distributions so you don't need to install the CUDA toolkit separately. If desired, you can pick which version of CUDA you'd like based on your needs, but please note that different versions of the CUDA toolkit have different GPU driver requirements:
To specify a specific cudatoolkit version, install as follows (for instance, to use
cudatoolkit=10.2
)
conda install -c conda-forge cudadecon cudatoolkit=10.2
CUDA | Linux driver | Win driver |
---|---|---|
10.2 | ≥ 440.33 | ≥ 441.22 |
11.0 | ≥ 450.36.06 | ≥ 451.22 |
11.1 | ≥ 455.23 | ≥ 456.38 |
11.2 | ≥ 460.27.03 | ≥ 460.82 |
If you run into trouble, feel free to open an issue and describe your setup.
-
Compatible GPUs are specified in this "C:\cudaDecon\CMakeLists.txt". This also sets up all of the linking to dependent libraries. If you end up adding other code libraries, or changing versions, etc you will want to edit this file. Specifically where you see the lines like : "-gencode=arch=compute_75,code=sm_75"
-
GPU based resources have a d_ prefix in their name such as : GPUBuffer & d_interpOTF
-
transferConstants() is a function to send small data values from host to GPU device.
-
The link between the function arguments of "transferConstants()" and the globals like : constant unsigned const_nzotf; are found in RLgpuImpl.cu with calls like : cutilSafeCall(cudaMemcpyToSymbol(const_nzotf, &nzotf, sizeof(int)));
-
This RL is based upon the built-in Matlab version : deconvlucy.m (see http://ecco2.jpl.nasa.gov/opendap/hyrax/matlab/images/images/deconvlucy.m)
-
Cudadecon.exe
main()
function is insrc/linearDecon.cpp
-
If not enough memory is on the GPU, the program will use host PC's RAM.
-
If you are processing on the GPU that drives the display, Windows will terminate cudaDecon if an iteration takes too long. Set the windows display driver timeout to something larger (like 10 seconds instead of default 5 seconds) : see http://stackoverflow.com/questions/17186638/modifying-registry-to-increase-gpu-timeout-windows-7 Running this command from an adminstrator command prompt should set the timeout to 10 :
reg.exe ADD "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers" /v "TdrDelay" /t REG_DWORD /D "10" /f
-
Better yet, use a second GPU. The GPU you wish to use for computation only should use the TCC driver (must be a Titan or Tesla or other GPU that supports TCC). This card should be initialized after the display GPU, so put the compute card in a slot that is > display card. The TCC driver is selected with NVIDIAsmi.exe -L from an administrator cmd window to show the GPUs, then NVIDIAsmi.exe -dm 1 -i 0 to set TCC on GPU 0. Then use
set CUDA_VISIBLE_DEVICES
to pick the GPU the deconv code should execute on.
If you simply wish to use this package, it is best to just install the precompiled binaries from conda as described above
To build the source locally, you have two options:
With docker installed, use .scripts/run_docker_build.sh
with one of the
configs available in .ci_support
, for instance:
CONFIG=linux_64_cuda_compiler_version10.2 .scripts/run_docker_build.sh
Here we create a dedicated conda environment with all of the build dependencies installed, and then use cmake directly. This method is faster and creates an immediately useable binary (i.e. it is better for iteration if you're changing the source code), but requires that you set up build dependencies correctly.
-
install miniconda
-
install cudatoolkit (I haven't yet tried 10.2)
-
(windows only) install build tools for VisualStudio 2017. For linux, all necessary build tools will be installed by conda.
-
create a new conda environment with all of the dependencies installed
conda config --add channels conda-forge conda create -n build -y cmake boost-cpp libtiff fftw ninja conda activate build # you will need to reactivate the "build" environment each time you close the terminal
-
create a new
build
directory inside of the top levelcudaDecon
foldermkdir build # inside the cudaDecon folder cd build
-
(windows only) Activate your build tools:
"C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools\VC\Auxiliary\Build\vcvars64.bat"
-
Run
cmake
and compile withninja
on windows ormake
on linux.# windows cmake ../src -DCMAKE_BUILD_TYPE=Release -G "Ninja" ninja # linux cmake ../src -DCMAKE_BUILD_TYPE=Release make -j4
note that you can specify the CUDA version to use by using the
-DCUDA_TOOLKIT_ROOT_DIR
flag
The binary will be written to cudaDecon\build\<platform>-<compiler>-release
.
If you change the source code, you can just rerun ninja
or make
and the
binary will be updated.