Thanks for all the links and information in this thread. Based partly on the info in this thread I’ve written a proof of concept Fiji jython scrip that calls 3D FFT based deconvolution directly on a CLBuffer
(btw @haesleinhuepf any further thoughts on wrapping clFFT or re-implementing a complete CLIJ FFT library?? I personally don’t mind writing low level stuff in c, because then I can easily wrap as Matlab mex interfaces and c-python, in addition to java).
Anyway to implement clFFT based RL required a few parts some of which are briefly (and very roughly) outlined below.
I needed kernels and a storage format for complex math. In this case I used interleaved float arrays because that is the format returned by clFFT. Example Complex Kernels are here. Similar kernels are available elsewhere.
Get and install clFFT
Get clFFT binaries then unpack to /opt/clFFT/ on Linux.
Windows instructions still “TODO”.
Real to Complex 2D FFT using clFFT
It may be feasible to wrap all of clFFT with java, and perhaps this is the direction CLIJ will go in.
However in many cases algorithm developers will want to work on native c code anyway (to make it easier to use and wrap in other frameworks). As well clFFT may be too obtuse for routine use. Thus I worked out a hack to get the long pointer to a CLBuffer so I could pass it to c-code. @haesleinhuepf currently has a pull request that exposes the long pointer so in the future we can avoid hacking it.
an example 2D Real to Complex FFT function which works on long pointers to existing GPU memory, context and queue
3D Richardson Lucy Deconvolution using clFFT
clFFT based Richardson Lucy implementations that works on long pointers to existing GPU Memory, context and queue
Native build scripts here and here .
The JavaCPP Wrapper is here
Native Build and Wrapper Build are invoked from the POM.
How to call from Java
An example showing how to call the 2D Real to Complex FFT
Note clFFT does not support all sizes, so need to extend to a smooth number first
img = (RandomAccessibleInterval<FloatType>) ops.run(
DefaultPadInputFFT.class, img, img, false);
An example showing how to call 3D deconvolution directly on GPU memory (CLBuffer)
Note: the PSF is passed in as a java RAI, because there is additional conditioning that needs to be done. PSF needs to be extended to the image size and shifted so the center is at 0,0.
// extend and shift the PSF
RandomAccessibleInterval<FloatType> extendedPSF = Views.zeroMin(ops.filter().padShiftFFTKernel(psf, new FinalDimensions(gpuImg.getDimensions())));
// transfer PSF to the GPU
ClearCLBuffer gpuPSF = clij.push(extendedPSF);