Roadmap towards CLIJ 2.5 (Poll)

Hi clij enthusiasts,

following a tradition I’d like to start a discussion about the feature list of the next clij release. Therefore, I would like to know what commands are used by the community. Thus, I made poll (see below).

Background: I’m retracting an earlier announcement: clEsperanto will not have its first stable release in 2021. We just postponed it to 2022. We want to release a really great library with a well-designed API. We can make a better clEsperanto until summer 2022 so let’s not rush. :relieved:

This gives us the opportunity to release a fully backwards-compatible successor of clij2 in summer 2021: clij2.5 :slight_smile: I’m reaching out to hear your opinion on which experimental clijx commands the community would like to see in a stable clij2.5 release.

Here comes a list of commands which IMHO could be relatively easily be deployed as part of clij2.5, because they have been used in a couple of projects and proved useful:

Furthermore, there are some commands which might be interesting as well but weren’t tested in much detail yet. To make them part of the stable clij 2.5 release, support is welcome:

Furthermore, general feedback is welcome as well. For example, we’re maintaining a wish list as github issues here. Add things there or reply in this post if you have a good idea for extending clij. :slight_smile:

Thanks for your opinion!



At least a bicubic interpolation scheme for geometric transformations would be wonderful. (Don’t see it on the impressive listings.)


Great idea! I just added it to the wish list. However, while linear interpolation is built-in to OpenCL, bi-cubic is not. Thus, the effort might be substantial to make this work while keeping high performance.
Thanks for the suggestion! :slightly_smiling_face:

1 Like

To fulfill this basic request may require substantial investments but isn’t this what science is about? Just using what’s there is …

TransformJ provides various flavours of cubic interpolation and it even provides a quintic scheme.

1 Like

Yep, but dev time is limited, so something being worked on might be several other somethings not being worked on!


A while ago we had a short discussion a while ago about a temporal median filter with regards to an implementation I did on the GPU using clij and some custom OpenCL code at the Holhbein lab.

I think that a proper GPU implementation will still be magnitudes faster and was wondering if one of the features mentioned in the poll was something with regards to an expanded medianZProjection.

If there is some interest (i would be willing to help with an implementation) I can post a detailed workflow in an issue on git (based on my depracted implementation).

Since I also work with rather large datasets, i was wondering if integration with GPUDirect Storage is already present/planned?

1 Like

Hey @DrNjitram ,

amazing initiative, thanks for reaching out!

I quickly went through your code and only found a median-z-projection is there a reason why you are not using the median-filter? Maybe I’m misunderstanding your project though. You do background-subtraction using a median-filter, right?

Regarding potential speedups: We recently did a project with two students (thanks Felix and Hannes!) about benchmarking of some operations in CLIc, CLIJs C++ sibling, made by @StRigaud. It turned out that mean-x-projections are faster than mean-z-projections, at least above a given stack size. Likely that’s related to how pixels are stored in memory.

The question would now be if a Median-X-projection is also faster. We also would need to investigate how much time an axis transposition would take in the context of realistic image sizes. A similar question arises for median-filters in X and Z.

If you could spend some time on this, I owe you a drink! It should be pretty straight-forward to take the java implementation of the MedianZProjection and its corresponding OpenCL-part, put them into a CLIJ2-plugin-template and turn it into a Median-X-Projection. Taking a look into the Mean-X-Projection and its OpenCL-counterpart might help as well.

From this median-x-projection it would then again be pretty straight-forward to program a sliding window that goes in x and computes the median of sub-stacks. I would recommend doing this in the GPU, using OpenCL to avoid repetetive data transfer.

I’m afraid this special memory access is restricted to NVidias CUDA technology and recent GPUs. However, I think if you can wrangle your data before pushing it to the GPU, transfer time can be optimized also without exploting vendor specific stuff. How large are your images? Do they fit in computer RAM?

We can also have a virtual meeting and discuss details.

Let me know if I can help you!



Hey Robert,

I dont recall the exact reason why i did certain things, it has been a few months and i have been busy with some other projects.

Why i am not using the median filter i wasnt sure, it might have been it was overlooked. The crux was iirc mostly loading data in and out of the GPU in batches. this also answers another question regarding the data size, as datasets could reach >20GB, more than some computers have (~16GB).

It was a rather complex application so I would love to have a quick discussion about the subject!

1 Like

Hi @haesleinhuepf,

I didn’t know about this “Wish List”!
I opened a few new “issues”!