Rolling Ball Background Subtraction Challenge

@haesleinhuepf posed this challenge in response to @Wilson_Adams to reduce the rolling ball background subtraction algorithm to more fundamental morphological operations, and I thought I would break it off into its own thread.

This is in reference to the following code:

@lmurphy suggested this was similar to top-hat and mean filtering.

I looked at the original reference (SR Sternberg. Biomedical Image Processing. IEEE Computer 1983) and this seems quite tractable: “Rolling a ball is equivalent to eroding and dilating by a spherical structuring element” making it seem related to morphological opening.



Apparently this is a standing issue for implementation as a ImageJ OP:

1 Like

I tried to make it some time ago, but ran into a definition problem.
Rolling-ball BG subtraction indeed looks like some top-hat filtering, extended to grayscale structuring elements. Instead of being ‘flat’ the structuring element has some intensities that varies. In the case of the rolling-ball background subtraction, this intensity goes to 0 on the border and take the shape of a ball.

The issue is that you have to define what would be the maximal intensity in the middle of the element. For 8-but images, in IJ1 they simply take the max intensity 255. But if we have to do this with ImgLib2, what would we do?


Interestingly, the original paper introduces the algorithm is in terms of a 3D structure element acting upon a sparse binary 3D volume. The sparse binary 3D volume is an interpretation of the image as a heightmap with the intensity specifying the z-coordinate of the top of a stack of binary voxels. I wonder if a modern implementation may attempt that approach more directly. Certainly handling morphological operations on sparse 3D volumes would seem to be a more generally applicable task than creating a specialized infrastructure as below.

Interpreted in 2D, the ball is a non-flat structure element as @tinevez pointed out. Specifically, the “height” values of the ball act as additive offsets which are added to the pixels in the neighborhood before an operation, such as max or min, is performed. At minimum this route would need support for additive offset non-flat structure elements and their use in morphological operations. MATLAB’s documentation for this can be seen here:
Since I don’t know of any other applications using offsetstrel other than rolling ball, I’m honestly not sure if it is worth building this up in Imglib2.

To optimize the rolling ball calculation, it would best to decompose the structure element. The current ImageJ1 calculates the ball structure element by brute force, making is very slow. The reference for this is already used to decompose the disk structure element (Rolf Adams. “Radial Decomposition of Discs and Spheres”. CVGIP: Graphical Models and Image Processing, vol. 55, no. 5, September 1993, pp. 325-332).:

Here’s the “Radial Decomposition of Discs and Spheres” reference itself:

After exploring this, I now wonder how easily could this be implemented via MorphoLibJ or CLIJ as a 3D algorithm?

I’m having a hard time finding the use of 255 in the code being used as the height of the ball when generating the background. Looking at the creation of the rolling ball, it looks like the maximum intensity in the middle of the element is just the radius or radius squared?

The closest I see to your description is in rollBall when Float.MAX_VALUE is used to initialized the minimization routine.

The other fixed used of 255 or other constants are when the tophat operation is being applied and the background is being subtracted. Did I miss something?

1 Like

I’ve created two proof of concept implementations of rolling ball background subtraction based on the 3D binary concept using 1) MATLAB and 2) ImageJ1 Macro with CLIJ.

a) Building the 3D binary volume is pretty slow still in ImageJ/CLIJ. It takes up about half of the processing time. Anyone have any ideas on how to accelerate that process?
b) Why does the builtin rolling ball look so square?

  1. MATLAB:

  2. ImageJ1 Macro:

    Top row: ImageJ1 Macro with CLIJ. From left to right: Original blobs.gif, 8-bit background, background subtracted blobs.gif
    Bottom row: Process > Subtract Background. Left: background. Right: Background subtracted blobs.gif
    ImageJ1 Macro Processing time: 1417 milliseconds
    Rolling Ball Radius: 10.0 pixels

1 Like

You have a valid point here!

The step “Binary Volume” takes about 700ms - according to the code this is thresholding in a loop. Just out of curiousity, why do you threshold the image in a loop with increasing thresholds? Also the subsequent 3D erosion is not so clear to me…



The first objective is to turn an 8-bit image into a binary volume. For each XY position, I need a stack of voxels with a height equivalent to the 8-bit integer value.

In MATLAB, I did this by first setting the voxel at the top of the stack to true and then using a reverse cumulative sum (cumulative or) to place all the true voxels underneath. This is analogous creating a structure by columns and just pouring cement in from the top of each column until it the column was a desired height.

For the ImageJ1 Macro I instead used a loop to build to build up each layer at a time using a theesholding. This is slow. The analogy here is building a structure up from the ground using blocks one layer at a time like a 3D printer.

I have been thinking of a another strategy to vectorize the operation.

  1. Replicate the original image 256 times for each z position to create a 8-bit volume. This would be like repmat in MATLAB.
  2. Create a column with values 0 to 255. If necessary, replicate the column over xy positions unless we have broadcasting.
  3. Create the binary volume by comparing the volume to the column using greater than.

I’m unclear how to do that efficiently in ImageJ1 or CLIJ(x) yet.

Could you elaborate what is unclear about the erosion?

Once I have the binary volume all we need to do is morphological opening in 3D. To approximate a ball I used a combination of the the sphere and the box. The sphere alone repeatedly looks like a diamond in cross section. The box produces a square. Together they produce an octagon.

Let me know what is unclear.

Ok, I see. By thresholding with all possible gray values and stacking the resulting binary images, you get a binary stack. The sum projection of this stack is the original image, right?

Applying a 3D erosion/dilation/opening of this stack before calculating the sum-projection should be the same as a minimum/maximum/top-hat-like filter of the original image.

Am I missing something? That brings us back to @lmurphy’s point: The background is basically top-hat. I really would like to understand this - that’s why the t-shirt challenge :wink: I bet there is a way of doing this efficiently.

Thanks for your efforts elaborating on this!


1 Like

Yes, that is correct. The sum projection of this stack is the original image if Gaussian smoothing is not applied first. If Gaussian smoothing is applied, then the sum projection would be the blurred image.

No, this is not correct. The sum-projection of the 3D morphological opening of this binary stack is not equivalent to a 2D morphological opening of the 8-bit intensity image. To make it equivalent one would have to add an offset to each neighborhood before performing the mophological opening operation. The additive offset approach is both what the current code does and is also what MATLAB has implemented via offsetstrel:

They are similar in that both operations involve the calculation of a background estimate and subtracting that from the original.

The difference is that the z-dimension of the ball matters for the background generation in the rolling ball case and modifies how the intensity is adjusted. Refer to Figure 10 above. During the erosion step the intensity will be reduced by at least the radius of the ball. Basically we are going to shear off from the alien landscape (the binary stacks) anything that is within a the radius of the ball from the surface. Not only is the thin obelisk like peak in Figure 10 is going away because it is thinner than the ball, but also the peaks of those smooth hills and plateaus are going to be reduced as well.

Let’s imagine a mesa, a table mountain like Täfelberg, that is flat on the top, has sheer sides, and that has a cross sectional area is larger than the ball/disk radius. Under 2D erosion, the height near the center of the mesa would be unaffected. The cross sectional area might be reduced though. After the subsequent 2D dilation, the cross sectional profile is again expanded and is “smoothed” but the topography of the center of the mesa is unaffected.

In contrast, under 3D erosion the height in the center gets reduced along with any small bumps on top of the mesa. The subsequent 3D dilation creates not only a smooth cross sectional area but also a smooth topography on the top of the mesa as well.

Restricted to 2D operations, one could probably produce something similar through some nonlinear combination of doing 2D opening and then Gaussian smoothing within each level set. Your challenge though did not specify if the operations had to be restricted to 2D. I’ve taken advantage of that and the fact that the original conception of the algorithm is in 3D. Our counterparts in the 1980s were a bit more restricted in terms of memory and computational power which is why I suspect that they created a 2D algorithm to do this.

For us today in 2019, we have significantly more memory and computational power. Rather than spending time on creating a specialized facility like MATLAB’s offsetstrel my approach here is to implement the original 3D algorithm using standard facilities directly.

In 2019 this makes me think of an algorithm a kid would have designed in Minecraft. In 1980, maybe the kids would be come up with this with using Legos.

This algorithm is a bit strange because this combines together pixel size and bit depth into a discrete 3D space. Intensity quantization has nothing to do with pixel size so applying a isotropic structure element in that 3D space is a bit weird. That’s where I suppose the parabloid comes in.


I feel like this could be accomplished using multiplyImageAndCoordinate, but I’m not quite sure how to use this yet.

It looks like multiplyImageAndCoordinate(source, destination, dimension) from peaking at

I created a new rolling ball background subtraction now with vectorized binary volume creation. Execution time: 444 milliseconds (NB: Different hardware, Windows / NVidia)

  1. I replicated the original image 256 times using CLIJx_resample
  2. I created a stack with each plane having the value of the z-coordinate (0 to 255) using CLIJx_resamplewith interpolation. Some cropping was needed.
  3. I compared the two stacks using Ext.CLIJx_greaterOrEqual

While I’m having fun playing with CLIJ using this approach and this has been a great academic exercise to understand the concept, I’m somewhat convinced that this will end up being slower than the original code.

I’m considering an approach that focuses on this statement that I made earlier:

One option may be to look for the unaffected mesas under 2D opening, reduce their intensity, and smooth them. That could be accomplished by selectively averaging the unaffected areas with a blurred image. I’ll probably return to this after ASCB.

@Wilson_Adams, do you have a real world example available for testing where the Process > Subtract background algorithm code is slow?


b) Why does the builtin rolling ball look so square?

The reason lies in the RollingBall class, variable ‘arcTrimPer’: To speed up calculations, the kernel size is less than the actual ball radius.
There are also a few other problems with the original rolling ball code: Depending on the ball size, the image is downsampled and a smaller ball is used (again, to speed it up). In principle, this would be fine, but I have the impression that this is not done such that the z values remain the same. So there is a discontinuity of behavior at radius 10, 30, and 100; the higher the radius, the more it becomes top-hat-like.
In addition, for 16-bit data, in any case, the rolling ball is not much different from a simple top-hat filter, because the z-extension of the ball is typically much less than the range of image data.
For these reasons, long time ago I had added the ‘sliding paraboloid’ method, which does not have these restrictions. To have it reasonably fast, it is only an approximation, not really a paraboloid but actually sliding parabolas with four different orientations. Since the “radius” of the paraboloid is the curvature radius at the apex, you may need very small radii such as 0.01 if you have large data values.
Just try it with a cross section plot in “live” mode while previewing; you will see that the sliding paraboloid comes much closer to what one would expect from a rolling ball. For my work, the “Ball” method is more like a legacy method, to keep compatibility with old code, but usually I use either the sliding paraboloid or a top-hat filter.

– Michael


I solved this for my needs a long time ago applying a modified Top minus Hat Opening by reconstruction in the following sequence (for bright signal on dark background): Top minus
Closing by reconstruction of Opening by reconstruction of Closing by reconstruction .
The full sequence is:

  1. Maximum filter of original
  2. Reconstruct by erosion :marker image = Maximum , mask = Original
  3. Minimum filter of 2.
  4. Reconstruct by dilation: marker 3, mask 2
  5. Maximum of 4
  6. Reconstruct by erosion: marker 5, mask 4
    7… Subtract 6 from Original (Original minus 6.)

The closing operations remove dark holes , which exist in the original image (the first closing) or the ones introduced by the Opening (second closing) and may lead to erroneous extraction of nonexistent bright spots. And also keep the overall brightness of the background estimate close to the original.


Very interesting discussion.
@Stoyan_Pavlov, I think your algorithm is somehow related to an alternate sequential filtering by reconstruction:

1 Like

I haven’t seen this @Nicolas . Thanks for the reference! I will check it.
I use this sequence for preprocessing of wide field epifluorescence images. Of course as a consequence of the series of morphological operations it is very accurate but very slow algorithm.


The original intent of this thread is using CLIJ to accelerate these operations on the GPU.