Learning ImgLib2 development

To learn ImgLib2 plugin development I started on a plugin that converts a IntegerType image into a dense ranked version of itself. It can be found here: https://github.com/rharkes/Dense-Rank-plugin/blob/master/src/main/java/nl/nki/imagej/Dense_Rank_plugin.java
This is only my second attempt at trying to work with ImgLib2. I like it a lot, but I am not confident that I find the best way of building a plugin. Would someone maybe have a quick look at my code and see if there are things I could have done better? Any advise would be appreciated! For example, would it have been better to program an OPS instead of what I did here?

As a next step I would like to use this ranked image to calculate and subtract a temporal sliding median. Preferably running in parallel over each pixel. With ImageJ1 I made a threadarray depending on Runtime.getRuntime().availableProcessors(); and gave each thread an AtomicInteger that corresponded with the pixel it needed to work on. Can I still do that with ImgLib2, or is this method obsolete now and does ImgLib2 have smarter ways of doing parallel computing?

I use Eclipse as IDE.

1 Like

So I tried, but it is 2.5x as slow compared to my old plugin. The code is referenced below, but this is what the old version does:

  1. copy everything frame by frame from the imagestack to a java short array in the order {t-x-y}
    a) Use ImageStack.getImageArray()
    b) Check which pixelvalues exists for use in the next step
  2. Dense rank the array
  3. Subtract temporal median background in parallel
  4. copy everything back frame by frame from the short array to the imagestack
    a) uses ImageStack.setPixels

The new version:

  1. Dense rank (as in the post above)
  2. Subtract temporal median
    a) Same parallel system as in the old version
    b) Two RandomAccess pointers per pixel for the start and end of the window that move in time.

By using ImgLib2 everything looks much more elegant. But what could be going on that makes it slower?

Old: https://github.com/rharkes/Temporal-Median-Background-Subtraction/tree/master/src/main/java
New: https://github.com/rharkes/Temporal-Median-Background-Subtraction/tree/imglib2_develop/src/main/java/

Hey @rharkes,

welcome to the ImageJ2 world. Your example code looks pretty professional :slight_smile:

The last link you posted is not functional. It’s this file, right?
https://github.com/rharkes/Temporal-Median-Background-Subtraction/blob/imglib2_develop/src/main/java/TemporalMedian.java

Thus some general thoughts:

  • You are using an iterator to go through pixels. Using a cursor is more popular and makes the code a bit slimmer.
  • I would recommend creating Java packages. For example a package named like nl.nkiavl.rharkes might make sense. If you don’t create packages, you might get issues later when building more complex software or when shipping the plugins to other colleagues. Furthermore, it is recommended to make the group ID in the pom.xml identical to the first part of the package. However, package names are not allowed with minusses. Thus, I would also recommend renaming the group-ID.
  • I didn’t know about the statusService. Thus, I learned something. Thanks for this! :slight_smile:
  • Just for curiosity: Why do you have your own pow() method? Is it different from Math.pow()?
  • Regarding the parallelisation, some people work on that at the moment: There is some ImageJ2/SciJava magic on the way. You find example code here. Maybe, @maarzt can tell us if this is already part of the Fiji distribution?

I hope that helps!

Cheers,
Robert

4 Likes

Hi @haesleinhuepf,
Thank you for the compliment and the helpfull advise.

  • I updated the pom.xml and put my code in packages.
  • The custom pow() method was because I was hesitant to go to an external library for something so simple, but I will use the Math.pow() in the future.
  • Cursor indeed makes the loop very slim. Love it!

The new version of the plugin has the advantage of not duplicating the entire dataset in memory. Unfortunately it takes 2.5x as long. I think it is mostly spend moving randomAccess around.

2 Likes

I would try to forward this question to @tpietzsch - maybe he can tell us, what makes that code slower than the ImageJ1 counter part. I bet there is a way of speeding things up. :wink:

Cheers,
Robert

I hope @tpietzsch will have some time to look at it because indeed, l bet there is a way to get it faster. I duplicated imagedata to a java array the ImageJ1 way and with a cursor. The ImageJ1 way was about 3x faster. But duplicating all data is not something someone should do in general I expect. Working on the data directly should be the fastest method.

1 Like

@rharkes In your imglib copy example, you treat Z (dim[2]) as fastest-moving dimension:


That’s probably a mistake because the ImageJ1 version doesn’t do that?

Anyway, in this case, extracting the position out of the Cursor to compute an index is what is slow. Instead, you should just iterate in flat order and write to the output array consecutively. Something like this would be more idiomatic:

		final int pixels = (int)Intervals.numElements(img);
		final short data[] = new short[pixels];
		statusService.showStatus("loading data");
		int p = 0;
		for (UnsignedShortType s : Views.flatIterable(img)) {
			data[p++] = s.getShort();
			if ((p % 2000) == 0) {
				statusService.showProgress(p, pixels);
			}
		}
		statusService.showStatus(1, 1, "FINISHED");

(not tested.)

2 Likes

Some ways to improve the imglib2 version:
Do not create new RandomAccess for every pixel.


Do not do complicated index computations and position-setting for every pixel:

Instead, try to rewrite subtractMedian such that you pass in a RandomAccess (instead of i) that is already correctly positioned in x, y and only move it around in z. (as a first approximation to making it nicer, then maybe go on and try to highlight the 1D nature of the operation, e.g., abstract the dimension along which the 1D operation goes.)

Instead of randA1.move(1, 2) you can us randA1.fwd(2) which might be slightly faster.


I would split the output range between threads beforehand instead of competing for AtomicInteger every pixel. (But the same goes for the IJ1 version)

I would also not create new arrays tempres, hist for every pixel but try to reuse them.
This is the same as with creating RandomAccesses above.
Maybe easiest to make the subtractMedian into a functor object with reusable RA and arrays as members. If you want to go all-out imglibby, I would make this object implement Positionable so that you can just position it at x,y coordinates and then run() to do the z line from there, etc. (but this is not related to performance, just “nice”…)

5 Likes

Thank you @tpietzsch ! This makes the copyexample even a little (10%) faster than the old method. I will try to rewrite the subtractMedian to accept a RandomAccess that contains only the z component of the pixel. Should I use Views.offsetInterval to create the RandomAccess?

1 Like

First tried to implement the new copy method into the old ImageJ1 macro, but I missed the fact that the array that is created in the new method is ordered xyt instead of txy as in the old method. This slows down the algorithm (change from this to this) and makes the total runtime 50% longer.

1 Like

@tpietzsch: Could you maybe have a look at the new version. I wanted to go all-out imglibby ofcourse. :blush: So I think I implemented Positionable [1], and now have only one RandomAccess. However, without any parallel computing it is still very slow. Here something must be changed for sure. You suggested to split up the task between the threads of the processor?

@rharkes looks good! That is more or less what I had in mind.

Did you try to parallelize it the same way as before (like this https://github.com/rharkes/Temporal-Median-Background-Subtraction/blob/e1ea1184770798aeed3c667405f8c51399b961c1/src/main/java/nl/nkiavl/rharkes/TemporalMedian.java#L88-L106)?

Give every new Thread() it’s own subtractMedian (I would rename this class to SubtractMedian, it’s very uncommon to start a class name lower-case) and reuse it within the thread. How does this compare to before?

1 Like

@tpietzsch Thank you. I parallelized it. However, on my test-file (256x256x28746, uint16) with a window of 501 frames and an offset of 1000 it takes 2m40. The previous version takes 1m00.

1 Like

@rharkes Hmm, I don’t see any big saving opportunities immediately. I would expect some overhead, but more than factor 2 seems a bit too much. But I’m not sure… Could you provide an easy way to run the old vs new version from an IDE? Then I could play around a bit myself…

I tried, but found it difficult because I need both scijava 16.1 and 26, but must choose only one in my pom.xml

Could you give me instructions on how to set up a benchmark then? (Which branches of which project(s) and what to run?)

Hi @tpietzsch. I made two releases that with corresponding sourcecode and .jar files. A 2.2 version in the old environment and a 3.2 version using ImgLib2. I run them using a big file (256x256x26746 uint16) but you can generate any dataset that is about 30k frames in time with random numbers between 0 and for example 3000.
I run the mediansubtraction using a 501 window and 1000 offset.