Otsu's threshold method delivers different results in scikit-image and ImageJ

Dear friends of ImageJ and scikit-image,

I was just playing with threshold methods in both worlds and discovered that Otsu’s method obviously delivers slightly different results when being called from scikit-image and ImageJ.

image

Does anybody know why? I realised that in python the threshold is determined as a floating point number even though the image is saved as 8-bit integer. Thus I assume it’s converted while loading? May that be the issue?

I’m new to the python/scikit-image world. Is there anything I’m doing wrong?

I put my example code online:

Any hints and suggestions are welcome! :slight_smile:

Cheers,
Robert

2 Likes

The blobs image is a bad candidate to test thresholds because it is heavily quantised (have a look at the histogram in IJ).
However I am willing to bet that the image above (Figure 1) is not exactly the same as blobs.tif.
At coordinates (84, 114) there is a tiny blob in the IJ version which does not exist in the Figure1 version.
One could assume that it is because the threshold is different, but if you go increasing the threshold one by one manually, by the time that small blob disappears (at t=153), there is a “hole” in another blob at (78,41) which did not show up in Figure1.
So I can only conclude that the two images are different, as it is difficult to imagine how that hole could appear in one but not in the other image.
Actually I suspect that Figure1 could have been smoothed (internally?) with a kernel of radius 2 or thereabouts.

3 Likes

I cannot reproduce what you are seeing there. For me, the tiny dot near
the centre shows up with both Fiji and scikit-image. However, I do
see minor differences.

One thing that strikes me is that skimage opens this as data type uint16
and has very different gray values and therefore thresholds.
Not sure why that is. @jni?
skimage version 0.14.0

Edited to add: also, this is appears to be a grayscale image that is opened as an RGB image by skimage. If i provide the as_gray=True option to io.imread it is converted to float64 (this behaviour is documented for io.imread but odd). The float64 has a maximum value of 0.96 so some rescaling is happening somewhere.

Edited to fix: uint8 to uint16 further up.

1 Like

Ok,
it turns out this 8 bit image has a 16 bit LUT.
Try Apply LUT in Fiji and you will see something interesting.

The treshold in Fiji after that is 30583 which differs from the skimage result.

1 Like

Hey @VolkerH and @gabriel,

thanks for your thoughts! I’m not sure what you mean with LUT, but according to what I learned about thresholding algorithms in general, it should not matter which lookup table is used to visualize the image. IMHO Otsu’s method anlyses the histogram of the image. It should not matter if there are empty bins in the histogram as for blobs.tif.

As you mentioned there might be something wrong with the image file; let’s try another: a crop from the famous hela-cells:
image

The differences after thresholding are tiny, but they are there:

And it may indeed have something todo with 8-bit versus 16-bit: What puzzles me is, depending on how I save the image: as 16-bit or 8-bit, I have to change my procedure for importing it to python:

The 8-bit version of otsu_scikit-image-hela8.py is

# load image
image = data.imread("C:/structure/code/otsu/hela-cells-crop8.tif")
#split channels
image = image[:,:,0]

And the 16-bit version of otsu_scikit-image-hela16.py is

# load image
image = data.imread("C:/structure/code/otsu/hela-cells-crop16.tif")

If I mix them up, it crashes. If I don’t split channels of the 8-bit image:

C:\ProgramData\Anaconda3\lib\site-packages\skimage\filters\thresholding.py:278: UserWarning: threshold_otsu is expected to work correctly only for grayscale images; image shape (56, 76, 3) looks like an RGB image
  warn(msg.format(image.shape))
.....
ValueError: 3-dimensional arrays must be of dtype unsigned byte, unsigned short, float32 or float64

Or if I try to split the 16-bit image:

File "C:/structure/code/pyproj/tests/otsu.py", line 7, in <module>
    image = image[:,:,0]
IndexError: too many indices for array

Why do I have to split channels in the 8-bit case? May either Fijis TIF-saving or scikit-image TIF-loading does misinterpret something? I put all scripts and images on github. If anybody could help me understanding this better, that would be very appreciated.

How do python people usually open TIF image files? Hard to imagine that I’m the first hitting this issue…

Thanks again!

Cheers,
Robert

1 Like

Hi Robert,

well, one would think the LUT should only be for display. But what happens here is that the LUT somehow gets applied to the data when you read it. The LUT seems to map
[uint8] -> [uint16, uint16, uint16]. And for some reason the LUT automatically gets applied when you read the image the way you do with skimage.data.imread. This is probably because imread thinks this is some sort of paletted indexed-colour (https://en.wikipedia.org/wiki/Indexed_color) image similar to a GIF rather than intensity values.

Having said that: I mostly use tifffile.imread. And for this particular image, tifffile gives the behaviour you would expect, i.e. correct shape and data type. I thought skimage.io uses tifffile under the hood so I’m not exactly sure why it shows this behaviour.

With this, you also get the same threshold value of 120 that Fiji produces.

image

Without looking into the sources I can’t tell why skimage behaves differently. But @jni can jump in here. I use @cgohlke 's tifffile :slight_smile:

2 Likes

In python, I generally use

from imageio import imread, volread

with imread for 2D images and volread for more dimensions (or indeed color channels if not simply RGB). Then it behaves more as expected.

In my limited tests, the ImageJ threshold is just one more than that I get with scikit-image. For blobs, it is really ‘one bin higher’ (128 vs 120). If that holds in general, I’d expect to get the same thresholding in Python with > as I’d get in ImageJ with >=

2 Likes

tifffile no longer applies color maps as of version 2017.9.29. Scikit-image vendors tifffile 2017.1.12.

3 Likes

But I think it does matter in several other histogram thresholding methods. E.g. when searching for a local minimum or with the triangle method: where there is a gap in the quantised histogram, you get a local minimum (most likely not present in the original data). Histogram-equalised images also tend to get histogram gaps of varying sizes.
Caveat emptor!

2 Likes

Another issue with this image: it is 16 bit image, and it probably matters if you compute the histogram in 16bits (like in the Auto Threshold plugin) or with the binned-to-8bit histogram of the built in thresholding applet. So, again, it probably it is not the same image that is being thresholded.
See “important note 2” here: https://imagej.net/Auto_Threshold

1 Like

Whoa, this thread grew rather quickly. :joy:

:point_up: This is the crux of the matter — thanks @cgohlke for chiming in, as I was indeed confused that we were not doing the same thing as tifffile in scikit-image.

I actually have a pull request open to use the installed tifffile if it is available in preference to our vendored copy. I hope that makes it in soon.

Responding to some of the other topics raised in this thread that relate to scikit-image:

That’s right, the 8-bit image was being colormapped for the reasons explained by @cgohlke, while the 16-bit image was not (and thus was read in as grayscale).

  1. skimage.io.imread
  2. imageio.imread
  3. tifffile.imread

Actually all three of those should end up in the third one, but which one you use determines some of the convenience wrapping around it. The first one gets rid of all the metadata and just gives me a NumPy array, which is what I want most of the time. When I need the metadata I drop down to the lower levels.

scikit-image has no concept of LUT — what you are working with is a bare NumPy array of values. These values were (incorrectly) colormapped upon reading the image (ie the LUT was applied), giving you a three-channel image that just happened to look grayscale. (Making one wonder why there was a colormap at all to begin with!)

Note that the confusion is bidirectional. I’ve often been confused in Fiji because there was an inverted LUT inserted between me and my image. :wink:

This is because as_gray works for RGB images and multiplies by the correct luminance values for each channel. This is best done in float space for accuracy.

Note that 0.15 was released a few days ago. =D

Finally, as regards Otsu: it’s entirely possible that there is an off-by-one bug in there. I don’t have time to investigate in such detail but the code is available for all to see here and contributions are always welcome! :smile:

Needless to say, one hopes that your results don’t depend on being on the left or right of a bin in a thresholding algorithm. But indeed checking software for consistency between packages is very important.

5 Likes

Hey all,

thanks for the hints. Using tifffile resolves my importing-issues!

In the 8-bit case the results are now identical. Furthermore, the threshold in python is 1 smaller than in ImageJ and as @petebankhead pointed out, applying the threshold with the > operator does the job.

However, in the 16-bit case, the determined thresholds are quite different. Its 1223 versus 1236 and thus the results are again quite a bit different:

Code: Otsu_ImageJ_hela16.ijm, otsu_scikit-image-hela16.py

Thus, I digged deeper. Both, ImageJ and scikit-image use a histogram with 256 bins. However, when looking at minimum and maximum grey value of the image, 321 and 3855 respectively, you can see that the binsize is 13. Thus, the difference in threshold can again be explained as @petebankhead said above: scikit-image and ImageJ almost agree, they are different by one bin-size of the histogram.

Is this something that could be considered as a bug? If yes, in which of the both libraries?

Thank you all btw. for the nice discussion. I know it’s partly academic, but also very interesting. Also thanks @jni for all the directions you gave me. I’m slowly picking up speed in python :partying_face: Thanks for your support! :slight_smile:

Cheers,
Robert

3 Likes

Addendum: The actual threshold calculation in scikit-image and ImageJ are quite different, so it’s not obvious what’s right:

Opinions? :slight_smile:

I think both of those lines do approximately the same thing — return the bin lower bin edge (Fiji) or the bin center (scikit-image). I couldn’t find from your link where the Otsu algorithm is actually run in Java, since all the thresholding algorithms use that same scale method?

Yes, all thresholding algorithms determine an index in the 256-bin histogram and their result index is piped to the image processor. The Otsu method itself lives here: https://github.com/imagej/imagej1/blob/master/ij/process/AutoThresholder.java#L713

Applying the colormap for files such as hela-cells-crop8.tif is correct according to the TIFF specification. The files produced by ImageJ are mis-tagged as palette-color images (PhotometricInterpretation 3), where the component values are indices into RGB lookup tables, which are always RGB uint16 arrays in TIFF. Instead such files should be tagged as grayscale images (PhotometricInterpretation 0 or 1). Newer versions of tifffile unconditionally don’t apply color maps because LSM, some OME-TIFF, STK, etc. also get this wrong.

The tifffile docstring lists other libraries for reading scientific TIFF files from Python, e.g.:

4 Likes

Are you sure the built-in applet computes a 256-bin histogram only? From this commit and the source code of ShortStatistics and ImageProcessor, it seems that it is actually calculated on the full 16-bit histogram, and I remember discussions here on the forum about the confusion caused by the fact it is calculated on 16-bit, but displayed with a temporary 8-bit lookup table…


To complete the picture, I looked into how #imagej-ops are computing the threshold, and it produces 128 for the blobs image, where ImageJ 1.x produces 127.

Here’s a Groovy script to compute the “ImageJ 1.x” way:

#@output result

import ij.IJ
import ij.process.AutoThresholder

imp = IJ.openImage("https://github.com/haesleinhuepf/otsu/raw/master/blobs.tif")
histogram = imp.getProcessor().getStatistics().getHistogram()
at = new AutoThresholder()
result = at.getThreshold("Otsu", histogram as int[]) // 127

… and here’s a script to get the threshold computed via Ops:

#@ IOService io
#@ OpService ops
#@output result

dataset = io.open("https://github.com/haesleinhuepf/otsu/raw/master/blobs.tif")
histogram = ops.run("histogram", dataset)
threshold = ops.run("threshold.otsu", histogram)
result = threshold.getRealDouble() // 128.0

So it seems ImageJ 1.x is in disagreement with most other imaging libraries, even with ImageJ Ops.

Since the ImageJ Ops implementation of Otsu’s threshold is essentially copied from the other one in ImageJ:

… the difference might actually be in the way the histogram and bins are computed.

Best to check with Wayne, but see the AutoThresholder.java of IJ1:

I just noted that the #imagej-ops seems to be using one version of the Otsu algorithm that in certain circumstances can cause an overflow. The Otsu code in the Auto Threshold plugin I maintain uses since June 2017 Emre Celebi’s implementation which is less prone to cause the overflow when running on 16bit histograms.

Thanks for the heads-up. Issue filed:

:astonished: Sorry, I presumed when you changed the behaviour it was to fix a bug, not introduce one just because everyone else is buggy! That’s a sad state of affairs.

Is it possible to use tifffile to recover the LUT in addition to the “index” image? It is trivial to apply it with NumPy, so someone might have a use for it:

In [1]: colors = np.array([[0, 0, 1], [1, 1, 1]])
In [2]: image = np.array([[0, 0], [1, 1]])
In [3]: colors[image]
Out[3]: 
array([[[0, 0, 1],
        [0, 0, 1]],

       [[1, 1, 1],
        [1, 1, 1]]])