Acquire order of dimensions during loading

Hello,
is it possible to know order of dimensions of a loaded image or an image in a collection?
Depending on the image file the results may be:

  • 2D grayscale (row, col)
  • 2D multichannel (eg. RGB) (row, col, ch)
  • 3D grayscale (pln, row, col)
  • 3D multichannel (pln, row, col, ch)

naturally 2D multichannel and 3D greyscale have the same number of dimensions and I wonder whether there is a way to distinguish between these types.
The only way I can think of to check the third dimension and if it is larger than, lets say, 5 assume that it is a 3D grayscale image.
Is there a cleaner way?
Thank you!

1 Like

For scikit-image proper, no: we don’t have access to the image metadata so there is no way to know (other than visual inspection) what the axes are. Depending on the file format, there is probably a way to access the image metadata that would tell you the identity of the axes.

The only way I can think of to check the third dimension and if it is larger than, lets say, 5 assume that it is a 3D grayscale image.

We used to do this, and some parts of the code still do, but it turned out to be very brittle and lead to bugs, so we are leaving this to the users in the future, see this issue on the repo.

So, if you have access to the original files, your best bet is to try to figure out how to get that information from there. If it is a mishmash of files… Well then, guessing as above might be your only option…

1 Like

Thank you for the clarification. As you pointed out, this relates to a possibility to extract and provide meta data from original images.
I’ve noticed a topic on this and wonder whether there is any progress in this regard? I personally missing the possibility to get all relevant metadata before loading images.

1 Like

Hey @Ilya_Belevich ,

awesome, thanks for pushing this forward! I’m also very interested in how to solve such issues. I’m just linking a releated issue where @jni suggested diving into the XArray library. Unfortunately, I wasn’t able to install it on Windows back in the days and thus, dropped the idea. I just checked again and it installed successfully on my machine just now. It apparently supports reading a couple of file formats and corresponding meta data such as some tiff, zarr and some hdf5
Would you mind checking if it works for your use case? What file format do you work with? I’m curious! :slight_smile:

Cheers,
Robert

Hei Robert,

What file format do you work with? I’m curious!

lets put it so - I am interested in all of them :slight_smile:
In Matlab, I use imfinfo function that returns a structure containing information about an image in a graphics file for standard formats. For some custom formats as (mrc, amiramesh or nrrd) I use custom readers and thus extract meta data myself. As more general, for bioimaging formats I use BioFormats library (still in Matlab) that gives me all possible metadata.
For BioFormats on Python I still need to explore, but I hope that it is possible (python-bioformats · PyPI), possibly with some performance limitations.

Would you mind checking if it works for your use case?

I can check, no problem with that, but it looks to me that it is quite limited to these few formats, so it is not something very universal. I was thinking that the libraries/readers that scikit-image is using should give the possibility to acquire metadata and it is just a question of passing those further… or I am completely wrong?

1 Like

We have never tackled it, and within the year we are planning on deprecating skimage.io in favour of imageio, which has some plans for universal metadata handling here:

(and see at the bottom for linked issues)

But none of these plans have gotten off the ground. We certainly would love some help! :grimacing:

In other words: I am not aware of anything equivalent to imfinfo in Python. You might try imageio.imread(filename).meta, but that is often (always?) incomplete. Plus it requires reading the file.

1 Like

well, it looks quite fine actually and it may be enough for many purposes.
I am reading single images or collection of images via skimage.io.ImageCollection, is there a way to propagate that metadata as one of variables of ic = io.ImageCollection(filenames) call?

Plus it requires reading the file.

ohh, that is really pity.
I scored performance of reading 16k x 9k files and these are results:

  • JPG:
    • Matlab imfinfo: 0.0038 sec vs imageio.imread(filename[0]).meta: 1.644 sec
    • Matlab imread: 1.2123 sec vs imageio.imread(filename[0]): 1.617 sec
  • Uncompressed TIF
    • Matlab imfinfo: 0.0035 sec vs imageio.imread(filename[0]).meta: 0.1795 sec
    • Matlab imread: 1.8190 sec vs imageio.imread(filename[0]): 0.2685 sec
  • LZW compressed TIF
    • Matlab imfinfo: 0.0035 sec vs imageio.imread(filename[0]).meta: 0.3784 sec
    • Matlab imread: 3.1322 sec vs imageio.imread(filename[0]): 0.3521 sec

in good image formats the metadata is written in header and there is no need of reading the whole image. Matlab imfinfo benefits from that and extracts metadata very efficiently (except pngs as far as I remember, have not tested this time). There is surprisingly slow performance for reading of TIFs, perhaps the function was not updated for many years…
Anyway, addition of metadata to imageio.imread does no cause large performance drop and it would be great to have it extracted upon loading of images with skimage: ic = io.ImageCollection(filenames).

Note: for tiffs, you can use tifffile directly, which does let you read metadata without loading the full file.

For ImageCollection, again, it’s something that skimage will probably deprecate, so I would focus efforts on the imageio metadata reading capability. I think that is 100% in scope and probably feasible.

I guess the path towards deprecating skimage.io is still unclear; e.g., I don’t think we want to get rid of ImageCollection until we have a reasonable alternative.

I just tested imageio's metadata reading, and it looks like it depends on the type of image being read on whether the data is loaded along with the meta-data. For me, TIFF is fast, but JPEG is not.

import imageio
r = imageio.get_reader(image_fn)
print(r.get_meta_data())

Of the options I tried, Pillow was fastest at extracting the JPEG meta-data:

img = Image.open(filename)
print(img.info)

Thanks Stéfan! :tada: Didn’t think of Pillow.

Re ImageCollection, I think it belongs in imageio, which Almar has agreed to bequeath to us. :wink: Agreed that we should port it there first before deprecating it in skimage so that such a thing is available.

Thank you for the comments!

are there any specific problems with ImageCollection? I just started to use it and it gives very nice way to work with multiple files in various formats and I only missing metadata import with it :slight_smile:
What may be an alternative if I need to be able to combine files in various formats?

No problem with ImageCollection as far as I know; I use it quite a bit :slight_smile:

You can override the way that ImageCollection loads images to also grab their metadata:

import os

from skimage import io
import imageio


def read_with_meta(fn):
    r = imageio.get_reader(fn)
    return (r.get_data(0), r.get_meta_data())


ic = io.ImageCollection(
    ['*.jpg', '*.png', '*.tif'],
    load_func=read_with_meta
)

image, meta = ic[0]
print(image.shape)
print(meta)
1 Like

Fantastic! I will try that!

I was able to install it over a clean Python 3.8 environment and at least the following command works:

remote_data = xr.open_dataset(
        "http://iridl.ldeo.columbia.edu/SOURCES/.OSU/.PRISM/.monthly/dods",
        decode_times=False)

But I was not able to add it into my current working environment on Python 3.9 due to multiple incompatibilities. Possibly if I add it to env-yaml file I may get it installed.