Ilastik best file format for pixel classification

Hi folks,
I’ve large tiff formats from stitched SEM images that I want to evaluate coherently. Now, after multiple segmentation attempts in FIJI, I came across Ilastik here on the forum. I think that Ilastik can solve my segmentation problems in a more precise way. However, my individual tiff formats are about 25 megabytes each. So I followed the Ilastik manuals and created an .h5 file for each image and open them with ilastik in a new project. While I’m training with my .h5 files (6-10 files, each about 45 mb) the program runs slowly or crashes partially. I am a beginner in image editing and wonder if compressed tiff formats could distort the results in the end?

Hi @hofmanpa,

I assume your data is 2D? What is roughly the size in pixels along each dimension?

You also inquire about compressed tiffs and whether those distort the analysis. Tiffs can be compressed using different methods (something like gzip, lzw, jpeg…). This depends. If compression is lossless (-> all data can be recovered) then there is no problem at all. Lossy compression, on the other hand, should in general never be used when you’re planning to analyse the data.

The hdf5 that you (I assume) generated with the ilastik import/export Plugin for Fiji saves data completely uncompressed per default (compression level=0). The compression algorithm we use is gzip, so it’s lossless compression.

Cheers
Dominik

@k-dominik yes, it’s a 2D data set. Every image has a size with 6868 x6868 pixels. As you mentioned I used the export/import plugin in fiji to generate hdf5 files. I was just wondering about the huge size of 45 mb for each image.
exportfile_ilastik_fiji

@k-dominik would you recommend to preprocess images with different background noise in fiji before training the pixel classifier in ilastik?
Some examples:

thanks for your help!

Hi @hofmanpa,

thank you for providing some data. I had a look.

I think in principle ilastik should learn away the noise. Having said that, if you some physically justifiable method to get rid of it, it wouldn’t hurt…

The size of 47 mb is exactly what you expect for an image of that size if saved uncompressed. The compressed size is dependent on the data. This type of noisy data does not compress so well (one of the examples you sent me in tif is also around 45 megs in size). But if you crank up compression to, let’s say 9 in the export, you’ll get a smaller file size again. Very similar to the one in tiff (again, no data loss here, just a different way to encode the data on disk).

Now maybe about your performance issues: Are the objects you are after rather large, or small? Do you find yourself using a lot of big brush strokes?
ilastik works best with rather small (even size 1) brush strokes. Adding training data in places where the prediction is already doing well, does not really help - but slow down the training significantly.

If the structures you’re after are rather large, then you could also consider downscaling your data.

Also when training, try not to zoom out to too much while in live update. ilastik will only request the part of the image that your are looking at at any given moment - so updates are faster zoomed in.

I’ve also added some performance-related information here: Which upgrade for my laptop would be most beneficial for working with "Ilastik"? - #2 by k-dominik, maybe this is of help.

Cheers

1 Like

Thanks again for the quick help.

@k-dominik I have now read through all of Ilastik’s instructions again, but the batch processing function in the Pixel Classification workflow has not yet become clear to me in the last part with ‘prediction Export’ and ‘Batch Processing’.
The following problem: I train the Pixel Classifier with several images in ‘Training’ (compressed hdf5-files as you recommended). Afterwards I set the export settings in ‘Prediction Export’ (Source: Simple Segmentation; default Export Image Settings).
In ‘Batch Processing’ I can select the images to be segmented and then process them. Not all images are processed, only the last selected one. I don’t know if I should set something else in the ‘Prediciton Export’ settings…
Might there be an online seminar on Ilastik in the future?

Greetings
Paul

Hi @hofmanpa,

  • The Export Applet has two purposes: configure the export (what source, file type, all those settings), and export the data you have trained on.
  • The Batch Processing Applet is there to be able to easily apply the trained classifier to unseen data (data you have not used in training). The idea is that you train on some representative data and then can apply it to the rest of the supposedly bigger dataset. (Why not add all as input data in the beginning? First of all, ilastik unfortunately gets slow if you add more than 10, 20 images for training. Also it does not improve the classification if you add a lot of similar training data - this just makes the training slower.)

if you want to use the batch processing applet for unseen data, then it is necessary to set the file name in the export settings correctly (e.g. like the default, with magic values in curly braces). This ensures there will be a new file created for every input.

Cheers
Dominik

1 Like

Automatic feature suggestion

After I trained the classifier with some representative images in the ‘Training’ applet Ilastik offers the option to ‘Suggest Features’ for the selected trained image. Is the selected method also applied to all other Images which I selected in the ‘Input Data’ and trained in the ‘Training’ applet?
Thanks for all the tips,
Paul

Hi @hofmanpa,

the automatic feature suggestion will select a subset of the features that you had originally selected (I suppose you choose all, which is usually a sensible choice) in order to speed up processing a bit. This should only be done once you have a lot of annotations already. Feature selection will use the annotations you have provided (on all images), to come up with a feature set that performs similarly on the training data you have provided. You’ll also get an estimation for the speedup of the feature computation with the reduced set.
The set of features you select there is then the selected features set for the project (you can verify this by going back to the feature selection applet - you will see that only a few features are then selected).

Cheers

1 Like