Object classification import labels

Hello everyone,
is it possible to import labels for the object classification workflow?
I am doing a cross validation for my bachelor thesis, where i have ten segmented images. My idea was to train the classifier on 8 of them and test it on 2 of them in every combination. If I could import object classification labels, i would only have to label them once for all of the 10 images.
It would be great, if anyone could help me!
Cheers Justus

Hi @Justusschl,

there is some import/export option for labels in debug mode, however I would not recommend to use it.

What you are trying to do should involve as little manual labor as possible (source of errors). I thought about it a bit and found something that could work. Did you know the .ilp project file is just another h5 file? That means that you can manipulate it. So for starters you should create a project and annotate all 10 datasets.
Then you’d write a script (in Python?!) where you would:

  1. Populate a list of image-file - probability file pairs (in the same order as you have it in your project file)
  2. Use sklearn.model_selection.KFold to generate your splits (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.KFold.html#sklearn.model_selection.KFold)
  3. iterate over the splits, delete the trained random forest and the annotations belonging to the test data

Some python pseudocode (haven’t tested it):

import numpy
from sklearn.model_selection import KFold
import h5py
import os

# project with annotations in all images
master_project_file = "/paht/to/your/project.ilp"
out_folder = "/some/out/folder/"

file_list = [
    ("/path/to/image01.h5", "/path/to/probs01.h5"),
    ("/path/to/image02.h5", "/path/to/probs02.h5"),
    ...
    ("/path/to/image10.h5", "/path/to/probs10.h5"),
]

# leave two out: should yield 5 combinations with 8 v 2?
kf = KFold(n_splits=5)

# generate a new project file for every split, with the test training data
# deleted
for train_indieces, test_indices in kf.split(file_list):
    # make a copy of the file
    fout = "".join(map(str, test_indices))
    fout = f"{out_folder}/test_{fout}.ilp"
    os.copy(master_project_file, fout)
    # open the copied file
    with h5py.File(fout, "r+") as f:
        # very important: remove the trained random forest
        del f["ObjectClassification"]["ClassifierForests"]
        for ind in test_indices:
            del f["ObjectClassification"]["LabelInputs"][f"{ind:04d}"]["0"]
            f["ObjectClassification"]["LabelInputs"][f"{ind:04d}"]["0"] = numpy.array(
                [0.0, 0.0], dtype="<f8"
            )
    # bonus, construct a bash/shell script that that calls ilastik in headless mode with the 
    # appropriate new project file and the right input files from the file list
    ...

if you make sure that the original project file was created with all data copied to it (in the dataset properties, data selection applet), then you can open the created files in the gui without having to re-link the input files via a dialog…

1 Like

Thanks a lot!
It works well, except when i am changing the feature selection in my master_project the python code gives me the error:
KeyError: "Couldn’t delete link (callback link pointer is NULL (specified link may be ‘.’ or not exist))
and it points to the part [“ClassifierForests”].
I changed the feature selection back, but it still doesn’t work. Luckily i made a copy of the master_project, which still works. But why doesn’t the code work as soon as i change the feature selection of the masterfile?
To be precise, I changed in “object feature selection” the “neighboorhood size in pixels”.

Cheers Justus

Awesome that it worked :slight_smile:

But really interesting that you got problems, once changing the feature set in the master file… Have you switched to the object classification applet after changing the selection before saving?

It appears there is some redundant information in the project file… Maybe also delete the f["ObjectClassification"]["SelectedFeatures"].

Cheers
Dominik

I tried switching back to the object classification applet after that, but it still doesn’t work with the file.
Deleting f [“ObjectClassification”][“SelectedFeatures”][“ClassifierForests”] doesn’t work either.

ah, sorry, I think I get it now. What went wrong, is that, after changing the features, the classifier got invalidated and is thus not in the project anymore after saving. Just wrap the code (but only this one line that deletes the classifier) with

if “ClassifierForests” in f["ObjectClassification"]:
    del f["ObjectClassification"]["ClassifierForests"]
else:
    print("Warning, no classifier found")