Hi Ilastik deveopler
I am working with the object classifier in a work flow that first does a pixel classification and uses those input into an object classification. It is two separate ilastik files.
Ilastik Object classifications feature table
I am write here in the hope that I can get a little bit of clarification on Object classification, and specifically a PCA analysis of them. We have a trained classifier that actually works quite well, and we was interesting in which features was actually important for the classification, and therefore turned to the feature export to do a bit of PCA analysis on those, to get an understanding of what is actually important for the classification, and that is what my questions evolve around. I am still newish to PCA, so forgive me if some of the questions are stupid or doesn’t make any sense.
The build-in PCA features
The features export includes a series of PCA component (called Principal component of the object_x with x = [0,3]), however when plotting those I get a very weird relationship. Ie. PCA_0 and PCA_1 is on a circle and PCA_1 and PCA_2 is equal (almost). Se images below, where the color represent different classes.
Trying my own PCA analysis using sklearn
So I tried running my own PCA analysis, using the python sklearn package, but it didn’t bring any revelation with it. See for instance the initial output below. I think one of the things I am struggeling with is the intensity histogram and neighboring intensity histogram. This takes up 128 out of my about 180 features, which I guess makes it easy for them to become dominant in the PCA analysis.
This is the result of the sklean PCA including a normalization of all the features (excluding position and labels) the first component account for 24% of the variance, which is not a lot when you look in textbook and online tutorials. but maybe that is as high as they get in a real life application?!
This brings me to my questions:
- I am not 100% about the algorithm behind the object classifier. But is it possible to have a good classifier and still see no pattern in a PCA?
- I experimented a bit with both doing the pca with and without a normalization (using sklearn.preprocessing.StandardScaler), I got a better PCA without normalized but got a bit of a weird range on my PCA. Is the values in the features export already normalized?
- Is it correctly understood that the intensity is used as 64 individual features used in the classification? Or are they combined into a smaller features space before being combined with the others?
- Or do I simple have too high expectation for a clear cut in a 2D PCA analysis?
Any help would be much appreciated.
Best
Jes