QuPath-M9 classsification is different in building and applying

Hi all,

I noticed something in QuPath m9.
I trained a classifier on my data (I created a combined training image with various annotations on my images, I set the stain vectors, I detected the cells and then I used the counting tool to label cells from different classes until I’m happy with the “train object classifier”). I then saved and apply.
Then I copied the .json classifier in another project with the same images. I applied the same preprocessing steps (I used the same script) and applied the classifier to the whole image. But classifications are different from my training project to my “whole image” project : most of the cell classification are the same, but a few are different.

Did I missed to copy something else in my “whole image” project ?

Thanks

Nico

1 Like

Hi @VirtualSlide,

Did you calculate additional features (shape features/intensity features/…) during the training? If so, did you make sure that you re-calculated them for the same image in the new project?

2 Likes

This was allll wrong :slight_smile:

not that time :slight_smile: I already did the mistake :slight_smile:

1 Like

Where that smoothing could occur ?

Calculate features-> Add smoothed features

Nevermind! I just tested it and the smoothing does not cross images. It must be something else.

If you give us your script for setting up your measurements, that might help.

@VirtualSlide the cell detection isn’t guaranteed to give exactly the same results for the same cells, if they are detected as part of a different region. The reason for this is that the background subtraction could differ. Often this isn’t noticeable, but in extreme cases you can really see the boundaries of individual tiles where the detection applied: https://github.com/qupath/qupath/issues/80

My guess is that these small variations could produce the effect you describe. It’s a limitation of the existing cell detection method (at least when background estimation is part of it), and can’t be easily fixed - but improved cell detection methods can be added in the future.

3 Likes

I think that Pete gave the answer (unfortunately)

Here is my code

setImageType('BRIGHTFIELD_H_DAB');
setColorDeconvolutionStains('{"Name" : "CD8-FoxP3-Icos", "Stain 1" : "Hematoxylin", "Values 1" : "0.65198 0.58679 0.48021 ", "Stain 2" : "DAB", "Values 2" : "0.37555 0.76338 0.52556 ", "Background" : " 221 222 222 "}');
runPlugin('qupath.imagej.detect.tissue.SimpleTissueDetection2', '{"threshold": 215,  "requestedPixelSizeMicrons": 10.0,  "minAreaMicrons": 10000.0,  "maxHoleAreaMicrons": 10000.0,  "darkBackground": false,  "smoothImage": false,  "medianCleanup": true,  "dilateBoundaries": false,  "smoothCoordinates": true,  "excludeOnBoundary": false,  "singleAnnotation": true}');
selectAnnotations();
runPlugin('qupath.imagej.detect.cells.WatershedCellDetection', '{"detectionImageBrightfield": "Optical density sum",  "requestedPixelSizeMicrons": 0.25,  "backgroundRadiusMicrons": 8.0,  "medianRadiusMicrons": 1.5,  "sigmaMicrons": 1.5,  "minAreaMicrons": 10.0,  "maxAreaMicrons": 400.0,  "threshold": 0.1,  "maxBackground": 2.0,  "watershedPostProcess": true,  "excludeDAB": false,  "cellExpansionMicrons": 0.0,  "includeNuclei": true,  "smoothBoundaries": true,  "makeMeasurements": true}');

I have a more philosophical question : Since I train a classifier on data. Is it still necessary to use stain vector estimation ? Since the RF may extract directly information from raw data without color deconvolution ?

Nico

1 Like

If your classifier is trained on measurements like Nucleus: DAB OD mean, those values are based on the stain vectors.

I am not sure what RF is.

RF stands for Random Forest

1 Like

Side note, in general when creating my training images, I take a larger region than I need and only create objects or training annotations a “safe” distance away from the edge of the extracted image. Safe would be determined by things like the size of your Background radius, or other types of measurements like that.

In the smoothing example, I would need to make sure any cell I picked for training had the full range of the smoothing circle around it. Anything on the edge of the artificial training annotation should not be used, as the cells it would pull information from would be missing.


Ah, was thinking RT would be trees, but right. Still, it isn’t accessing any pixel information in the object classifier. Change All measurements to Selected measurements and see what you are working with.

This is a good idea in general, but won’t overcome the issue of the background estimation using opening by reconstruction - which basically allows pixel values to have influence beyond the defined radius. The return for that is that the background estimate is often much better than would be achieved with a simpler approach (e.g. morphological opening). Because large regions will be tiled, the exact location of the tiles will matter (as shown for an extreme example in the link in my last post). Although for small regions/TMA cores it won’t.

The only way to really get away from this impact of this is to set the background radius to 0, i.e. to not subtract background at all during detection. But then you lose its benefits…