Improving the pixel classifier for tumour-stroma classification, impact of spatial features on the classifier

Hello fellow QuPath users.

I have been trying to train a pixel classifier to differentiate epithelium/tumour from stroma on H&E slides of adenomas which are pre-malignant polyps of the colon and rectum.

After having read through the info on github/readthedocs and watched tutorials, and with a lot of help from similar posts and comments in this forum (thanks btw!), i believe the classifier works well enough on at least some of my slides (or some parts of them), see the two pictures below with classifier off and on.
To help the classifier, i have added a third annotation “other”, which in this case is a type of cell (goblet cell) that produces and accumulates mucin which on the slide will look very similar to lumen, at least in color.

However, for other slides where maybe the color is a lot more similar between epithelial cells and the stroma, the classifier is having a hard time separating the two (see pictures below, same slide but different part of the slide). From the perspective of color i can understand that the program is struggling, but the spatial/geometric differences are rather distinct.

In this specific case, the classifier wrongly classifies areas of goblet cells (inside the circular epithelial/tumour patterns) as stroma, and in other cases (not shown) the classifier assigns areas of stroma with a lot of dark colored cells as epithelium/tumour.

So my question is this; is there a way for me to better enable QuPath to learn shapes and geometric patterns, or weigh this higher than it does currently? Is there a way for me to teach it some specific spatial rules?
For instance:

  1. That tumour does not exist as small single cells or isolated areas in the area of stroma lying between two areas of tumour
  2. That stroma does not exist between tumour and goblet cells or lumen areas?

It does not seem to help the classifier to just add more training areas.
Furthermore, i have been testing the different features in combination with gaussian and/or gradient feature, and tried previewing them in the classifier dropdown menu to evaluate whether they would help the classifier. By far the greatest impact and help for the classifier was adding the gradient feature to the gaussian, and adding scales of 8.0.

I am aware that some slides simply might not be suited for automated analysis using image software due to homogeneity in color, or boundaries that are hard to define, but i still feel like i should be able to improve the classifier with all the features and options already at hand.

Any help will be greatly appreciated!
Thank you.

Best regards,
Jacob Bech

Assuming you have gone through : Pixel classification — QuPath 0.2.3 documentation
The gaussian is pretty much your “pixel intensity” feature. Including it along is essentially no spatial information, and only context provided by downsampled pixels (intensity as well). If you want things like local texture, you need the other features as described in the link.
In cases like scratch assays (find the area with no cells), I remove gaussian as I do not want the results affected by things like brightness or shading; I exclusively want the variation in the pixel values as edge effects from the borders of the wells dominate broad intensity changes.

This is not deep learning, it is not detecting objects, there are no specific spatial rules or anything like that. I have not found it to be all that great for picking out objects unless there is some sort of stain variation (intensity or gradient) that is consistent for all objects of a type. For something like that, you may want to look into deep learning algorithms, which you can use QuPath for to generate training data and import results back in for visualization.

Alternatively, you may be able to train two pixel classifiers and split your images into two groups based on the type of staining. At some point, though, if the staining is not consistent enough, even deep learning won’t fix broken inputs :stuck_out_tongue:

Thank you for the swift and thorough answer!
And sorry for my reply not being so swift :smiley:

Yes i have been through the QuPath documentation for pixel classification, it has helped me a lot!
I have tried all the other features but found that only the gradient feature helps me for this specific task.
I also tried removing the gaussian and running solely with some of the other features and in particular the gradient feature, but as color still is an important part of it, i found it didnt help the classifier.

My coding/scripting abilities are very limited, and as this is only a smaller part of a large project, it is not something i am looking to invest oceans of time into. Which means maybe the deep learning aspect is something i will leave to someone else :slight_smile:

Training two (or more) classifiers is definitely something i am considering. When you say split them based on the type of staining, you mean based on the slide to slide variation in staining intensity/color?
We actually also possess Ki67 (DAB) stained slides for all of our samples. I have tried making the classifier on these, but found that the results were very similar to H&E, which again makes sense as the staining of actively dividing cells does not particularly differentiate between cells in either stroma or epithelium.
Also, i am not interested in any cell counts or features of the areas i find with the classifier, i am solely interested in the relative proportion between stroma and tumor areas.

Maybe i need to accept the fact that the pixel classifier is good for some slides and maybe not good enough for other slides, and then try to assess how many “groups” of slides for varying staining i have. Then create a training image with sections of these different groups, or train separate classifiers for them.

Thank you again for your help!

Best regards,
Jacob