QuPath- pixel classification and CPU

Hi @petebankhead and hi all,
I’m trying to do pixel classification on H&E staining (whole slide) for GI cancer patients (~11 different slides), and I’m struggling with 2 problems.

  1. There is a big variance in the quality of the staining, from too much purple- hematoxylin to too much pink- eosin (hospital staining, I can’t fix the staining).
  • I tried 2 things to improve it, but still, there is a big variance, I changed the color by (a) double click on the ‘need to be eosin’ area and define it as ‘eosin’ (under the tab Image), (b) estimate stain vector> Auto.
    Do you have an idea of how to improve it more?
  1. I’m trying to do pixel classification and each time I add another annotation the computer gets stuck and QuPath is using all the CPU, and then it’s finishing the process and stuck again with the next annotation I add, a similar case to this issue- QuPath ver.5 using up 100% CPU when doing simple annotations on slides).
    Briefly, since it’s the ‘whole slide’ case, with different areas (from tumor and stroma to normal structures…) and a big variance, I take 10-13 region (500 um which are ~1/9 500px), which end with training image of ~140 regions in size of 1.4 M, with ~900 annotations.
    In the training parameter I have ANN; Resolution-Moderate; check for 2 channels, 3 scales, 5 selected features. BUT, I don’t think it’s related, since as I said before, it’s get stuck with the next annotation without activating the training.

I’m working with ver 0.2.2 and 0.2.3
The computer parameters are- 2*Intel Xeon E5-2690 CPU @ 2.6 GHz, 256GB of memory, Nvidia Quadro P5000, 16GB graphic card.

Any suggestion what should I do?

Thank you

P.S QuPath is amazing! this program change and help us a lot in the way we are working, a lot of thanks!

1 Like

Hi @Oshrat, glad you like QuPath :smiley:

I’m afraid I’ve no particular advice on the staining beyond setting stain vectors and creating a training image. If this isn’t enough, I’d need to see examples of images to better understand the problem – but it’s quite possible it is something beyond what the pixel classifier can currently handle.

Note that if your training image is large/complex, saving it as an OME-TIFF (File → Export images) might help performance a lot.

Other things that would impact performance include whether ‘Live prediction’ is turned on or off.

You can also toggle the classification overlay (the ‘C’ button on the toolbar, or press the ‘C’ key); normally I turn this off when annotating, and only back on when I want the prediction to be updated (as an alternative to pressing the ‘Live prediction’ button).

Finally, I’d recommend avoiding ‘Structure tensor’ features unless you find they are really helpful; they are likely to be the slowest to calculate. ‘Hessian’ features are also quite slow – but depending upon the application they can be very useful. In any case, I would typically use Structure tensor or Hessian, but not both (because there is some overlap in the information they provide, and together they are especially slow).

1 Like

Hi, thank you for your quick response! :slight_smile:

  1. The ‘Live prediction’ button indeed “off” when I’m drawing a new annotation.

  2. In the OME-TIFF saving option, what are the parameters that you recommended me to use? downsampling in 4 as the default is not a lot?

  3. I’m not using ‘Hessian’ parameters since they don’t really help in my case. But I do use the first 3 ‘structure tensor’.

  4. My purple staining is like this: (the top 3 is the correct colors, the 3 on the bottom are incorrect colors, the purple example. the green annotation suppose to be pink for both). (Are you thinking working in ‘Hematoxylin’ ‘Eosin’ mode is better than ‘Red’ ‘Green’ ‘Blue’ in this case?)

and thank you again!

The default export looks ok – the main thing is to check the image looks all right at the end. The image should be exported at full resolution; the downsample refers only to the extra pyramid levels that are generated.

If you have a lot of annotations, often polylines and points can be a better choice that areas with the brush or wand.

I’m afraid I don’t really know whether H&E or RGB is likely to be best. The images on the bottom row do look like they will be extremely challenging for machine learning using the kinds of features QuPath can give to the pixel classifier.

@petebankhead I just want to verify, but from what I recall, the stain vectors are saved with the classifier, so adjusting the stain vectors per image might not have any effect.

I could be wrong, as I have mostly tested this for the Simple Thresholder, where I am fairly confident it is the case.

Thank you!!
Another stupid question, but it’s better to split the project?
(or actually, does QuPath process\ using the CPU just on the open image and not on the whole project?)

I’m sorry, I’m not sure I understand you correctly,
I adjusted the vectors and then create the training image. In this way the vectors aren’t changing? I think that I saw the values in the image tab, next to the Hematoxylin\ Eosin changing a bit. Is there a different\ better way to do it?
thank you

Sorry, that was more of a check, not a question for you. My worry is that, regardless of how you set the color vectors per image, the pixel classifier will still use the color vectors chosen during the classifier creation. That means it will not even try to adapt to the variation in your staining.

Yes, that’s right. Setting different stain vectors for individual images won’t help.

1 Like

Now I’m not sure that I did the vector changes in the right way.
I applied the vector changes on each image separately and then create the the tile training image. Should I do it in a different way?
thank you

@Oshrat I’m afraid it won’t make any difference – you can have only one set of stain vectors for every image (including a training image that has been formed from different images), and this set of stain vectors is then stored as part of the classifier.

This means that when you apply the classifier, the stain vectors set for an image won’t matter – the ones in the classifier will be used.

1 Like

I get it now, thanks.
So maybe I can correct the training image, my guess is the ‘good’ color images will be still kind of good, or maybe split my cohort and create 2 different classifiers? :thinking:

That might be your best and only option. Maybe in the future there will be a classifier option to try to adjust for this type of staining, so that it takes the image’s stain vectors… but that could also be very dangerous.

Imagine a new user setting everything up in their first image, and then running it on a project where they had not edited the rest of their stain vectors to match the first image!

sorry for the delay…
I saved the image as @petebankhead suggest (OME-TIFF) and it works very well now!

I still struggle with finding the correct vector staining for my training.

thank you both for your good advice and help :pray:


If there are two main types of staining, two projects with two classifiers might be the “simplest” way to go. At some point, though, image analysis can’t make up for bad inputs.

The most complicated way to go might be to set the stain vectors individually for each image, export the color deconvolved channels into a new ome.tiff image, and then perform the analysis on the “quasi fluorescent” images that result (each channel is a stain). There may be some caveats to that method that I have not thought through, though. Danger, danger.

Thank you @Research_Associate,

It’s kind of work nice with estimate stain vector without set a specific rectangle (on the whole ‘training image’) for all my pink-purple scale I have. But there are still too many areas that class incorrectly.
I’m training for pathology classes on H&E staining, so I’m a bit less worry about deconvolution impact (for me it easy to tell which area is which), but 2 danger alerts and lack of knowledge on this, make this solution for a future task to research :smiley:

I’m wondering, is there an easy way to create a new training image with less\ more areas (‘annotated region’) and keep my annotation? :thinking: :pray: :pray:

Also, do you think there is meaning in creating a classifier for each patient? It slides with big areas of tissue (~2x3 cm)

If you created a separate project for your training (recommended), or still have the annotations stored somewhere, all you need is those annotations. You can go through and delete or add more annotations at any time, and build a new training image. If you have already deleted the annotations you used, then I don’t know of any way to restore them.

Unsure. In the end, the purpose is to get an accurate result, and as has been discussed elsewhere, there can be a lot of natural variation in human samples for fluorescence. That requires different thresholds per patient due to the antibody binding affinity across different versions of the same protein, and other genetic variation leading to off target effects. So my answer would be “maybe.”

The more human input you add (creating multiple classifiers), the more you need to be able to defend your decisions by posting exactly what your training regions were and hosting examples of your results. It is too easy to get any result you want if you make one classifier per patient, and the way to “protect” those results is to be transparent with them, allowing everyone to inspect the work.

Just my 2cents though.

Again, thank you :smiley:

I will try to explain myself better, I have the ‘training image’ with my annotation (not as a separate project, since I wasn’t sure about the benefit… but can be done easily). However, let’s say I understand that I need to add more areas to sample better my samples\ cohort for building a better classifier.

  • Is it possible to add more ‘region’ to the training image I already have and trained? Can I do it if I don’t have the original marked ‘region’ at the images in the project?
    Alternatively, can I transfer the annotations (all of them, not just the last one)? or maybe can I united 2 training images with all the annotation I did? What is the better way to add more images for a training image?

I totally agree. I will never do it in IF, but in H&E, the main meaning is the shape (at least in my case), not the color (I recognize the cells and the areas according to that),. Anyway, I agree that 1 classifier for one study is better, but since I have 3 different types of purple-pink scale… and all the plays with creating the training image… :tired_face:

Not that I know of, though @petebankhead may know of something tricky with scripts.

You recreate the training image from scratch.

Alternatively, if this is just about training objects, they don’t have to all be in the same image. You can import training objects from multiple images, so create a second training image with additional training areas. Use a different class to define the new set of objects so you don’t import old regions.

1 Like