Using StarDist as a start for further classification

Hi Everyone, I have had great success training a StarDist model on my own data. Now I’d like to use the nuclear segmentation for further classification.

My images are like this:

This is a fluorescent multi-channel image of brain stained with DAPI and other markers. I used StarDist to find the nuclei and then I annotated those regions in the other channels using ImageJ’s ROI Group tools. Now I have an annotated data set I could use to train classifiers to the other channels. I want to classify only the regions found using StarDist, not start with the other channels and go from there.

Does anyone have any suggestions as to a tool I could use to do that?

I see all the stuff at ZeroCostDL4Mic, but I don’t see the workflow I am proposing.

Thanks in advance for your time and suggestions! John

Dear John,

why do you need a second round of training?

From the image it seems you can just use the intensity of the marker channel to classify, or?

E.g. in Python using skimage.measure.regionprops.

Kind regards

Tobias

Yep, unless you want something more complex than straight measurements per object, you should be able to approach that by either masking in FIJI, measuring anywhere you can use the ROI, or try QuPath where many of the measurements are made automatically:


You do need a specific build of QuPath to use StarDist, but the automatic measurements and options for classification (as long as you don’t need texture based classification), ease of use, and ability to export measurement tables might make it an option worth exploring.


And to give an example of what can be built off of that
1 Like

Hi Tobias and MicroscopyRA,

Thanks for your replies. I agree for this mini example intensity would clearly be good enough. I find though that over many tissue samples in large studies a simple threshold often doesn’t work as good as hoped for even when the staining is relatively consistent. I looking to find something a bit smarter that could use the context in and around the objects to identify positives.

Here’s an example were I’ve pointed out a few cells I think are positive with these two different markers. In the green image on the left there is a wide intensity difference and in the right image I circled a low intensity object I don’t think is real. In these images I’ve left out the nuclei outlines:

I’ve gotten some decent results using Ilastik, but with that software I have to pick the training examples. It’s not hard to do that, but I’ve created this big data set of hand annotated data and I’d like to leverage instead of trying to repeat process that within another tool.

I had such good success training my own model in StarDist for nuclear segmentation that I was hoping to make a similar improvement with the other channels in my images.

I haven’t tried to work this out in QuPath yet, that’s next the list…

-John

Ah, I have been dealing with similar issues, and QuPath does not really help too much in that regard. StarDist is also dangerous as it does a great job of picking up very faint round blob-y objects, sometimes that I can barely even see with normal brightness and contrast for what I consider “real” cells.

I have used the logic shown here to check whether two ROIs (cells) contain each other, and keep the larger one. You might be able to do similar in FIJI.

In that case I was using StarDist to detect first nuclei and then macrophages (different runs on different channels), and deleting any nuclei that were contained within the macrophage so that I only ended up with a single macrophage object. It ended up being only partially successful as there were plenty of instances when the cells were touching and StarDist did not work so well there (or on non-round objects).

If I were using StarDist for classifications, I would alter the class of the parent object whenever I deleted another channel’s child object.

So you’re using StarDist to find objects in both the nuclei and the macrophage images separately and then finding objects that overlap? Hmm, I didn’t think of doing that. That might work for the basically round objects I’m trying to find.

I do staining in tissue for both macrophages and microglia and yes, both are quite different then examples I’m showing here. Hard to imagine how the Stardist approach would work in those cases. In those stains I’ve focused on staining near the nucleus and use that to count glia.

I have another marker for somatostatin which often only stains part of the cytoplasm so it doesn’t have a rounded morphology, but it doesn’t look like glia either. Seems like there’s a need for more machine learning here…

QuPath doesn’t have an option for importing training annotation data does it?

Depends on where the annotations were made. FIJI ROIs work well since QuPath is integrated well with IJ1. I think there are posts about importing FIJI ROIs, but I have never tried to do this from zip files, only active ROIs. I prefer QuPath for ROI handling since the ROIs save with, but not embedded into, the images.
Other sources of ROIs you may have trouble with, though some NDPI viewer ROIs can be imported with a script. Or you could use masks.

As far as using StarDist though, people have usually used QuPath to generate the training data, then retrained a StarDist model off of that, followed by running the model back within QuPath. It would always be nice to have more models available for the community if you get them worked out nicely!


I wasn’t too careful about the time point I picked, but that’s a good general area.
1 Like

I’m doing the annotation for this second step of analysis after the nuclear segmentation in ImageJ with the ROI group tool to mark ROI’s. I can save this as a mask or as ROI’s in .zip or as active ROI’s if there’s some kind of direct connection.

Do you think Qupath could be helpful with this type of imaging problem? Sounds like you’ve already tried and didn’t have great success…

I hear you regarding releasing my model into the wild, but there are IP issues I have to address. If I was able to where would it do it?

Hmm, no idea, really. I would defer to @uschmidt83

As for the ROIs, if you do end up digging into it, I think the zip files were mentioned elsewhere in posts, but I do not remember exactly how it is done. If the annotations are generated by (or loaded into) QuPath’s version of ImageJ, then there is a command to transfer them over rather easily.
https://qupath.readthedocs.io/en/latest/docs/advanced/imagej.html#qupath-objects-and-imagej-rois

Hi John, irrespective of the tool you use, i would go for an random-forest based pixel classier to give you probabilities for the right class (per pixel). And this probability image i would then use (as suggested above for intensities) to quantify per stardist object. If important information is outside the stardist object you might dilate objects a bit. Hope that helps!

Kind regards Tobias

HI Tobias,

That sounds like reasonable advice, thanks for your reply. I am familiar with some of the supporting evidence for the random forest approach: https://jmlr.org/papers/volume15/delgado14a/delgado14a.pdf

I was assuming that if the per pixel classification is done with a neighborhood size of say, 30 pixels, then pixels that are outside of the stardist object would contribute to the classification of the ones inside the object which are near the edge. Is that not correct?

Even if that is correct dilating the nuclei segmentation objects before using that to look in the other markers is a good idea, I haven’t tried that yet.

1 Like

Yep, this is built into the StarDist as described above. Though the blind expansion does have its own problems if you do have background and cell overlap. In widefield data I have often found that nuclear signal for cytoplasmic markers was a much better predictor, while in confocal data this does not hold as well.

Interesting. If I have time I’ll try both. But this gets back to my original workflow problem. I’ve annotated my data as to positives and negatives and I can easily compare classifier output to this “ground truth” and calculate performance metrics. Now, if I could use the annotations to train a classifier then it would be easy to try dilated or not dilated objects and see which one is better. But as it stands I have to manually pick the training set each time I use Ilastik which is tedious and ambiguous because one just picks a relatively few examples across ones data set and it’s not clear which examples or how many provide the best results. So what I’m looking for is a system like stardist where I input the training set automatically (as a mask or whatever) and get back a classifier that’s optimized to that training set. Then I run a performance metric script and, if necessary, turn some knobs or add more training data and try again. Adding training data manually is putting a roadblock in the iteration loop I’m trying to execute.

I haven’t tried to figure this workflow out on QuPath or any other tool yet besides the imagej, stardist, illastik combo I’m using at the moment. I guess I’m trying to figure out if there’s a better way.

Hi @johnmc,

Just fyi, the upcoming release of stardist will include a classification option, where the prediction of a stardist instances can optionally also contain a class prediction from a set of user defined object classes (e.g. phenotypes etc). The code currently lives in its own branch (multi) and is still pretty experimental, but if it fits your problem description well you may have a go and in that way become a beta tester :slight_smile:

To do that:

  1. Install multi branch

pip install git+https://github.com/mpicbg-csbd/stardist@multi

  1. For each stardist ground truth label image, provide a class dictionary with maps label instance ids to class ids (this can be provided sparsely, i.e. not all instances need to be classified in the ground truth )

  2. Adapt the example notebook for multi class predictions (for a toy example with 2 classes) which you can find here

Let me know how that goes!

Martin

2 Likes

Hi Martin,

Wow, thanks for your suggestion I will check this out. Just to be clear, I have further annotations for the stardist objects found, but those annotations are based on the staining in the other images in the set. For example, I use stardist to find the nuclei in the DAPI channel and then I go to the other channels, say FITC, to see if it’s positive for GFP (Green Fluorescent Protein). I haven’t looked at the example you cite yet… Is this how ‘multi’ works? Using those other channels along with the annotations I’ve created?

-John

1 Like

Also have not looked, but I had initially assumed it would use information from other channels within the StarDist label/ROI… but now I am wondering if it might only be based on the nuclear/segmentation channel.

It will classfy each found stardist object based on the given input, which can be single or multichannel channel. You mentioned that you trained your own model anyway, so you could just use the 3 channels as input with the masks you already have + the additional classification annotations for each nuclei belonging to either of the two classes…

1 Like

Ah, so we can’t use one/several channel for the segmentation, and then more channels for the classification. It might be a bit weird to use certain cytoplasmic channels for segmentation (macrophages, epithelial cells in general), but it would still be interesting to classify those stains being present in the nuclear segmentation (or slight expansion).

Ah, I see. So you want the segmentation being solely based on one channel and the classification on a different one? That is indeed a different workflow then the one I was describing…

2 Likes

Why is this important? To not bias the result somehow?