3D Stardist Segmentation Model Comparison

Hello everyone :smiley:

As part of my Ph.D. project, I have used StarDist 3D to perform some training and prediction to segment some cell nuclei from a 3D image stack or a cell organoid. I have done several tests using the Jupyter notebooks available in Github and different image data sets for training and testing generating quite good results. However, I have had some problems with the models comparison because the plots that StarDist 3D gives sometimes do not show the whole information required and I would like to know why this could be happening.

Here you can see the 2 sets of plots I obtained, the first plot was obtained using just my data, 2 fluorescent microscopy image stack of 85 frames and a size of 340x310 pixels. Where we can observe the different metrics and the tp, fn, and fp. In the second set of plots, we use the same fluorescent microscopy images but in this case, we add two of the synthetic data set provided that 3D StarDist uses to train de quick-demo model to our training data set.

I have check predictions done by each model and the second model offers better results, but I am still curious about why the Jupyter notebook gives me the second set of plots. Why does not compute the fp? and why the plots are in general not smooth as in the other model’s plots?

I hope you can help me and thanks for reading my post

I’m confused, since you already asked this question before and answered it here:

I’m not sure what you mean or what you expect. The second row of plots shows much better results, with all scores essentially being perfect up until IOU threshold 0.5. Only then do you see a gradual decline in performance.


Dear @uschmidt83 Thank you for your quick answer, in this case, my question tended to be more about why the plots are different when I use different data sets. I mean the last time I ask you told me the value of the reason I have this kind of plots is that my fp and fn were the same and that is why I could not see the fp plot, but now I would like to know why could be the reasons this could be happening? Because I find it very odd that the precision, mean_true score, and recall do not appear in the plot if they have different formulas, How these metrics could be having similar value to the others?

I have this question because the addition of the synthetic data is really affecting the performance of my model but for me is not very clear how or why this is happening.

I hope this explanation can tell you more about why I decided to open a new issue. :smiley:

Sorry, but I don’t understand what you’re trying to tell me. Can you be much more specific, ideally with code. For example, add comments to your Jupyter notebook at the positions where something unexpected happens in your opinion. Then share the notebook with me.

However, to be completely honest, I don’t really have much time to provide such in-depth level of support. I’m doing all of this in my spare time.