it took a while, but we’ve finally managed to group and answer the questions that came up during the StarDist webinar last month.
You can watch it on YouTube in case you missed it.
Please start a new topic if you want to discuss anything further.
We thank @superresolusian, @daniel.sage, @romainGuiet, @Ofra_Golani, @Julien_Colombelli, and especially @oburri for helping us plan and run the webinar.
We also thank the whole NEUBIAS team for making these webinars happen.
Will it work for my data/application?
- How do I know if my objects of interest are (sufficiently) star-convex, i.e. is StarDist a good choice for my data?
- Other stains/markers with different appearance, quality, or inhomogeneity?
- Can other objects besides round nuclei be segmented (e.g. multi-lobe nuclei, granules, bacteria)?
- With multiple nucleus types, is it possible to only segment some or classify in addition to segmentation?
- Use for cell counting or centroid localization?
- Data format/pre-processing
- Will it work for my data/application?
How to label
- Should I annotate a few entire raw images/stacks, or is it better to annotate several smaller image crops?
- Which size should the training images be?
- Is there an upper size limit for objects to be well segmented?
- Do I have to annotate all nuclei (objects) in a training image? What about those that are only partially visible? What about other objects not of interest?
- Is it better to annotate images from scratch or to bootstrap/curate imperfect annotations (e.g. from another method)? Is training sensitive to annotation mistakes?
- How many images or nucleus (object) instances do I have to annotate for good results?
- Software/format for labeling
- How to label
- Using pretrained models
- What are the probability and overlap/NMS thresholds? How do I select good values?
- How does it work under the hood? I want to know technical details?
- Is a trained model sensitive to changes in image intensity or object size (as compared to the training images)?
- Do you support or recommend “transfer learning”?
In a nutshell, most blob-like object shapes are star-convex (see Wikipedia article). If you have labeled images, you can load your data in our example notebooks and see how well it can be reconstructed with a star-convex polygon/polyhedron representation. An average reconstruction IoU score (mean intersection of union score) of 0.8 or higher could be generally considered good enough.
Please first verify that the shapes of your objects are star-convex, i.e. blob-like. Examples of objects (segmentable by StarDist) include cells in brightfield images and stained structures in fluorescence or histology images. Where stains are used, an object can have its whole area stained, just its boundary stained, or be negatively stained (i.e. it is dark compared to other regions of the image). Next, please check if one of the pretrained models works for your data.
If your data is suitable for StarDist, but there is no pretrained model available, you need to train your own model. To that end, you need labeled images before you can train your model (you can use the provided example notebooks where you replace the example data with your own).
The short answer is that StarDist should work well for segmenting all kinds of blob-like objects with a star-convex shape. However, it typically performs quite a bit better for roundish shapes compared to strongly elongated ones. For the latter, you often need to increase the number of rays to get decent results.
If there are multiple object/cell types in your image and you only want to segment some of them, you have several options. First, you can annotate only the object type(s) of interest in your training data, implicitly telling StarDist to consider everything else as background. While this can work, it might make it more difficult for StarDist to reliably distinguish between objects and background, especially if the visual differences between object types are subtle. Alternatively, you can annotate all objects in the training data, such that StarDist will learn to segment objects of all types. In a second step, you would have to filter out all objects of those types you are not interested in. This can either be done manually or with a different classification model.
Ideally, StarDist could additionally classify all objects while segmenting them. Although this is currently not possible, we might add this feature in a future version.
If you just want to count or localize the centroids of cells, it might be a bit overkill to use StarDist (although trying one of the pretrained models is always a good idea). Dedicated cell counting and centroid localization approaches do exist, and they often need weaker forms of labeling, such as cell counts per training image or point annotations for cell centroids. However, if such centroid localization methods yield suboptimal results (e.g. in the case of very densely packed cells/nuclei) it might be worth to spend the extra annotation effort and train a dedicated StarDist model.
In general, special pre-processing of images (such as background subtraction, denoising, etc) is not necessary. However it is reasonable to scale your input images such that the overall size of objects (in pixels) is similar to the size of objects used during training. If you have trained your own model, that means to always ensure that new images have roughly the same pixel size as the training images. This will make it much easier for StarDist to learn and might also avoid erroneous predictions of objects that are either too small or too large.
If you are using a pre-trained model, it is important to know what kind of images it was trained with to understand if your image data is similar enough. In some cases, you can pre-process your images to make them suitable for a pre-trained model (e.g. up/downscaling of the image).
If your images contain only very few (<10) axial planes, you might consider doing a 2D segmentation from a maximum intensity projection (MIP) of the 3D stack. But the MIP should only be one or two cell layers thick. If you can’t individualize cells by eye, there is little hope that StarDist will get it right.
If you need 3D segmentations, StarDist 3D does support anisotropic data (e.g. a 5x larger axial vs lateral pixel size should not be a problem). However, we sometimes found it advantageous to upscale the axial resolution to make objects appear more isotropic in the images. Hence, first try it directly with the anisotropic data and only if that doesn’t lead to good results you could upscale the data isotropically. Note, that as one is not interested in restoring the image intensity signal but rather only segmenting the objects, it most likely would not make sense to use Isotropic CARE.
StarDist is in general not limited to images of specific formats, bit-depths, or sizes. Any input image however needs to be normalized to floating point values roughly in the range 0…1 before network prediction. Our example notebooks demonstrate how this normalization is done in Python, and our Fiji plugin does this by default.
StarDist can be trained and predict on images with arbitrary spatial dimensions, but once a model is trained it is limited to its specific number of input channels (e.g. one cannot use a model trained for 2D RGB images on 2D single channel images).
StarDist does not put any constraints on the specific size of the input image: all padding and cropping necessary for the actual neural network is automatically handled for you. Also note that StarDist can do tiled prediction of large images in case of limited GPU memory.
A StarDist model is always trained to work for images with specific input channels in a given order. On one hand, that means you can train your own model with any number of input channels that you think might be helpful to accurately segment your images. On the other hand, these channels have to be always present in images that you want to segment using this model. Note that this also applies to the pretrained models that we provide, which expect images with specific input channels.
If images have additional channels or channels in a different order than expected by a trained StarDist model, you first need to re-arrange them. For example, you may need to split the image channels and select the appropriate channel image (e.g. DAPI) before you can apply our pretrained model for fluorescent nuclei in Fiji. You can then use the resulting segmentation to perform measurements in the other channels.
In general, it is better to annotate several image crops instead of entire (big) images or stacks. It is important that the content within annotated training images is representative of the content within images that you want to predict on later, after the model has been trained. In other words, the training data should cover the full range of variability that you expect in your (future) data.
As mentioned earlier, it is generally better to annotate a variety of image crops as your training data. However, those crops must be big enough to contain entire fully visible objects and provide some context around them. Also make sure that not too many of the annotated objects are touching the border (it’s fine if some do, but it should not be the majority). Example: if you have small cells with a diameter of 20 pixels, it might be sufficient to have annotated images of size 160x160, whereas if your objects have a diameter of 80 pixels, you would need to use larger annotated images e.g. of size 512x512.
The “patch size” is an important parameter for training StarDist, and the size of images used for training affects what an appropriate value for the patch size should be (to maintain compatibility with the neural network architecture). For example, the patch size used for training StarDist must be smaller or equal than the size of the smallest annotated training image. To be on the safe side, ensure that the patch size is divisible by 16 along all dimensions. For example, you can annotate image crops of 300x300 pixels and then use a patch size of 256x256 pixels for training.
The maximal size of objects that can be well segmented depends on the receptive field of the neural network used inside a StarDist model.
For the default StarDist 2D network configuration, this is roughly 90 pixels. If your objects are larger than this and the segmentation results indicate over-segmentation, you can either a) downscale your input images such that the object size becomes smaller, or b) increase the receptive field of a StarDist model by changing the grid parameter in the model configuration (e.g. setting
grid=(2,2) will roughly double the receptive field). Grid values of 4 and even 8 do make sense for images with a large minimum object size, e.g. 5x the size of the grid value.
This is similar for StarDist 3D, although the receptive field for the default network configuration is only roughly 35 pixels. Besides downscaling your input images, you can also change the grid parameter as mentioned above, but do not increase it for Z if you have strongly anisotropic images with relatively few axial planes, e.g. use
grid=(1,2,2). Furthermore, you can also slightly increase the receptive field by changing the backbone in the configuration to a U-Net, i.e. by using
Sparse labelling is not supported at this point, i.e. you must label all the objects in your chosen training images, even if they are only partially visible. If you don’t do this, the trained model can be confused as to which pixels belong to objects and which belong to the background. As a consequence, this might result in many objects being missed during prediction.
If there are any other objects or structures present in the image, which are not of interest, there are two options. First, annotate them too for training and then filter them out later from the predicted objects. (In the future, we might add an option to additionally classify different objects types, making this easier.) I would recommend this in many cases, especially when the objects are also star-convex or look very similar to the objects of interest. Second, leave unwanted objects or structures out of the annotation if they can easily be distinguished from the objects of interest. If in doubt, try both strategies.
In practice, you probably would like to use the labeling approach that requires the least amount of manual annotation/curation work. It depends on your data whether this is labeling from scratch or curating an imperfect automatic labeling.
Annotating images from scratch is often easier because it doesn’t involve obtaining predictions and curating them. It can be a good strategy if the task is not too difficult.
If you already have an instance segmentation method with decent results, you can try training StarDist by using its predictions as ground truth. As long as there are no systematic mistakes in the ground truth, we have observed that training can still be successful. Especially when the segmentation task is more difficult (e.g. noisy images and/or strong appearance variations), it often makes sense to train an initial model, curate its predictions and add them to the training data.
This is very difficult to answer in general, since it really depends on your specific data. The more variability is in your data (object shapes and packing, background, noise, signal variation, aberrations), the more training data (in form of a wide range of examples) is necessary so that the network can learn to perform accurate predictions.
We have often seen good results from as few as 5-10 image crops each having 10-20 annotated objects (in 2D), but your mileage may vary substantially. You can always start with a small training dataset, inspect/curate the results and iterate.
Furthermore, one can/should always use data augmentation to artificially inflate the training data by adding training images with plausible appearance variations. What plausible means depends on the data at hand, but some operations (random flips and rotations, intensity shifts) can be used in most cases and are demonstrated e.g. in the training example notebook.
The image annotations (also known as label images or label masks) should be integer-valued (e.g. 8-bit, 16-bit, 32-bit) TIFF files where all background pixels have value 0 and each object instance is represented by an area/volume filled with a unique integer value. It does not matter what the values are and they do not need to be consecutive. Please note that a foreground/background segmentation mask, where all object instances are denoted by the same value, is not sufficient for StarDist training.
Note that for visualization purposes, label images are often displayed with each object instance in a different color (to tell them apart) on a black background; this is the result of applying a look-up table (e.g. Glasbey on dark in Fiji). As mentioned above, the label masks for StarDist must be integer-valued TIFF files and not RGB files, i.e. the specific color does not matter.
In 2D, there are several options, among them being Fiji, QuPath, or Labkit. Although each of these provide decent annotation tools, we currently recommend using Labkit for its easy label export. Please read here for more detailed instructions how to use Labkit to generate annotations.
Here is some advice for exporting annotations to a label image from different tools (see here for a list of recommended tools).
- Fiji: Use this script to convert annotations to label images.
- Labkit: Please read this.
- QuPath: See this post to get started.
First, you can take a look at the existing pretrained models and inspect the images they were trained on, to get an idea if one of them might be suitable for your data. At the moment, you can find an overview of pretrained models here and here, including links to the training datasets. Furthermore, our example notebooks also demonstrate how to show a list of the available pretrained models.
If you found a promising pretrained model for your data, it is probably easiest to quickly try it out with our Fiji plugin and manually inspect if the results are plausible. If that’s the case, you may also want to quantitatively evaluate the results.
Besides being rather robust to intensity changes, our pretrained models are able to segment objects with a fair range of sizes. Please take a look at the respective training datasets to get an idea of the object size variations that the model should be able handle. In the future, we might provide additional metadata for each pretrained model to help you with that. Also please have a look at this related question.
If your images contain relatively large objects and you observe lots of over-segmentation mistakes (i.e. several smaller objects predicted instead of an expected large one), you should try to reduce the pixel resolution of the image before applying StarDist.
Unfortunately not, but we would like to provide one at some point. A major issue is the lack of available training data.
There are no immediate plans at the moment, but we can relatively easily be persuaded to add new ones given a common use case and the availability of suitable training data.