I have used a known software for image analysis for a long time.
The work was related to biomarker quantification with cancer outcome.
I met some serious computational scientific challenges related to how the software handle the problem. I want to open a discussion in this forum about some of them. The intended purpose of the discussion is 100 % scientific. To learn and the share experiences.
Most of images come as pixels.
For illustration, let assume we have a DAPI image,
Dark pixel is background otherwise is a part of a nucleus if one assume that the acquisition process is free totally from noise( too realistic, know).
The biologists( pathologists, to be more precise) like to see the image transformed to quasi close pathological objects that make sense for biological interpretation, nucleus, cytoplasm …
The most used approach is to use segmentation and group close similar pixels into one object, the objects can be classified into nucleus class and background …
This transformation will change the biomarker distribution(histogram);
The mean and the variance values, at the level of pixels, are totally not equal to the one generated at the level of objects …
We are now in this situation, objects have a biological meaning but may be the biomarker quantification is senseless because the segmentation introduce some variation that destroy the real biological variation related the outcome.
Think about KI67 related to the proliferation, how one can have the optimal segmentation to generate a surrogate value for the proliferation quantification …
I did a lot of computation of this kind by using correlation, for instance, between DAPI layer and KI67 layer.
Correlation at pixel level is totally not close to one at the object level.
Some people will argue that model for prediction will decide the correct one. This idea is simply wrong the used model can have a non scientific mercy for the wrong feature.
The problem begins to be more funny when some one is dealing with Mplex, 3 biomakres for instance and more.
Even making the problem more simple by ignoring the network aspect between the proteins ( biomarkers are used to track these proteins), one single segmentation can not work for all these images …
Any one have an idea about bias introduced by the first or last segmentation?