Math Analysis assistance requested for determining the amount of expression on image

In the last two months we’ve been working with a mostly volunteer team on helping out a grand endeavour, completely repairing the human heart. The part our team has been working on has now hit a roadblock. We’d like to ask for some help from a willing mathematician with image analysis experience to get us back on track. Can you help?

We’ve been trying to analyze the tissue from heart valves to determine the quality of this heart valve. What we have is thin slices of heart tissue that have been coloured with 2 different stains. The tissue is so thin that it is transparent under the microscope, so we have coloured it. One colour measures the protein that we are looking for and the other colours all tissue so that we can see the contours of the tissue. We think we can determine the quality of the tissue by verifying how widespread this protein has gone. More of it means a further deterioration of the valve.

As a team we have used a few existing analyses, and transported those into Knime so we can reuse them. Unfortunately we are not getting exactly what we need and now need to understand the mathematics of it all in order to tweak and tune our processes. We have 2 software developers on board and a tissue engineer, but are lacking the mathematical knowledge to get this analysis tuned. Below you will find as brief as possible an explanation of our process and a whole load of questions. Would be very helpful if someone can take the time to help us through this!

The chemical process
Normally, a slide is washed in an antibody (The Primary Antibody) that specifically binds to the protein in which we are interested. After the excess liquid is removed, the slides is then washed with a different antibody (the Secondary Antibody) which is tagged with a chemical to provide another color (in this case, Alkaline Phosphatase, in purple), which will attach to the Primary Antibody and make the protein visible. After both antibody stains, the slide is then washed with the counterstain (in this case, Nuclear Fast Red, the magenta color, which is used to see cell nuclei, but colors all the biological tissue). The slide is then washed of the excess stain, dehydrated, and sealed with a slide cover. The first quadrant below is the result of this.

Our analysis
We ran a grayscale histogram of the full image and then did a colour deconvolution, ported from Gabriel Landini’s awesome work ( The result of that is in the second, third and fourth quadrants. We got to that by selecting two small regions from the full colour image in order to get reference values. These regions however always have both colours present in them. We got a decent separation from that, so we ran the histograms on that, but they look confusing to us:

Why is the histogram curve for expression spread out wider over all of the intensities than the original histogram curve? Why did the peak of counterstain go to the right of the original curve? Is this to do with the method we chose for the grayscaling?

Improvement expected
We thought we could improve the process by taking deconvolution values from single stained images. Our tissue engineer took a sample with the expression stain and a sample with the counterstain.

We then ran that through the Colour Deconvolution algorithm. Suddenly we get a far worse colour deconvolution, completely opposite from what we expected. The third image, the remainder, contains many more pixels. What are we doing wrong here? Choosing the wrong location in the image?

The histogram for this colour deconvolution also confirms that we have something worse instead of better.

Basically we are playing with forces we don’t completely comprehend and would love for someone to educate us, or even better, help us out with this. Are you that person?

Look forward to hearing from you!

Martin van Dijken
M +31 (0)6 26 144 223

P.S. I am also crossposting this on the Knime forum as this concerns that tool as well:

1 Like

Is there any chance you could share an original image or subsection of an original image? Preferably 3, one of each individual stain and part of the combined. There are various hosting options, including Google Drive that can handle fairly large files (~14GB).

I understand these things are not always sharable, but in case it is, there are quite a few people on the forum interested in stain separation:
Crazy long post, but to give you an idea:

Extra discussion if you are interested in deconvolution.


Thanks for the links, I will definitely read up on those! I have asked our Tissue engineer to check with his Uni whether the shots can be shared in a higher resolution than we have done so far. Hope to get back with positive news on this!

A couple of comments:

  1. Sadly is not possible to quantify the expression of the protein with the method described. Standard immunohistochemistry, indeed uses a primary antibody, then a secondary antibody with an enzyme that precipitates a chromogen. That is not a stoichiometric reaction by any means. There are several amplification steps and reactions going on from the primary Ab binding to the chromogen precipitating that are virtually impossible to control. Sorry if this sounds like “told you so” but there is a reason for that warning in the Colour Deconvolution plugin dialog!
    Most likely the counterstain is also non-soichiometric (like most dyes). Take IHC as a “yes no” test: can I detect antigen x, and if so where is it?- type of question. Very useful for some things, like tumour typing, but trying to quantify intensities and ratios of intensities from IHC is not a good idea. One would have a very hard time trying to convince a clued referee or colleague that the approach is sound and reproducible, regardless of how much imaging and statistical machinery is thrown at it.

  2. Can I ask what are those histograms plotting? Is it the greyscale intensity of each channel? If so that will not tell where the dye is, or if it is co-localising, but only the distribution of intensities in the image. I wonder what were you expecting to see in the histograms? (maybe I am not understanding exactly what is shown, apologies if that is the case).

  3. I think another problem might be that the test images to define the dyes seem OK, but the image you are analysing does not: check the background of the pink-single dye image versus the lumen of the blood vessel in the third panel (looks grey to me, not white). I think the CD algorithm in that image “finds” a bit of all dyes contributing to the darkness of the background because the background is expected to be white. Also look at the tear near the bottom of the colour image, where you can see through the background: it is not white but some bluish colour (some residual stain?). Doing proper background correction at the image acquisition time might help, e.g. the “a prori” method using transmittance as described here:
    Hope it is useful.


That is what I was assuming. It seems to show is that there are fewer pixels around the peak in the pink stain with the second separation, and more white pixels. That matches up with the quad of images above, since the stain separation removed the entire purple/pink overlap area in the second set.

These parts make sense, since the original total grayscale intensity (black curve?) is being distributed among both purple and pink color vectors. Each of the darkest pixels has contributions from both vectors, so I would think the intensities of each stain would be less.

I am a bit more surprised that the purple curve goes farther to the left (darker) than the original grayscale curve. Probably my lack of experience with the math of color decon though.

Beside the important comments of @Research_Associate and @gabriel I would like to add some notes - especially regarding the histogram questions.

In your both histogram plots you compare the histogram of a RGB color image with the grayscale histograms of the channels resulting from CD (color deconvolution).

What is the idea of such a comparison? Are you trying to estimate the quality of the CD by this comparison?

  1. The displayed histogram of the RGB image is a histogram of grayscale values calculated from the local RGB pixel values which is equivalent to the image intensity histogram.
  2. This intensity histogram has no linear connection to the histograms of the CD channels (because the intensity is a local average of the RGB information). Therefore it makes no sense to ask something like ‘Why is the distribution A different from distribution B?’
  3. In addition, the CD approach calculates ‘stain concentrations’ in log-space. The resulting information is usually computed as 32bit number and in your case transformed back by using exp() and rescaled to 8bit range.

Comparing the histograms is like assuming a simple mathematical transformation between the CD channels and the average RGB-intensity. But there isn’t such a connection. The distribution of the local RGB ratios strongly influences the intensity histogram.

E.g. if a sequence of pixels has the following pixel values in channel I
{ 10, 12, 12, 10, 11, 11, 11, 9, 10, 11, 13, 13, 12, 12, 13, 14, 12, 13, 12, 14, 14, 15, 13 }
And in the second channel the same pixel locations have the pixel values in channel II
{ 10, 8, 8, 10, 9, 9, 9, 11, 10, 9, 7, 7, 8, 8, 7, 6, 8, 7, 8, 6, 6, 5, 7 }

The histograms of channel I and II are quite similar but shifted locally.
The resulting intensity histogram of the average channel values is totally different from the channel histograms. It is a single peak at intensity 10 !

=> The channel histograms cannot be derived from the intensity histogram.
=> Quit different channel histograms can lead to identical intensity histograms. The connection between intensity and channel histograms is not distinct.

Beside that I am a bit irritated by

and by the channel naming ‘Fibers KNIME’ in the histogram plots.
I think there is only 1 antibody stain and the counterstain, correct?


Coming back the main question:

Do you need to quantify the antibody stain or is a
Yes/No, Present/Not Present, :-1:/ :+1: answer (as Gabriel pointed out) sufficient?

Do you need the counterstain to classify and select/reject tissue regions or is it used just as a rough estimation of tissue localization and tissue area?

Hello everyone and thank you for the feedback you have provided so far.

I am the tissue engineer with whom sunsear is working. He is lending his expertise to develop our method and hopefully expedite much of our analysis.

To answer a main question of several commenters, we are not trying to fully quantify stain or antigen concentration. We are trying to automate the “Yes/No” process to give us percentage of the “Yes” pixels in a given image, and want to use color deconvolution to help us determine where the threshold between visually distinct expression and counterstain (“yes and no”, respectively) should be.

For my purposes, we are using the counterstain to visualize the tissue and provide contrast to our immunostain.

Thanks again to everyone for your interest and assistance.


Which of these did you choose?

If there is any chance you can share an actual image (or a tiff export of a subsection, now that that is easy in QuPath), I’m sure people might offer more help. Without that…

We chose the second to last option (From ROI).

You might try something like this.

Use Color Decon

Make Binary

Analyze Particles

Comparing the areas may give you an idea of the relative proportions of your stains.

P.S. I get much better results segmenting and measuring stains using trainable weka segmentation.

Hello there everyone,

Thanks for all of your help and input. It’s been a bit slower this week as sunsear is on holiday. I am working on getting some higher quality images subsections and will upload them when I get the all clear.

Good afternoon everyone,

I am able to provide some higher resolution .tif files, as these are not being used in studies. Attached are the higher resolution images of the tissues stained with only either nuclear fast red or AP, and an image from a control stained with both. Hopefully this provides everyone more to work with.

Deconvolution Test Counterstain.tif (8.2 MB) Deconvolution Test Stain.tif (8.2 MB) Deconvolution Test Image.tif (7.7 MB)