IdentifyPrimaryObjects-Different count for cropped version of image


I’m trying to make a pipeline to count cells that are HNA (a nuclear stain for human cells) positive in a graft where cells were injected into rat tissue. When we take the images of these grafts, it is an image made of many tiles stitched together as the total magnification is 200X. So, there are often large areas that are outside of the graft and make the image files unnecessarily large. We had cropped them so that the run time for all the images to be processed by CellProfiler would be faster.

We noticed that the number of objects, however, changes between the original image and the cropped image. For example, for one image, the un-cropped image had about 300 objects identified while the cropped image had only about 200 objects identified. There shouldn’t be that much of a difference as the sections cropped out did not have more than 2 or 3 potential objects in them.

The resolution between cropped and un-cropped images is the same, so it is only the size of the image that is different. The pipeline consists of EnhanceOrSuppressFeatures, ApplyThreshold, and IdentifyPrimaryObjects. Do you know why this might be occurring? Any help would be appreciated! Please let me know if you need any additional information.


I’m somewhat confused about what you are saying your workflow is- are you saying are 1) taking the images then 2) stitching them together then 3) cropping the stitched section 4) exporting a new somewhat smaller stitched image 5) feeding the whole stitched image into CP?

If so, I have two guesses - either A) whatever program you’re doing the cropping/resaving in (what is it btw?) is doing something not-nice to your file in steps 3 and 4 or B) the “empty regions” were changing the overall threshhold selected in your IdentifyPrimaryObjects module.

I suspect it’s likely to be B- when you had the empty areas the intensity threshhold between “stuff” and “not stuff” was lower, because there were large areas of very low intensity. This could be either an advantage or a disadvantage depending on what your images look like- you just say that the number is different, not which one is a more accurate count of your desired cells.

I’m questioning why you’re stitching them at all before feeding them into CellProfiler though? It’s likely to run much faster on the unstitched images, and then you could do things like illumination correction that are hard-to-impossible on the stitched image.

When we take the images on the scope, you can see one field of view if you were to take a single image. But because the grafts are much larger than one single image, you have to estimate the length and width (in terms of fields of view or “tiles”) and then tell the program to take an image over that area. So the program/microscope goes and takes a bunch of single images, stitching them along the way to end up with one cohesive image at a certain magnification. So then these images are fed into CP. When we crop the stitched images, it’s just in Preview or Paint (we work on both Macs and PCs).

How does the intensity threshold affect the count? I’m a little confused on how that affects how many objects are found.

We considered having them as smaller images but there could be an issue with over counting or under counting. Like if there are two adjacent images and they are processed separately, and if we choose to include objects on the edges of images, then cells that are split between the edges of two adjacent tiles could be counted twice. And if we choose not to include objects on the edge, then we lose those objects. That’s the main reason why we hope that we can process the full, stitched image.

OK, first things first- it’s a very bad idea to do anything remotely quantitative in those kinds of programs, particularly Paint! I EXTREMELY strongly suggest that you use something designed for scientific images like FIJI; it’s free and it runs on both Windows and Mac and it will treat your data much more nicely. This probably explains why you’re seeing differences before and after cropping in your particular case.

The reason it might hypothetically also be because you’ve cropped out the empty regions is that the automatic threshold decision is trying to find the difference between regions where staining is present and staining is not present; if you have 3 categories of intensity (call them 0, 0.5, and 1) (No cells at all, cells with background fluorescence, cells positively stained) vs 2 categories (now you’ve lost 0 and only have 0.5 and 1) (Cells with background fluorescence, cells positively stained), you can see where it might draw a different dividing line in the those cases (say 0.5 in the first one and 0.75 in the second one).

I definitely get what you’re saying re: missing or overcounting cells if you split them into the individual tiles, but splitting them allows you to do things like illumination correction (which is important for stitched images), it will probably run much faster, and if you do lose/gain some cells it should be in a way that’s consistent from tissue slice to tissue slice (I’m presuming you’re comparing between slices to see things like treatments that make grafts grow more/less well, etc). If you care more about cell count than cell area (which I presume since you don’t have a “Measure Area” step in your pipeline), you could even allow it to keep cells touching the edge but set a size filter at something like 60% the size of a “normal” cell, which should mean that for split cells one and only one side will usually be counted. Personally that would be how I would proceed, but without knowing more about exactly what you’re doing and the rationale behind it I can’t really make that call.

I hope that makes sense, let me know if it doesn’t-good luck!

I originally had cropped them in FIJI but I noticed that the resolution would change for the cropped image. Uncropped would be about 244 dpi but cropped would change to 96 dpi. Originally, I thought this was the reason for why the counts might change but once I used Preview to crop the image that had the ~100 cell difference, and the resolution was now unchanged, there was still about ~100 cell difference, with the count of the cropped image changing only slightly.

Okay, now I see what you mean by the threshold. But when we crop them, there are still areas that have ~0 intensity as we just do a simple rectangular crop. The amount of area with ~0 intensity definitely changes, though. Would this still have somewhat of a similar effect to what you described?

Yes, the cell count is more important for this set of images. The filter you suggested sounds like that could work, and perhaps if there are extra cells counted, it won’t matter too much in the long run as there are hundreds of cells. Thank you for the suggestion!

Resolution for a digital image really only matters when you’re trying to print it (and totally depends on the size you’re going to print), and there’s reason to believe ImageJ doesn’t actually faithfully report DPI- see this link. It’s not really a useful metric while your image is purely digital. Unless you were doing something other than just opening the file, drawing a box, hitting crop, then hitting “Save As TIFF”, I wouldn’t really worry about it. Far fewer quasi-untraceable errors are going to creep into your data if you use something designed for scientific images- you’ve presumably put a deal of time and effort into your experiment at that point between caring for the mice, making your grafts, getting the tissue, imaging it, etc; there’s no reason not to keep your data in its best possible form and make sure you get the most accurate results of all that hard work!

This threshold issue I described before could definitely still apply if there’s a little black border still remaining; for simplicity’s sake I was describing a case where there were only 3 pixels (0,0.5, and 1), but of course your image will have millions of pixels and changing the balance of how many are in the “roughly 0” class vs the other classes could definitely have an effect on the threshold selection. This may or may not have actually been to your advantage- you say that the number of cell counts dropped by 100 but not if this is due to the loss of “real” cells or whether in the uncropped images you were picking up too many “cells” and they’re now gone. Did you output an image of what the program called as cells, and have you scored it to see which you think was more accurate?

Okay, and yeah we did just as you said with the cropping. We have done that and the count we think it is supposed to be is around 240 cells (though the uncropped image visually looks more accurate), so sort of in between uncropped and cropped counts. What we’re more concerned with is why the counts even change between uncropped and cropped images. We’re going to try cropping inside of the program that we take the images in, so maybe that will prevent any random changes in the data (if they are even there!).

I don’t think the changes are random, I’m almost certain they’re based on threshold selection- you can verify this by looking at the IdentifyPrimaryObject result window (which reports the threshold used) or if you exported the data to spreadsheets/database by looking at what’s reported under Image_Threshold_FinalThreshold_(something). If that is in fact the case, where you do the cropping won’t matter at all.

I haven’t seen your pipeline so I don’t know what method you’re using for threshold selection, but it seems like whichever one you’ve chosen is not behaving robustly when there’s a change in the relative amount of empty space- which depending on how much space you’re cropping (is the cropped image 10% smaller than the original? 50%?) may or may not be surprising.

You can try playing with different thresholding methods to see if one behaves more robustly on your data- you can also try adding a minimum or maximum allowed threshold to limit the range of thresholds to choose between. You can either do this iteratively in test mode, but since you say these take a long time to analyze you could also follow a strategy I’ve used myself in the past:

-Create a pipeline that’s just whatever preprocessing you may do before your IdentifyPrimaryObjects steps (if any) followed by 5 or 6 IPO modules, each creating an object with a slightly different name (ie “Cells_Otsu2Class”, “Cells_Otsu3Class”, etc) and using a different thresholding method or variation.
-For each method add an “OverlayOutlines” and a “SaveImages” step so that you can save the overlay of the objects you’ve found by that method, and preferably also a single final “ExportToSpreadsheet” step so that you can extract the actual thresholds used.
-Give it the cropped and uncropped images to analyze then walk away and let it run overnight or over a weekend- then you can come back at your leisure and compare how accurate each method was on both the cropped and uncropped images.

Hopefully you find a method that works well for cropped and uncropped images, but even if not you’ll be able to make an informed choice about what does the best thresholding job for your experiment. You can then run your larger image set using that thresholding method, plus then whatever other downstream steps you’d like.