Analyzing large images takes long

Hi there,
I am analyzing relatively large images (16bit, 2160x2160, 9132KB - generated by ImageXpress/Mol Dev) and it takes forever for CellProfiler to run a pipeline (>5min for 1 set of 3 images). Does it make sense to you?
Thanks for your advice,

Hi Noga,

Unfortunately, we can’t answer your question without also seeing the pipeline that you are using, plus the 3 images (I assume there are other wavelenths involved?). Could you upload it to this forum thread? If you need to make additional posts in this thread to upload large images, then go ahead and do so.


Hi Mark,
This is a half-baked pipeline, I am just learning…
Indeed the 3 images are DAPI, GFP and TRIC. The goal of the pipeline is to record the intensity of the GFP in the areas defined by the TRITC channel.
i uploaded the CP file and 2 more images for your review.
Thanks for your help,

Elazar pipeline_1.cp (14.4 KB)

Hi Noga,

Apologies for the slow reply. I am attaching an updated pipeline for you to try. You had what I believe were extraneous IdentifyPrimaryObjects modules, which are computationally expensive and slowed CP down. Take a look at the pipeline and see if it does what you want. There appear to be GFP positive cells which don’t have obvious nuclear counterparts, which seems odd to me, but maybe that makes sense for this assay.

As far as the speed issue, my pipeline takes about 1 minute to run for this single cycle. This is typical for CellProfiler. For high-throughput use, we utilize a computing cluster here at Broad and thus can parallelize the computations by 1000+ fold. A newer version of CellProfiler, not released yet, can run multiple parallel threads on a single machine utilizing multiple CPU cores which does speed up processing significantly, but on the order of a factor of 2-3 depending on your setup, I think. But for HT screening, a cluster solution is the best solution (though often not trivial to set up).

Please also see this FAQ for some speed-up tips.

Elazar pipeline_2_DLogan.cp (9.65 KB)

P.S. The TRITC channel is very dim, although if you brighten it up (in CP, right-click TRITC image > Image Contrast > Log Normalized) then there appears a lot more information regarding the cytoplasm outline. I tried some tricks in CP to do this (without much success and it’s not in the pipeline I uploaded), but if you can brighten this up biologically or via imaging, that would make the IdentifySecondaryObject step much easier.

Dear David – thanks for your reply and useful advice.
Please see attached a more advanced pipeline I generated. Its description is as follows:

Load images:
Identify primary objects:
Identify secondary objects:
Low cytoplasm regions
High cytoplasm regions (LC3-TRITC channel)
Identify tertiary objects:
Record GFP intensity in the identified tertiary objects:
GFP intensity in Cytoplasm
GFP intensity in LC3inCytoplasm
Calculate ratio between

A couple of questions:

  1. The intensity values I get for “IntegratedIntensityEdge” (tens) or “IntegratedIntensity” (hundreds) are not related to the real image analysed (thousents). Can you explain?
  2. I wish to get an average value for all the objects in an image – how can this be achieved?
  3. Is there a way to calculate the ratio “LC3inCytoplasm/Cytoplasm” in the pipeline, rather than post pipeline using excel?

Thanks so much,
Elazar pipeline_tertiary objects_nucleiChildre.cp (13.4 KB)

I assume what you are seeing is the fact that the raw image intensity reflects the image bit-depth (e.g, 0-255 for 8-bit, 0-65535 for 16-bit, etc), whereas the intensities in CellProfiler are different (and lower). If so, the reason is that CellProfiler scales the pixel values of all images from 0 to 1 by dividing all pixels in the image by the maximum possible intensity value, so this should be kept in mind when making intensity measurements.

You can use the MeasureImageIntensity module to obtain per-image intensity statistics for only the pixels within an object of choice.

You can use the CalculateMath module to compute a ratio of two per-object measurements. You just need to choose which intensity measurements you’re interested in getting the ratio of, e.g., mean, median, etc.


Hi Noga,

I’m not quite sure what you mean by “real image analysed (thousands)” – is that a measure from CellProfiler or some other software?-- but in any case the confusion may be that CellProfiler rescales all images, regardless of bit-depth, to pixel values from 0-1. This standardizes images from a variety of sources, so that while your camera might report value of 255 or 65536, depending on if the camera is 8-bit or 16-bit, respectively, these values represent the same intensity, i.e. the maximum value.

And according to the Help for MeasureObjectIntensity,

[quote]IntegratedIntensityEdge: The sum of the edge pixel intensities of an object.
IntegratedIntensity: The sum of the pixel intensities within an object.[/quote]

So given what I said above, each pixel has an internal intensity in CP from 0-1. Then IntegratedIntensityEdge then sums up these values for the perimeter pixels (say 200 pixels for a particular object) and so the value might be in the 10s. IntegratedIntensity then sums up these values for the all the pixels in the object (say 5000 pixels for a particular object) and so the value might be in the 100s.

Does that make sense?

In ExportToSpreadsheet, click the “Calculate the per-image mean values…” button (median is there, too). Then in the XX_Image.csv (by default, “DefaultOUT_Image.csv”), you will get values beginning with “Mean_…”, like “Mean_LC3inCytoplasm_Intensity_IntegratedIntensityEdge_GFP” which are averaged across all images.

Yes, you can add a “CalcuateMath” module after the measurements you want to use, choose “Divide”, then choose type as “Object” (since they are per_object measurements, not per_image measurements), and then input the two measures you desire.

Hope that helps!

Oops! Mark and I were writing at the same time - sorry Mark!

Hi again,
Back to this stubborn assay I work on…
This is a set of images showing the phenotype I try to quantify. I record the intensity of LC3 in two regions in the cell – ‘Cytoplasm’ and ‘LAMP1incytoplasm’. The intensity in the ‘LAMP1incytoplasm’ should be much higher that of the ‘Cytoplasm’ but the pipeline fails to show this.
Please advise
Attached is the CP pipeline and images.
Thank you.

20131009_Elazar pipeline_LC3.cp (13.4 KB)

Hi Noga,

In your second IdentifySecondaryObjects module, you comment “Identify the whole cytoplasm by LC3 signal” however the Input Image is set to TRITC (and not LC3)? Just a guess, but this would make sense if the idea was to grow the nuclear seeds into the bright TRITC regions, and then grow from there into
If I change the Input Image to LC3 it gives (to my eye) more reasonable “whole cells”.

The other issue I see is that the two IdentifyTertiary regions are not mutually exclusive. You could make it mutually exclusive if you change the second IdentifyTertiaryObjects module’s “smaller region” to “LAMP1 regions” to match the larger region of the previous IDTertiary module.

If I am guessing wrong, perhaps you can sketch out what compartments you want measured in a fake, or example cell?