Speed etc

My data set is: 60 wells, 4 frames, 2 colors (480 16-bit images, 11MB each). The CP pipeline is very simple: Identify the nuclei after smoothing, then measure the 2 channels and export the data

I am running this on a 2.7GHz 12-core Xeon E5 MacPro, 1TB SSD, 64GB RAM (1866MHz). CP takes full advantage of the available cores (24 workers maxing out the CPUs)

The two issues I have are:

  1. Overheating memory chips: two of the memory chips are overheating rapidly (>100C). The same happens on a 2nd machine (identical configuration), but it does not happen when I use other CPU- and memory-intense programs such as ImageJ. I realize that this is basically a hardware issue, but I was wondering if there was a way to alter how memory is managed by CP without sacrificing speed?

  2. Slow speed relative to CPU usage: even when running 24 workers (with no displays showing) the processing speed still seems slow (about 4 minutes with CP using 100% of the CPU compared to 46 seconds with ImageJ (equivalent ImageJ script; running 6 instances in parallel, using ~90% of CPU; running just 1 instance of ImageJ takes ~3min and uses <25% of the CPU). Due to the overheating issue, the most workers I can use for continuous processing is about 6, resulting in a ~12 minute processing time.

I am aware the reducing the image size would make things easier, but ImageJ can deal with the large images very efficiently, so it seems to be a software issue as well. Any suggestions?

test.cppipe (10.2 KB)

Hi, thanks for the analysis.

I think you’re right about it being a software issue. We’ve tried to make our algorithms take an amount of time and memory proportional to the image size or at worst O(N log N) in software terms, but we could pay more attention to memory usage and there are many opportunities for optimization. CellProfiler represents pixels using 64-bit floating-point values whereas ImageJ often operates with a much more limited memory footprint, sacrificing generality for efficiency. We do have time allocated in the current phase of our grant to address some of this and the upcoming 2.1.2 release (which you can try out out from the trunk-builds page: cellprofiler.org/cgi_bin/trunk_build.cgi) has a cacheing strategy that flushes images and objects from memory after each module execution and that might help.

With regard to segmentation of nuclei - CellProfiler may be doing more work than ImageJ. IdentifyPrimaryObjects runs a watershed algorithm that is expensive in both time and memory consumption if you separate touching objects. ImageJ may just be identifying connected components (also done by IdentifyPrimaryObjects). Turning off hole filling and object splitting will save about 33% in processing time. We may be computing more measurements as well.

Pragmatically, there is a balance between memory and cores used, although 6 cores / 10GB per core is worse than I’d expect for a simple pipeline and any reasonably-sized image. Hopefully this will improve in the future.