Problem running CPCluster on large numbers of images

cellprofiler

#1

Dear CPCluster users/developers,

I’m attempting to process a large number of pairs of tif files (90,000 pairs) using CPCluster.
I have successfully processed batches up to 6924 files (3462 pairs) with no problem. However, larger batches of files (I’ve tried batches as small as 16128, as well as ~25000 and ~140000) result in an error during the execution of the CellProfiler GUI prior to creating the batch files. In particular, the error occurs during step 3 of my pipeline, which is “RescaleIntensity”. The error occurs after CellProfiler has been working on step 3 for nearly an hour. The gist of the errors displayed in the dialog are “Image processing was cancelled because the module could not load the image …“
and
”… no space to read tiff directory”

I’ve done a bit of sanity checking: the tif file in question does exist, there is plenty of disk space, and there is plenty of RAM left (CellProfiler remains at roughly 500MB of virtual memory during the run). Running with different sets of files results in a different file mentioned as the file that couldn’t be loaded, leading me to believe that the problem isn’t related to the specific file.

I’ve made a screenshot of the actual error pane and made it available via this url:
maguro.cs.yale.edu/docs/CPerror.jpg

Some details about my context: I’m running on a RHEL WSR4 cluster. I’m using the compiled (non-matlab) version of CellProfiler, version 1.0.4553
My images are 2.9MB each; they are all very similar in size and content as far as I know.

I’m guessing that some internal datastructure has been exhausted… but the documentation indicates that CPCluster has been used for much larger sets of files, so this error seems odd.

Given my large set of inputs, it will be much more work to process them in small batches, so I’d be very thankful for any help you could give me that would allow me to run the job in a large batch.

Thanks,

Rob Bjornson


#2

Hi Robert,

It appears you are using the E method with options AA for both min and max values to be used. In this setup, every image in your analysis is loaded one-by-one to figure out the min and max values to use. When looking at 90k image sets, you are most likely getting extreme fragmenting of the memory. So while it may look like plenty of memory is available, there is likely no continuous block available to load the image. My suggestion is to manually figure out a decent value to use and skip the AA options. This will also make your first image set process much faster.

Regards,
Mike


#3

Dear Mike,

Thanks for your message. We had independently come to the same conclusion, that it was probably a mistake to examine so many images for this purpose. We changed the options and the problem was solved, with the side benefit you mentioned; the first image set went much faster.

Thanks again for your help.

Rob Bjornson