Dear CPCluster users/developers,
I’m attempting to process a large number of pairs of tif files (90,000 pairs) using CPCluster.
I have successfully processed batches up to 6924 files (3462 pairs) with no problem. However, larger batches of files (I’ve tried batches as small as 16128, as well as ~25000 and ~140000) result in an error during the execution of the CellProfiler GUI prior to creating the batch files. In particular, the error occurs during step 3 of my pipeline, which is “RescaleIntensity”. The error occurs after CellProfiler has been working on step 3 for nearly an hour. The gist of the errors displayed in the dialog are “Image processing was cancelled because the module could not load the image …“
”… no space to read tiff directory”
I’ve done a bit of sanity checking: the tif file in question does exist, there is plenty of disk space, and there is plenty of RAM left (CellProfiler remains at roughly 500MB of virtual memory during the run). Running with different sets of files results in a different file mentioned as the file that couldn’t be loaded, leading me to believe that the problem isn’t related to the specific file.
I’ve made a screenshot of the actual error pane and made it available via this url:
Some details about my context: I’m running on a RHEL WSR4 cluster. I’m using the compiled (non-matlab) version of CellProfiler, version 1.0.4553
My images are 2.9MB each; they are all very similar in size and content as far as I know.
I’m guessing that some internal datastructure has been exhausted… but the documentation indicates that CPCluster has been used for much larger sets of files, so this error seems odd.
Given my large set of inputs, it will be much more work to process them in small batches, so I’d be very thankful for any help you could give me that would allow me to run the job in a large batch.