Running Illumination Correction - Calculate module across parallel jobs on a computing cluster

Hi CellProfiler Team,

Our lab is currently working on getting CellProfiler 2.2.0 and 3.0 running on our computing cluster, and we have a couple of questions regarding illumination correction pipelines. We routinely image multiple 384-well plates in 4 channels (25 fields per well) and so have hundreds of thousands of images to run in any one experiment.

Using desktop mode, we’re able to easily load up all of the plates from a single experiment, calculate across all images (grouped by plateID) and then output a single image per plate for each channel (around 7 hours per plate). Is there a way to run CorrectIlluminationCalculate in the same way, but across multiple parallel jobs on a cluster? We currently run 90 simultaneous jobs to export the data to a MySQL database, but weren’t sure how splitting the images into 90 jobs would impact the illumination correction calculation.

Your help would be very much appreciated!

Thank you,

Karla

1 Like

Hi @Karla, one way to make your workflow more parallel is to group your images by plate and channel. If your plate has 4 channels, this would at least take an 8 hour processing time to a 2 hour processing time.

Additionally, you could try sub-sampling your plate to reduce the number of images required to calculate your correction image.

When you submit your job use the cellprofiler -g flag from the command line to specify the group of images to be processed by a node.

Another alternative to consider, divide a plate across your 90 jobs and create 90 illumination correction images for each plate. Assuming each job was given an equal number of images, then averaging these 90 correction images should give you a similar image that you’d find by processing them all at once.

1 Like

Thanks very much! That really helps. I’ll try both options and see which is better!

Hi Karla,

I found your message on the forum and was just wondering which of the methods that were suggested (if any) you ended up using? We are facing the same issue of trying to generate a single illumination correction function while running parallel jobs on a cluster and your feedback would be very helpful. Thank you!