Issue with Illumination Correction Calculate Module


I am currently working with the pipeline published in Bray et al. Nature Protocols (2016) and trying to apply the illumination pipeline to almost 3500 images from a single plate. This step, with a filter size of 200, takes almost 1h 30’ per image, which is a lot of computing time. As far as I have understood, the CorrectIlluminationCalculate Module has to be applied to all the images of a plate on a single batch, so using this with the module CreateBatchFiles and splitting this task in 250 computing nodes that we have available at our cluster is not an option.

I have tried to decrease the filter size, which speeds the computing time but decreases the quality of the illumination correction and the resulting images are less bright.

I have also tried to invoke cellprofiler headless from the command-line and with the GUI. Cellprofiler seems to run on a single thread headless, but when I run this pipeline from cellprofiler GUI a single thread is launched, although in my preferences I have 12 set as the maximum number of workers.

My questions are:

  • Is it incorrect to divide the CorrectIlluminationCalculate task of a single plate into several batches to decrease the required computing time, as suggested in this post? In case it is correct, should I store the Illumination Functions computed for each batch or only the last one?

  • Is there a way to multithread this task?

  • Am I missing something or doing something wrong? It is the first time I use this software and it feels like the computing time I am getting for CorrectIlluminationCalculate is too high.

I’m using ClusterProfiler 4, and here you can find the pipeline I am currently using:
illum_CP4.cppipe (16.5 KB)

Please, if you need any other information to help me just let me know

Thank you

Hi @Ipediez,

I’ve looked into this, and there does seem to be a bug in the module. In the case of your pipeline, it’s arising from the use of the median filter smoothing function with a large window size. I’ve created this PR to address the problem in the next release. In the meantime it might be worth trying the gaussian filter or using a slightly older version (4.0.5?) for the illumination.

Regarding your questions, if you need to calculate an illum function across the entire plate then you really can’t divide the work up into multiple batches. The same goes for multithreading. If you have multiple plates then you could calculate each plate as a distinct batch.

1 Like

We also have had very good luck getting good illumination correction functions by, downsampling 4-10x, doing the illumination correction step on the much smaller image (which requires a much smaller and therefore much faster size of smoothing filter), then re-upsampling to full size before saving. That’s actually our typical illumination correction workflow

1 Like

Indeed, the gaussian filter reduces the computing time to 6s per image. Thanks a lot for your answers