Executing multicore CP analysis pipes without a cluster

Hi CP developers,

I have been testing the “CreateBatchFile” module and it works great to farm out the image processing to multicore CPUs. I have noticed that others have asked for this functionality in the past but it appears that some users may be confused about how to set it up properly. I was wondering if I could get you to add a create bat file checkbox for users like me who have multicore CPUs but do not have access to a cluster of computers. It does significantly speed up the pipeline execution time to run large CP jobs headless with the image processing load spread over a 4 or 8 core workstation. It could ask for the number of processors so that the image batch can be split equally between cores. I was envisioning a bat file script like the one listed below. Then users could just click on the bat file in the output folder and the multi core CP analysis would be run automatically.

cd \Program Files\CellProfiler

START CellProfiler -p J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\output\Batch_data.mat -i J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\ -c -r -f 1 -l 4

START CellProfiler -p J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\output\Batch_data.mat -i J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray" -c -r -f 5 -l 8

START CellProfiler -p J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\output\Batch_data.mat -i J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\ -c -r -f 9 -l 12

START CellProfiler -p J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\output\Batch_data.mat -i J:\Users\nikon\Desktop\cellprofiler_image\PR\split\gray\ -c -r -f 13 -l 16

Hi Derek,

We’re actually in the process of adding multiprocessor support to CellProfiler. It’s taking a bit of work to adjust the CP internal architecture to support it, but we hope to have it ready in the next few months.

Ray Jones

Thanks for the update Thouis. I can’t wait to try it out. Any chance of an OpenCL CellProfiler implementation because who needs a compute cluster when a single graphics card has 1600 stream processors? It seems like a perfect match for a GPU since they are quite efficient at low dependency/embarrassingly parallel workloads? Just dreaming out loud.

It’s something we’ve looked at, a bit, but there are other, more glaring inefficiencies in CP to take care of, first. Also, there are some projects (notably, scikit-image) working on implementing a common interface to multiple backends (numpy, opencl, etc.) implementing common image processing algorithms. We will be able to take advantage of these fairly easily when they are available.

Great news.

Hi guys,

Any news on the multi core implementation? We are opting for a very powerful workstation and would love to speed up our cell profiler pipelines using it.



also, would it be possible to start CellProfiler multiple times on a multicore PC in order to use all cores? If so, we could just run several instances on one machine.

Our team has been working on multi-processing functionality for some time now and it’s actually finished but not released. Part of the reason for this is that we have incorporated our UI revisions re: image file loading into it as well. To your second point, opening multiple instances will not work as expected, but we’re working on fixing this.

However, if you’re daring, you can actually try it out here: cellprofiler.org/cgi-bin/trunk_build.cgi; see the files below the horizontal line.

For the benefit of others who may see this page: At this point, this CP build is experimental and still in development. There is absolutely no guarantee that pipelines made with this build will not break on later versions as we continue to make revisions prior to release.

That said, feel free to try it out :smiley:. Also, we’d love feedback on the image loading UI. If you have comments, you can post on this thread: ATTENTION USERS: Help us with usability improvements!