R-Script vs CellProfiler API

Dear Community,

I just made my first steps with CP3 and have now devised a very robust pipeline which we’re going to use a lot in the near future. Next, i am going to write an R-Script that extracts the values from the output spreadsheets of my pipeline and performs some statistical analyses (+plotting) on them. Ideally, the output would be looped through such that it all runs automatically from the point of selecting the pictures of interest. Is there a way to a) call a cellprofiler pipeline from within R or b) forward the output from within cellprofiler and execute a given Script?

Many thanks in advance!
With kind regards,
H2AX

Hi! Sounds like a great automation you will be setting up.
Maybe this could help? https://github.com/CellProfiler/notebooks
(if not, there are other CellProfiler + Jupyter things in existence if you google them, but I’m unfamiliar with them all myself!)

It looks like a job for a workflow management system.

Maybe have a look at KNIME (https://www.knime.com/) specifically KNIME image processing: https://www.knime.com/community/image-processing.

You can run whole cell profiler pipelines within KNIME and do data processing and analysis directly there. I think it is also possible to run R scripts from knime.

The Knime platform looks indeed very suited for this task. However, i will have to spend quite some time learning how to use it. Initially i was hoping for some API plugin in combination with the frameworks that i already know. However, the Jupyter notebooks are way over my head, so this hits closer to home. I’ll let you know in a couple of months :stuck_out_tongue:

1 Like

P.S. Is there a way to execute a cellprofiler pipeline (while passing the image file names/paths) from within a Unix shell, e.g. bash?

Edit: Nevermind, i figured it out!

Great! For others, the answer is YES, you can run CellProfiler headless. Take a look here: https://github.com/CellProfiler/CellProfiler/wiki/Adapting-CellProfiler-to-a-LIMS-environment#head

2 Likes

So, i’m now able to call the CP3 pipeline from within an R script using the system() command and proceed with statistics&plotting therein.
However, it is comparably slower than when i run the analysis from the UI. I believe this is because only one instance of CP3 is launched and the pictures are analyzed one after another. How do i parallelise the runs in headless mode?

Edit: It looks like analysis accelerates over time. The first 10 images took as long as the subsequent 100. I now got > 500 images processed in ~ 1 h on a simple iMac.

1 Like