Run from command line

Hi Team,

I would like to know if there is an option to run CellProfiler from the linux command line.
The problem is that we have a lot of movies (hundreds) and therefore, we would like to run the same pipeline for all the movies automatically, without the CellProfiller GUI.
If we will run it from the command line we will need to change the parameters within the pipeline (for example, give it a list of all the movies that it has to load). Please let me know if this is possible.

Thanks in advance,

Maya

Hi Maya,

It sounds like you want to run CellProfiler ‘headless’:

% *** RUNNING CELLPROFILER WITHOUT THE GRAPHICAL USER INTERFACE ***
%
% In order to run CellProfiler modules without the GUI you must have the
% following variables:
%
% handles.Settings.ModuleNames (for all modules in pipeline)
% handles.Settings.VariableValues (for all modules in pipeline)
% handles.Current.CurrentModuleNumber (must be consistent with pipeline)
% handles.Current.SetBeingAnalyzed (must be consistent with pipeline)
% handles.Current.FigureNumberForModuleXX (for all modules in pipeline)
% handles.Current.NumberOfImageSets (set by LoadImages, so if it is run
% first, you do not need to set it)
% handles.Current.DefaultOutputDirectory
% handles.Current.DefaultImageDirectory
% handles.Current.NumberOfModules
% handles.Preferences.IntensityColorMap (only used for display purposes)
% handles.Preferences.LabelColorMap (only used for display purposes)
% handles.Preferences.FontSize (only used for display purposes)
%
% You will also need to have the CPsubfunctions folder, since our Modules
% call CP subfunctions for many tasks. The CurrentModuleNumber needs to be
% set correctly for each module in the pipeline since this is how the
% variable values are called. In order to see what all of these variables
% look like, run a sample analysis and then go to File -> Tech Diagnosis.
% This will let you manipulate the handles variable in MatLab.

So yes, you are correct- you will need to set some parameters. This option requires you have a Matlab license on the linux machine that you’ll be using for analysis. You’ll launch matlab interactively, set up the handles, and run the pipeline from the Matlab command line.

Hi,

My name is Barak and I am working with Maya.
First, thanks a lot for Cell Profiler and this forum.

As Maya said, we want to run CP on many image sets (which are different experiments) on our cluster (a single image set for each node in the cluster).

I have some questions regarding your answer:

  1. Is there a significant run time difference between the compiled version of CP and the developer’s version?
  2. Running CP headless requires to run matlab and through it to run CP? Does this requires to use the developer’s version?
  3. When we want to change the handles, is it correct to run the CP GUI, load a pipeline, then run Tech Diagnosis, save the handles to a mat file, and later on load it and for example to run on different images simply change the correct places in the handles.Settings.VariableValues?
  4. After loading and creating the handles struct, adding the path CPsubfunctions, how do we run the pipeline from the Matlab command line?

Thanks in advance,
Barak

Hi,

  1. No, there is no difference; however, on a cluster, you would need a Matlab license for every node you want to run the Developer’s version on, so the compiled version is a bit of an advantage in that you don’t need Matlab.
  2. Yes, running CP without the GUI requires Matlab (and the Developer’s version of CP)
  3. That definitely sounds like the easiest way to do it- though the GUI version sets a bunch of parameters in handles that are irrelevant if running headless (ie, settings about the GUI), but I’ve never tried it. The answer I gave Maya above outlines the bare minimum you need to set in handles for a pipeline to run.
  4. It sounds like you just want to run the same pipeline on a bunch of files without starting up CellProfiler. Once the handles are set in Matlab, I would just write an .m file to execute all the modules, passing handles back and forth, like you would in a pipeline.

There’s no significant processing speedup that I can think of running headless, other than not having the GUI on, so if you’re really after speed, you may want to look into batch processing, ie sending every movie off to a different computer.

-Kate

Thanks for the quick reply.
One more question regarding the handles: where can I find in the handles the Output Filename?

Barak

Hi Barak,

The output filename is not found in the handles structure, but is obtained from the GUI itself and saved in handles.Current.OutputFileNameEditBox. Since the GUI won’t be active if running headless, the OutputFileNameEditBox field will not be usable.

Regards,
-Mark

Does the matlab comments still apply 7 years later? I would like to work directly in python.

Hi Lee,

No, that’s definitely not still true. You can check out the wiki on our GitHub for more up-to-date information about running CP from source, headless, directly in a python package, etc.