Batch mode in CellProfiler 2.1.0

I’m entirely new to using CellProfiler, so you’ll have to forgive my ignorance. However, I’m trying to set up a process to analyze multiple sets of images (some of which are yet to be collected).

First, I followed the instructions at cellprofiler.org/CPmanual/He … ssing.html to set up the batch files, but those instructions seem to be outdated, in that instead of getting the expected Batch_data.mat file, all I got was a Batch_data.h5 file. According to the help for the CreateBatchFiles module

Has the future already arrived? Do both the web page and the in-module text need to be updated? Note that currently the OutputFilename and Output file format fields of the CreateBatchFiles UI don’t seem to do anything.

Second, I assume I can still use the recommended invocation command of

but substituting Batch_data.h5 as the argument for the -p option. I’m not entirely sure what qualifies as an “Image Set”. What is that?

Finally, I’m unclear on the image specification vs the pipeline specification. I’d like to be able to set up (or have an expert user set up) a pipeline and invoke it against any directory of images, breaking the set of images in the directory up into chunks and feeding chunks to different nodes in our cluster. When next week’s images come in, I’d like to use the same pipeline, but point to a different directory of images. However, it looks like you have to specify the image directory as part of creating the .h5 file, and that requires running the GUI again. Is there a way to re-use a pipeline like this?

Thanks,
Jon

Yes, indeed they do! Thanks for pointing this out :smiley:

These fields have to do with the “View output settings” button and not the CreateBatchFiles module (a little confusing, I know).

The output files are produced (if requested) at the end of an analysis run. In batch file creation mode, they are not; only a batch file is made.

[quote=“jchrist”] Second, I assume I can still use the recommended invocation command of

but substituting Batch_data.h5 as the argument for the -p option. I’m not entirely sure what qualifies as an “Image Set”. What is that?[/quote]

One point of clarification first: Are you running from Windows, Mac or Unix? The reason I ask is that batch file processing is most often done on the computing cluster (i.e, Unix-based system) so the instructions assume that you are working with the source code. Is this what you are doing?

If not, then the instructions are a bit different (I’m going to assume you are using Windows for the sake of argument). The command would be this: CellProfiler.exe -p <Default_Output_Folder_path>/Batch_data.h5-c -r -b -f <first_image_set_number> -l <last_image_set_number>
An image set is the collection of channels the represent (in the usual case) a single field of view. It is set up using the NamesAndTypes module; see the help for that module for more details.

Yes. Keep in mind that the batch file contains (among other things) the locations of the images to be analyzed so you will always have to specify the input somehow. There are a couple of ways to do this:

  • You can always open CP, load the pipeline just drag/drop the new set of images into the Images module, and produce a new batch file without any further changes.
  • The other way is to use the LoadData module as part of the pipeline, which takes a CSV of filenames, one row per image set, and runs the analysis; this can be done headless (i.e, without the UI), but it is the responsibility of the user to produce the CSV for each new set of images. However, if you are using a LIMS environment, this might straightforward for you. More info on this (which is up to-date) can be found here: github.com/CellProfiler/CellPro … nvironment

Both of these approaches assumes that the image nomenclature remains the same from week to week.

Regards,
-Mark