Problem creating batch files in version 1.0.5811

cellprofiler

#1

Hi,

I am attempting to user the 1.0.5811 version of Cell Profiler, after using the 1.0.4553 successfully for more than a year. I’m having a problem generating batch files.

I am attempting to compute a small, 6 cycle test case in preparation for a large run. This test case appears to run fine with my pipeline so long as I don’t create batch files.

However when I add CreateBatchFiles to the end of the pipeline, the first cycle appears to run fine, and the CreateBatchFiles module claims to have output the batch files. However, when I look in the output directory, I only see these two files:
Batch_data.mat DefaultOUT.mat

The individual batch file(s) don’t appear. I would expect to see one, named Batch_2_6.mat
(As an aside, in the older versions of CreateBatchFiles, there was an option to specify the number of cycles per batch file. This option doesn’t appear in the 1.0.5811 version. How do I specify this?)

For background, I’m running this on 64 bit Redhat system, using the developers version of CellProfiler running inside matlab 7.4.0. (since a compiled version for Linux isn’t yet available)

[rdb9@c076 TEST092408]$ uname -a
Linux c076 2.6.9-55.0.9.EL_lustre.1.6.4.2smp #1 SMP Wed Jan 16 19:52:57 EST 2008 x86_64 x86_64 x86_64 GNU/Linux


#2

Hi,

The behavior you are seeing is correct, albeit different than before. There is now only a single Batch_data.mat file created, i.e. no Batch_2_to_X.mat, etc. The batch size is now handled in your CPCluster scripts, e.g. in BatchRunner.py as found in the CPCluster package. It turns out that if you had mismatched the batch sizes in versions previous to 5811 in CreateBatchFiles and BatchRunner.py (or whatever script you run), then the CreateBatchFiles setting would be ignored anyway. So we thought that this would be more straightforward, requiring the batch size to be placed in only one script or module.

Please refer to BatchRunner.py and CPCluster.py, which have had some changes, for more help. Notably, BatchRunner.py now assumes the “Batch_” prefix, plus is a bit smarter in the way it checks and submits the jobs, and also we use python 2.5.2, which I believe is necessary for some python imported modules.

These changes are alluded to in the Help for CreateBatchFiles, but I see that there ought to be more explanation somewhere - sorry for the confusion. We also have a new DataTool, “SubmitBatch”, which will submit batches made by the CreateBatchFiles module to the cluster via webserver. Of course, you would need to set up this webserver yourself (or your IT dept). This is intended for our users who are not as comfortable as you are with command-line job submission.

David