Adjusting/preparing In/Out folder while creating the pipeline for batch-processing

I suspect 1.20.2 is fine (we’ve been using 1.20.1), what is your h5py version? That’s the other one that often is tricky in that it doesn’t “show up” as the underlying cause when it actually is, and I can’t quite tell what version you have from that tmp/build string.

Hello @bcimini,

  1. My h5py version is:
>>> import h5py
>>> print(h5py.__version__)
2.10.0
  1. h5cc -showconfig shows my HDF5 Version to be 1.10.6

I notice you’re running the cpproj again, can you be sure when you’re running on your cluster to run the .cppipe? That could possibly be the cause of the issue if it’s trying to make groups based on your file locations stored in the cpproj.

Otherwise, I’ve tried and failed to replicate this on my end, so it’s either a) something in your install or b) something in your image set or c) something in your run parameters, and your install looks clean AFAICT.

If switching to the cppipe DOESN’T fix it:

  1. To test “a” for sure, can you put this image set and cppipe somewhere in your cluster and try to run it headless? I’ve verified it works for me, with both numpy 1.20.1 and 1.20.2 (just to rule out that THAT was the difference, it would have been unlikely, but was worth testing). If it fails with the same ExportToSpreadsheet error, there’s something funny in your install, if not it’s to do with your data set or call and you should proceed to step 2.
    TestETS.zip (3.8 MB)

  2. To try to get some insight into “b”, can you post the following lines of print statements between lines 727 and 728 in ExportToSpreadsheet (aka between the line starting group_numbers and the line starting max_image_set_len) in your cloned copy of CellProfiler and see what the output looks like on the sample set I just sent you for step 1 and for your data set? I’ve posted below what it should look like for the sample set.

            print(group_numbers)
            print(type(group_numbers))
            print(type(group_numbers[0]))
            print(numpy.bincount(group_numbers))
            print(type(numpy.bincount(group_numbers)))
            print(type(numpy.bincount(group_numbers)[0]))
[1 2 3]
<class 'numpy.ndarray'>
<class 'numpy.int64'>
[0 1 1 1]
<class 'numpy.ndarray'>
<class 'numpy.int64'>

Hello @bcimini The problem was with the pipeline extension. once I ran the .cppipe, I’m not getting the excel related error. and I can generate the excels as well. Thanks a lot for the excellent forum support.

I want to accomplish another task, I would like to screenshot the whole IdentifyPrimaryObjects window - export both the images and table into one png/pdf as part of the pipeline and save it. Is this possible as part of the pipeline, so then at the end of the analysis each original image has its own summary image? I saw a solution of it, in this post How to automatically save output window from Identify Primary Objects function. But, when I’m changing parameters in the SaveImages module, I don’t see Objects options under ‘Select the type of image to save’. What may be a reason for that?

So in general, that option would be ONLY to save the color-coded picture of the objects, not the whole window with the table; if you look at the link in the post directly below the one you linked, there’s a link as to why we can’t do what you’re requesting.

The option to save just pictures of the color-coded objects directly from SaveImages was removed in CellProfiler 3, you can still do it but you just have to make the picture first with ConvertObjectsToImage and then save THAT picture in the SaveImages module.

Thanks, @bcimini for the great help. Another little question to ask, Since I’m writing a job submission script for my batch task, I will be providing a .h5 file as a pipeline created following the description Here.

So, What are the differences between using a .h5 file as a pipeline vs a .cppipe file as a pipeline? Can I use the .cppipe pipeline I created for the job submission task instead of a .h5 file? I have a basic understanding of how the hdf5 file system works.

You can absolutely use the cppipe instead. The instructions you linked to are for using the .h5 file that is the output of the CreateBatchFiles module; if you do that, it serves as your -p and you don’t need any of the input options (-i --file-list --data-file). It is perfectly also permissible though to use p somefile.cppipe along with one of the 3 input options though instead!

1 Like