CellProfiler CreateBatchFiles module

Hi,
I want to analyze multiple sets of images using CellProfiler2.1.1 Windows version. And I note that the CreateBatchFiles module may help to solve the large jobs. But I am ignorant of the setting of this module. There are some fundamental problems:

  1. I cannot find the corresponding Linux version in the developer’s webpage(https://cellprofiler.org/previous_releases/). But I found some topics of the CellProfiler2 Linux version installation (https://github.com/CellProfiler/CellProfiler/wiki/Source-installation-(Linux)) and it seems fuzzy of how to fetch and install a CellProfiler2.1.1 or later version. Do I need to install the corresponding version in the server ? And are there any dependencies changes of the version?
  2. Since the file path in cluster computer is written as the format ‘/server_name/your_name/your_data/images’, I am misunderstanding with the ‘server_name’ format. Is there any linux command can I get the right format or does this contains the IP of the server?
  3. Since the images are stored on my local computer, but the inputs are finally processed by the remote server and the output are stored on the local computer. I am confused with this file path setting and the problem of online job processing. My server manages jobs by qsub system so I should submit the job in a PBS script.
  4. In the ClassifyPixels process, the images were classified by a .h5 file (~90 Mb in size) produced from ilastik. But I am ignorant of how to write the command line with this input.

What do you think of that? Cheers for the detailed answer:smile:
Sen

A partial answer to the question about linux and cp is here for CP 2: CellProfiler install from source and run headless to solve batch processing memory errors

and here for CP 3: Cellprofiler Linux setup

Why use version 2 when 3 is available?

Your question (#3) about inputs and outputs is confusing. CP is expecting a directory path. eg “C:\Users\you\image_directory” or unix style “/image_directory/some_folder” No IP addresses, it’s not going ssh to your server.

Not sure which version or even if CP can import .h5 files, but what I do is import them into ImageJ using the “ilastik” plugin and then convert them into a binary mask which can then be imported into CP. Good luck!

3 Likes

Thanks, johnmc. I have installed a CellProfiler2.2.0 in my cluster. Many Effort to solve the dependencies. This version supports the CassifyPixels function using .h5 file produced from ilastik0.5. But I am still ignorant of how to solve this input in the pipeline.

Cheers for the detailed answer
Sen

I have had quite a lot of experience setting cellprofiler up on my local cluster. I can probably help if you have some specific questions? My cluster also uses the Son of Grid Engine (SGE) for submitting jobs, which is what I assume you mean by the “qsub system”.

Our cluster sadly works on CentOS 6, which means I cannot mount my windows file system (where the images are stored) to the cluster. If your cluster runs Ubuntu, then I recommend you ask IT to help you mount your local drive.

If you cannot do this, you will need to move files between the two systems. I do this with psftp. Most of my output gets sent to an online MYSQL database, so I don’t have to move many files back.

So to clarify - my workaround is to psftp image files and batch script (.h5 file) to the cluster, run the batch script on the cluster and the output goes to the MySQL database.

Let me know if you need further help.

1 Like

Thanks for your kindness, edward. I require the function provided from Ilastik to help classify the image objects, which returns a .h5 file and it can be imported into Classifypiexls module of CP2.2.0. I install the CP2.2.0 in our cluster working on CentOS7.5. It is convenient to send images to the cluster. Since the CP2.2.0 is not compiled for Linux (https://github.com/CellProfiler/CellProfiler/releases/tag/2.2.0), I install the software by source code.

  1. I receive the error information in the installation step. The error seems coming from the incompatibility between python2 and python3. The CP2 scripts are developed by python2 but I am ignorant of the error referring me to run the software by python3. Finally We try to rewrite the scripts in the py3 style using 2to3 module of python3.But it seems to be useless and the error reminds me some modules cannot be imported when the software running Ilastik. And now I am stuck in this trouble.
  2. I have tried another installing way using conda (https://github.com/CellProfiler/CellProfiler/wiki/Conda-Installation). It returns a right output when I run the command ‘cellprofiler -h’. But when I tried to run the pipeline referring to this(Installing CP on amazon ec2 ubuntu instance), the error reminds the same information and I am also stuck by it.
  3. I also note the source installation method provided by the developers(https://github.com/CellProfiler/CellProfiler/wiki/Source-installation-(Linux)).It supplys a makefile but our cluster cannot links to the url(https://svn.broadinstitute.org/CellProfiler/trunk/CellProfiler).

Do you have any idea of what the problem is? Thanks again.
Sen

Hi Sen,

Re:1 - My understanding is that CP2 is all python 2. I’m not sure why python 3 would be involved at all. Could you please send a screenshot or upload a text file with the full output including the error?
Re: 2 - I used conda to install CP3.1.8 (and my own forked version) on CentOS 6. It worked after a little fiddling. Again could you please upload the full output including the error?
Re: 3 - That link is broken. I would just follow the instructions for installing from source but for CentOS 6. Are you using “git checkout” to get the correct version of CellProfiler from github?

If you could upload as much output as you think is relevant, that would be very helpful to us in fixing your problems.

Cheers!

Ed

1 Like

Thanks, edward. The followings show the answers:

  1. I install the CP2.2.0 using source code from git (https://github.com/CellProfiler/CellProfiler/releases/tag/2.2.0), unzip the package and using the command ’ pip install --editable .’ in the specified directory. The error information shows as the error.txt [1] file. Then I abandon this method.
    2.I try the anaconda method refers to this (https://github.com/CellProfiler/CellProfiler/wiki/Conda-Installation). This method seems to install the latest version instead of the specified version like CP2.2.0. And I finally solve the dependencies by ‘conda install xxx’ referring the error. Then I try to run the pipeline referring to this (Installing CP on amazon ec2 ubuntu instance) using the example input. The error returns as following:
    error
    I am now stuck with it and waiting for advice. Thanks again.
    Sen
    [1] error.txt (11.1 KB)

Hi Sen,

Seems like CellProfiler can’t find the plugins folder. I assume you used the --plugins-directory switch to point it to the correct folder?

If not it could be an error with your installation as it says it can’t find Ilastik. This looks like a conda install.Could you please provide your environment file? (.yml file)

Cheers,

Ed

1 Like

Thanks, edward.
Here shows the environment file and the conda list file.
Pleast note that I require the function supplied by ilastik, so I need to install CP v 2.1.1 - 2.3 (https://github.com/CellProfiler/CellProfiler/wiki/How-to-use-Pixel-Classification-in-CellProfiler).

Sen
[1]environment.yml (1023 Bytes)
[2]conda_list_table.txt (13.1 KB)

Hi Sen,

First thing I’d try is your environment file. Looks like you’re currently cloning the master branch. Have you tried changing the git command from “…@master” to “…@v2.3.1” or whatever the particular version you need is?

Ed

1 Like

Thanks,edward.
Here shows my CP version using conda installation method, showing that it is v2.2.0 [1].
I did not try to change the git site you recommended. How can I confirm the required modules is installed or not. I also note that the specified version did not contains some modules. This situation seems to be same as mine (Cell profiler vigra startup error on linux). But the problems are still unfixed:pensive:.
Do you have any idea of this?

Sen
[1]clipboard

Hi,

I hate to be the bearer of bad news, I’m excerpting the page you linked, about the old ilastik integration, and adding emphasis-

Unfortunately, maintaining the necessary connection between CellProfiler and ilastik became impossible given our team’s resources, so this workflow only functioned with a very old version of ilastik (v 0.5) that was bundled with CellProfiler v 2.1.1 - 2.3, and even then only on Windows.

It does not appear this is possible to do on linux, even with these old versions.

Again, though, why not upgrade your workflow to modern versions of these packages? You can run modern ilastik and CP 3+ both on a cluster, and in a batch mode…

1 Like

Thank you,bcimini. I am trying the CP3, many modules have changed. I wonder which module can I load a .h5 file produced from ilastik. Ooh, Many functions need to be adapted to change the pipeline.

I’m unclear which kind of h5 file you mean- you will no longer be able load an h5 CLASSIFIER file, that function does not work anymore (as it says in the link you provided). Hopefully, in ilastik you can easily transfer either the ‘.h5’ classifier directly into ilastik 1.3+, otherwise you can presumably export your annotations from 0.5 and re-import them in 1.3; I’ve added an ilastik tag to this post to help. From there, you can use the “batch mode” to classify your images.

If you mean an ilastik h5 IMAGE file, I am not certain you can read those in CellProfiler, but ilastik can export many other file formats, including tif.

1 Like