How to install cellprofiller2.0 in a Linux Cluster

So…
Where can I find the protocol to install Cellprofiller 2.0 in a linux cluster?
Is there a CPcluster 2.0?

I’ve seen this: Guide to Batch Processing , but it seems for cellprofiller 1.0.

Thanks a lot!

How to install on a linux cluster has some dependencies on the version of linux and which job management system you are using.

There isn’t a separate CPCluster executable any more. CellProfiler.py can be invoked in a standalone mode from the command line. Run python CellProfiler.py --help for a list of arguments, many of which of related to batch mode. A typical batch execution might look like:

python CellProfiler.py  -c -r -b -p BatchData.mat -o output_dir -i image_dir -f 10 -l 20 -d status/done10_20.txt

The general flow for getting set up is:

  • Check out the source code
  • Install CP’s prerequisites (similar to those for the Mac: cellprofiler.org/wiki/index.php/ … p_on_a_Mac)
  • Write a job management script (see RunOne_2_0() in BatchProfiler/RunBatch.py in the source code, for example)

I’ve also started a page on the wiki to collect information on Linux & Batch processing: http://cellprofiler.org/wiki/index.php/CP2.0_on_Linux.

Hi!
The IT guy from the cluster installed cellprofiler 2.0 in the master node of the cluster
I think he did a great job finding the correct packages but I’m getting an error when running a simple pipeline that creates maxprojections. The error says: “No module named tifffile”
¿Did you guys run into this error before?

The cluster runs red hat linux.
If you wish to take a look at the architecture of the cluster: cms.uniovi.es/ (it runs on the one called cmi)
This was his installing routine:

  1. Install Python 2.5.5
  2. Install wxwidgets 2.8.9.1
  3. Install matplotlib 0.99.1.1
    zlib 1.2.3
  4. Install Cython 0.12.1
  5. Install PIL 1.1.7
  6. Install MySQL-python 1.2.3c1
    Requires setuptools (0.6c11-py2.5):
  7. Install numpy 1.4.1
    a) Compile BLAS:
    b) Compile LAPACK 3.2.1
    c) Compile CBLAS
    d) Compile ATLAS 3.8.0
    e) Install numpy 1.4.1
    8) Install Scipy-0.7.2
  8. Install Java JDK 1.6.0_20 (64 bits)

Thanks for all the help!

tifffile.py lives in CellProfiler/contrib/tifffile.py in the source code. Perhaps this directory didn’t get checked out all the way?

The file is there, but for some reason the CP doesn’t find it :frowning:

Had you heard about this issue before?

Thanks!

This is not something we’ve seen before. Was that the complete error message?

At the top of cellprofiler/modules/loadimages.py, there is a line like:
version="Revision: 10066 "
Can you tell me what that line says for you?

Also, can you post the pipeline here?

Thanks,
Thouis Jones

Version is:

version="Revision: 9976 "

I think I know the problem. You can probably fix it by adding:
export PYTHONPATH=/path/to/CellProfiler:$PYTHONPATH
to the script you use to run CellProfiler.

I thought CellProfiler.py did this automatically, but apparently not.

Another fix would be to make sure CellProfiler is started in the /path/to/cellprofiler directory.

Hi!
Unfortunatly it didn’t work. The behaviour remains the same.
It keeps giving the notiffile issue at the same point :frowning:

I’m out of easy ideas. Perhaps we should move to phone or Skype with you or your IT person to try to work this out faster. I would also be willing to try to debug it myself, if you are willing to give me a temporary account on the cluster.

Hi thouis!

That would be great. Thanks a lot for all the effort.
I’ll ask the IT guy if they’d mind if I borrow you my account (nothing sensitive on my side). The IT guy as you probably guessed is clustermc.
I’ll get back to you when I have an answer :smile:

Hi, I’m trying to install CP 2.0 on Ubuntu 10.04.

It seems to work, but when I run it, it says “WARNING: Java and JVM is not installed”, however both are definitely installed. How do I make CP see Java? Thanks.

Is “javac” on your path?

Our own script to run CP on Linux sets JAVA_HOME and adds $JAVA_HOME/jre/lib/amd64/server to LD_LIBRARY_PATH.

You should be able to find which directory you might need to add with:
% find $JAVA_HOME -name libjvm.so

Hi,

Download from: http://data.marssoft.de/cp/Makefile.CP2

I would like to propose a build system for CellProfiler 2 prerequisites. It is currently in early alpha status, but can already be quite useful. I have been able to use it on a fairly old cluster environment based on CentOS Linux 5.2. It works equally well on a more recent Ubuntu 10.04 (Lucid). Essentially, it is a single Makefile for all CP2 prerequisites, that works fully automatic. It will download, unpack, compile and install all prerequisites, namely the following:

PYTHONVERSION				= Python-2.5.5
PYPIVERSION				= setuptools-0.6c11-py2.5.egg
PILVERSION				= Imaging-1.1.7
MYSQLPYTHONVERSION			= 1.2.3
PYTHONNOSEVERSION			= nose-0.11.3
DECORATORVERSION			= decorator-3.2.0
CYTHONVERSION				= Cython-0.12.1
ATLASVERSION				= atlas3.9.25
LAPACKVERSION				= lapack-3.1.1
NUMPYVERSION				= numpy-1.4.1
FFTWVERSION				= fftw-3.2.2
SCIPYVERSION				= scipy-0.8.0b1
UMFPACKVERSION				= UMFPACK-5.4.0
UFCONFIGVERSION				= UFconfig-3.5.0
AMDVERSION				= AMD-2.2.1
MATPLOTLIBVERSION			= matplotlib-0.99.3
WXGTKVERSION				= wxGTK-2.8.11
WXPYTHONVERSION				= 2.8.11.0
JPEGVERSION				= 8b
ZLIBVERSION				= zlib-1.2.5
TIFFVERSION				= tiff-3.9.4

It is also quite convenient to use. First, download the Makefile from here: http://data.marssoft.de/cp/Makefile.CP2 to an empty directory. Then open it with a text editor, and edit the parameter TGTDIR to where you want to install. It will create a directory python25 as a subdirectory of TGTDIR, so this would install to $(HOME)/usr/python25/

TGTDIR = $(HOME)/usr

Then set the environment in your .bashrc (or whatever shell you are using), i.e. if you have TGTDIR = $(HOME)/usr then the following should work:

# python-2.5 and other CP2 prerequisites
export  PATH="$HOME/usr/python25/bin:${PATH}"
export  LD_LIBRARY_PATH="$HOME/usr/python25/lib:$HOME/usr/python25/lib64:${LD_LIBRARY_PATH}"
export  PYTHONPATH="$HOME/usr/python25/lib/python2.5/site-packages:${PYTHONPATH}"
# java-1.6 for 64bit
export  JAVA_HOME="$HOME/usr/jdk-6u20-linux-x86_64"
export  PATH="$JAVA_HOME/bin:${PATH}"
export  LD_LIBRARY_PATH="${JAVA_HOME}/jre/lib/amd64:${JAVA_HOME}/jre/lib/amd64/server:${LD_LIBRARY_PATH}"

Finally, open a shell, change to the Makefile directory, and invoke make:

make -f Makefile.CP2

That should finish the installation of all prerequisites, except for CellProfiler 2 itself. You should then be able to run CellProfiler 2 by downloading it, changing to its source directory, and invoking:

python CellProfiler.py

The Makefile is still a preliminary version that could use some fine-tuning. Below is a list of possible quirks. It should provide good performance in SciPy and NumPy by using ATLAS.

Known Quirks:

  • you have to take care to disable CPU frequency scaling before compilation, else ATLAS can not tune its algorithms for your CPU

  • some paths in the Makefile are absolute towards my home directory. If you get error messages about the path to CellProfiler 2, change the path in the Makefile

  • You need a fairly recent version of gcc. I have used gcc-4.4 by compiling it myself. If you require a Makefile for gcc as well, please contact me.

  • Java is not downloaded or installed automatically. Please download and unpack it yourself, and adjust your JAVA_HOME variable accordingly.

  • MySQL is not downloaded or installed automatically. Please download and install it yourself, and adjust your PATH variable accordingly.

I am more than happy about feedback and reports. I would like to improve the Makefile over time, to have a stable and reliable foundation for CP2 installation on a Linux cluster - any help is highly appreciated.

All the best,

Mario Emmenlauer, Biozentrum Basel

1 Like

Hello Mario,

Thanks very much for this. It will come in very handy for new Linux users trying to get CP running on their cluster.

My hope is that eventually, CP’s dependencies will be mature enough that we can use a .deb or .rpm solution, as well. This would be even easier for Linux users.

Hi Thouis,

I would very much like a deb or rpm solution! However, clusters are often hand-crafted by a few admins, so I assume there will be always a need for customization: users to install in their home directory, or admins install in a software tree directory, or very old/very new compiler dependencies etc. Also, some packages like ATLAS can only show optimal performance when compiled on the machine they are used on, since they optimize during compile-time.

So I hope we can have both in the long run - deb and rpm for desktop users, and a well-crafted compilation for clusters or HPC users.

Hi Mario,

I was wondering if you had tried make file on OpenSusE?

Cheers,
Amos

Hi Amos,
I have not tried it on SuSE or OpenSuSE. But it is quite a generic Makefile, it should work if you have a recent gcc (4.2 or newer preferred) and mysql-development installed. If you try it, please open it with a text editor and check for the paths on your system, mainly the one to the output directory (TGTDIR).
Let me know if you have problems, all the best,
Mario

Hi All!

I’m trying to use this makefile, but i’m having some troubles:

File "<string>", line 1, in <module> zipimport.ZipImportError: can't decompress data; zlib not available make: ** [/home/cglienke/usr/python25/lib/python2.5/site-packages/easy-install.pth] Erro 1

Sorry if my question is stupid, but i’m a little new in Linux world. :blush:

Java and MySql must be instaled before ou after the makefile run?

Thanks!