Using CellProfiler 4.0.0rc9 as a Python Package for Batch Image Segmentation

So to preface this post, I realize that CellProfiler 4.0.0 is still under development, but I desperately need to use Python 3 with CellProfiler for a software package that I am developing. Therefore, I decided to download the latest CellProfiler code from GitHub, apt-get all of the various dependencies, build CellProfiler “from source”, and package it all as a Docker container. Somehow, I got this all working, and I’m even able to open the CellProfiler GUI via the Linux Docker container that my MacBook Pro is running (although there were a few bugs with the installation that I had to manually patch; I’m happy to share those if anyone is interested).

For the next step, I created a CellProfiler cell segmentation pipeline using the GUI, and then I exported it as a .cppipe file. Then, I tried to run the pipeline using CellProfiler as a Python package. I followed the guide at https://github.com/CellProfiler/CellProfiler/wiki/CellProfiler-as-a-Python-package; below is the script:

from cellprofiler_core.pipeline import Pipeline

pipeline = Pipeline()

pipeline.load("output/segment.cppipe")

measurements = pipeline.run()

Unfortunately, this gives me the following Javabridge error:

Traceback (most recent call last):
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/cellprofiler_core/pipeline/_pipeline.py", line 1402, in prepare_run
    not module.prepare_run(workspace)
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/cellprofiler_core/modules/images.py", line 335, in prepare_run
    ifcls = javabridge.class_for_name("org.cellprofiler.imageset.ImageFile")
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/javabridge/jutil.py", line 1743, in class_for_name
    ldr = static_call('java/lang/ClassLoader', 'getSystemClassLoader',
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/javabridge/jutil.py", line 939, in static_call
    fn = make_static_call(class_name, method_name, sig)
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/javabridge/jutil.py", line 910, in make_static_call
    klass = env.find_class(class_name)
AttributeError: 'NoneType' object has no attribute 'find_class'

However, if I run the pipeline in the command line headless mode, it works fine. Here is a GitHub issue that seems to express the exact same problem (albeit with CellProfiler v3.1.8):
https://github.com/CellProfiler/CellProfiler/issues/3863

If there was some method (whether in beta development or not) of running a CellProfiler pipeline from a Python script without running into this error, I would love to know about it.

Thanks!

Hi @alam-shahul,

Thanks for providing so much detail. The .cppipe format won’t contain any image sets, so I think that might be what’s wrong here: the pipeline is trying to run without any images. A .cpproj file usually contains the bundled image set list to avoid this problem, otherwise I expect you’ll need to specify some image sets before trying to run.

I’ll try to get that tutorial updated to reflect this, since the example pipeline is a special case.

1 Like

Hi @DStirling,

Thanks for the swift response! A few follow-up questions:

  1. How can I specify the input image set? Can I do it using the Python package?

  2. How can I import the .cpproj file in Python? I tried to replace segment.cppipe with my saved segment.cpproj, but that gave me the following error:

Traceback (most recent call last):
  File "cellprofiler_test.py", line 5, in <module>
    pipeline.load("output/segment.cpproj")
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/cellprofiler_core/pipeline/_pipeline.py", line 282, in load
    if Pipeline.is_pipeline_txt_fd(fd):
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/site-packages/cellprofiler_core/pipeline/_pipeline.py", line 236, in is_pipeline_txt_fd
    header = fd.read(1024)
  File "/home/jupyter-user/miniconda/envs/preprocessing/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)

I’m assuming that this is because the Pipeline class doesn’t take .cpproj files as input. Which function/class should I use?

Hi @alam-shahul,

We’re now putting together some proper instructions for this, for the time being there are some more details available here for use with Jupyter.

Hope that helps!

2 Likes

@DStirling Hmm, I just tried it out, and I’m still getting the same AttributeError: 'NoneType' object has no attribute 'find_class' error.

Below is the gist of my code:

from cellprofiler_core import image, object, pipeline, preferences, workspace, measurement
preferences.set_headless()

...

def segment(reference_image_filepath, merged_composite_filepath):
    """
    """
    segmentation_pipeline = pipeline.Pipeline()
    segmentation_pipeline.load("segment.cppipe")

    image_set_list = image.ImageSetList()
    image_set = image_set_list.get_image_set(0)

    reference_image = imageio.imread(reference_image_filepath)
    reference_handle = image.Image(reference_image)
    merged_composite = imageio.imread(merged_composite_filepath)
    merged_composite_handle = image.Image(merged_composite)

    image_set.add("DAPI", reference_handle)
    image_set.add("RNA", merged_composite_handle)

    object_set = object.ObjectSet()

    objects  = object.Objects()

    object_set.add_objects(objects, "example")
    measurements = measurement.Measurements()

    segmentation_workspace = workspace.Workspace(
        segmentation_pipeline,
        None,
        image_set,
        object_set,
        measurements,
        image_set_list,
    )
    output_measurements = segmentation_pipeline.run(None)

Note: I replaced the module parameter in the Workspace constructor with None because I’m not sure what that’s supposed to be.

Also, I’m confused, because my .cppipe includes regex rules in the NamesAndTypes module to find and name the appropriate files. Why do I need to create an ImageSetList in that case?

Edit: there is also this error at the top of the traceback: ERROR:root:Failed to prepare run for module Images

I think this means that it’s trying to run the Images module, which is good because that should be the first one in the pipeline.

1 Like

Another point: when testing locally using Docker, I can run it from the command line with the following command: cellprofiler -r -c -p output/segment.cppipe -i input/

However, this is only possible if I forward the Docker image to my actual computer’s display, even though the GUI doesn’t open when I run the pipeline like this.

But when I try to use the same command on a remote computer (for which I can’t use X-forwarding, unfortunately), I get the following error:

Unable to access the X Display, is $DISPLAY set properly?

How can I disable the use of the display entirely?

Hi, @alam-shahul! Thank you for the feedback and questions. They are very helpful.

The Images, Groups, Metadata, and NamesAndTypes modules are useful but their respective implementations are garbage. You can, however, use them now, but it is extremely cumbersome so I want to discourage their use outside of the CellProfiler application. Nevertheless, I agree CellProfiler lacks a great way to programmatically construct a non-trivial ImageSetList.

I am going to work on a solution to make that easier. For example, @DStirling and I brainstormed a method on workspace for constructing an ImageSetList from globbed pathnames. The signature might look like this:

class Workspace:
    def add_images(pathnames:  List[PathLike]) -> None:
        pass

What do you think? Would this make it easier?

I suppose we could also consider implementing a method that takes a schema-defined CSV, JSON, YAML, et al. file. This could replace some of the functionality provided by the Groups, Metadata, and NamesAndTypes modules.

2 Likes

This is a bug. We’ll fix ASAP. :smile:

1 Like

Hi @agoodman ! Thanks for the response.

I would like to note that I am by no means a CellProfiler power user - indeed, this is my first time ever using CellProfiler. So perhaps the best perspective I can offer is that of a new user?

First of all, I don’t use the Metadata or Groups modules for my application, so they’re useless to me; as you mentioned, I only have them because I started with a GUI pipeline (which requires them) and then exported that pipeline to a .cppipe file.

I agree that the add_images signature that you proposed looks easier to use. Perhaps it could take either a list of pathnames or a config file, if someone wants to incorporate the other modules’ functionality?

I just want to be able to run CellProfiler in headless mode and without using a bash shell…

Also, how long would it take to fix this? If there’s any chance that someone could fix this today, that would be of great benefit to me… :sweat_smile:

Hello,

have you managed to resolve the problem with AttributeError: 'NoneType' object has no attribute 'find_class'? I’m getting the same error…

Cheers!

Moved past the initial issue with:

from cellprofiler_core.utilities.java import start_java, stop_java

start_java()

pipeline = cellprofiler_core.pipeline.Pipeline()
...

stop_java()