IdentifySecondary - MeasureObjectAreaShape bug or problem

cellprofiler

#1

Hi,

I am having a problem where the IdentifySecondary module finds a cell near the edge of an image but propagates a long straight area that isn’t part of the cell. This may be a problem with the image (though I doubt it), but my main problem is that MeasureObjectAreaShape module throws an out of memory error when trying to measure the secondary regions. If it didn’t error out I could filter the object out though I suspect it would be better if the propagation method didn’t behave like this.

It would be easiest if I could send the image and the pipeline, but I find that they are too big to attach. I will try to email a zip to someone after I post this. One image is a few nuclei stained with dapi and the other is the actin for those cells. Running the pipeline provided gives a cell outline that stretches along the bottom of the image as an extension of one of the objects.

Where would be the best place to catch this problem?

Thanks for taking a look at this! -John


#2

Hi John,

Interesting problem! After running your pipeline, I took a screenshot of the IDSecondary output, so that we can all see what you’re talking about. (see “Picture3.png”)
[attachment=0]Picture 3.png[/attachment]

Indeed, it did look very strange at first. But I brightened up the image a bit and saw that you do have what seems to be a horizontal image artifact along the bottom of your image. (see “Picture2.png”)

[attachment=1]Picture 2.png[/attachment]

So I don’t think that’s a problem with the Propagate function. To limit the extent of the Secondary object, try the “Distance - B” option instead of Propagate which will still follow the background intensity, but you can limit the extent manually, too (see the “For DISTANCE…” option). You can also try the Crop module to get rid of the boundary artifacts.

Re: the Out of Memory errors within MeasureObjectAreaShape, this is usually caused by the computationally-intensive Zernike calculations (last option). I set that to “No” and went from 35 seconds to process that module, to 1 sec, and likely much less RAM. In addition, I ran your pipeline without an error (though my machine has 7GBs of RAM), even with the Zernike features, so it’s not a bug. You can also try turning off the displaying of windows which saves RAM, by File -> Set Preferences -> Display mode.

On a separate note, I saw that you were using LoadSingleImage, though we prefer users to implement LoadImages for most purposes. LoadImages does not require you to specify the full image name (in fact, it didn’t run for me at first, since the images you sent were different than those specified in LoadSingleImage), and thus is more general.

Cheers,
David





#3

Hi David,

Thanks for taking a look at this so quickly. As soon as I sent this I thought I might eat my words regarding my image not having artifacts. Good catch, I should have been more thorough. Looking at the fields just prior and just after the image I sent did not have the same artifact so I’m a bit clueless about where that came from.

I did test out the Distance - B setting for that image and noted that it did a better job. On the other hand, the Propagate setting works much better on an image with lots of cells/cytoplasm. see attached jpg example 1: [attachment=0]idsecondary_eg.jpg[/attachment]

Most of my images have quite a few cells and I like the robust way that the propagate algorithm works (good job there for sure!) so I think I’ll stick with that method.

Aside from images with (seemingly random) artifacts I have had a problem with the IdSecondary module on images with few or no cells. See attached jpg example 2:

I can find these images by running the IdSecondary module and finding the global image background median. With very few (or zero) cells the median of the background is significantly less then the average median of a bunch of untreated cell images. I was dealing with this problem by setting a lower limit to the IdSecondary algorithm in these cases, but the distance method works in the above case as well as this so I think I will put a catch in my script to run the propagate algorithm except in these outlier cases.

Regarding the Zernike calculations I’m doing pattern recognition on the results from these images and so I want to gather these features. I’m using a pc with 4Gigs of RAM so Out of Memory errors must really be big.

Also, regarding using LoadSingleImage I’m using CP in a rather unusual manner. I create a pipeline in the normal fashion then run CP, but stop it with a break point in the CellProfiler script right before the first module is run. Currently that’s line 3830 and I’m using the Developer version (in case someone is wondering what I’m talking about). I save the ‘handle’ then I have my own script that opens the handle and passes it to modules as necessary. This allows me to interleave all kinds of manipulations to the handle and the results as I’m doing each image in a batch. I have an entire databasing system that I’ve built up over many years to store results from screening data for thousands of plates over many years. I used to use ImagePro to do image processing, but eventually found it to be not nearly as modern in terms of measurements and segmentation as CP. Anyway, I’m not running CP through your GUI when I do batch processing and the LoadSingleImage allows me to change the name of the current image in the handle with each cycle. I’m not sure if you want to encourage this behavior, but it’s a nice way to use CP within a larger system. I never actually change your modules so as you improve them I don’t have to update my code.

Keep up the Awesome work, John



#4

John,

Sorry for the long response time, but it’s great to hear from you one of the many ways that people are using CP! Yours is a certainly a unique setup, but in any case we’re glad it’s working for you.

It may not benefit you at all, since you seem to have a good scheme going, but others might also be interested in running a command-line (non-GUI) version of CP, and they should check out CPCluster on our website. It’s optimized for running large image sets in batch mode on a computing cluster.

Re: Out of Memory errors: Matlab has a habit of eating up its allocated RAM and not releasing it, so it is possible that after many cycles even 4GB could be used up.

Cheers,
David