Worm Toolbox .xml worm model manipulation

Sample image and/or code

Raw image:
Find the raw image here: CP_issues/20210129-toxin16A-p02-m2X_A09.TIF at 6f3b16c2ac71f11682891b17f28d941f4a769b07 · tcrombie/CP_issues · GitHub

Working example:

  1. Clone the CP_issues repo found here: GitHub - tcrombie/CP_issues: Reproducible CP issues
  2. Open CP_issues/short_worms/short_worm_example.cpproj in CellProfiler-4.0.7
  3. configure input/outputs for your machine and analyze.

Background

  • The example image is from an experiment testing the effect of toxins on C. elegans. The nematodes in the well image are sick due to toxin exposure and some are very small. I made a worm model from small sick worms called MDHD.xml (CP_issues/short_worms/input/MDHD.xml) that can identify the smaller worms in this well. However, occasionally the model identifies worms that are extremely small. For example, the output from the pipeline contains worms with Worm_Length of 2 (CP_issues/short_worms/output/output_data/OverlappingWorms_MDHD.csv).

Analysis goals

I want to measure the lengths of nemtodes in the well. Note, I am not concerned about larger worms being separated into two worms by the MDHD.xml model. I am more interested in removing the very short, fat worms that are clearly not worms but debris in the well. You can see the worm outlines I’m referring to in the output image here in blue: CP_issues/20210129-toxin16A-p02-m2X_A09_overlay.png at 6f3b16c2ac71f11682891b17f28d941f4a769b07 · tcrombie/CP_issues · GitHub

Challenges

  • I do not understand how the MDHD.xml worm model finds worms of length 2 when min-path-length parameter is set to 31.77 (see MDHD.xml for details).

Questions

  1. How should I interpret the worm model parameters in the MDHD.xml file? Specifically how do these parameters influence the minimum length of a worm. The values below are taken directly from my MDHD.xml file.
 <min-area>143.49699999999999</min-area>
  <max-area>705.5</max-area>
  <cost-threshold>66.58667600021171</cost-threshold>
  <num-control-points>21</num-control-points>
  <max-skel-length>127.1088136726586</max-skel-length>
  <min-path-length>31.772433103515787</min-path-length>
  <max-path-length>139.81969503992448</max-path-length>
  <median-worm-area>428.5</median-worm-area>
  <max-radius>4.0</max-radius>
  <overlap-weight>5.0</overlap-weight>
  <leftover-weight>10.0</leftover-weight>
  <training-set-size>100</training-set-size> 
  1. Does the fact that I made the model with CP3.1.9 affect how it is run with CP4.0.7?

Hmmm, I’m not certain how to interpret the parameters; naively, I would think you are correct that the minimum path length and/or area would preclude such a small object, but I don’t recall the underlying code well enough to be certain.

  1. My first couple of ideas for fixes (none of them mutually exclusive) would just be to a) use an Opening module on your binary image before you run UntangleWorms to remove small spots b) use FilterObjects to remove small crud like that you don’t like after Untangling c) if it isn’t too painful, redo your training and make sure you don’t accidentally have some small worms in your training set that are somehow leading to them ending up being selected downstream.
  2. It shouldn’t affect the Untangling per se, we did not make any major changes to that module in my recollection (or that are recorded in the release post), but it might affect your thresholding.

Thanks @bcimini!

  • Q1 Fix idea A: I’ve tried tinkering with the diameter range for objects added to the binary image. Trouble is, I have small thin worms with an equivalent diameter similar to the crud in the wells.

  • Q1 Fix idea B: I like it! I might use ClassifyObjects to retain the small crud-like worms too so I can step through a size threshold that works best.

  • Q1 Fix idea C: I don’t like it! lol. Making worm models is painful. Although you’re right, it should help.

  • Q2: OK, thanks. I did notice that thresholding is slightly different between 3 and 4, but I don’t think it will affect the model performance. I suppose I could test by running the same model in CP3 and CP4.

Anyone else have experience with how the parameters in the worm models can be manipulated to prevent super short worms?

Ha, don’t blame you for not wanting to retrain, just is TECHNICALLY an option. If you were lucky and had a model that had trained easily in half an hour or so, it might have been worth it; not so much otherwise.

I’ll see if I can dig into the code more later this week.

1 Like

@bcimini Any chance you’ve had time to look into the worm toolbox model code? Still not certain how these small worms are showing up.

Sorry, this had slipped my filter!

It appears in the source code that the min_area and min_length parameters are ONLY used in training; that’s why they aren’t being used to throw out any resulting worms here. I’ve confirmed that’s true even in versions of the code going back 5+ years, so it isn’t just that we accidentally deleted it in the code during an upgrade, it’s how the module was designed.

If you increase your minimum object size in IPO to 15, and/or filter the objects from that before you make your binary, that seems to be the most reliable way to keep things from being called as worms that are not; I tried a bit playing with the weights in Untangle but it doesn’t seem to help much in this particular case. You can of course filter them after as well.

Sorry to not provide more answers!

1 Like

HI
I am using wormtools to “untangle” drosophila pupae. In the past I have definitely edited the model to make it work- but it was so long ago and the model has worked for years.

i do recall I got help from Carolina Wählby (who wrote the published the toolbox , who you may want to contact). She adjusted the ‘cost-threshold’ of my model as a way to make it more or less strict. Could this be the way it can get paths well below the minimums set ?

Ps I have to make a new model next week as when have a new imaging machine so I may haveing to edit a new model again

Thanks again @bcimini. It’s good to know that some of the model parameters are just used in training. I decided to apply length filters post analysis. This option lets the noise come through in the output and I can adjust the size filter as necessary for different conditions in my experiment. We’re working to implement these filters in R. See AndersenLab/easyXpress for our progress if interested. The easyXpress package is written to clean, process, and visualize CellProfiler data output from our high-throughput phenotyping assays. For those interested in an R solution rather than CellProfiler Analyst