CellPainting - Update from Aarhus with questions



Hi @bcimini, @cells2numbers (or @Tim_1), @shntnu and @AnneCarpenter (and whoever else is interested)

We have now worked with CellPainting for some time and overcome some obstacles. I can happily announce that the data looks good from our perspective, here is an example heatmap of control compounds:

Heatmap with dendrogram, Aarhus

As mentioned earlier I have attempted to compare our profiles with the profiles of the same compound from the 2017 GigaDatabase submission from your group.
However, when I analyze the uploaded per-well profiles the same way as my own per-well profiles, I see very weird profiles. Here is an example:

Heatmap from Broad GigaDB data

There seems to very bad correlation between related compounds.
Have you profiled some (or all?) of the data from the GigaDB dataset to verify that related compounds correlate nicely?
I have not attempted to download the images and go through the whole CellProfiler analysis, as I had hoped this would not be necessary.

There is also no correlation between our profiles and yours (as was perhaps expected…). We are using some different excitation sources and filters on our CellDiscoverer 7 compared to the ones you have reported, and thus see differences in bleed-through (especially AGP-SYTO14-ConA are prone to bleed-through on our setup). We are hoping to optimize this with some extra single band pass filters that should be closer to the ones in the CellPainting literature.
Furthermore, we have made some changes to the analysis protocol. For this I have one question. When identifying cells in IdentifySecondaryObjects the previous work from you uses the Syto14 / RNA channel. When we do this our segmentation is very poor. We achieve significantly better segmentation when using the AGP channel. My question is: Why do you use the RNA channel instead of the AGP channel, when the AGP channel contains the plasma membrane? Here is two examples of segmentations of the same image:

RNA channel
AGP channel

For now we still export our measurements to a MySQL-database, and then read from that in R. I think we will change this to export to .csv and then import the .csv’s to R. I have still not been able to make the CellProfiler export a .csv per well and site as I was recommended by Shantanu earlier, but we not produce hundreds of plates (yet…?), so for now the other method works fine.

Thank you for all your assistance until now!

With kind regards
Esben Svenningsen, Ph.D. Student
Aarhus University, Denmark
Chemical Biology - Thomas B. Poulsen


Hi Esben,

Not sure on the profiling question, I’ll tag in someone else for that on my end.

We have at different times used the actin and the RNA for segmentation- if actin works better for you, there’s nothing wrong with doing so (though I’d be careful if you think there are likely any actin-perturbing things in your compound and/or gene set!). That being said, you probably can use the RNA just with more tweaking of the pipeline.