"Toy" dataset for fluorescence microscopy measurements?

Hi all,

I’ve been assigned to instruct an exploratory data analysis workshop with R. Since the participants are biologists working with fluorescence microscopy experiments, I want to use a dataset related to their field.

Is there any “toy dataset” available for this? I’m hoping for something like the iris dataset, but for fluorescence microscopy. Nothing too fancy, just something that you would routinely obtain from, say, a CellProfiler pipeline: a table of cells in different time points or wells, containing some morphological and fluorescence intensity measurements. Ideally, the dataset would present a variable which works well for clustering cells in a PCA or t-SNE plot, e.g. control vs treatment, wild-type vs mutant, different cell lines, etc.

Many thanks in advance, cheers!

Hi @jfaccioni,

Couple of good places to look are the IDR and Broad Benchmark Collection. The latter is great if you need lots of real or synthetic nuclei to segment so maybe not the best for PCA type analysis.

There are helpful instructions on how to download data from the IDR here.

Hope that helps!

PS. I really like HTML for data/image analysis/coding workshops. You might want to have a quick read here on the benefits if you’ve not considered it before.

1 Like

If you’re going to use R, you can also check the cellHTS2 Bioconductor package. I believe it contains a small toy data set. There are also toy data sets in the Bioconductor MSMB package that accompany the book ‘Modern Statistics for Modern Biology’, by Susan Holmes and Wolfgang Huber, look for the section on imaging.

1 Like

Thanks! At a glance this seems to be exactly what I need. Thanks as well for the heads-up for the presentation - I’ve been meaning to try reveal.js for a while now, and this might be the perfect opportunity for it :slight_smile:

1 Like

Thanks for the resources! I’ll definitely take a look at it. The book itself also seems very interesting, so thanks for that as well :slight_smile:

You can find cytgenetic dataset here :
MFISH (8 bits Tiff)

QFISH (12 bits raw Tiff)