Looking for a good "best practices" resource as our University wades deeper into big data

I’ve doing microscopy for a long time (confocal, SIM, WF, TEM, etc), but we are about to deploy a slide scanner with an image analysis pipeline and a massive file server. I’m doing my due diligence and trying to read up on data management. I am a little familiar with the Open Data Foundation, I have certainly heard of OMERO (and others), and I am somewhat familiar with the FAIR principles (it’s on my “to read” list).

Is there an article or other source that does a particularly outstanding job of outlining some best practices for naming, annotating, and storing image data? Thanks!

Hi Doug-

Sorry for not replying earlier-- we only just saw this.

It’s an interesting question. I don’t know of a specific, single guide that will cover everything you need. One useful rule is to decide who will use your data and make the “best practices” serve whomever will consume the data. That will go a long way to deciding what you should do.

More general pointers:

Ten Simple Rules for Digital Data Storage

Ten Simple Rules for a Good Data Management Plan

There are now several public data repositories-- Systems Science of Biological Dynamics Database, Image Data Resource, BioImage Archive, The CELL Image Library– all of which have metadata specifications that might be useful to review and assess for your own uses.

Hope that helps! Again, sorry for missing your question for so long.

Cheers,

Jason

3 Likes

I’d like to second Jason’s point on focusing on the consumers’ needs. The data should be made available ready to be processed. Two points that are often overlooked are the need for a suitable IT infrastructure (e.g. networking, tape archiving, compute resources) and management of the life cycle of the data (i.e. when and how do you get and remove data from the system?).
Another challenge is to collect experimental metadata (e.g. what’s been imaged in which channel, what are the treatments/controls…) that are required to make the data FAIR. Ideally this is best captured at acquisition time and permanently associated with the data. Tying the reporting of metadata to the instrument booking system (i.e. you can’t book your next session until the metadata for the previous ones have been provided) seems to be a good way of ensuring metadata capture. For light microscopy metadata collection, I try to get people to use this template:image_metadata.template.yml (855 Bytes) which is roughly aligned with the IDR metadata.

2 Likes