Create minimal YAML metadata file using BioFormats

Dear .*,

For keeping data at our institute organized, I need to write a simple Fiji plugin that creates (as good as possible) below YAML text file from an image file.

I am, of course, thinking about using BioFormats to extract the information.

I actually do not have a concrete question, but I thought it is good to post this here in case this overlaps with other activities. And, of course, I am very happy to hear any suggestions regarding this!

Best, Christian.

%YAML 1.2
# This file is in the YAML format. See http://yaml.org/spec/1.2/spec.html
# Check validity at https://codebeautify.org/yaml-validator
Image dimensions: # as XxYxZxCxT
Total data size: # specify unit in Mb, Gb or Tb
Channels:
- channel 1:
  - entity:
  - label:
- channel 2:
  - entity:
  - label:
- channel 3:
  - entity:
  - label:
Time point: 
Position: [] # comma-separated list of plate-well-field
Imaging Method:
Species:
- name: #
- taxon: # ID from the NCBI taxonomy database
Developmental stage:
Cell line:
Genes:
- symbols: [] # comma-separated list in square brackets
- identifiers: [] # comma-separated list, same order as in symbols
- reference database:
Experiment description: > # Enter text after the > sign
Protocol description: > # Enter text after the > sign
Associated data files: [] # comma-separated list in square brackets
1 Like

Are you talking regex or rather wildcards here? :wink:


I’m curious: what would be the benefit of your custom YAML metadata format over e.g. OME-XML? Do you think OME is too complicated for your use case? Are you missing the possibility of adding custom experiment metadata?

Do you envision your YAML format to be parsed by solutions like an OMERO server? Or do you plan creating your custom user interface, or targeting HTM Explorer?

1 Like

This was not my choice, but of a colleague of mine…but I think the answers are:

  • Use case: upload of data to public repositories (as required for publication)
  • YAML: easier for humans to modify and read.

That’s interesting. What are the public repositories targeted? Do you know if they have specific guidelines regarding the upload format, or do they impose that specific YAML format?

One of the use cases concerns data deposition at the Image Data Resource. They don’t require the data in YAML format, they have their own template files. More generally, the problem on our side is to get people to collect metadata so that collaborators can efficiently deal with the data. The requirement is for a format that is both human and computer writable and readable, hence the selection of YAML.

1 Like

Hi all. A few related but vacationonally-delayed thoughts on this:

  • The current version of IDR metadata uses TSV files (i.e. spreadsheets) since it’s assumed to be the most user friendly. It’s likely that a future version of the metadata would be in YAML and/or JSON, though I would assume a tabular representation would still be worthwhile.

  • This strategy has turned out to be fairly close to that of the Human Cell Atlas, for which we are currently looking to integrate Bio-Formats. They are taking submissions either in spreadsheets or in json (conforming to a jsonschema). This metadata as it develops will likely influence the IDR metadata. (See https://github.com/HumanCellAtlas/metadata-schema/issues/368)

  • That however is metadata that is not currently in the OME model. The current best approximation of Christian’s example would have the biological aspects stored as annotations.

  • The idea of having a serialization of OME metadata in YAML/JSON has been toyed with several times. It would be interesting to hear of concrete drivers or see examples of versions that others have produced.

All the best,
~Josh

2 Likes