Harmonization of image metadata for different file formats. omero.mde?

Dear omero-community,

I am quite new to Omero and I am currently looking into harmonization of image metadata for different file formats (dcm, czi, tif, ometif).

Bioformats apparently recognizes metadata keys really well and stores most of it in the Acquisition Tab under “original metadata”. However, the Tags and Values in the original metadata depend on the imported file format. Is there a way to harmonize the metadata for aformentioned file formats in an easy, automated way (without manually adjusting every field)? Can this be done using the omero.mde xml-templates? If I understand it correctly these templates are describing what is shown in the “General” Tab, but not really how the original metadata is parsed into standard key-value pairs in the general tab.

Lastly, I looked a bit into the Python API, can metadata be queried in a targeted fashion from both the original metadata and the ome metadata in the general tab?

Thanks in advance for your efforts guys.

Best regards & have a nice day.
Ali

2 Likes

Hi @AliDurmaz,

OMERO.mde uses Bioformats to parse the original metadata contained in the file format into the OME Data Model and offer it as an editable template. Additional tags and values can be added to this template (either from the OME Data Model or self-defined) or existing ones can be hidden. By saving a specific template as a setup, you can offer different templates for different data. The automatic mapping of metadata from the file is limited to the fields of the OME Data Model, but if you know your generating systems, you can specify the configurations as predefinitions for the respective setup and these will then be mapped to your harmonised tags. In this way, you could create a setup for each of your file formats with harmonised data. However, this also means that you have to manually adjust these fields once for the data format, but can then apply this to all data of this format again and again. All entries in the mde are then assigned to the respective data as K/V during import.
However, OMERO.mde does not affect the data specified under “General” tab and “Acquisition” tab. This is still the data that Bioformats originally extracts from the data format and maps into the OME Data Model.

I hope this clarifies part of your question.

All the best,
Susanne

Dear Susanne @sukunis

I really appreciate your efforts. Alright so the xml-templates used in mde can facilitate parsing of specific original metadata into the OME Data Model. I assume this is a OME-XML file, that is created additionally in the course of the image “upload”. Probably one would need to link the Tags in the original metadata that Bioformat recognizes with specific keys in the OME-XML format by using such a xml-template, right? Would one need to enter the original meta data tags recognized by Bio-Formats e.g. here?:

If I am not missing something

Customize user input form of OMERO.mde — OMERO guide 0.2.0 documentation (omero-guides.readthedocs.io)

shows how values are added to specific OME Data Model (OME-XML) Tags but not really how a mapping between the original metadata tags and the OME Data Model can be done using these templates. Do you by any chance have a example XML-template, where something along these lines is done?

Thanks again!

Best regards & have a nice day
Ali

Hi Ali,

Completing Susanne’s answer on the Bio-Formats side, the metadata processing tries to capture as much original information from the proprietary format into a table of key/value pairs as the original metadata and a subset of this information is further transformed into OME metadata. Broadly, there are two main limitations to a faithful and complete metadata translation:

  • on the writing front, the existence of equivalent concepts in the target representation, in that case the OME Data Model
  • on the reading front, the metadata parsing, in the worst scenario across multiple variants of opaque proprietary file formats

A few topics have been started on this forum to discuss similar metadata questions across file formats focusing on certain domains - see Identifying microscopy instruments, Improve Bio-Formats Image Position Metadata. Additionally, the GitHub - AllenCellModeling/aicsimageio: Delayed Parallel Image Reading for Microscopy Images in Python library is working extensively on metadata translations between their supported formats and might have valuable thoughts to add to this conversion.

Also adding #industry as a tag

1 Like

Regarding the last part of your question about the Python API.
It’s not possible to query the original metadata (although you can access this data for a given image).
It is possible to query other ‘ome metadata’. Do you have an example of the types of metadata (Key-Value pairs?) and queries you want to make?

Regards,
Will

Hi Will,

thank you for your feedback. Also thanks for clarifying @s.besson .

Regarding the former, accessing the original metadata in the table, according to " Open Microscopy Environment • View topic - Extracting OME XML metadata" from 2011, can be done like this:

img = conn.getObject(‘Image’, num)
meta = img.loadOriginalMetadata()

This is also interesting, but querying would be the objective.

For the example query (Get all images that have pixel type == unit8) for the pixel type:

image

According to the page mentioned by Sébastien this pixel type belongs to the core metadata. Can the other OME metadata be queried accordingly? Can you provide a minimal example how this query would work?

Your answer also implies that harmonizing to OME metadata is indeed necessary, if one wants to do a programmatic query of the images.

Thanks in advance.

Best regards
Ali

Hi @AliDurmaz,

the xml file of the OMERO.mde is used to configure the templates (input forms) and setups, as well as to specify predefined values, but does not affect how the original metadata is read from the file into the OME Data Model. OMERO.mde only visualises the result of BioFormats at this point. All objects preceded by an “OME:” are the standard objects of the OME Data Model and are read from the original data in the way implemented by BioFormats.
Since in your example no data is loaded for the OME:Objective, this means that BioFormats cannot assign any data in the original metadata of the image to this objective. So it cannot clearly identify which of the original metadata corresponds to the model of the objective, etc. However, the name of the image suggests that the image is a conversion from CZI format. Have you loaded the original CZI image into OMERO.mde? Maybe the original metadata got lost during the conversion?

All the best,
Susanne

Hi Susanne, @sukunis

in our case we often have different imaging settings, which is why we need something like a mapping between original metadata tags and OME metadata keys. Assigning default values in such a xml template, would not be appropriate for our large variety of imaging settings.

Are there any plans to extend Omero.mde to achieve such a mapping? Somehow BioFormats already recognizes many metadata tags and values in the mentioned file formats (according to the original metadata tab). Maybe it could be a good thing to enable users to do such mappings, as such mappings could indirectly help to improve BioFormats and extend the subset that is mapped/transformed to OME metadata correctly?

Some kind of user defined mapping:

Key ---------------------- Value
MyCustomNode ----- Value(Aperture Align X)

could be interesting in my opinion and would also increase the flexibility with respect to other modalities (such as scanning electrom microscopy). Are such custom nodes considered part of ‘ome metadata’ and can be queried as well? @will-moore

You are right, the screenshot shows a file which I converted from czi to dicom and the metadata parsing was not complete. When I try to load the corresponding .czi image I can see entries in “Model” and many more keys. I am aware of this, sorry, I should have chosen another image previously to not distract from the main point.

Thanks a lot for the (very insightful) discussion so far.

Best regards
Ali

Hi @AliDurmaz,

sounds like a good idea, I’ll put it on my list for OMERO.mde.

All the best,
Susanne

Hi Ali,

Here is an example for how to query for Images by pixelType and by Key-Value pairs (Map annotations).

These examples use HQL (Hibernate Query Language) which is similar to SQL.

To create your own metadata queries you’ll need to have an idea of the OMERO data model.
See Working with OMERO — OMERO 5.6.3 documentation
for Getting Started. Browse all the OMERO model objects at:
OMERO API Index - omero::model

I tend to use the various query examples in the BlitzGateway source code https://github.com/ome/omero-py/blob/94b4be90f85f8ccaef02ceebc75320cf1b70db38/src/omero/gateway/init.py when creating new queries.
The graph traversal and loading can be a bit tricky, but feel free to ask for help with any queries if needed.

Regards,
Will.

Hi Will and Susanne,

the query example for pixel type did work seamlessly. Thanks Will!
In order to test the second type of query that you provided (key-value pairs), I did try the following:

  1. Adding key and value pairs manually inside omero.insight
  2. Adding a key and value with MDE

When I did try querying the images with your key-value query, option 1) worked while option 2) did not.

Here is how the key value pair showed in omero.insight that did not work:


params = omero.sys.ParametersI()
params.addString('key', '[OME-Model]{0}#[OME:Image]{0}#[MyCustomObject]{0} | ExampleKey_1')
params.addString('value', 'bla')
#params.addString('key', 'test')
#params.addString('value', 'LOM')
offset = 0
limit = 100
params.page(offset, limit)

# Here we use 'projection' query to load specific values rather than whole objects
# This can be more performant, but here I'm using it based on Map-Annotation
# query examples in omero-mapr
query = """
    select image.id, image.name from 
    ImageAnnotationLink ial
    join ial.child a
    join a.mapValue mv
    join ial.parent image
    where mv.name = :key and mv.value = :value"""
result = conn.getQueryService().projection(query, params, conn.SERVICE_OPTS)
for row in result:
    print("Img ID: %s Name: %s" % (row[0].val, row[1].val))

Are there any thoughts of extending the queries to the original metadata?

@sukunis: Is there a way to change the key names to be something other than “ExampleKey_1”?

Apart from that I did see that the command line importer can create custom annotations as well:
Import data using the Command Line Interface (CLI) — OMERO guide 0.2.0 documentation (omero-guides.readthedocs.io)

Thanks again.

Best reagrds
Ali

Maybe you can compare the values of the Map Annotations between manually-added and added with MDE, to see what the differences are:

image = conn.getObject("Image", image_id)
for ann in image.listAnnotations():
    print(ann.getValue())

Will

Hi Ali,

Sorry for the previous mistaken post of mine, suggesting double quoting and extra variable as a solution.
I have deleted it as this was a red herring, the problem is somewhere else, see below please.

We have investigated once more with Will Moore and can report:

The MDE is adding an extra whitespace in the setup which you are using to the Value.
This means, you have to query for "bla ", not “bla”.
The extra whitespace is not visible in OMERO.insight, and thus you of course missed it.
Thus, the code which does work for us using MDE should go

...
	params.addString('key', '[OME-Model]{0}#[OME:Image]{0}#[MyCustomObject]{0} | ExampleKey_1')
	params.addString('value', 'bla ')
...

Possibly @sukunis might be able to comment on the whitespace addition by MDE ?

Hope this helps.

Thank you

Petr and Will

Dear Petr and Will,

adding a whitespace at the end worked for me as well.

Thanks a lot!

If there is a way to do a user-defined mapping between some original metadata elements and custom MDE elements (or key-value pairs in general), I would consider both original questions as resolved.

I will look a bit into the Command line importer and the JSON API to achieve the same.

Best regards
Ali

Dear Ali

That is great. Thank you for that.
Please note that the Import Image data — OMERO guide 0.2.0 documentation which you mentioned (the CLI importer) is basically just consuming CSV files where the metadata are predefined, there is no providion for reading and exposing original metadata from the imported files (except, of course, the implicit Bio-Formats usage at import, which will not give you any output at import stage), unlike in MDE.

You also asked

Are there any thoughts of extending the queries to the original metadata?

Once in OMERO (after import), there is no way bo query the original Bio-Formats metadata, but some metadata out of those were written during import stage into the Database and can be queried. If you want to know how and expand on this, just let us know please, we can work on some queries (using HQL).

Is there anything more we can help you with ?

All the best

Petr and Will

Dear Petr, dear Will,

thank you again.

Could the following workflow not be feasible?

  1. Use the command line importer to import some images
  2. Retrieve their original metadata
    img = conn.getObject(‘Image’, num)
    meta = img.loadOriginalMetadata()
  3. Programatically create an yml and csv file for annotation which link individual elements from the original metadata with some user-defined keys.
  4. Do the post-import steps for annotation

This might not be the most efficient procedure but could be (if feasible) a workaround until the mapping is implemented in MDE.

What do you think?

Best regards
Ali

Hi,

You can go direct from original metadata to key-value pairs, allowing you to query the images for these values:

keys = [
    "Wavelength 1 mean intensity",
    "Z axis angle",
    "Extended header Z9 W2 T0:exWavelen"
]
image = conn.getObject("Image", iid)
f, series_metadata, global_metadata = image.loadOriginalMetadata()
key_value_data = []
for metadata in (series_metadata, global_metadata):
    for key, value in metadata:
        if key in keys:
            key_value_data.append([key, str(value)])
if len(key_value_data) > 0:
    map_ann = omero.gateway.MapAnnotationWrapper(conn)
    namespace = "custom.from.original_metadata"
    map_ann.setNs(namespace)
    map_ann.setValue(key_value_data)
    map_ann.save()
    image.linkAnnotation(map_ann)

Will