Parse output from omero import

Hi all,
I’m looking for a good automated way to parse the output from the omero import CLI tool. For a new project we’re setting up an OMERO server for the images but all ingress and metadata management is being done by a front-end DMS that also manages many other data types such as genomics data, clinical attribute data, etc. So I would like the DMS to accept all the data and extract what it needs and then initiate the import into OMERO for image data. But the DMS needs to keep track of the imageIDs, datasetIDs, etc in OMERO so I would like to parse the output from omero import to populate the DMS record that tracks each data entity. E.g. for a single image file with multiple image structures I get:

...
2021-05-24 11:07:19,811 81632      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END
2021-05-24 11:07:19,974 81795      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_STARTED Logfile: 287
2021-05-24 11:07:20,157 81978      [l.Client-1] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_IMPORTED Step: 1 of 5  Logfile: 287
2021-05-24 11:08:50,508 172329     [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - PIXELDATA_PROCESSED Step: 2 of 5  Logfile: 287
2021-05-24 11:08:57,871 179692     [l.Client-3] INFO   ormats.importer.cli.LoggingImportMonitor - THUMBNAILS_GENERATED Step: 3 of 5  Logfile: 287
2021-05-24 11:08:57,885 179706     [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_PROCESSED Step: 4 of 5  Logfile: 287
2021-05-24 11:08:57,896 179717     [l.Client-3] INFO   ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5  Logfile: 287
2021-05-24 11:08:58,031 179852     [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /data/share/sudard/LuCa-7color_Scan1.ome.tiff
Image:25,26,27,28
Other imported objects:
Fileset:16

==> Summary
1 file uploaded, 1 fileset created, 4 images imported, 0 errors in 0:02:52.264

Of course, for more complex data sets, there can be a lot more “OMERO” structure. Has anyone cobbled together a reliable parser for this?
Thanks,
Damir

1 Like

Hi Damir,

by default, you can capture the stdout in obj-notation:

$ DEFAULT=$(omero import a.fake)
$ echo $DEFAULT
Image:9404

If you want to get the other items you’ll need to grep for ^Plate etc.

DropBox uses this for knowing what it imported: omero-dropbox/fsDropBoxMonitorClient.py at 530e3440ea7cb40982977c055b2653645b39b1d3 · ome/omero-dropbox · GitHub

Alternatively, you can pass the --output argument to get yaml:

$ YAML=$(omero import a.fake --output=yaml)
$ echo "$YAML"
---
- Fileset: 1084
  Image: [9405]

~Josh

1 Like

Thanks Josh. The yaml output looks very promising. Plenty of tools exist to parse that.
Damir

2 Likes

Uuh, the --output=yaml option somehow slipped my attention - that is looking great indeed!

Many many thanks for bringing this up @dsudar and for this helpful answer @joshmoore !