Transfer a project / dataset between OMERO instances

Hi,

Hypothetical problem, suppose there are two OMERO servers:

  • A private one which is authenticated users only where data is activity worked on (possibly behind an institutional firewall)
  • a public one which is largely read only like the idr (maybe at another institution)

Once a project on the private server has been cleaned up, annotated and is fit for release how could it be transferred to the public server?

Cheers,

Chris

2 Likes

Hi Chris,

There’s a bunch of ways to approach this.
One is to use a script to connect to the 2 servers, traverse a Project -> Dataset -> Images and Annotations on the private server and duplicate the whole graph onto the public server.

For example, for setting up training data, we use https://github.com/ome/training-scripts/blob/master/maintenance/scripts/idr_copy_plate.py to copy a plate from the IDR server onto a training server. This creates new images with the same pixel data by copying the data via a numpy array (and only takes the first time-point of timelapse images). You could also do this by downloading the images and re-importing if you want the same original files.

For annotations you could do the same: This script traverses P>D>I on the origin server (IDR) and copies Map Annotations to the corresponding images on the target server (matching images by name). https://github.com/ome/training-scripts/blob/master/maintenance/scripts/idr_get_map_annotations.py

I seem to remember one user experimenting with a server-side OMERO.script on the ‘private’ server which any user could run to export their data to another server, but I don’t know where this got to.

A different approach might be to export with https://github.com/ome/omero-downloader and re-import. The downloader includes some annotations in the exported OME-XML which should be re-created in the public server when re-imported, but I haven’t tried this.

If you want to try a script, we’d be happy to help as it’s certainly something that has been asked about before.

Cheers,

Will

Hi Will,

thanks for this

We will need the original files so it sounds like the original files will need to downloaded from one server then uploaded to other.

Given that all the annotations needs transfer too, it sounds like the script to a transfer project covers the bulk export that that is discussed in this thread.

Do you know off hand if OMERO.downloader brings down the tags, key values and tables?

Cheers,

Chris

That thread is more about giving individual users the ability to export a Project (not re-import and not using a script since it’s not very user-friendly).

I assume that OMERO.downloader downloads Tags and Key-Values (and I’m pretty sure it doesn’t support tables).
And they should be re-created when the OME-TIFF is re-imported, but I haven’t tried this.

You could also teach a script to download the original files and re-import them. I don’t know which approach would be easiest.

Transferring tables would be a bit more work. Probably the easiest would be to export as CSV with the Image Names (instead of IDs) and then upload the CSV to the public server and run the populate_metadata script there. If you used a CSV to create the table in this way originally then you don’t even need the export step!

Cheers,

Will.

Will.

Hi,

I was thinking that one way to do a transfer would be to export then import the components so you’d have the export step done.

Using image names as a key is a bit risky as OMERO lets you have images with the same name but file systems and tables won’t like it. Maybe appending the ImageID to filename is enough?

Another complication is original files that contain multiple locations and get split out into series.
Each member of the series is a unique imageID in OMERO. So when the annotations are exported they will need to mapped onto the original file and series member.

Cheers,

Chris

Hi,

The idr_get_map_annotations.py script above matches images by name within each Dataset, so it’s not full-proof but works if you don’t have duplicate image names within a Dataset. And multiple images in a Series should get different names when imported to OMERO?

Each image is part of a FileSet that consists of potentially multiple files. A FileSet may contain more than 1 image in OMERO.
So, if you download all the original files for a bunch of images, then re-import them to another server you may find you have additional images there (that you didn’t choose to export).

For example a Fileset may have the following Files and Images (in OMERO).

Fileset 1:

  • Files:
    • main.xml
    • plane1.tiff
    • plane2.tiff
    • plane3.tiff
  • Images:
    • imageA
    • imageB

In OMERO, imageA and imageB are linked to Fileset1.
If you export imageA from OMERO, you’ll download all the the Files in the Fileset. Then re-import and you’ll have imageA AND imageB in the ‘public’ server.

Cheers,

Will

Hi @evenhuis,

one year passed and we are in a pretty similar situation. How did you finally implement the data transfer between the private and public instance? Are you willing to share your efforts?

Cheers,
Anna

2 Likes