Batch export entire projects in OMERO

Hi,
We have been using batch image export script which is limited to selecting multiple datasets (but not projects). One issue we have faced is that zipping large (100+GB) datasets take a long time. I see ways to use CLI or Matlab to do it, but my end users would probably have hard time dealing with scripts.

Is there a way to download a list of projects in omero (using insight or webclient) directly to client storage?

Hi @memphizz,

Maybe not completely related but for general information , I developed a framework for automated processing based on OMERO (but it works for images on disk also) called TAPAS.
Actually I developed some Java functions to read/write images from OMERO, I think @ctrueden developed similar functions too. The code is available in github https://github.com/mcib3d/tapas-core .

Hope this helps

Thomas

1 Like

Hi Thomas,
Thanks for mentioning TAPAS. We are almost ready to go live with our OMERO service. I am looking for ways to enhance OMERO compute capabilities. Ideally I want to keep the heavy compute on the server side.
btw, I routinely use the 3D imagej suite and teach my end users. very useful set of tools.

Best,
Abbas

Hi @memphizz,

Not out-of-the-box with the OMERO clients at the moment. The export script could be updated to take a Project and create multiple file annotations. Additionally, the newer project OMERO.downloader could be integrated into insight to perform a similar function.

But, of course, always happy to hear about community projects like TAPAS! :+1:

~Josh

Hi Josh,
Thanks for the response. I tried OMERO.downloader for a test dataset that has 144 images (160MB each). It takes ~ 90 seconds to download an image using the downloader. When I use insight to download the same dataset, it takes ~ 3 sec per image (downloading at 50MBps). It looks like the downloader is compressing the images (client side?). Is there a way to disable the compression, (if you think that’s the rate limiting factor)?
The server and client are both are on 10G networks. We typically get near 10G transfer rates between the server and storage and clients. When I start another download on the same client I see total transfer rate goes to ~100MBps. Is there a parameter in insight or the server that limits the transfer rates in insight to 50MBps per session/download?

1 Like

Hi @memphizz,

Thanks for the numbers. They certainly sound concerning. To clarify: You are wanting to all the exported images in TIFF, correct? What options did you pass to downloader? Also, which download method did you use from insight?

~Josh

P.S. just in case responses take a while, a number of people are away over the next two weeks.

1 Like

The images are in ome.tiff in OMERO. I passed -f ome-tiff to avoid conversion.

./download.sh -b /media/tmp/ -s omero_IP -u usrname -w pass -f ome-tiff Dataset:3521

In insight I selected all the images in the dataset and use “download” not “Export as”.

Best,
Abbas

On the topic of OMERO.downloader: I see that version 0.2.0 targets OMERO 5.5. Does that mean it doesn’t work against a 5.4.10 server? Should i check out version 0.1.5 instead? Since there are still lots of 5.4.x servers out there, would it be asking too much for an up-to-date 5.4.x-compatible version?
Thanks,
Damir

I actually didn’t pay attention to the versions. I am running 5.4.9 and tried OMERO.downloader 0.2.0. It was able to complete the download for the test I was running.

Great, thanks for letting me know. So no need for concern :slight_smile:
Damir

Hi @memphizz
Just seeing your timing numbers and reading the OMERO.downloader documentation. Would it be faster if you specify -f binary instead of -f ome-tiff? Maybe it’s doing a conversion from ome-tiff to ome-tiff and eating up precious 10s of seconds?
Just a thought,
Damir

When I use -f binary it crashes:

finding target images... done
mapping filesets of images... done
(1/144) determining files used by image 43011...-! 10/10/19 10:04:52:734 warning: Ice.ThreadPool.Client-0: dispatch exception:
   identity: 9M,3a6^0R(HM@(a\'\\\"4*/ed000184-f8ba-4d69-8112-5a5b88c95c8e
   facet: 
   operation: finished
   remote host: XXXXXXX remote port: 4064
   Ice.UnmarshalOutOfBoundsException
       reason = ""
   	at IceInternal.BasicStream.skip(BasicStream.java:2456)
   	at IceInternal.BasicStream.skipOpt(BasicStream.java:2411)
   	at IceInternal.BasicStream.skipOpts(BasicStream.java:2447)
   	at IceInternal.BasicStream.endReadEncaps(BasicStream.java:379)
   	at IceInternal.Incoming.endReadParams(Incoming.java:385)
   	at omero.cmd._CmdCallbackDisp.___finished(_CmdCallbackDisp.java:117)
   	at omero.cmd._CmdCallbackDisp.__dispatch(_CmdCallbackDisp.java:145)
   	at IceInternal.Incoming.invoke(Incoming.java:221)
   	at Ice.ConnectionI.invokeAll(ConnectionI.java:2536)
   	at Ice.ConnectionI.dispatch(ConnectionI.java:1145)
   	at Ice.ConnectionI.message(ConnectionI.java:1056)
   	at IceInternal.ThreadPool.run(ThreadPool.java:395)
   	at IceInternal.ThreadPool.access$300(ThreadPool.java:12)
   	at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:832)
   	at java.lang.Thread.run(Thread.java:745)

Oy, that sounds like a bug. @mtbc?
Sorry, that’s all I could think of.

I just installed and tried:

  • when connecting to OMERO 5.5.1, the download with the -f binary option works fine.
  • when connecting to OMERO 5.4.10, the same command fails with the error @memphizz also reported.

Damir

Hi,
Thanks @dsudar for identifying the -f option to avoid unnecessary generation of OME-TIFFs.

The Ice.UnmarshalOutOfBoundsException is definitely the result of different versions of OMERO on the server vv client. You may not hit it for some operations (if you don’t use API methods that have changed etc) which is why it seemed to work OK in some cases, but you really need compatible versions.

@mtbc is away just now, but I created https://github.com/ome/omero-downloader/issues/29 for support of 5.4.x servers.

Will

Hi @joshmoore ,

just to second that an bulk export function is an important function for us too. We have had a couple of users leave the institution and extracting their images out of OMERO has been real issue.

I have been using a python script to walk the projects they have, but as @memphizz points out this can’t be easily deployed to users.

How does OMERO.downloader treat annotations such as tags, Key-Value annotation and Tables? I have a look in the code but had trouble finding where this might be done. Exporting these as csv files could probably be best.

It would be really cool if the exporter could generate .html files that replicate the dataset thumbnail view so that when you click on thumbnail the info from the righthand pane is shown. This way when a project is exported to disk the user could navigate the directories in a similar way as in OMERO.

Cheers,

Chris

Please see my comment on https://github.com/ome/omero-downloader/issues/29#issuecomment-541111743 regarding downloader which is 5.4 compatible. There will hardly be any new features between the 5.4 compatible version and the newest release which you are using.

All the best

Petr

Hi @pwalczysko,
Thanks for mentioning that. I tested it and indeed, the 0.1.5 release works fine with my 5.4.10 server. So there were no significant improvements between 0.1.5 and 0.2.0 except for switching support to OMERO 5.5?
Cheers,
Damir

Hey Chris,

We certainly agree, and this has even been the topic of a recent funding proposal. But for the moment, downloader is a project which gets attention when we can manage. Nonetheless, it’s great to have the community stating the need. Thanks.

At the moment, I’m pretty sure everything would get downloaded to XML. CSV is an interesting option. However, there’s already a JSON format supported by the CLI as well so that might be a better compromise. Ultimately, we need to be careful not to create a large number of semi-specified formats that we need to continue supporting moving forward.

:smile: Also, a very interesting idea but the complexity issue raises its head. Let’s keep gathering all the ideas out there and see if we can balance it all out.

Thanks again!
~Josh

Hi Josh,
It would be very useful to have the an option to parallel download (and import). In my case insight could download 20 images at 50MBps simultaneously.

Best,
Abbas