Omero compatibility with S3

Dear all
Our university is installing an S3 server as we speak and they are promising us the moon! The question is : is omero compatible with an S3 server and if not is there a plan to make it compatible?
Sylvie

3 Likes

Hi Sylvie,
we were discussing this topic with our IT team and wanted to ask the forum about it !
Thank you for starting this thread !
Cheers,
Romain

1 Like

Cool! :blush:
We are doing a pilot project with the university IT architecture guys to test the new infrastructure. I already spoke to them about setting up Omero. I can test stuff for you on the new server if you want. We just got a light sheet so I certainly
have the datasets for it! :scream::scream:

Med vänlig hälsning / Best regards

Sylvie

@@@@@@@@@@@@@@@@

Sylvie Le Guyader, PhD

Live Cell Imaging Facility Manager

Karolinska Institutet- Bionut Dpt

Hälsovägen 7C,

Room 7362 (lab)/7840 (office)

14157 Huddinge, Sweden

mobile: +46 (0) 73 733 5008

LCI website

Follow our microscopy blog!

romainGuiet

June 6
Hi Sylvie,
we were discussing this topic with our IT team and wanted to ask the forum about it !
Thank you for starting this thread !
Cheers,
Romain

At present OMERO only supports POSIX file systems, but Bio-Formats 6+ contains preliminary support for reading off S3. For instance if you download the Bio-Formats command line tools you can use the showinf command to open an S3 image:

./bftools/showinf s3://s3.amazonaws.com/bucket-name/path/to/image.jpg

If you try this one obvious issue is the latency in opening and viewing images. There’s an optional cache included in the S3 reader but that’s not a full solution, for example it doesn’t automatically refresh, nor is there a size limit.

OMERO 5.5 will include Bio-Formats 6.1, but to enable importing and viewing of S3 images an external Bio-Formats reader is required: https://gitlab.com/openmicroscopy/incubator/bioformats-urlreader/
In the current architecture OMERO always requires a local file on disk, which is provided by this wrapper that stores the URL and some options (including credentials if the bucket is private). The S3 file is then streamed remotely.

I had a demo S3 OMERO 5.4 server, I haven’t looked in to testing with 5.5 yet but I don’t anticipate too many issues.

However getting this to production quality is still a lot of work. What would be interesting is for everyone to outline what their minimum requirements for it to be useful are.

4 Likes

Sorry but what do you mean by ‘minimum requirements’?

Supporting object stores like S3 in Bio-Formats and OMERO is a lot of work. If you require full support for S3, meaning as far as OMERO is concerned it behaves exactly like any other file system, that’s likely to be a long way off. On the other hand if you wanted S3 for storing public read-only data in OMERO that’s much easier. The current prototype has most of that, the main limitation is performance which could be improved by caching.

Between those extremes there are a lot of other use-cases, so having an idea of what you want will help us prioritise the work.

what do you mean by ‘minimum requirements’?

In addition to @manicsto read-only or not to read-only question, your comment:

makes me wonder:

  • What object storage will you be using? AWS S3 or some other S3-API provider?
  • Are you looking to have all your storage S3 or a mix?
  • What file formats supported by :bioformats: are you looking to store in S3?
  • What are the typical dimensions of the files you are looking to start with? More time series? Large 3D volumes (a la light sheet)?
  • Are you ok with converting the data to a different format?
  • If so, would you then feel comfortable getting rid of the originals?

And those are just the beginning! :wink:
~J.

We managed to get Omero to store its entire Managed Repository in S3 using the AWS File Gateway system. We currently have ~8 terabytes of images stored in Omero via S3 and everything shows up normally in the viewer (including tiled images). We haven’t run into any issues with file locks or anything. Once and a while we’ll have to restart the server but so far it’s been working great.

4 Likes

This is great to hear. Thanks for letting us know.

Do you have any details about what’s going wrong to require a restart? If it happens against, we’d be interested in logs as well as outputs of jstack.

~J.

I don’t think we’ve had to restart it in a few weeks, actually. It seems to be working without a hitch.

1 Like

Hi again

Thanks everyone for the answers! It is nice to read that there is hope! :wink:
Just to make sure we are talking about the same thing: What is being set up at our university is an in-house S3 server, not an Amazon S3. Is that what you did pmann?

That’s certainly one of the questions which means that @pmann’s solution in Omero compatibility with S3 won’t work for you. With your own server, there might be a similar option (or you could try a third-party one like https://github.com/kahing/goofys) at least until OMERO fully supports S3.

~Josh

Do you mean that @pmann uses the Amazon S3?

Med vänlig hälsning / Best regards

Sylvie

@@@@@@@@@@@@@@@@@@@@@@@@

Sylvie Le Guyader, PhD

Live Cell Imaging Facility Manager

Karolinska Institutet- Bionut Dpt

Hälsovägen 7C,

Room 7362 (lab)/7840 (office)

14157 Huddinge, Sweden

mobile: +46 (0) 73 733 5008

LCI website

Follow our microscopy
blog!

Exactly. AWS File Gateway system is a specific product add-on that only exists in Amazon’s AWS. ~J