Is there a guide for how to get large files onto OMERO? I have figured out how to convert files with bioformats2raw, but I don’t quite understand where to go from here and how to use the ome-cli-zarr plugin
There is a blog post about the conversion process https://www.glencoesoftware.com/blog/2019/12/09/converting-whole-slide-images-to-OME-TIFF.html
and also a useful notebook (prepared for our community meeting in May) explaining the various conversion steps
To import the data into OMERO, the files will need to be either TIFF or OME-TIFF.
I hope it helps
jumping in a bit late, but it’d be good to know what you’d like to do with the Zarr. Currently the ome-zarr format supported by the plugin is not an official format and so is not supported by OMERO. If you are trying to do something large in production, then the best path is to stick with the complete conversion to OME-TIFF using
raw2ometiff. If you are looking to do something more exploratory, we’ll be happy to help.
I am a bit confused with the workflow of importing large images. I understand that the new workflow published here
Therefore, those large images should first be converted to ome-zarr and then to ome-tiff, then import the ome-tiff into omero.
But how would one associate the original file within omero? Can I still have the original formatted file (which is imported by ln_s) to be linked to the new ome-tiff thumbnail and pixel data, etc? Is there a way to do that internally?
Or for those more adventurous, I guess it is possible of using omero-ms-zarr microservice, presumably one needs to create a obj that link that will link to the output of the .zarr from bioformats2raw and that should allow omero API to access the pixel level data. Presumably like the implementation example here Next-generation file formats (NGFF)
One can then use a different viewer to view the zarr image. Am I correct on the interpretation of how omero will deal with zarr images? If I am correct, then will there be a way to link back to the original file?
The reason of linking back to original file is the need for Research Data Management and other software and demands requiring accessing the original image file.
Looking forward to the dicussion at the OME Meeting next week.
Hi @ken.ho san
Not at the moment. The workflows as they exist are geared more towards dealing with issues where the original data has not, and especially cannot, be imported into OMERO. In that case, I could see having a reference to the storage location for the original data as an annotation, but nothing is prescribed.
If the data is already in OMERO, then I guess the question is what you are trying to achieve by converting to OME-Tiff? Are these pyramidal images? Are the pyramids in OMERO not performing?
If the goal is to have the Zarr, then at the moment, yes, our best offering is generating Zarr on the fly. (Below)
That isn’t implemented at the moment. Currently, omero-ms-zarr generates a Zarr for you using bioformats directly. i.e. you don’t need to cache the Zarr representation. What you outline though would be feasible. In general, it would be great to have more people defining (and implementing) use cases in the microservice!
There are a number of tools that can now access Zarrs remotely, yes. In general, it’s a more generic API and frees clients from needing to speak OMERO-ish.
Thanks for the reply Josh.
The files are large images more than 3000x3000 pixels in formats, like jpeg, ND2 and BigdataViwer that were imported into OMERO, but OMERO didn’t generate those pyramidal images. Users can’t see or view the images and I guess they will not be able to access by API either. Hence the need to convert them to OME-TIFF and re-importing them.
I think I am going to remove the original images (can’t be rendered nor access) from OMERO (they were ln_s anyway) and then re-import them in OME-TIFF (without ln_s) and change the OriginalFilePath of the OME-TIFF to point to the original formatted files.
The use case for this is that there are demands from Machine Learning team(s) to access the raw propreitary image files.
Thanks, I shall play around with it later and will report this back if there is something worth posting.
Interesting. Which field would you change? The other alternative is of course an annotation.
My naive thinking is to update OrginalFile.name and OriginalFile.path.
In a way the OME-TIFF formatted file is internalised … I think…
Maybe you are right that annotation is a safer option. As the above manipulation may be confusing for other system-admin to follow up in the future.
No. Those fields are used to look up the file in the managed repository. Changing those would make the data unparseable.
clientPath is an option. It will change what shows up with the little
/../.. symbol in the UI but there may not be a one-to-one mapping between how many files are in the OME-TIFF and how many are in the source dataset.
Many thanks for the advice.
I shall do an annotation instead then. To be on the safe side.
My naive thinking was that the OriginalFile.name and .path are for reference on the UI only and in my Use Case to give other teams direct access to the original files. I thought importing the OME-TIFF wihtout in-place import, it will just use the files in the ManagedRepository. Sounds like there are more to this than meets the eye. I shall play around with Filesetentry.clientPath and see what it looks like in my dev server.