Hi OME team,
A few scientific projects at OHSU want to use the exact same images each for their own purposes but they do NOT want to share annotations and whatever else gets attached to those images (due to PHI concerns). However, ideally they would like NOT to have to create multiple copies of the large image files. So is there a way to create separate Image objects (each in their own Group with their own annotations, etc.) but have those Image objects connected to the same file in the ManagedRepository? I’m imagining of some kind of hard link situation in the ManagedRepository so that if one of the Image objects is deleted, the file remains on disk without trouble.
Hi OME team,
I am not a part of the OME team, but I think I know the answer. You want to use “in-place” importing.
I recommend the soft-linking strategy, which we use a ton. But hard linking would work just fine if you aren’t linking across file systems.
Thanks, yes, I had been thinking about that option but wasn’t entirely sure how to implement the entire workflow. So I guess I would do a regular import first so the image is in the ManagedRepository and then do an inplace import with hard links from that original import to get the second copy into the 2nd Group and so on. Figuring out all the specifics of that step 2 isn’t obvious.
I did meanwhile find the omero-cli-duplicate plugin and will see if I can subvert that to do what I need.
In general, I think the “reimport in-place” workflow has some merit, and I think I’ve suggested it elsewhere on image.sc. There aren’t however any formalizations of that yet.
@dsudar: Additionally to what @joshmoore wrote: If you use the
omero duplicate it will allow you to duplicate the images with or without annotations as you wish. Also, the
omero duplicate attempts to create a hard link in ManagedRepo instead of doubling the necessary storage whenever possible as long as the images are concerned. Of course, after the duplication, you still have to move the images into the desired group.
Hi Josh and Petr,
Yes, it indeed looks like omero-cli-duplicate will already do exactly what I need. I just found out about it last night. One quick new feature request after reading the docs in omero-guides: can there be an option to provide the destination Group for the duplicate as part of the duplicate process?
“PRs welcome”? More seriously, at the CLI level, I’m hesitant to have each command learn how to run other tasks (chgrp, chmod) since they can be chained together fairly easily. That being said, the duplicate command’s output is currently not ideal:
$ omero duplicate Fileset:123 omero.cmd.Duplicate Fileset:123 ok
You can pass
--report and capture the value you’re interested in:
$ DUPE=$(omero duplicate Fileset:123 --report | grep " Fileset:")
but it’s not ideal. I opened https://github.com/ome/omero-cli-duplicate/pull/18 to allow the likes of:
$ DUPE=$(omero duplicate Fileset:123) omero chgrp "New Group" $DUPE
--report is not passed. We will need to review all of the CLI commands to make sure that “Class:ID on stdout” is a standard contract like it is with
omero import, etc.
In the web, it’s a different story since piping isn’t possible. We began scoping this work, but there were some concerns about the overall scaling of these long-running tasks in the web, which is something we need to consider first.
All the best,
Yes, coming from a Unix background, I completely agree with that sentiment.
But the temporary workaround until your PR makes it through is perfectly fine for us.
And yes, I see that providing this functionality on web is quite a non-trivial thing.