My current definition of an “image dataset” would be: a set of images that can be meaningfully displayed in the same physical coordinate system, combined with sufficient metadata to know where in this common coordinate system they should be positioned. I guess that’s already the first point to be discussed whether this is a definition that would be useful for other people.
Another question would be how to technically “create” such a dataset.
Along those lines: @NicoKiaru for inspiration, could you post here the text of a bdv.xml that combines several images into one dataset? I think you are now a known expert for this topic
Just to link the similar discussion that focusses on what viewers to use as well as “how to create the dataset” mapping transform from one image to the other. Goolge-maps type browser - #5 by Christian_Tischer
That makes a lot of sense (at least to me ). But I maybe wouldn’t call it image dataset. Reason: dataset is often used to refer to the object that contains nd data, for example the scale datasets s0, s1, …, in the case of ome.zarr. Maybe multi-image-container or image-collection?
To support multiple images, the dictionary multiscales in the attributes could (1) be nested further or (2) instead of multiscales one could allow for multiple keys. So instead of
This definitions sounds meaningful to me. And it is more or less exactly the philosophy we follow with CZI + ZEN Connect.
We call this concept sample-centric data storage. I am not sure if that helps your discussions…
I did not invent anything but I just reused the specifications used in the xml files of bigdataviewer. AFAIK, the original use case is positioning of lot of “sources” (XYZT) in a single global 3d coordinates system (typical use case : light sheet, but that’s also good for correlative or multimodal imaging). BigStitcher by @StephanPreibisch uses these metadata in order to perform several rounds of finer and finer registration steps.
So the info you have in a bdv xml file are:
for each source (XYZ)
for each timepoint (T)
a chain of affine transform in 3D
Keeping the chain in memory allows to keep track and easily cancel, if needed, a failed registration step. The first affine tranform usually contains the voxel size information, the second affine transform can contain the position in 3D, or how to unskew data if needed.
Maybe looking at what’s done in other software could be a nice source of inspiration :
look at blender ? How do they define objects coordinates in 3d space ? (ping @frauzufall)
That looks very useful to me as it, e.g. would allow to apply (and document) both channel and drift corrections, without having to re-save the voxel data.
However, could one go further? Let’s say I have a huge, e.g. volume EM, XYZ volume and would like to apply different transformations to the individual XY slices (very common use-case afaik, isn’t it @schorb?). @NicoKiaru@bogovicj , do you know if that could be represented with the current bdv.xml specifications or does it only allow for one transformation per volume?
Then, as far as I know, you need to consider your slices as independent sources.
I might be possible to do a custom ‘source’ which allows for flexibility in 2d transform while keeping a simple ‘z stacking’, but that needs to be outside the bdv xml specifications.
Outside bdv, trackem2 does that - registration of big 2d planes:
On not using “datasets”, I’d concur since it shows up in various places already. I’ve been using “containers” colloquially with the only one currently implemented for high-content screening plates. But “collection” or anything suitably unused would work.
This would certainly work as things currently stand, but I’m beginning to think I made some mistakes in the v0.1 spec. For example, we’re starting to look into storing multiple different downsamplings (e.g. one for 2D and one for 3D access). Currently the only place we could put that information is also in the multiscales object. i.e. we likely need more, which means we’re probably into breaking the spec for a v0.1 anyway. My thinking would be:
rename “multiscales” to be “image” (or similar) making it clear that it turns a group (or HDF5 dataset ← that word again!) into an image. In that case, to support multiple images as @Christian_Tischer is rightly requesting here would mean doing so via a containing group:
- images # group with new "collection" or "container" metadata)
- image1 # group with "multiscales" now called "image"
- image2 # the same
Note: you could do this today to represent multiple images in a single OME-Zarr fileset, but you would either need to explicitly pass the URL to your clients or have the client search the hierarchy since there is no metadata to say, "please find images at path image1/ and image2/.
Agreed. But I think this is going to happen several times as we work through all these concepts as a community, so I think we need to be ready for upgrades
@NicoKiaru brought up this idea recently on call. I find it intriguing but knowing what I do about the SVG spec, I’d tend to put this on the “will take time to implement” end of the spectrum. For what it’s worth, I do think that there need to be multiple collection/container-implementations. If we can describe the most immediately needed one, then we can build from there. Ditto on:
> - images # group with new "collection" or "container" metadata)
> - image1 # group with "multiscales" now called "image"
> - image2 # the same
I like this!
Maybe we could have this discussion on specifying the transforms that map the pixels into the physical world in a different thread. However, it could be related to the image collection discussion, because is it (a) a property of each individual image where it lives in the physical world or (b) a property that is (only) defined on the level of the image collection? I think I remember several people making the point that (b) is better as the same image could potentially be “reused” in different spatial contexts, is that right?
If we think along the lines of (b) we would need to be able for a certain image (the raw data) to be part of multiple image-collections. Does that mean that we should allow for several images groups? Probably yes, isn’t it? I am sorry for my elaborate argument as I am starting to feel that this was clear anyway?!
In order to efficiently browse through an image set I think we would need an “image feature table”. Again, I am not sure this is something that should be defined on the image-collection level or scraped together from the anyway existing image metadata.
@joshmoore, is your vision (a) that the specifications for HTM data would in future versions of the file format be “absorbed” ( not sure what’s the right word here) by the image-collection data model or (b) that they would remain as stand-alone specs? I think I would favour (a).
Understood. And going back to your previous comment, I also am not sure at which level to do it. If there’s either not a lot of metadata or not a lot of images, encoding it at the image layer is fairly straight-forward. But when you start to have GB (or TB!) of tabular data along with a study, I assume that needs to be its own data source.
Certainly an option. Alternatively (or perhaps additionally) the top level group could have metadata of the form:
It seems to me that as the structure becomes more complicated and the collections larger, efficient data extraction will require some form of indexing.
Definitely agree, Jean Karim. Hopefully that can be an extension to the specs as they are built rather than needing a full reworking. Certainly having bidirectional metadata will help in many situations. At some scales, though, the metadata will like become more like data and need to be stored in binary at which point we may have to deal with both in the same application.
I like RO-Crate a lot, but it hasn’t become clear to me yet how to merge that with the separate Zarr hierarchy. I’ve tended to try to keep things simpler starting out rather than introducing another technology/library. I failed to meet up with Stian at the biohackathon, but I imagine that a few conversations could show us a better path forward.