Multi-scale image labels v0.1


This specification defines a convention for storing multiscale integer-valued labels, commonly used for storing a segmentation. Multiple such labeled images can be associated with a single image:

├── 0-4                # Resolution levels of the primary image
└── labels             # Group container for the labeled images.
    ├── original       # One or more groups which each represent
    ├── ..             # a multi-scale pyramid of integer values
    └── recalculated   # representing e.g. detected objects.

Use of this draft for the specification of IDR images in S3 is available at

“labels” group

In order to enable discovery, the well-known group “labels” within each image directory functions as a registry of all known image labels.

See color-key below

-    "labels": [
!        "original",
!        "recalculated"
-    ]

“image-label” group

Each group image label group is itself a multiscale image and should contain the “multiscales” metadata key. However, the presence of the “image-label” key identifies a group as a labeled image. In order to enable discovery each such group should be registered with the “labels” group above. Additionally, labeled images should list their source image in order to enable a bidirectional link.

The primary additional metadata that is currently specified is in the “colors” metadata key. Each label value can be registered in an array

See color-key below

-    "image-label": {
!        "version": "0.1",
!        "source": {
!            "image": "../.."
!        },
+        "colors": [
!            {
-             "label-value": 1,
-             "rgba": [128, 128, 128, 128]
!            }          
-        ]
-    },

Example workflow

If you would like to experiment with this specification, you can install the ome-zarr library via:

pip install ome-zarr==0.0.13

The library provides a napari plugin, which can optionally be activated via:

pip install ome-zarr[napari]==0.0.13

Sample data is available under the test-data subdirectory of the S3 bucket:

$ ome_zarr info [zgroup]
 - metadata
   - Multiscales
   - OMERO
 - data
   - (1, 2, 257, 210, 253) [zgroup] (hidden)
 - metadata
   - Labels
 - data [zgroup] (hidden)
 - metadata
   - Label
   - Multiscales
 - data
   - (1, 1, 257, 210, 253)
   - (1, 1, 257, 126, 105)
   - (1, 1, 257, 52, 63)
   - (1, 1, 257, 31, 26)
   - (1, 1, 257, 13, 15)

If you have existing masks in OMERO, you can export your image and masks using omero-cli-zarr:

$ pip install omero-cli-zarr
$ omero zarr export Image:6001240
$ omero zarr masks Image:6001240
$ ome_zarr info 6001240.zarr/
/tmp/6001240.zarr [zgroup]
 - metadata
   - Multiscales
   - OMERO
 - data
   - (1, 2, 236, 275, 271)
/opt/data/6001240.zarr/labels [zgroup] (hidden)
 - metadata
   - Labels
 - data
/tmp/6001240.zarr/labels/0 [zgroup] (hidden)
 - metadata
   - Label
   - Multiscales
 - data
   - (1, 1, 236, 275, 271)
   - (1, 1, 236, 135, 137)
   - (1, 1, 236, 68, 67)

Design trade-offs

Two additional layouts were considered for the labeled data itself. The first was a split representation in which each label was a separate bitmask. This representation is still possible by using multiple labeled images. The other was a 6-dimensional bitmask structure. The benefit of both of these was the support of overlaps. The downside was that many implementations do not natively support a compact representation of bit arrays.

For the metadata, also a number of different configurations were considered for describing metadata about each label value. The primary choice was between an array representation and two sparse representation: a dictionary with the downside of requiring string keys and a list of dictionaries with the downside of possible reduncancy. More details to this discussion are available under “Revamp color metadata” (#62).

Current limitations


  • Multi-channel labeled images are currently not supported. The colors metadata specification would need to be updated to do so.
  • The current assumption is that for every multiscale level in the image data a layer of equal size will be present in the labeled image.
  • Currently missing metadata:
    • label value for overlaps
    • source array (as opposed to group) of the segmentation


Color key

(according to

- MUST     : If these values are not present, the multiscale series will not be detected.
! SHOULD   : Missing values may cause issues in future versions.
+ MAY      : Optional values which can be readily omitted.
# UNPARSED : When updating between versions, no transformation will be performed on these values.
Revision Source Date Description
0.1.1 @joshmoore 2020.10.01 Migration to
0.1.0 @joshmoore 2020.09.16 Initial version on GitHub


Thank you very much for sharing!
I have a few questions and thoughts.

  1. Question about the label image dimensions and data type: You write “The other was a 6-dimensional bitmask structure.” Could you elaborate on what that means and why there are 6 dimensions?
  2. Regarding label attributes: As discussed in the last call, some sort of table where each row corresponds to one label and columns corresponding to label attributes would be great to have. If we go for this I would like to raise the issue of how we would technically enable to later add attributes/columns (e.g. newly computed label features) to the table? For example, would it be an idea to require/recommend column-wise chunking in the storage model such would be easy and efficient to add new columns?! However, for fast loading of specific label attributes a row-wise chunking would be more performant. Afaik this is a classical issue with no easy answer (first google hit). Thus, maybe we would not require anything here but support all chunking options?

Sure. Drawings would help, but in pseudo-code, the proposals we looked at were:

CURRENT = 3  # The label class that we're currently trying to assign

#  Each mask is a separate 5D binary array in a group
mask_group[str(CURRENT)][0, 0, 0, 0, 0] = True

# All masks are in one 6D binary array:
mask_6D_shape[0, 0, 0, 0, 0, CURRENT] = True

# Non-overlapping masks are collected into a labeled array
labeled_array[0, 0, 0, 0, 0] = CURRENT

# Note: In the first two examples, the mask index wouldn't necessarily need to be at the end.


I’d agree. If we can keep the API unaware of the storage specifics, then code should work continue to work, even if not at optimal performance. This is the same issue we’ll have with the pixel data if it is accessed in a suboptimal dimension order.

In the mean time, you might take a look at @DragaDoncila’s proposal in for an intermediate way to store extra properties in the JSON metadata.

Maybe drawing really would help. I don’t think I get it :slight_smile:

mask_group[str(CURRENT)][0, 0, 0, 0, 0] = True

Are you trying to say that mask_group[str(CURRENT)] is a 5D boolean array with the same dimensions as the image and the values are set to True for all pixels where the label CURRENT lives? In other words, mask_group in Java would be a Map< String, boolean[][][][][]>?! And then you rely on compression for storing all the boolean[][][][][], is it?

Would labeled_array[][][][][] be of data type String?

mask_6D_shape[0, 0, 0, 0, 0, CURRENT] = True
I guess this implies that CURRENT is a positive integer, right? Which in Java would mean that we would limit ourselves to 2147483647 labels, which may be enough for all the use cases I can imagine :slight_smile:

And is there a favourite already?

I think most applications would work with this one: labeled_array[0, 0, 0, 0, 0] = CURRENT, with CURRENT being, e.g., of data type Unsigned Long, but as you said, this implementation has the disadvantage that you need additional logic for overlapping labels…

Good point. Sorry, I should have put on my Java hat:

// Needs to be initialized to a value starting at 0.
long CURRENT = -1;

//  Each mask is a separate 5D binary array in a group
Map<String,  boolean[][][][][]> maskGroup;

# All masks are in one 6D binary array:
boolean[][][][][][] mask6DShape;

# Non-overlapping masks are collected into a labeled array
long[][][][][] labeledArray;

is what it comes to. And yes, in the first two proposals there would have been a strong leaning on compression to make the storage of the boolean arrays convenient. Unfortunately the overall support for bitmasks though felt poor, so the favorite at the moment and what we have implemented in examples is long[][][][][]. I think there won’t be an issue with the array size limit in Java because you will be loading by chunk, but that’s a question for the Java implementation that needs working through. The choices for overlapping labels at the moment are:

  • creating a second long[][][][] labeledArray2 in the zarr group
  • setting a marker value in the first labeledArray to express the conflict.


Unfortunately the overall support for bitmasks though felt poor, so the favorite at the moment and what we have implemented in examples is long[][][][][] .

Makes sense, although I conceptually like the Map< String, boolean[][][][][]> all the software I know currently internally implement a long[][]..., so in most cases there would need to be a conversion step somewhere, which, maybe, is too annoying.

Could you please elaborate, maybe with a minimal example (also I guess, at one location in space that there could be more than two overlapping labels, but e.g. three)?

Sure, though I may need to come back to this next week. The example at the top of this topic shows:

└── labels             # Group container for the labeled images.
    ├── original       # One or more groups which each represent
    ├── ..             # a multi-scale pyramid of integer values
    └── recalculated   # representing e.g. detected objects.

so two sets of labels that of course could be overlapping. Here the example is intended to show them coming from different users, but the idea of having a collection of labeled images that the user can organism also means that if there are two types of overlapping segmentations from the same analysis (e.g. “cell” and “nucleus”) that they could be stored separately:

└── labels
    └── tischi
        ├── cells  <-- int[][][][][]
        └── nuclei <-- int[][][][][]

I see, if those are entirely different structures, different label images make sense, of course.

I was more thinking of use cases such as segmentation of nuclei in 2D images. Since 2D images are often some sort of projection, nuclei can overlap in those images (even though in the real 3D world they don’t). So there could be a region of nucleus with label 2, a region of nucleus with label 3, and a region in between where they are both (2 & 3). I think one way to handle this is by assigning a new label 4 to the region where they overlap and then store something like a Map< Integer, List< Integer > > indexInLabelImageToObjectLabelIndicies where you could see that 4 -> { 2, 3 }.

So the storage model could be something like:
int[][][][][]; Map< Integer, List< Integer > >
What do you think? Maybe the Map could be optional so only people that care about such overlaps would need to store it.

As a side note, I think in the current model this might mean there needs to be a 2D projection of the image that the labels are “on”. Might need to consider this…

Agreed, but certainly that’s not been addressed yet.

I wonder to what degree this is metadata on the labels themselves along the lines of

cc: @DragaDoncila

Yes, I was wondering this, too :wink:

1 Like

This is a related thread: How to save labelings with metadata in ImageJ / Fiji

@frauzufall, did you make any progress regarding save label images and does any of above discussion resonate with your current ideas?

1 Like