Convert annotation / label in Python from GeoJSON, XML to labeled or binary numpy arrays

Hi All,

@constantinpape
@VolkerH
@mweigert

I am just looking for some Python code to convert annotation / label in Python from GeoJSON
(from QuPath export) or XML (from MoNuSeg) to labeled or binary numpy arrays.

or better even better a:

Universal_Label_Converter in Python.
Sure it must be something out there, i just did not find it…

Thanks a lot for your help & Kind regards

Tobias

1 Like

Hallo Tobias,

I haven’t looked in detail at the polygon annotations in QuPath, if you say they are GeoJSON I assume they are polygons ?

I would read the JSON file into a string, then use geojson.loads(), iterate over the items to convert each polygon to a numpy array and finally use skimage.draw.polygon to assign labels or skimage.draw.polygon2mask to create masks.

I would also expect having to swap coordinates of the numpy array (not sure what order x and y are in GeoJSON) and maybe round to the nearest integer (in case the polygons are sub-pixel).

EDIT:
Obviously if the polygons are overlapping the labels you assign later will override the ones created earlier. But that is a general problem with label images.

EDIT2:
Maybe you can attach a small toy example.

2 Likes

Hi @Tobias,

I am not aware of a general library that would handle all the different possible geometry formats equally well (although it would be nice to have).
Similar to what @VolkerH wrote, I typically import the geometry via geopandas or xml.etree.ElementTree and then use skimage.draw.polygon to create the label mask.

1 Like

I also don’t know any general purpose library for this; when I needed to do this in the past I also wrote a custom parser similar to what @VolkerH proposed.

Just as a side note: label exports are a fairly complex topic in general, see also the recent discussion I had about this with @petebankhead:

1 Like

Dear Volker,

thanks a lot for the fast response.
Yes, objects are (until now) polygons.

With toy example you mean demo data?
The smallest I have is this dataset (7MB).

The annotation file alternates:
Filename (linebreak)
GeoJSON (linebreak)
Filename …

Or did you mean sample code?
I did not start yet, as I had hoped that someone of you alerts me that the problem is already solved.

Kind regards

Tobias

Hi @mweigert,

thanks a lot! Is the way you do it already part of CSBDeep, or available elsewhere, that I could have a look? Do you already solve the overlapping annotation issue (in a smart way)?

Kind regards

Tobias

Hi @constantinpape thanks a lot for sharing the link.
I agree that it would be great to agree on a few formats and have for these a common collection of converter.

At the same time I agree with Pete that scripts are fine for now, as long as they are available and shared.

All the Best

Tobias

1 Like

Unfortunately we didn’t have the time for both yet :frowning:

UpDate!!!

Dear All,

brief update: I just made a notebook to convert XML from MoNuSe.

If no common collection of converters emerges soon, I will include it in the next stable version of OpSeF.

Kind regards

Tobias

I had a little play with your test set. This notebook (unfinished and possibly buggy) is a quick attempt at exploring the data structure. There is no iterating over the labels yet, but you should be able to see how you can extract the data and manipulate pixels within the labelled regions. Something to get you started …

2 Likes

Hi Volker,

thanks a lot for your help.
That looks great, the rest seems easy and fast.

Viele Grüße
Tobias

1 Like