Finding bounding box of a scene in a CZI file

Hi.

I’m currently trying to process czi files that contain 2 scenes with Python, numpy and the czifile library.

As far a I can tell, the only way to access the data is to load all of it into a gigantic numpy array.

>>> import czifile
>>> czif = czifile.CziFile("2019_10_03__11_28__0329.czi")
>>> czif.axes
'SCYX0'
>>> czif.shape
(2, 3, 25935, 64539, 1)
>>> arr = czif.asarray()

In this example, I’ve got two scenes, i.e. two RGB images of shape 25935x64539.
With some transposition and dropping useless dimensions, I’m able to retrieve an image in the format I need.

My problem is related to the X axis dimension (64539). On further inspection, I noticed that only about half of that dimension contains useful data. It seems that czifile creates this dimension by adding up the shapes of the two scenes, causing the array size to be twice as large as necessary.

Is there a way to find in the header the bounding box of each scene, so I can drop the useless zero values as early as possible in my processing flow?

I have been able to find an origin and size for each scene, but it’s obviously not expressed in pixels. If there was a similar information regarding the entire image, I could then convert the header units into pixels. But I haven’t found it in the header, yet.

This is how I read the header:
>>> metadata_dict = czif.metadata(raw=False)

Then I dump it into a yml file to make it more readable:


>>> import yaml
>>>  with open("czi_header.yml", "w") as f:
...     yaml.dump(metadata_dict, f)

Note that the extra data causes actual issues in my flow, when displaying the image in an opengl widget, too much data to fit on the GPU memory.

And the reason why I need to use czifile rather than another czi python library is that all other libraries seem to have too restrictive licenses for my application.