Standard workflow and format to handle huge slidescanner images

I think primarily since dedicated tools for 2D exist.
I am for example currently exploring OpenSeadragon.

I will chat with Constantin once he is back.
But I am afraid that his code uses dependencies that might not work with some large array dimensions.

Hi,

I use ND2 files in QuPath pretty reguarly. However, if they are big stitched images I first use BFtools to make them pyramidal OME-tifs first.

2 Likes

Thanks a lot!!
Bioformats supports some nd2 file formats, but not the one from our slidescanner.

ButI might still export a huge ome tiff using NIS-Elements and then make it pyramidal using python or BF Tools.

What pyramid do you exactly make??
A resolution pyramid?
Or do you split the image into tiles?
If yes of which size?

Ah I see. Well QuPath uses Bio-Formats so that explains why it won’t open your ND2 as well.

I make a resolution pyramid because QuPath can handle them so nicely compared to most options where you have to load in the whole stitched image at full res.

The command I used would be something like below for a 50K by 50K image:

bfconvert -tilex 512 -tiley 512 -noflat -pyramid-resolutions 4 -pyramid-scale 4 “D:\bigImage.nd2” “D:\bigImage.ome.tiff”

To be honest I can’t remember how much I initially played around with the number of levels and scaling and the tile size. I think I remember @petebankhead telling me to try downsampling by a factor of 4 until the lowest resolution is around 500-1000 pixels wide or high. The tiling is needed as BFtools can’t load the full stitched image.

1 Like

I think you’ll want a pyramidal OME-TIFF, which I believe makes both a QuPath and Orbit an option (does/could BDV support this directly too?).

However in general QuPath doesn’t really need Bio-Formats to support the format, it just needs a reader that implements its ImageServer interface. There are implementations for ImageJ, OpenSlide & recently also OMERO. If you can read your ND2 files somehow then you may be able to add QuPath support with your own reader. This could involve extract steps like dynamically stitching fields of view if it really had to.

Which version of bfconvert do you use?
In version 5.6.0 -noflat -pyramid-resolution and -pyramidscale are not known.

if i run:
bfconvert -tilex 512 -tiley 512 E:\export3\test002_1.tif E:\export3\bigImage.ome.tiff

I get an array too large error:

E:\export3\test002_1.tif
TiffDelegateReader initializing E:\export3\test002_1.tif
Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata
[Tagged Image File Format] -> E:\export3\bigImage.ome.tiff [OME-TIFF]
Switching to BigTIFF (by file size)
Exception in thread “main” java.lang.IllegalArgumentException: Array size too large: 57856 x 64512
at loci.common.DataTools.safeMultiply32(DataTools.java:949)
at loci.formats.in.MinimalTiffReader.getOptimalTileHeight(MinimalTider.java:416)
at loci.formats.DelegateReader.getOptimalTileHeight(DelegateReader.256)
at loci.formats.ImageReader.getOptimalTileHeight(ImageReader.java:7
at loci.formats.tools.ImageConverter.convertTilePlane(ImageConvertea:663)
at loci.formats.tools.ImageConverter.convertPlane(ImageConverter.ja4)
at loci.formats.tools.ImageConverter.testConvert(ImageConverter.jav)
at loci.formats.tools.ImageConverter.main(ImageConverter.java:884)

I think it should be on the same version as BioFormats in general, so you might want to try 6.2.0
https://docs.openmicroscopy.org/bio-formats/6.2.0/users/comlinetools/conversion.html
More usefully, this link has a link to what I think is the current download.
https://docs.openmicroscopy.org/bio-formats/6.2.0/users/comlinetools/index.html
It was a bit of a maze to find, since a google search for bfconvert does come up with 5.7.1 links first.

Thanks a lot!
Now at least all arguments are known.
If I start with an 4x4 binned input the entire pipeline works and I can import in QuPath.

However with the full size image i still get the same:
Array size too large: 57856 x 64512 error.

My system has plenty of memory left, but it might still be an java.heap space issue.

@s.besson

The array size too large message is in bftools or later in QuPath? If in bftools, it sounds like maybe your tiling isn’t working since it is still trying to load the whole image? Or I might be missing something, I haven’t tried this on a large image or anything :slight_smile:

Sorry for the confusion.

The error is in bftools. That is why I added @s.besson to the discussion.

I was hoping that bftools can convert a huge (non-tiled) TIFF file into a pyramid OME.TIFF.
Can it?

If no:
I have the entire file as numpy array in python.
Obviously i can write it easily as tiles (one tile per standard tif file) on disc.
But that I can not import in QuPath, or? (would be great too)

And I would not know how to instruct bftools to properly interpret my files as to convert them into a ome.tiff.
Is that easy?

As last resort, I might write an pseudo ome.tiff from Python:

similar as described in:


My hope was just that such an solution would already exist.

@sebi06

Are you setting the max heap size to be allocated using set BF_MAX_MEM? The default is 512m so def needed otherwise there will be memory problems.

In QuPath, that would usually generate Java heap space errors, not Array size too large errors. *My QuPath experience may or may not be applicable here :slight_smile:

@Tobias Are you using the exact line @lmurphy listed above?

And @lmurphy did you use that command on single huge tiffs or was your comment on tiling above regarding some formatting on the command line?

I don’t know about importing individual tiles into QuPath. I don’t think that is something that would work well right now.

Yes, I do. The only difference is that I start from standard TIFF not nd2.

I did: set BF_MAX_MEM=26g

I suspect that this might be the problem that causes the error.

tifffile can write pyramidal TIFF files with a variety of options. For example, the uncompressed BigTIFF file produced by the following code works for me with QuPath-0.2.0-m3 and Bio-Formats but not with OpenSlide on Windows:

import tifffile
import cv2  # OpenCV for fast resizing

image = tifffile.imread('CMU-1.tiff', key=0)
h, w, s = image.shape

with tifffile.TiffWriter('pyramid.tif', bigtiff=True) as tif:
    level = 0
    while True:
        tif.save(
            image,
            software='Glencoe/Faas pyramid',
            metadata=None,
            tile=(256, 256),
            resolution=(1000/2**level, 1000/2**level, 'CENTIMETER'),
            # compress=1,  # low level deflate
            # compress=('jpeg', 95),  # requires imagecodecs
            # subfiletype=1 if level else 0,
        )
        if max(w, h) < 256:
            break
        level += 1
        w //= 2
        h //= 2
        image = cv2.resize(image, dsize=(w, h), interpolation=cv2.INTER_LINEAR)
7 Likes

Dear Christoph,

This solution is exactly what i needed. It is very fast and simple and seems to work with arbitrarily large input. Thank you very much!! @cgohlke

I will summarize the entire workflow again for anyone who might run in the same problem in the future.

  1. I export nd2 files as single (large) standard TIF (using NIS Elements).
  2. This file I open with Python / TIFFFILE as numpy array (the test array was (57639,64462,3)).
  3. Then i enlarge it such than x and y are multiples of 512 (i need that for other purposes)…
  4. then i export a bigtiff pyramid as described above
  5. this pyramid i open with QuPath 0.2.0 m3

Thanks a lot to all of you who contributed to this solution.

Kind regards

Tobias

1 Like

Hi @Tobias et al,

re Standard workflow and format to handle huge slidescanner images, the command you pasted is exactly the one I would have suggested. I am unclear about the reason for the exception you are seeing as the code tries to read the tile size. This might point towards a bug that can be fixed in the bfconvert utility. Is it possible for you to upload a sample file reproducing the issue?

re Standard workflow and format to handle huge slidescanner images, thanks to Christophe for sharing his code allowing to generate TIFF files. I would like to include one word of caution about the specific implementation that is used and especially the usage of Glencoe/Faas in the software tag. This format had been used to store a large multi-resolution Zebrafish embryo originally published in JCB Data Viewer and now available in IDR. A Bio-Formats reader had been developed specifically for this dataset and this is probably what QuPath uses under the hood.

Although this layout had been used as one of the inspirations while designing the pyramidal OME-TIFF extension, it is effectively a custom file format with all associated caveats in terms of long-term support, interoperability…

:+1: for having a solution that allows you to make progress immediately but for the long-term, we definitely want to improve the tooling allowing to easily structure multi-resolution data into an open exchangeable fomat (such as OME-TIFF, BDV, …).

Hello Tobias,

I’m glad the code worked for you. However, I did not mean to promote it as the “standard workflow and format to handle huge slidescanner images”. It would be better to avoid the intermediate TIFF file and directly read the nd2 file if possible. The pyramid file produced is not really following any particular “standard”. It just happens to work with QuPath. When enabling compression the file cannot be read by Bio-Formats without corruption. Probably a bug. As mentioned, OpenSlide (winbuild-20171122) does not recognize it as pyramidal, which excludes tools like vips.

I am not aware of any specific BigTIFF+TIFF6 compliant slide format that currently “just works” with OpenSlide, BioFormats, and GIMP or PhotoShop.

1 Like

Dear @s.besson

Thanks a lot for following up!
I think it might be something related to memory management in Java (and I have currently only one workstation to test this on).
I just send you exactly the file i used (and which I can not make publically available) but I can make such files if it turns out to be a principle problem.

1 Like

Dear @cgohlke

Yes, I agree. It turned out that: “Towards” a standard workflow… or “At least somehing that works”… would have been the better title :slight_smile:

The problem with ND2 is that Nikon keeps changing it. Thus, I would hope (but not expect) that Bio-formats has always the correct reader for the latest Nd2 format.

But NIS-Elements will always be able to export standard TIFFs.
And Python/TIFFFILE will always be able to open these.

Thus, I think (with small variations) the proposed “interim” solution might become a long-term solution…

Thanks @Tobias,

the sample file was very useful to identify the source of the issue in Bio-Formats. I opened a Pull Request to propose a fix in the upcoming patch release which we will review in the upcoming days.

Best,
Sebastien

1 Like