Bfconvert output compression internal JPEG2000 and dealing with very large 2D images?

Hi Bio-Formats developers,

When you run “bfconvert -h” (I use Bio-Formats 6.2.0) it appears to be possible to create output ome.tif files with internal JP-2000 or JP-2000 Lossy compression. However, any attempt to use that gives:
Exception in thread “main” loci.formats.FormatException: Invalid compression type: JP-2000

Is there a way to enable that or is it a mismatch between implementation and documentation?

I’m asking because we have extremely large (>100k by 100k pixels) images that I’m trying to convert to pyramid OME-TIFFs so they can be imported into OMERO 5.5.1. While bfconvert can kinda* do this but generates a 13GB file from a 5.5GB input. So I was thinking a lossy codec might be sufficient for this purpose and reduce the file size.

*The kinda refers to this:

BF_MAX_MEM=4096M bfconvert -noflat -tilex 1024 -tiley 1024 -compression LZW -pyramid-resolutions 5 -pyramid-scale 2 SMMART_101b-1.tif SMMART_101b-1.ome.tif

works
but:

BF_MAX_MEM=4096M bfconvert -noflat -tilex 1024 -tiley 1024 -compression LZW -pyramid-resolutions 6 -pyramid-scale 2 SMMART_101b-1.tif SMMART_101b-1.ome.tif

fails with:

SMMART_101b-1.tif
TiffDelegateReader initializing SMMART_101b-1.tif
Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata
[Tagged Image File Format] -> SMMART_101b-2.ome.tif [OME-TIFF]
Switching to BigTIFF (by file size)
Tile size = 1024 x 1024
Converted 1/1 planes (100%)
Tile size = 1024 x 1024
Converted 1/1 planes (100%)
Tile size = 1024 x 1024
Converted 1/1 planes (100%)
Tile size = 1024 x 1024
Converted 1/1 planes (100%)
Tile size = 1024 x 1024
Converted 1/1 planes (100%)
Tile size = 1024 x 1024
Exception in thread “main” loci.formats.FormatException: Image plane too large. Only 2GB of data can be extracted at one time. You can work around the problem by opening the plane in tiles; for further details, see: https://docs.openmicroscopy.org/bio-formats/6.2.0/about/bug-reporting.html#common-issues-to-check
at loci.formats.FormatReader.openBytes(FormatReader.java:872)
at loci.formats.ImageReader.openBytes(ImageReader.java:445)
at loci.formats.tools.ImageConverter.getTile(ImageConverter.java:1038)
at loci.formats.tools.ImageConverter.convertTilePlane(ImageConverter.java:826)
at loci.formats.tools.ImageConverter.convertPlane(ImageConverter.java:763)
at loci.formats.tools.ImageConverter.testConvert(ImageConverter.java:691)
at loci.formats.tools.ImageConverter.main(ImageConverter.java:1051)
Caused by: java.lang.IllegalArgumentException: Array size too large: 32768 x 32768 x 3 x 1
at loci.common.DataTools.safeMultiply32(DataTools.java:1286)
at loci.common.DataTools.allocate(DataTools.java:1259)
at loci.formats.FormatReader.openBytes(FormatReader.java:869)
… 6 more

The file “SMMART_101b-1.tif” is in your repo. See: https://trello.com/c/fFBOR0rY/282-tiffsaver-oom-on-big-image-import

Thanks,
Damir

Hi Damir,

For the compression I think its just the wrong string being used, JPEG-2000 rather than JP-2000 should work fine. I will follow up with testing on the SMMART_101b-1.tif file to see if I can get it running as you wish

As a follow up, for the FormatException: Image plane too large which you are seeing at the smallest resolution. This is due to the fact that the tile size remains constant for each resolutions, so in order to generate a 1024 x 1024 tile at the lowest resolution, a larger tile of (32 x 1024) x (32 x 1024) must first be read and then down-sampled. It is this larger tile that is throwing the exception (It actually works out just a single byte past the limit). Using a slightly smaller tile size should resolve the issue.

Hi David,

Thanks for explaining how bfconvert works internally and I understand the issue now. However, that points out that there is a systemic problem: for smooth visualization across the resolution range, it is typically best to have a pyramid all the way up to a ~256px by ~256px image and thus one will have to use progressively smaller tiles as the original image size gets bigger. And large 2D images are becoming quite common (see e.g.: Standard workflow and format to handle huge slidescanner images) I presume this is a maximum Java array size issue? Any ideas of work-around or time to switch to a different language?

The JPEG-2000 (and JPEG-2000 Lossy) compression settings do indeed work fine. They don’t give as much reduction in file size as I had hoped and running this compression in bfconvert is quite slow but it helps. And the resulting files import smoothly into OMERO 5.5.1 which is exactly what I hoped to achieve.

Thanks,
Damir

This is indeed the Java array size limit which is being reached. Bfconvert can likely be a bit smarter in how it handles this and try to downsample the plane in sections if the array size gets too large, but I haven’t yet properly tested this. Ive opened a GitHub issue to see if I can get a fix in place: https://github.com/ome/bioformats/issues/3415