Problems merging and exporting large pyramidal images

I have been attempting to use this script for some larger images, but there seems to be some sort of problem with… not sure if it is exactly file/image size, but something related to size. If I set the downsample for a given set of images (I had 3 sets of images 9 in this first test set of MICSSS images) low enough to only get 3 layers to the pyramid, it would write out the aligned deconvolved images perfectly!
With one sample, I was able to export with downsample 4, so I figured that would be good enough and tried the other two.
The second and third samples were slightly larger, and I found that any time the script reached the 4th layer, it would stop when “Writing plane 1”. I do not think the problem was inherently the image files, as I could change the downsample to 6, and both would write out successfully, though only with 3 pyramid layers.

There is no error message in either the script or the log file, the script stops running. This happens on different computers, one with 200GB of RAM. The files are approximately 7GB when the process fails.
There are 9 channels in each final image, I am not sure if that matters.
I plan on trying a smaller number of channels with one of the larger images to see if the channel count or file size is more important than the absolute size (in pixels) of the original image.

*I did try once at full resolution, and the file size ended up over 200GB, but also stopped when writing out the 4th pyramid level.

And writing out one ome.tif with 3 channels, no alignment, at full resolution, also resulted in the script stopping at resolution 4, though this was out of 5 total, so it is not a problem associated with the “last” level, but the fourth level specifically.
image

script output
INFO: 6264-02
INFO: Channels: 2
INFO: Writing ********************.ndpi (Color deconvolution stains: Hematoxylin: 0.717 0.594 0.366 , DAB: 0.373 0.684 0.627 , Residual: 0.284 -0.727 0.625 ) to G:\Cluster analysis of islets\output\IF one image.ome.tif (series 1/1)
INFO: Writing resolution 1 of 5 (downsample=1.0, 27048 tiles)
INFO: Writing plane 1/2
INFO: Written 5% tiles
INFO: Written 10% tiles
INFO: Written 15% tiles
INFO: Written 20% tiles
INFO: Written 25% tiles
INFO: Written 30% tiles
INFO: Written 35% tiles
INFO: Written 40% tiles
INFO: Written 45% tiles
INFO: Written 50% tiles
INFO: Written 55% tiles
INFO: Written 60% tiles
INFO: Written 65% tiles
INFO: Written 70% tiles
INFO: Written 75% tiles
INFO: Written 80% tiles
INFO: Written 85% tiles
INFO: Written 90% tiles
INFO: Written 95% tiles
INFO: Written 100% tiles
INFO: Plane written in 884630 ms
INFO: Writing plane 2/2
INFO: Plane written in 1729244 ms
INFO: Writing resolution 2 of 5 (downsample=4.0, 1722 tiles)
INFO: Writing plane 1/2
INFO: Written 5% tiles
INFO: Written 10% tiles
INFO: Written 15% tiles
INFO: Written 20% tiles
INFO: Written 25% tiles
INFO: Written 30% tiles
INFO: Written 35% tiles
INFO: Written 40% tiles
INFO: Written 45% tiles
INFO: Written 50% tiles
INFO: Written 55% tiles
INFO: Written 60% tiles
INFO: Written 65% tiles
INFO: Written 70% tiles
INFO: Written 75% tiles
INFO: Written 80% tiles
INFO: Written 85% tiles
INFO: Written 90% tiles
INFO: Written 95% tiles
INFO: Written 100% tiles
INFO: Plane written in 380509 ms
INFO: Writing plane 2/2
INFO: Plane written in 757466 ms
INFO: Writing resolution 3 of 5 (downsample=16.0, 121 tiles)
INFO: Writing plane 1/2
INFO: Written 10% tiles
INFO: Written 20% tiles
INFO: Written 30% tiles
INFO: Written 40% tiles
INFO: Written 50% tiles
INFO: Written 60% tiles
INFO: Written 70% tiles
INFO: Written 80% tiles
INFO: Written 90% tiles
INFO: Written 99% tiles
INFO: Plane written in 262013 ms
INFO: Writing plane 2/2
INFO: Plane written in 523613 ms
INFO: Writing resolution 4 of 5 (downsample=64.0, 9 tiles)
INFO: Writing plane 1/2

Coming back to this, @smcardle pointed out that:
https://docs.openmicroscopy.org/ome-model/5.5.7/ome-tiff/file-structure.html#file-size
It might actually have been the ome.tiff basic format size limits I was running into, though I am not sure why the file size got so large before stopping.
I do see some mentions of bigtiff file format here in the code, but I am not sure how to request that writeImage use them, or even if it is intended to be an option, rather than something that is called when needed. If it is supposed to happen automatically, is there any way to verify that it is?

What are the width, height and bit-depths of images that fail? And is any compression used?

Also, to avoid any confusion, could you post the script or link to the specific one you used?

I’d need to be able to replicate it, ideally for the smallest/simplest case that fails.

Well, that took quite a bit longer than I thought it would, but when I went back to your original script, everything just worked! That was when I started to pin down when it would break.

I will just include the results from using your exact script above, with the only changes being the file name(s) and the downsample (I ran a bunch of other tests on mine, and found it was failing in the same ways).
|Petescript|||

|3 channels 1 images|1x|good| <-- the first test that worked, which surprised me.
|3 channels 1 images|2x|fails|
|3 channels 1 images|3x|fails|
|3 channels 1 images|4x|good|
|6 channels 2 images|1x|good|
|6channels 2 images|2x|fails|
|6channels 2 images|3x|fails|
|2channels 2 images|2x|good|
|4 channels 2 images|2x|fails|
|4channels 2 images|1x|good|

Shown are the number of channels extracted, the number of base images, and the downsample, and then whether it fails on writing the 4th pyramid level.

It looks like 1 downsample is usually good (I had been avoiding that on my scripts since the file ends up well over 200GB, and then failed), which is why the script suddenly worked when I tried yours; it only handled 2 images as written (have added more lines for later tests), so I only used 2 images, and went ahead with downsample 1 since it was a small data set.

Swapping to downsample 2 is when the problems started. It looks like if, on large enough images:

  1. if you have a downsample >1
  2. the final image will have 4 or more “layers” to the pyramid (as defined in the output text)
    it will fail as it starts to write the 4th resolution.
    If your downsample is large enough that the final image will not have 4 resolutions, it writes out fine.

It just so happened that I wanted some downsample so as to avoid 200GB files, but not too much downsample as to be unable to interpret the data. 3-4x was a sweet spot in terms of interpreting the whole slide images. Downsample of 1 eventually breaks at very high file sizes, but the multiday export has prevented me from tying up enough computers to really pin down when that happens.

No compression used. Image for the tests listed above is ~80000 by 85000 pixels, though there is a lot of intra-image variability in size.

Any other information I can provide here to reproduce the problem? I can resent the GLOBUS link (it is for the MICSSS project) or host two or three of the images on a dummy GDrive account if you don’t have any large # pixel brightfield images to test.

Editing the above for a little more visual clarity.

It would help if you could give one specific failing example – ideally failing as quickly as possible, without room for misinterpretation. Preferably, it would use the original OS-2/OS-3 images to avoid extra downloads, or alternatively please link me to a specific failing image(s) with a script that contains the correct names / transforms / channels (or just give the script if I already have the images…).

Sorry but I don’t know what precisely that means, and I’m really swamped at the minute – and since it takes quite a long time to write each image, I really need to be absolutely definitely sure I’m investigating the right thing on the right images.


Alternatively/additionally, you can use the sampler in VisualVM: https://visualvm.github.io
This is how I’d be investigating myself (and I’m running it now with OS-2/3). Currently I see:

The numbers for the highlighted line are increasing. This shows me that, although QuPath appears to be hanging on my system, it is actually calling the (sometimes horrendously slow) built-in Java method setRect().

Indeed, since writing that it has grudgingly moved on to plane 3/6.

You can also use VisualVM to do a thread dump if it really has stopped, which can report any issues like deadlock.

1 Like

Sounds good, I will set something up and message you with the GDrive link once I can get something hosted. OS-2/3 were too small to reach the 4th pyramid level, they should work fine in most reasonable cases (not sure about 100 copies of channels but that would take a long time to test).

The script stops running at the 4th “resolution,” as indicated by the script no longer being (Running) and I can once again interact with other parts of the interface. The tiff stops growing in size, but it is also unable to be opened. I have not had a problem in cases where the script remains (Running) .

I think I could replicate the error with OS-2/3 – curiously, VisualVM told me that QuPath was still valiantly trying to allocate a raster, even though it didn’t appear to be (and the script stopped running… which I suppose might be a time-out rather than actual completion).

In any case, I haven’t had time to test this to completion but you can try replacing the block at the end with this more fine-grained version:

// Write the image or open it in the viewer
if (pathOutput != null) {
    def downsamples = server.getPreferredDownsamples()
    if (outputDownsample > 1) {
        downsamples = [outputDownsample]
        double d = outputDownsample * 4
        while (Math.min(server.getWidth()/d, server.getHeight()/d) >= 128) {
            downsamples << d
            d *= 4
        }
        downsamples = downsamples as double[]
        server = ImageServers.pyramidalize(server, downsamples)
    }
    new OMEPyramidWriter.Builder(server)
        .parallelize()
        .downsamples(downsamples as double[])
        .tileSize(1024)
        .bigTiff()
        .channelsInterleaved()
        .build()
        .writePyramid(pathOutput)
//    writeImage(server, pathOutput)
} else {
    // Create the new image & add to the project
    def imageData = new ImageData<BufferedImage>(server)
    setChannels(imageData, channels as ImageChannel[])
    Platform.runLater {
        getCurrentViewer().setImageData(imageData)
    }
}

Basically, even if it doesn’t work it will give you more things you can adjust when writing.

The original formulation had two issues that could be causing trouble in some cases:

  • It generated a pyramidal image server with just a single downsample value… you might need to give more values to make the writing reasonably efficient
  • It wrote channels as separate planes. That seems to be more ‘normal’ for OME-TIFF in general, but is pretty inefficient here – since the image might be too big to cache, meaning pixels need to be requested for every single channel. For that reason, I’ve switched to try interleaving channel pixels instead.
1 Like

Thanks, I will take a look at this. I was trying to set up a demo project but have been running into connectivity issues between work blocking Google Drive and the internet being out a home due to the fires.

Maybe no need to share files – I’ve just written a OS-2/3 pyramid with 6 channels starting at 3x downsample (7.7 GB) in about 40 minutes on my ageing computer.

Previously, I’d reproduced the bug by starting it running this morning and then forgetting about it while it failed to finished over about 8 hours… after which it was still trying to create that raster.

I think adding more pyramid levels to the pyramidal server should fix the problem, and interleaving the channels should make it substantially faster.

1 Like

This has worked for me so far, thanks for the script adjustment!

1 Like