Batch processing unusually slow?

Hello all,

Is it normal that running a script for a project is substantially slower than running it individually for each image?

I have a pretty basic script that I’ve compiled from forum posts and the workflow panel that, very briefly:

sets image type
sets pixel size
sets de-convolution vectors
selects the entire image
counts cells
smooths
applies an existing classifier
exports the rendered image
exports annotation data
exports detection data

When I run it on a newly imported image, it takes ~80 sec to run (images are between 600-800MB on average). When I run it for my project (9 images at the moment) it takes close to 10 minutes/image.

Is there something I can do to allow the script to run for project more efficiently? Either on the QuPath/scripting side or the actual computer hardware side.

Thanks in advance!

Posting my computer specs in case they are relevant:

iMac Pro (2017)
processor: 2.5 GHz 14-Core Intel Xeon W
memory: 128 GB 2666 MHz DDR4
graphics: Radeon Pro Vega 56 8 GB

and here is the actual script:

setImageType('BRIGHTFIELD_H_E');
setPixelSizeMicrons(0.511904,0.511904);
setColorDeconvolutionStains('{"Name" : "H&E modified", "Stain 1" : "Hematoxylin", "Values 1" : "0.84285 0.49591 0.20896 ", "Stain 2" : "Eosin", "Values 2" : "0.51405 0.67906 0.52405 ", "Background" : " 255 255 255 "}');
createSelectAllObject(true);
runPlugin('qupath.imagej.detect.cells.WatershedCellDetection', '{"detectionImageBrightfield": "Optical density sum",  "requestedPixelSizeMicrons": 0.0,  "backgroundRadiusMicrons": 8.0,  "medianRadiusMicrons": 0.0,  "sigmaMicrons": 1.2,  "minAreaMicrons": 5.0,  "maxAreaMicrons": 200.0,  "threshold": 0.1,  "maxBackground": 2.0,  "watershedPostProcess": true,  "cellExpansionMicrons": 5.0,  "includeNuclei": true,  "smoothBoundaries": true,  "makeMeasurements": true}');
runPlugin('qupath.lib.plugins.objects.SmoothFeaturesPlugin', '{"fwhmMicrons": 50.0,  "smoothWithinClasses": false}');
runObjectClassifier("h1n1 c;lassifer (tri) 2");

import qupath.imagej.tools.IJTools
import qupath.lib.gui.images.servers.RenderedImageServer
import qupath.lib.gui.viewer.overlays.HierarchyOverlay
import qupath.lib.regions.RegionRequest

import static qupath.lib.gui.scripting.QPEx.*


double downsample = 2

String path = buildFilePath(PROJECT_BASE_DIR, 'rendered', getProjectEntry().getImageName() + '.png')

def viewer = getCurrentViewer()
def imageData = getCurrentImageData()

def server = new RenderedImageServer.Builder(imageData)
    .downsamples(downsample)
    .layers(new HierarchyOverlay(viewer.getImageRegionStore(), viewer.getOverlayOptions(), imageData))
    .build()

if (path != null) {
    mkdirs(new File(path).getParent())
    writeImage(server, path)
} else
    IJTools.convertToImagePlus(server, RegionRequest.createInstance(server)).getImage().show()
    print 'Image exported to ' + path
    
    def name2 = getProjectEntry().getImageName() + '.txt'
    def path2 = buildFilePath(PROJECT_BASE_DIR, 'annotation measurements')
    mkdirs(path2)
    path = buildFilePath(path2, name2)
    saveAnnotationMeasurements(path2)
    print 'Results exported to ' + path2

    def name1 = getProjectEntry().getImageName() + '.txt'
    def path1 = buildFilePath(PROJECT_BASE_DIR, 'detection measurements')
    mkdirs(path1)
    path = buildFilePath(path1, name1)
    saveDetectionMeasurements(path1)
    print 'Results exported to ' + path1

Edited the script formatting to make it code.

Any chance you are writing out large images over a network?

1 Like

Thanks, I didn’t know how to do that :slight_smile:

If I understand your question correctly, no, the images is being written to a folder on the computer. Does that answer it?

Yep, that’s about all I have other than the usual “check the memory monitor and see if there are problems there.” If you are capping out on memory, maybe increase the max memory. Or there are a few places around talking about releasing memory during the script - would have to search for them though.

Not sure why anything else in your script would cause problems, though.
View-Show memory monitor

1 Like

Come to think of it, this process can continue after the script finishes, IIRC. But I might be misremembering.
Maybe Pete has better ideas though.

1 Like

I’d suggest trying it with shorter versions of the script / lines commented out to try to identify where the bottleneck is.

If your detection measurement files are huge, writing those could be slow.

The image files may be 600-800 MB, but I’m guessing they are compressed and the actual images are much larger?

1 Like

As I am running a batch right now, I can share a quick screen grab:

I have it set to allocate 40% memory to tile caching, and it’s set to the default 50% RAM allocation since I’m on Mac, so I guess I’m a little confused why the total memory says its around 24.

Can I afford to increase either of these memory allocations? Do you think they may help?
This computer was purchased for the lab specifically to run histopathology analysis through QuPath, so I don’t think it’s the end of the world to allocate it more RAM, but, I defer to your expertise as I am not very computer savvy…

1 Like

Hi Pete, thanks for chiming in.

The detection measurement files are between 200-300MB, and I honestly don’t think I need them for what I’m doing currently - the command made it over through copy/pasting an old script. I’m currently interested in just the annotation measurements, so I can take it out next time around. That being said, I’m still confused why the bottleneck when scaling up the batch?

Also, the images are exported from the microscope as uncompressed TIFs, so I don’t think they are any larger than stated unless there’s some sort of compression going on I haven’t accounted for or noticed.

I think that is allocated for use - note that the scale goes all the way up to 64, which would be 50% of your 128.
That means you aren’t pushing anything in terms of memory, and it is not likely the problem. Memory should only be allocated as needed - QuPath/Java does not hold the whole 64GB hostage.

1 Like

Right, so is something going on such that it only uses 24gb out of the available 64gb?

I’m confused as well, but I’d definitely skip writing the detection measurements as a start. I’d also skip writing the rendered image if you don’t definitely need it; this is also likely to be slow if it is very large.

Not necessarily, the maximum is 64 GB but it won’t all be allocated at the very beginning if it isn’t needed.

(Java uses two flags for this at startup, -Xmx and -Xms… the first is for the maximum memory, the second for the initial memory. QuPath only sets the flag for the maximum memory, on the assumption that you won’t always necessarily want to use the full amount immediately when you start the application. It will grow as more memory is required.)

1 Like

Thanks Pete! Sounds like it isn’t necessary to change the overall allocated RAM then… How about the % allocated for tile caching? Would it be worth bumping up from 40%?

Unfortunately, the rendered image is something I need for the time being…

My goal with this script was to create a one-click analysis that other people in my lab can use for their own projects (switching out for their own detection parameters and classifier, of course), without the need to invest a ton of time into learning QuPath (which I’ve been more than happy to do :slight_smile: )… I guess it’s not actually that much more effort to just separate and run as several different scripts in series, but it would’ve been nice not to have to.

Happy to hear from you or anyone else if there’s any other thoughts on the matter!

1 Like

Depending on what is necessary for a given project, it might help to place different parts within if statements, and then put well described booleans at the start of the script.

If you do not want to split the script, that gives the current user control without much code editing.

exportRenderedPicture = true
//......
//.........
if(exportRenderedPicture){
 //export code
}

And maybe increase the downsample on the rendered image for the moment if speed is an issue.

1 Like

That’s a good idea, thank you!

1 Like

And, in case anyone is curious, this is what the memory wrt time was when the script finally finished:


Keep in mind, i started tracking memory about halfway through the batch so this was about 51 minutes to run 4 or 5 images.

1 Like

When your image is open, QuPath should tell you the uncompressed size under the ‘Image’ tab. I’d be interested to know the width/height; if these are large, PNG may not be the best export format (at least with downsample 2; you may benefit from decreasing the downsample to export a lower image).

Things also tend to be a lot slower if the image isn’t ‘pyramidal’. If your image isn’t a pyramid from the start, QuPath lets you choose on import.

Also, just to clarify, I presume you’re running the script from the script editor with either Run or Run for project in the menu. The second option will also save the data file, which will take some time (but shouldn’t be ~10 minutes…)

My main advice is to

  • try repeating the script with parts removed (to find the bottleneck)
    • alternatively, use https://visualvm.github.io (more technically involved, but the CPU sampling option helps identify bottlenecks)
  • confirm that the same image is much faster to process using Run than it is to process using Run for project.

To check this, you can apply Run for project to a single image that isn’t currently open. For a fair comparison, you should restart QuPath in between and run your script as the first thing you do (so that one method isn’t advantaged by having cached part of the image already).

I’m not really able to investigate more without the results of the two tests suggested above, since any problem may well depend upon the specifics of your image and I don’t know enough about them to replicate the issue myself.

2 Likes

Thanks Pete, I’m going to try this today to see if I can find the bottleneck. If you’re interested, I’m going to attach one of the images. I checked the uncompressed size in QuPath (830.9MB), which is weirdly smaller than the image size when I “Get Info” on the original image in its folder (872.4MB). Not sure what to make of that. The dimensions are quite large as its a mosaic (~6.7mm x 11.4mm).

Link to image:
https://pitt.box.com/s/5e4rab5swy656nun7rzbu8v45imajpxc

Thanks @davidnascari, the image was very helpful – I can confirm there’s something wrong on the QuPath side… although I haven’t been able to figure out exactly what.

As far as I can tell, it traces back to the fact that the image is not stored in a tiled, pyramidal way. If you open the image in QuPath and choose File → Export images… → OME TIFF to create a new pyramidal TIFF image (with/without compression) you should find performance for everything is better if you use the new image.

But after spending a couple of hours investigating, I’ve only succeeded in proving most of my hypotheses regarding the cause to be wrong. We’ll try to figure it out before the next release.

2 Likes