How to export tiles to use for batch processing

Hi All,

I am working on a project trying to use tissue segmentation, then the positive cell detection tools to classify areas of a tumor as either positive or negative for a specific IHC. Unfortunately, the computing power I have is unable to manage the vast number of annotations created and crashes before I am able to export the images and annotations.

My solution to this problem was to break the images in the project down into smaller 512x512 tiles and complete the pre-trained positive cell detections and then to export these smaller patches to be stitched back together when necessary.

The aim of the project is to create cell centroid data with X&Y coordinates to perform spatial analysis of clone sizes.

Unfortunately, I have attempted to use the guides below.


The current script I am running is this

/**
 * Script to export image tiles (can be customized in various ways).
 */

// Get the current image (supports 'Run for project')
def imageData = getCurrentImageData()

// Define output path (here, relative to project)
def name = GeneralTools.getNameWithoutExtension(imageData.getServer().getMetadata().getName())
def pathOutput = buildFilePath(PROJECT_BASE_DIR, 'tiles', name)
mkdirs(pathOutput)

// Define output resolution in calibrated units (e.g. µm if available)
double requestedPixelSize = 1.0

// Convert output resolution to a downsample factor
double pixelSize = imageData.getServer().getPixelCalibration().getAveragedPixelSize()
double downsample = requestedPixelSize / pixelSize

// Create an exporter that requests corresponding tiles from the original & labelled image servers
new TileExporter(imageData)
    .downsample(downsample)   // Define export resolution
    .imageExtension('.jpg')   // Define file extension for original pixels (often .tif, .jpg, '.png' or '.ome.tif')
    .tileSize(512)            // Define size of each tile, in pixels
    .annotatedTilesOnly(false) // If true, only export tiles if there is a (classified) annotation present
    .overlap(64)              // Define overlap, in pixel units at the export resolution
    .writeTiles(pathOutput)   // Write tiles to the specified directory

print 'Done!'

Which executes and produces the files I need. However, the tiles include a large number of empty tiles.

My question is how do I edit this script to only create tiles of the annotated area produced by the tissue detection tool?

I thought using selectAnnotations() would work and have tried it in several places through the script to no avail.

Sorry for the long post.

I thought this was the solution but it did not work.

/**
 * Script to export image tiles (can be customized in various ways).
 */

// Get the current image (supports 'Run for project')
def imageData = getCurrentImageData()

// Define output path (here, relative to project)
def name = GeneralTools.getNameWithoutExtension(imageData.getServer().getMetadata().getName())
def pathOutput = buildFilePath(PROJECT_BASE_DIR, 'tiles', name)
mkdirs(pathOutput)

// Define output resolution in calibrated units (e.g. µm if available)
double requestedPixelSize = 1.0

// Convert output resolution to a downsample factor
double pixelSize = imageData.getServer().getPixelCalibration().getAveragedPixelSize()
double downsample = requestedPixelSize / pixelSize

// Create an exporter that requests corresponding tiles from the original & labelled image servers
selectAnnotations()
new TileExporter(imageData)
    .downsample(downsample)   // Define export resolution
    .imageExtension('.jpg')   // Define file extension for original pixels (often .tif, .jpg, '.png' or '.ome.tif')
    .tileSize(512)            // Define size of each tile, in pixels
    .annotatedTilesOnly(false) // If true, only export tiles if there is a (classified) annotation present
    .overlap(64)              // Define overlap, in pixel units at the export resolution
    .writeTiles(pathOutput)   // Write tiles to the specified directory

print 'Done!'

Hi @Chris_Ross,

Have you tried changing

.annotatedTilesOnly(false)

with

.annotatedTilesOnly(true)

?

2 Likes

Out of curiosity, are you using 0.2.0+? It can handle a lot more annotations than older versions.

Have you tried using detections instead of annotations?

If you have many annotations to run cell detection across, you could also select each annotation in turn, run cell detection on it, and then save the summary measurements to the annotation.

In some cases the crashing is due to too much CPU and too little RAM. You might also consider lowering the number of CPU threads available in the Preferences.

2 Likes

One last thing, you have your requested pixel size set to 1, which is not the same as downsample set to 1, and the images will not be full resolution (unless your pixel size is, in fact, 1.0).

double requestedPixelSize = 1.0

You may want to change double downsample to 1.0 instead.

2 Likes

Hi thanks for the advice.
The build version and info are

Version: 0.2.3
Build time: 2020-09-11, 12:59

I have allocated 6 of the 8GB of RAM available in my machine.
I will have a look at the CPU, it is a 6600k.

In terms of the downsampling I thought it was a ratio. So I will change that in the next one.

But do you know the how I only run the tile script on annotations?

Tried that, as I understand that only tells the script to run or not. Not to only select the annotations.

You might need to set the classification for the annotations – note the comment to the right explaining the purpose of the option:

    .annotatedTilesOnly(false) // If true, only export tiles if there is a (classified) annotation present

If you can give more details (e.g. exact error message, info from View → Show log), there might be other workarounds available.

Note that if memory is the issue, you can track this via https://qupath.readthedocs.io/en/latest/docs/reference/commands.html#show-memory-monitor

Also, if you have other applications open on your computer then QuPath may be unable to access the full 6 GB.

1 Like

Yep, I am still not 100% clear on at what step it is crashing. The actual cell detection, the export, the creation of annotations? And why are there so many annotations/could this be done a different way?

It seems like it would be far cleaner to keep the whole project within the single image if possible.

1 Like

Thanks for all the responses, sorry if I wasn’t clear.

Firstly Thank you Pete for the memory monitor I didn’t know that existed. Memory seems to be the Big issue. I am completely using the 6GB with over 2,000,000 detections and this error ERROR: Error running plugin: java.lang.OutOfMemoryError: Java heap space.

Secondly, the main question of this post was to ask how I could edit the tile creation script to produce tiles of only the annotated area of a slide. The following script appears to work now, thanks to your suggestions.

/**
 * Script to export image tiles (can be customized in various ways).
 */

// Get the current image (supports 'Run for project')
def imageData = getCurrentImageData()

// Define output path (here, relative to project)
def name = GeneralTools.getNameWithoutExtension(imageData.getServer().getMetadata().getName())
def pathOutput = buildFilePath(PROJECT_BASE_DIR, 'tiles', name)
mkdirs(pathOutput)

// Define output resolution in calibrated units (e.g. µm if available)
double requestedPixelSize = 0.5

// Convert output resolution to a downsample factor
double pixelSize = imageData.getServer().getPixelCalibration().getAveragedPixelSize()
double downsample = requestedPixelSize / pixelSize

// Create an exporter that requests corresponding tiles from the original & labeled image servers
selectAnnotations()
new TileExporter(imageData)
    .downsample(downsample)   // Define export resolution
    .imageExtension('.jpg')   // Define file extension for original pixels (often .tif, .jpg, '.png' or '.ome.tif')
    .tileSize(512)            // Define size of each tile, in pixels
    .annotatedTilesOnly(false) // If true, only export tiles if there is a (classified) annotation present
    .overlap(64)              // Define overlap, in pixel units at the export resolution
    .writeTiles(pathOutput)   // Write tiles to the specified directory

print 'Done!'

Finally the problem of my project and if there is a more streamlined way of working without tiling the images.

I have attached a couple of screen shots to show what I am working on.


For context, I am aiming to classify individual tumour cells as positive or negative for this IHC to allow me to perform a spacial analysis of clone sizes.

So currently other than getting more RAM or decreasing the image sizes. I am not sure what to do next.

Thank you for all your help.

1 Like

It’s usually fine if the memory fills up – anything that isn’t needed will be automatically cleared before an OutOfMemoryError happens. But it sounds like this is pushing things beyond its limits.

2,000,000 cells with measurements will require a lot of memory to store, but I think it should be (just about) manageable. However, during detection there are spikes in memory use required for image processing. Reducing the number of parallel threads should reduce these spikes – have you tried changing that in the preferences?

See this discussion for more details:

Personally, I’d turn to tiling only as a last resort. The size of the image itself is less important (assuming it’s pyramidal), since QuPath doesn’t have to hold all of it in memory at any one time. In fact, creating tiles can make things worse if they aren’t saved as efficient image pyramids… in addition to potentially causing a headache with any later spatial analysis.

A compromise could be to create rectangular annotations for each quadrant of the image and then duplicate the image 4 times (right-click under the ‘Project’ tab). Then you can run cell detection in each quadrant separately in different images. This retains the benefits of tiling in terms of memory use, while also preserving all the coordinates.

1 Like

Thanks Pete,

Fortunately, the original images are pyramidal .ndpi files.
I am currently only using 4 threads, I will try running this with only two threads.
If that fails I will try your suggestion for turning the image into quadrants. Although I am just trying to understand the need to duplicate the image 4 times?

Thanks for all the help

I agree tilling is not optimal and if I can find a better workaround that would be fantastic.

I was thinking that you’d process one quadrant per duplicate image – that way each image would only contain ~1/4 the number of cells. But this may well not be necessary, since memory requirements are much lower when QuPath is ‘resting’ (not detecting anything), even if a lot of cells are there.

You may also want to set a minimum size threshold when creating your annotation. Especially in the lower image (and the lower right), it looks like there are many small areas that maybe could/should be excluded. I do not think that is the source of your current issue, but it is likely to make QuPath run more smoothly.

I understand what you mean now Pete with the 4 quadrants.
In terms of the threshold, I have set the threshold as rather high because I feel it is more important I don’t lose areas of the tumor over removing stroma as I am using this for a research project.
Otherwise, is there a way of exporting the tiles into pyramidal file formats?
Thank you again

You can use .ome.tif as the extension, but I don’t know if that will be enough – the TileExporter isn’t really intended for writing pyramids. It would certainly be possible but might need longer a custom script to export in exactly the format you want… so really a last resort thing. And I don’t really see any advantages of that over the quadrant approach, given that your images are pyramidal already.