Batch processing across multiple input folders

Hi,
I’m using Pete’s script below to merge image tiles (dozens per tissue sample). This works well but I’m wondering how to adapt this best for batch processing. Running as a project doesn’t seem like the way forward since the tiles belonging to different tissue samples are stored in separate folders, rather than all being part of the same project.

Would it be possible - from within QuPath - to loop over all folders within a directory, execute the script on the tiles within each folder and save a merged output file for each folder (where output file name = folder ID)?
Thanks for any suggestions!

/**
 * Convert TIFF fields of view to a pyramidal OME-TIFF.
 *
 * Locations are parsed from the baseline TIFF tags, therefore these need to be set.
 *
 * One application of this script is to combine spectrally-unmixed images.
 * Be sure to read the script and see where default settings could be changed, e.g.
 *   - Prompting the user to select files (or using the one currently open the viewer)
 *   - Using lossy or lossless compression
 *
 * @author Pete Bankhead
 */

import qupath.lib.common.GeneralTools
import qupath.lib.images.servers.ImageServerProvider
import qupath.lib.images.servers.ImageServers
import qupath.lib.images.servers.SparseImageServer
import qupath.lib.images.writers.ome.OMEPyramidWriter
import qupath.lib.regions.ImageRegion

import javax.imageio.ImageIO
import javax.imageio.plugins.tiff.BaselineTIFFTagSet
import javax.imageio.plugins.tiff.TIFFDirectory
import java.awt.image.BufferedImage

import static qupath.lib.gui.scripting.QPEx.*

boolean promptForFiles = true

// DEFINE INPUT FILES ==========================================================

File dir
List<File> files
String baseName = 'Merged image'
if (promptForFiles) {
    def qupath = getQuPath()
    files = qupath.getDialogHelper().promptForMultipleFiles("Choose input files", null, "TIFF files", ".tif", ".tiff")
} else {
    // Try to get the URI of the current image that is open
    def currentFile = new File(getCurrentServer().getURIs()[0])
    dir = currentFile.getParentFile()
    // This naming scheme works for me...
    String name = currentFile.getName()
    int ind = name.indexOf("_[")
    if (ind < 0)
        ind = name.toLowerCase().lastIndexOf('.tif')
    if (ind >= 0)
        baseName = currentFile.getName().substring(0, ind)
    // Get all the non-OME TIFF files in the same directory
    files = dir.listFiles().findAll {
        return it.isFile() &&
                !it.getName().endsWith('.ome.tif') &&
                (baseName == null || it.getName().startsWith(baseName))
        (it.getName().endsWith('.tiff') || it.getName().endsWith('.tif') || checkTIFF(file))
    }
}
if (!files) {
    print 'No TIFF files selected'
    return
}

// DEFINE OUTPUT FILE ==========================================================

File fileOutput
if (promptForFiles) {
    def qupath = getQuPath()
    fileOutput = qupath.getDialogHelper().promptToSaveFile("Output file", null, null, "OME-TIFF", ".ome.tif")
} else {
    // Ensure we have a unique output name
    fileOutput = new File(dir, baseName+'.ome.tif')
    int count = 1
    while (fileOutput.exists()) {
        fileOutput = new File(dir, baseName+'-'+count+'.ome.tif')
    }
}
if (fileOutput == null)
    return

// Parse image regions & create a sparse server =================================

print 'Parsing regions from ' + files.size() + ' files...'
def builder = new SparseImageServer.Builder()
files.parallelStream().forEach { f ->
    def region = parseRegion(f)
    if (region == null) {
        print 'WARN: Could not parse region for ' + f
        return
    }
    def serverBuilder = ImageServerProvider.getPreferredUriImageSupport(BufferedImage.class, f.toURI().toString()).getBuilders().get(0)
    builder.jsonRegion(region, 1.0, serverBuilder)
}
print 'Building server...'
def server = builder.build()
server = ImageServers.pyramidalize(server)

long startTime = System.currentTimeMillis()
String pathOutput = fileOutput.getAbsolutePath()
new OMEPyramidWriter.Builder(server)
    .downsamples(server.getPreferredDownsamples()) // Use pyramid levels calculated in the ImageServers.pyramidalize(server) method
    .tileSize(512)      // Requested tile size
    .channelsInterleaved()      // Because SparseImageServer returns all channels in a BufferedImage, it's more efficient to write them interleaved
    .parallelize()              // Attempt to parallelize requesting tiles (need to write sequentially)
    .losslessCompression()      // Use lossless compression (often best for fluorescence, by lossy compression may be ok for brightfield)
    .build()
    .writePyramid(pathOutput)
long endTime = System.currentTimeMillis()
print('Image written to ' + pathOutput + ' in ' + GeneralTools.formatNumber((endTime - startTime)/1000.0, 1) + ' s')
server.close()


static ImageRegion parseRegion(File file, int z = 0, int t = 0) {
    if (checkTIFF(file)) {
        try {
            return parseRegionFromTIFF(file, z, t)
        } catch (Exception e) {
            print e.getLocalizedMessage()
        }
    }
}

/**
 * Check for TIFF 'magic number'.
 * @param file
 * @return
 */
static boolean checkTIFF(File file) {
    file.withInputStream {
        def bytes = it.readNBytes(4)
        short byteOrder = toShort(bytes[0], bytes[1])
        int val
        if (byteOrder == 0x4949) {
            // Little-endian
            val = toShort(bytes[3], bytes[2])
        } else if (byteOrder == 0x4d4d) {
            val = toShort(bytes[2], bytes[3])
        } else
            return false
        return val == 42 || val == 43
    }
}

/**
 * Combine two bytes to create a short, in the given order
 * @param b1
 * @param b2
 * @return
 */
static short toShort(byte b1, byte b2) {
    return (b1 << 8) + (b2 << 0)
}

/**
 * Parse an ImageRegion from a TIFF image, using the metadata.
 * @param file image file
 * @param z index of z plane
 * @param t index of timepoint
 * @return
 */
static ImageRegion parseRegionFromTIFF(File file, int z = 0, int t = 0) {
    int x, y, width, height
    file.withInputStream {
        def reader = ImageIO.getImageReadersByFormatName("TIFF").next()
        reader.setInput(ImageIO.createImageInputStream(it))
        def metadata = reader.getImageMetadata(0)
        def tiffDir = TIFFDirectory.createFromMetadata(metadata)

        double xRes = getRational(tiffDir, BaselineTIFFTagSet.TAG_X_RESOLUTION)
        double yRes = getRational(tiffDir, BaselineTIFFTagSet.TAG_Y_RESOLUTION)

        double xPos = getRational(tiffDir, BaselineTIFFTagSet.TAG_X_POSITION)
        double yPos = getRational(tiffDir, BaselineTIFFTagSet.TAG_Y_POSITION)

        width = tiffDir.getTIFFField(BaselineTIFFTagSet.TAG_IMAGE_WIDTH).getAsLong(0) as int
        height = tiffDir.getTIFFField(BaselineTIFFTagSet.TAG_IMAGE_LENGTH).getAsLong(0) as int

        x = xRes * xPos * 0.9986616702 // incl. corr factor; was: x = Math.round(xRes * xPos) as int
        y = yRes * yPos * 0.9982142857 // incl. corr factor; was: y = Math.round(yRes * yPos) as int
    }
    return ImageRegion.createInstance(x, y, width, height, z, t)
}

/**
 * Helper for parsing rational from TIFF metadata.
 * @param tiffDir
 * @param tag
 * @return
 */
static double getRational(TIFFDirectory tiffDir, int tag) {
    long[] rational = tiffDir.getTIFFField(tag).getAsRational(0);
    return rational[0] / (double)rational[1];
}

You could do it all in Groovy, wrapping all the top bit of the code in a loop. It doesn’t require anything special in QuPath.

However, as I foggily remember my own code, there might be an easier way.

If you create a project that contains one image from each of your directories, then you could try Run for project then. You should set promptForFiles = false and then QuPath will try to automatically determine the files to merge from within the same directory as the ‘current’ image.

There’s a bit of logic to try to identify only the right files to merge within that directory (so it doesn’t fail by trying to add some other file)… depending upon your naming scheme, this might need to be adjusted.

Thanks Pete. So when I run this as a project with promptForFiles = false on a few directories (dir1: images 1-1…1-4; dir2: images 2-1…2-4 etc.), giving it only the first file for each directory, it creates a .ome output file in each directory that is identical to the input file…

This is the (somewhat involved) bit that filters out the files to exclude from the directory:

    // This naming scheme works for me...
    String name = currentFile.getName()
    int ind = name.indexOf("_[")
    if (ind < 0)
        ind = name.toLowerCase().lastIndexOf('.tif')
    if (ind >= 0)
        baseName = currentFile.getName().substring(0, ind)
    // Get all the non-OME TIFF files in the same directory
    files = dir.listFiles().findAll {
        return it.isFile() &&
                !it.getName().endsWith('.ome.tif') &&
                (baseName == null || it.getName().startsWith(baseName))
        (it.getName().endsWith('.tiff') || it.getName().endsWith('.tif') || checkTIFF(file))
    }

When I wrote the script, I needed this to be quite restrictive because there were other files in the directory that should be ignored.

You probably need to change this, but I have no idea how your files are named and so I don’t know how. I haven’t tested it, but this is an attempt to write a much more tolerant file select (i.e. everything that ends with .tif, but not .ome.tif).

files = dir.listFiles().findAll {it.isFile && it.getName().endsWith('.tif') && !it.getName().endsWith('.ome.tif')}

You could try replacing the chunk of code above with that one line instead.

1 Like

Thanks - so with input files in the test dataset simply named 1-1.tif…1-4.tif, 2-1.tif…2-4.tif, etc. and replacing this chunk as suggested, I got a (v helpful!) error message:

ERROR: I cannot find 'isFile'!
ERROR: MissingPropertyException at line 42: No such property: isFile for class: java.io.File
Possible solutions: file
  • and replacing it.isFile with it.file
    files = dir.listFiles().findAll {it.file && it.getName().endsWith('.tif') && !it.getName().endsWith('.ome.tif')}
    seems to do the job -

Thanks a lot for your help!

Ah, I was aiming for it.isFile() but that might be the same with Groovy - glad it works!