Bio-formats `read_region` equivalent

I have a WSI i would like to tile.

I wrote a python function based on openslide-python read_region function, however, I would like to port it to its Bio-formats.

Is there any equivalent to this function in the Bio-formats library? I found some documentation that I think it do something similar but I am not quite sure if/how this could be done using python-bioformats. Some guidance would be very appreciated :slight_smile:

The python script would be very to the Java example for OverlappedTiledWriter which is on the docs you linked above.

I had also put together the following as an ImageJ macro example for converting to tiles with multiple sub resolutions. If you ignore the sub resolution parts then it should provide an example of how to achieve tiled conversion in python: https://github.com/dgault/bio-formats-examples/blob/6cdb11e8c64566611b18f384b3a257dab5037e90/src/main/macros/jython/OverlappedTiledPyramidConversion.py

Thanks for the feedback.

Is it necessary that the images are in OME-tiff? Most of my images are in Ventana bif and it would take some time to convert them all. Also, I work with Leica svs. It would be great if it was a โ€œgenericโ€ tiling tool.

Ah, I perhaps misunderstood. The Ventana bif are likely already tiled, so if simply want to read the image in tiles then you can use the below code.

file = "/path/to/inputFile.tiff"

# setup reader
reader = ImageReader()
omeMeta = MetadataTools.createOMEXMLMetadata()
reader.setMetadataStore(omeMeta)
reader.setId(file)

# set the tile sizes to be used (can be replaced with your own hardcoded values if needs be)
tileSizeX = reader.getOptimalTileWidth()
tileSizeY = reader.getOptimalTileHeight()
type = reader.getPixelType()


# read the tiles
for series in range(reader.getSeriesCount()):
	reader.setSeries(series)

	for image in range(reader.getImageCount()):
		width = reader.getSizeX()
		height = reader.getSizeY()

		# Determined the number of tiles to read and write
		nXTiles = int(math.floor(width / tileSizeX))
		nYTiles = int(math.floor(height / tileSizeY))
		if nXTiles * tileSizeX != width:
			nXTiles = nXTiles + 1
		if nYTiles * tileSizeY != height:
			nYTiles = nYTiles + 1;

		for y in range(nYTiles):
			for x in range(nXTiles):
				# The x and y coordinates for the current tile
				tileX = x * tileSizeX;
				tileY = y * tileSizeY;
				effTileSizeX = tileSizeX
				if (tileX + tileSizeX) >= width:
					effTileSizeX = width - tileX
				effTileSizeY = tileSizeY
				if (tileY + tileSizeY) >= height:
					effTileSizeY = height - tileY
				# Read tiles from the input file and write them to the output OME-Tiff
				buf = reader.openBytes(image, tileX, tileY, effTileSizeX, effTileSizeY)

Sorry, I think I am not explaining myself correctly.

The goal I want to accomplish is to tile a WSI in patches of a given size.

In the code I am using now I use read_region function from openslide-python and I saved the tile using .save function.

I would like to do this using bioformats, since it manages much better Ventana bif files than openslide does. I add here a code snippet of the function adapted from here in order to make things more clear:

def wsi2mosaic(image, size, overlap, level, drop_last=False, return_coords=False, only_list=False, check_tissue=True, prefix='', suffix='.png'):
    assert isinstance(image, openslide.OpenSlide), "input image should be an openslide wsi"
    assert level < len(image.level_dimensions), f"this image has only {len(image.level_dimensions)} levels"
    
    if type(size) is list:
        assert len(size) == 2, "size should be integer or [size_h, size_w]"
        s_h = size[0]
        s_w = size[1]
    else:
        assert isinstance(size, int), "size should be integer or [size_h, size_w]"
        s_h = size
        s_w = size
    
    if type(overlap) is list:
        assert len(size) == 2, "overlap should be integer or [overlap_h, overlap_w]"
        o_h = size[0]
        o_w = size[1]
    else:
        assert isinstance(overlap, int), "overlap should be integer or [overlap_h, overlap_w]"
        o_h = overlap
        o_w = overlap
    
    w_wsi, h_wsi = image.dimensions #! openslide image dimensions: WxH
    w_lvl, h_lvl = image.level_dimensions[level]
    
    box_coords_wsi = [0,0, h_wsi, w_wsi] #This way you avoid keeping only the biggest part of the tissue
    
    box_coords_wsi = [[box_coords_wsi[0], box_coords_wsi[1]],[box_coords_wsi[2], box_coords_wsi[3]]]
    box_coords_lvl = getScaledCoordinates(box_coords_wsi, [h_wsi,w_wsi], [h_lvl,w_lvl])
    h_box_lvl = box_coords_lvl[1][0] - box_coords_lvl[0][0]
    w_box_lvl = box_coords_lvl[1][1] - box_coords_lvl[0][1]
    assert h_box_lvl>s_h, f"tile height ({s_h}) should be less than box level height ({h_box_lvl})"
    assert w_box_lvl>s_w, f"tile width ({s_w}) should be less than box level width ({w_box_lvl})"
    
    x_ = np.arange(box_coords_lvl[0][0], box_coords_lvl[1][0]-s_h+1, s_h-o_h)
    y_ = np.arange(box_coords_lvl[0][1], box_coords_lvl[1][1]-s_w+1, s_w-o_w)
    
    if not drop_last:
        x_ = np.hstack([x_, [box_coords_lvl[1][0]-s_h]])
        y_ = np.hstack([y_, [box_coords_lvl[1][1]-s_w]])
    
    coords_ul = [(x,y) for x in x_ for y in y_]
    coords_br = [(x+s_h,y+s_w) for x in x_ for y in y_]
    coord_wsi_ul = getScaledCoordinates(coords_ul, [h_lvl, w_lvl], [h_wsi,w_wsi])
    coord_wsi_br = getScaledCoordinates(coords_br, [h_lvl, w_lvl], [h_wsi,w_wsi])

    coord_wsi = [(ul[0], ul[1], br[0], br[1]) for ul,br in zip(coord_wsi_ul, coord_wsi_br)]    
    
    if return_coords:
        return(coord_wsi)
    
    img_list = []
    f = open(f'{prefix}_coordinates.csv', 'w')
    f.write('coordinates\n')
    f.close()
        
    for COORD in coord_wsi:
        x_ul = COORD[0]
        y_ul = COORD[1]
        x_br = COORD[2]
        y_br = COORD[3]
        tile = image.read_region((y_ul, x_ul), level, (s_w, s_h))
        
        if check_tissue:
            tile_np = np.array(tile)
            if only_list:
                if hasEnoughTissue(tile_np):
                    f = open(f'{prefix}_coordinates.csv', 'a')
                    f.write('[{},{}]\n'.format(y_ul, x_ul))
            else:
                if hasEnoughTissue(tile_np):
                    tile.save(f'{prefix}_{level}_{x_ul}-{y_ul}-{x_br}-{y_br}_{suffix}')
                    f = open(f'{prefix}_coordinates.csv', 'a')
                    f.write('[{},{}]\n'.format(y_ul, x_ul))
                
        else:
                tile.save(f'{prefix}_{level}_{x_ul}-{y_ul}-{x_br}-{y_br}_{suffix}')
                f = open(f'{prefix}_coordinates.csv', 'a')
                f.write('[{},{}]\n'.format(x_ul, y_ul))

Ignoring the extra functions that are not pasted, the idea of the script is to tile an image at a given level. Here, what I would like to do is, ideally, change the tile = image.read_region((y_ul, x_ul), level, (s_w, s_h)) inside the for loop with something equivalent in python-bioformats.

That should be fairly straightforward, you will still need to setup the image reader at the start, but after that it should be a single call to openBytes to retrieve the tile:

# setup reader
reader = ImageReader()
omeMeta = MetadataTools.createOMEXMLMetadata()
reader.setMetadataStore(omeMeta)
reader.setId(file)

# rest of your code


# read a specific region
tile = reader.openBytes(level, x_ul, y_ul, s_w, s_h)
1 Like

Sorry if the question is quite naive but I am not able to run this commands.

Actually loci.formats imports do not work. What does loci mean? I am trying to perform the imports in a jupyter notebook, just in case is important and using python-bioformats (version 1.5.2)

The loci.formats etc are simply the package names for the particular classes being used. If you are using python-bioformats then you will instead need (from the python-bioformats docs: https://pythonhosted.org/python-bioformats/):

import javabridge
import bioformats
javabridge.start_vm(class_path=bioformats.JARS)

# your program goes here

javabridge.kill_vm()

And then import the different classes from the bioformats library right? Something like:

import javabridge
import bioformats
javabridge.start_vm(class_path=bioformats.JARS)

# setup reader
reader = bioformats.ImageReader()
omeMeta = bioformats.metadatatools.createOMEXMLMetadata()
reader.setMetadataStore(omeMeta)
reader.setId(file)

If this is the case Iโ€™ve got an error related to the ImageReader class which needs a path:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-f2349f40edb9> in <module>()
      1 # setup reader
----> 2 reader = bioformats.ImageReader()
      3 omeMeta = bioformats.metadatatools.createOMEXMLMetadata()
      4 reader.setMetadataStore(omeMeta)
      5 reader.setId(file)

~/anaconda3/envs/dlhisto_BioFormats/lib/python3.6/site-packages/bioformats/formatreader.py in __init__(self, path, url, perform_init)
    571         self.path = path
    572         if path is None:
--> 573             if url.lower().startswith("omero:"):
    574                 while True:
    575                     #

AttributeError: 'NoneType' object has no attribute 'lower'

I am working with Aperio (svs) and Roche (bif) but I am not quite sure how to specify it.

That error looks to be due to missing the path, so:

filename = "path/to/myFile.svs"
reader = bioformats.ImageReader(filename)