Taking lots of days to append each bytes of each tiles

Hi there,
I have a pyramid structure file (1054300, 1107060) of size 247GB into ome.tif. I want to append bytes of each tile of size 512*512 of that file. I am getting my output but it takes lots of days to append all tile bytes into one. I want to achieve output fast.
Currently, I am running below code:

def torg(reader, img, width, height):
    XT = 2163
    YT = 2060
    t_SizeX_org = 512
    t_SizeY_org = 512
    ed = []

    for y in range(YT):
        for x in range(XT):
            X_org = x * t_SizeX_org
            Y_org = y * t_SizeY_org
            SizeX_org = tile_SizeX_org
            if (X_org + t_SizeX_org) >= width:
                SizeX_org = width - X_org
            SizeY_org = tile_SizeY_org
            if (Y_org + t_SizeY_org) >= height:
                SizeY_org = height - Y_org

            b_org = reader.read(img, rescale=False, XYWH=(X_org, Y_org, SizeX_org, SizeY_org))
            img_org = Image.fromarray(b_org)
            instance_byte = BytesIO()
            img_org.save(instance_byte, "JPEG", quality=80)
            t_org = instance_byte.getvalue()
            ed.append(t_org)

Is there any way to obtain output fast?
Any suggestion!

It looks like you are decompressing and re-compressing terrabytes of image data and append the result to a Python list. Besides that this might not be the right approach, you could try:

  1. In case reader.read(img, ...) opens and parses a file on each call, keep a file reader object open if possible.

  2. Replace Image.fromarray(b_org) ... BytesIO() ... byte.getvalue() with a single function call imagecodecs.jpeg8_encode(b_org, level=80).

  3. In case you are using conda-forge, they are using the libjpeg library for Pillow and imagecodecs. The packages on PyPI are built with the faster libjpeg-turbo library.

  4. Assuming the decoder and encoder functions release the GIL, parallelize the outer loop (for y in range(YT)), for example using ThreadPoolExecutor. Alternatively consider dask.

1 Like

Hello everyone,
Thank you @cgohlke for your suggestion.
@dgault (Sorry to tag you here, but I need some suggestion from you bcoz I believe reader or java bridge function has something to do while each process calls the row function).

Code:

from multiprocessing import Process
import math
import time
import multiprocessing
from io import BytesIO
from PIL import Image
import bioformats
import javabridge
import javabridge as jutil

num_cores = multiprocessing.cpu_count()


def num_of_tiles_to_read(width, height, tile_SizeX_org, tile_SizeY_org):
    # Determined the number of tiles to read and write original image
    nXTiles = int(math.floor(width / tile_SizeX_org))
    nYTiles = int(math.floor(height / tile_SizeY_org))
    if nXTiles * tile_SizeX_org != width:
        nXTiles = nXTiles + 1
    if nYTiles * tile_SizeY_org != height:
        nYTiles = nYTiles + 1

    return nXTiles, nYTiles


def tiles_coordinates(x, y, tile_SizeX_org, tile_SizeY_org, width, height):
    # The x and y coordinates for the current tile
    tileX_org = x * tile_SizeX_org
    tileY_org = y * tile_SizeY_org
    effTileSizeX_org = tile_SizeX_org
    if (tileX_org + tile_SizeX_org) >= width:
        effTileSizeX_org = width - tileX_org
    effTileSizeY_org = tile_SizeY_org
    if (tileY_org + tile_SizeY_org) >= height:
        effTileSizeY_org = height - tileY_org

    return tileX_org, tileY_org, effTileSizeX_org, effTileSizeY_org


def img_read(reader, zstack, tileX_org, tileY_org, effTileSizeX_org, effTileSizeY_org):
    # Read and write tiles
    bf_tiles_org = reader.read(zstack, rescale=False,
                               XYWH=(tileX_org, tileY_org, effTileSizeX_org, effTileSizeY_org))
    img_tiles_org = Image.fromarray(bf_tiles_org).convert("RGB")
    instance_byte_str_buffer_org = BytesIO()
    img_tiles_org.save(instance_byte_str_buffer_org, "JPEG", quality=80,
                       icc_profile=img_tiles_org.info.get('icc_profile'), progressive=False)
    t_org = instance_byte_str_buffer_org.getvalue()

    return t_org


def row(q, X_TILES, y, tile_SizeX_org, tile_SizeY_org, width, height, reader, zstack):
    encoded_framed_items_org = []
    for x in X_TILES:
        tileX_org, tileY_org, effTileSizeX_org, effTileSizeY_org = tiles_coordinates(x, y, tile_SizeX_org,
                                                                                     tile_SizeY_org, width,
                                                                                     height)
        t_org = img_read(reader, zstack, tileX_org, tileY_org, effTileSizeX_org, effTileSizeY_org)
        encoded_framed_items_org.append(t_org)
        print(encoded_framed_items_org)
    q.put(encoded_framed_items_org)


def tile_size_org(tile_SizeX_org, tile_SizeY_org, width, height, reader, zstack):
    nXTiles, nYTiles = num_of_tiles_to_read(width, height, tile_SizeX_org, tile_SizeY_org)
    print(nYTiles, nXTiles)
    X_TILES = range(nXTiles)
    Y_TILES = range(nYTiles)
    encoded_framed_items_org = []
    encoded_items_or = []

    q = multiprocessing.Queue()

    for y in Y_TILES:
        p = Process(target=row, args=(q, X_TILES, y, tile_SizeX_org, tile_SizeY_org, width, height, reader, zstack))
        encoded_items_or.append(p)
        p.start()

    for _ in Y_TILES:
        encoded_framed_items_org += q.get()

    for p in encoded_items_or:
        p.join()

    return encoded_items_or


if __name__ == "__main__":
    javabridge.start_vm(class_path=bioformats.JARS, max_heap_size="9G")
    env = jutil.attach()

    input_path = "/home/yuvi/Desktop/DATASETS_&_OUTPUTS/Internal_dataset_output/Datasets/1_channel_sample/_earthworm.ome.tif"
    tile_SizeX_org, tile_SizeY_org = 512, 512

    reader = bioformats.ImageReader(input_path)
    reader.rdr.setId(input_path)

    metadata = javabridge.jdictionary_to_string_dictionary(reader.rdr.getMetadata())

    for series in range(reader.rdr.getSeriesCount()):
        series += 4
        reader.rdr.setSeries(series)
        width = reader.rdr.getSizeX()
        height = reader.rdr.getSizeY()
        for zstack in range(reader.rdr.getImageCount()):
            width = reader.rdr.getSizeX()
            height = reader.rdr.getSizeY()
            encoded_items_or = tile_size_org(tile_SizeX_org, tile_SizeY_org, width, height, reader, zstack)
            print(encoded_items_or)

    jutil.detach()
    javabridge.kill_vm()

I passed imagecodecs.jpeg8_encode(b_org, level=80) command as you have suggested but I didn’t get a value in a byte. It shows me something else or I might not have understood how to call bytes.

At first, I tried with ThreadPoolExecutor, but it just ran in a single processor. I want to run above code in a multiprocessor, that’s why I used multiprocessing.Queue function.
Whenever I run above code, sometimes it’s print print encoded_framed_items_org lists and sometimes it just runs but it doesn’t do anything.
I want to ask you guys: Isn’t above code is the right way to parallelize the system?
It didn’t give me any error but neither it generates any output.
I really need help to solve it. :slightly_smiling_face:

I think I have a solution to what I think your actual problem is.

If you’re willing to get QuPath (open source and free), it’s capable of exporting OME-TIFFs in a variety of formats (including JPEG) and a designated downsample factor / resolution

image

Thanks for your comment, but I am not having a problem while reading an image or it’s tiles by calling reader.read or reader.Openbytes (when I don’t use multiprocessing pipeline). I am having problem-related to reading an image while I run above multiprocessing code.

I’ve only taken a quick look at the code here but my first impression is that appending writes to a single file is not likely to see much benefit from multi threading. If it is an option then perhaps writing each tile to a separate file may enable you to speed up the writing (there are some multi file OME-TIFF samples at https://docs.openmicroscopy.org/ome-model/6.6.0/ome-tiff/data.html#multifile-samples).