Reading only a 2D ROI from a large image file

Hi @ctrueden, I recall you mentioned a while back that, perhaps BioFormats, can open a ROI of a large 2D image file, so that only the ROI field of view is loaded. What library call would that be?

While I am quite sure there is a built-in way to open only a 2D ROI from a (larger) 2D file, here is a quick jython script to do just that. The function is called “read2DImageROI” and can read subregions of an uncompressed 2D image (8-bit, 16-bit and floating-point only), as an ImgLib2 image.

You’ll need to find out the dimensions and header size, which can be done e.g. using ImageJ’s libraries e.g. with FileInfo:

Hi @albertcardona, and sorry for the delayed reply.

This gist by @ctrueden has an example how to read a cropped region with Bio-Formats:

1 Like

Thanks. I tried it out to read a 16-bit image and it threw an Exception:

imps = BF.openImagePlus(options)
	at loci.formats.tiff.TiffParser.getNextOffset(TiffParser.java:1310)
	at loci.formats.tiff.TiffParser.getFirstOffset(TiffParser.java:417)
	at loci.formats.tiff.TiffParser.getFirstIFD(TiffParser.java:371)
	at loci.formats.in.SEQReader.isThisType(SEQReader.java:73)
	at loci.formats.FormatReader.isThisType(FormatReader.java:613)
	at loci.formats.ImageReader.getReader(ImageReader.java:188)
	at loci.plugins.in.ImportProcess.createBaseReader(ImportProcess.java:620)
	at loci.plugins.in.ImportProcess.initializeReader(ImportProcess.java:485)
	at loci.plugins.in.ImportProcess.execute(ImportProcess.java:138)
	at loci.plugins.BF.openImagePlus(BF.java:92)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
java.lang.NoSuchMethodError: java.lang.NoSuchMethodError: loci.common.RandomAccessInputStream.readUnsignedInt()J

The code:

from loci.common import Region
from loci.plugins import BF
from loci.plugins.in import ImporterOptions

path = "/home/albert/Desktop/t2/189/section189-images/08apr22a_gb27932_D4b_12x12_1_00005gr_01755ex.mrc.tif"

# ROI
x, y, w, h = 396, 698, 552, 514

# Now with BioFormats
def bf_open2DImageROI():
  options = ImporterOptions()
  options.setColorMode(ImporterOptions.COLOR_MODE_GRAYSCALE)
  options.setCrop(True)
  options.setCropRegion(0, Region(x, y, w, h))
  options.setId(path)
  imps = BF.openImagePlus(options)
  imps[0].show()

bf_open2DImageROI()

@albertcardona I cannot reproduce. The error you’re seeing suggests dependency version skew. Is your Fiji fully up to date? Which update sites do you have enabled? Try with a clean Fiji.

Here is another way to open a 2D region, using the SCIFIO API (Bio-Formats will still be used as needed under the hood):

#@ DatasetIOService dio
#@ File imageFile
#@ long x
#@ long y
#@ long width
#@ long height
#@output Dataset dataset

from io.scif.config import SCIFIOConfig
from io.scif.img import ImageRegion, Range
from net.imagej.axis import Axes

config = SCIFIOConfig()

xRange = Range(x, x + width - 1)
yRange = Range(y, y + height - 1)
region = {Axes.X: xRange, Axes.Y: yRange}
config.imgOpenerSetRegion(ImageRegion(region))

dataset = dio.open(imageFile.getAbsolutePath(), config)

The Dataset object implements Img so can be passed to ImgLib2 routines as well.

Cross-posted as a gist here:

2 Likes

Thank you @ctrueden, updating Fiji fixed the error.

What I saw surprised me a bit though:

Via jython function:
min: 2.48 ms, max: 4.73 ms, mean: 3.00 ms

By calling BioFormats:
min: 35.46 ms, max: 767.70 ms, mean: 70.26 ms

It is surprising that BioFormats is ~20x slower than a jython function that even has a loop in one dimension (height), with jython loops being very slow.

(Perhaps some of the slowness is related to the System.out.println statements from BioFormats, which print:

Reading IFDs
Populating metadata
Checking comment style
Populating OME metadata

)

1 Like

Obviously, BioFormats is doing a lot more work, by parsing the entire TIFF header, whereas the jython function is skipping it altogether.

1 Like

Seems like comparing apples to oranges? The Jython function does a raw byte block read out of a file, whereas the Bio-Formats code does all kinds of stuff including parsing metadata and then transforming it into OME-XML. There are ways to disable some of the things Bio-Formats is doing, but I doubt you can fully get it down to the 3-4ms time of a raw byte block read.

Thanks. I wonder if BioFormats is reading out the entire file, and only then cropping the region? Will have to read the source code.

I believe it does that as a fallback for certain formats where reading sub-planes is hard, but for regular TIFFs, no, it reads only the region requested:

For TIFF Bio-Formats will only be reading out the requested subsection. The performance time in this case will almost certainly be the parsing of the metadata beforehand.

If this is an operation you will doing frequently and wish to speed up the initialisation time then it is possible to cache the state of the initialised reader by taking the reader (as in https://gist.github.com/ctrueden/6282856#file-bio-formats-py-L23) and wrapping it with:

reader = new Memoizer(reader, 0);
1 Like

Thanks for the hint to use the Memoizer. From the docs I understood that the Memoizer will only work for a given file path, not for additional file paths with the same header characteristics (i.e. files with identical headers, therefore dimensions etc., but different data). Is there perhaps a way to do the latter?

Im afraid not at the moment, even with the dimensions and main metadata values being the same there will always be some values (such as timestamps) which will differ from file to file.

Thanks for the info. Our current use consists in loading thousands of images of the same dimensions and type. Given that our data is rather constant in its properties, I’ll go with the direct byte parsing: the 20x speed up is worthwhile.