Ongoing support for python-bioformats

I wanted to get a sense of community demand for/reliance on python-bioformats; I think there has been a disconnect in how we on our team have thought about it for the last few years (which is as one “support beam” making CellProfiler work for users with proprietary formats) and how some other folks in the community have wished it was (a broadly, enthusiastically supported package for the whole community). I feel like those who want more from it have ended up frustrated, so first of all I just wanted to apologize personally for that. We’ve tried to at least now make our current stance more clear in our documentation. If I miss a post about it on here, please don’t hesitate to tag me in; I’m still getting up to speed with a lot of its functionality, so I can’t promise I’ll always be able to solve stuff right away, but we definitely want to have fewer frustrated folks in the community.

Second of all, I would love to get a sense of how broadly it is still being used; we’re slowly modifying CellProfiler’s java dependencies in collaboration with the ImageJ folks such that we’re starting to imagine a world without CellProfiler needing python-bioformats (this world would be, at minimum, several months off, but perhaps not several years off), but we don’t have a good sense on what us abandoning that project entirely would do. Obviously, in a perfect world, every useful-to-a-single-person tool would be supported indefinitely, but if we need to make decisions about prioritization of resources, it would be helpful to know basically who else is using it. Is it the case that now everyone uses pyimagej so most folks don’t really care, is there a die-hard core of users who love it (and we could maybe lean on for some assistance with keeping it up to date), are we somewhere in the middle, etc. I want to emphasize that nothing is changing right now, but we do want to start thinking about plans for the future.

Open to the floor here!

9 Likes

Hi Beth, I hope the data treats you well.

I use python-bioformats and it works really well except that I can’t release the memory from the javabridge.

I didn’t know about pyimagej (or maybe got mixed up with the other python - java bindings).

Would you recommend pyimagej? Does it have the same java memory constraint of python-bioformats?

Thanks
Neil

I can’t speak to the java memory stuff on pyimagej, but maybe @hinerm can?

@nranthony I’m not that familiar with python-bioformats - do you mean the memory used to run the JVM itself? Are you looking for the ability to start/stop the JVM?

pyimagej uses jpype instead of javabridge, which still requires a running JVM. I think the main reason to use pyimagej would be if you wanted to use other ImageJ plugins beyond Bio-Formats.

Thanks @bcimini , @hinerm ,

I have a multiprocessed app with each process using it’s own bridge, but it retains the memory and eventually fills up and rips the virtual memory drive and gets stuck. I can’t stop and restart, and have tried creating sub processes that each have their own bridge, but that’s super slow.

I’ll look to pyimagej and report back.
Thanks
Neil

@nranthony this might not be a problem specifically with python-bioformats - it could just be a memory leak from files not being closed. It might be worth checking that all your bioformats.IFormatReader's are either close()'d manually or, better, used from with statements.

If you aren’t sure and haven’t already tried, a file leak detector or in-depth heap analysis could help narrow down the culprit.

Sorry if this is just a known issue in python-bioformats!

pyimagej won’t allow that either. I think one JVM per process is a hard limit.

I use python-bioformats for reading lif files into python.
if there any alternative I would be happy to switch but I was not able to find any so far.

2 Likes

heh, it’s something I’d like to perfom too

Hi @hinerm,
I have three calls to bioformats and don’t know if I should use ‘with’ for all of them or just the reader?

omexml = bioformats.get_omexml_metadata(fpath)
omemeta = bioformats.OMEXML(omexml)
reader = bioformats.formatreader.get_image_reader(0,fullpath)

Not sure if this differs from the bioformats.IFormatReader you note above.

Thanks for your time.
Neil

Actually, @hinerm, I think I fixed it.

I wanted to post the steps here for reference to others.

I had
reader = bioformats.formatreader.get_image_reader(0,fullpath)
which I changed to
with bioformats.formatreader.get_image_reader(0,fullpath) as reader:
This gave me an error of no attribute rdr and I was stuck.

But then on mouse over Spyder recommended I use:
with bioformats.ImageReader(fullpath) as reader:
which appears to be churning away a lot fast and without RAM ramping up :slight_smile:

Thanks
Neil

Just wanted to chime on the Python LIF Reading.

Both aicsimageio and readlif are available.

Bias: I maintain aicsimageio and we depend on readlif but add a few features and standardize the API along with other supported formats.

I.E.

  • We can stitch tiles back together into one large dask array.
  • You can use xarray, dask, or numpy for pulling and retrieving data
  • We pull out key pieces of metadata to simply properties (img.channel_names, img.dims, etc.)
  • Pure Python, no javabridge, just pip!

A big extra note: We are right in the middle of releasing our next major release of (4.0.0) so our documentation is for 4.0: AICSImageIO — AICSImageIO 4.0.0.dev4 documentation

4.0 dev releases are pretty stable at this point so don’t worry about that, we just haven’t released 4.0 because we are waiting on one last thing (CZI support) so feel free to give it a whirl with pip install --pre aicsimageio[lif].

3 Likes

Hi @bcimini

I’ve been using python-bioformats for a few months to open up specific channels and z positions of ome xml and tif files. It was a bit clunky to get started but it seems to be working great now.

I don’t have a lot invested in python-bioformats so would be happy to switch to pyimagej. Is it recommended that new Python projects that need bioformats access it through pyimagej instead of python-bioformats ?

Brian

1 Like

@bnorthan We really don’t have any recommendations yet, we’re still very much in “information gathering” stage. If and when we make a hypothetical future official decision that we won’t be supporting python-bioformats anymore, we would definitely make sure to define successor package(s) and provide transition guides. Thanks for your input!

1 Like

Hi Beth,
I am using bio-formats with Matlab quite extensively. As I have some experience with it, I am planning to implement reading routines using python-bioformats for a new project in Python. In this sense, it would be nice if it stays alive.

Ilya

1 Like