Pyimagej OutOfMemoryError: Possibility to re-initialize?

I am using pyimagej to stitch 3x3 images via the Grid/Collection stitching plugin (and hopefully at some point via lower level APIs), see here for some of the details and the extract of the code that does work in pyimagej: Using imageJ functions like type conversion and setting pixel size via pyimagej

I need to run hundreds of those stitching jobs sequentially. I initialize like this at the moment:

ij = imagej.init('C:\\Program Files\\Fiji')

When I initialize at the beginning of the script and then run my jobs in a loop, I run into an OutOfMemoryError after 45 images have been processed (reproducibly at the same image), apparently during saving image 46:
46

It has nothing to do with the specific images in question. If I start at image 46 where it would normally run into the OutOfMemoryError, it passes it fine.

I have observed that the memory usage is continuously increasing. Therefore, I am using ij.window().clear() after saving each image. If I understand it correctly, this closes ImageJ windows (as created by the Grid/Collection stitching plugin). But I also convert those to an ImageJ2 dataset stitched_img_dataset = ij.py.to_dataset(stitched_img). Is there a way to close these as well? Even when closing ImageJ1 images, memory usage still goes from using 8 GB of RAM in the beginning to 32 GB of RAM, but then stays around 32 GB for a while (for 15 images being processed or so), until it fails when trying image #46 (without actually appearing to use more RAM in the task manager).

Some of that may be solved when using lower level APIs, but I think there are some things that just accumulate over each round of the run (not the actual full images, otherwise my memory usage would have exploded much faster, after 4-5 images or so). Therefore, I wondered if there is a way to “re-initialize” or to shut down the ImageJ gateway. All my sequential runs are completely independent, so restarting ImageJ each time wouldn’t matter from that perspective. But when I try to initialize every round by having ij = imagej.init('C:\\Program Files\\Fiji') in the loop, I get the following error message:
34
Is there a supported way to close a session, something like imagej.close()? Or to restart a session fresh? Or just clear all its memory?

1 Like

I have a similar problem. I’m trying to open a large number of images, convert them to numpy arrays and then do some analysis on them.
I use the following code and open the files one by one.

files=glob.glob(some_files)
for file_path in files:
    img=ij.io().open(file_path)
    as_np=ij.py.from_java(img)
    some_function(as_np)

As the program runs the memory usages increases and stalls eventually. I believe the problem is that img(s) remain open in imagej instance.

I agree with @jluethi that if there is a way to close the VM and create a new one will solve the issue.
Have you found any answer to your question @jluethi?

@omidalam
I actually found the solution to this thanks to the help of @ctrueden

In short:
If you only need the image as a numpy array and it’s a standard format, consider loading it with something like opencv. This just loads the image as a numpy array.

If it’s some special file format or you have other reasons to stay with pyimagej, then closing the java images seems to do the job for me. This closes an ImagePlus in ImageJ, thus allowing the JVM to reclaim the memory. You do this by calling img.close()

So your loop would be:

files=glob.glob(some_files)
for file_path in files:
    img=ij.io().open(file_path)
    as_np=ij.py.from_java(img)
    img.close()
    some_function(as_np)

EDIT: Not sure whether close works with the way you open the images.

1 Like

I don’t know how the internals of pyimageJ work but the img object should be closed by the garbage collector, which should also trigger closing whatever java object is wrapped.
It may be that the numpy view holds a reference to the java object, but at least in your code snippet that should also go out of scope after each for loop.

1 Like

In my experience, I need to explicitly close the img if it’s an ImagePlus type image. I’m actually not sure what data type ij.io().open(file_path) returns.
I’m using:

from jnius import autoclass
IJ = autoclass('ij.IJ')

img = IJ.openImage(str(img_path))

# Do something with image, e.g. send to numpy

img.close()

If ij.io().open(file_path) returns something else than an ImagePlus class object, the close function will probably not work.

Hmmm, if the memory really isn’t freed I would consider that a bug. If there is no more reference to the python object it should eventually be destroyed by the python garbage collector.

Again, I’m not that familiar with the implementation details of pyimagej and the underlying scyjava and pyjnius stuff but whichever python object wraps the underlying Java object should make sure that the Java object is closed and freed when garbage collected by providing a suitable __del__ method.

I guess @ctrueden or @hanslovsky will know more about this.

1 Like

That function can return anything, depending on what’s in the file path given. It is a general open command that is plugin-driven.

In practice, for images, the SCIFIO-based I/O plugin will handle it and return net.imagej.Dataset. You do not need to call close on these (nor is there even a close() method defined); when there are no more references, the Dataset will be garbage collected as needed.

However, if you do ij.ui().show(myImage), then a display is created and then a hard reference is kept until the display is closed. But for most PyImageJ-based use cases, it would be rare to do that.

It is more complicated than that. PyImageJ spins a JVM, so there are two garbage collectors: the Python one on the Python side, and the Java one on the JVM side. If you open an image on the Java side e.g. via ij.io().open(...) or ij.scifio().datasetIO().open(...) or IJ.openImage(...), then those objects are on the Java heap. Python knows nothing about them. The objects will be wrapped as Python objects on the Python side by pyjnius when you make a function call that returns that Java object, in which case the Python garbage collector will keep track of the Python-side object and garbage collect it as usual. But on the Java side, that object gets garbage collected when there are no more Java-side references to it.

Furthermore, when the Java garbage collector runs, it frees memory for use by the JVM again, but might not return that memory back to the operating system, depending on the JVM implementation. See “Why does ImageJ not release any memory back to the system?” sidebar on the following page for further details:

Again, I don’t think this is correct. Although I am also no expert on pyjnius, so I acknowledge it is possible I am wrong. What makes you believe this? Just intuition? Or documentation somewhere?

2 Likes

To respond to the original question:

There is no implemented way to start over cleanly. I do not know how easy it would be to shut down the JVM and then spin another one, which would be the most complete way of doing that.

However, @hadim added a feature recently making it possible to destroy your ImageJ context and then make another one, as follows:

ij.getContext().dispose()
ij = imagej.init('...', new_instance=True)

Where '...' is whatever endpoint you want to use, as normal.

But no new pyimagej release has been made since then. If you urgently need it, I can look at making another release.

This is not possible with PyJNIus. Maybe a new JVM can be started with reloading jnius but the previous JVM will probably persist or weird stuff might happen. This sounds like a lot of implicit side effects to me. Maybe explicit start and shutdown of the JVM could be requested as a feature for PyJNIus but I can imagine that possible problems with Java classes and objects associated with a previously shutdown JVM my be prohibitive.

Edit: maybe python multi processing could be a solution when you import jnius exclusively in subprocesses

1 Like

I think I may not have expressed that clearly: I don’t know wheter pyjnius has a mechanism like that. The point I was trying to make is that such a mechanism that keeps the reference counters in python and the JVM in sync (e.g. removes a reference in the JVM if the python object is garbage collected) is a prerequisite if one wants to be able to write idiomatic python code using pyimagej without running into memory leaks. I thought one way of achieving this could be implementing an appropriate __del__ in the python wrapper objects, but maybe it is more complicated than that or not even possible.

If such a mechanism is already in place (again, I don’t know) it is also conceivable that the python garbage collector doesn’t trigger early enough.

Anyway, my take-away message is that it may be necessary to explicilty call .close() to avoid such issues.