Use shared memory to avoid copying data

Hi,

From a program written in C++ on Windows, I have to launch ImageJ (Fiji) in headless mode to execute a script. I searched for some solutions to improve performance and time consumption and for now i use a ram disk software. Data that i have to send are already loaded in the ram when i have to send them. I copy them on the ram disk and then ImageJ will copy them in its memory (I guess). So there are lot’s of useless copies.

Are there some solutions to avoid these copies like using shared memory between both processes ?

Thank you in advance

2 Likes

Hi @Loic_Balleydier,

Welcome to the forum.

Yes, it’s possible using the Java Native Interface (JNI), but might be difficult.

For tips, I recommend looking into pyimagej, which might make your life easier if you make python wrappers for your c++ code. Just a thought.

@hanslovsky, @ctrueden will know details.

Good luck,
John

Thank you for the quick answer, I will look at links you mentionned.

1 Like

It will depend on the data structures you are using: ImageJ1 data structures (ImagePlus, ImageProcessor) do not support native memory, ImageJ2/ImgLib2 data structures can read and write native memory. But even without shared memory, you would save a few copies and I/O.

Do you have a Python wrapper around your C++ code? Then it would be fairly easy to just call the imagey or pyimagej API to wrap your C++ data as ImgLIb2 data structure.

No, I have no python wrapper today. I use the Qt framework to launch directly ImageJ in headless mode. I never had the chance to use python to wrap my C++ code until now.

So if I well understood, the shared memory will be between the C++ program and the python wrapper. The python wrapper will build a structure (ImageJ2/ImgLib2) around my data and use functions from pyimagej or imagey to execute the ImageJ script.
Is it that ?

Thanks for your answers.

Let me explain in a little more detail:

Java data structures live in the JVM and the memory layer is an abstraction that is different from native memory. As a consequence, Java programs cannot read from or write into native memory in contrast to other programming languages such as C, C++, Python. There are workarounds to do that, e.g. through the Java Native Interface or through sun.misc.Unsafe. None of these options are trivial and require careful considerations, but it is definitely possible. In principle, you will need to three major components:

  1. A way to start a Java Virtual Machine from within a native process (you will need JNI for that)
  2. Make Java methods and classes available in the native programming language (JNI)
  3. Add support for reading native memory addresses in your Java data structures (Unsafe).

PyJNIus takes care of (1) and (2) for Python through the JNI with Cython. To address (3), I created

  • imglib2-unsafe to expose native memory to ImgLib 2 data structures
  • imglib2-imglyb for data structures that can understand numpy meta data, such as strides or offsets into native memory and other meta data, and
  • imglyb the Python library that wraps numpy.ndarray into ImgLib2 data structures with shared memory access.

In order to have shared memory between your C++ code and ImgLib2 data structures, there are two options:

  1. Write your own wrapper for (1) and (2) from above and then use imglib2-unsafe to wrap around your C++ native memory
  2. Write a shared memory Python/numpy.ndarray wrapper for your C++ data structures and then use imglyb to handle shared memory between Python and ImgLib2. To fully use the capabilities of imglyb, your C++ data structures need to be wrapped as numpy.ndarray.

From my perspective, option (2) seems a lot easier and I think it’s generally a good idea to expose C++ code and data structures to Python but I have not written a Python wrapper in a long time. If you are looking for a good starting point, I would recommend pybind11 to create the Python wrapper.

4 Likes

Thank you @hanslovsky for all these details.
I will follow your advices and choose the second option.

1 Like

Hi @hanslovsky,

Since my last message, I didn’t have so much time to create a poc of my tool with shared memory but I made some research about all components to set it up. However, I’m facing some problems linked to the functionalities of my tool that i can’t solved for now. I read again our conversation and you said :

I will have to improve my first version without shared memory and I’m wondering how to save copies and I/O without using shared memory ?

I already searched about use a virtual stack in ImageJ but it’s read only and i need to modify images. I can’t use batch processing because i have to implement the possibility to use an advanced ImageJ script with many loops and with batch processing I can’t (I guess).
I searched about send images one by one but i have the same problem and the time needed will explode with this approach.

Do you have some idea or details to save copies and I/O with these characteristics ?

Thanks !

PS: I talked a little bit about my tool but maybe a complete view of it should be helpful :
From a C++/Qt application, I have to send TIFF images to ImageJ to apply a script on them. Then I retrieve modified images and I reload them in the C++/Qt application. For now I use the RAM disk to speed up the transfer between the C++/Qt application and ImageJ. I launch ImageJ by using command line.

You will have to make one copy eventually (from native memory into Java arrays), but you won’t need to go through IO.

The answer is: It depends. It is extremely hard to say without a minimal working example. That being said, I would probably

  1. wrap the numpy array as an ImgLib2 data structure with imglyb.to_imglib
  2. create an ArrayImg of appropriate size (this will fail if you are trying to load images > 2GB, you would need to use different data structures)
  3. copy the contents of the wrapped image into the ArrayImg.
  4. expose the ArrayImg as ImagePlus

You would need to copy the contents back into your native numpy image if you need to access the result in Python/C+, but no need to go through I/O.

Note that this is only necessary if the plugin you are using expects ImageJ1 data structures.

I’m a little bit confused, I didn’t understand that I have to make Java code between python and ImageJ. If I well understood, the last three points need to be in Java, right ?

What is not necessary if I just use ImageJ2 data structures ? All the points previously mentionned ? Just the last three ?

The problem is that I don’t know what the script will contain. The user of the C++/Qt application will choose a script (or create it) and I have to apply it on images that it choose too.
Something I’m sure is that I use Fiji to apply the script (or at least ImageJ2).

Yes, but the API is available in Python through pyimagej/imglyb/pyjnius.

The last three, i.e.everything that needs a copy.

Ok, it’s become more clear !

I will not work in my company for a long time (I’m an apprentice) but I will look about this problem when I will be back.

Thank you for all the details !

1 Like