Compile the bioformats Java code to a binary shared lib

Dear Bioformats community,

Just a thought.

As you know, integrating bioformats into python projects can be challenging
because of the annoying Java-Python impedance mismatch. Bindings exist
(python-bioformats , etc…) but there is unfortunately some clunkiness there.
And of course, a rewrite is just out of the question.

There might be a possible radical solution, and I wanted to check with the
community here to see if someone tried it, or considered it, and what the
consensus and thoughts are about it:

Compile the bioformats Java code to a binary shared lib using either:

i) https://www.excelsiorjet.com/

ii) Better: use GraalVM from Oracle:
https://chrisseaton.com/truffleruby/tenthings/
(See section 8: 'Java code as a native library’ )

If we could do this, compile Bioformats to a native library,
then this would be incredible in terms of making integration
of Bioformats super easy for all sorts of languages (Python, Julia, etc…)
without having to ship a JVM or have a ‘good’ one already installed.
In fact, microscope manufacturers would have little excuse to
not start using it too… Speed of native code is another benefit.

What do you think?

3 Likes

Hi @royerloic,

There’ve been a number of attempts to bridge that particular Java/native gap over the years, and @ctrueden has an extensive write up of many of the methods attempted at one time or another under https://loci.wisc.edu/software/interfacing-non-java-code . Additionally, Johannes Schindelin looked into a DLL-like solution with Avian (can be found under the inactive https://github.com/ReadyTalk/avian repository) back in 2013 which pointed to a likely manageable set of changes that needed to take place to use Avian at that time:

https://lists.openmicroscopy.org.uk/pipermail/ome-devel/2013-October/002534.html

Of course, things have come along substantially since then, so in general, we’d :heart: to see Bio-Formats being made more readily available to languages outside the Java ecosystem.

From our side, it’s largely been a matter of having no one with time dedicated to achieving and then maintaining such a solution. If anyone is inclined to try out a similar solution, a good starting point would be to catalog any blockers to integration. Assuming there’s a list similar to Johannes’ with changes that are needed to Bio-Formats core, we’d be inclined to get those into the mainline as quickly as possible.

~Josh & the OME team

3 Likes

I agree that a good way forward here would be to try compiling Bio-Formats with Graal’s Substrate VM (i.e. native-image). There will undoubtedly be many problems. Someone would have to go through layer by layer and get the various supporting libraries of Bio-Formats to compile and work one by one.

The code cannot have any AWT dependency. There will also likely be issues with usage of other javax APIs, maybe javax.xml and related packages. The only way to know for sure is to try.

@royerloic If your team has resources to do this, I think it would be an extremely impactful endeavor. I am also willing to help hack on it personally at a future hackathon.

All of that said: a much easier way forward would be to call Bio-Formats via pyimagej, as I mentioned to you in person recently. Yes, you have to ship a JVM for that to work, but it should make the actual implementation part in Python pretty painless—just call Bio-Formats and then ij.py.from_java(myImage) as needed. Yes, that also makes a copy, but I think would be much less effort to get working quickly.

4 Likes

I’m doing some testing producing a native executable rather than a library, to avoid a lot of hand-written @CEntryPoint methods. Still working through what works/doesn’t, but it largely seems to compile. It’s mainly a question of avoiding unsupported Java Reflection methods and making sure all needed classes&resources are included. I’m not sure if native libraries are getting included in the file, are found on the library path, or are still extracted from the classpath jar (I think it’s the latter).

Avoiding unsupported sources of Reflection isn’t the easiest either as some things are Java-internal, like javax.xml.transform.Transformer --> com.sun.org.apache.xalan.internal.xsltc.trax.TemplatesImpl. ---> Unsupported method java.lang.ClassLoader.defineClass(String, byte[], int, int)

1 Like

Cross-linking a twitter thread from @AnneCarpenter

cc: @Erich_Bremer @agoodman @sofroniewn @adamltyson

2 Likes