Bypassing a JDK1.8 bug: "Comparison method violates its general contract!" - TimSort

Under certain conditions, SNT usage can trigger a bug in the JDK1.8 currently bundled within Fiji.
It is a bug in java.util.TimSort, in this case triggered by a com.jogamp.opengl class call. In the context of SNT usage, the bug can be just ‘annoying’ (e.g. linux with jdk1.8.0_172) or catastrophic: with an earlier jdk 1.8 version, the JVM segfaults and Fiji crashes. It is that nasty.

Exception in thread "SciJava-4b2a01d4-Thread-11-AWTAnimator#00" com.jogamp.opengl.util.AnimatorBase$UncaughtAnimatorException: com.jogamp.opengl.GLException: Caught IllegalArgumentException: Comparison method violates its general contract! on thread SciJava-4b2a01d4-Thread-11-AWTAnimator#00
	at com.jogamp.opengl.util.AWTAnimatorImpl.display(AWTAnimatorImpl.java:92)
	at com.jogamp.opengl.util.AnimatorBase.display(AnimatorBase.java:452)
	at com.jogamp.opengl.util.Animator$MainLoop.run(Animator.java:204)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.jogamp.opengl.GLException: Caught IllegalArgumentException: Comparison method violates its general contract! on thread SciJava-4b2a01d4-Thread-11-AWTAnimator#00
	at com.jogamp.opengl.GLException.newGLException(GLException.java:76)
	at jogamp.opengl.GLDrawableHelper.invokeGLImpl(GLDrawableHelper.java:1327)
	at jogamp.opengl.GLDrawableHelper.invokeGL(GLDrawableHelper.java:1147)
	at com.jogamp.opengl.awt.GLCanvas$12.run(GLCanvas.java:1438)
	at com.jogamp.opengl.Threading.invoke(Threading.java:223)
	at com.jogamp.opengl.awt.GLCanvas.display(GLCanvas.java:505)
	at com.jogamp.opengl.util.AWTAnimatorImpl.display(AWTAnimatorImpl.java:81)
	... 3 more
Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract!
	at java.util.TimSort.mergeHi(TimSort.java:899)
	at java.util.TimSort.mergeAt(TimSort.java:516)
	at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
	at java.util.TimSort.sort(TimSort.java:254)
	at java.util.Arrays.sort(Arrays.java:1512)
	at java.util.ArrayList.sort(ArrayList.java:1462)
	at java.util.Collections.sort(Collections.java:175)

The issue is discussed, e.g., here (triggered by swing code on java 1.7 but the consequences are the same). The solutions proposed on that SO thread work, namely:

  1. call System.setProperty("java.util.Arrays.useLegacyMergeSort", "true")

  2. setting -Djava.util.Arrays.useLegacyMergeSort=true at startup
    I don’t think it is possible to set #2 on our end. But we could certainly implement #1, however. There are two issues with that:

    • It will only work when users start SNT right after starting Fiji. If users run any Java code that calls java.util.Arrays, before starting SNT, setting the property will do nothing.
    • Setting the property will disable Timsort which is more efficient than legacy sorting. If we do disable it, won’t we degrade the performance of Fiji for SNT users?

Presumably, this issue is moot on the latest version of JDK 1.8 241 (i have yet to formally test it, but it appears to be fixed), so updating the JDK would also be an option (albeit users would have to download a new Fiji.app). I know the future of the bundled JDK is discussed here, but we need and immediate/short-term solution.

I guess another option would be to nag users (e.g., at startup) to update their jdk, or at least download a newer Fiji.app with jdk1.8.0_172 in which the bug is not as severe. Not sure if that is an effective workaround.

I am actually surprised that nodoby came across this in the past. I guess everybody is developing with newer JDK versions? What do you guys advise?
(Pinging @imagejan, @hinerm , @ctrueden, but please do chime in if you have any insights)

3 Likes

Do you have an ‘easy’ and reproducible way to trigger this error ?

You can trigger it by opening Reconstruction Viewer and loading a mesh after loading a reconstruction in a visible scene (it needs to be in this order). Also, I find it that larger files can trigger it more consistently. Here is a way to trigger it:

  1. Subscribe to the Neuroanatomy update site
  2. Open Script Editor and run this Python snippet. But beware. It may bring your Fiji to a crawl, so please don’t run it with unsaved data anywhere. On ubuntu I get a freeze, but e.g., @arshadic gets a segfault with zulu jdk:
from sc.fiji.snt.io import MouseLightLoader
#@SNTService snt
viewer = snt.newRecViewer(True)
viewer.show()      # it helps to have the viewer visible
viewer.setSplitDendritesFromAxons(False)
loader = MouseLightLoader("AA0001")
viewer.add(loader.getTree())
viewer.loadRefBrain("zebrafish")

We have been looking at this for a while, and it is really low level stuff. I don’t think there is anything in our code we can do it to address it, but do let me know if you think otherwise!

2 Likes

@tferr I still get this bug using Zulu 1.8.0_265

But, do you know where in SNT the sort is being called? What type of object is being sorted? It sounds like this could be a legitimate error in the Comparable being sorted.

2 Likes

@himmer, thanks for looking into it. That is a way newer jdk, so you may be right.
the problem comes from sorting “Drawables” which are jzy3d primitives. It is this call that causes the issue. But the developers are aware of it, and even mention it in the code. I will followup with them, thanks!