Imglyb can now wrap arbitrary ndarray like objects into ImgLib2 cached cell images

I added support for wrapping arbitrary ndarray like objects as cached ImgLib2 cell images to imglyb. This is a direct result of the hackathon in Dresden in December. I encountered the following issues:

  1. References to local Python methods would be garbage collected by the Python interpreter even though they were still in use inside the JVM
  2. References to native memory (primarily numpy.ndarray) would be garbage colleted by the Python interpreter even though they were still in use inside the JVM
  3. Copying native memory into ArrayDataAccess with Python functions seemed to work only with synchronization on the Python side; I observed very strange exceptions on the Java side without synchronization.

I addressed (1) and (2) with a reference store that I return together with the wrapped image. Additionally, I overrode finalize for all unsafe accesses to optionally run code on garbage collection. In this case, I remove references that are not needed anymore. To address (3), I added more helper code in imglib2-imglyb.

The function of interest is

imglyb.as_cell_img(array, chunk_shape, cache, *, access_type='native', chunk_as_array=identity, **kwargs)
    Wrap an arbitrary ndarray-like object as an ImgLib2 cached cell img.
    
    :param array: The arbitrary ndarray-like object to be wrapped
    :param chunk_shape: The shape of `array`. In many cases, this is just `array.shape`.
    :param cache: Can be `int` or an ImgLib2 `LoaderCache`. If `int` (recommended), use a
                :py:data:`imglyb.caches.BoundedSoftRefLoaderCache` that is bounded to `cache` elements.
                `LoaderCache`s are available in :py:mod:`imglyb.caches`.
    :param access_type: Can be either `'native'` or `'array'`. If `'native'`, use the native memory of the contiguous
                ndarray of a chunk directly. If `'array'`, copy the native memory into a Java array and use the Java
                array as access.
    :param chunk_as_array: Defines conversion of a chunk created by slicing into a :py:class:`numpy.ndarray`.
    :param kwargs: Optional arguments that may depend on the value passed for `access_type`, e.g `use_volatile_access`
                is relevant only for `access_type == 'array'`.
    :return: A tuple that holds the wrapped image at `0` and a reference store at `1` to ensure that Python references
                are not being garbage collected while still in use in the JVM. the reference store should stay in scope
                as long as the wrapped image is intended to be used.

There are now examples for using this functionality in imglyb, e.g. wrapping both an h5py.File and a dask.array

The new feature is available in release 0.4.0 on pip. The conda version will be released as soon as the auto kick bot notices the new release.

11 Likes