Big 3D Stitching

Dear all,

we (with @Christian_Tischer, @constantinpape, @Kimberly_Meechan, …) try to align rather big tiled serial section EM datasets. One specific example would be images generated in a serial block-face SEM with @btitze 's SBEMimage, but could also be from other instruments (FIB-SEM, serial TEM).

The data would be n (up to 10,000) slices in z with a specific thickness. Each slice is a regular grid mosaic composed of rectangular tiles (can be several hundreds). Each tile would be about 1k to 4k in size. There are empty tiles and tiles can exist independent of neighbors. Initial (physical stage coordinates) positional metadata for all tiles is available.

The core concept of the approach would be to specify necessary transformations for each tile based on its relations to neighboring tiles and then compute a globally optimal solution for all tiles and apply the resulting transformations. Eventually, you would merge some or all of the data together to have the data available as volume-based storage. (this could include partitioning of the resulting volumes and displaying the sub-volumes in positional relation)

We would love to get comments of some of the developers or experienced users of the following tools that we consider to implement an efficient pipeline.

@axtimwalde, @StephanPreibisch, @albertcardona, @thewtex, @hoerldavid

  • Obviously, there is Render that is designed for massive EM data. However, since we get data from different microscopes in different formats and in much smaller quantities, we are a bit hesitant in setting up such a big and rather complicated infrastructure and creating compatible data.

  • we then consider TrakEM, that should essentially provide the same underlying stitching functionality. What we are not sure of: how it integrates with the BigDataViewer universe, which is a must. Plus, we would definitely aim for HPC-type parallelization and/or GPU utilization. Both seem not trivial using TrakEM. Please correct us if wrong!

  • BigStitcher: We could already visualize and stitch small example datasets with BigStitcher very nicely. Our questions would again be HPC-type parallelization and/or GPU utilization due to the size of the datasets. Also, we are not entirely sure if and how neighboring tiles in the z axis are considered (they don’t overlap!) for the stitching.

  • TeraStitcher/Parastitcher: this sounds appealing in terms of parallelization. Has anyone experience in its usage and support of the developers? (I could not find them here unfortunately). Again, we are not entirely sure if and how neighboring tiles in the z axis are considered (they don’t overlap!) for the stitching. Also it is not clear if single 2D tiles can be used as input data or if we need to assemble small volume chunks first.

  • ITKMontage: this looks like a promising generic package. Would it be capable of handling this type of data?

Any other suggestions are much appreciated as well!

So long,
Thanks for any feedback!

5 Likes

There is a TrakEM2 tool to BDV browse the current project https://github.com/saalfeldlab/bigdataviewer-trakem2. Probably outdated and much easier to write today, i.e. it shouldn’t be a big deal. However, @StephanPreibisch and me are currently improving the render based alignment tools targeting SEM and I think if you want to go HPC (not GPU in this case), that could be the best way to go. Are you interested in sharing a test data set? May be we can find a way to work on this together?

4 Likes

Dear Stephan,

wow, this is a great suggestion.
Let me dig and see what we can provide. I will get back to you!

Hi @schorb,
Cool project. Only to add that, for large projects, we use @axtimwalde’s mpicbg transformation and alignment library that TrakEM2 uses, but without TrakEM2. So far, we’ve been aligning FIBSEM volumes with a single image per section, so the problem is rather straightforward and simple. A working script to run in a multi-core machine is here: https://github.com/acardona/scripts/blob/0193ea99389006c7e4da6b6e7e1d610021c0ba26/python/imagej/FIBSEM/serial_section_registration.py and our lab member Andrew Champion wrote an HPC version of it that runs in a cluster.
I’ve always wanted to rewrite a version of TrakEM2, so to speak, based on ImgLib2 and BDV, for alignment of large multi-tile, multi-section data sets, but I haven’t–I am hoping that what @axtimwalde and @StephanPreibisch are working on will do it.
On that note, the group of Sebastian Seung at Princeton wrote a cluster-ready multi-tile serial section alignment by rewriting parts of the mpicbg library in julialang. And at the Allen Brain Institute, they also have their own alignment software, spearheaded by Forrest Collman (if I remember correctly) for the alignment of their >1mm^3 volume of mouse cortex imaged with TEMCA, and which they released in github somewhere.

2 Likes

Hi,

I have installed a test instance of render-ws and got everything up and running. I think I have familiarized with the data model and the API calls that it uses for managing the data.
Next in line would be to identify a good approach of interfacing this data model with the processing platform.
Is this what the Dynamic Render Host is used for?

What I found:

My main task at the moment is to to get users and microscopy facility staff used to the idea of having raw (untransformed) image data plus the positional and transformation metadata as separate entities.
The ideal scenario for me would thus be to already aim at interfacing the Render data model with the BDV strategy of positional metadata (transformed source). Is there already something like a converter between the BDV/BigStitcher/… XML format and the Render json data model?
This is what I already have at hands and also what we find to be the best approach for browsing/exploring and sharing large volume data.
Ideally, the registration/alignment would happen automated through render (with an associated data management module) and the user interaction would simply be a viewer (CATMAID or maybe already BDV-based) and some interface to trigger data management or processing procedures.

1 Like

For what it’s worth, I invited Alessandro Bria, the author of TeraStitcher, to this forum topic (using the :link: Share button at the bottom of the page).

1 Like

There is this

which got a bit old. It renders locally and uses the render service only to query meta-data and transformations. Haven’t used this for a while as you can see from the outdated dependencies. It should be straight forward to bump to more recent versions, some bug-fixing required. I also have a more up-to-date Spark based exporter into N5 that we use more regularly and that should be recent:

@StephanPreibisch and Eric Trautman are currently running the render-based alignment jobs using Spark and a variety of solvers. With @StephanPreibisch, we currently went back to using the MPICBG solver on overlapping chunks to align very large series in parallel that we then stitch with simple interpolation.

So I bumped the bigdataviewer-render-app to pom-scijava-29.2.1 and added some things that were still sitting idle in my local repo. Works for a local (i.e. Janelia render server) test case. Expect some issues because I almost forgot about the project as we mostly visualize data through local exports these days…

Cool,

so after doing some more screening of the tools that are around, here would be my draft for a generic volume EM registration/stitching strategy:

General task:
I really like @khaled’s nomenclature definition and try to stick to it from now on. Also, his diagram nicely explains the workflow:

In the case of FIB-SEM we skip the initial montaging of the tiles, but the rest is comparable.

We would then certainly also need to incorporate exporters to chunked volume formats (N5, Zarr,…), ideally making it compatible with remote access as well as submission to public repos.

Strategy

I think it makes sense to build things around a Render service for having a central information hub (data storage and transformation metadata) with a standardized API.

The two main areas where there is the need for development work are:

  1. Frontend(s) for populating Render and initiating client processes. So far, I have not yet found tools operating at target scale (HPC-compatible) accessible to a shell-anaphylactic user. Ideally we would follow parallel strategies with enabling access through both:
  • Fiji Plugin(s) (either based on TrakEM2 or BDV/BigStitcher). This is what is typically used already in the community and what users are familiar with. Generating the positional metadata in (BDV/TrakEM2) compatible formats from the acquisition devices is pretty straight-forward and often already implemented.
  • Web-based access. I have installed and played a bit with Neuroglancer and find it promising for user interaction. I am not entirely sure how the performance in visualizing raw data in section/tile format from a Render source as compared to final chunked volumes, but will test this further. CATMAID might also be an option, maybe stripping it from the neuron-tracing specifics.
  1. Processing clients: Here is where modularity will be key. I don’t think the exact choice of client library matters too much, as long as it can be “easily” accessed and triggered by the frontend. Ideally, one could compile a collection of client processing modules for each type of input dataset (FIB-SEM, ss-TEM, SEM Array-Tomo, SBSEM) and offer these workflows as packages. I like the idea of preparing the modules in pairs for local computing and HPC. My first try would be to get Janelia’s spark-based client scripts up and running on EMBL’s HPC infrastructure. With the also already existing Python apps, and other implementations, integrating modules across programming languages might also be feasible.
    In the end, it will be necessary to include dedicated processing modules specific to a certain technique and from various developers across labs. A unified platform (Render) with standardized metadata handling is key for making this possible.

I will try get things up and running in our environment, initially without too much development on the frontend side. But eventually, I have the feeling that his will be the key component to make the whole alignment/registration environment accessible to the community.

Let me know, if you see things entirely different and whether I maybe missed a crucial component that would make everyone’s life easier.

1 Like