Great question. It might be nice to have a wiki page on imagej.net discussing all the ways forward people are using, highlighting cost/benefit of each. But right now, as far as I know, that documentation does not exist. So for now I’ll just list the ways I know about, off the top of my head:
Headless scripting from the CLI. You can run the scripts headless as described here:
On a compute cluster, each node executes one or more scripts in this fashion.
- Pro: Easy to implement as bash scripts or whatnot; standard well-established infrastructure.
- Con: Each startup of Fiji pings for updates by default. If you launch 10000 Fijis, you’ll hit sites.imagej.net 10000 times, and soft DoS the server. This has been happening more and more lately as people are running more and bigger cluster jobs like this. When it gets really bad I end up blocking the IP(s), and those machines can’t talk to imagej.net anymore.
KNIME Server. Workflows using the KNIME Image Processing nodes can be run via KNIME Server. I am not sure exactly what the characteristics are as processing scales up; I defer to @gab1one or @dietzc or other KNIME team member to comment on that.
- Pro: Some people really like developing workflows graphically. And KNIME has tons of advantages like easy inspection of intermediate results, reproducible execution, etc.
- Con: Different paradigm from Fiji Jython scripts would requiring adapting or redoing the workflow, so probably not a solution for you here @bcimini. But wanted to mention it as way forward more generally.
- Con: While KNIME Analytics Platform is FOSS, KNIME Server is a paid feature.
SciJava parallel. As @bnorthan mentioned, a team at IT4I in Ostrava is developing SciJava parallel, a layer on top of SciJava that offers a generic way of distributed programming on top of clusters, without any assumptions about the specific cluster hardware or architecture. If I understand it correctly: support for each kind of cluster is offered as a SciJava “paradigm” plugin, with user code not needing to care about which paradigm plugin is used. I believe the main goal is to make it easy to execute ImageJ/Fiji functionality from within the UI itself. Just click some button like “Run in parallel” and the framework does the rest. I am not sure how far the current development reality is from that ideal, though. I defer to Jan Kožusznik and others on that team regarding technical details.
Container-based execution. Using a tool like Docker or Singularity, run Fiji headless in a container. If you have some container-based workflow tool or orchestration mechanism, these units become more convenient to work with. ImJoy (a partner on this forum) works that way, as does Galaxy, as does Zeiss’s APEER platform. I am not sure how easy it is to scale up ImJoy-based computations; I defer to @oeway on that. And I defer to @sebi06 and @rkirmse regarding APEER. In general, I personally have not dived into these approaches yet, so am ill-equipped to comment on how practical they are to use from a pragmatic or scientific perspective.
ImageJ Server. You can run ImageJ as a server using the ImageJ Server project:
Aside: Funnily enough, we originally developed this approach for CellProfiler after visiting with @agoodman at Broad in 2016, but at this point I think that PyImageJ will be a better way forward for CellProfiler-ImageJ integration—at least if you want the tools running on the same machine.
The ImageJ Server is a pretty general thing, but one use case would be to run one server instance on each cluster node, and then farm out jobs to them from a controller node. IIUC, this is one approach the scijava-parallel project has used in practice to run ImageJ operations in parallel on a cluster.
- Pro: Run ImageJ on a separate machine. Offers scalability and process isolation.
- Con: ImageJ Server’s security is not currently robust or up-to-date. It is exploitable remotely if the server machine accepts connections indiscriminately.
- Con: ImageJ Server’s REST API is pretty minimal right now. This is both good and bad—one downside is that client-side Python code to run commands is not super elegant. But it’s powerful.
The ImageJ Server is also useful for same-machine interprocess communication. For example, I know @kephale has used it to execute ImageJ scripts from Atom, without needing to launch a new ImageJ instance every time.
I’m probably forgetting other approaches people have used. In practice, launching Fiji headless from the command line is probably your best bet here, since you already have a Jython script. When combined with non-ImageJ-specific cluster job infrastructure e.g. like @schmiedc talks about, you should be able to get “embarrassingly parallel” things going.