Automated batch processing

scijava
ij2command
batch-processing

#1

@imagejan @haesleinhuepf

I am trying to get automated batch analysis in IJ2 commands working.

This is my command: https://github.com/tischi/fiji-plugin-illuminationCorrection/blob/master/src/main/java/de/embl/cba/illuminationcorrection/basic/BaSiCCommand.java#L38

And this is my pom.xml: https://github.com/tischi/fiji-plugin-illuminationCorrection/blob/master/pom.xml#L116

When running this command I do not see a [Batch] button in the command UI. Should there be one or am I missing something?


#2

When I compile a jar from your repository (using mvn clean install) and put it into my up-to-date Fiji installation, I do see the Batch button:

I didn’t test starting it from an IDE via a main class. Maybe you’re missing a scijava-search dependency in addition to batch-processor? Do you see the search bar at all?


Correcting brightfield illumination artifacts
#3

Ok! I thought the [Batch] was supposed to appear in the actual Command UI…


#4

That would be a nice addition, indeed, but require quite some rewriting of the UI framework, of which I lack the necessary knowledge.

But it should be feasible in principle. One would have to think of ways to define which of the parameters should be “batched” (e.g. radio buttons next to the “batchable” entries), and maybe have a state-dependent “batch mode” that can be enabled or disabled, as a permanent presence of these UI elements could potentially be confusing for new users, etc.


#5

In case you’re interested in some (minor) code review:

You should not rely on the UI to display your image… if you really need to open an image from file, use:

imp = IJ.openImage( inputFile.getAbsolutePath() );

But in general, by adding a runtime dependency to imagej-plugins-batch, you can write your command to take any ImagePlus and let ImageJ supply it for you, keeping full flexibility and a good separation of concerns:

<dependency>
  <groupId>net.imagej</groupId>
  <artifactId>imagej-plugins-batch</artifactId>
</dependency>

This will make Dataset, ImgPlus and other image parameters batchable and will (hopefully) be included in the next round of component uploads by @ctrueden (the imagej-plugins-batch version is managed by pom-scijava since a while now).


#6

I tested it and it does run in batch!

There are two things that I do not understand:

  1. Can one do something with the “Output table” that is generated? I clicked on the files and nothing happened…

  2. Can one configure the batch such that it automatically writes output files to disk? Or is that something that should be part of the code of the command itself?


Correcting brightfield illumination artifacts
#7

@imagejan Am I right assuming that this is the job of the command itself? I would then go ahead and implement this accordingly.


#8

The output table is an ImageJ2 Table (currently net.imagej.table.Table which will be org.scijava.table.Table as soon as all necessary API is available in Fiji, see the following PRs: scijava-ui-swing#42, imagej-legacy#196, imagej-ui-swing#80).

There are plans to make Table functionally equivalent to an ImageJ1 ResultsTable, but currently, there’s no way yet to save the results from the UI. I hope this will be possible in the very near future (see discussions in imagej-common#37 and scijava-ui-swing#37).

Currently, yes. The batch processor only handles mapping from a list of files to a batchable input, but outputs are either displayed (if an image) or collected in the output table (if string or numeric).

Extensible output processing is on the wishlist:

My plan was to offer an option to save to disk if a given module contains an image parameter output (such as #@OUTPUT Img resultImg in a script). There are a few questions to be discussed though:

  • Should we offer to save in source directory, or always ask for a target directory?
  • Should images be saved with the name of the input image, or optionally with a suffix?
  • What to do if a command produces more than one image output? Ask for suffixes for each?

I’d very much appreciate feedback on these questions as well as help in implementing these, as I think the batch processor targets a very common and recurring usage pattern of many plugins, and any work on making it a generic as possible will pay off on future projects.


#9

My votes (based on several years of experience) would be:

  • Always ask for a target directory!
  • With a suffix!
  • Ask for suffixes for each!

Related to that:

  • Does it currently handle batching files from a nested input folder structure?

#10

Yes, it can handle any list of files from any location. You can either drag and drop files onto the input field, or use the Add folder content… button to add files from all subfolders. I noticed that the fact that this list of files can be arbitrary will complicate things when trying to write into a single output directory: we’ll have to deduce (or ask for) the common parent folder of the input files in order to create a suitable subfolder structure in the target directory (unless we save in flat hierarchy, i.e. without creating subfolders).


#11

yes, I thought about that, too. Probably recreating the input folder structure in a new user-defined output root folder is the natural thing to do.


#12

I wonder whether removal of files like “.DS_Store” should be done automatically by the batch-processor?


#13

I agree. The difficulty I was alluding to is if the files in the list come from entirely different locations:

C:\Users\Bob\Image1.tif
D:\UserData\John\Image2.tiff
E:\UserData\John\Image2.tiff

Which subfolders of the output directory should be created in this (arguably constructed) case?


The batch processor should respect any restrictions you define on the File input:

#@ File (style="extensions:tif/tiff") inputFile

and only accept files with matching extensions. But it’s currently not yet doing that, see:

Similarly, for #@ Img inputs and the like, it should only accept image files that can be opened by SCIFIO and/or Bio-Formats (via DatasetIOService).


#14

As an alternative to target folder nesting to match the input, I have in some cases (such as the one you mention) converted the path separator(s) to underscores (our double underscores) so that you still keep some of the naming convention. (“C:/Users/user/test.tif” -> “C__Users__user__test.tif”)

Another option is to do the above and keep a table “batch_log.csv” that keeps

Timestamp, OriginalAbsolutePath, OutputFilename

2018/10/10, “C:/Users/user/test.tif”, “C__Users__user__test.tif”


#15

Gosh, that’s a tough one, indeed.
Maybe the batch-processor should throw an error:

ERROR: Input folders with different roots are not supported.

I think this case would still be ok:

C:\Users\Bob\Image1.czi
C:\UserData\John\Image2.tiff
C:\UserData\John\Image2.tiff

with output folder: D:\Output one would get:

D:\Output\Users\Bob\Image1.czi-processed.tif
D:\Output\UserData\John\Image2.tiff-processed.tif
D:\Output\UserData\John\Image2.tiff-processed.tif

Or alternatively

D:\Output\Users\Bob\Image1-processed.tif
D:\Output\UserData\John\Image2-processed.tif
D:\Output\UserData\John\Image2-processed.tif

In fact, I kind of like keeping the original file extension (first version) because it makes it super obvious what the input file really was.

[EDIT]

Now that I look at it, your original example could be handled like this:

D:\Output\C\Users\Bob\Image1.czi-processed.tif
D:\Output\D\UserData\John\Image2.tiff-processed.tif
D:\Output\E\UserData\John\Image2.tiff-processed.tif