I have been playing with DeepImageJ, a tool for making image-centric TensorFlow models accessible for execution by end users from within the ImageJ user interface. This has led me to a question for the community:
Is there a metadata standard for documenting ML models which makes it more feasible to execute them on images in a general way?
Since these TensorFlow models are graph-based, there is a lot of flexibility in the structure of the inputs and outputs. This makes it hard to, in general, “just execute the model” on an image, without some prior knowledge of the particular model to be executed, its requirements, assumptions and structure. For example, many models only operate on images with particular dimensions.
By documenting these requirements and assumptions in a metainformation file to the model itself—e.g., how to treat each input and output of the graph, and constraints on what sorts of images are intended for use with the model—it becomes easier for general-purpose tools (like DeepImageJ) to make these models executable from user-friendly tools (like ImageJ or napari).
DeepImageJ has an implicit XML schema documenting some of this metainformation. In particular, needed input image dimensions, supported tile sizes, and supported tile overlaps are specified—but also some social metadata such as who authored the model, and where to look for more information about it.
Here is an example XML metadata file for StarDist executing via DeepImageJ
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <Model> <ModelInformation> <Name>Stardist nuclei detection</Name> <Author>Martin Weigert</Author> <URL>http://csbdeep.bioimagecomputing.com/index.html</URL> <Credit>Max Planck Institute of Molecular Cell Biology and Genetics, and Center for Systems Biology Dresden, Germany</Credit> <Version>n/a</Version> <Date>2018</Date> <Reference>Uwe Schmidt, Martin Weigert, Coleman Broaddus and Gene Myers, Cell Detection with Star-Convex Polygons, Medical Image Computing and Computer Assisted Intervention (MICCAI) 2018</Reference> </ModelInformation> <ModelTest> <InputSize>320x256</InputSize> <OutputSize>320x256</OutputSize> <MemoryPeak>578.1 Mb</MemoryPeak> <Runtime> 2.4 s</Runtime> <PixelSize>1.00pixel x 1.00pixel</PixelSize> </ModelTest> <ModelCharacteristics> <ModelTag>tf.saved_model.tag_constants.SERVING</ModelTag> <SignatureDefinition>tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY</SignatureDefinition> <InputTensorDimensions>,-1,-1,-1,1,</InputTensorDimensions> <NumberOfInputs>1</NumberOfInputs> <InputNames0>input</InputNames0> <InputOrganization0>NHWC</InputOrganization0> <NumberOfOutputs>1</NumberOfOutputs> <OutputNames0>output</OutputNames0> <OutputOrganization0>NHWC</OutputOrganization0> <Channels>1</Channels> <FixedPatch>false</FixedPatch> <MinimumSize>64</MinimumSize> <PatchSize>128</PatchSize> <FixedPadding>true</FixedPadding> <Padding>22</Padding> <PreprocessingFile>preprocessing.txt</PreprocessingFile> <PostprocessingFile>postprocessing.txt</PostprocessingFile> <slices>1</slices> </ModelCharacteristics> </Model>
It strikes me that this sort of definition is hardly unique to DeepImageJ, and could be utilized effectively in any sort of general-purpose model execution framework for images. Does anyone active in this community know whether work is being done to define a community standard along these lines? @agoodman @jni @bcimini @AnneCarpenter @fjug @mweigert @uschmidt83 @iarganda @joshmoore
P.S. There are often also further restrictions regarding which sorts of data are valid for input given the scope of a pretrained model’s training data, such as only images from a particular microscope, within certain intensity ranges, etc. Not all of these restrictions can be easily documented in a technical way. While I am interested in how to deal with that thorny issue, in this topic I am merely wondering whether there is any standard whatsoever for those elements which can be documented.