Validate that an image is an image

Hello,

Is there a way to check that the file in question is actually an image and not a movie? The basic way would be to look at the file header, but what if it has been altered? Does BioFormats provide such kind of validation?

Also, even if the file opens fine and is actually an image, is there a way to tell automatically that it is the image one might be looking for, e.g., that it is grayscale medical scan rather than a colorful landscape picture?

Thanks!

If you assuming that the header might be tampered with, I don’t see how this is possible. As far as I know, the only way you can tell what a file is supposed to be is from the header; the rest is just data that could be interpreted however you want.

1 Like

Hi @yaric,

to make sure we answer your question properly, let me make sure we talk of the same concepts. OME defines an Image as a multi-dimensional collection of planes. From your original post, I assume image refers to a single 2D plane and movie refers to a timelapse collection of planes.

Bio-Formats will not tell you whether a file header, a metadata file or any file has been modified from its original form. Doing so likely involves generating checksums of the original files at creation time. These checksums can be re-used later on to detect corrupted or missing files.

Bio-Formats will detect whether a file or a collection of files belongs to one of its supported imaging file formats. Once you have established a fileset as being of imaging type, the way to answer the second question you are asking is to read and introspect the metadata. This means using the API allows you to retrieve the dimensions of the image e.g. along the time dimension, whether it is RGB

Hope these pointers help. If you have more specifics about what the use cases are trying to detect, sample files or scenarios might help driving the discussion.

3 Likes

Thank you for the suggestions. RGB link you have provided looks promising.

By movies I meant .mpg files, for example. So that it would be interesting to be able to automatically distinguish that a file is an actual JPEG and not an MPEG with header altered to look like it is JPEG.