About lossless compression of images for shuffling data


I have to transmit large quantities of image data for a collaborative project. As people on the other end will do processing and analysis, it has to be raw or lossless compression. I’d prefer lossless as fluorescence images often see big gains in size (mostly black!). I was wondering what was the best way to do it in a bandwidth- and time-efficient way:

  • I can use lossless PNG compression. Does lossless PNG stacks exist? What about hyperstacks: is there a way to save them with lossless compression?
  • alternatively, I can do everything in tif and compress the resulting folder to zip. I often see quite a gain with this, and I was wondering what image compressions is zip using? I assume it is lossless?

Thanks for your help!

I always use TIFF formats. Because that format has the lowest compression value (as I know). However, ImageJ can read most files with the Bio-format importer. So you have to doubt if compression is necessary.

In this document they explain some different types of compression with their pros and cons. I learned a lot of basic principles from this document.

Unfortunately I don’t know a lot about ZIP compression of images. So I can’t help you out with that.

IIRC, Bio Formats can save as a zlib compressed TIFF. This will be lossless. JPEG compression will not be lossless.

have you looked into jpeg2000 lossless. Super compression using wavelets.

Some thoughts.
Usually, depending on the amount of noise, lossless compression might not compress your data much, but if you discuss with your people on the other end what kind of pre-filtering they will perform on your data, you could do that processing on your side before sending them the images.

Similarily, make sure that they absolutely NEED the resolution at which you acquired the images. If they can afford to downsample the data by 2, then you should do this before as well.

As a simple example, I have tried this on a 512x512 3 channel image, here are my results
RAW as TIFF: 772KB RAW as LZWTIFF: 211KB RAW as JPEG2000: 166KB

After a Median Filter of radius 2
as TIFF: 772KB as LZWTIFF: 113KB as JPEG2000: 59KB
Some notes, make sure that you select Lossless JPEG2000 as it also has a lossy compression.

Finally This paper on lossy image compression for scientific data is interesting. In short:

  • Depending on what will be analyzed, you could compress your data in a lossy way, once.
  • If bandwidth is truly an issue, I would recommend that your people on the other side run a few analyses on both lossy and non-lossy data and compare the results, then conclude for this particular usage case.

On your point of saving everything in RAW and THEN making a ZIP file, because each image will have metadata associated with it, you will gain a slight bit more compression than if you just compress the pixel data image by image. Zip compression indeed lossless, and usually makes use of the DEFLATE algorithm.

Good luck!