"Finishing Files" takes long time to finish in Micro-manager

I am having issues when closing the freshly-acquired multidimensional data in MM.

Depending on the size of the dataset (typically a few TBs), the closing time could be as long as tens of minutes.

This is a bit annoying to me as I am acquiring multiple datasets of the same sample sequentially.

Anyone has any idea how to reduce this time? For now I just kill MM and restart it…

Note:
The “Create metadata.txt file with image Stack Files” checkbox is un-checked in Tools > Options.

Thank you,
Bin

Hi Bin,

Are you saving as “image stack files” or as “separate image files” (I suspect the first, but better be sure)?

When you kill MM and restart, can you read back in the data and are all the metadata and display settings correct?

1 Like

I’m almost certain this is caused by the writing of a bunch of extra stuff at the end of acquisition–specifically, OME-TIFF metadata, ImageJ metadata, display settings, etc. There’s an unfortunate tradeoff here between portability and performance.

NDTiffStorage has these features removed and thus has this closing time cut to the bare minimum. You can’t save with it through micro-manager MDA yet, but it is the default for Pycro-Manager/Micro-Magellan

1 Like

Hi Nico,

Yes, I am saving as “image stack files”.

After I kill MM and restart, I can load the data (n=2) in MM and the display is correct.

This makes sense. I was also suspecting that MM is writing some metadata when closing the data. But how come that writing the metadata takes so long? It could take more than tens minutes sometimes.

Moreover, on rare occasions, the closing of the data is super-fast, like in a few seconds. This happened to me twice (including once this morning), so it might be fixed?

I don’t know, that’s weird. It shouldn’t be particularly variable as to what its writing. Maybe it is something specific to your hard drive and how its writing to different locations

I think @henrypinkard is right about the cause – writing out the OME XML can be very time-consuming and it is entirely done when closing the file (in part because some information that appears early in the XML is only finalized when the dataset is complete). The size of the XML scales with the number of images in the dataset, of course.

Finishing large (100’s of GB) acquisitions on our diSPIM usually takes a decently long time but I feel like it’s on the order of 10’s of seconds rather than minutes. Do you have different physical drives you can test writing to? I wouldn’t be surprised to find that writing metadata to a large HDD would be significantly slower than an SSD.

If this is really a big issue for such large datasets (there are very few people actually writing many TBs in a single dataset), then we may want an option to switch this “finalizing” off (I doubt that Bin will ever want to open the data with ImageJ or OME, so nothing of value to him will be lost). I guess that another part of the reason for the long duration is the enormous redundancy in metadata (each image duplicates the metadata almost verbatim).

Might be easier to just build in an option for saving to NDTiffStorage. I branched it off and removed the OME support in part for this reason – wanted to make it easier to collect TB scale datasets without the overhead of additional features

That sounds like a good idea. @marktsuchida what do you think?

We use a SSD as system drive and a HDD (RAID0 of 8 HDDs) for data storage. The two disks have similar write speed.
I just tried to save datasets of the same size (~160 GB) to them separately. The closing time is ~ 1 minute in both cases.
This is proportional to the closing time of tens of minutes that I experienced when closing datasets of a few TBs.

Indeed as Nico said, not so many people are writing many TBs in a single dataset, though this could change as more people might need to do so in the future.

If one can save to NDTiffStorage in MM, that would be great.