OMERO: Optimizing upload speed

Dear OME team,

We noticed that the import of a 1Gb file using OMERO.insight or using the CLI takes approximately 16s (without the metadata reading part), corresponding to an upload speed of 62Mb/s.

We have been testing our OMERO upload speed on a Windows computer connected to internet with a 10Gb copper cable.
The network bandwidth between omero-server.epfl.ch and the computer is good:

image

With multiple parallel sessions, we are almost reaching the 10Gb:

We then measured the speed at which our OMERO server is able to write a 1Gb file on our NFS storage server (synology), and the result is satisfying (497 Mb/s):

So, we are wondering what is going wrong here and why OMERO.insight and the CLI are really not reaching the maximal transfer capacities of our installation?
Is there something to tune on the application to benefit from higher upload speeds here?

All the best,
Claire

6 Likes

Hi Claire,

Thanks for all of the background measurements. Definitely good to eliminate all of that first.

A few questions: What options did you use during import? Do you know if import was happening over an encrypted connection? Can you tell us more about what you measured as being “16s”? Can you send us the output of omero fs importtime for one of the filesets?

e.g.

omero fs importtime 3135
...
   upload time of 194.21s for 153 files (1.269s/file)
    setId time of  51.13s
 metadata time of   0.43s
   pixels time of 5627.71s for 21736 planes (0.259s/plane)
    rdefs time of  13.51s for 1 rendering setting (13.512s/rdef)
thumbnail time of   1.00s for 1 thumbnail (1.002s/thumbnail)

All the best,
~Josh

3 Likes

Hi Josh,

I tested to upload a 1GB file stored in /tmp of Omero server which is on local SSD drive, I used omero CLI on port 4064 which is encrypted.

(venv3) omero@xxx{Production}:tmp$ DID=$(omero obj new Dataset name=import_speed_test2)
Using session for nbarrier@localhost:4064. Idle timeout: 10 min. Current group: sv-it
(venv3) omero@xxx{Production}:tmp$ omero import -d $DID 1GB_black.tif
Using session for nbarrier@localhost:4064. Idle timeout: 10 min. Current group: sv-it
2021-04-30 14:20:50,506 234        [      main] INFO          ome.formats.importer.ImportConfig - OMERO.blitz Version: 5.5.8
2021-04-30 14:20:50,520 248        [      main] INFO          ome.formats.importer.ImportConfig - Bioformats version: 6.5.1 revision: 6f50e4d52c9d96112635fd8b2dde737f31041cf0 date: 7 July 2020
2021-04-30 14:20:50,559 287        [      main] INFO   formats.importer.cli.CommandLineImporter - Log levels -- Bio-Formats: ERROR OMERO.importer: INFO
2021-04-30 14:20:50,825 553        [      main] INFO      ome.formats.importer.ImportCandidates - Depth: 4 Metadata Level: MINIMUM
2021-04-30 14:20:50,854 582        [      main] ERROR                  loci.formats.ImageReader -  *** One or more readers is misbehaving. See the debug output for more information. e.g.:
     loci.formats.in.APLReader@75d2da2d -> java.lang.NullPointerException('null') ***
2021-04-30 14:20:51,004 732        [      main] INFO      ome.formats.importer.ImportCandidates - 1 file(s) parsed into 1 group(s) with 1 call(s) to setId in 174ms. (179ms total) [0 unknowns]
2021-04-30 14:20:51,034 762        [      main] INFO       ome.formats.OMEROMetadataStoreClient - Attempting initial SSL connection to localhost:4064
2021-04-30 14:20:51,397 1125       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Insecure connection requested, falling back
2021-04-30 14:20:51,733 1461       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Pinging session every 300s.
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Server: 5.6.3
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Client: 5.5.8
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - Java Version: 11.0.10
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Name: Linux
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Arch: amd64
2021-04-30 14:20:51,749 1477       [      main] INFO       ome.formats.OMEROMetadataStoreClient - OS Version: 4.18.0-240.22.1.el8_3.x86_64
2021-04-30 14:20:52,004 1732       [      main] WARN                    ome.system.UpgradeCheck - UPGRADE AVAILABLE:Please upgrade to 5.6.1. See http://downloads.openmicroscopy.org/latest/omero for the latest version.
2021-04-30 14:20:52,222 1950       [      main] INFO          ome.formats.importer.ImportConfig - Using import target: Dataset:75
2021-04-30 14:20:52,257 1985       [2-thread-1] INFO         ome.formats.importer.ImportLibrary - Import target specifies container: Dataset:75
2021-04-30 14:20:52,263 1991       [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_PREPARATION
2021-04-30 14:20:52,584 2312       [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_START
2021-04-30 14:20:52,608 2336       [3-thread-1] INFO   ts.importer.transfers.UploadFileTransfer - Transferring /tmp/1GB_black.tif...
2021-04-30 14:20:52,638 2366       [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_STARTED: /tmp/1GB_black.tif
2021-04-30 14:21:10,441 20169      [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_COMPLETE: /tmp/1GB_black.tif
2021-04-30 14:21:10,508 20236      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END
2021-04-30 14:21:10,581 20309      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_STARTED Logfile: 1259
2021-04-30 14:21:10,797 20525      [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_IMPORTED Step: 1 of 5  Logfile: 1259
2021-04-30 14:21:19,054 28782      [l.Client-1] INFO   ormats.importer.cli.LoggingImportMonitor - PIXELDATA_PROCESSED Step: 2 of 5  Logfile: 1259
2021-04-30 14:21:19,373 29101      [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - THUMBNAILS_GENERATED Step: 3 of 5  Logfile: 1259
2021-04-30 14:21:19,396 29124      [l.Client-1] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_PROCESSED Step: 4 of 5  Logfile: 1259
2021-04-30 14:21:19,419 29147      [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5  Logfile: 1259
2021-04-30 14:21:19,515 29243      [l.Client-1] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /tmp/1GB_black.tif
Image:274
Other imported objects:
Fileset:267

==> Summary
1 file uploaded, 1 fileset created, 1 image imported, 0 errors in 0:00:27.518

The time to upload (FILESET_UPLOAD_START - FILESET_UPLOAD_END) is around 18 seconds.

Here is the result of fs command:

(venv3) omero@xxx{Production}:tmp$  omero fs importtime 75
Using session for root@localhost:4064. Idle timeout: 10 min. Current group: system
   upload time of   0.68s for 1 file (0.676s/file)
    setId time of   0.20s
 metadata time of   0.03s
   pixels time of   0.18s for 2 planes (0.089s/plane)
    rdefs time of   0.06s for 1 rendering setting (0.060s/rdef)
thumbnail time of   0.09s for 1 thumbnail (0.092s/thumbnail)

Best,
Nicolas

3 Likes

Hi Nicolas,

thanks for the information. It’s weird that you get 18s upload time from the log output, but omero fs importtime reports only 0.68s. I couldn’t replicate that unfortunately. Tried the same with a 1Gb tiff file:

omero import -d $DID /tmp/test_file.tiff 
...
2021-05-03 11:51:11,395 2861       [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_STARTED: /tmp/test_file.tiff
2021-05-03 11:51:27,651 19117      [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_COMPLETE: /tmp/test_file.tiff
2021-05-03 11:51:27,740 19206      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END
...
omero fs importtime 252
   upload time of  16.45s for 1 file (16.449s/file)

But I noticed that half of the upload time is actually due to checksum being calculated/compared. When I skipped that omero import -d $DID --skip=checksum /tmp/test_file.tiff it reduced the upload time to upload time of 7.01s for 1 file (7.009s/file) . There’s still quite some overhead, as a simple copy on the filesystem is only 0.02s. But it would definitely be an improvement, if you want to give that a try.

Kind Regards,
Dominik

3 Likes

Hi Dominik,

I did the same test with skipping checksum and it also improved the upload time by half (18sec → 9sec):

2021-05-04 08:37:01,928 2439       [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_START
2021-05-04 08:37:01,947 2458       [3-thread-1] INFO   ts.importer.transfers.UploadFileTransfer - Transferring /tmp/1GB_black.tif...
2021-05-04 08:37:01,981 2492       [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_STARTED: /tmp/1GB_black.tif
2021-05-04 08:37:10,918 11429      [3-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILE_UPLOAD_COMPLETE: /tmp/1GB_black.tif
2021-05-04 08:37:10,980 11491      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - FILESET_UPLOAD_END

It seems that we have close upload time for a 1GB tiff file.
I would like to be sure the performance is normal/optimal on our production server which has the following configuration:
2 x Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
128 GB RAM
480GB RAID1 SSD drives
2 x 10Gb fiber LACP

Best,
Nicolas

2 Likes

Ok. Though there’s clearly room for improvement, good to hear we’re now seeing roughly the same thing. Certainly your hardware specs seem fine.

If anyone is interested in having a go at optimizing the method, e.g. using a reader/writer strategy or using parallel streams, you can see the calling code in

this could even be added as a new/experimental transfer option just by adding it to the classpath:

omero import --transfer=my.pkg.FasterTransfer ...

See AbstractFileTransfer for the details.

~Josh

1 Like