Temporary files storage and Batch processing on Orbit

HI All,
using Orbit for processing very large files (whole scanned slides), the software managed to “eat” >150GB of memory of the harddrive in the space of 3h, while processing 7 files.
I found the place where temporary files were stored and then removed them. Is there a setting that needs changing to avoid this problem or a way to direct those temporary files someqhere else, if they are unavoidable?
Moreover, the loaded model is unable to start processing the slides when the Batch comman is given.
Any help is welcome, thanks in advance
Ema

Hi Ema,

Can you share a bit of info on the image size (e.g. x and y pixel size, number of channels).

How are you processing the files?

Also how much memory does your machine have available? Are you using it all (see here for details: Orbit memory settings)

Thanks,

Jon

Dear @Emasan,

thanks for using Orbit.
As Jon suggested, can you please provide more details, e.g. what kind of model are you running (classification, segmentation… ?).

Are you using the Batch processing or are you processing the files within the UI?

The huge amount of memory should not be needed and should be obsolete for batch processing.
In case you really want to switch the temp folder:

Start Orbit from command line via

java -XX:MaxPermSize=150m -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -cp "build/libs/orbit-image-analysis.jar;build/libs/lib/*" com.actelion.research.orbit.imageAnalysis.components.OrbitImageAnalysis

(you might have to change build/libs/lib to /lib depending on which distribution you have)
and add
-Djava.io.tmpdir=/path/to/tmpdir
to set a different temp dir.

Regards,
Manuel

Dear Manuel and Jon,
thanks for your reply.
In brief:
-I am running both classification and segmentation
-I am analysing 2GB sized whole scanned slides at 40x
-I am analysing DAB IHC with hematoxilin nuclear counterstain, but still leaving the default settings (on th right column, Adjustments) : clor deconvolution H&E. Does this parameter need changing for each invidually processed scanned slide? Would this accelerate or improve computation?

  • analysis of individual files no problematic, but query of Batch measurement elicits no progress or action.
  • the sudden drop in hard drive space due to storage of tile files only happens if I ask Orbit to analyse 4 whole scanned slides at the same time. I will try your approach
  • I have a CPU AMD Ryzen 9 12 core, 32 GB RAM, NVidia Gforce 1660 6GB and 250GB of SSD as hard drive
    -generation of a model for both classification and segmentaton: “a walk in the park”
    -Classification: ~40min for entire scanned slide (with exclusion model as well) using 4 classes
    -Segmentation: No possible on entire slide, which is my concern. To overcome the issue, as I write, I am drowing annotation of set size and using as ROI (10 at 5000pxl size x each scanned slide in specific anatomic areas)
    -RAM memory usage reaches spikes of 70% of tatal available
    I hope this answers some of your queries
    Emasan
    PS: Thanks for the assistance and for the lovely Software :slight_smile:

Dear @Emasan,

thanks for all the details, that really helps.
Actually the GUI-based analysis (classification and/or segmentation) is not meant for analyzing several slides. In this mode many temporary results (e.g. classification overlay tiles, segmentation outlines, …) are stored and thus use that much memory. Usually you would select a small ROI (using the yellow temporary ROI tool) and test of your analysis method works within that small part of the image.
Then, when you’re convinced, you use the batch mode to analyze a series of images. In that mode no temporary data is stored and depending on your analysis method you simply get class ratios or cell counts as output.

In batch mode you see a progress bar on the right side. But this only updates per image, e.g. if you analyze four images it will jump to 25% after the first image has finished. And if you assume one image takes 40min, it will stay for 40min at 0%…
Can you please check the files orbit.out.log and orbit.err.log in your user-home directory? These files are updated while processing and you should see if there’s an error. Also very useful for debugging in general…

Btw: you can also open your image at a lower resolution (Image -> Open special resolution) and see if your segmentation works at a lower resolution - which speeds up you analysis.

So to conclude - you have to get the batch mode working!
And as a lookout -> Orbit has scale-out possibilities (e.g. on a Spark cluster) to perform distributed computation on several machines.

Cheers,
Manuel