Autocontext workflow crashes on Windows

Hi,
I’m trying to use the Autocontext workflow to segment some EM data. I encountered 2 issues:

  1. The autocontext workflow crashes on Windows machine during the second training step, after “live view” is pressed. The program hangs in “not responding” mode, and green indicator bar moves back and forth forever (see attached image)
    On some Windows machines this causes display driver to fail after which I need to restart the system. I’m using the latest version of ilastik. I used the same workflow with the beta version that is on your website and everything works fine. In Linux the workflow also works fine. But I’m interested in using the Windows version as most of the users of our microscopy infrastructure are not familiar with Linux.
  2. Second problem is the speed of the autocontext procedure (also pixel classification to some extent). This is especially evident in Windows version. I’m working with 750Mb dataset (xyz, EM data (one channel)). The step that takes forever is waiting for the display to update after one scrolls to the next z section or switches on/off some of the “visibility” categories. For example after doing the labeling and pressing “live update” button one typically waits around 1 hour until the display is updated with default visibility options. In Linux this is reduced to cca. 10min for the same dataset but still rather slow. The calculations seem to finish rather quickly (there is a green progress bar that goes quickly to 100%) but than one waits for one hour until the pixels are colored yellow or blue… It seems like a bug to me, not sure what are your typical datasets but 750Mb is not on the large side… It does not seem to be an issue with the hardware resources as I have tested this on several systems with different CPU/GPU/RAM combinations (with 256GB of RAM or more). Although, autocontext takes a lot of RAM, the RAM/CPU resources have never been used by more than 20-30%.
    thanks!

Hi @metavibor,

first of all 750Mb doesn’t constitute as a particularly large dataset. May I ask which format you are using? You’ll get the best performance using hdf5.

I have summarized a few hints at getting better performance in ilastik here.
In addition to this, there is a setting, available via the menu View -> Set tile width. For 3D data you can set this to 256 to get updates a bit snappier.

We haven’t yet noticed this speed difference between linux and windows. Having said that, almost all developers here are using linux. So we have to look into this.

on what kind of machine are you running ilastik (the windows one)?

Hi k-dominik,
I did some more testing implementing your recommendations. I’m using the latest version of ilastik (1.3.3post2). I was initially using tiffs, which now I converted to hdf5 using the Fiji plugin. I changed the “Set tile width” from 512 to 256. And I copied the data file to the project (so no relative link).
Even with these measures none of my windows workstations were able to complete the autocontext workflow… they would either crash (ilastik would “self close” or I would simply stop ilastik via task manager after cca.1 hour of no changes in ilastik (see the screen shot how the interference looks like just before me stopping ilastik)

. ilastik would crash after clicking on “Live update” during the second training step. I think the autocontext workflow is broken on Win… The 3 workstations where I did the testing are:
1: Win10, CPU - intel i9-9940X @ 3.3GHz (14cores), RAM - 128GB, GPU - Titan RTX
2: Windows Server 2012R, CPU - Intel Xeon E5-2683 V4 @2.1GHz (16 cores), RAM - 256GB, GPU - Quadro M5000
3. Win10, CPU - intel i7-3770@3.4GHz 4 cores, RAM - 32GB, GPU - GeForce GTX 680

On the 1. workstation I have installed Linux (dual boot with Win10) so I can test the ilastik version for Linux on the same hardware. Here, at least, the autocontext workflow would complete without crashes and it would take cca. 10min to get first results after clicking on “live update” during the second training step. However, if I would label a few more pixels or if I would scroll to another nearby Z slice to see how the program performed I would have to wait for time that varies but is in the range of 10min or so. So I was not getting any significant speed improvement compared to before when I was using tiffs.
The situation is worse since I want to add a few more datasets for training in order to be able to predict segmentation on new datasets. After adding the second training set, also the Linux version stopped responding during the “live update” within the second training step (and I did not do any additional labeling, I just wanted to see how the segmentation looks on a new dataset)… Is this the typical speed one can expect for the autocontext workflow on 700MB 3D dataset?
thanks!