Problem with RAID10?

Our collaborators have bought a new stationary PC (2 processors, 12 kernels, 24 logical processors). The performance when running a CellProfiler pipeline is remarkably bad, it takes about twice the time compared to the old (4 kernel) laptop, using the same CP version 2.1.0. All workers are used but the CPU usage is very low during almost the entire pipeline (see attached snapshot). The PC is configured as RAID10. Do you think that could cause the problem?


Hi Petter,
One thing that could slow things down is CP’s use of temporary files. Measurements are written to a temporary HDF file (since the release prior to 2.1.0). I’m not flushing the HDF file very often, but perhaps something underneath is insisting that the disk exactly record every change to the HDF file. The result would be a disaster for RAID, making whatever hardware that computes the Hamming code run that computation on a whole block to account for a minor value change.

If it’s possible, they should define their temporary directory to point a disk that’s not RAID ($TMPDIR or /tmp or in 2.1.1, use the “-t” switch to set the temporary directory). I think that would be good for the machine’s health overall to create a partition for tempfiles that was fast and that treated the data as not mission-critical.

HDF5 has some configuration options that might help performance if we set them - I’ll file an issue for it. I might be able to give them a small test program that they could run with different HDF5 configurations to see if setting the options helps.

Hope this helps, let me know if they try changing their temp directory and things still run slow.

–Lee

Hi Lee,
Thanks for the reply. I have reported this to the collaborators. Hopefully they will be able to try this soon.
/Petter