Running CellProfiler on Condor without file sharing

Hello again!

I am currently trying to distribute a large image analysis across a distributed computing system known as Condor (which uses downtime on public computers within the university). However, the system is limited, as it does not give access to any shared filespace. Files can be transferred to each computer (but no directory structure). This means for each computer I have to send the standalone command line ZIP utility 7za.exe, a ZIP file containing the 32-bit compiled CellProfiler (as created by the installer) and the images for analysis. All the computers are running Windows XP.

This has worked pretty well, but a significant minority of the computers return a stderr: “The system cannot execute the specified program.”

I presume this has something to do with the unavailability of the Visual C++ runtime libraries. I got the CAB file from the redistributable installer and managed to get lots of nosxs_ .dll files from it and tried distributing those in the CellProfiler directory, to no avail.

Do you have any other suggestions?

Thanks,

Andrew

Just to note that this is now resolved. It transpired some of the computers had insufficient disk space so CellProfiler could not be fully decompressed!

Andrew

I’m glad to hear this is resolved. I’m currently working on revising CellProfiler to use multiprocessing, and will hopefully be able to build off of this work to make distributed computing possible, as well. In your case, it might simplify some of the distribution of work, as in the new model, the main computer (i.e., your desktop running CellProfiler) acts as an image server for the distributed machines. All reads and writes of image data go through the main system, so there should be no more need to distribute the images.

It will be a while before this is ready for testing, but when it is, if you’re interested, it would be great to have someone knowledgeable that can help test it. There are always system-specific bugs hiding in code like this, and the wider the set of systems we can test it on, the more we can squash before releasing it generally.