Error when grouping images

cellprofiler
loadimages

#1

Hi,

I have been setting up various tracking pipelines and always run in the problem that the grouping does not work when using the LoadImages module. It works like a charm when using the newer integrated modules, but the same logic does not work in the old LoadImages module. The reason why I use the old module is that I want to run it headless on a cluster and that with very long movies or fast movies, the loading takes an insane amount of time. This issue is several years old, would be nice to fix it.
Thanks


#2

Hi,
Can you explain more about this issue a bit- is it this problem? Can you take advantage of batch files perhaps instead?
Can you use LoadData to load your movies?

I’d have to check with our software team to be sure, but given everything on their plate I have my doubts whether they’ll have time to add new functionality to a legacy module anytime soon. My guess is that LoadImages may not even be available in CP 3.0, though again I’m not sure about that.


#3

Hi,
I hope LoadImages is not dumped, that would make cell profiler to compatible with cluster compute and essentially make it useless for HCS. It’s not a legacy module if the software itself suggests using it for large amount of images.
It seems that the error was fixed and it was my specifications of the image sets that was wrong. I apologise for this. I assumed because in the past it didn’t group images correctly that the error was the same. Here it seems to work. I have only one channel though, so I’ll try with multiple channels and let you know.

Bye
Marc


#4

I hope LoadImages is not dumped, that would make cell profiler to compatible with cluster compute and essentially make it useless for HCS.

It’ll certainly still be possible to run CP in a cluster environment even if LoadImages goes away; you can use

  • LoadData + a file CSV (that CP itself will generate for you using the first four image modules if you like)
  • The first four image modules with CreateBatchFiles
  • The first four image modules plus --file-list
  • The first four image modules plus simply passing CP the name of the folder where all the images are (new in 3.0, and one of the big advantages LoadImages had)

I’m not sure of any guidance specifically to use LoadImages in case of large image sets, and I can’t find it easily, but I’d definitely be interested in learning more about your particular use case and why you think it’s important the module be kept. I don’t think it’s been decided yet whether or not LoadImages will stay in 3.0, but the more compelling use cases we have for any given functionality in the software the more likely it’ll survive the upgrade.


#5

Hi Beth,

not sure that anything you mention will work in our cluster environment. The head nodes handles the dispatching to the nodes and there is no reach through from the work directory where the images are stored. This is a pretty standard way to set up a cluster. The CreateBatch Files is meant for Windows, which is not an environment for clusters. I’ll discuss with our HPC guys to see what will work and what not.
Thank you in any case for looking into this. It’s never easy, I know of a few places who are struggling to put CP on their clusters.
Bye,
Marc

Marc Bickle, PhD
Head HT-Technology Development Studio
Max Planck Institute of Molecular Cell Biology and Genetics
Pfotenhauerstr. 108
D-01307 Dresden
Germany

phone +49 351 210 2595
fax +49 351 210 1689
mail bickle@mpi-cbg.de
web: http://www.mpi-cbg.de/facilities/profiles/ht-tds.html

DRESDEN-concept Technology Platform:
https://tp.dresden-concept.de


#6

Hi Marc,
A late follow up, but I couldn’t let this go…

CreateBatchFiles is actually meant for exactly the sort of case you mentioned: Convert a pipeline created in a local Windows/Mac environment to a Linux cluster. CreateBatchFiles simply needs the root paths in each environment, and it will do the rest: convert backslashes to forwardslashes, and replace the root paths.

Cheers,
David


#7

Hi David,

Maybe I should go through this with our cluster expert and see what he thinks. I’ll come back to you with his answers.

As a note, we are running CP now on our GPU nodes which is pretty cool.

Bye,

Marc

Marc Bickle, PhD
Head HT-Technology Development Studio
Max Planck Institute of Molecular Cell Biology and Genetics
Pfotenhauerstr. 108
D-01307 Dresden
Germany

phone +49 351 210 2595
fax +49 351 210 1689
mail bickle@mpi-cbg.de
web: http://www.mpi-cbg.de/facilities/profiles/ht-tds.html

DRESDEN-concept Technology Platform:
https://tp.dresden-concept.de


#8

GPUs, cool! Let us (I say ‘us’ though I am no longer an official CP team member!) know how well it improves your efficiency. I didn’t think that the code was optimized for GPUs yet, but I hadn’t been following that aspect.

Cheers,
David


#9

Hi David,

Yes the guy looking after the cluster asked for a comparison too. I guess I’ll have to knuckle down and do it. I’ll try to prepare for the weekend, when the queues are empty.

bye

Marc

Marc Bickle, PhD
Head HT-Technology Development Studio
Max Planck Institute of Molecular Cell Biology and Genetics
Pfotenhauerstr. 108
D-01307 Dresden
Germany

phone +49 351 210 2595
fax +49 351 210 1689
mail bickle@mpi-cbg.de
web: http://www.mpi-cbg.de/facilities/profiles/ht-tds.html

DRESDEN-concept Technology Platform:
https://tp.dresden-concept.de