Distributed Cell Profiler not initiating on AWS

Hi,
I have been trying to get Distributed CellProfiler up and running on AWS and I’ve reached an issue that I haven’t been able to resolve after a week of googling and tweaking.

I’ve gone through all of the steps on the wiki page on setup and configuration and can create a spot fleet, but it never runs. After starting the cluster, when I monitor the spot fleet “xxxxSpotFleetRequestId.json” it returns: “In process: 0 Pending 1”. and it will stay like this. Occasionally, it will change to “In process: 1 Pending 0” for about 30 seconds, but then goes back to “In process: 0 Pending 1”. It will do this for hours if I let it. And it never outputs any of the files that the pipeline should (it is very simple-small pipeline for the purposes of just getting CP up and running)

In the EC2 console, a number of instances are create then quickly terminated.
In the ECS console, a cluster is created, but under Services->Events I get the following message:
" service [TrialCPService] was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster. For more information, see the [Troubleshooting section]"

In my S3 bucket, and “ecsconfigs” file and folder are created but nothing else.
Obviously, I’ve tried to troubleshoot this but I can’t figure it out. I can’t find any other error messages on AWS and I do not get any errors in PUTTY terminal.
I have been using the settings recommended in the Distributed CellProfiler wiki in terms of the size and computing.

Please help.
Joe

Hi Joe,

I stumbled onto your post while facing a similar issue - not quite 100% as I’m not getting the ECS error - my EC2s persist - but the lack of any progress or useful error messages seems spot on. I can’t even connect to the spot fleet EC2s via ssh to see what’s happening there. Mind if I ask if you made any progress in identifying your issue - any way to debug it?

Thanks,
Bolek

Hi Bolek,
I was never able to get distributed cellprofiler to run on an AWS cluster. I was never able to figure out exactly what the problem was.

What I ended up doing was installing (regular) cell profiler on an ec2 instance. On 48xCPU (c5d 12xlarge) instance I could run ~30 copies of CP in parallel.

Joe

Hi guys,

Joe, sorry for missing your message in December- not sure how that happened! What I CAN say is that if your message was pinging back and forth continually between in process and pending, there was probably something wrong in your job file; did you check the CloudWatchLogs at the time to see what the error message(s) you were receiving were? (They are set to auto-sunset after 60 days, so they won’t be there anymore).

Bolek, I would definitely recommend checking yours!