CellProfiler 2.1.1 can't access Amazon S3 object URLs

Greetings to our CP friends;

I’m trying to setup a CellProfiler workflow on the AWS cloud. As an initial step, I wanted to evaluate whether CellProfiler would be able to access images and data on the Amazon S3 storage.
These are accessed via S3 object URLs. Given that CellProfiler works great with URLs from our corporate network I really did not expect any surprises.

So, I’ve created an S3 ‘bucket’ and I’ve uploaded some example images. These images were made public but when I created an image list with them CellProfiler was unable to open and process them.
Attached see the error and also find the pipeline.file.

You can access the image list that goes with the LoadData module here: https://s3-us-west-2.amazonaws.com/ikm.hcs.cellprofiler/images/ExampleHumanImages/S3_v2_ImageSetExampleHumanImages.csv

Then I realized that CP seems unable to read anything (including the image list above) from an Amazon S3 URL (perhaps the https prefix?) even if its open to the public.

Can you please, comment on whether it is possible to read data and images through this type of URL (or is it a bug) ?

Thanks and best regards
Ioannis
S3_ExampleHuman.cppipe (13 KB)


Hi Ioannis,

Odd, it seems to work on my end (see screenshot), whether loading the csv via URL in LoadData as in the pipeline you attached, or locally as shown in your screenshot. Can you think of any authentication issues on your side that might be the culprit?

Regards,
-Mark


That’s great Mark!

Thanks for checking!
It may then be an issue with the company firewall of proxy server. I’ll have to do some additional investigation on this.

The weird thing is that the URLs to the images work perfectly fine in the browser.

Very helpful that you were able to test outside our environment!

Best regards
Ioannis

Not sure, but this issue could also be related to this Github issue: github.com/CellProfiler/CellProfiler/issues/803