Cellprofiler (and CPA) on arm-based apple silicon?

Hi everyone,

I have used cellprofiler and cellprofiler analyst for some time on an old macbook and generally been very happy with it. Last week I bought a new Mac with the new arm-based architecture and ran into major issues with compatibility (perhaps unsurprisingly…). I cannot get CPA running whatsoever and cellprofiler runs at crawling speed (around 10 times slower than dual core MBP from 2012 when running same images through same pipeline).

Regarding CPA, I installed the appropriate JDK (8) from Azul (https://www.azul.com/downloads/zulu-community/?package=jdk) that is developed for cross-compatibility between x86 and apple silicon arm-based architecture. Cannot start the program at all and the issue seems to be localised to the java python bridge according to the error messages in terminal. I unfortunately do not have the programming skills to address this. Regarding cellprofiler (v4.0.7) proper, everything seems functionally intact, however processing speed is unbearably slow. I really don’t know what could be done to address that.

Has anyone here experienced or solved these issues?

Many thanks!

Hi @August_Lundquist,

Welcome to the forum!

Regarding CellProfiler Analyst – we’ve created some detailed instructions for how to install CPA on a Mac. Does following these steps allow CellProfiler Analyst to open?

Regarding the slow CellProfiler speeds, I’m reaching out to the engineering team to see if they have any insight and we’ll try to get back to you soon!

Thank you!

No, I have already performed the steps just like I did, successfully, when I installed CPA on my old computer but with the Zulu JDK instead of adoptopenJDK (since they don’t seem to have an arm version available).

It should also be noted that the test runs I did with CellProfiler didn’t seem to utilise the processor too well, no heat build up and so on. If it would be of interest I could send you the full activity monitor logs.

Hi @August_Lundquist,

Thanks for reporting this, we do expect some performance issues at this point in time, but we expect that things will improve over time as package dependencies are updated to work with the new architecture more efficiently.

In terms of CellProfiler, it’d be helpful to know which particular areas of the program feel unresponsive. For example, is the interface sluggish, or does opening images feel slow, or is it the analysis itself that isn’t meeting expectations? If we can identify particular modules which are problematic than this would give us things to prioritise. If you’re able to launch CellProfiler from the terminal, running an analysis with 1 worker should then provide a log of the time taken for each module to complete - if you could try the example pipeline that’d be super helpful. Otherwise module execution times are also reported by the ExportToSpreadsheet module.

With regards to CPA, this program is still running on Python 2 and as far as I’m aware Apple don’t intend to help with patching Python 2 for the new architecture. As a result any performance issues there could be difficult to resolve. With regards to Java though, would you be able to post the error messages you mentioned seeing in the terminal?

Thanks again

Error log CPA 30:11.txt (25.7 KB)

Regarding CPA, attached full log. Doesn’t make too much sense to me and could not find specifically what I saw earlier, regarding error with java-python bridge but maybe you can understand something that I don’t from it. There are certainly a lot of error messages regarding python though. Is some update to translate CPA into python 3 in the works or could you point me to some software that fulfils a similar role (I mainly use the machine learning classifier trainer) but runs better on recent hardware?

Regarding CellProfiler, everything apart from the analysis run time works fine! I will provide you with some numbers on execution times for specific modules tomorrow!

Thank you!

Much appreciated. That looks like the system log rather than the actual application log. Are you able to start Analyst by using a terminal window and entering open "/Applications/CellProfiler Analyst.app"?

In terms of a Python 3 upgrade for CPA, I don’t think that this is on the immediate radar right now. Such an upgrade would take a fair bit of development time and I’m not sure there’s specific funding for such work at the moment. There are alternative classifying software packages in the works, but I don’t think any of these export the types of rules lists which are compatible with CellProfiler. Hopefully we can get the Python 2 version running on Apple Silicon.

Execution times will be very helpful, thanks for doing that!

I don’t have much to share other than I’ll try to get an M1 machine ASAP so we can start debugging these issues. CellProfiler should be running in the x86 compatibility mode (Rosetta2) so I wouldn’t expect any major issues in that environment yet …

Regarding CellProfiler, I updated java proper to latest version and achieved much better results, however still far from parity between my old and new machine. See execution times and experiment spreadsheet attached (I would rather not upload the image I analysed out of respect to our donors, we work with human tissue). MeasureColocalization seems to be the main time sink.

Execution times.zip (22.7 KB)

Regarding CPA, I launched CPA through terminal, like described, but could not produce any additional error logs from the console application. Where else could I find the application log?

Thank you all!

Fantastic, thanks for that!

Looking at the execution times, it seems clear that MeasureColocalization is the primary cause of the slowdown. The rest of the modules look like they’re performing pretty well.

I should ask whether your Intel execution times are also performed with version 4.0.7? There were some major changes to the colocalization module in 4.0 which won’t be directly comparable with 3.1.9.

Either way, I expect that you’re using 16-bit images in this experiment. That tends to make the MeasureColocalization module run very slowly with the pipeline settings you’ve supplied. Could I suggest that you try switching the ‘Method for Costes thresholding’ in the Colocalization modules from ‘Accurate’ to ‘Faster’? In practice this should give you the same results in much less time. If you could retry the pipeline with that change it’d be very helpful.

As for CPA, could I check that after running the open command there’s no window displayed at all? It just returns straight to a new line in the terminal?

Thanks

CP: see new execution times attached. About 50% faster when choosing ‘faster’ but the time difference between the machines persists (around five times slower on M1) in the second MeasureColocalization module (module 34), the first one we use, curiously, seem to be unaffected by changing this setting (module 13, could be that we measure only within objects in that one). Both machines run v4.0.7.

Execution times 2.zip (26.1 KB)

CPA: Yes, the application opens just like when I click the icon (first asking to choose properties file, then crashing after a proper one is selected) but no additional log is available, as far as I can see, apart from the one in the console that I uploaded yesterday.

Interesting, thanks so much for doing that. This might mean that there’s another metric causing problems.

What I’d suggest we try is running the pipeline in Test Mode up to the Colocalization module. If you switch “Run All Metrics” to “No”, you’ll be given the option to enable or disable individual metrics. If you start with all metrics disabled and execute the module using the ‘play’ button next to it on the modules list, would you be able to try turning different metrics on in turn to figure out which one is causing the slowdown? The module should execute almost instantly, so you’ll probably feel it freeze up when you’ve turned on the problem measurement.

The CPA issue you describe is usually caused by a problem with the .properties file. If you open that up in a text editor could you check that the db_sql_file parameter is pointed to the proper location of the database file?

1 Like

Ok, did what you suggested, good idea! It seems like it is the last metric (Manders coefficients using Costes auto threshold) that creates significant slowdown. I can’t really turn it off for my analysis since I have trained all classifiers with it but at least we know the core of the problem.

The .properties database pathway is pointing to the right place (I have solved all these issues when I set up CPA on my old machine). Thank you for the suggestion though!

Thanks for doing that, it looks like the performance was suboptimal even with Intel, so perhaps we should take a closer look at this. Would you be able to upload a sample image set so that we can try to replicate this locally?

Regarding your classifiers - are you using a set of rules exported from CPA? If those rules don’t feature the problem measurement then it may be safe to disable Costes for the time being.

Okay, this is still likely to be come kind of database connection error. I’m not sure whether folder permissions might be an issue here.

Ok, great news, I got CPA up and running too! Performance better than previous machine (faster disk or something?). This zulu native ARM JDK was the error. Doing everything exactly as previously, with adoptopenJDK, despite technically not being compatible, and then letting Rosetta translate it seems the way to go. Thank you all so much!

That’s brilliant! Glad to hear it.

Just to check - was Rosetta not enabled by default?

No, it was enabled. The fault was entirely mine in assuming that the zulu JDK would work better than/at all and that the adoptopen JDK would not work on the new architecture. I am quite impressed by the translative capabilities of Rosetta!

Thanks again!

Great, thanks!

Would you still be able to upload a sample image set so that we could take a look at the performance of those measurement modules?