Issue with saving per object classification

cellprofiler
classifier
cellprofiler-analyst

#1

Hello,

I’m trying to exclude from the analysis cells using power of the Classifier as discussed in this thread

The issue is that the per object classification doesn’t save in the DB when I score all images.An empty Error window appears (in one of the first tries the Error window wasn’t empty and told that the database is locked, similar to this thread; I’ve tried to get this non-empty-error once again but after several tens of time I gave up).

Please find attached the screenshots. Here is dubug log:

[MainThread] SELECT name FROM sqlite_master WHERE type=‘table’ and name=‘ExpAnt93_AC_individual’
[MainThread] SELECT name FROM sqlite_temp_master WHERE type=‘table’ and name=‘ExpAnt93_AC_individual’
Skipping table checking step for sqlite
[MainThread] SELECT ImageNumber FROM ExpAnt93_AC_Per_Image GROUP BY ImageNumber
[MainThread] SELECT ExpAnt93_AC_Per_FilteredNuclei.ImageNumber, COUNT(ExpAnt93_AC_Per_FilteredNuclei.FilteredNuclei_Number_Object_Number) FROM ExpAnt93_AC_Per_FilteredNuclei GROUP BY ExpAnt93_AC_Per_FilteredNuclei.ImageNumber
[MainThread] SELECT ExpAnt93_AC_Per_Image.ImageNumber FROM ExpAnt93_AC_Per_Image
[MainThread] SELECT ExpAnt93_AC_Per_FilteredNuclei.ImageNumber, FilteredNuclei_AreaShape_Area,FilteredNuclei_AreaShape_Center_X,FilteredN… FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber <= 20) AND 1 = 1
Any values that cannot be converted to float are set to 0
[MainThread] SELECT ExpAnt93_AC_Per_FilteredNuclei.ImageNumber, FilteredNuclei_AreaShape_Area,FilteredNuclei_AreaShape_Center_X,FilteredNuclei_AreaS… FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber > 20) AND (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber <= 70) AND 1 = 1
Any values that cannot be converted to float are set to 0
[MainThread] SELECT ExpAnt93_AC_Per_FilteredNuclei.ImageNumber, FilteredNuclei_AreaShape_Area,FilteredNuclei_AreaShape_Center_X,FilteredNuclei_AreaShape FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber > 70) AND (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber <= 120) AND 1 = 1
Any values that cannot be converted to float are set to 0
Saving cell classes to database…
[MainThread] DROP TABLE IF EXISTS ExpAnt93_AC_individual
[MainThread] CREATE TABLE ExpAnt93_AC_individual (ImageNumber INT, FilteredNuclei_Number_Object_Number INT, class VARCHAR (3), class_number INT)
[MainThread] CREATE INDEX idx_ExpAnt93_AC_individual ON ExpAnt93_AC_individual (ImageNumber,FilteredNuclei_Number_Object_Number)
[MainThread] SELECT ExpAnt93_AC_Per_FilteredNuclei.ImageNumber,ExpAnt93_AC_Per_FilteredNuclei.FilteredNuclei_Number_Object_Number, FilteredNuclei_AreaShape_Area,FilteredNuclei_AreaShape_Center_X,FilteredNuclei_AreaShape_Center_Y,FilteredNuclei_AreaShape_Center_Z,…FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ExpAnt93_AC_Per_FilteredNuclei.ImageNumber <= 20)
Any values that cannot be converted to float are set to 0
[Thread-30] Connecting to the database…
[Thread-30] SQLite file: K:\project_Ant\ExpAnt93_CellProfiler\AC\ExpAnt93_AC_DB.db
[Thread-30] SELECT ImageNumber FROM ExpAnt93_AC_Per_Image GROUP BY ImageNumber
[Thread-30] SELECT FilteredNuclei_Location_Center_X, FilteredNuclei_Location_Center_Y FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ImageNumber=57 AND FilteredNuclei_Number_Object_Number=207)
[Thread-30] SELECT FilteredNuclei_Location_Center_X, FilteredNuclei_Location_Center_Y FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ImageNumber=10 AND FilteredNuclei_Number_Object_Number=468)
[Thread-30] SELECT FilteredNuclei_Location_Center_X, FilteredNuclei_Location_Center_Y FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ImageNumber=46 AND FilteredNuclei_Number_Object_Number=1029)

[Thread-30] SELECT FilteredNuclei_Location_Center_X, FilteredNuclei_Location_Center_Y FROM ExpAnt93_AC_Per_FilteredNuclei WHERE (ImageNumber=46 AND FilteredNuclei_Number_Object_Number=117)

Instead of … and , there are many-many column names and ImageNumber/ObjectNumber pairs respectively. The part with “Thread-30” appears after Error window (i.e. it fails to save per object data, but calculate per image data that I can access after clicking Score All once again).

Does anybody no how to fix or overcome this issue?

Screen shots here


#2

Hmmm, are you running CPA from built or from source installation? Can you try the latter and see if that fixes the problem?


#3

I was running CPA from built. I’ll try to install from source and let you know on the results.
Thank you for the suggestion


#4

Hi Beth,

I’ve installed CPA from source and tried to run classification on the same dataset and another (smaller). Unfortunately I can’t reproduce the analysis because the CPA installed from source works much-much slower than CPA installed from built. Thus Score All for built took a couple of minutes while for source it estimates remaining time as 70 000 hr and I’ve cancel it after a half an hour. Should it be like that? Or did I make some mistake during installation from source?


#5

No, it should not be like that, and there’s no comparable issue in the CPA Github.

What OS are you using? Can you upload your database, properties file, and training set somewhere so we can try to test it in our hands?


#6

Here is link to Google Drive with DB, properties and training set. The pipeline used to generate this DB is also there.
https://drive.google.com/drive/folders/14vW_PLxXHJdBpK6jDMS-AAhjVrxAJrMR?usp=sharing

I’m using Windows 10 (64-bit).


#7

Ah, someone else’s post just reminded me- can you try setting check_tables to ‘no’ and see if you still get the same issue?


#8

I’ve tried but nothing changes.

Additionally I notice that every time classifier told me that the class table already exists and ask permission to overwrite it. I tried to manually delete this table from DB, thus classifier has created the new one with following sentence:

CREATE TABLE ExpAnt102_class (ImageNumber INT, Nuclei_Number_Object_Number INT, class VARCHAR (3), class_number INT)

But the table remains empty.

I’m no nothing about sql, but VARCHAR (3) isn’t it too short? I tried to change the class names to 2-3 characters, but it doesn’t help.

Additionally, I was thinking why CPA from source works so much slower then from built. For my “everyday life” I’m using Python 3.7 (64-bit), but when I tried to install packages for CPA I fail to install javabridge for Python 3.7, because it didn’t see JDK (which I have installed), but somehow I was able to set it up with Python 2.7.13 (64-bit). So, I’m know thinking, maybe something wrong with JDK?


#9

Hmmm, I don’t have access to a PC right now, but I can say on your data that while Score All seems definitely a bit slower on my machine in source than in built (the whole thing took about 10 minutes), it doesn’t seem to freeze they way you’ve mentioned, and the table does indeed get written (whether or not I was making the table fresh or dropping an old table). I did set check_tables to “no”, didn’t check what happens if I kept it set to “yes”.

It’s possible there’s a Windows vs Mac issue, or a source install environment issue (CPA is Python2 and Java8 only, I’ll put my pip freeze that generated the working table below). It might also be that you’re trying to write to an external drive, which is slower than writing to your main drive (you could keep the images where they are and just move the database/properties file onto your system drive). It might help to diagnose the problem how far it’s getting when it seems to stop- I’ve posted the commands my terminal generates when training and writing to the database.

alabaster==0.7.11
altgraph==0.15
appnope==0.1.0
atomicwrites==1.1.5
attrs==18.1.0
awscli==1.16.66
Babel==2.6.0
backports-abc==0.5
backports.functools-lru-cache==1.5
backports.shutil-get-terminal-size==1.0.0
bleach==2.1.3
boto3==1.9.60
botocore==1.12.62
-e git+git@github.com:bethac07/CellProfiler.git@2f7d95dde835a7e51a819452b5f814dfd94ed795#egg=CellProfiler
centrosome==1.1.6
certifi==2018.4.16
chardet==3.0.4
Click==7.0
cloudpickle==0.5.3
colorama==0.3.9
configparser==3.5.0
contextlib2==0.5.5
cycler==0.10.0
Cython==0.28.3
dask==0.17.5
decorator==4.3.0
deprecation==2.0.6
dis3==0.1.2
docutils==0.14
entrypoints==0.2.3
enum34==1.1.6
funcsigs==1.0.2
functools32==3.2.3.post2
future==0.16.0
futures==3.2.0
h5py==2.8.0
html5lib==1.0.1
idna==2.6
imagesize==1.0.0
inflect==2.1.0
ipykernel==4.8.2
ipython==5.7.0
ipython-genutils==0.2.0
ipywidgets==7.2.1
-e git+git@github.com:bethac07/python-javabridge.git@faf93cbd5bbd169e1ca340c8b47097712d0193fe#egg=javabridge
Jinja2==2.10
jmespath==0.9.3
joblib==0.13.0
jsonschema==2.6.0
jupyter==1.0.0
jupyter-client==5.2.3
jupyter-console==5.2.0
jupyter-core==4.4.0
Keras==2.2.4
Keras-Applications==1.0.6
Keras-Preprocessing==1.0.5
-e git+https://github.com/broadinstitute/keras-resnet.git@aa6012a741f79fdd22d45c73403d6f080e95cdcb#egg=keras_resnet
kiwisolver==1.0.1
macholib==1.9
mahotas==1.4.5
MarkupSafe==1.0
matplotlib==2.2.3
mistune==0.8.3
more-itertools==4.2.0
MySQL-python==1.2.5
mysqlclient==1.3.13
nbconvert==5.3.1
nbformat==4.4.0
networkx==2.1
notebook==5.5.0
numexpr==2.6.8
numpy==1.14.5
packaging==17.1
pandas==0.23.0
pandocfilters==1.4.2
pathlib2==2.3.2
pefile==2017.11.5
pexpect==4.6.0
pickleshare==0.7.4
Pillow==5.1.0
pkginfo==1.4.2
pluggy==0.6.0
prokaryote==2.4.0
prompt-toolkit==1.0.15
psycopg2==2.7.6.1
ptyprocess==0.5.2
py==1.5.3
pyasn1==0.4.4
Pygments==2.2.0
PyInstaller==3.3.1
pyparsing==2.2.0
Pypubsub==4.0.0
pytest==3.6.0
python-bioformats==1.4.0
python-dateutil==2.7.3
pytz==2018.4
PyWavelets==0.5.2
PyYAML==3.13
pyzmq==15.3.0
qtconsole==4.3.1
raven==6.9.0
requests==2.20.1
requests-toolbelt==0.8.0
rsa==3.4.2
s3transfer==0.1.13
scandir==1.7
scikit-image==0.14.0
scikit-learn==0.20.1
scipy==1.1.0
seaborn==0.9.0
Send2Trash==1.5.0
simplegeneric==0.8.1
singledispatch==3.4.0.3
six==1.11.0
snowballstemmer==1.2.1
Sphinx==1.7.5
sphinx-rtd-theme==0.4.2
sphinxcontrib-websupport==1.1.0
subprocess32==3.5.1
tables==3.4.4
terminado==0.8.1
testpath==0.3.1
toolz==0.9.0
tornado==5.0.2
tqdm==4.23.4
traitlets==4.3.2
twine==1.11.0
typing==3.6.4
urllib3==1.22
verlib==0.1
wcwidth==0.1.7
webencodings==0.5.1
widgetsnbextension==3.2.1
wxPython==3.0.2.0
wxPython-common==3.0.2.0

#10

I tried spinning up a Windows VM, and it did freeze on your data; could be a Windows vs Mac thing, or a “VM has only a fraction of the power of my main machine” thing.

Interestingly, when I deleted the pre-existing class table from your database and tried again, it worked…


#11

I’ve tried to install CPA on an other PC (I have no access to Mac), and in general everything was the same (i.e. empty Error window in built version and tens thousands hours “remaining time” in source version) with one exception: after 15-30 sec thousands hours turns down to few minutes, after that classification successfully completed (no error appears, classification is saved in DB) in some reasonable time.

The other PC I was tested is more powerful that my own (i7-4790 + 16 GB RAM vs i7-2670QM + 8 GB RAM). So I have tried to run once again on my own PC and left it for a couple hours, and it worked out.

Thus source installation works okay and freezing that you observed on Windows VM seems to be

“VM has only a fraction of the power of my main machine” thing

.or maybe something else.

Anyway, thank you for your help!