CPA classifier counting individual cells multiple times

Hi there, I am trying to count cells using the classifier tool in CPA, but for some reason it keeps counting individual cells multiple times. After training it to count only the cells I want, then clicking score image, it gives my a result with about 4 hits per cells, rather than only 1 hit, therefore making my cells counts far higher than they should be. I can’t figure out whether this is an issue with the pipeline I am using in cellprofiler to process my images initially (it’s the published pipeline by Dordea et al), or something within cellprofiler analyst or my properties file. Does anybody know how I can tune the program to only count each cell once? Thanks!

Welcome to the forum!
Do you mind describing the issue a bit more, perhaps taking a screenshot? What is it that you are seeing that indicates the cells are being counted multiple times?

Hi Anne,

Thanks for your reply! I have attached a screenshot of my image after I have trained the set and clicked “score image”. I am also including the code for the properties file that I am using when I start up CPA (the forum won’t allow me to add it as an attachment. I was thinking it might also be something wrong with the pipeline I am using, specifically related to the “identify primary objects” step. I attached my results from that step as well, because it looks like in the “cell outlines” figure, there are multiple labels per cell, which might be causing the classifier to think there are multiple objects per cell as well (?). Let me know if you need anymore information and thank you in advance for your help.



#Wed Mar 27 13:31:21 2019
# ==============================================
# CellProfiler Analyst 2.0 properties file
# ==============================================

# ==== Database Info ====
db_type         = sqlite
db_sqlite_file  = C:\Users\elissa\DefaultDB.db

# ==== Database Tables ====
image_table   = Per_Image
object_table  = Per_Object

# ==== Database Columns ====
# Specify the database column names that contain unique IDs for images and
# objects (and optionally tables).
# table_id (OPTIONAL): This field lets Classifier handle multiple tables if
#          you merge them into one and add a table_number column as a foreign
#          key to your per-image and per-object tables.
# image_id: must be a foreign key column between your per-image and per-object
#           tables
# object_id: the object key column from your per-object table

image_id      = ImageNumber
object_id     = ObjectNumber
plate_id      = 
well_id       = 

# Also specify the column names that contain X and Y coordinates for each
# object within an image.
cell_x_loc    = Nuclei_Location_Center_X
cell_y_loc    = Nuclei_Location_Center_Y

# ==== Image Path and File Name Columns ====
# Classifier needs to know where to find the images from your experiment.
# Specify the column names from your per-image table that contain the image
# paths and file names here.
# Individual image files are expected to be monochromatic and represent a single
# channel. However, any number of images may be combined by adding a new channel
# path and filename column to the per-image table of your database and then
# adding those column names here.
# NOTE: These lists must have equal length!
image_path_cols = Image_PathName_OrigRed,Image_PathName_OrigBlue
image_file_cols = Image_FileName_OrigRed,Image_FileName_OrigBlue

# CPA will now read image thumbnails directly from the database, if chosen in ExportToDatabase.
image_thumbnail_cols = 

# Give short names for each of the channels (respectively)...
image_names = b3tubulin,DAPI

# Specify a default color for each of the channels (respectively)
# Valid colors are: [red, green, blue, magenta, cyan, yellow, gray, none]
image_channel_colors = red,blue

# ==== Image Accesss Info ====
image_url_prepend = 

# ==== Dynamic Groups ====
# Here you can define groupings to choose from when classifier scores your experiment.  (eg: per-well)
# This is OPTIONAL, you may leave "groups = ".
#   group_XXX  =  MySQL select statement that returns image-keys and group-keys.  This will be associated with the group name "XXX" from above.
#   groups               =  Well, Gene, Well+Gene,
#   group_SQL_Well       =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Per_Image_Table.well FROM Per_Image_Table
#   group_SQL_Gene       =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well
#   group_SQL_Well+Gene  =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.well, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

# ==== Image Filters ====
# Here you can define image filters to let you select objects from a subset of your experiment when training the classifier.
#   filter_SQL_XXX  =  MySQL select statement that returns image keys you wish to filter out.  This will be associated with the filter name "XXX" from above.
#   filters           =  EMPTY, CDKs,
#   filter_SQL_EMPTY  =  SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene="EMPTY"
#   filter_SQL_CDKs   =  SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene REGEXP 'CDK.*'

# ==== Meta data ====
# What are your objects called?
#   object_name  =  singular object name, plural object name,
object_name  =  cell, cells,

# What size plates were used?  96, 384 or 5600?  This is for use in the PlateViewer. Leave blank if none
plate_type  = 

# ==== Excluded Columns ====
# Classifier uses columns in your per_object table to find rules. It will
# automatically ignore ID columns defined in table_id, image_id, and object_id
# as well as any columns that contain non-numeric data.
# Here you may list other columns in your per_object table that you wish the
# classifier to ignore when finding rules.
# You may also use regular expressions here to match more general column names.
# Example: classifier_ignore_columns = WellID, Meta_.*, .*_Position
#   This will ignore any column named "WellID", any columns that start with
#   "Meta_", and any columns that end in "_Position".
# A more restrictive example:
# classifier_ignore_columns = ImageNumber, ObjectNumber, .*Parent.*, .*Children.*, .*_Location_Center_.*,.*_Metadata_.*

classifier_ignore_columns  =  table_number_key_column, image_number_key_column, object_number_key_column

# ==== Other ====
# Specify the approximate diameter of your objects in pixels here.
image_tile_size   =  90

# ======== Auto Load Training Set ========
# You may enter the full path to a training set that you would like Classifier
# to automatically load when started.

training_set  = 

# ======== Area Based Scoring ========
# You may specify a column in your per-object table which will be summed and
# reported in place of object-counts when scoring.  The typical use for this
# is to report the areas of objects on a per-image or per-group basis.

area_scoring_column =

# ======== Output Per-Object Classes ========
# Here you can specify a MySQL table in your Database where you would like
# Classifier to write out class information for each object in the
# object_table

class_table  = 

# ======== Check Tables ========
# [yes/no]  You can ask classifier to check your tables for anomalies such
# as orphaned objects or missing column indices.  Default is on.
# This check is run when Classifier starts and may take up to a minute if
# your object_table is extremely large.

check_tables = yes

Aha! Indeed your intuition is correct that something is wonky in the Identify module, given it shows 732 objects, and you are seeing those tiny dots on each nucleus. Instead, this module should identify each nucleus as a separate object, neatly outlined. Then CPA should work just fine!

Take a look at the help for Identify, and there’s a CellProfiler blog post about adjusting its settings, too.

Hi Anne,

I see what you mean. I took a look at the blog post, and I now realize that the issue may be that I am trying to identify cells as my primary objects, when I should be identifying them as secondary objects. However, my images do not contain nuclei, so I can’t identify any nuclei as primary objects, and without primary objects, I can’t identify my secondary objects (cells). Is there some way to either count the cells as primary objects or as secondary objects without the input of primary objects? Thanks again for all your help!

Yes, absolutely cells can be primary. Primary just means “the first thing identified in the image”. If you’re having trouble, post images and a pipeline you’ve attempted and we will see what we can do!

Hi Anne,

So I adjusted my pipeline to make sure that each cell was only counted/outlined once by the IdentifyPrimaryObjects in cellprofiler, but for some reason, CPA is still labelling the cells multiple times. Now I suspect there might be a problem with my properties file for CPA or some setting in the classifier module. Since I am a “new user”, the forum won’t allow me to upload my pipeline, but I have attached a screenshot of the results of my “IdentifyPrimaryObjects”, the window of CellProfiler (so you can see the modules), and the output for CPA classifier after training.

Wow, nice job configuring IdentifyPrimaryObjects!

You are so close now… My guess is that you are loading an old properties file (or the one you have loaded links back to the data from your first run that had multiple identifications per cell). Can you check that everything has been refreshed to your new CellProfiler analysis?

Hi Anne,

It works! You were right, I wasn’t making a new properties file for the edited pipeline and it appears that the old one was still trying to count nuclei. Thanks so very much for your help with this, problem solved!


1 Like

Super, thanks for reporting back and have fun!

1 Like