Classification of infected cells

Hello everybody!

I’m currently working with the protozoan Trypanosoma cruzi,which act as an intracelular parasite in a certain stage of its development, and I’m making invasion assays.
This kind of assay consists in quantifying the ammount of infected cells on a given condition. Since I have too many images and too many conditions, I’d like to automate the process by using CP and/or CPA but I’m not geting the results I expected.

I’ve attached an example image of my experiment. It’s a DAPI stained pictures. As you probably know, the big circles are the mammalian cell nucleus and the little small circles are the nucleus of parasites. I would like to classify the infected cells and non infected cells.

Since I couldn’t find a way to do this using only CP, I’ve tried this pipeline combining CP and CPA:

-IdentifyPrimaryObjects: nucleus; Typical diameter of objects, in pixel units (Min,Max):35,100
-IdentifySecondaryObjects: membrane; Distance - N = 40 px

(below is the .properties files used in CPA)

I tried using CPA but apparently it doesn’t converge and cannot correctly classfify my cells. I don’t know if there is some other kind of object I need to identify or any other tool I need to use.
Please, can you help with this?

Thanks in advance!

#Mon Feb  3 14:15:02 2014
# ==============================================
# CellProfiler Analyst 2.0 properties file
# ==============================================

# ==== Database Info ====
db_type      = mysql
db_port      = 3306
db_host      = localhost
db_name      = cpa
db_user      = aarteixeira
db_passwd    = 

# ==== Database Tables ====
image_table   = a140203_Per_Image
object_table  = a140203_Per_Object

# ==== Database Columns ====
# Specify the database column names that contain unique IDs for images and
# objects (and optionally tables).
# table_id (OPTIONAL): This field lets Classifier handle multiple tables if
#          you merge them into one and add a table_number column as a foreign
#          key to your per-image and per-object tables.
# image_id: must be a foreign key column between your per-image and per-object
#           tables
# object_id: the object key column from your per-object table

image_id      = ImageNumber
object_id     = ObjectNumber
plate_id      = 
well_id       = 

# Also specify the column names that contain X and Y coordinates for each
# object within an image.
cell_x_loc    = nucleus_Location_Center_X
cell_y_loc    = nucleus_Location_Center_Y

# ==== Image Path and File Name Columns ====
# Classifier needs to know where to find the images from your experiment.
# Specify the column names from your per-image table that contain the image
# paths and file names here.
# Individual image files are expected to be monochromatic and represent a single
# channel. However, any number of images may be combined by adding a new channel
# path and filename column to the per-image table of your database and then
# adding those column names here.
# NOTE: These lists must have equal length!
image_path_cols = Image_PathName_image
image_file_cols = Image_FileName_image

# CPA will now read image thumbnails directly from the database, if chosen in ExportToDatabase.

image_thumbnail_cols = Image_Thumbnail_image

# Give short names for each of the channels (respectively)...
image_names = image

# Specify a default color for each of the channels (respectively)
# Valid colors are: [red, green, blue, magenta, cyan, yellow, gray, none]

image_channel_colors = gray,

# ==== Image Accesss Info ====
image_url_prepend = 

# ==== Dynamic Groups ====
# Here you can define groupings to choose from when classifier scores your experiment.  (eg: per-well)
# This is OPTIONAL, you may leave "groups = ".
#   group_XXX  =  MySQL select statement that returns image-keys and group-keys.  This will be associated with the group name "XXX" from above.
#   groups               =  Well, Gene, Well+Gene,
#   group_SQL_Well       =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Per_Image_Table.well FROM Per_Image_Table
#   group_SQL_Gene       =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well
#   group_SQL_Well+Gene  =  SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.well, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

# ==== Image Filters ====
# Here you can define image filters to let you select objects from a subset of your experiment when training the classifier.
# This is OPTIONAL, you may leave "filters = ".
#   filter_SQL_XXX  =  MySQL select statement that returns image keys you wish to filter out.  This will be associated with the filter name "XXX" from above.
#   filters           =  EMPTY, CDKs,
#   filter_SQL_EMPTY  =  SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene="EMPTY"
#   filter_SQL_CDKs   =  SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene REGEXP 'CDK.*'

# ==== Meta data ====
# What are your objects called?
#   object_name  =  singular object name, plural object name,
object_name  =  nucleus, nuclei, membrane, membranes, cytoplasm, cytoplasms

# What size plates were used?  96, 384 or 5600?  This is for use in the PlateViewer. Leave blank if none
plate_type  =

# ==== Excluded Columns ====
# Classifier uses columns in your per_object table to find rules. It will
# automatically ignore ID columns defined in table_id, image_id, and object_id
# as well as any columns that contain non-numeric data.
# Here you may list other columns in your per_object table that you wish the
# classifier to ignore when finding rules.
# You may also use regular expressions here to match more general column names.
# Example: classifier_ignore_columns = WellID, Meta_.*, .*_Position
#   This will ignore any column named "WellID", any columns that start with
#   "Meta_", and any columns that end in "_Position".
# A more restrictive example:
# classifier_ignore_columns = ImageNumber, ObjectNumber, .*Parent.*, .*Children.*, .*_Location_Center_.*,.*_Metadata_.*

classifier_ignore_columns  =  table_number_key_column, image_number_key_column, object_number_key_column, .*Location.*

# ==== Other ====
# Specify the approximate diameter of your objects in pixels here.
image_tile_size   =  150

# ======== Auto Load Training Set ========
# You may enter the full path to a training set that you would like Classifier
# to automatically load when started.

training_set  = 

# ======== Area Based Scoring ========
# You may specify a column in your per-object table which will be summed and
# reported in place of object-counts when scoring.  The typical use for this
# is to report the areas of objects on a per-image or per-group basis.

area_scoring_column =

# ======== Output Per-Object Classes ========
# Here you can specify a MySQL table in your Database where you would like
# Classifier to write out class information for each object in the
# object_table

class_table  =

# ======== Check Tables ========
# [yes/no]  You can ask classifier to check your tables for anomalies such
# as orphaned objects or missing column indices.  Default is on.
# This check is run when Classifier starts and may take up to a minute if
# your object_table is extremely large.

check_tables = yes

The machine learning tool in CPA requires per-object measurements in order to work. As it stands, your pipeline identifies the objects but doesn’t measure anything, hence the failure in classification. If you believe the nuclei are being identified reasonably well, then after the identification modules, you can insert modules such as MeasureObjectSizeShape, MeasureObjectIntensity, MeasureRadialDistribution and MeasureTexture. Set them to measure from the primary, secondary and tertiary objects, with the DAPI image as input for modules in which that’s appropriate.

However, I wonder whether you can identify just the parasites with one IdentifyPrimary module, the nuclei with another, use IdentifySecondary to define the cell body (using Distance-N since you have no cell stain), and then use RelateObjects to assign the parasites as “children” to the enclosing “parent” cell. That way, you can define an infected cell as a cell that as one that has one or more children parasites.