Sqlite and CPA

Hello there,
I am hardly into software but am trying to use CP and CPA for WBC differential counts and also to detect leukemic cells.I have downloaded the CP v.9777.After a successful run of analysis ( all WBCs detected and counted ) ,with Export to database having SQLITE as the database type, an examination of the DefaultDB.db in the CP Output folder has a very large number of blackened out NULL,EOT, SOH, EXT,STX entries. After making the necessary adjustment to the properties file , classifier does accept the properties file but when asked to fetch it says that the Default DB.db is empty without any entries.
I have a number of very pertinent questions ( AN ABSOLUTE IDIOTS GUIDE TO CP/CPA , is a good idea !), the answers to which I haven’t been able to find on the FAQs and the Forum Topics, in the past month and a half.

  1. Do we have do download some kind of SQLITE program to create an SQLITE Database ?
    2.Where exactly do I find the name of the image and object tables ? CPA says there aren’t tables called Per_Image or Per _Object.
  2. What or where is the “foreign key column between your per-image and per-object tables” ?
  3. What or where is the “the object key column from your per-object table” ?
  4. While specifying the column names that contain X and Y coordinates , do we use the letters the columns are named by for example E and F or do we number them 5 and 6 etc. ?
  5. Where are the image channel paths ?
    7.What is Empty randrange() ( 1,1 ,0)?
    You guys are are doing wonderful work.
    Jeremy
    DefaultOUT_Cells.csv (1.27 KB)
    DefaultOUT_Image.csv (3.19 KB)

Thanks for the feedback Jeremy, I should be able to help you. First of all, I want to mention that I recently posted an updated CPA example dataset that you should definitely download and take a look at (available here). The one that was previously posted was incorrect and causing a lot of people confusion. This one should work right out of the box so you can see how CPA should work and then look at the props file and data to understand what’s going on.

Getting to your questions:

  1. Currently the best and most reliable way to use SQLite as your database is to use ExportToDatabase with the “MySQL / CSV” option. The SQLite feature is still new and may still have some kinks that need to be ironed out. Here’s how it works with MySQL/CSV: CP will output per-image and per-object csv files along with a setup.sql file. You can then set your db_ fields in CPA as follows:
    db_type = sqlite
    db_sql_file =
    CPA will then use the setup.sql file to find the csv files and build the SQLite database for you. My apologies if this is unintuitive, it’s an artifact of the way we used to do things, but we’re working to improve it.

  2. Using the above method, the names will be Per_Image and Per_Object by default unless you specify a prefix in the CP ExportToDatabase module config panel. You can check this by looking at the csv file names that were output or by looking in the setup.sql file with wordpad, emacs, or textedit. (this file defines the DB schema, that is, what tables will be written and what columns and column types they have)… The above method should set these in your props file for you if you check off “create a CPAnalyst properties file”, so you shouldn’t have to sweat it.

  3. This would be the ImageNumber column. – this is automatically set in your props file

  4. ObjectNumber – also automatically set in props

  5. These should be set automatically if you do step 1, the columns are the names though, you could find them in the setup.sql file, they’ll end with X and Y, but you shouldn’t have to go hunting for this.

  6. Also automatically set for you though you can find them in setup.sql as well.

  7. Sounds like an error that’s an artifact of an incomplete or improperly setup database.

Let me know if you need any more help – hopefully the above will get you off the ground. In the meantime I’m going to look more closely at the SQLite mode on ExportToDatabase because I’m almost certain it doesn’t work the way it should.

Thanks!
Adam

Hi Adam, and thanks for the speedy reply.
The downloaded CPA example file works like a charm, but not the output from CP.
I did all that you suggested, but on loading the Props file, CPA says that it cant find the SQL_SETUP.SQL file , which is exactly the name of the setup file from the CP output.
I have enclosed a typical image from my lab , and the pipeline I’m using on CP.
Thanks
Jeremy
Smear.cp (8.74 KB)


It would actually be more helpful for me to see the CPA properties file that you have and the error output you get in the CPA console window.

At any rate, make sure that you set the db_sql_file to the full path to your file.
eg:

or if you’re using windows:

If you set it to SQL_SETUP.SQL, CPA will look for it in the same directory as your properties file.

-Adam

Hi Adam,
I have copied the Props file here,

#Thu Apr 15 05:25:08 2010

==============================================

Classifier 2.0 properties file

==============================================

==== Database Info ====

db_type = sqlite
db_sql_file = C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL

==== Database Tables ====

image_table = Per_Image
object_table = Per_Object

==== Database Columns ====

Specify the database column names that contain unique IDs for images and

objects (and optionally tables).

table_id (OPTIONAL): This field lets Classifier handle multiple tables if

you merge them into one and add a table_number column as a foreign

key to your per-image and per-object tables.

image_id: must be a foreign key column between your per-image and per-object

tables

object_id: the object key column from your per-object table

image_id = ImageNumber
object_id = ObjectNumber

Also specify the column names that contain X and Y coordinates for each

object within an image.

cell_x_loc = Nuclei_Location_Center_X
cell_y_loc = Nuclei_Location_Center_Y

==== Image Path and File Name Columns ====

Classifier needs to know where to find the images from your experiment.

Specify the column names from your per-image table that contain the image

paths and file names here.

Individual image files are expected to be monochromatic and represent a single

channel. However, any number of images may be combined by adding a new channel

path and filename column to the per-image table of your database and then

adding those column names here.

NOTE: These lists must have equal length!

image_channel_paths = Image_PathName_Smear,
image_channel_files = Image_FileName_Smear,

Give short names for each of the channels (respectively)…

image_channel_names = Smear,

Specify a default color for each of the channels (respectively)

Valid colors are: [red, green, blue, magenta, cyan, yellow, gray, none]

image_channel_colors = gray,

==== Image Accesss Info ====

image_url_prepend = imageweb/images/CPALinks

==== Dynamic Groups ====

Here you can define groupings to choose from when classifier scores your experiment. (eg: per-well)

This is OPTIONAL, you may leave "groups = ".

FORMAT:

groups = comma separated list of group names (MUST END IN A COMMA IF THERE IS ONLY ONE GROUP)

group_XXX = MySQL select statement that returns image-keys and group-keys. This will be associated with the group name “XXX” from above.

EXAMPLE GROUPS:

groups = Well, Gene, Well+Gene,

group_SQL_Well = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Per_Image_Table.well FROM Per_Image_Table

group_SQL_Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

group_SQL_Well+Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.well, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

groups =

==== Image Filters ====

Here you can define image filters to let you select objects from a subset of your experiment when training the classifier.

This is OPTIONAL, you may leave "filters = ".

FORMAT:

filters = comma separated list of filter names (MUST END IN A COMMA IF THERE IS ONLY ONE FILTER)

filter_SQL_XXX = MySQL select statement that returns image keys you wish to filter out. This will be associated with the filter name “XXX” from above.

EXAMPLE FILTERS:

filters = EMPTY, CDKs,

filter_SQL_EMPTY = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene=“EMPTY”

filter_SQL_CDKs = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene REGEXP ‘CDK.*’

filters =

==== Meta data ====

What are your objects called?

FORMAT:

object_name = singular object name, plural object name,

object_name = cell, cells,

What size plates were used? 384 or 96? This is for use in the PlateViewer

==== Excluded Columns ====

OPTIONAL

Classifier uses columns in your per_object table to find rules. It will

automatically ignore ID columns defined in table_id, image_id, and object_id

as well as any columns that contain non-numeric data.

Here you may list other columns in your per_object table that you wish the

classifier to ignore when finding rules.

You may also use regular expressions here to match more general column names.

Example: classifier_ignore_columns = WellID, Meta_., ._Position

This will ignore any column named “WellID”, any columns that start with

“Meta_”, and any columns that end in “_Position”.

A more restrictive example:

classifier_ignore_columns = ImageNumber, ObjectNumber, .Parent., .Children., .Location_Center.,.Metadata.

classifier_ignore_columns = table_number_key_column, image_number_key_column, object_number_key_column

==== Other ====

Specify the approximate diameter of your objects in pixels here.

image_tile_size = 50

======== Auto Load Training Set ========

OPTIONAL

You may enter the full path to a training set that you would like Classifier

to automatically load when started.

training_set =

======== Area Based Scoring ========

OPTIONAL

You may specify a column in your per-object table which will be summed and

reported in place of object-counts when scoring. The typical use for this

is to report the areas of objects on a per-image or per-group basis.

area_scoring_column =

======== Output Per-Object Classes ========

OPTIONAL

Here you can specify a MySQL table in your Database where you would like

Classifier to write out class information for each object in the

object_table

class_table =

======== Check Tables ========

OPTIONAL

[yes/no] You can ask classifier to check your tables for anomalies such

as orphaned objects or missing column indices. Default is on.

This check is run when Classifier starts and may take up to a minute if

your object_table is extremely large.

check_tables = yes

#Thu Apr 15 05:25:08 2010

==============================================

Classifier 2.0 properties file

==============================================

==== Database Info ====

db_type = sqlite
db_sql_file = C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL

==== Database Tables ====

image_table = Per_Image
object_table = Per_Object

==== Database Columns ====

Specify the database column names that contain unique IDs for images and

objects (and optionally tables).

table_id (OPTIONAL): This field lets Classifier handle multiple tables if

you merge them into one and add a table_number column as a foreign

key to your per-image and per-object tables.

image_id: must be a foreign key column between your per-image and per-object

tables

object_id: the object key column from your per-object table

image_id = ImageNumber
object_id = ObjectNumber

Also specify the column names that contain X and Y coordinates for each

object within an image.

cell_x_loc = Nuclei_Location_Center_X
cell_y_loc = Nuclei_Location_Center_Y

==== Image Path and File Name Columns ====

Classifier needs to know where to find the images from your experiment.

Specify the column names from your per-image table that contain the image

paths and file names here.

Individual image files are expected to be monochromatic and represent a single

channel. However, any number of images may be combined by adding a new channel

path and filename column to the per-image table of your database and then

adding those column names here.

NOTE: These lists must have equal length!

image_channel_paths = Image_PathName_Smear,
image_channel_files = Image_FileName_Smear,

Give short names for each of the channels (respectively)…

image_channel_names = Smear,

Specify a default color for each of the channels (respectively)

Valid colors are: [red, green, blue, magenta, cyan, yellow, gray, none]

image_channel_colors = gray,

==== Image Accesss Info ====

image_url_prepend = imageweb/images/CPALinks

==== Dynamic Groups ====

Here you can define groupings to choose from when classifier scores your experiment. (eg: per-well)

This is OPTIONAL, you may leave "groups = ".

FORMAT:

groups = comma separated list of group names (MUST END IN A COMMA IF THERE IS ONLY ONE GROUP)

group_XXX = MySQL select statement that returns image-keys and group-keys. This will be associated with the group name “XXX” from above.

EXAMPLE GROUPS:

groups = Well, Gene, Well+Gene,

group_SQL_Well = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Per_Image_Table.well FROM Per_Image_Table

group_SQL_Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

group_SQL_Well+Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.well, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

groups =

==== Image Filters ====

Here you can define image filters to let you select objects from a subset of your experiment when training the classifier.

This is OPTIONAL, you may leave "filters = ".

FORMAT:

filters = comma separated list of filter names (MUST END IN A COMMA IF THERE IS ONLY ONE FILTER)

filter_SQL_XXX = MySQL select statement that returns image keys you wish to filter out. This will be associated with the filter name “XXX” from above.

EXAMPLE FILTERS:

filters = EMPTY, CDKs,

filter_SQL_EMPTY = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene=“EMPTY”

filter_SQL_CDKs = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene REGEXP ‘CDK.*’

filters =

==== Meta data ====

What are your objects called?

FORMAT:

object_name = singular object name, plural object name,

object_name = cell, cells,

What size plates were used? 384 or 96? This is for use in the PlateViewer

==== Excluded Columns ====

OPTIONAL

Classifier uses columns in your per_object table to find rules. It will

automatically ignore ID columns defined in table_id, image_id, and object_id

as well as any columns that contain non-numeric data.

Here you may list other columns in your per_object table that you wish the

classifier to ignore when finding rules.

You may also use regular expressions here to match more general column names.

Example: classifier_ignore_columns = WellID, Meta_., ._Position

This will ignore any column named “WellID”, any columns that start with

“Meta_”, and any columns that end in “_Position”.

A more restrictive example:

classifier_ignore_columns = ImageNumber, ObjectNumber, .Parent., .Children., .Location_Center.,.Metadata.

classifier_ignore_columns = table_number_key_column, image_number_key_column, object_number_key_column

==== Other ====

Specify the approximate diameter of your objects in pixels here.

image_tile_size = 50

======== Auto Load Training Set ========

OPTIONAL

You may enter the full path to a training set that you would like Classifier

to automatically load when started.

training_set =

======== Area Based Scoring ========

OPTIONAL

You may specify a column in your per-object table which will be summed and

reported in place of object-counts when scoring. The typical use for this

is to report the areas of objects on a per-image or per-group basis.

area_scoring_column =

======== Output Per-Object Classes ========

OPTIONAL

Here you can specify a MySQL table in your Database where you would like

Classifier to write out class information for each object in the

object_table

class_table =

======== Check Tables ========

OPTIONAL

[yes/no] You can ask classifier to check your tables for anomalies such

as orphaned objects or missing column indices. Default is on.

This check is run when Classifier starts and may take up to a minute if

your object_table is extremely large.

check_tables = yes

This is the text of the error output from CPA

An error occurred in the program:
Exception: PROPERTIES ERROR (db_sql_file): File “C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL” could not be found.

Traceback (most recent call last):
File “cpa.py”, line 351, in
File “Properties.pyc”, line 80, in show_load_dialog
File “Properties.pyc”, line 156, in LoadFile
File “Properties.pyc”, line 252, in Validate

Thanks,
Jeremy
jEREMY

Alright, I made corrections and posted what I have below with comments… this should be all you need to get up and running assuming the values specified are correct – which I can’t be sure of from what you’ve sent so far.

One thing I want to make sure of is that you are using ExportToDatabase in CP with the MySQL/CSV option because the CSV files that you sent me previously looked like they were written with ExportToSpreadsheet.

Double check that the file “C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL” exists, and I’ll check with a windows user to make sure you don’t have to do anything special to the path:
eg: C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL or C:/Programs/CellProfiler/CPOutput/SQL_SETUP.SQL
I don’t think this is the case, but I’ll check for you.

-Adam

[code]# ==============================================

CPAnalyst 2.0 Properties file

==============================================

==== Database Info ====

db_type = sqlite
db_sql_file = C:\Programs\CellProfiler\CPOutput\SQL_SETUP.SQL

==== Database Tables ====

image_table = Per_Image
object_table = Per_Object

==== Database Columns ====

image_id = ImageNumber
object_id = ObjectNumber

cell_x_loc = Nuclei_Location_Center_X
cell_y_loc = Nuclei_Location_Center_Y

==== Image Path and File Name Columns ====

image_channel_paths = Image_PathName_Smear,
image_channel_files = Image_FileName_Smear,
image_channel_names = Smear,
image_channel_colors = gray,

==== Image Accesss Info ====

afraser: you only need this if you’re accessing your images via http://

(CP writes it out by default, so it’s our fault you had it uncommented)

image_url_prepend =

==== Meta data ====

object_name = cell, cells,

==== Other ====

Specify the approximate diameter of your objects in pixels here.

image_tile_size = 50

You still had the “fill-me” values in here. I plugged in the key columns, but these columns

as well as all categorical columns are automatically ignored by classifier. If there are other

columns that you feel like classifier should ignore, you can list them here… some people

include the X and Y location columns since these features usually have nothing to do with

phenotype

classifier_ignore_columns = ImageNumber, ObjectNumber

check_tables = yes[/code]

Thanks Adam,
I’ll give it a try, and like you say I have an Export to Spreadsheet module in my pipeline prior to the Export to Database module.
I’ll remove it and give it a go.
Jeremy

Something else you might try is to put your properties file in the CP output directory along with the SQL_SETUP.sql file… then change the properties field db_sql_file to:

db_sql_file = SQL_SETUP.sql

Hi Adam,
I tried both options. The Props file you sent me and the shortened db_sql_file with both in the same directory. The first option led to the same error of the CPA not being to find the file, and the second led to the CPA interface freezing (had to be closed with the task manager).
I could hand you my desktop on Team Viewer ( a 2 MB download if you don’t already have it ), so as you could take a first hand look at my setup.
My Team Viewer ID 692753495
My Team Viewer Password wonderfulworld
I am online now and ready to roll. In case you aren’t just let me know when you can access my desktop and I’ll keep my system on and online at that time.
Thanks a million,
Jeremy