SqLite and CPA

Hi,

sorry for opening another of these Topics,

i just wanted to know whether the SqLite Export is supposed to work already…
using the SqLite Export i can look at the individual images, but my problem is the following:

  • i extract Metadata from the Filename, giving me 2 columns in the Image Table with names
    Metadata_Plate
    Metadata_Well

but i do not get CPA to find/recognise/use these fields.

The error message as a response to the command:
“[MainThread] SELECT Metadata_Well FROM Per_Image WHERE Metadata_Well!=”" GROUP BY Metadata_Well" "

is:
"An error occurred in the program:
OperationalError: no such column: Metadata_Well

Traceback (most recent call last):
File “cpa.py”, line 175, in launch_plate_map_browser
File “PlateMapBrowser.pyc”, line 206, in init
File “PlateMapBrowser.pyc”, line 240, in AddPlateMap
File “DBConnect.pyc”, line 389, in execute"

can you maybe help me?

Best, Christian.

here is my properties file:


#Fri Apr 16 13:14:59 2010

==============================================

Classifier 2.0 properties file

==============================================

==== Database Info ====

db_type = sqlite
db_sqlite_file = C:\Documents and Settings\Christian Tischer\My Documents\Projects\Gyula_Timinszky_GroupLadurner\results\TrackNuclei/DefaultDB.db

==== Database Tables ====

image_table = Per_Image
object_table = Per_Object

==== Database Columns ====

Specify the database column names that contain unique IDs for images and

objects (and optionally tables).

table_id (OPTIONAL): This field lets Classifier handle multiple tables if

you merge them into one and add a table_number column as a foreign

key to your per-image and per-object tables.

image_id: must be a foreign key column between your per-image and per-object

tables

object_id: the object key column from your per-object table

image_id = ImageNumber
object_id = ObjectNumber
plate_id = Metadata_Well
well_id = Metadata_Well

Also specify the column names that contain X and Y coordinates for each

object within an image.

cell_x_loc = Nuclei_Location_Center_X
cell_y_loc = Nuclei_Location_Center_Y

==== Image Path and File Name Columns ====

Classifier needs to know where to find the images from your experiment.

Specify the column names from your per-image table that contain the image

paths and file names here.

Individual image files are expected to be monochromatic and represent a single

channel. However, any number of images may be combined by adding a new channel

path and filename column to the per-image table of your database and then

adding those column names here.

NOTE: These lists must have equal length!

image_channel_paths = Image_PathName_OrigGFP,Image_PathName_OrigNuc,
image_channel_files = Image_FileName_OrigGFP,Image_FileName_OrigNuc,

Give short names for each of the channels (respectively)…

image_channel_names = OrigGFP,OrigNuc,

Specify a default color for each of the channels (respectively)

Valid colors are: [red, green, blue, magenta, cyan, yellow, gray, none]

image_channel_colors = red,green,blue,cyan,magenta,yellow,gray,none,none,none,

==== Image Accesss Info ====

image_url_prepend =

==== Dynamic Groups ====

Here you can define groupings to choose from when classifier scores your experiment. (eg: per-well)

This is OPTIONAL, you may leave "groups = ".

FORMAT:

groups = comma separated list of group names (MUST END IN A COMMA IF THERE IS ONLY ONE GROUP)

group_XXX = MySQL select statement that returns image-keys and group-keys. This will be associated with the group name “XXX” from above.

EXAMPLE GROUPS:

groups = Well, Gene, Well+Gene,

group_SQL_Well = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Per_Image_Table.well FROM Per_Image_Table

group_SQL_Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

group_SQL_Well+Gene = SELECT Per_Image_Table.TableNumber, Per_Image_Table.ImageNumber, Well_ID_Table.well, Well_ID_Table.gene FROM Per_Image_Table, Well_ID_Table WHERE Per_Image_Table.well=Well_ID_Table.well

groups =

==== Image Filters ====

Here you can define image filters to let you select objects from a subset of your experiment when training the classifier.

This is OPTIONAL, you may leave "filters = ".

FORMAT:

filters = comma separated list of filter names (MUST END IN A COMMA IF THERE IS ONLY ONE FILTER)

filter_SQL_XXX = MySQL select statement that returns image keys you wish to filter out. This will be associated with the filter name “XXX” from above.

EXAMPLE FILTERS:

filters = EMPTY, CDKs,

filter_SQL_EMPTY = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene=“EMPTY”

filter_SQL_CDKs = SELECT TableNumber, ImageNumber FROM CPA_per_image, Well_ID_Table WHERE CPA_per_image.well=Well_ID_Table.well AND Well_ID_Table.Gene REGEXP ‘CDK.*’

filters =

==== Meta data ====

What are your objects called?

FORMAT:

object_name = singular object name, plural object name,

object_name = cell, cells,

What size plates were used? 384 or 96? This is for use in the PlateViewer

plate_type = 96

==== Excluded Columns ====

OPTIONAL

Classifier uses columns in your per_object table to find rules. It will

automatically ignore ID columns defined in table_id, image_id, and object_id

as well as any columns that contain non-numeric data.

Here you may list other columns in your per_object table that you wish the

classifier to ignore when finding rules.

You may also use regular expressions here to match more general column names.

Example: classifier_ignore_columns = WellID, Meta_., ._Position

This will ignore any column named “WellID”, any columns that start with

“Meta_”, and any columns that end in “_Position”.

A more restrictive example:

classifier_ignore_columns = ImageNumber, ObjectNumber, .Parent., .Children., .Location_Center.,.Metadata.

classifier_ignore_columns = table_number_key_column, image_number_key_column, object_number_key_column

==== Other ====

Specify the approximate diameter of your objects in pixels here.

image_tile_size = 50

======== Auto Load Training Set ========

OPTIONAL

You may enter the full path to a training set that you would like Classifier

to automatically load when started.

training_set =

======== Area Based Scoring ========

OPTIONAL

You may specify a column in your per-object table which will be summed and

reported in place of object-counts when scoring. The typical use for this

is to report the areas of objects on a per-image or per-group basis.

area_scoring_column =

======== Output Per-Object Classes ========

OPTIONAL

Here you can specify a MySQL table in your Database where you would like

Classifier to write out class information for each object in the

object_table

class_table =

======== Check Tables ========

OPTIONAL

[yes/no] You can ask classifier to check your tables for anomalies such

as orphaned objects or missing column indices. Default is on.

This check is run when Classifier starts and may take up to a minute if

your object_table is extremely large.

check_tables = yes

Hi Christian,

Could you post the pipeline and a few images as a .zip? I’ll try to reproduce and debug this.

Hi Ray,

here are images and pipeline.

and, b.t.w., there are no “Per_Image sql files” exported if i choose the SqLite export as far as i can see…maybe that is the problem?
probably i am simply doing a very stupid mistake somewhere :smile:

Thanks!
Tischi
Gyula–TrackNuclei.zip (1.24 MB)

Hi Tischi,

A couple of things:

  • I’ve noticed that you’ve specified the plate_id in the properties file as “Metadata_Well.” Did you mean “Metadata_Plate”?

  • Rather than sending images of the saved output, would you be able to send some of the raw images instead? That way, we can run the pipeline and re-generate the SQLite database?

Regards,
-Mark

Hi Mark,

thanks for having a look at this!!

(i) i currently have:

image_id = ImageNumber
object_id = ObjectNumber
plate_id = Metadata_Plate
well_id = Metadata_Well

resulting in the error that CPA could not find Metadata_Plate in “Per_Image…”
btw: the only output file that i get is “DefaultDB.db”, is this supposed to hold the Per_Image table?!

(ii) sorry, i probably uploaded the wrong data set. i gave it another try (attached).

Regards,
Tischi
data.zip (2.99 MB)

Hi Tishi,

The header names for your measurement columns in the SQLite file are slightly different that those for the MySQL options in ExportToDatabase and the format used in ExportToSpreadsheet. So in this case, the way to specify the plate and well ID is as “**Image_**Metadata_Plate” and “**Image_**Metadata_Well” rather than “Metadata_Plate” and “Metadata_Well” as you had before. Once you do that, it will work.

(To answer your other question: Yes, the SQLite database is the only file created; it holds both the per-image and per-object tables).

As an aside, when you view the images in CPA, you’ll probably want to use the linear or log scaling under the Image Controls; you don’t see much of anything under the default display options.

Regards,
-Mark

Hi Mark,

great! it works! thanks a lot!

may i add another question?!

in the example data set that i sent you, there are two positions with 4 time-points each…

the question would be whether there is any way to use the Analyst to display the data in some way that reflects the time-axis.

ideas would be:
(i) see all images in one well as a “browse-able movie”
(ii) somehow have an option to do ScatterPlots, where the x-axis reflects “time”

maybe, regarding (ii), is there some “trick” as to how to save some object- or image-measurements that reflect “time” such that one could select those to plot other numbers against?!

Regards,
Tischi

This is possible, but you would need to use LoadData rather than LoadImages. If you can create a comma-delimited file (.csv) that contains columns with the filenames and paths of all your images, along with the plate and well information, in addition to the time point, you can use it in your pipeline. See the help for LoadData to find out more information.

Of course, this means your .csv will need to be created before you run your pipeline. But you can create one without the time info by running an additional simple pipeline with LoadImages and ExportToSpreadsheet as the only modules. Point the default input folder at your directory containing your images, and it will generate the filelist .csv to which you can then add the time information.

All this information will be written to the SQLite database, and should be assessable using the plotting tools in CPA. You can then use the column representing the time point as your x-axis.

Regards,
-Mark

Hi Mark,

thanks for the answer i will look into this!

Regards, Tischi

Hi Mark,

i had a look at it but i am a little confused now :wink:

in fact, my original pipepline contains already a variable reflecting time, namely “Image_Metadata_Frame”, which i extract from the file-name.
this variable is also stored in the database and i can see it in CPA, the problem however is that i cannot use this as an x-axis (it simply does not appear as a choice).

my feeling would be that i would somehow have to convert the “Image_Metadata_Frame”, which is probably stored as something like a “metadata-string-variable” into a “measurement-number-variable” that qualifies for plotting.

what do you think?

Regards,
Tischi

When I looked at your pipeline, I was going to suggest using this approach originally, but as it turns out, CPA automatically excludes these CP-generated metadata columns because they are always stored as text. Hence, my LoadText suggestion as a workaround; I believe if you define the metadata numerically beforehand, it stays that way and it will show up as a plotting option.

But your approach is exactly right. Having better control over the type of metadata stored would be an improvement. Also, depending on how SQL-savvy you are, you could try modifying the SQLite database yourself accordingly, but I have not tried this myself.

Regards,
-Mark

…in fact what i am trying right now is to modify the “trackobjects” module such that it would store the “number of the image in the group” as a measurement, because i think this may be generally useful…
i am trying to dig into the source code to see whether there is a variable that holds this information (do you maybe know?).

if this does not work out i would implement in trackobjects that it converts a Metadata information (of choice) into a numeric number and store this as a numeric measurement.

maybe it would generally be a nice idea to have a module that converts “Metadata” into a numeric (plot-able) measurement, or?
if you think this would make more sense, then i would try to develop something like this…

in case some of this works out, would you guys be interested in adding such code to your “trunk build”?
more generally put: how would be the official way of contributing to the code of the new Cellprofiler?

Hi Tischi,

At this point, we are working on making numeric metadata captured from the path/filename recognized as such. When it is committed to the svn repository, I’ll post here.

Re: image number in group - You’re right, that would be useful. We’ll add that as a suggested improvement.

Re: contributions - We don’t have a formalized means for users to contribute (yet). But that is one of our large-scale objectives, so stay tuned.

Regards,
-Mark

Hi Mark,

I’ve just posted something similar on this Well Metadata Capture .

The documentation could do with pointing this out as it isn’t obvious that you can’t at present use the CP captured metadata in CPA.

I’ve already generated my image filenames that include a character row reference (e.g. A06) but if I was this again doing it again could I have given a numeric well plate reference in the filename e.g. 0409 row=04 col=09 then I could convert the output of this from VARCHAR to INT with a simple mysql command.

ALTER TABLE per_image CHANGE Image_Metadata_Well Image_Metadata_Well INT(20);

This would preserve the reference as an integer value.

unfortunately there is no easy mapping of the alphanumeric reference.

Any heads-up on the module you are developing for metadata capture?

Cheers,
Amos

Hi Amos,

This one is perplexing since metadata capture is already implemented, and hence shouldn’t require additional table modification.

I’m moving this thread over to the CPA forum to be addressed.
-Mark