Trouble loading image.CSV

cellprofiler-analyst

#1

CP Team,

I am a new user, just getting started. I was able to load the data from the example database into MySQL and get CP Analyst to work fine. I think I have CellProfiler working with my pipeline and a trial image set, but I’m having trouble loading the image.CSV file. I get “ERROR 1265 <01000>: Data truncated for column ‘StDev_Cytoplasm_Intensity_MassDisplacement_OrigPurple’ at row 1”. I tried manually adjusting file paths and names to remove anything that might be causing a problem, but I’m stuck. I tried attaching the MySQL output, the .CSV file and the .SQL file but I get an error message that those extensions are not allowed - maybe something is misset? Any suggestions would be greatly appreciated.


#2

I have attached a .zip file with the .SQL, .CSV and My SQL outputs.
CPAnalyst Files.zip (297 KB)


#3

Hi,

The problems were MySQL syntax related. I changed a a couple things and got the Per_Image table to upload:
(1) I commented out the end of your *_SETP.SQL file and made some changes relating to line termination. You have to be careful of the differences in line termination on PCs (dos) and other file systems (you are on a PC, yes?)

#LOAD DATA LOCAL INFILE 'C:/cp/20090224_EJG_Test/SQL_1_10_image.CSV' REPLACE INTO TABLE Per_Image FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY ''; #SHOW WARNINGS; #LOAD DATA LOCAL INFILE 'C:/cp/20090224_EJG_Test/SQL_1_10_object.CSV' REPLACE INTO TABLE Per_Object FIELDS TERMINATED BY ','; #SHOW WARNINGS; LOAD DATA LOCAL INFILE 'SQL_1_10_image.CSV' REPLACE INTO TABLE Per_Image FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n'; SHOW WARNINGS; LOAD DATA LOCAL INFILE 'SQL_1_10_object.CSV' REPLACE INTO TABLE Per_Object FIELDS TERMINATED BY ','; SHOW WARNINGS;

Note that I don’t have your Per_Object data to test, so you’d likely have to modify its LOAD DATA line likewise.

(2) There was an extra blank line at the end of your SQL_1_10_image.CSV file which caused a line of NULLs at the end of the table, but worked fine once I took it out.

Were these files unaltered from the way they were output from CP? I ask since our most of our CP developers work on the Mac platform and we want to smooth out any PC-related database issues.

Thanks,
David


#4

David,

Yes, I am on a PC running Windows XP professional. Tanks for the help, that solved this problem. These files were slightly altered to correct the paths and I added “DROP DATABASE IF EXISTS 20090224EJGDB;” to the beginning but otherwise unchanged. The extra line at the end of the .CSV was definitely generated by CellProfiler.

This is how the text from CellProfiler looked:

LOAD DATA LOCAL INFILE ‘SQL_1_10_image.CSV’ REPLACE INTO TABLE Per_Image FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘"’;
SHOW WARNINGS;
LOAD DATA LOCAL INFILE ‘SQL_1_10_object.CSV’ REPLACE INTO TABLE Per_Object FIELDS TERMINATED BY ‘,’;
SHOW WARNINGS;

This worked correctly:

LOAD DATA LOCAL INFILE ‘C:/cp/20090224_EJG_Test/SQL_1_10_image.CSV’ REPLACE INTO TABLE Per_Image FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘"’ Lines TERMINATED BY ‘\r\n’;
SHOW WARNINGS;
LOAD DATA LOCAL INFILE ‘C:/cp/20090224_EJG_Test/SQL_1_10_object.CSV’ REPLACE INTO TABLE Per_Object FIELDS TERMINATED BY ‘,’;
SHOW WARNINGS;

Shortening the LOAD statement to this also worked:

LOAD DATA LOCAL INFILE ‘C:/cp/20090224_EJG_Test/SQL_1_10_image.CSV’ REPLACE INTO TABLE Per_Image FIELDS TERMINATED BY ‘,’ Lines TERMINATED BY ‘\r\n’;

Why the image file didn’t load and the object file did, when they seem to be the same, I don’t understand. I have attached the object file if you care to look at it.

I have another issue now but I will make a separate post on that. Thanks again for the help.
EJGTestObjectFile.zip (668 KB)


#5

Glad that helped, to start with at least. We will test uploading with PCs here soon, and see if we need to modify ExportToDatabase for any PC-specific tweaks.

This is fine since you apparently don’t have any odd characters that need escaping with: OPTIONALLY ENCLOSED BY '"';

I’m not sure what you mean by this:

[quote=“egubbins”]Why the image file didn’t load and the object file did, when they seem to be the same, I don’t understand. I have attached the object file if you care to look at it.
[/quote]

Do you mean that you can’t upload the Per_Object data, or did you mean that I couldn’t? I just tried uploading it, now that I have the object data, and it works fine.

David


#6

To clarify, loading the object.CSV file always worked, without the “Lines TERMINATED BY ‘\r\n’” qualifier. If the problem was the PC-specific line termination, why did the image.CSV file need the “Lines TERMINATED BY…” and the object.CSV file not, when as far as I can tell they have the same kind of termination? Why did one work and the other not?

Thanks again.


#7

That is a good question. I just looked at some *_image.CSV and *_object.CSV files that were generated from a PC here and they were not dos-encoded (i.e. no need for the TERMINATED BY ‘\r\n’). Is it possible that you opened the image.CSV file on the PC, and then resaved it again? Even if you made no edits, and possibly depending on the editor, the PC might encode the file with the \r\n line terminators.

On a unix machine you can run the dos2unix utility on the file to re-encode it in the proper format. I think I’ve used a PC utility for this purpose, too, but I don’t remember what it was called. But it shouldn’t happen in the first place due to CP - please let us know if this behavior strikes again.

Cheers,
David


#8

I guess that makes sense. Since I needed to edit the image.CSV file to update links, it went through a text editor which probably changed the line encoding. I had no need to edit the object.CSV so that one retained the original encoding.

Thanks again for the help.