Why do some objects regularly lack texture feature data?

Hi, CellProfiler Team and everyone
When I use CP 3.1.9 to analyze the image and export the data, it seems that some objects whose Object Number is larger in an image always lack texture feature data regularly.
Here are my data and some details:


analysis_Per_CytoplasmMito.csv (2.8 MB)
The same situation still occurs even if I try to export data using the ExportToSpreadsheet and ExportToDatabase modules on CP3.1.9 or CP 4.0.7.
Does anyone know this is normal?
Thank you very much!

Here is my pipeline and example images
analysis.cppipe (86.6 KB)
test_B02_s1_w2B4C4C30F-29F2-424E-9D51-E3BB1FA67D3E.tif (8.0 MB)
test_B02_s1_w16427D4EF-2310-4F54-A27F-7EA8FD322DCB.tif (8.0 MB)

I found that if I choose discard secondary objects that touch the image border, but not discard the associated primary object, the exported texture feature data will not be aligned, but the data is still arranged according to the size of the object number. The affected data includes Location_Center_X, Location_Center_Y, texture data. @DStirling, because I have spent a lot of time acquiring these data, could you check the source code for me to see if I can manually adjust the order of the data in the table to align it?
If you could help me, I would really appreciate it!
Thank you very much!

Hi @Gavyn,

Could you describe exactly what data is missing in your example (including which objects, spreadsheet name, column names, etc.?). I wasn’t able to reproduce any errors using your pipeline. CellProfiler seems to be exporting the expected measurements when I run it on my computer, which likely means that I’m not understanding your error.

Thanks!
Pearl

Hi @pearl-ryder,
Thank you very much for your reply. Sorry, maybe my words is ambiguous.
First, the pipeline runs without any errors, But when I double-check the exported data I found that the data may be misaligned. Here are the details:

  1. If I choose discard secondary objects that touch the image border, but not discard the associated primary object (analysis_1.cppipe (86.8 KB) ), I will get the following data (1-test.zip (2.5 MB) )

  2. If I choose discard secondary objects that touch the image border and discard the associated primary object (analysis_2.cppipe (87.0 KB) ), I will get the following data (2-test.zip (2.4 MB) )
    Each zipped package will contain 3 SQLite database files and 6 csv files exported from Nuclei.db

  3. As shown in the 1-analysis_Per_CytoplasmMito.csv file in the 1-test zipped package, CytoplasmMito_Number_Object_Number 1 in ImageNumber 1 (second row), whose CytoplasmMito_Parent_Nuclei is 0, should be discarded and should not have texture feature data (such as CytoplasmMito_Texture_AngularSecondMoment_OrigMito_8_00).
    However, CytoplasmMito_Number_Object_Number 51~57 in ImageNumber 1 (52-58 row), whose CytoplasmMito_Parent_Nuclei are not 0, should have texture feature data, but these data are actually missing (BQ52:DP58 in the table).

  4. Compared with 2-analysis_Per_CytoplasmMito.csv file in the 2-test zipped package, CytoplasmMito_Number_Object_Number 5 in ImageNumber 1 in 1-analysis_Per_CytoplasmMito.csv file(Row 6) corresponds to CytoplasmMito_Number_Object_Number 1 in ImageNumber 1 in 2-analysis_Per_CytoplasmMito.csv (Row 2) according to CytoplasmMito_AreaShape_Center_X and CytoplasmMito_AreaShape_Center_Y.

So I guess the corresponding texture data for discarded objects are not filled with nan. Because the data exported from the first pipeline (analysis_1.cppipe) is still arranged according to the size of the object number, maybe I can manually adjust the order of the data in the table to make it aligned. I hope your team can verify my guess from the source code so that I can remedy the data I have exported

Similar problems also appear in the Location_Center_X and Location_Center_Y data, such as CellsMito_Location_Center_X (1-analysis_Per_CellsMito.csv)

Sorry, My words may be too hard to read.
Thanks.
Best,
Gavyn

1 Like

Hi @Gavyn,

Thanks for your detailed reply! I think you have found a bug within CellProfiler. I’ve submitted a bug report: https://github.com/CellProfiler/CellProfiler/issues/4322 and we’ll try to work on a solution in the new year.

In the meantime, I would not include any objects whose AreaShape_Area is 0 or NaN. From my tests, this measurement does not seem to be made on objects that should be discarded.

Good luck with your experiments and thanks again for letting us know about this bug!
Best,
Pearl

1 Like

Hi @pearl-ryder,
Thanks for helping me submit bug. From your test, it seems that more data obtained from Mac OS is not aligned. I agree with you that those misaligned data does not seem to be made on objects that should be discarded.

My original intention is to discard secondary objects that touch the image border and not to include any objects whose AreaShape_Area is 0 or NaN. But due to my mistake, I got a lot of unaligned data like zipped package 1-test(1-test.zip (2.5 MB) ) which took me weeks to get it.

From my test, I think the data (2-test.zip (2.4 MB)) obtained by choosing discard secondary objects that touch the image border and discard the associated primary object is aligned as expected. I find when the imagenumber is the same, the cell BQ2:DP51 in Table 1-analysis_Per_CytoplasmMito (1-test.zip) corresponds to the cell BQ2:DP51 in Table 2-analysis_Per_CytoplasmMito (2-test.zip).

Therefore, taking the BQ column (CytoplasmMito_Texture_AngularSecondMoment_OrigMito_8_00) in Table 1-analysis_Per_CytoplasmMito as an example, I think the data in cell BQ2 should actually be measured by CytoplasmMito_Number_Object_Number 5 in ImageNumber 1 according to CytoplasmMito_Parent_Nuclei number.

By analogy, I may be able to know the actual order of unaligned data and write a simple script to align the data. But I need to further confirm my hypothesis from the source code. However, I am not familiar with the CellProfiler source code and I am not good at programming. Checking the source code is too difficult for me. So I hope your team can help me check the source code in the new year so that I can remedy my data.

Merry Christmas in advance
Best,
Gavyn

Hi @Gavyn,

Yes, your description to align the data here:

seems accurate to me. Note that if you remove all rows where AreaShape_Area = 0, then you will have the same result. Once we dig into the bug we can describe more about exactly what’s happening, but from your data, none of the object numbers have been renamed, so you can use those to relate objects to each other.

What is your goal in aligning the rows? If you need to relate the CytoplasmMito objects to the CellsMito objects, the most reliable way to do that would be to use the “CytoplasmMito_Parent_CellsMito” column in the CytoplasmMito data. That column gives the Cells object number that corresponds to the Cytoplasm object (the output of the IdentifyTertiaryObjects module).

Good luck!
Pearl

Hi @pearl-ryder,
As shown in Table 1-analysis_Per_CytoplasmMito, Since the texture feature data of CytoplasmMito objects do not correspond to CytoplasmMito_Number_Object_Number, if I do not take measures to align these data, I may get wrong measurement data. So the purpose of aligning the rows is to match the texture feature data with the corresponding CytoplasmMito objects in order to correct the obtained data. The ultimate goal is to align all the data of each Nuclei, CellsMito, CytoplasmMito object that has not been discarded.

Select “CytoplasmMito_Parent_CellsMito” column to associate CytoplasmMito with CellsMito, and then associate with Nuclei is indeed a more reliable method. But what I care more about is whether it is reliable to match the misaligned CytoplasmMito texture data with corresponding CytoplasmMito objects according to the method described in my previous post.

Best,
Gavyn