CellProfiler ExportToSpreadsheet module generates inconsistent object measurements

I wish to input instance segmentation masks into CellProfiler (4.0.7) to measure and export some features. I’m using the “select image type as objects” option in the NamesAndTypes module to obtain the objects from the masks. In the exported measurements using ExportToSpreadsheet, I figured that the ObjectNumber field corresponds to the increasing order of pixel values in the input mask.

For one particular mask in my dataset, there are 2 objects with NaN features at the end in the exported csv. The mask image contains 136 objects with pixel labels 1, 2, …, 138 (except 98 and 108). It seems cellprofiler created 2 “empty” objects in their place. Moreover, the Location_Center_X and Location_Center_Y features for ObjectNumber corresponding to the missing pixel values (98 and 108) are NaN and the “empty” objects seem to have these values instead! I tried to recreate this issue with a dummy example (labels 1, 2, 3 and 5). However, I did not get any NaN object corresponding to the missing pixel label (4) in this case.

Is this a bug and/or is there an issue with my mask image that I am missing?

nuclei_features.zip (54.6 KB)

So it looks to me like what’s happening is that for everything but the ObjectNumber, FeretDiameters, Radius, and Location_Centers, your objects are being renumbered to 1-136 (instead of 1-138), but due to some legacy code, those are not. So you end up with those columns being 0s or NaNs in the rows for the labels that are missing, and then all the measurements in those categories after are off by one-per-missing-label-so-far after. So those columns for rows 99-107 are off by 1, and 108-end are off by 2, and the final two rows of measurements are there at the end. I’ve checked that this is still also an issue in 4.1.3.

CellProfiler used to allow you to have object labels missing, but it now no longer does, which is why we didn’t catch this before. I’ll open a bug report today and we can see how efficient is to adjust these before 4.2, and we are working towards no longer using that same legacy code in CellProfiler 5, but in the meantime it would be best to not have missing labels if you can (ie relabel before moving to CellProfiler); otherwise, the columns in question can just be re-aligned after-the-fact as an absolute worst case scenario.

Thanks for letting us know!

Thanks for the explanation. So for now, I will relabel masks before moving to CellProfiler as you suggested.

Also, it seems that the AreaShape_Zernike and Intensity features are off too due to the legacy code. You might want to add this to the bug report.

Nuclei.csv (2.4 MB)

Great, the bug report was already in before I saw this, but thanks for letting us know!