Hi all! Is there a way to visualize cell composition (as either the number of cells or percent of cells) in cell-centered neighborhoods without creating regions? Let’s say I have X, Y, and Z phenotype of cells. I ask CytoMAP to create cell-centered neighborhoods around X and Y type cells. Then, I would like to compare the number of Z cells present in X vs Y neighborhoods. When I cluster neighborhoods into regions so that I can compare Z composition, I need exactly two regions that perfectly separate each cell type (i.e., region 1 = X, region 2 = Y). Since I have multiple images with many neighborhood/cell types (X, Y, A, B,…), the regions end up having mixed cell types. This is why I would like to avoid creating regions and just directly get the numbers from the neighborhoods if possible. Thanks!
If you were looking at X neighborhoods vs Y neighborhoods, you are fine with double counting (or more) cells that are near an X and a Y cell, correct? And would a list of all of the contents of each cell centered neighborhood be ok (3 columns of data per row, X cells, Y cells and Z cells)?
Only attempting to clarify the problem here.
Hello @Gurkan ,
Currently I don’t keep track of what cell type each neighborhood is centered on (it is on my to-do list to add this option). This makes it difficult to gate on neighborhoods around cell type X or Y. However, since you are asking if there is more cell type Z around cell X or Y you should be able to use the nearest_neighbor_analysis.m extension I have been working on. This is still a work in progress so not all of the options work. (I am using the example dataset found here for the below examples)
This is found under the Extensions drop-down menu at the top of the main CytoMAP window. To run this you first have to have defined some regions. (You don’t have to use the regions but I originally programmed this so I could look at nearest neighbors in specific regions of tissues.) Next you select which cells you want to look at the nearest neighbors of by selecting cell types in the left columns. In the bottom right of this window you can choose to look at the N nearest neighbors, or look at all neighbors within some distance, where the distance is in whatever units your X,Y,Z position is in. If you chose distance this is basically the same calculation as the cell centered neighborhoods, except I don’t store the table of neighborhoods for later clustering and region statistics. When you click ok you will get a few heat-maps out.
Assuming I did my math right, these heatmaps should show you what the composition of the neighbors are around each cell type. In the top row you can see the average % of neighbors of CD301b+ cells are each cell type. In the bottom row you can see the average % of neighbors are around each sub-capsular sinus Macrophage (SCS Macs). Since SCS Macs live in a separate region from the other myeloid cells I selected in these lymph nodes I expect most of their neighbors to be of the same cell type so an average of 0.74 (74%) of all neighbors within 30um being SCS macs for all SCS Macs seems reasonable.
In the Interaction plot on the left, I am calculating the expected number of neighbors, then finding:
(Average N Neighbors - Expected Neighbors ) / (Expected Neighbors)
Where the expected number of Neighbors is the number of cells for cell type X divided by the total number of cells so that you would expect there to be N cells of cell type X around each cell in the dataset.
For reference here are the physical locations of the cells I used above:
Yes, it would be okay to have cells being present in overlapping neighborhoods. Also, yes, such a list with the number of cells for each cell type would be great as long as it is clear what is the cell type in the center of each neighborhood.
Thank you for the quick and detailed response! I will try the nearest neighbor analysis and will let you know here whether it worked for me. From your description above, it sounds like it should be able to address my question.
I have tried the nearest neighbor analysis with distance but I think I have an issue: Should the raw percentages across a row sum up to 1? Looking at your image, that seems to be the case. However, my numbers exceed 1 if summed. Please see the screenshot below. I must be doing something wrong.
If the cells types are not exclusive you can get % of neighbors above 100%. e.g. if one of the selected cell types is T cell, and another cell type is CD8+ T cell, the rows will sum to more than 100% since cells can be both T cells and CD8+ T cells. You could have something like 90% of your neighbors are T cells and 50% of your neighbors are CD8+ T cells. The rows should only sum to 1 if each cell can only be one of the selected cell types.
Hi @cstoltzfus! Thank you for the explanation! I tried the analysis with fewer cell types that are definitely non-overlapping cell types and I noticed the sum can still exceed 1 when the analysis is based on “distance”. It was fixed though when I switched to the “number” of cells option instead of “distance”. Maybe I am doing something wrong. The results based on “number” are quite useful but it would be nice to also have the comparison based on “distance”.
I have tried a few of my datasets but I can’t reproduce this error. The only way I can make the rows sum to anything other than 1 is if the cell definitions have some double positive cells. I couldn’t find anything in the code that would make this possible. I will take another look at the code next week, maybe I missed something.
If you want you can double check your cells can’t be double positive by plotting all cells and changing the X, and Y axes to the two cell types that should have no overlap. If there is no overlap there will not be any points at 1,1. If there are any cells that are positive for both cell definitions there will be a point at 1,1.
I have tried again with 3 cell types in one sample and I still have the same issue. It is likely that I am doing something wrong. So, I am attaching some screenshots below hoping that you may identify my mistake.
So, this is to check that the 3 cell types are not overlapping as you recommended (I tried all combinations of X/Y axis and they all look like this one here):
And these are the steps:
The raw percentage maps when I choose “distance” vs “number” for the nearest neighbor analysis:
All of the steps look correct. This might take some digging on my end to see if I can figure out what is happening.