Hi again, everyone interested in cluster analysis and tissue neighborhoods!
This rather quick guide is mostly focused on using the neighborhood analysis tools in CytoMAP to analyze multiplex data, though they could be used for pretty much any CSV that includes XY (and possibly Z) coordinates. The data shown will be largely based off of the CSV exports used in the last guide, so you can download (second post) as a test set if you are having problems getting things working.
WHY?
Well, sometimes just classifying cells is not enough to get a good grasp of what is going on in an image, especially when you have dozens of markers and many different phenotypes. Groups of cells form environments, whether that is organ/tissue related, tumor related, or sometimes borders between two tissue types. Region/neighborhood analysis allows you to cluster groups of cells into “environments” and look at what types of cells are interacting most frequently, and how those areas connect with one another. In some of the images below, I ask CytoMAP to divide a field of view including dense cytokeratin annotations into four regions. It ends up finding solid tumor, tumor borders, inflamed stroma, and normal stroma.
How?
Using the same sort of scripts as last time, you can also get the regions that you define back into QuPath, at least in terms of which cells are in which regions. I have not worked on getting the actual masks of the regions back into QuPath.
For reference, the source image is in the upper left (same as for the previous guide) showing Cytokeratin/tumor areas in teal, and assorted immune markers which show up more frequently on the top-left part of the image.
As always, the official documentation can be found here - Guide is only accurate as of 1.4.13:
There will be two versions of the guide below:
- Taking classes defined in QuPath to form the regions
- Using tSNE or other dimensionality reduction to isolate clusters of cells, manually gate those, and use those clusters for your neighborhood analysis.
1. Using classes defined in QuPath
To start with, load in your CSV file as before, through the File menu in the main CytoMAP interface.
The CSV should include at least XY Coordinates and a Class column. I usually keep the class column as the first column in the data sheet. I used the Multiplex analysis guide (using the same image referenced there) to generate one base class per measurement channel, and a composite classifier to generate the many complex classes you can see below.
The beginning of my imported data set looks like the following image. I included the base measurements so that I could inspect those in later steps.
It's just a CSV file
Point CytoMAP to the QuPath defined classes
First interesting step! CytoMAP needs to know what to use to base it’s neighborhoods off of, and for this workflow, we need to point it to the Class column in the CSV file. Select Annotate Clusters from the main interface, choose the channel with your class names (in this case Ch_Class) and with “Use Classifier” selected, Save annotations.
Great! Now CytoMAP knows what to use to determine which cells are which. You could also also use Cluster Cells from the main interface to create classes within CytoMAP, and then you would select the name of your saved clustering model instead of “Use Classifier.”
Decide how large the neighborhoods should be, and have CytoMAP make a list
Next, select Define Neighborhoods from the main interface. For simplicity, I chose “Raster Scanned Neighborhood” with a radius of 50, and the “Fast way” to make this demo relatively quick and easy. Other options can take considerably more time, even with small data sets, so I would start here, and then choose other options later if you need a more granular results. Raster Scanned Neighborhood will create circular (or spherical, or cylindrical if your Z-stack height is smaller than your radius) tiles for your neighborhoods with about half overlap between neighborhoods. Each circle here would be a neighborhood of cells, which ends up being represented as a square tile (pixel) in the final display.
“The distance between neighborhoods is r/2, where r is the user-defined radius of the neighborhoods (field 2).”.
How my neighborhood calculation looked. The Neighborhood Radius will be in whatever units your X-Y coordinates were in, in the CSV file. So if your image was in microns in QuPath, the units will be microns. If you had a TIFF file in QuPath with no pixel sizes, the units will be in pixels.
Ok, now we have defined our neighborhoods - at least in terms of “what classes of cells are in each neighborhood.” CytoMAP now has a collection of these neighborhoods, each entry including a list of the cells (their classes) within that neighborhood. It has NOT attempted to classify those neighborhoods as a particular Type yet, though. That comes next.
While this step was quite quick on a small, field of view data set, @Colt_Egelston mentioned that it took about 90 minutes for a whole slide image of over half a million cells.
Clustering the neighborhoods
The clustering step for neighborhoods will look very much like the clustering step for cells. Select Cluster Neighborhoods into Regions from the main interface, and you will get
And ends up looking like this, since I want the clustering to be dependent on only the *classes* of cells involved
More detailed explanation
Important points: I want this to be based off of classes of cells, so none of the MFI (mean fluorescent intensity) information is used. I chose a manual number of regions to increase the speed of my run. Generally, you may want to play around with this a bit, but automatic determination of the number of regions involves doing the whole run MANY times at EACH possible value… very slow.
I used a Raster Scanned Neighborhood for the previous step, so I keep that selected now. Algorithm is up to you, I have not had especially good or bad results with any of the options with the data sets I have tested, but every data set is different. You may want to experiment there.
The Composition (top left of lower panel) I kept to the percentage of cells of each class within a region. Important note for QuPath users: It will not take into account sub-classifications when determining regions. So CD3:CD8 is just as different from CD3 as it is from CD3:CD8:PD1. They are just different classes. Period. It may help to group classes prior to exporting from QuPath depending on what you want your output to mean, or use other measurements. One example of using other measurements would be to create a variable per channel for your cells (within QuPath) and set that to 0 or 1 if the cell is positive for that class. Then CytoMAP could use those measurements instead of classes for neighborhood clustering. Alternatively, you may choose to not use classes at all, but use the MFI instead, so that each channel is considered independently.
I usually name the model after my settings, so I named this model “Number_manual8_Raster_NN”
It seems this step should be relatively faster than the last, as the same 500k cell data set took less than a minute. Finally, you may need to invert the Y axis (Options button in the lower left of the resulting plot, not shown below) if you are comparing the results to the QuPath image.
This plot shows the results of a 50 micron raster scan with 4 regions chosen manually.
Heatmaps: What is in each of these regions?
Further analysis options include generating a Region Heatmap (main CytoMAP interface) to see what is in each Neighborhood/Region you just defined.
Settings
In this case I had dozens of classes, so I decided to use the intensity of the markers within each neighborhood instead.
With a linear scale, I can easily see that the difference between my two stromal regions, 2 and 4 (green and orange), is the frequency of high intensity immune markers. Meanwhile, region 1 (teal) is the tumor area and region 3 is tumor adjacent since it has intermediate levels of cytokeratin. In this case, since I used Raster scanning neighborhoods, it is likely that the yellow results from “boxes/neighborhoods” that partially overlapped tumor tissue.
** As @tinevez has noted, I am not the greatest with color selection. Shrug. #i2k2020 !!
Back to QuPath
If you want to import these results back into QuPath, please refer to the previous guide linked above, except include the Region Model as one of the output columns when you export your CSV file from CytoMAP.
If you have any questions, @cstoltzfus is probably the one to ask. And @cstoltzfus, if you have any comments, feel free to edit me