Won't you be my neighbor? CytoMAP for multiplex tissue microenvironment analysis

Hi again, everyone interested in cluster analysis and tissue neighborhoods!
This rather quick guide is mostly focused on using the neighborhood analysis tools in CytoMAP to analyze multiplex data, though they could be used for pretty much any CSV that includes XY (and possibly Z) coordinates. The data shown will be largely based off of the CSV exports used in the last guide, so you can download (second post) as a test set if you are having problems getting things working.


Well, sometimes just classifying cells is not enough to get a good grasp of what is going on in an image, especially when you have dozens of markers and many different phenotypes. Groups of cells form environments, whether that is organ/tissue related, tumor related, or sometimes borders between two tissue types. Region/neighborhood analysis allows you to cluster groups of cells into “environments” and look at what types of cells are interacting most frequently, and how those areas connect with one another. In some of the images below, I ask CytoMAP to divide a field of view including dense cytokeratin annotations into four regions. It ends up finding solid tumor, tumor borders, inflamed stroma, and normal stroma.


Using the same sort of scripts as last time, you can also get the regions that you define back into QuPath, at least in terms of which cells are in which regions. I have not worked on getting the actual masks of the regions back into QuPath.

For reference, the source image is in the upper left (same as for the previous guide) showing Cytokeratin/tumor areas in teal, and assorted immune markers which show up more frequently on the top-left part of the image.

As always, the official documentation can be found here - Guide is only accurate as of 1.4.13:

There will be two versions of the guide below:

  1. Taking classes defined in QuPath to form the regions
  2. Using tSNE or other dimensionality reduction to isolate clusters of cells, manually gate those, and use those clusters for your neighborhood analysis.

1. Using classes defined in QuPath

To start with, load in your CSV file as before, through the File menu in the main CytoMAP interface.

The CSV should include at least XY Coordinates and a Class column. I usually keep the class column as the first column in the data sheet. I used the Multiplex analysis guide (using the same image referenced there) to generate one base class per measurement channel, and a composite classifier to generate the many complex classes you can see below.
The beginning of my imported data set looks like the following image. I included the base measurements so that I could inspect those in later steps.

It's just a CSV file


Point CytoMAP to the QuPath defined classes

First interesting step! CytoMAP needs to know what to use to base it’s neighborhoods off of, and for this workflow, we need to point it to the Class column in the CSV file. Select Annotate Clusters from the main interface, choose the channel with your class names (in this case Ch_Class) and with “Use Classifier” selected, Save annotations.

Annotate Cells dialog

Great! Now CytoMAP knows what to use to determine which cells are which. You could also also use Cluster Cells from the main interface to create classes within CytoMAP, and then you would select the name of your saved clustering model instead of “Use Classifier.”

Decide how large the neighborhoods should be, and have CytoMAP make a list

Next, select Define Neighborhoods from the main interface. For simplicity, I chose “Raster Scanned Neighborhood” with a radius of 50, and the “Fast way” to make this demo relatively quick and easy. Other options can take considerably more time, even with small data sets, so I would start here, and then choose other options later if you need a more granular results. Raster Scanned Neighborhood will create circular (or spherical, or cylindrical if your Z-stack height is smaller than your radius) tiles for your neighborhoods with about half overlap between neighborhoods. Each circle here would be a neighborhood of cells, which ends up being represented as a square tile (pixel) in the final display.
“The distance between neighborhoods is r/2, where r is the user-defined radius of the neighborhoods (field 2).”.

Neighborhood Calculation Options settings

How my neighborhood calculation looked. The Neighborhood Radius will be in whatever units your X-Y coordinates were in, in the CSV file. So if your image was in microns in QuPath, the units will be microns. If you had a TIFF file in QuPath with no pixel sizes, the units will be in pixels.

Ok, now we have defined our neighborhoods - at least in terms of “what classes of cells are in each neighborhood.” CytoMAP now has a collection of these neighborhoods, each entry including a list of the cells (their classes) within that neighborhood. It has NOT attempted to classify those neighborhoods as a particular Type yet, though. That comes next.

While this step was quite quick on a small, field of view data set, @Colt_Egelston mentioned that it took about 90 minutes for a whole slide image of over half a million cells.

Clustering the neighborhoods

The clustering step for neighborhoods will look very much like the clustering step for cells. Select Cluster Neighborhoods into Regions from the main interface, and you will get

a screen that looks starts out looking like this:

And ends up looking like this, since I want the clustering to be dependent on only the *classes* of cells involved

More detailed explanation

Important points: I want this to be based off of classes of cells, so none of the MFI (mean fluorescent intensity) information is used. I chose a manual number of regions to increase the speed of my run. Generally, you may want to play around with this a bit, but automatic determination of the number of regions involves doing the whole run MANY times at EACH possible value… very slow.

I used a Raster Scanned Neighborhood for the previous step, so I keep that selected now. Algorithm is up to you, I have not had especially good or bad results with any of the options with the data sets I have tested, but every data set is different. You may want to experiment there.

The Composition (top left of lower panel) I kept to the percentage of cells of each class within a region. Important note for QuPath users: It will not take into account sub-classifications when determining regions. So CD3:CD8 is just as different from CD3 as it is from CD3:CD8:PD1. They are just different classes. Period. It may help to group classes prior to exporting from QuPath depending on what you want your output to mean, or use other measurements. One example of using other measurements would be to create a variable per channel for your cells (within QuPath) and set that to 0 or 1 if the cell is positive for that class. Then CytoMAP could use those measurements instead of classes for neighborhood clustering. Alternatively, you may choose to not use classes at all, but use the MFI instead, so that each channel is considered independently.

I usually name the model after my settings, so I named this model “Number_manual8_Raster_NN”

It seems this step should be relatively faster than the last, as the same 500k cell data set took less than a minute. Finally, you may need to invert the Y axis (Options button in the lower left of the resulting plot, not shown below) if you are comparing the results to the QuPath image.
This plot shows the results of a 50 micron raster scan with 4 regions chosen manually.

Heatmaps: What is in each of these regions?

Further analysis options include generating a Region Heatmap (main CytoMAP interface) to see what is in each Neighborhood/Region you just defined.



In this case I had dozens of classes, so I decided to use the intensity of the markers within each neighborhood instead.

With a linear scale, I can easily see that the difference between my two stromal regions, 2 and 4 (green and orange), is the frequency of high intensity immune markers. Meanwhile, region 1 (teal) is the tumor area and region 3 is tumor adjacent since it has intermediate levels of cytokeratin. In this case, since I used Raster scanning neighborhoods, it is likely that the yellow results from “boxes/neighborhoods” that partially overlapped tumor tissue.

** As @tinevez has noted, I am not the greatest with color selection. Shrug. :slight_smile: #i2k2020 !!

Back to QuPath

If you want to import these results back into QuPath, please refer to the previous guide linked above, except include the Region Model as one of the output columns when you export your CSV file from CytoMAP.

Export settings

If you have any questions, @cstoltzfus is probably the one to ask. And @cstoltzfus, if you have any comments, feel free to edit me :smiling_imp:


This version is a little lighter on some details, please check the equivalent part of the guide above if anything is confusing

2. Manual gating after dimensionality reduction.

Or, when classes just won’t cut it. Sometimes, you will simply have too many markers, and too much sample variation (based on tissue slice, cell orientation, staining, etc.). In those cases, it might be more useful to create a set of classes within CytoMAP’s tSNE plots (or other options like PHATE).

As with the previous example, load in your CSV file. For this guide, I am using the full set of measurements from QuPath (less autofluorescence). Most of that information will “go away” after the first step with the tSNE plot, but you might want to reduce the number of measurements depending on your use case (only include measurements you “know” are important to your clustering).

Dimensionality reduction settings

Here I have made sure to choose tSNE (as I am not expecting any relationships between cell types and simply want to separate them into clusters), the data type is Individual cells as I have not yet run any neighborhood analysis, and I “Standardize”-d the measurements across my single sample so that brighter channels are not “more important.”
I will leave the settings for the tSNE plot at their defaults, but there is more information in the links.
From the MATLAB implementation: https://www.mathworks.com/help/stats/tsne.html#bvh3rti-2
Fun reading about tSNE variables (thanks @cstoltzfus !): https://distill.pub/2016/misread-tsne/

That’s a tSNE plot

Even with the exact same settings, tSNE plots will not come out the exact same every time. They should not be too different from each other, though, as long as your settings are within fairly “normal” limits. You can read more about that in some of the links above. Also, I changed the settings to a Perplexity of 60 and a Theta of 0.1 in the right most image. This takes a bit longer to run, and does not have a huge impact on the plot, though there is some slightly better separation.

Next, we will draw gates

Within the tSNE plot, I can see a few different clusters, and I can use those clusters to either figure out what markers define that cluster, or define neighborhoods and regions using those clusters. First, though, CytoMAP needs to know which clusters we are interested in.

For example, here I have a cluster that, when cycling through the “C-Axis” measurements in the bottom center, shows up as relatively positive for FoxP3. I will draw a gate around that and call it FoxP3.

After drawing the Gate, I make sure to Save that gate position.

Oops. It gets stuck and never completes. What did I do wrong?


It turns out that you need to click Show Table in the upper left, I was running the plot off of All/All, but the gate saving only works with All Cells is selected.

Once All Cells is selected, the Save works as expected.

After repeating those steps (find a cluster, find a marker/markers that represent that cluster, draw and save a gate) to create (AND SAVE) a few gates, I end up with the following:

Secret tip

I recommend not not not starting your gates with a point near the edge of the plot. If anything, start slightly “inward” of where you want your gate to end up, shape it correctly, and then once it is complete drag it to where you want it to end up (and pick up some of those pesky points at the edge of the plot).

If you ever lose visibility of your gates, you can go to the Plot-Plot gates menu to turn them back on. You can also grab the border points and adjust each gate once you have all gates visible, just make sure to save again!
With those gates saved (note that they are only saved for this instance of CytoMAP), I can hopefully use them to perform some neighborhood analysis. For the moment, I will export the gated data through the File menu in the original CytoMAP interface, “Export full data table as .CSV.”

Settings for export

Here I chose to export individual cells, and include the Gate Logicals, which is a column for each gate with a 1 or a 0 indicating that a cell is or is not contained within that gate. I also really only want the XY coordinates from the central column, which let me place the gates back within Qupath if I want to visualize the gated cells there.
The file will be called “CytoMAP_Sample_yourfilename”.csv and you only get to choose the output folder, not the file name.

If you want to save a more complete setup of your project, use the File->Save workspace from the main menu. I have not fully explored this, so I am not 100% sure of what it does or does not keep.

Define neighborhoods

Next I will define my neighborhoods, once again choosing Raster scanned and a radius of 50 the Fast Way for this quick test.

Neighborhood calculation settings

Note that it is including all my gates in the left column.

Cluster neighborhoods into regions

Now I want to perform the actual neighborhood analysis, and include my gates ONLY.

Clustering neighborhoods into regions - settings

Note that if you scroll ALLLL the way to the bottom of the list, there is a “select all” for the “Use for sorting” column for selecting the “Phenotypes.”
I chose 4 regions yet again, though I decided to use kmeans as the algorithm just to mix things up.

After naming the model and watching the bar fill I get a plot, and once again using the Options in the lower left I have inverted the Y axis. This time the regions are slightly different, though they follow mostly the same pattern.

Heatmap settings

Similar results are obtained, with a highly CK positive tumor area in teal, tumor adjacent areas in yellow, PDL1 enhanced areas more specifically in group 4 this time, and all of the rest of the cells with an enhancement of negative cells in group 2. Also, the cell density can be seen to be higher in group 4 as I included NCells, or the number of cells within a given neighborhood.
Final note: If you rerun the analysis, your cluster numbers may not be in the same order.

For example, I ran the same gates through a smaller Raster size (20 instead of 50) got this result.
Green is now the tumor area with the high CK signal. Also, with the smaller raster size it is better picking up the thinner and smaller tumor areas. Play around with your settings, you never know what you may find!

Please ask if you have any questions or comments about this workflow! There are many types of input data, and this setup will not work for everyone. It certainly makes some assumptions about your data being cell-centric, with every cell having a single value for each measurement.

On the other hand, this same exact analysis can be used for tiles, entirely avoiding cell segmentation based flaws. While it might be painfully slow for a whole slide, I was able to get the same sort of results by tiling the entire image and using the fluorescent intensities per tile to generate neighborhoods - no cell segmentation needed at all! This might be a better option in cases where the analysis concerns cells that are not easily segmented, like macrophages.

1 Like

Bonus post: not using the same models or regions as the previous posts.
Region Interactions” shows which regions, calculated in the previous posts, are bordering each other.
R1 = Teal
R2 = Green
R3 = Yellow - cytokeratin tumor
R4 = Orange - immune inflamed

Columns sum to 100% and show that most regions are fairly contiguous (high values on diagonal).

@cstoltzfus Not sure if there is a way to move the text in the final image?

1 Like

Crosslinking the update here since the major impact is on neighborhood analysis.

Essentially, I do not think CytoMAP differentiates between QuPath data for multiple images when those multiple images are imported as a single CSV file. However, the default export from QuPath is a single CSV file across an entire project.
This is an example of a script I used to export cellular data from QuPath in a way that could be used for CytoMAP neighborhood analysis:

Export one CSV file of detections **per image**
// https://forum.image.sc/t/qupath-saving-filtered-detectiontable-with-measurementexporter-for-current-image-only/41328/2
// ======== Save Results =============
import qupath.lib.gui.tools.MeasurementExporter
import qupath.lib.objects.PathCellObject

// Get the list of all images in the current project
def project = getProject()
def entry = getProjectEntry()
entryList = []
entryList << getProjectEntry()

def outputPath = buildFilePath(PROJECT_BASE_DIR, "results")
def imageData = entry.readImageData()
// Separate each measurement value in the output file with a tab ("\t")
def separator = ","

// Choose the columns that will be included in the export
def columnsToInclude = new String[]{"Image","Centroid X µm","Centroid Y µm", "Nucleus: Area µm^2", "Nucleus: Length µm", "Nucleus: Circularity", "Nucleus: Solidity", "Nucleus: Max diameter µm", "Nucleus: Min diameter µm", "Cell: Area µm^2","DAPI: Nucleus: Median", "DAPI: Cytoplasm: Median","DAPI: Cell: Median","CD8a: Nucleus: Median", "CD8a: Cytoplasm: Median","CD8a: Cell: Median","CD68: Nucleus: Median", "CD68: Cytoplasm: Median","CD68: Cell: Median","PDL1: Nucleus: Median", "PDL1: Cytoplasm: Median","PDL1: Cell: Median","CK: Nucleus: Median", "CK: Cytoplasm: Median","CK: Cell: Median"}

def exportType = PathCellObject.class

def name1 = entry.getImageName() +'.csv'
//need a file here
def outputFile = new File( buildFilePath(outputPath, name1))

// Create the measurementExporter and start the export
def exporter  = new MeasurementExporter()
                  .imageList(entryList)            // Images from which measurements will be exported
                  .separator(separator)                 // Character that separates values
                  .includeOnlyColumns(columnsToInclude) // Columns are case-sensitive
                  .exportType(exportType)               // Type of objects to export
                  .exportMeasurements(outputFile)        // Start the export process
print "Done"

Running the above script “For Project” gave me one CSV file per image, which I could then use in CytoMAP for neighborhood analysis across multiple images.
The downside to all this is that the measurements need to be typed out if you want to include only some columns - which is nice for working in CytoMAP.

Repeat of code from other thread:

Script to import a folder of CytoMAP CSV files into their correct QuPath images
//Select one particular folder that *only has CSV files exported from CytoMAP in it,* and import them into their matching QuPath images.
// Version 3, dramatic speed increase and error checking.
// Michael Nelson February 2021.
project = getProject()

def folder = Dialogs.promptForDirectory(null)
    // Create BufferedReader
    def csvReader = new BufferedReader(new FileReader(file));
    //The rest of the script assumes the X coordinate is in column1, Y in column2, and all other columns are to be imported.
    row = csvReader.readLine() // first row (header)
    measurementNames = row.split(',')
    length = row.split(',').size()
    //measurementNames -= 'X'
    //measurementNames -= 'Y'
    print measurementNames
    print "Adding results from " + file
    print "This may take some time, please be patient"
    csv = []
    Set imageList = []
    while ((row = csvReader.readLine()) != null) {
        toAdd = row.split(',')
        csv << toAdd
    int t = 0
    int z = 0
    print imageList
    imageList.each{ image->
        entry = project.getImageList().find {it.getImageName() == image}
        if (entry == null){print "NO ENTRIES FOR IMAGE "+ image;return;}
        imageData = entry.readImageData()
        hierarchy = imageData.getHierarchy()
        pixelSize = imageData.getServer().getPixelCalibration().getAveragedPixelSizeMicrons()
        csvSubset = csv.findAll{it[2] == image}
        //println("csv subset "+csvSubset)
        objects = hierarchy.getDetectionObjects()//.findAll{it.getPathClass() == getPathClass("Islet")}
        ob = new ObservableMeasurementTableData();
        ob.setImageData(imageData,  objects);
            x = line[0] as double
            y = line[1] as double
            object = PathObjectTools.getObjectsForLocation(hierarchy, x/pixelSize, y/pixelSize,  t, z,-1).find{it.isDetection()}
            if (object == null){print "ERROR, OBJECT NOT FOUND AT "+x+","+y;return}
                    if (round(ob.getNumericValue(object, "Centroid X µm")) != x || round(ob.getNumericValue(object, "Centroid Y µm")) != y){
            object = objects.find{round(ob.getNumericValue(it, "Centroid X µm")) == x && round(ob.getNumericValue(it, "Centroid Y µm")) == y}
            i=3 //skip the X Y and Image entries
            while (i<length){
                //toAdd = row.split(',')[i] as double
                object.getMeasurementList().putMeasurement(measurementNames[i], line[i] as double)

print "Done with all images!"

def round(double number){
    BigDecimal bd = new BigDecimal(number)
    def result
    if (number < 100){
        bd = bd.round(new MathContext(4))
        result = bd.doubleValue()
    }else if (number < 1000){
        result = number.round(2)
    }else {result = number.round(1) }

    return result
import qupath.lib.gui.measure.ObservableMeasurementTableData
import java.math.MathContext