There and back again, QuPath<==>CytoMAP cluster analysis

Hi all!
“Graphical abstract”…

New post on cluster/neighborhood analysis, though I will mostly be focusing on the clustering aspect. The focus here is on using CytoMAP, a great new MATLAB (but does not require a paid MATLAB license, so no fear!) based tool that can be used in conjunction with QuPath to perform some great analysis, while still being fairly straightforward to use. I think it is particularly amazing how powerful the analyses can be with entirely open source solutions (and there may be other options, all of this can likely be done in R, or possibly HistoCAT, though I am less familiar with those options. Someone else can create those guides!).

That said, there is one script involved this time (QP 0.2.2+), but it will not be too bad, I promise! The basic steps are these:

  1. Generate some objects in QuPath that have measurements. These measurements have to be meaningful, or else the cluster analysis will not generate good results. Garbage in, garbage out. Or in more concrete terms, if you use H-DAB color measurments on an H&E image, do not expect your cluster results to be any good.
Image: Well, some of these cell borders certainly do not look right

The data from the cytoplasmic and cell measurements will certainly not be reflective of the true cells in all cases.

  1. Export these measurements through the handy-dandy Measure->Export measurements interface, or write a script.
  2. Run CytoMAP 1.4.9 or later and import those measurements right on in!
  3. Create clusters, tSNE plots, a little bit of neighborhood analysis, whatever it is you want to do.
  4. Export the results from CytoMAP.
  5. Use a script to import the results from CytoMAP back into the correct images and objects in QuPath.
  6. Use the Measurement Maps or perhaps some other tools to inspect the results of your analysis within the original images.

The fact that you still have to use CSV files as a go between for the two programs might be somewhat off-putting, but it also nicely gives you a “hard copy” of the results that you could use in other analysis software.


Alright, let’s begin! I will be using a demo project similar to the last one that I hosted, and can be found here:
It includes:
The LuCa image from
Data file with some cells and measurements
Several scripts including the Import from CytoMAP script, and a visualization script.
Several csv files representing results at different stages of the guide.
A folder of colormaps that can be used by adding them to your User directory/colormaps folder (future post)

You will want QuPath 0.2.2 or later installed. I specifically used a version with Tensorflow so that I could use StarDist to generate my cell outlines, but that is not necessary. You will also need to have CytoMAP 1.4.9 or later installed, which can be found here.

1. Generate some data containing objects to analyze. GIGO warning.

Well, the objects have already been generated in the demo project, and there is only one image, but I will be treating this demo as if there are multiple images that are part of a project, and my recommendations about what to export will be based on the assumption that everything is run for a multi-image project. It works just fine on a single image, though.

Remember that the measurements generated must be meaningful; while we are essentially treating these values as flow cytometry data, they very much are not flow cytometry data! Mean intensities will be inaccurate due to not tracing the outline of the cell exactly, and will be impacted by how exactly the cell was sliced in the tissue section. Take this into consideration with your conclusions! You may have better luck generating intensity sums per cell than means if you need a large cytoplasmic expansion.

2. Export the measurements

I will show the Measure->Export measurements dialog here, but if you are script savvy, you can also script this.

Make sure all of the files you want to export are on the right side in the Selected window.
Output file: Name your csv file.
Export type: Choose Cells or Detections, usually.
Separator: I have only used CSV.
Columns to include(Optional): While this says optional, I would STRONGLY recommend it be mandatory for you, in this case. You really do not want to scroll through dozens of measurements you will never use every time you want to do something in CytoMAP. Worse, many of those measurements would be harmful to your analysis if you simply “include everything.”
Click Populate, and then include at least these base measurements.

Columns necessary to include


After those, you can include any additional measurements you want your clustering to be based off of. If you want to do neighborhood analysis based off of Classes you created in QuPath, include “Class” as well.
For the exported CSV you will find in the demo folder, you can see that I included some Nucleus size/shape measurements, and one measurement per signal channel, based on the localization of that signal (either nuclear or cytoplasmic).


At this step, I would always check the csv file for any missing values. CytoMAP does not handle those well at all, especially when you try to Standardize your data. As in it will hang and leave you wondering what has gone wrong. I just ran into the issue again, so I wanted to put an extra warning here. My steps to resolve were to open the .csv file in Excel, use CTRL+SHIFT+arrow keys to select the entire data set, then CTRL+F to get to Find. Then use “Replace” instead to replace all instances of nothing (do not put anything into the “Find what” field) with 0.

1 Like

3. Run CytoMAP

(or install. The first time you install, you may need to download the MATLAB Runtime (~3GB) if you do not already have a MATLAB installation. Further updates to CytoMAP are incredibly fast and only take a few seconds)
The interface should look something like this, showing the import option.

So far I have only tested the following on Windows 10 and 7 systems, but MATLAB seems to be an option for Windows, Mac and Linux.
Once you complete the Import (or Load .csv or .mat in this case) nothing happens as it gets stuck at:
Oops. Apparently CytoMAP doesn’t like the name of my demo file! If CytoMAP does not handle certain characters or combinations of characters, it will simply hang, so check the CPU usage if you run into loading bars that are not moving to make sure CytoMAP is working. I am not exactly sure what broke in this case, but replacing the name on the right with the name on the left solves the problem. If you do not have weirdly named files, this likely will not be an issue for you, as it has not been a problem for me in previous projects.

Image: Fixing the oopsie


There are no two copies of CellData.csv in the demo folder, I will be using CellData2.csv for the rest of this guide as it has the adjusted image name.

Next step, select the X,Y and Z coordinates for your objects. Selecting these is not terribly important unless you want to view your data in CytoMAP “as if it were in an image.” Since the main purpose of this for me has been viewing the data in QuPath, I do not worry about it much. Do choose values with numerical entries though. For Z there is a default “Do not use Z” option you can leave selected.

Image: CytoMAP X and Y selection on import

In the next popup, select Load in the lower left.
Now CytoMAP returns to its default look, though with no indication the data is loaded. If you have not gotten stuck on any green partial loading bars, you should be good to proceed.

1 Like

4. Now, on to analysis in CytoMAP.

I recommend going through the documentation here, look at the options on the right side of the screen. While I intended this to be a few quick steps, it got a bit out of control.

If you classified your cells in QuPath, you can use the “Annotate Clusters” button to tell CytoMAP that it should treat that “Class” entry in your CSV as a list of clusters. One of the great recent changes is that even if you have 30+ classes, it will autofill them all for you!

I will jump straight into the two options I use most “Cluster Cells” and “Dimensionality Reduction.”

Cluster Cells

When you first select Cluster Cells, you will get an interface like this.

Image+description:Cluster cells interface

Select All/All in Use for Sorting in the upper left. If you do not, CytoMAP will try to cluster zero cells. It does not work well.
Next select the measurements you want to cluster based on, and adjust any Weights (half hidden column “Wei…”) for measurements you think are more important. Be careful with this. Great power, great responsibility, all that. I would not include any positional data like the XYZ coordinates, or any strings in this case (maybe string data should not show up here?).

At the bottom, starting from the left, I usually select “Standardize” rather than Raw MFI (mean fluorescent intensity), as I do not want to weight the results based on my brightest channels.
Normalize as is appropriate for your project, if you are not sure, the forum should be a good place to ask! I have never altered Color scheme.
Number of Regions is very important, and you can either use one of the built in auto-detection algorithms, or select your own. I generally start with the built in options, and then increase the number of base clusters to study what happens, and where the algorithm finds differences in the data. There is no single right answer; there are many right and wrong answers, however. Depending on how you normalized, you may find that clusters form “per sample” due to variations in staining, for example. You may want to merge those clusters for downstream analysis (overclustering).
Input Data Type: For the moment, use individual cells. Explore the other options once you are comfortable generating neighborhood data.
Select Algorithm: I have not explored the variations in these quite enough to really say when one in particular should be used. Try them all!

Huge note that I missed earlier, there is a menu at the top where you can File->Save or Load model.

Example first run.

Image+description:Cluster cells first run

I usually name the model (the results) after the settings I used. The model above, for example, was called “Standardize DB NN 1”. If you do not include information about the settings you use, I do not believe they are stored anywhere else, so it can be hard to replicate an analysis on another or larger data set.
Once you run it, this may take some time, so check your CPU utilization if you are worried that the program has stopped.
In Windows:
If you chose to allow an automatic detection of the number of clusters, you should get both some sort of plot and a Figure on completion of the analysis.
There is way too much to go over here, so I recommend the official documentation, but I want to point out that the Y axis is inverted as QuPath starts Y values from 0 at the top of the image.

Image: Cluster analysis results


In the Options in the lower left of the Figure, you can Invert the Y axis. The “C-Axis” options are your color map (which can also be changed in the Options).
The cluster analysis we just ran results in:

Image: Cluster analysis results colormap


Ok. So. That is a thing. I feel like an optometrist is about to ask me which number is hidden in the picture. But, we can at least see that some of the clusters align with structures from the original image. What does it mean though? Well, thankfully, there are heatmaps! Go back to the original CytoMAP window and select Extensions->cell_heatmaps.m
Since this is going a bit long, I will simply show the options I selected, and feel free to ask if you have any questions!

Image: Heatmap options

And the result is in!

Right off, I can identify two clusters that are heavy in CK which represent the majority of the tumor area (6 and 9 and green and red in the previous XY map). The primary difference between the two seems to be the size of the nucleus. Is that useful? I am not sure, ask the biologists! However, we could include other measurements from QuPath at this point, like nearest CD8+ cell, or similar distance based measurements, which might tease out more differences between the clusters. You can include data on the heatmap that was not used in the generation of the clusters!
A couple of other things that jumped out: The CD8 and FOXP3 clusters are relatively rare compared to most other clusters and PDL1 seems mostly associated with the CD68 positive cells, both of which might tell us something about the tumor microenvironment in this case.

Dimensionality Reduction, tSNE plots and manual gating!

I chose similar settings for the quick Dimensionality Reduction example as for the cluster analysis. There are also other options like PHATE, but I only included tSNE here for, hah, “brevity.”

Note that I did exclude the cluster analysis results, as I did not want them to bias the tSNE plot.
Settings choices for tSNE:

Image: tSNE options

From the MATLAB implementation:
Fun reading about tSNE variables (thanks @cstoltzfus !):

I chose the default settings and, after a short wait, obtained the following plot.

A tSNE plot of QuPath data! Dreams do come true!

The one change I did make was changing the color axis to the previously discovered clusters, so I could see how they matched up. I can see clusters 9 and 6 on the left, which would be the tumor region, and a fairly well separated cluster 5 on the right. Cluster 5 was my FOXP3 cells, but why are there cluster 2, 3 and 4 cells mixed in? I wonder where and what those are?

That is where Gating comes in. There are two ways I could go about this, either selecting the entire cluster on the right, or trying to grab only the non-cluster 5 cells in that tSNE cluster. I will simply create a gate or two and bring that data back into QuPath.

First, click Show Table and change All/All to All Cells, as shown. Then Exit Table and use the annotation options to draw some gates.


The useful buttons, from left to right:

Clear plot.
Refresh image: This is useful if you run a new cluster analysis in another window, and want to view those results in your tSNE plot.
Square gate: Draw a square gate.
Polygon gate: Draw a polygon gate, close it by clicking on the first point a second time.
Save Last Chosen Gate: This is very important. It does not save ALL gates, it only saves the last gate selected. But if you never save the gates, you cannot do anything with the gates, like import them back into QuPath.

When you click on a button to create a gate, it will first ask you to name it.

This is my gate. There are many like it, but this one is mine


The cursor changes to a crosshairs, and you can start clicking to draw your gate.
Edges can be tricky and I recommend setting your first point far away from an edge. After completing your gate, you can always drag it around to capture points near the edge of the image. Dragging the gate will “push” the edge.

Final gates; I also created a CK gate.


A quick check of the heatmaps for those gates.

How do I create a heatmap for gates?

By selecting the gates I want to analyze and the Phenotypes option within “Select What To Compare” I can see that some of the tSNE FOXP3 cluster now includes CD68 and PD1 positive cells.

There currently seems to be a bug when choosing Select Heatmap Type, if you attempt to choose Combined heatmap for all samples. Individual will work.

Remembering that this is not flow data, what we may be seeing here is interactions between macrophages and FOXP3 positive cells.

1 Like

5. Export the data from CytoMAP

Fairly straightforward: File->Export full data tables as csv

To import back into QuPath, the export settings should look like this.

Include all cells, and include the gates we created. Include the XY coordinates so match up the clusters with the correct objects in QuPath. Include the Image name so that if you have multiple images, you assign the objects to the correct XY coordinates in the correct images!
Finally, include any models you want, and check the Include Gate Logicals next to the Export button. Uncheck Individual .csv for each cell.
When you click Export, it will have you select a folder, NOT a file name. This can be a little disconcerting if you export multiple times, as you will need to either choose another folder or rename the original file. It will attempt to overwrite your original export otherwise, and it may be locked for editing! I chose the QuPath project folder.

What the project folder looks like now.


6. Time to import all that data back into QuPath!

This is where the first script comes in.
The script is also included within the demo download.

**Script:** Import a targeted CSV file exported from CytoMAP
The image name in the CSV file must match the image name in the project. If you rename one, you must rename the other.

Script to import a CSV file exported from CytoMAP 1.4.9 or later.
Expected input:
CSV file
X,Y coordinates in first two columns
Image name in third column
All other columns are cluster or gate data

Michael Nelson, September 2020
project = getProject()

def file = Dialogs.promptForFile(null)

// Create BufferedReader
def csvReader = new BufferedReader(new FileReader(file));

//The rest of the script assumes the X coordinate is in column1, Y in column2, and all other columns are to be imported.
row = csvReader.readLine() // first row (header)
measurementNames = row.split(',')
length = row.split(',').size()

print measurementNames
print "Adding results from " + file
print "This may take some time, please be patient"
csv = []
Set imageList = []
while ((row = csvReader.readLine()) != null) {
    toAdd = row.split(',')
    csv << toAdd

print imageList
imageList.each{ image->
    entry = project.getImageList().find {it.getImageName() == image}

    imageData = entry.readImageData()
    hierarchy = imageData.getHierarchy()

    csvSubset = csv.findAll{it[2] == image}
    //println("csv subset "+csvSubset)
    objects = hierarchy.getCellObjects();
    ob = new ObservableMeasurementTableData();
    ob.setImageData(imageData,  objects);
        //print line
        x = line[0] as double
        y = line[1] as double
        object = objects.find{round(ob.getNumericValue(it, "Centroid X µm")) == x && round(ob.getNumericValue(it, "Centroid Y µm")) == y}
        //print object
        i=3 //skip the X Y and Image entries
        if (object){
        while (i<length){
            //toAdd = row.split(',')[i] as double
            object.getMeasurementList().putMeasurement(measurementNames[i], line[i] as double)

print "Done with all images!"

def round(double number){
    BigDecimal bd = new BigDecimal(number)
    def result
    if (number < 100){
        bd = bd.round(new MathContext(4))
        result = bd.doubleValue()
    }else if (number < 1000){
        result = number.round(2)
    }else {result = number.round(1) }

    return result
import qupath.lib.gui.measure.ObservableMeasurementTableData
import java.math.MathContext

Feel free to take a look at the CytoMAP_Sample_”Original CSV File Name” to see what you are working with, if you want.
Then, run the script, and target the appropriate CSV file. It SHOULD be as easy as that. If you run into any errors, please let me know, as I have only been able to test this on my own data (and this example). The only problem I had with this example is that I also had to rename the LuCa image within QuPath. The file names must match up exactly, or else the script does not know where to put the CytoMAP values.

Sample script output.


Ok, it worked, but where are the measurements?

Well, we wrote them to the data file, but QuPath is working off of the temporary data file. Couple of options here: “CTRL+R” with the main window selected will reload the data, and give you the measurements. If you have more than one image, switching to another image will load the measurements in the new image. You could also run the import script with no images open in the first place and everything will be fine.


7. Great, but what can we do with this?

Cluster analysis results.

FoxP3 gate from the tSNE plot.

Since I have no classifications to maintain, I decided to quickly create a single measurement classifier using the gate.

At this point, if you do not have any QuPath classifications that you want to keep, you could create a classifier based on the cluster measurement. Once classified, it would be much easier to turn various clusters on and off through the Annotation menu. Even if you have classifications, you might consider Duplicating (Project tab, right click on an image) an image, and then reclassifying only the duplicate.

With all of the unclassified cells turned off in the Annotations tab
it is much easier to inspect the remaining cells for CD68 and PD1.
Looking at the white and yellow cells (most strongly positive for CD68), I can see that many of these FOXP3 cells do seem to have extensions of CD68 touching them, though not necessarily completely surrounding them. Looking at which cluster those cells are in, it turns out they are largely in Cluster 3, which was one of the more minor components of that gate. Clusters 2 and 4 end up looking fairly normal based on the measurements included, but who knows what further analysis might find!
The cytokeratin gate, as a sanity check:
Yep, that gate was definitely the tumor area.

I hope this is useful to someone out there! I appreciate any feedback, and blasted this out in a couple of hours, so please point out if anything is unclear. I plan to update it a bit further with the measurement map script (First image in this section that color codes the clusters, which seems easier than visualizing them by heatmap) in the next few days.

Huge thanks to both @petebankhead and @cstoltzfus for making these two programs available for everyone to use! @cstoltzfus made 3-4 updates in the last couple of weeks that made this all happen (sometimes within hours), so if anyone has questions for him about the software, I highly recommend asking. His response and update time has been phenomenal!

1 Like

Thank you for putting this together! This is really a cool use of CytoMAP and QuPath together.

1 Like

A second script can be used to fix the colors on a particular type of colormap to certain values, allowing users to go back and forth between the heatmaps and clusters in CytoMAP and the resulting cells or other objects in QuPath.

Script: Force display colors to match color legend
//STEP 1. Acquire a .TSV file that represents a color map of distinct colors
//STEP 2. Make sure that the User  directory is set in Preferences/Extensions (gear icon, upper right)
//STEP 3. Place the .TSV file into a "colormap" folder within the directory
//STEP 4. Change the file path below to match the colormap you wish to use.
// Influenced by

Michael Nelson, September 2020
With assistance from Pete Bankhead:

  Change this to the correct location on your computer!!
file = "C:/Users/mnelson/QuPath/colormaps/Rand_CL20.tsv"

def csvReader = new BufferedReader(new FileReader(file));
randMap = []
while ((row = csvReader.readLine()) != null) {
    toAdd = row.split("\t")
    toAddInts = []
    //print toAdd
        toAddInts << it.toDouble()
    randMap << toAddInts
numberOfColors = randMap.size()

int col = 0
int row = 0
//int textFieldWidth = 120
int labelWidth = 20
def gridPane = new GridPane()
gridPane.setPadding(new Insets(10, 10, 10, 10));
ScrollPane scrollPane = new ScrollPane(gridPane)
BorderPane border = new BorderPane(scrollPane)
border.setPadding(new Insets(15));
//Put a button here to update the Measurement Maps window
Button setMapThresholds = new Button()
setMapThresholds.setText("Correct Colors")
gridPane.add( setMapThresholds, 2, row++, 1,1)

for (i=0; i<numberOfColors;i++){
    clusterNumber = new Label((i+1).toString())
    gridPane.add( clusterNumber, col, row, 1,1)
    rect = new Rectangle(25,(i+1)*10, 60,10)
    int R = 255*randMap[i][0]
    int G = 255*randMap[i][1]
    int B = 255*randMap[i][2]

    color = Color.rgb(R,G ,B )
    gridPane.add( rect, 2, row++, 1,1)

setMapThresholds.setOnAction {

    String userPath = PathPrefs.getUserPath();
    Path dirUser = Paths.get(userPath, "colormaps");
    colorMappers = []
    //Adjust the name of the color map to whichever map you are interested in.
    def colorMapper = colorMappers.find {it.getName() == 'Rand_CL20'}
    //Great, we have the color map at this point, if it exists.
    def viewer = getCurrentViewer()
    def options = viewer.getOverlayOptions()
    def detections = getQuPath().getImageData().getHierarchy().getDetectionObjects()

    //Two options here, the one that works but requires "index" or a specific measurement, and the one that tries to find out what the currently selected measurement in the dialog is. The latter fails.
    name = options.getMeasurementMapper().measurement
    def mapper = new MeasurementMapper(colorMapper,name, detections)

Platform.runLater {

    def stage = new Stage()
    stage.setScene(new Scene( border))
    stage.setTitle("Cluster colormap")


import java.nio.file.Paths;
import java.nio.file.Path;
import qupath.lib.gui.prefs.PathPrefs;
import javafx.application.Platform
import javafx.geometry.Insets
import javafx.scene.Scene
import javafx.geometry.Pos
import javafx.scene.control.Button
import javafx.scene.control.Label
import javafx.scene.control.ColorPicker
import javafx.scene.layout.BorderPane
import javafx.scene.layout.GridPane
import javafx.scene.control.ScrollPane
import javafx.scene.layout.BorderPane
import javafx.stage.Stage
import javafx.scene.input.MouseEvent
import javafx.beans.value.ChangeListener
import qupath.lib.gui.QuPathGUI
import javafx.scene.shape.Rectangle
import javafx.scene.paint.Color

When you run this script, you should get a dialog that looks like what is shown below, with a single button.

If you have a Measure->Measurement map open and selected, you will be able to click the “Correct Colors” button to be able to see the cluster colors in a way that matches up with the list under the Correct Colors button. If you do not have a “Measurement map” open and selected, the button will cause an error!!

Now we can see that clusters 9 (white) and 6 (gray) are dominant in the tumor (cytokeratin positive) areas.

Now as a GIF!

Without using the correct colors button, or using other colormaps, you get images like these

In the first image, the Rand_CL20 colormap was chosen, but the colors used are stretched out between the dark green at the end of the color map and the teal at the beginning, with no easy way to see what all of the colors in between ended up as.

Using one of the standard color maps, it can be very difficult to tell clusters apart, much less tell exactly which cluster is which.

Jet (legacy) is slightly better in terms of cluster color separation, but still does not help much with identifying exactly which cluster is which.

Regardless of what colormap you choose, assuming you have adjusted the script and set up your QuPath user directory correctly, the Correct Colors button will use the colormap chosen (I recommend Rand_CL20 to start!) in the script to overwrite the normal Measurement map display.

To make the whole setup work, first set your User directory, if you have not already.

Open Preferences from the gear in the upper right of QuPath (also in the upper right of the image above), and copy and paste a path into the provided space. Alternatively, double click in the empty space and navigate to the desired folder.

Once you have a user directory, create a colormaps folder within that directory, or copy in the colormaps folder from the Demo folder included in the second post. There should be several sample colormaps within that folder, which you can delete or move to another folder if you like. QuPath will show all of the added maps, so if it gets messy, delete away!
In the end, you should end up with something like this:

Edit the script above to point to the desired file. You can also get the script from the scripts folder in the Demo.

That is pretty much it! Please ask if you have any problems or questions. Note that this will only be useful “as is” if you have 20 or fewer clusters. If you have more clusters, you may need to add more lines to the provided colormap, or create your own colormap by manipulating the provided .tsv files.

P.S. You technically do not need to set up the User folder, as you can get away with simply editing the script to point to the desired .tsv file anywhere. If you like the results, though, you may want to explore creating and sharing your own colormaps which will be easier with this setup already in place.

1 Like

Minor update to the import script to avoid “misses” that I could not pin down and a warning about missing fields in the QuPath CSV export.

Wanted to quickly point out that there have been a number of improvements if you are working with multiple samples/files.

First, I would use a script to export "per sample" csv files, something like this.
// ======== Save Results =============
import qupath.lib.objects.PathCellObject
import qupath.lib.objects.PathDetectionObject
//New User Defined Entry- only for exporting certain classes of objects
//exportClass = getPathClass("PD1 (Opal 650)")

// Get the list of all images in the current project
def project = getProject()
def entry = getProjectEntry()
entryList = []
entryList << getProjectEntry()

def outputPath = buildFilePath(PROJECT_BASE_DIR, "results")

//NEW CODE - only needed if you want to exclude certain classes
//removedObjects = getCellObjects().findAll{it.getPathClass() != exportClass}
//removeObjects(removedObjects, true)
imageData = entry.readImageData()
// Separate each measurement value in the output file with a tab ("\t")
def separator = ","

// Choose the columns that will be included in the export

def columnsToExclude = new String[]{"Name", "Class","Parent", "ROI" }

def exportType = PathDetectionObject.class

def name1 = entry.getImageName() +'.csv'
//need a file here
def outputFile = new File( buildFilePath(outputPath, name1))

// Create the measurementExporter and start the export
def exporter  = new MeasurementExporter()
                  .imageList(entryList)            // Images from which measurements will be exported
                  .separator(separator)                 // Character that separates values
                  .excludeColumns(columnsToExclude) // Columns are case-sensitive
                  .exportType(exportType)               // Type of objects to export
                  .exportMeasurements(outputFile)        // Start the export process

//Put stuff back in place!
print "Done"

See here for the original.

Once you have your pile of CSV files (and have checked them all for empty entries in your measurement lists! Ack!), you can use CytoMAP’s Import Multiple Samples option.

For neighborhood analysis, the most recent update, 1.4.11, allows you to pull classifications from QuPath from all images, and use those to define your neighborhoods. Or simply use Clusters from within CytoMAP.

*Edit for code not being formatted.

Mike - This looks like an awesome tool. Thanks for putting together such a great post. Do you know if CytoMAP will implement dimensionality reduction techniques like UMAP and PCA, or clustering solutions like DBSCAN? I work in genomics and these are the state of the art in our field and am wondering if they can be applied to imaging data.


It already has PCA and UMAP, I have a rather slow implementation of DBSCAN for QuPath in another post on the forum, but this has much better visualization options if you can use the methods it has available.
Current techniques are available on the wiki:

I never did get around to implementing HDBSCAN and I do not know if that is something that can be expected from @cstoltzfus


MATLAB has dbscan built in so I can easily add it as an option for clustering in CytoMAP. I’ll put it in the next compiled version.

@rdbell3 I added DBSCAN as a model optoin when clustering cells or neighborhoods to version 1.4.13 available as a matlab app here or a standalone version here. When you run DBSCAN it asks for epsilon, and minimum points within epsilon. Since DBSCAN determines the number of clusters internally, CytoMAP will ignore the number of regions option selected by the user. I am using the base MATLAB implementation of DBSCAN described here. I haven’t tried it out to see how well it defines actual regions in tissues yet.


First of all a huge thanks for these free software and tutorials, they are amazings!

I am working with Vectra 3.0, analyzing mIF (7 colours) in WSI at 20x.
After merging with qupath all the MSI with the available script in a one file omero.tif, I can generate the csv file with 650K detection cells (60MB), but when i try to cluster cells in cytoMAP, using standardize and NN SOM (algorithm), only for the 7 markers ( no other parameters) “determining number of cluster" never finish. How long should it take? It’s too many number of cells? (I am using i5 3,3 with 24 GB).I ve tried with MSI regions with 30K cells and it works. Should I change the algorithm? Or should I try cluster 4–5 MSI regions together in cytomap and export to qupath in the WSI omero.tif?

Many thanks again for this exemplary community

1 Like

Have you tried manually setting the number of clusters and seeing if the cluster analysis works? I have not yet had any truly large data sets to play with, but I did see that the completion times were much longer when I tiled my images (which I think ended up being only about 60,000 objects). They still completed as long as there were no problems with the CSV file, but the time increases significantly. I don’t recall exactly how much.


Hi Mike,

Thanks a lot for you rapid reply. With manual number of regions is a super fast process, it has taken seconds (5-10 more or less).

Finally, I restared my computer and triyed with Davied Bouldin method again and it did it… in more or less 4hours. It gave me 12 clusters, I suppose it is possible merge some of them in the annotation cluster option and come back with them to qupath.

I dont know if this time consuming method is worthy or maybe check different manual clusters and finally choosing one with biological sense. I need to explore deeper the data analysis.

Thanks again


If you want things to run a bit faster you can try using K-Means instead of NN. However, with any clustering algorithm automatically determining the number of regions will be 24ish times slower than manually defining the number of regions in CytoMAP. If you have no idea how many cells types or regions to expect then this can be worth the extra time. In many cases though you can guess about how many cell types or regions you expect to find based on the markers you stained with. If this is the case then manually choosing an overestimate of the number of clusters then grouping clusters together using the annotate regions/clusters has worked well for me.

Using an algorithm to determine the number of clusters is very slow because the algorithm clusters your data 24 times, with 24 different numbers of clusters then compares the different clustering runs to see which one best “fits” the data. How “best fit” is determined depends on the algorithm used. I think for a lot of biological questions this is overkill since the absolute number of different cell types or region types is often irrelevant to the biological question being asked.


Thanks a lot Caleb for this comprehensive explanation!

On the other hand, Maybe there is a little bug in the app, when I try to open edit>edit Neighborhoods, it show me edit channels.

Thanks again, amazing tool

1 Like

Nice catch, it looks like I didn’t change the window name for edit channels when making edit neighborhoods. I will fix this for the next compiled version. It looks like this interface is still functioning as I intended it to, despite the incorrect window name. If you select a sample to edit it should have channels like Neigh_Volume, NCells which are unique to neighborhoods. When you make a new channel you can then use it for clustering.


Hi Mike,
Thank you for putting it all together. Great explanations, I really enjoy the combination of QuPath and CytoMAP.
I’ve tested it on a subset of the data and it worked smoothly. With a full dataset (6GB images + 260,000rois), I did not get the final message after completing the script, but refreshing the project did the job and I can view the neighbourhoods calculated in CytoMAP.