Responses to CytoMap There and back again thread

First of all a huge thanks for these free software and tutorials, they are amazings!

I am working with Vectra 3.0, analyzing mIF (7 colours) in WSI at 20x.
After merging with qupath all the MSI with the available script in a one file omero.tif, I can generate the csv file with 650K detection cells (60MB), but when i try to cluster cells in cytoMAP, using standardize and NN SOM (algorithm), only for the 7 markers ( no other parameters) “determining number of cluster" never finish. How long should it take? It’s too many number of cells? (I am using i5 3,3 with 24 GB).I ve tried with MSI regions with 30K cells and it works. Should I change the algorithm? Or should I try cluster 4–5 MSI regions together in cytomap and export to qupath in the WSI omero.tif?

Many thanks again for this exemplary community

1 Like

Have you tried manually setting the number of clusters and seeing if the cluster analysis works? I have not yet had any truly large data sets to play with, but I did see that the completion times were much longer when I tiled my images (which I think ended up being only about 60,000 objects). They still completed as long as there were no problems with the CSV file, but the time increases significantly. I don’t recall exactly how much.

2 Likes

Hi Mike,

Thanks a lot for you rapid reply. With manual number of regions is a super fast process, it has taken seconds (5-10 more or less).

Finally, I restared my computer and triyed with Davied Bouldin method again and it did it… in more or less 4hours. It gave me 12 clusters, I suppose it is possible merge some of them in the annotation cluster option and come back with them to qupath.

I dont know if this time consuming method is worthy or maybe check different manual clusters and finally choosing one with biological sense. I need to explore deeper the data analysis.

Thanks again

2 Likes

If you want things to run a bit faster you can try using K-Means instead of NN. However, with any clustering algorithm automatically determining the number of regions will be 24ish times slower than manually defining the number of regions in CytoMAP. If you have no idea how many cells types or regions to expect then this can be worth the extra time. In many cases though you can guess about how many cell types or regions you expect to find based on the markers you stained with. If this is the case then manually choosing an overestimate of the number of clusters then grouping clusters together using the annotate regions/clusters has worked well for me.

Using an algorithm to determine the number of clusters is very slow because the algorithm clusters your data 24 times, with 24 different numbers of clusters then compares the different clustering runs to see which one best “fits” the data. How “best fit” is determined depends on the algorithm used. I think for a lot of biological questions this is overkill since the absolute number of different cell types or region types is often irrelevant to the biological question being asked.

2 Likes

Thanks a lot Caleb for this comprehensive explanation!

On the other hand, Maybe there is a little bug in the app, when I try to open edit>edit Neighborhoods, it show me edit channels.

Thanks again, amazing tool

1 Like

Nice catch, it looks like I didn’t change the window name for edit channels when making edit neighborhoods. I will fix this for the next compiled version. It looks like this interface is still functioning as I intended it to, despite the incorrect window name. If you select a sample to edit it should have channels like Neigh_Volume, NCells which are unique to neighborhoods. When you make a new channel you can then use it for clustering.

2 Likes

Hi Mike,
Thank you for putting it all together. Great explanations, I really enjoy the combination of QuPath and CytoMAP.
I’ve tested it on a subset of the data and it worked smoothly. With a full dataset (6GB images + 260,000rois), I did not get the final message after completing the script, but refreshing the project did the job and I can view the neighbourhoods calculated in CytoMAP.

2 Likes

Hi again Mike,

I am trying to correct colors as in your example.
I added the extension, followed all the steps. I have measuerments maps with a correct model opened up, I run your script to get a correct colormap but when I press on the ‘Correct Colors’ button I get a long list of error (please see below first few lines). Am I missing any step? I thought it’s related to the “Measurement map” but I have it open as in your example.

Thank you,
Ewelina

13:13:34.959 [JavaFX Application Thread] [ERROR] qupath.lib.gui.dialogs.Dialogs - QuPath exception

java.lang.reflect.UndeclaredThrowableException: null

at com.sun.proxy.$Proxy28.handle(Unknown Source)

at com.sun.javafx.event.CompositeEventHandler.dispatchBubblingEvent(CompositeEventHandler.java:86)

at com.sun.javafx.event.EventHandlerManager.dispatchBubblingEvent(EventHandlerManager.java:234)

……

Hi! Not sure off the top of my head, but can you confirm that you have added the colormap to the correct folder so that it is accessible to QuPath?

From the script:

//STEP 1. Acquire a .TSV file that represents a color map of distinct colors
//STEP 2. Make sure that the User  directory is set in Preferences/Extensions (gear icon, upper right)
//STEP 3. Place the .TSV file into a "colormap" folder within the directory
//STEP 4. Change the file path below to match the colormap you wish to use.

The TSV file used in the demo images was included in the sample project - I am not sure of an easier way to host it.

Hi,
Yes, I downloaded your demo folder and included .TSV file into correct directory. I run the script and a new window opens with a new colormap; I exchanged colors in the TSV file and they are correctly read. Only when I press “Correct Colors” I get an error. Please see below.

Hmm, I am a bit stumped here. I was able to adjust my TSV file to 9 colors for a 9 cluster dataset.

I did catch that there is an extra step if you make your own colormap though, I will need to fix that somehow.
image
The name of the colormap also has to be changed down near line 68, otherwise it still tries to use the default Rand_CL20. That still did not cause me to get the error you are seeing though.

Does the Rand_CL20 that came with the demo still work or does it give you the same error?

I am not sure it will help, but could you post the entire error? Also, maybe we can set up a screenshare sometime to take a look at your overall process.

Yes, I saw that line.
I kept the name to Rand_CL20. I tested it both with modified file and the one from your demo folder and had the same error.
I will test it tomorrow on another computer where I built it QuPath from source, maybe java is an issue.
I will update you, and thank you it would be great to check it together. Thank you, it’s very kind :smile:

1 Like

Hi @Mike_Nelson,
Thanks for putting this together. This looks really interesting and a powerful tool. Looking forward to work more with this.
I have a question concerning the annotations. I was able to import the detections and to start playing with the different tools. Then, on QuPath, I made some annotations of the different structures visually present in the tissue. I am able to import the centroids into CytoMap (thanks to the .csv files) but not the polygons created on QuPath. So, if I want to calculate clusters, those will be around the centroid (middle of a vein for example) but not around the side of the structure (the endothelium).
Do you have any idea how to export the full polygons/shapes from QuPath in order to import them to CytoMap? The .csv file seems not to be informative enough for that purpose.

Also, like @ewelina.bartoszek I had some issues with the script to change colors on QuPath only but I’ll probably have enough to play with using CytoMap.

Thanks a lot in advance.

1 Like

Hi @Maxime_Jacquet,

All of the real credit goes to @cstoltzfus and @petebankhead, I am just happy to make use of such great software!

I think we have narrowed down the issue to the file path. The script, somewhat unfortunately, accesses the TSV file two different ways, through the direct path, and through the path build from the Preferences->User directory. If either one of those is not correct, the script can fail in a couple of different ways.

I am not sure what your goal is here. CytoMAP is intended to cluster objects with lists of measurements. The XY coordinates (generally of the centroid) are the positional measurement, and the rest are… other things. The centroids can then be used to find the same object back inside QuPath and apply the class/clusters there. Annotations are frequently a single object, so I am not sure what kind of results you would expect from CytoMAP by sending the, I assume, list of points that make up the annotation. CytoMAP would have no way to interpret things like whether the points were inside or outside (for holes) etc, though you may be able to take the points defining the annotation in QuPath and make a massive list of measurements within each annotation. Unfortunately, that list would be different per annotation (unless they were all squares and had 4 points, for example) and so would not be compatible with CytoMAP analysis.

Maybe @cstoltzfus is aware of some way of using the points list defining the boundary of an object, but I am not.
The closest I can think of to using the shape of an object is to Add Shape Measurements - things like the major/minor axis length, circularity, or any other single measurements you might calculate through scripting.

Another option, if you want to see how close annotations are to each other at the border, is to look at a recent script here:

That script generates measurements for the distance from the edge of one annotation to the nearest edge of each other class as a single measurement. Those single measurements could be used in CytoMAP.

Hi Mike,
I think here the issue is with Mac reading the tsv file incorrectly.
With ‘\t’ the file was read as empty. When we changed to ‘/n’ it could read the first line and showed an error that it cannot go to the next line.

Here, @Maxime_Jacquet would like to reproduce figure 6 from the CytoMAP paper (data is in 2D though). We are looking for a way to import User defined surfaces. The first step would be to export polygons (annotations) from QuPath, but as I could find it is only possible as a json file or through imageJ ROI manager. And later import those to CytoMAP. @cstoltzfus would you be able to tell us how did you imported those blood vessels in your paper?
To provide positional information on CD31+ blood vessels, we generated segmented surface objects on the CD31 channel and imported these objects’ data into CytoMAP. Did you create those with imaris?

Thank you!

1 Like

Hmm, interesting. Since you were accessing the colors correctly (The list of colors in the Dialog box -which are pulled in using the direct path) it seemed like the TSV file was being read correctly. If you are able to see the correct number of colors, and the correct colors in the list - that indicates no problems with the TSV. I do not have access to a Mac, though, so hopefully you can resolve the issue! And if you find any code changes that need to be made to make it more Mac-friendly, I can edit the script above.

I took a quick look at the paper, and as mentioned above, you can get the distances already applied to the cells from within QuPath (which I think is the end purpose of Fig. 6). You could also use tiling or SLICs inside of the blood vessel annotations to generate similar distances if you wanted to import such objects into CytoMAP. I suppose it comes down to what type of segmentation @cstoltzfus meant in “we generated segmented surface objects on the CD31 channel and imported these objects’ data into CytoMAP.”

Right now you can only import points into CytoMAP and not polygons etc. Clustering will treat each point as an individual “object”. However, you ca get very creative with what points you import.

For the blood blood vessels in the CytoMAP paper I made surfaces around blood vessels in Imaris, then used the “split” function with a split seed diameter of 4um. This split the single polygon surface around the whole vessel into a bunch of ~4um surfaces. The white dots in Figure 7B are the centriods of these small surfaces. I then imported these into CytoMAP just like you would any other cell. Since my question in this figure was are there areas along a blood vessel that are associated with specific myeloid cell types, breaking up the blood vessel into small chunks was a way to answer this.

You could in principle recreate a whole blood vessel surface in CytoMAP using user defined surfaces. I don’t have a way to load surfaces directly. You would have to export the polygon vertices, load those into CytoMAP as a .csv, then build a user defined surface in CytoMAP using the Define Surface Function. This would let you calculate distance to the surface, although I imagine this would yield similar results to calculating distance to a surface that was broken into small segments like I did in the paper. You wouldn’t be able to cluster this since I don’t have any way to cluster surface objects at the moment. You could also use this surface to gate on cells inside blood vessels which might be useful. I have done this before to see if cells are intravascular, although you could probably do that directly in Imaris or Qupath.

If you can export the polygon vertices as a .csv you could build neighborhoods around the vertices (similar to what I did in the paper), cluster those neighborhoods and use that to define vessel types. You would have to be sure there were enough vertices along your vessel to sample the right length scale. i.e. if your region along the vessel is in chunks of around 50um in length I would want the maximum distance between polygon vertices to be at most 10um. Personally I like to over-sample so I would probably shoot for a max distance between vertices of below 5um just to be safe.

2 Likes

Yep, I suspect it would be easiest to use the built in “Distance to annotations” function in QuPath to directly add that measurement to each cell, prior to importing into CytoMAP.

I have updated the Colormap script above to “V2” which now only has one input, the file name of the TSV you want to use. If the file name is not in the correct location, either due to the name or the path being wrong, it should throw an error.
It should also print out the location it tried to open and failed, to help with troubleshooting.

1 Like

In case anyone here is still using the old script, I also wanted to mention in the split thread that the import code was updated to make it dramatically faster for large numbers of cells!

1 Like