How to get the location of each RNA spot?

Hi, I am using PixelSpotDecode to detect the spots. My collaborator told me if Tert gene lie on nucleus then my pipeline may be correct. My pipeline has many gene and out of them Tert gene count 8 times. I am wondering how to show the all the spots of Tert gene on the nuclei mask to verify. I am trying with following snippet of code.

def make_expression_matrix(masks, decoded):
    al = AssignTargets.Label()
    labeled = al.run(masks, decoded[decoded.target != 'nan'])
    labeled_filtered = labeled[labeled.cell_id != 'nan']
    cg=labeled_filtered.to_expression_matrix()
    return labeled_filtered,cg

labeled_filtered,mat = make_expression_matrix(masks, decoded)

genes=mat.coords['genes'].values
X=mat.coords['x'].values
Y=mat.coords['y'].values
for i in range(len(genes)):
    print(X[i],Y[i],genes[i])

mat.dims
mat.values.shape (7, 28, 28)
display(stack=filtered_imgs, spots=labeled_filtered, masks=masks,viewer=viewer)

Hi Ankit,

What you’re doing makes sense. Do you get an error message?

Is it possible you didn’t instantiate a napari viewer before using it in display()? If that’s the case either drop the viewer parameter because it’s not needed or first create a viewer with viewer = napari.Viewer().

Hi Matt,

I don’t get error message. I am seeing the spots in napari. But I am seeing all the genes. I want to see only specific genes. How should I select some specific gene?
Thanks.

Ah, I think what you’re looking for is labeled_filtered_tert = labeled_filtered[labeled_filtered.target == 'Tert'], which will remove all spots that are not decoded to “Tert”.

I’m not sure if this applies to your data but you may also be interested in using project_axes.

Many thank Matt. This is what I was looking for. Also many thanks for introuducing to me project_axes. It is quite helpful in visualization.

Hi Matt, I got another problem and am not sure whether to put in new thread or not. The pixel based spot finding scheme working fine but when I am trying with Blob based spot the segmentation part showing the error.

Traceback (most recent call last):
markers_and_stain = Merge.SimpleMerge().run([thresholded_stain, watershed_markers])
File “/Users/agrawal/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/pipeline/algorithmbase.py”, line 23, in helper
result = func(*args, **kwargs)
File “/Users/agrawal/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/morphology/Merge/simple.py”, line 43, in run
raise ValueError(“not all masks have the same physical ticks”)
ValueError: not all masks have the same physical ticks

I saw another thread related to similar question but when I check my nuclei and primary fov json file both looks similar.

Any Idea why I am getting that error here.

Hi Ankit,

It’s interesting that this error only occurs when doing blob based spot finding since the error doesn’t have anything to do with SpotFindingResults or DecodedIntensityTable.

That error message means your physical ticks from thresholded_stain doesn’t match watershed_markers. My guess is this is due to your primary images being 3D (13 z-sections) and your nuclei being 2D (only 1 z-section). Therefore the Coordinates.Z don’t match. You can verify this by doing stack.xarray.coords and examining the ‘zc’ coordinate of both.

One possible solution (that I have not tested) would be to do a maximum projection on your primary image stack with stack.reduce({'z'}, func="max"} before segmentation. And then see if the zc coordinate matches your nuclei stack’s zc. They should both be a single float, and if they don’t match you can alter the json file of nuclei until they do.

Hi Matt,

Many thanks for the nice suggestion. Right now,I am doing the Z projection
of nuclei and 6 RNA channels in Fiji and then giving them
as 2d images in starfish and then I don’t see any error. I am bit confused
about which pipeline I should finalize for this seqFISH data. In this dataset
3 combination of channel decode for some gene and I have total 15 genes in codebook.

When I used BlobDetector, PerRoundMaxChannel, and Sequential build strategy
I found four gene spot Kitl, Selplg, Sgk3 and Tert.
When I used BlobDetector, MetricDistance, and Sequential I found again four gene
but different than previous one Alcam, Baiap2, Ccnb1, col1a1.

When I used BlobDetector, MetricDistance(codebook=codebook,max_distance=0.2,min_intensity=0.1,
trace_building_strategy=TraceBuildingStrategies.ExactMatch) got following error.

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/dataarray.py in setitem(self, key, value)
716 for k, v in self._item_key_to_dict(key).items()
717 }
→ 718 self.variable[key] = value
719
720 def delitem(self, key: Any) → None:

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/variable.py in setitem(self, key, value)
845
846 indexable = as_indexable(self._data)
→ 847 indexable[index_tuple] = value
848
849 @property

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/indexing.py in setitem(self, key, value)
1305 array, key = self._indexing_array_and_key(key)
1306 try:
→ 1307 array[key] = value
1308 except ValueError:
1309 # More informative exception if read-only view

ValueError: could not broadcast input array from shape (23) into shape (22)

When I used BlobDetector, MetricDistance(codebook=codebook,max_distance=0.2,min_intensity=0.1,
trace_building_strategy=TraceBuildingStrategies.NEAREST_NEIGHBOR) got following error.

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/spots/DecodeSpots/metric_decoder.py in run(self, spots, *args)
87 intensities = self.trace_builder(spot_results=spots,
88 anchor_round=self.anchor_round,
—> 89 search_radius=self.search_radius)
90 transfer_physical_coords_to_intensity_table(intensity_table=intensities, spots=spots)
91 return self.codebook.decode_metric(

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/spots/DecodeSpots/trace_builders.py in build_traces_nearest_neighbors(spot_results, anchor_round, search_radius)
97 distances, indices = _match_spots(
98 per_round_spot_results,
—> 99 anchor_round=anchor_round
100 )
101 intensity_table = _build_intensity_table(

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/spots/DecodeSpots/util.py in _match_spots(round_dataframes, anchor_round)
29 no spot is detected within search radius
30 “”"
—> 31 reference_df = round_dataframes[anchor_round]
32 reference_coordinates = reference_df[[Axes.ZPLANE, Axes.Y, Axes.X]]
33

KeyError: 1

I am unable to understand the error and confused with which pipeline is best for this dataset. If you have any suggestion then please let me know. It would be very useful to understand in detail.
Thanks a lot.

Hi Ankit,

Definitely don’t use the TraceBuildingStrategy.SEQUENTIAL. That is meant for sequential smFISH methods (where every gene is only detected in exactly one cycle of imaging). It does not try to piece multiple cycles of information together into a barcode. Between EXACT_MATCH and NEAREST_NEIGHBOR I would try to use EXACT_MATCH when possible (with good image registration) because NEAREST_NEIGHBOR's performance varies depending on density of spots.

If your seqFISH barcode is the kind where every gene is detected in every cycle (like sequencing) then you can use either PerRoundMaxChannel or MetricDistance. PerRoundMaxChannel will choose the brightest channel from every cycle to decode the spot to a barcode, so it only works if the spot has signal in every cycle.

If your barcode is like seqFISH+ or MERFISH, then you must use MetricDistance. It creates a vector of intensities from your images and then finds the nearest barcode.

Is this the full error message? Can you share it so I can see where it starts in starfish?

Hi Matt, Many many thanks for explaning very well. I think my colloborator gave me seqfish+ data but we both are confused about the exact difference between these two protocol. We need to read these two in details to know the exact clear difference but most probably it is seqfish+ only.

when I used following strategy.

decoder=DecodeSpots.PerRoundMaxChannel(codebook=codebook,anchor_round=0,search_radius=100,trace_building_strategy=TraceBuildingStrategies.NEAREST_NEIGHBOR)

I get counts in only four genes out of 20 genes.

decoder = DecodeSpots.MetricDistance(codebook=codebook,max_distance=0.2,min_intensity=0.1,
trace_building_strategy=TraceBuildingStrategies.NEAREST_NEIGHBOR)

For above strategy I do not get any count on any gene.

decoder =DecodeSpots.MetricDistance(codebook=codebook,max_distance=0.2, min_intensity=0.1, metric=‘euclidean’,norm_order=2,trace_building_strategy=TraceBuildingStrategies.EXACT_MATCH)

Basically for exact match strategy I get error. Is it possible that my registration is not perfect and due to that it showed. Also not sure why no single gene get detected via Metric distance and nearest neighbor method.

Many Thanks.


ValueError Traceback (most recent call last)
in
----> 1 decoded = decode_spots(e.codebook, spots)
2 labeled_filtered,mat = make_expression_matrix(masks, decoded)
3 genes, counts = np.unique(decoded.loc[decoded[Features.PASSES_THRESHOLDS]][Features.TARGET], return_counts=True)
4 print(‘genes and counts’, len(genes),len(counts))
5 for i in range(len(genes)):

in decode_spots(codebook, spots)
17 #decoded_filtered = decoded[decoded.target != ‘nan’]
18 #decoder = DecodeSpots.SimpleLookupDecoder(codebook=codebook)
—> 19 return decoder.run(spots=spots)
20
21 def make_expression_matrix(masks, decoded):

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/pipeline/algorithmbase.py in helper(*args, **kwargs)
21 @functools.wraps(func)
22 def helper(*args, **kwargs):
—> 23 result = func(*args, **kwargs)
24 if result is not None:
25 method_class_str = str(args[0].class)

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/spots/DecodeSpots/metric_decoder.py in run(self, spots, *args)
87 intensities = self.trace_builder(spot_results=spots,
88 anchor_round=self.anchor_round,
—> 89 search_radius=self.search_radius)
90 transfer_physical_coords_to_intensity_table(intensity_table=intensities, spots=spots)
91 return self.codebook.decode_metric(

~/miniconda/envs/starfish/lib/python3.7/site-packages/starfish/core/spots/DecodeSpots/trace_builders.py in build_spot_traces_exact_match(spot_results, **kwargs)
35 # if no exact match set value to 0
36 value = 0 if value.empty else value
—> 37 intensity_table.loc[dict(c=c, r=r)] = value
38 return intensity_table
39

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/dataarray.py in setitem(self, key, value)
206
207 pos_indexers, _ = remap_label_indexers(self.data_array, key)
→ 208 self.data_array[pos_indexers] = value
209
210

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/dataarray.py in setitem(self, key, value)
716 for k, v in self._item_key_to_dict(key).items()
717 }
→ 718 self.variable[key] = value
719
720 def delitem(self, key: Any) → None:

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/variable.py in setitem(self, key, value)
845
846 indexable = as_indexable(self._data)
→ 847 indexable[index_tuple] = value
848
849 @property

~/miniconda/envs/starfish/lib/python3.7/site-packages/xarray/core/indexing.py in setitem(self, key, value)
1305 array, key = self._indexing_array_and_key(key)
1306 try:
→ 1307 array[key] = value
1308 except ValueError:
1309 # More informative exception if read-only view

ValueError: could not broadcast input array from shape (18) into shape (16)

I would look at the barcode for those four genes and see if there’s a pattern. Is it possible that those four genes have a barcode where every round has an ‘ON’ channel and the other 16 genes do not? Do they contain only one channel (ie. a barcode of all Cy3)?

This may be because the max_distance or min_intensity thresholds are filtering out any detected spots (see 4th paragraph of MetricDistance tutorial). To test this theory you can relax the thresholds and see if that reveals some genes.

When you used blob detector did you give it a reference image? When you pass a reference image to a FindSpots algorithm it will find the location of spots in that reference and then measure intensities of that location in your other rounds of images. That way every round has the same spots.

If you don’t pass it a reference, then it finds spots for every round so each round will have a different number of spots. EXACT_MATCH needs spots to have info from every round so if you forget the reference it won’t be able to build a trace.

Sometimes the reference can be another round of images that labeled every RNA, but if you don’t have that you can create an image that has every RNA by doing a max projection of the appropriate images. Don’t forget to register before projecting if you know images might be shifted.

Hi Matt, Many thanks for detailed explanation. I uploaded the data and pipeline in this link.
github link
The main code is final_blob_based_find.py and terminal output is shown in TerminalOutput.txt file.
One thing I did not understand how to make the reference image of dots in my case (2 rounds and 3 channel)?
I am registering based on nuclei markers because I don’t see any markers in the spot.
The current method finds only 8 genes [Col1a1(8),Dlk1(7), Kitl(2), Ly6a(9), Mki67(6),Pdgfra(5),Sgk3(9)
Tert(12)]. I tried it for 3 different FOV and all it give the same genes, same error but different counts. The current error is appearing on make_expression_matrix.
I have total 20 genes but there was some error during the design of probes in the experiment so 5 genes got same readout probes. You can ignore Alb, Arg, Bglap, Ly6a and S100a6. Basically remaining 15 genes should show in the starfish at least in some FOV. Many thanks for your help.

With add to previous comment. I managed to make reference dots.

dots1=clipped_both_scaled.reduce({Axes.CH}, func=“max”)
dots=dots1.reduce({Axes.ROUND}, func=“max”)

Now Metric distance and exact match gives me following [‘Alb’, ‘Arg1’, ‘Ccnb1’, ‘Col1a1’, ‘Kitl’, ‘Ly6a’, ‘Mcam’, ‘Tert’] genes and PerRoundMaxChannel and nearest neighbors gives me following [Col1a1,Dlk1, Kitl, Ly6a, Mki67,Pdgfra,Sgk, Tert] gens. I am confused why these two strategies gives me so different genes.

Hi Ankit, nice that you figured out the reference dots. By the way, you should be able to do it in one command with .reduce({Axes.CH, Axes.ROUND}, func="max").

I would caution against using nearest neighbors if you can avoid it. I think it can be finicky and lead to spurious results compared to the straight forward exact match approach. Regardless of method, you should validate your results with display(), which will visualize your images and spots in the napari viewer.

For example, if you’re running in a jupyter notebook, you can view the Col1a1 spots by doing something like:

%gui qt
from starfish import display

col1a1_spots = decoded[decoded[Features.TARGET] == 'Col1a1']

display(stack=registered_imgs, spots=col1a1_spots)

Then you can use napari to follow the spots through the various rounds and channels to see if their signal matches the expected barcode. It might reveal which approach is more accurate for your images.