A MATLAB importer for Mastodon files

Hi all,

We just released a first version of a MATLAB importer that can import Mastodon project files.

You can find it here for now: https://github.com/tinevez/matlab-mastodon-importer

Please note that Mastodon is still not released, not published, not supported, probably full of bugs. This sub-project is part of the current development of Mastodon.

We took notes from pain points raised by several courageous people that noted that the import of XML MaMuT or TrackMate large files in MATLAB was very long. So this importer directly reads the binary file format of Mastodon, using the low-level API of MATLAB. It should be much much faster and convenient, though I did not test it on very large dataset yet.

I copy/paste below the README.md of the project, so that you can see if it can be useful to you.

On a personal note it was fun to make. It felt like reverse-engineering.

A MATLAB importer for Mastodon files.

This repository contains several MATLAB functions used to import Mastodon files (https://github.com/fiji/TrackMate3). The import procedure is based on directly deserialising the binary files using MATLAB low-level API, and therefore has no dependency.

Running a quick demo.

Run the demo script demo_import_and_plot.m in the demo/ folder.

Installation.

Simply add all the files of the src/ folder to your MATLAB path.

Usage.

The main function is import_mastodon( path/to/your/mastodon/file.mastodon ).

[ G, metadata, tss ] = import_mastodon( source_file );

The data graph.

The data is returned as a MATLAB directed graph already:

>> G

G = 

  digraph with properties:

    Edges: [2981×6 table]
    Nodes: [3087×22 table]

The spots are listed in the Nodes table of the graph. The links are listed in the Edges table.

Everything is imported: the model, the numerical features and the tags:

>> head(G.Nodes)

ans =

  8×22 table

    id      x         y         z       t     c_11       c_12      c_13     c_22     c_23
    __    ______    ______    ______    _    ______    ________    ____    ______    ____

    0     61.362    76.349    55.142    0    22.502     0.91281     0      21.628     0  
    1     113.15    3.2331     67.34    0    35.087      6.6417     0      20.584     0  
    2     35.741    20.075     57.84    0      24.6     -6.5254     0      28.333     0  
    3     79.251    35.248    56.869    0    19.096     0.19479     0      34.124     0  
    4     143.31    95.934    77.957    0    31.404     -1.9327     0      18.281     0  
    5     114.93    97.665    65.717    0    25.358    -0.42869     0      18.027     0  
    6     129.54    98.146    70.754    0    23.942    -0.66575     0      18.076     0  
    7      36.41    50.149    57.431    0    26.004      7.9379     0      25.876     0  
...
 
  c_33      bsrs     label      Fruits          Names       SpotNLinks    SpotTrackID 
 ______    ______    _____    ___________    ___________    __________    ___________ 

 181.48    725.93     '0'     Apple          Mike               1               0     
 82.768    331.07     '1'     Apple          <undefined>        1             105     
 146.87    587.46     ''      Banana         Robert             1             104     
  170.8    683.21     ''      Kiwi           Myriam             1             103     
 122.11    488.43     ''      Kiwi           Assaf              1             102     
 115.41    461.63     ''      <undefined>    <undefined>        1             101     
 86.567    346.27     ''      <undefined>    <undefined>        1             100     
    158    631.99     ''      <undefined>    <undefined>        1              99     

...

The id column corresponds to the Mastodon internal object id. However, the link table follows MATLAB convention. It has a variable called EndNodes made of two columns containing the source and target spots of each link. But the EndNodes values refer to row indices in the spots table, not to the spot ids.

>> head(G.Edges)

ans =

  8×6 table

    EndNodes    id      Fruits          Names       LinkDisplacement    LinkVelocity
    ________    __    ___________    ___________    ________________    ____________

    1     95    0     <undefined>    Chris               1.3555            1.3555   
    2     96    1     Apple          Roy                0.41863           0.41863   
    3     97    2     Banana         <undefined>        0.65284           0.65284   
    4     98    3     Kiwi           <undefined>        0.95254           0.95254   
    5     99    4     <undefined>    <undefined>        0.52254           0.52254   
    6    100    5     <undefined>    <undefined>        0.77146           0.77146   
    7    101    6     <undefined>    <undefined>        0.83214           0.83214   
    8    102    7     <undefined>    Joe                0.71399           0.71399   

The tables store also the physical units of the variables they store:

>> head(G.Edges, 1)

ans =

  1×6 table

    EndNodes    id      Fruits       Names    LinkDisplacement    LinkVelocity
    ________    __    ___________    _____    ________________    ____________

    1    95     0     <undefined>    Chris         1.3555            1.3555   

>> G.Edges.Properties.VariableUnits

ans =

  1×6 cell array

    {0×0 char}    {0×0 char}    {0×0 char}    {0×0 char}    {'um'}    {'um/frame'}

And their description when available:

>> G.Edges.Properties.VariableDescriptions'

ans =

  6×1 cell array

    {0×0 char}
    {0×0 char}
    {0×0 char}
    {0×0 char}
    {'Computes the link displacement in physical units as the distance between the source spot and the target spot.'                                              }
    {'Computes the link velocity as the distance between the source and target spots divided by their frame difference. Units are in physical distance per frame.'}

The ellipsoid and the covariance matrix.

The spot ellipsoid shape is represented through a covariance matrix. The covariance matrix itself is stored in the variables c_11, c_12, c_13, c_22, c_23, c_33, so that this symmetric real 3x3 matrix can be expressed as:

C = 	[  	c_11, 	c_12, 	c_13
		c_12, 	c_22,	c_23
		c_13,	c_23,	c_33 ];

The bsrs variable contains the bounding-sphere radius squared. It is the radius of the smallest sphere that includes the spot ellipsoid fully, squared.

In the demo_import_and_plot.m file there is a function can plot the spot ellipsoid. You would use it for instance like this:

spots = G.Nodes;
i = 1;
spot = spots( i, : );

M = [ spot.x; spot.y; spot.z ];
C = [
  spot.c_11, spot.c_12, spot.c_13
  spot.c_12, spot.c_22, spot.c_23
  spot.c_13, spot.c_23, spot.c_33
];

h(i) = plot_ellipsoid( M, C );
set( h(i), ...
  'EdgeColor', 'None', ...
  'FaceColor', 'b', ...
  'FaceLighting', 'Flat' );
light()

The metadata.

We also retrieve the metadata, made mainly of the physical units, and the absolute path to the XML/H5 BDV file:

>> metadata

metadata = 

  struct with fields:

                     version: '0.3'
         spim_data_file_path: '/Users/tinevez/Development/Mastodon/TrackMate3/samples/mamutproject/datasethdf5.xml'
    spim_data_file_path_type: 'absolute'
                 space_units: 'um'
                  time_units: 'frame'

The tag-set structure.

The last variable returned is the tag-set structure. For each tag-set, it contains its label, its id and the tag list.

>> tss

tss = 

  2×1 struct array with fields:

    id
    name
    tags

>> tss(1)

ans = 

  struct with fields:

      id: 0
    name: 'Fruits'
    tags: [3×1 struct]

The tags themselves are a struct with an id, a label and a color encoded as an integer.

>> tss(1).tags(1)

ans = 

  struct with fields:

    label: 'Apple'
       id: 0
    color: -52480

In the demo_import_and_plot.m file there is a function to convert the int color into a RGB triplet. I might as well give it here:

function rgb = to_hex_color( val )
    hex_code = dec2hex( typecast( int32( val ), 'uint32'  ) );   
    rgb = reshape( sscanf( hex_code( 1 : 6 ).', '%2x' ), 3, []).' / 255;    
end

>> val = tss(1).tags(1).color;
>> rgb = to_hex_color( val )l
>> rgb

rgb =

    1.0000    1.0000    0.2000

Performance.

On my MacPro I tested the import of a dataset made of about 30k objects (spots and links) in less than 1s.

Limitation.

The importer strongly depends on how the Mastodon file format is written. Any changes made to the serialisation procedure in the Java Mastodon project will likely break the importer. Right now the importer echoes a warning if the Mastodon file version is not 0.3.

Examples.

The screenshots below mostly exemplify what can be done from the imported data structure, with the MATLAB visualisation tools.

Import the full track graph and the spot ellipsoids.

Import the tags and use them to color spots and links.

The MaMuT dataset imported as ellipsoids.

This is the results of the detection of cells using the TGMM framework of Amat et al., 2014.

Colouring individual tracks.

A smaller dataset.

Inspecting the data numerical features.

figure;
s = scatter( G.Nodes.SpotGaussianFilteredIntensityMeanCh1, G.Nodes.z, 75, sqrt(G.Nodes.bsrs), 'filled' );
s.MarkerEdgeColor = [ 0.3 0.3 0.3 ];

xlabel( 'Mean intensity' );
ylabel( sprintf( 'Z position (%s)', G.Nodes.Properties.VariableUnits{4} ) )
colormap jet
c = colorbar;
c.Label.String = sprintf( 'Approx size (%s)', G.Nodes.Properties.VariableUnits{4} );

6 Likes

Thank you very much for developing Mastodon, a very nice Plugin.
It has been a great help for my research.

I used Mastodon plugin in Fiji to analyze the 3D tracking data of the cells.
To handle this analysis data in Matlab, I used matlab-mastodon-importer to import the .mastodon file into Matlab. When I looked at the Nodes information using the “>>head(G.Edges)” command, it only contained basic information such as cell id and X, Y, Z coordinates (id, x, y, z, t, c_11 … c_33, bsrs and label). In the Nodes information, the information I wanted, such as Spot radius, Spot gaussian-filtered intensity, and Track N spots, were not included. Similarly, when I looked at the Edges information using the “>>head(G.Edges)” command, there was only the EndNodes and id information, not the Link displacement and Link velocity information that I wanted. Of course, I calculated these values with the feature calculation of Mastodon plugin in Fiji and saved them in a Mastodon file.

How can I get these values? Please let me know.

By the way, when I looked at the Nodes and Edges information in the mamutproject.mastodon file in the demo folder of the matlab-mastodon-importer, it also did not contain the information that appears in the “README.md” such as SpotNLinks, SpotTrackID, LinkDisplacement and LinkVelocity.

Hello @SakuLab

I suspect you did not compute the features in Mastodon before saving the mastodon file?

Just to confirm, can you share the mastodon file?

1 Like

Thanks for the reply, Tinevez.

I want to upload my .mastdon file, but the file size seems to be too big (>100MB). So even after compressing it, I get an error and can’t upload it. The image is 150 frames long, so the file size seems to be too large. In order to reduce the file size, I’m trying to reduce the number of frames in the image and try again. I’ll let you know as soon as I can upload it.

I haven’t uploaded the .mastodon file yet, but I have verified the possibility you pointed out that I did not save the calculation result.

First, in Fiji’s Mastodon plugin, I pressed the “compute features” button and calculated statistics such as “spot radius” and “Track N spots”.

After that calculation, I pressed the “table” button and confirmed that statistics such as “spot radius” and “Track N spots” were actually calculated (Fig. 1 and 2). Fig 1.tif (5.9 MB) Fig 2.tif (5.9 MB)

After confirmation, I pressed the “save” button to save the .mastodon file.

I imported it into matlab, but it did not contain information such as Spot radius or Track N spots, as shown in Fig. 3. Fig 3.tif (5.9 MB)

Curiously, I also tried importing the demo file “mamutproject.mastodon” that you created, but it also did not contain information such as Spot radius and Track N spots (Fig. 5). Fig 5.tif (5.9 MB)
Your “mamutproject.mastodon” file is small in file size, so I will upload it first (mamutproject.zip). mamutproject.zip (343.3 KB)

Ok. If the features are present in the Mastodon tables and not in the MATLAB import, then the MATLAB importer is wrong.

It is likely to be the case because we have to hardcode the feature names in the MATLAB importer. We probably changed some things in the feature names and I forgot to echo them in the importer.

I will look into this and report, thanks for the feedback.

1 Like

Thank you for your quick response, Tinevez.

I’ll upload my .mastodon file as soon as I can too, although I can’t do it right away as I’m using my PC for another job.

Hello @SakuLab

I think I fixed one part of the problem.
Can you git pull again the importer and try again?

1 Like

Thanks for the quick fix, Tinevez.

I downloaded the importer again and tried again.
I was able to get the following new features: SpotRadius, DetectionQuality, SpotTrackID, SpotGaussianFilteredIntensityMeanCh1, SpotGaussianFilteredIntensityStdCh1 and TrackNSpots, which I had not been able to get before (Fig. 6). Fig 6.tif (7.9 MB)

This fix will help me a lot with my research.
Thank you!

However, as you may have noticed, there are the following two problems:

  1. A problem related to matlab-mastodon-importer

I also calculated the “Spot N links” and “Spot median intensity” features with Fiji’s Mastodon plugin. However, I have not been able to read these two features in Matlab (Fig. 6). Fig 6.tif (7.9 MB)

  1. Problems related to Mastodon in Fiji

I think there is something wrong with the behavior of the “Feature and tag table” window that displays the calculation results.

I have calculated all the features that can be selected in Fiji’s Mastodon plugin. However, the “Feature and tag table” window shows only some of the features (Fig. 1). Fig 1.tif (5.9 MB) For example, features such as “x”, “y”, “z”, “t”, and “Spot median intensity” are not displayed. Features such as “x”, “y”, “z”, and “t” can be read in Matlab, so I think they exist somewhere. Oddly enough, when I calculate only the “Spot position”, the features of “x”, “y” and “z” are displayed normally in the “Feature and tag table” window (Fig. 7). Fig 7.tif (5.9 MB) The feature “spot median intensity” is not readable in Matlab, but it has been calculated over a long period of time in Mastodon, so I think it exists somewhere.

Also, in the current version of Mastodon, I cannot calculate the feature “Spot sum intensity” that appears in MastodonManual.pdf. It would be really helpful if this feature could be calculated.

These issues are not urgent, as your fix will do my job well for the time being.
It would be helpful if you could fix these problems when you are free.

Thanks to you, I can have a nice weekend.

Have a nice weekend!

You have already solved most of the problems, but I upload my .mastodon file with the frames cut down to the point where it can be uploaded20f_210407_dataset-f3.zip (3.7 MB) .

I computed all the features that can be computed with Fiji’s Mastodon plugin (Fig. 8)Fig 8.tif (5.9 MB), in Matlab, most of the features are now readable, but features like “Spot N links” and “Spot median intensity” are also not readable (Fig. 9)Fig 9.tif (6.4 MB) .

On a different note, when I analyzed only 20 frames, Fiji’s Mastodon plugin showed me the results of all the features (Fig. 8)Fig 8.tif (5.9 MB) . On the other hand, when I analyzed 150 frames, Fiji’s Mastodon plugin showed me only some of the features (Fig. 1)Fig 1.tif (5.9 MB) . If there are too many calculation results, it may cause trouble in displaying them.

It would be really helpful if you could solve these troubles when you are free.

Best regards.

Hello @SakuLab

Actually I found out that the median was not saved in the file, which needs to be fixed.
The N links feature is not saved as well, but it is on purpose: this is a feature which is computer ‘on the fly’ and does not need computation.

I am anyway unsatisfied by how the median and all intensity features are computed. @tpietzsch just made a new iterator over the ellipsoid that should fix my boggus ones. What I will do is to rewrite the feature computers related to intensity based on Tobias iterator. I will keep you posted here on the forum.

1 Like

Thank you @tinevez and @tpietzsch for your prompt and perfect response.

I’m looking forward to your new code.

I think it’s great that you are helping so many biologists with their research.

Best regards.

1 Like