tSNE implementation for ImageJ

Hi All,

I spent this weekend porting Leif Jonsson’s, Barnes Hut, pure-Java t-SNE clustering algorithm and multidimensional visualisation tool as an ImageJ plugin.
It turned out quite well, so I thought some of you may want to use or modify my implementation.
If a path is not passed to it via argument, the plugin will ask for a directory of images that it will then process. Some other t-SNE and plugin parameters can also be passed to the plugin, including ‘perplexity’, ‘initial dimensions(PCA preprocessing)’, and ‘maximum iterations for optimisation’ values; and a toggle for whether to output the low-dimensional result as a .csv file to the user’s desktop.
The lower-dimensional data is finally presented in a typical 2D scatter-plot. For example, this is the outcome of running the plugin on the full 60,000 (28x28pixel) grayscale mnist dataset:


The current implementation doesn’t automatically colour the scatterplot by labelled groups in ImageJ, but adding those in excel accentuates the accurate groupings:

I’ve put both my plugin build (T_SNE-0.0.1.jar) and Leif Jonsson’s tsne library (tsne-2.5.0.jar) in an accessible google drive folder for those wishing to try it out. I’ve also included the full mnist dataset as .pngs and a labels index for post-hoc colouring :slightly_smiling_face:.
Or try making your own synthetic training and test sets by mouse drawing over slices of a giant stack etc.

Note1: Remove the ‘.remove’ portion of the file name and move the ‘T_SNE-0.0.1.jar’ file into the ImageJ plugins folder. Remove the ‘.remove’ portion of the file name and move the ‘tsne-2.5.0.jar’ file into the ImageJ jars folder.

Note2: The plugin is ‘very new’ so it is certainly lacking polish and some basic exception handling.

Note3: This clearly isn’t a proper distribution (via the updater) and I already know of a .jar conflict. To get this plugin to work in its current form you will have to temporarily remove the ejml-0.24.jar file from the ImageJ jar folder.

To run the plugin from a macro:

run("T SNE");

To pass arguments (note space delimiter):

run("T SNE", "perplexity=20 initial_dims=10 output_csv max_iterations=1000 input_path=[/Users/antinos/Desktop/");

I haven’t fully explored Leif’s t-SNE library, but even now I know the final output can be presented in more than 2-dimensions. Spitting out a 3D output at the end might be cool.

Anyway, apologies if it is bad practice to share a very-early plugin without using the updater and also if I have totally missed somebody else’s extant and awesome implementation of this visualisation and dimensionality reduction method.

Kind regards.

9 Likes

A post was split to a new topic: Count the number of silkworm eggs laid

I will be very interested in giving this a shot once there are no .jar conflicts :slight_smile:

Hi,

I think I’ve resolved the .jar conflict. It took a bit of refactoring and project restructuring. For instance, the t-SNE library source-built classes are now in the buildpath, making the plugin a little bigger. An added advantage of this, is that all t-SNE source code is now available in my build. It meant that I could also start injecting IJ api calls into some of the previously hidden t-SNE classes (mostly log calls but was also useful to replace 1 or 2 ‘import’ assignments to make them more ImageJ friendly).

For this version (install instructions and relevant files below: see note1), I also took the opportunity to add a few convenience and function features to the plugin. These include:

  • The plugin will now first check to see if an image stack is available, in which case it can process the stack in lieu of a folder of images.
  • Using a macro argument, the user can now also specify a .csv label index, which the plugin will attempt to use to colour the final 2D plot, automatically. NOTE: the plugin will expect a single column of data in the .csv, with a single heading cell, like this:
    image
    If the number of entries in the label file does not match the number of processed images, then the plugin will default to an non-coloured plot.

For example, processing an mnist_fashion dataset of 60,000 (28x28 pixel) images, of these kind:
image etc…
Can now result in this ImageJ plot:


In which colours have been automatically assigned and a legend added.

To run with the new label file specification, e.g.:

run("T SNE", "label_path=[C:/Users/asinadin/Desktop/FashionIndex.csv]");

Note1: As the plugin is now built with refactored t-SNE source code, it does not now require the ‘tsne-2.5.0.jar’ library, to be added to the .jar folder as a dependency. However, it does now require another third party library ‘jblas-1.2.3.jar’, which I have included in an updated googledrive folder.
TO INSTALL the plugin: add the ‘T_SNE-0.0.4.jar’ file to the ImageJ plugin folder and ‘jblas-1.2.3.jar’ to the ImageJ .jar folder. (sorry… I may offer the Updater release this weekend).

Note2:
The automatic colouring feature turned out to be more work than the plugin re-build, and the methods I ended up using may require future optimisaiton for efficiency and elegance. Also, I ended up assigning colours to the labels randomly, to allow for the an unlimited number of them. However, I didn’t incorporate a similarity check, so there is a chance that any two groups could be assigned the same (unlikely) or apparently similar colours.

Note3:
So far, I have only tested this plugin in my Fiji build. There is a chance that I am relying on a dependency library in my build which is not present in virgin ImageJ. This weekend I will test this.

Kind regards.

2 Likes