I spent this weekend porting Leif Jonsson’s, Barnes Hut, pure-Java t-SNE clustering algorithm and multidimensional visualisation tool as an ImageJ plugin.
It turned out quite well, so I thought some of you may want to use or modify my implementation.
If a path is not passed to it via argument, the plugin will ask for a directory of images that it will then process. Some other t-SNE and plugin parameters can also be passed to the plugin, including ‘perplexity’, ‘initial dimensions(PCA preprocessing)’, and ‘maximum iterations for optimisation’ values; and a toggle for whether to output the low-dimensional result as a .csv file to the user’s desktop.
The lower-dimensional data is finally presented in a typical 2D scatter-plot. For example, this is the outcome of running the plugin on the full 60,000 (28x28pixel) grayscale mnist dataset:
The current implementation doesn’t automatically colour the scatterplot by labelled groups in ImageJ, but adding those in excel accentuates the accurate groupings:
I’ve put both my plugin build (T_SNE-0.0.1.jar) and Leif Jonsson’s tsne library (tsne-2.5.0.jar) in an accessible google drive folder for those wishing to try it out. I’ve also included the full mnist dataset as .pngs and a labels index for post-hoc colouring .
Or try making your own synthetic training and test sets by mouse drawing over slices of a giant stack etc.
Note1: Remove the ‘.remove’ portion of the file name and move the ‘T_SNE-0.0.1.jar’ file into the ImageJ plugins folder. Remove the ‘.remove’ portion of the file name and move the ‘tsne-2.5.0.jar’ file into the ImageJ jars folder.
Note2: The plugin is ‘very new’ so it is certainly lacking polish and some basic exception handling.
Note3: This clearly isn’t a proper distribution (via the updater) and I already know of a .jar conflict. To get this plugin to work in its current form you will have to temporarily remove the ejml-0.24.jar file from the ImageJ jar folder.
To run the plugin from a macro:
To pass arguments (note space delimiter):
run("T SNE", "perplexity=20 initial_dims=10 output_csv max_iterations=1000 input_path=[/Users/antinos/Desktop/");
I haven’t fully explored Leif’s t-SNE library, but even now I know the final output can be presented in more than 2-dimensions. Spitting out a 3D output at the end might be cool.
Anyway, apologies if it is bad practice to share a very-early plugin without using the updater and also if I have totally missed somebody else’s extant and awesome implementation of this visualisation and dimensionality reduction method.