PCA for set of 3D objects

Hello!

I am interested in using PCA to find the PCs of a group of fly brain z-stacks. Each brain is represented by a single z-stack.

Is there a plugin for ImageJ or another program that will take a group of 3D stacks as input and find the 3D PCs (also stacks representing the “principal brains”) of that group?

Thanks!

1 Like

Hi!

Do you mean something like “fitting a 3D ellipsoid” (sometimes just another word for PCA :wink: ) to each brain to find its principal directions? If that is the case both the 3DImageSuite (@ThomasBoudier) and MorpholibJ (@dlegland) have tools for that.

Or did you have something else in mind?

Hi @meib and @Christian_Tischer

I think that @meib is rather trying to do some dimensionality reduction taking each entire z-stack as if it were a vector and basically representing each of those vectors as a linear combination of the principal components. @meib, did I understand that correctly?

For that to work, it’d be easiest if all volumes were the same size -> each vector in your space would then be of the same dimensionality (w * h * d). Then, you can use a matrix library like JAMA (I haven’t looked at the Java matrix library landscape recently…) or maybe a stats library to do PCA.

One aspect is that PCA might give you PC that contain negative elements: you might want to consider doing something like (blind) non-negative matrix factorization (NMF) to avoid that.

I don’t think there is anything out of the box though… Should be pretty straightforward in MATLAB (Octave?) though.

Thomas

1 Like

I’m making the same assumption as Thomas here. @meib, if you are not already familiar with it you may want to look at Eigenfaces. This is a well known algorithm and if you search for the term you will find many tutorials, youtube lectures and sample code in pretty much any language (The original Turk & Pentland paper is worth a read too). The only thing that you will have to change from the Eigenfaces tutorials is that you will have to turn your 3D volume (rather than a 2D image) into a column vector to build the matrix.
As @Thomas_Pengo mentioned this is pretty straightforward in Matlab or alternatively Python numpy using svd (Singular Value Decomposition). (You will have to use the 'econ' option in matlab or full_matrices=False in numpy to avoid running out of memory, in particular when applying this to volumes).

1 Like

Hi all,

Thanks for your responses! @Thomas_Pengo and @VolkerH, you’ve assumed correctly about my problem. Thank you for your coding tips!

I will start with writing this in Python, but I am new to programming and would still prefer to use an existing plugin for ImageJ.

Does anyone know of a plugin that would work for this?

Thanks for the help!

Mei

1 Like

If it’s in 3D then you want a tensor decomposition. Check TensorFaces. I’ve applied this idea to EM data here.

Hi @jkh1

Thank you for sharing your work! It sounds like HOSVD is what I’m looking for.

If each of my brains is a tensor, then my dataset is a collection of tensors. Would factorizing an average tensor of that dataset (tensor A) result in basis matrices that represent the principal component brains of the dataset?

Let me know if you need me to clarify more.

Best,
Mei

As I understand it, your whole data set is a tensor of order 4. If each brain is a 3d image and you have N of these then the collection has dimensions X,Y,Z and N. When you factorize this tensor, you’ll get 4 factor matrices and the last one (assuming the dimensions are in this order X,Y,Z,N) can indeed be interpreted as having the coordinates of the brains along some principal components. Note that your images need to be registered. If the data is too big, you can factorize only a subset then project the rest of the data onto the resulting basis or you can downsample your images. It depends on what you’re trying to get at.

1 Like

Hi, some python demo about pca.

  1. how to pca a vector set:
import numpy as np
from numpy.linalg import eig

e,v = eig(np.cov(xyz))
  1. how to collect vector set form volumn
from scipy.ndimage import label
lab, n = label(imgs)

# to get the x th object:
xyzs = np.where(imgs==x)
# then do pca ...

but the code upon is not utillity when the volume has many object, that 's python’s limmit. If we use for loop to iter the pixels, that will be too slow. but thanks for many site-packages, such as skimage

  1. count many properties for every object in a volume
from skimage.measure import lab, regionprops
buf, n = label(imgs, strc, output=np.uint32)
ls = regionprops(buf)
for obj in ls:
    # obj.xxx, there are many properties (also the pca in), you can see the document

https://scikit-image.org/docs/dev/api/skimage.measure.html#regionprops

  1. at last, recommend ImagePy, a ui tool like imagej, but built on python data. https://github.com/Image-Py/imagepy, there are also an 3d object analysis plugin. https://github.com/Image-Py/imagepy/blob/master/imagepy/menus/Kit3D/Analysis%203D/regionprops3d_plgs.py, you can use it from uitool, and you can also reference the code in.

May be I did not understand your question well, but the code could be helpful.

@yxdragon This was very generous, thank you! I will look into ImagePy. Pieces of the code look useful.

@jkh1 Thanks for your response. That clarifies some things for me.

Because my main goal is to visualize the PCs as “principal brains” (translate the eigenvectors of pixel values back into 3D images) rather than to represent my original brains as linear combos of the PCs, I’m now thinking that HOSVD is not the way for me to go. Thanks for your suggestions though.

Just warning you that

e,v = eig(np.cov(xyz))

is probably not going to work. This is actually explained quite well on the Eigenface Wikipedia page:

Computing the eigenvectors[edit]

Performing PCA directly on the covariance matrix of the images is often computationally infeasible. If small images are used, say 100 × 100 pixels, each image is a point in a 10,000-dimensional space and the covariance matrix S is a matrix of 10,000 × 10,000 = 10^8 elements. However the rank of the covariance matrix is limited by the number of training examples: if there are N training examples, there will be at most N − 1 eigenvectors with non-zero eigenvalues.

In your case, you’re working on volumes, so assuming you have 100x100x100 voxel volumes, the covariance matrix would have 10^12 elements. That’s why you need to use one of the tricks, e.g. going with the SVD.

3 Likes

@VolkerH

Oh my. Thanks for the heads up. e,v = eig(np.cov(xyz)) is exactly the route I was planning to take. I should do more reading on HOSVD.

Is there a way to extract the eigenvectors someway other than direct PCA?

Yes, as explained in the references provided:

Using SVD on data matrix X, it is unnecessary to calculate the actual covariance matrix to get eigenfaces.

The wikipedia page gives a cookbook recipe under Connection with SVD. You only need to need to know the operators for matrix multiplication, matrix transpose and singular value decomposition in whatever programming language you chose to implement it. This might involve looking up the manual page for svd, you can do it.

1 Like

Wonderful, thank you so much! This has been extremely helpful.

Best,
Mei