dimensionality_reduction
Class PCA

java.lang.Object
  extended by dimensionality_reduction.PCA

public class PCA
extends java.lang.Object

The following is a simple example of how to perform basic principle component analysis in EJML.

Principal Component Analysis (PCA) is typically used to develop a linear model for a set of data (e.g. face images) which can then be used to test for membership. PCA works by converting the set of data to a new basis that is a subspace of the original set. The subspace is selected to maximize information.

PCA is typically derived as an eigenvalue problem. However in this implementation SVD is used since it can produce a more numerically stable solution. Computation using EVD requires explicitly computing the variance of each sample set. The variance is computed by squaring the residual, which can cause loss of precision.

Author:
Peter Abeles, Elefterios Spyromitros-Xioufis

Field Summary
private  org.ejml.data.DenseMatrix64F A
          where the data is stored
private  boolean compact
           
private  boolean doWhitening
          Whether to perform whitening.
(package private)  double[] mean
          mean values of each element across all the samples
private  int numComponents
          how many principle components are used
private  int numSamples
          number of samples that will be used for learning the PCA
private  int sampleIndex
          counts the number of currently loaded samples
private  int sampleSize
          number of elements in each sample
private  org.ejml.data.DenseMatrix64F V_t
          principle component subspace is stored in the rows
private  org.ejml.data.DenseMatrix64F W
          a diagonal matrix with the singular values
 
Constructor Summary
PCA(int numComponents, int numSamples, int sampleSize)
           
 
Method Summary
 void addSample(double[] sampleData)
          Adds a new sample of the raw data to internal data structure for later processing.
 void computeBasis()
          Computes a basis (the principle components) from the most dominant eigenvectors.
 double[] getBasisVector(int which)
          Returns a vector from the PCA's basis.
 double[] getEigenValues()
          Returns a vector with the eignevalues in descending order.
 double[] getMean()
           
 double[] sampleToEigenSpace(double[] sampleData)
          Converts a vector from sample space into eigen space.
 void setBasisMatrix(double[][] basis)
          Sets the PCA basis matrix.
 void setCompact(boolean compact)
           
 void setDoWhitening(boolean doWhitening)
           
 void setEigenvalues(double[] eigenvalues)
          Initializes the diagonal eigenvalue matrix W by using the supplied vector and then whitens the projection matrix V_t
 void setMean(double[] mean)
           
 void setPCAFromFile(java.lang.String PCAFileName)
          Initializes the PCA matrix, means vector and optionally eigenvalues matrix from the given file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

numComponents

private int numComponents
how many principle components are used


sampleSize

private int sampleSize
number of elements in each sample


numSamples

private int numSamples
number of samples that will be used for learning the PCA


sampleIndex

private int sampleIndex
counts the number of currently loaded samples


A

private org.ejml.data.DenseMatrix64F A
where the data is stored


mean

double[] mean
mean values of each element across all the samples


V_t

private org.ejml.data.DenseMatrix64F V_t
principle component subspace is stored in the rows


W

private org.ejml.data.DenseMatrix64F W
a diagonal matrix with the singular values


doWhitening

private boolean doWhitening
Whether to perform whitening.


compact

private boolean compact
Constructor Detail

PCA

public PCA(int numComponents,
           int numSamples,
           int sampleSize)
Parameters:
numComponents - Number of vectors it will use to describe the data. Typically much smaller than the number of elements in the input vector.
numSamples - Number of samples that will be processed (only important at learning).
sampleSize - Number of elements in each sample.
Method Detail

setCompact

public void setCompact(boolean compact)

addSample

public void addSample(double[] sampleData)
Adds a new sample of the raw data to internal data structure for later processing. All the samples must be added before computeBasis is called.

Parameters:
sampleData - Sample from original raw data.

computeBasis

public void computeBasis()
Computes a basis (the principle components) from the most dominant eigenvectors.


sampleToEigenSpace

public double[] sampleToEigenSpace(double[] sampleData)
Converts a vector from sample space into eigen space. If setEigenvalues(double[]) has been called, then the projected vector is also whitened.

Parameters:
sampleData - Sample space data.
Returns:
Eigen space projection.

setDoWhitening

public void setDoWhitening(boolean doWhitening)

setMean

public void setMean(double[] mean)

getMean

public double[] getMean()

setBasisMatrix

public void setBasisMatrix(double[][] basis)
Sets the PCA basis matrix.

Parameters:
basis - the basis matrix

getBasisVector

public double[] getBasisVector(int which)
Returns a vector from the PCA's basis.

Parameters:
which - Which component's vector is to be returned.
Returns:
Vector from the PCA basis.

setEigenvalues

public void setEigenvalues(double[] eigenvalues)
                    throws java.lang.Exception
Initializes the diagonal eigenvalue matrix W by using the supplied vector and then whitens the projection matrix V_t

Parameters:
eigenvalues -
Throws:
java.lang.Exception

getEigenValues

public double[] getEigenValues()
Returns a vector with the eignevalues in descending order.

Returns:
vector of eigenValues in descending order.

setPCAFromFile

public void setPCAFromFile(java.lang.String PCAFileName)
                    throws java.lang.Exception
Initializes the PCA matrix, means vector and optionally eigenvalues matrix from the given file.

Parameters:
PCAFileName - the learning file
hasEigenvalues - whether the eigenvalues are also stored in the file.
Throws:
java.lang.Exception