Principal Component Analysis (PCA)

Technique in machine learning and statistics for dimensionality reduction and feature extraction.

Explanation

Key Concepts of PCA

1. Dimensionality Reduction:

  • PCA reduces the number of features in the dataset while retaining as much variance (information) as possible.
  • This is useful for visualizing high-dimensional data and reducing computational complexity.

2. Principal Components:

  • These are new features that are linear combinations of the original features.
  • The first principal component captures the maximum variance in the data.
  • Each subsequent component captures the remaining variance under the constraint that it is orthogonal to the previous components.

3. Orthogonality:

  • Principal components are orthogonal, meaning they are uncorrelated with each other.

Examples

E.g. We have a hyperspectral dataset composed of 100 observations (pixels) in which each observation consists of 150 spectral features.

This dataset has too many features which mean a big dimensionality. If we have more features than observations, we have the risk of overfitting in supervised learning, so the performance will drop. Moreover, the observations become harder to cluster. 

PCA can be solution for dimensionality problem. It projects the data along the directions where there is the largest variation of data. In this case by reducing the dimension to 10, we can reach a good balance between the number of observations and the number of features: 100/10 = 10.

How to

1. Standardization:

  • Scale the data so that each feature has a mean of 0 and a standard deviation of 1.
  • This step is crucial if the features are measured on different scales.

2. Covariance Matrix Computation:

  • Compute the covariance matrix of the standardized data to understand how the features vary with respect to each other.

3. Eigen Decomposition:

  • Compute the eigenvectors and eigenvalues of the covariance matrix.
  • Eigenvectors represent the directions (principal components) and eigenvalues represent the magnitude (variance) in those directions.

4. Sort and Select:

  • Sort the eigenvectors by decreasing eigenvalues to rank the principal components by the amount of variance they explain.

5. Transform Data:

  • Project the original data onto the selected principal components to get the transformed data.

Outgoing relations