Principal Component Analysis

PCA is a dimensionality reduction technique which projects a set of observations into a lower dimensional space, into a set of linearly uncorrelated variables called principal components.

Introduction

PCA identifies directions of maximum variance (in high dimensional data) and projects the data onto a smaller dimensional subspace while retaining most of the information.

PCA is built on the concept of eigenvectors and eigenvalues, creates a projection matrix of the selected k eigenvectors, and transforms the original dataset X via the projection matrix and obtains a k-dimensional feature subspace Y.