Decorrelation and whitening

From stats++ wiki
Jump to: navigation, search

IGNORE

Theory

Decorrelation

Let $\mathbf{X}$ be our set of input with zero mean. Its covariance matrix $\mathbf{\Sigma}$ can be written: \begin{equation} \tag{1} \mathbf{\Sigma} = \operatorname{E} \left[ \mathbf{X} \mathbf{X}^\mathrm{T} \right] \end{equation} When the data in $\mathbf{X}$ is correlated (in general), $\mathbf{\Sigma}$ will not be diagonal.

We therefore seek a transformation (decorrelation) matrix $\mathbf{W}$ such that the transformed data $\mathbf{W} \mathbf{X}$ is decorrelated: \begin{equation} \tag{2} \mathbf{\Sigma}' = \operatorname{E} \left[ \mathbf{W} \mathbf{X} \left( \mathbf{W} \mathbf{X} \right)^\mathrm{T} \right] = \mathbf{\Lambda} \end{equation} where $\mathbf{\Lambda}$ is a diagonal matrix (in fact, containing the eigenvalues -- see below). Using the matrix identity $\left( \mathbf{W} \mathbf{X} \right)^\mathrm{T} = \mathbf{X}^\mathrm{T} \mathbf{W}^\mathrm{T}$: \begin{equation} \tag{3} \mathbf{W} \left( \mathbf{X} \mathbf{X}^\mathrm{T} \right) \mathbf{W}^\mathrm{T} = \mathbf{W} \mathbf{\Sigma} \mathbf{W}^\mathrm{T} = \mathbf{\Lambda} \end{equation} where $\mathbf{X} \mathbf{X}^\mathrm{T}$ is recognized as our original covariance matrix.

Noting that the covariance matrix is always diagonalizable, an eigendecomposition can be performed: \begin{equation} \tag{4} \mathbf{\Sigma} = \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^{-1} \end{equation} where $\mathbf{Q}$ is a square ($N \times N$) matrix whose ith column is the eigenvector $q_i$ of $\mathbf{\Sigma}$ and $\mathbf{\Lambda}$ is the diagonal matrix whose diagonal elements $\mathbf{\Lambda}_{ii} = \lambda_i$ are the corresponding eigenvalues. Inserting Eq. (4) into (3) gives: \begin{equation} \tag{5} \mathbf{W} \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^{-1} \mathbf{W}^\mathrm{T} = \mathbf{\Lambda} \end{equation} and the decorrelation matrix is obvious: \begin{equation} \tag{6} \mathbf{W} = \mathbf{Q}^\mathrm{T} \end{equation}

$\mathbf{Q}$ can be obtained using principal component analysis (e.g., it is the same as the right singular vectors of $\mathbf{X}$, obtained via singular value decomposition). (This is also known as the discrete Karhunen–Loève transform (KLT).]

Whitening

The diagonal elements of $\mathbf{\Lambda}$ correspond to the length of the associated eigenvector of $\mathbf{\Sigma}$. If they are different, then the covariance along each direction will be different (i.e., the covariance will be elliptical). Making them the same is called whitening the data. In other words, we seek: \begin{equation} \tag{7} \mathbf{\Sigma}' = \operatorname{E} \left[ \mathbf{W} \mathbf{X} \left( \mathbf{W} \mathbf{X} \right)^\mathrm{T} \right] = \mathbf{I} \end{equation} Examination of (5) suggests how to choose the matrix $\mathbf{W}$: \begin{equation} \tag{8} \mathbf{W} = \mathbf{\Lambda}^{-1/2} \mathbf{Q}^\mathrm{T} \end{equation}

analytics++

It is up to the user to preprocess data through the preprocess_data() function:

std::pair<DataSet, Matrix> preprocess_data(DataSet)

By default, xxx.

For advanced use, the following subroutines exist:

std::pair<DataSet, Matrix> decorrelate_data(DataSet)
std::pair<DataSet, Matrix> whiten_data(DataSet)