Standardization

From stats++ wiki
Jump to: navigation, search

Because the independent variables that describe a data set are often of different dimensionality, it is important to transform them to a standard scale. Standardization is the process of transforming data to have zero mean and unit standard deviation. Given a collection of data points $x$, the equation that accomplishes this is x: \begin{equation} \tag{1} x' = \frac{x - \bar{x}}{s} \end{equation} where $x'$ is the transformed data, and $\bar{x}$ is the sample mean, and $s$ is the sample standard deviation.

Destandardization is the reverse transformation: \begin{equation} \tag{2} x = x' s + \bar{x} \end{equation}

stats++

In stats++, once a Preprocessor object has been created, its standardize_data() function can be used to normalize data:

Matrix<double> Preproccessor::standardize_data(Matrix<double> X)

and destandardize_data() gives the reverse transformation:

Matrix<double> Preproccessor::destandardize_data(Matrix<double> X)