Normalization

From stats++ wiki
Jump to: navigation, search

Normalization is scaling of data to lie in the range [0,1]. Given a collection of data points $x$, the equation that accomplishes this is: \begin{equation} \tag{1} x' = \frac{x - x_\text{min}}{x_\text{max} - x_\text{min}} \end{equation} where $x'$ is the transformed data, and $x_\text{min}$ and $x_\text{max}$ are the minimum and maximum of $x$.

Denormalization is the reverse transformation: \begin{equation} \tag{2} x = x' (x_\text{max} - x_\text{min}) + x_\text{min} \end{equation}

stats++ (header-only)

Normalization and denormalization is handled in stats++ though statsxx::data::Preprocessor object, declared in statsxx/data.hpp.

Normalization

normalize_data() function can be used to normalize data:

Matrix<double> statsxx::data::Preproccessor::normalize_data(Matrix<double> X)

Denormalization

denormalize_data() gives the reverse transformation:

Matrix<double> statsxx::data::Preproccessor::denormalize_data(Matrix<double> X)

Example code

// jScience
#include "jScience/linalg.hpp" // Matrix<>
 
// stats++
#include "statsxx/data.hpp"    // statsxx::data::Preproccessor
 
 
int main(int argc, char* argv[])
{
    Matrix<double> X;
 
    // populate X with data
 
    statsxx::data::Preproccessor pp(X);
 
    Matrix<double> X_pp = pp.normalize_data(X);
 
    Matrix<double> X_dpp = pp.denormalize_data(X_pp); // will return X
 
    return 0;
}

stats++ (executable)

Normalization and denormalization is handled in stats++ though statsxx::data::Preprocessor object, declared in statsxx/data.hpp.