Gaussian processes in stats++

From stats++ wiki
Jump to: navigation, search

The following describes the implementation of Gaussian processes in stats++. Familiarity with the theory behind Gaussian processes is assumed.

Introduction

In stats++, all Gaussian process functionality can included in a project via the following header file:

#include "statsxx/machine_learning/Gaussian_process.hpp"

It is important to note that all subroutines and classes that provide this functionality are encapsulated in the Gaussian_process namespace:

namespace Gaussian_process
{
    ...
}

Model selection

The first step in constructing a Gaussian process is to specify its underlying model. Recall that a Gaussian process is fully specified by a mean function $\mu(\mathbf{x})$ and covariance function $k(\mathbf{x},\mathbf{x}')$. Model selection is concerned with choosing their functional forms and values for any hyperparameters (including those inherent to the Gaussian process).

The hyperparameters of optimize_hyperparameters():

std::pair<std::vector<double>,
          double> Gaussian_process::optimize_hyperparameters(std::unique_ptr<CovarianceFunction> const &k,
                                                             std::vector<double> theta,
                                                             const Matrix<double> &X,
                                                             const Vector<double> &y,
                                                             double sigma_n)

As an example code to optimize the hyperparameters of the squared exponential covariance function and noise in a dataset is as follows:

Matrix<double> X = ...;
Vector<double> y = ...;
double sigma_n = ...;
 
std::vector<double> theta = ...;
std::unique_ptr<CovarianceFunction> k = std::unique_ptr<SquaredExponential>(new SquaredExponential(theta));
 
std::vector<double> theta_opt;
double sigma_n_opt;
std::tie(theta_opt, sigma_n_opt) = Gaussian_process::optimize_hyperparameters(k, theta, X, y, sigma_n);
 
k->assign_hyperparameters(theta_opt);

Constructing a Gaussian process

Once a model has been selected, it can be used to construct a Gaussian process. In stats++, this is accomplished via the GaussianProcess class:

class GaussianProcess
{
    ...
}

The GaussianProcess constructor reflects the fact that it is fully specified by $\mu(\mathbf{x})$ and $k(\mathbf{x},\mathbf{x}')$ (the former assumed 0):

Gaussian_process::GaussianProcess::GaussianProcess(std::unique_ptr<CovarianceFunction> const &k_arg)

As an example code to construct a Gaussian process with the squared exponential covariance function is as follows:

std::vector<double> theta = ...;
std::unique_ptr<CovarianceFunction> cf = std::unique_ptr<SquaredExponential>(new SquaredExponential(theta));
Gaussian_process::GaussianProcess gp(cf);

Adding training data

After the Gaussian process has been constructed, the next step is to add training data; this is accomplished via the following function:

void Gaussian_process::GaussianProcess::add_data(const Matrix<double> &X_arg, 
                                                 const Vector<double> &y_arg, 
                                                 const double sigma_n_arg)

After data has been added, the likelihood of the data (and its derivatives with respect to the hyperparameters) can be calculated using the functions:

double GaussianProcess::likelihood()
std::vector<double> GaussianProcess::dlikelihood()

Prediction

After training data has been added to a Gaussian process, it can be used to make predictions about new observations.

void Gaussian_process::GaussianProcess::predict(const Vector<double> &x)