# Gaussian processes in stats++

The following describes the implementation of Gaussian processes in stats++. Familiarity with the theory behind Gaussian processes is assumed.

## Contents

## Introduction

In stats++, all **Gaussian process** functionality can included in a project via the following header file:

`#include "statsxx/machine_learning/Gaussian_process.hpp"`

It is important to note that all subroutines and classes that provide this functionality are encapsulated in the Gaussian_process namespace:

namespace Gaussian_process { ... }

## Model selection

The first step in constructing a Gaussian process is to specify its underlying model. Recall that a Gaussian process is fully specified by a mean function $\mu(\mathbf{x})$ and covariance function $k(\mathbf{x},\mathbf{x}')$. **Model selection** is concerned with choosing their functional forms and values for any hyperparameters (including those inherent to the Gaussian process).

The hyperparameters of optimize_hyperparameters():

std::pair<std::vector<double>, double> Gaussian_process::optimize_hyperparameters(std::unique_ptr<CovarianceFunction> const &k, std::vector<double> theta, const Matrix<double> &X, const Vector<double> &y, double sigma_n)

As an example code to optimize the hyperparameters of the squared exponential covariance function and noise in a dataset is as follows:

Matrix<double> X = ...; Vector<double> y = ...; double sigma_n = ...; std::vector<double> theta = ...; std::unique_ptr<CovarianceFunction> k = std::unique_ptr<SquaredExponential>(new SquaredExponential(theta)); std::vector<double> theta_opt; double sigma_n_opt; std::tie(theta_opt, sigma_n_opt) = Gaussian_process::optimize_hyperparameters(k, theta, X, y, sigma_n); k->assign_hyperparameters(theta_opt);

## Constructing a Gaussian process

Once a model has been selected, it can be used to construct a Gaussian process. In stats++, this is accomplished via the GaussianProcess class:

class GaussianProcess { ... }

The GaussianProcess constructor reflects the fact that it is fully specified by $\mu(\mathbf{x})$ and $k(\mathbf{x},\mathbf{x}')$ (the former assumed 0):

Gaussian_process::GaussianProcess::GaussianProcess(std::unique_ptr<CovarianceFunction> const &k_arg)

As an example code to construct a Gaussian process with the squared exponential covariance function is as follows:

std::vector<double> theta = ...; std::unique_ptr<CovarianceFunction> cf = std::unique_ptr<SquaredExponential>(new SquaredExponential(theta)); Gaussian_process::GaussianProcess gp(cf);

## Adding training data

After the Gaussian process has been constructed, the next step is to add training data; this is accomplished via the following function:

void Gaussian_process::GaussianProcess::add_data(const Matrix<double> &X_arg, const Vector<double> &y_arg, const double sigma_n_arg)

After data has been added, the likelihood of the data (and its derivatives with respect to the hyperparameters) can be calculated using the functions:

double GaussianProcess::likelihood() std::vector<double> GaussianProcess::dlikelihood()

## Prediction

After training data has been added to a Gaussian process, it can be used to make predictions about new observations.

void Gaussian_process::GaussianProcess::predict(const Vector<double> &x)