`gptoolsStan`

is a minimal package to publish Stan code
for efficient Gaussian process inference. The package can be used with
the `cmdstanr`

interface for Stan in R. Unfortunately, `Rstan`

is not
supported because it does
not provide an option to specify include paths. If you’re already
familiar with `cmdstanr`

, dive in below. If not, have a look
at the getting
started guide for the `cmdstanr`

interface.

This vignette demonstrates the package by sampling from a simple Gaussian process using Fourier methods (see the accompanying publication “Scalable Gaussian Process Inference with Stan” for background on the approach). Here is the model definition in Stan.

```
functions {
// Include utility functions, such as real fast Fourier transforms.
#include gptools/util.stan
// Include functions to evaluate GP likelihoods with Fourier methods.
#include gptools/fft.stan
}
data {
// The number of sample points.
int<lower=1> n;
// Real fast Fourier transform of the covariance kernel.
vector[n %/% 2 + 1] cov_rfft;
}
parameters {
// GP value at the `n` sampling points.
vector[n] f;
}
model {
// Sampling statement to indicate that `f` is a GP.
f ~ gp_rfft(zeros_vector(n), cov_rfft);
}
```

Here, we assume that `cmdstanr`

is installed and that the
`cmdstan`

compiler is available. See here if you need
help getting set up with `cmdstanr`

. Let’s compile the
model.

```
library(cmdstanr)
#> This is cmdstanr version 0.7.0
#> - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr
#> - CmdStan path: /Users/till/.cmdstan/cmdstan-2.33.1
#> - CmdStan version: 2.33.1
library(gptoolsStan)
model <- cmdstan_model(
stan_file="getting_started.stan",
include_paths=gptools_include_path(),
)
```

As indicated in the `data`

section of the Stan program, we
need to define the number of elements `n`

of the Gaussian
process and the real
fast Fourier transform (RFFT) of the covariance kernel
`cov_rfft`

. We’ll use 100 elements and a squared
exponential covariance kernel which allows us to evaluate the RFFT
directly.

```
n <- 100
length_scale <- n / 10
freq <- 1:(n %/% 2 + 1) - 1
# See appendix B of https://arxiv.org/abs/2301.08836 for details on the expression.
cov_rfft <- exp(- 2 * (pi * freq * length_scale / n) ^ 2) + 1e-9
```

Let’s obtain prior realization by sampling from the model.

```
fit <- model$sample(
data=list(n=n, cov_rfft=cov_rfft),
seed=123,
chains=1,
iter_warmup=1000,
iter_sampling=5,
)
#> Running MCMC with 1 chain...
#>
#> Chain 1 Iteration: 1 / 1005 [ 0%] (Warmup)
#> Chain 1 Iteration: 100 / 1005 [ 9%] (Warmup)
#> Chain 1 Iteration: 200 / 1005 [ 19%] (Warmup)
#> Chain 1 Iteration: 300 / 1005 [ 29%] (Warmup)
#> Chain 1 Iteration: 400 / 1005 [ 39%] (Warmup)
#> Chain 1 Iteration: 500 / 1005 [ 49%] (Warmup)
#> Chain 1 Iteration: 600 / 1005 [ 59%] (Warmup)
#> Chain 1 Iteration: 700 / 1005 [ 69%] (Warmup)
#> Chain 1 Iteration: 800 / 1005 [ 79%] (Warmup)
#> Chain 1 Iteration: 900 / 1005 [ 89%] (Warmup)
#> Chain 1 Iteration: 1000 / 1005 [ 99%] (Warmup)
#> Chain 1 Iteration: 1001 / 1005 [ 99%] (Sampling)
#> Chain 1 Iteration: 1005 / 1005 [100%] (Sampling)
#> Chain 1 finished in 6.7 seconds.
#> Warning: 5 of 5 (100.0%) transitions hit the maximum treedepth limit of 10.
#> See https://mc-stan.org/misc/warnings for details.
```

We extract the draws from the `fit`

object and plot a
realization of the process.

```
f <- fit$draws("f", format="draws_matrix")
plot(1:n, f[1,], type="l", xlab="covariate x", ylab="Gaussian process f(x)")
```

This vignette illustrates the use of gptools with an elementary
example. Further details can be found in the more comprehensive
documentation although using the `cmdstanpy`

interface.