How can I choose the best kernel for a Gaussian process regression, possibly using "bayesopt" function? How can I predict unseen data with the resulting model?

8 views (last 30 days)
How can I choose the best kernel for a Gaussian process regression, possibly using "bayesopt" function?
For example setting this variable:
kernel=optimizableVariable('KernelFunction',{'exponential','squaredexponential','matern32','matern52',...
'rationalquadratic','ardexponential','ardsquaredexponential','ardmatern32','ardmatern52','ardrationalquadratic'},'Type','categorical')
How can I predict unseen data with the resulting model?

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 9 Mar 2021
Edited: MathWorks Support Team on 9 Mar 2021
There are two ways you could do this. One of them involves "bayesopt" function.
 
(1) The easiest way is to use the "fitrgp" function. There is an example of how to optimize a fitrgp hyperparameter (Sigma) on the "fitrgp" function Documentation page:
 
To optimize ‘KernelFunction’, using default values for all other hyperparameters, you could do the following (using data from the documentation example):
load(fullfile(matlabroot,'examples','stats','gprdata2.mat'))
model1 = fitrgp(x,y,'OptimizeHyperparameters',{'KernelFunction'});
Because fitrgp tends to be sensitive to the Sigma parameter, you will probably get a better result if you simultaneously optimize KernelFunction and Sigma. Also, searching categorical variables inherently requires more function evaluations, so you should probably run it longer:
model2 = fitrgp(x,y,'OptimizeHyperparameters',{'KernelFunction','Sigma'},...
    'HyperparameterOptimizationOptions', struct('MaxObjectiveEvaluations',100));
The "fitrgp" function documentation page lists the hyperparameters that are eligible for optimization, and what values are searched: 
You can use this model to make predictions on unseen data directly with the "predict" function:
model2 = fitrgp(x,y,'OptimizeHyperparameters',{'KernelFunction','Sigma'},...
    'HyperparameterOptimizationOptions', struct('MaxObjectiveEvaluations',100));
% Predict on unseen data
xNew = 0.2
ypred = predict(model2, xNew)
(2) A second approach would be to use "bayesopt" function directly, defining your own objective function. Here, the objective function is the 5-fold crossvalidation loss of the model.
load(fullfile(matlabroot,'examples','stats','gprdata2.mat')) 
% Define kernel variable
kernel = optimizableVariable('KernelFunction',{'exponential','squaredexponential','matern32','matern52',...
'rationalquadratic','ardexponential','ardsquaredexponential','ardmatern32','ardmatern52','ardrationalquadratic'},...
'Type','categorical')
% Call bayesopt, capturing x and y in the objective function
bo = bayesopt(@(T)objFcn(T,x,y), kernel)
% Define an objective function
function Loss = objFcn(Vars, x, y)
m = fitrgp(x, y, 'KernelFunction', char(Vars.KernelFunction), 'KFold', 5);Loss = kfoldLoss(m);
end
(2.1) To optimize both KernelFunction and Sigma using "bayesopt" function, you can do the following:
% Define 2 variables
kernel = optimizableVariable('KernelFunction',{'exponential','squaredexponential','matern32','matern52',...
'rationalquadratic','ardexponential','ardsquaredexponential','ardmatern32','ardmatern52','ardrationalquadratic'},...
'Type','categorical')
sigma = optimizableVariable('Sigma',[1e-4,10],'Transform','log')
 
% Call bayesopt, capturing x and y in the objective function
bo = bayesopt(@(T)objFcn(T,x,y), [sigma, kernel], 'MaxObjectiveEvaluations', 100)
 
function Loss = objFcn(Vars, x, y)
m = fitrgp(x, y, 'KernelFunction', char(Vars.KernelFunction), ...
                 'Sigma', Vars.Sigma, 'ConstantSigma', true,...
                 'KFold', 5);
Loss = kfoldLoss(m);
end
To get a model that can make predictions, you need to fit a model without passing the ‘kfold’ argument, but keeping the optimal kernel function obtained from "bayesopt" function.
kernel= bo.XAtMinObjective.KernelFunction
model = fitrgp(x, y, 'KernelFunction', char(kernel));
% Predict on unseen data
xNew = 0.2
ypred = predict(model2, xNew)
  2 Comments
rakbar
rakbar on 7 Jan 2019
Thank you for this great tutorial. The kernel optimization step only looks for the best 'single' kernel type, e.g., ardsquaredexponential.
How can this method be modified to look for different kernel compositions?
For example: kernel = ardsquaredexponential * ardeexponential + rationalquadratic.
Often enough, a kernel composed of different types of kernels can describe the signal much better.
Sterling Baird
Sterling Baird on 7 Dec 2020
Nothing built-in, but I imagine you can specify a custom kernel function that is e.g. ardsquaredexponential * ardeexponential + rationalquadratic

Sign in to comment.

More Answers (0)

Products


Release

R2016b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!