Predict responses for Gaussian kernel regression model



YFit = predict(Mdl,X) returns predicted responses for each observation in the predictor data X based on the binary Gaussian kernel regression model Mdl.


collapse all

Predict the test set responses using a Gaussian kernel regression model for the carbig data set.

Load the carbig data set.

load carbig

Specify the predictor variables (X) and the response variable (Y).

X = [Weight,Cylinders,Horsepower,Model_Year];
Y = MPG;

Delete rows of X and Y where either array has NaN values. Removing rows with NaN values before passing data to fitrkernel can speed up training and reduce memory usage.

R = rmmissing([X Y]); 
X = R(:,1:4); 
Y = R(:,end); 

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10)  % For reproducibility 
N = length(Y); 
cvp = cvpartition(N,'Holdout',0.1);
idxTrn = training(cvp); % Training set indices
idxTest = test(cvp);    % Test set indices

Standardize the training data and train the regression kernel model.

Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
[Ztrain,tr_mu,tr_sigma] = zscore(Xtrain); % Standardize the training data
tr_sigma(tr_sigma==0) = 1;
Mdl = fitrkernel(Ztrain,Ytrain)
Mdl = 
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 128
               KernelScale: 1
                    Lambda: 0.0028
             BoxConstraint: 1
                   Epsilon: 0.8617

  Properties, Methods

Mdl is a RegressionKernel model.

Standardize the test data using the same mean and standard deviation of the training data columns. Predict responses for the test set.

Xtest = X(idxTest,:);
Ztest = (Xtest-tr_mu)./tr_sigma; % Standardize the test data
Ytest = Y(idxTest);

YFit = predict(Mdl,Ztest);

Create a table containing the first 10 observed response values and predicted response values.

ans=10×2 table
    ObservedValue    PredictedValue
    _____________    ______________

         18              17.616    
         14              25.799    
         24              24.141    
         25              25.018    
         14              13.637    
         14              14.557    
         18              18.584    
         27              26.096    
         21              25.031    
         13              13.324    

Estimate the test set regression loss using the mean squared error loss function.

L = loss(Mdl,Ztest,Ytest)
L = 9.2664

Input Arguments

collapse all

Kernel regression model, specified as a RegressionKernel model object. You can create a RegressionKernel model object using fitrkernel.

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train Mdl.

Data Types: single | double

Output Arguments

collapse all

Predicted responses, returned as a numeric vector.

YFit is an n-by-1 vector of the same data type as the response data (Y) used to train Mdl, where n is the number of observations in X.

Extended Capabilities

Introduced in R2018a