Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.
When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer
function.
Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA'
values as missing data so that datastore
replaces them with NaN
values. Select a subset of the variables to use. Create a tall table on top of the datastore.
Specify DepTime
and ArrTime
as the predictor variables (X
) and ActualElapsedTime
as the response variable (Y
). Select the observations for which ArrTime
is later than DepTime
.
Standardize the predictor variables.
Train a default Gaussian kernel regression model with the standardized predictors. Set 'Verbose',0
to suppress diagnostic messages.
Mdl =
RegressionKernel
PredictorNames: {'x1' 'x2'}
ResponseName: 'Y'
Learner: 'svm'
NumExpansionDimensions: 64
KernelScale: 1
Lambda: 8.5385e-06
BoxConstraint: 1
Epsilon: 5.9303
Properties, Methods
FitInfo = struct with fields:
Solver: 'LBFGS-tall'
LossFunction: 'epsiloninsensitive'
Lambda: 8.5385e-06
BetaTolerance: 1.0000e-03
GradientTolerance: 1.0000e-05
ObjectiveValue: 30.7814
GradientMagnitude: 0.0191
RelativeChangeInBeta: 0.0228
FitTime: 52.5822
History: []
Mdl
is a trained RegressionKernel
model, and the structure array FitInfo
contains optimization details.
Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.
lossMSE =
MxNx... tall array
? ? ? ...
? ? ? ...
? ? ? ...
: : :
: : :
lossEI =
MxNx... tall array
? ? ? ...
? ? ? ...
? ? ? ...
: : :
: : :
Evaluate the tall arrays and bring the results into memory by using gather
.
Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 1.2 sec
Evaluation completed in 1.5 sec