Regression loss for observations not used in training

Description

returns
the cross-validated mean squared error (MSE) obtained by the cross-validated,
linear regression model `L`

= kfoldLoss(`CVMdl`

)`CVMdl`

. That is, for every
fold, `kfoldLoss`

estimates the regression loss for
observations that it holds out when it trains using all other observations.

`L`

contains a regression loss for each regularization
strength in the linear regression models that compose `CVMdl`

.

uses
additional options specified by one or more `L`

= kfoldLoss(`CVMdl`

,`Name,Value`

)`Name,Value`

pair
arguments. For example, indicate which folds to use for the loss calculation
or specify the regression-loss function.

`CVMdl`

— Cross-validated, linear regression model`RegressionPartitionedLinear`

model objectCross-validated, linear regression model, specified as a `RegressionPartitionedLinear`

model object. You can create a
`RegressionPartitionedLinear`

model using `fitrlinear`

and specifying any of the one of the cross-validation,
name-value pair arguments, for example, `CrossVal`

.

To obtain estimates, kfoldLoss applies the same data used to cross-validate the linear
regression model (`X`

and `Y`

).

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'Folds'`

— Fold indices to use for response prediction`1:CVMdl.KFold`

(default) | numeric vector of positive integersFold indices to use for response prediction, specified as the comma-separated pair consisting of `'Folds'`

and a numeric vector of positive integers. The elements of `Folds`

must range from `1`

through `CVMdl.KFold`

.

**Example: **`'Folds',[1 4 10]`

**Data Types: **`single`

| `double`

`'LossFun'`

— Loss function`'mse'`

(default) | `'epsiloninsensitive'`

| function handleLoss function, specified as the comma-separated pair consisting of `'LossFun'`

and a built-in, loss-function name or function handle.

The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $$f\left(x\right)=x\beta +b.$$

*β*is a vector of*p*coefficients.*x*is an observation from*p*predictor variables.*b*is the scalar bias.

Value Description `'epsiloninsensitive'`

Epsilon-insensitive loss: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,\left|y-f\left(x\right)\right|-\epsilon \right]$$ `'mse'`

MSE: $$\ell \left[y,f\left(x\right)\right]={\left[y-f\left(x\right)\right]}^{2}$$ `'epsiloninsensitive'`

is appropriate for SVM learners only.Specify your own function using function handle notation.

Let

`n`

be the number of observations in`X`

. Your function must have this signaturewhere:`lossvalue =`

(Y,Yhat,W)`lossfun`

The output argument

`lossvalue`

is a scalar.You choose the function name (

).`lossfun`

`Y`

is an`n`

-dimensional vector of observed responses.`kfoldLoss`

passes the input argument`Y`

in for`Y`

.`Yhat`

is an`n`

-dimensional vector of predicted responses, which is similar to the output of`predict`

.`W`

is an`n`

-by-1 numeric vector of observation weights.

Specify your function using

`'LossFun',@`

.`lossfun`

**Data Types: **`char`

| `string`

| `function_handle`

`'Mode'`

— Loss aggregation level`'average'`

(default) | `'individual'`

Loss aggregation level, specified as the comma-separated pair
consisting of `'Mode'`

and `'average'`

or `'individual'`

.

Value | Description |
---|---|

`'average'` | Returns losses averaged over all folds |

`'individual'` | Returns losses for each fold |

**Example: **`'Mode','individual'`

`L`

— Cross-validated regression lossesnumeric scalar | numeric vector | numeric matrix

Cross-validated regression losses, returned as a numeric scalar,
vector, or matrix. The interpretation of `L`

depends
on `LossFun`

.

Let * R* be the number of regularizations strengths is the
cross-validated models (stored in

`numel(CVMdl.Trained{1}.Lambda)`

) and
`F`

`CVMdl.KFold`

).If

`Mode`

is`'average'`

, then`L`

is a 1-by-vector.`R`

`L(`

is the average regression loss over all folds of the cross-validated model that uses regularization strength)`j`

.`j`

Otherwise,

`L`

is an-by-`F`

matrix.`R`

`L(`

is the regression loss for fold,`i`

)`j`

of the cross-validated model that uses regularization strength`i`

.`j`

To estimate `L`

,
`kfoldLoss`

uses the data that created
`CVMdl`

(see `X`

and `Y`

).

Simulate 10000 observations from this model

$$y={x}_{100}+2{x}_{200}+e.$$

$$X=\{{x}_{1},...,{x}_{1000}\}$$ is a 10000-by-1000 sparse matrix with 10% nonzero standard normal elements.

*e*is random normal error with mean 0 and standard deviation 0.3.

```
rng(1) % For reproducibility
n = 1e4;
d = 1e3;
nz = 0.1;
X = sprandn(n,d,nz);
Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
```

Cross-validate a linear regression model using SVM learners.

rng(1); % For reproducibility CVMdl = fitrlinear(X,Y,'CrossVal','on');

`CVMdl`

is a `RegressionPartitionedLinear`

model. By default, the software implements 10-fold cross validation. You can alter the number of folds using the `'KFold'`

name-value pair argument.

Estimate the average of the test-sample MSEs.

mse = kfoldLoss(CVMdl)

mse = 0.1735

Alternatively, you can obtain the per-fold MSEs by specifying the name-value pair `'Mode','individual'`

in `kfoldLoss`

.

Simulate data as in Estimate k-Fold Mean Squared Error.

rng(1) % For reproducibility n = 1e4; d = 1e3; nz = 0.1; X = sprandn(n,d,nz); Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1); X = X'; % Put observations in columns for faster training

Cross-validate a linear regression model using 10-fold cross-validation. Optimize the objective function using SpaRSA.

CVMdl = fitrlinear(X,Y,'CrossVal','on','ObservationsIn','columns',... 'Solver','sparsa');

`CVMdl`

is a `RegressionPartitionedLinear`

model. It contains the property `Trained`

, which is a 10-by-1 cell array holding `RegressionLinear`

models that the software trained using the training set.

Create an anonymous function that measures Huber loss ($$\delta $$ = 1), that is,

$$L=\frac{1}{\sum {w}_{j}}\sum _{j=1}^{n}{w}_{j}{\ell}_{j},$$

where

$$\begin{array}{l}\\ {\ell}_{j}=\{\begin{array}{c}0.5{\underset{}{\overset{\u02c6}{{e}_{j}}}}^{2};\\ \left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|-0.5;\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\end{array}\begin{array}{c}\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|\le 1\\ \phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\left|\underset{}{\overset{\u02c6}{{e}_{j}}}\right|>1\end{array}.\end{array}$$

$\underset{}{\overset{\u02c6}{{e}_{j}}}$ is the residual for observation *j*. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the `'LossFun'`

name-value pair argument.

```
huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);
```

Estimate the average Huber loss over the folds. Also, obtain the Huber loss for each fold.

`mseAve = kfoldLoss(CVMdl,'LossFun',huberloss)`

mseAve = -0.4447

mseFold = kfoldLoss(CVMdl,'LossFun',huberloss,'Mode','individual')

`mseFold = `*10×1*
-0.4454
-0.4473
-0.4452
-0.4469
-0.4434
-0.4427
-0.4471
-0.4430
-0.4438
-0.4426

To determine a good lasso-penalty strength for a linear regression model that uses least squares, implement 5-fold cross-validation.

Simulate 10000 observations from this model

$$y={x}_{100}+2{x}_{200}+e.$$

$$X=\{{x}_{1},...,{x}_{1000}\}$$ is a 10000-by-1000 sparse matrix with 10% nonzero standard normal elements.

*e*is random normal error with mean 0 and standard deviation 0.3.

```
rng(1) % For reproducibility
n = 1e4;
d = 1e3;
nz = 0.1;
X = sprandn(n,d,nz);
Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
```

Create a set of 15 logarithmically-spaced regularization strengths from $$1{0}^{-5}$$ through $$1{0}^{-1}$$.

Lambda = logspace(-5,-1,15);

Cross-validate the models. To increase execution speed, transpose the predictor data and specify that the observations are in columns. Optimize the objective function using SpaRSA.

X = X'; CVMdl = fitrlinear(X,Y,'ObservationsIn','columns','KFold',5,'Lambda',Lambda,... 'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numCLModels = numel(CVMdl.Trained)

numCLModels = 5

`CVMdl`

is a `RegressionPartitionedLinear`

model. Because `fitrlinear`

implements 5-fold cross-validation, `CVMdl`

contains 5 `RegressionLinear`

models that the software trains on each fold.

Display the first trained linear regression model.

Mdl1 = CVMdl.Trained{1}

Mdl1 = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x15 double] Bias: [1x15 double] Lambda: [1x15 double] Learner: 'leastsquares' Properties, Methods

`Mdl1`

is a `RegressionLinear`

model object. `fitrlinear`

constructed `Mdl1`

by training on the first four folds. Because `Lambda`

is a sequence of regularization strengths, you can think of `Mdl1`

as 15 models, one for each regularization strength in `Lambda`

.

Estimate the cross-validated MSE.

mse = kfoldLoss(CVMdl);

Higher values of `Lambda`

lead to predictor variable sparsity, which is a good quality of a regression model. For each regularization strength, train a linear regression model using the entire data set and the same options as when you cross-validated the models. Determine the number of nonzero coefficients per model.

Mdl = fitrlinear(X,Y,'ObservationsIn','columns','Lambda',Lambda,... 'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numNZCoeff = sum(Mdl.Beta~=0);

In the same figure, plot the cross-validated MSE and frequency of nonzero coefficients for each regularization strength. Plot all variables on the log scale.

figure [h,hL1,hL2] = plotyy(log10(Lambda),log10(mse),... log10(Lambda),log10(numNZCoeff)); hL1.Marker = 'o'; hL2.Marker = 'o'; ylabel(h(1),'log_{10} MSE') ylabel(h(2),'log_{10} nonzero-coefficient frequency') xlabel('log_{10} Lambda') hold off

Choose the index of the regularization strength that balances predictor variable sparsity and low MSE (for example, `Lambda(10)`

).

idxFinal = 10;

Extract the model with corresponding to the minimal MSE.

MdlFinal = selectModels(Mdl,idxFinal)

MdlFinal = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x1 double] Bias: -0.0050 Lambda: 0.0037 Learner: 'leastsquares' Properties, Methods

idxNZCoeff = find(MdlFinal.Beta~=0)

`idxNZCoeff = `*2×1*
100
200

EstCoeff = Mdl.Beta(idxNZCoeff)

`EstCoeff = `*2×1*
1.0051
1.9965

`MdlFinal`

is a `RegressionLinear`

model with one regularization strength. The nonzero coefficients `EstCoeff`

are close to the coefficients that simulated the data.

`RegressionLinear`

| `RegressionPartitionedLinear`

| `kfoldPredict`

| `loss`

A modified version of this example exists on your system. Do you want to open this version instead?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)