**Class: **RegressionLinear

Predict response of linear regression model

returns
predicted responses with additional options specified by one or more `YHat`

= predict(`Mdl`

,`X`

,`Name,Value`

)`Name,Value`

pair
arguments. For example, specify that columns in the predictor data
correspond to observations.

`Mdl`

— Linear regression model`RegressionLinear`

model objectLinear regression model, specified as a `RegressionLinear`

model
object. You can create a `RegressionLinear`

model
object using `fitrlinear`

.

`X`

— Predictor datafull matrix | sparse matrix

Predictor data, specified as an *n*-by-*p* full or sparse matrix. This orientation of `X`

indicates that rows correspond to individual observations, and columns correspond to individual predictor variables.

If you orient your predictor matrix so that observations correspond to columns and specify `'ObservationsIn','columns'`

, then you might experience a significant reduction in computation time.

The length of `Y`

and the number of observations
in `X`

must be equal.

**Data Types: **`single`

| `double`

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

`'ObservationsIn'`

— Predictor data observation dimension`'rows'`

(default) | `'columns'`

Predictor data observation dimension, specified as the comma-separated
pair consisting of `'ObservationsIn'`

and `'columns'`

or `'rows'`

.

If you orient your predictor matrix so that observations correspond
to columns and specify `'ObservationsIn','columns'`

,
then you might experience a significant reduction in optimization-execution
time.

`YHat`

— Predicted responsesnumeric matrix

Predicted responses, returned as a *n*-by-*L* numeric
matrix. *n* is the number of observations in `X`

and *L* is
the number of regularization strengths in `Mdl.Lambda`

. `YHat(`

is
the response for observation * i*,

`j`

`i`

`Mdl.Lambda(``j`

)

.The predicted response using the model with regularization strength *j* is $${\widehat{y}}_{j}=x{\beta}_{j}+{b}_{j}.$$

*x*is an observation from the predictor data matrix`X`

, and is row vector.$${\beta}_{j}$$ is the estimated column vector of coefficients. The software stores this vector in

`Mdl.Beta(:,`

.)`j`

$${b}_{j}$$ is the estimated, scalar bias, which the software stores in

`Mdl.Bias(`

.)`j`

Simulate 10000 observations from this model

$$y={x}_{100}+2{x}_{200}+e.$$

$$X={x}_{1},...,{x}_{1000}$$ is a 10000-by-1000 sparse matrix with 10% nonzero standard normal elements.

*e*is random normal error with mean 0 and standard deviation 0.3.

```
rng(1) % For reproducibility
n = 1e4;
d = 1e3;
nz = 0.1;
X = sprandn(n,d,nz);
Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
```

Train a linear regression model. Reserve 30% of the observations as a holdout sample.

```
CVMdl = fitrlinear(X,Y,'Holdout',0.3);
Mdl = CVMdl.Trained{1}
```

Mdl = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x1 double] Bias: -0.0066 Lambda: 1.4286e-04 Learner: 'svm' Properties, Methods

`CVMdl`

is a `RegressionPartitionedLinear`

model. It contains the property `Trained`

, which is a 1-by-1 cell array holding a `RegressionLinear`

model that the software trained using the training set.

Extract the training and test data from the partition definition.

trainIdx = training(CVMdl.Partition); testIdx = test(CVMdl.Partition);

Predict the training- and test-sample responses.

yHatTrain = predict(Mdl,X(trainIdx,:)); yHatTest = predict(Mdl,X(testIdx,:));

Because there is one regularization strength in `Mdl`

, `yHatTrain`

and `yHatTest`

are numeric vectors.

Predict responses from the best-performing, linear regression model that uses a lasso-penalty and least squares.

Simulate 10000 observations as in Predict Test-Sample Responses.

```
rng(1) % For reproducibility
n = 1e4;
d = 1e3;
nz = 0.1;
X = sprandn(n,d,nz);
Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
```

Create a set of 15 logarithmically-spaced regularization strengths from $$1{0}^{-5}$$ through $$1{0}^{-1}$$.

Lambda = logspace(-5,-1,15);

Cross-validate the models. To increase execution speed, transpose the predictor data and specify that the observations are in columns. Optimize the objective function using SpaRSA.

X = X'; CVMdl = fitrlinear(X,Y,'ObservationsIn','columns','KFold',5,'Lambda',Lambda,... 'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numCLModels = numel(CVMdl.Trained)

numCLModels = 5

`CVMdl`

is a `RegressionPartitionedLinear`

model. Because `fitrlinear`

implements 5-fold cross-validation, `CVMdl`

contains 5 `RegressionLinear`

models that the software trains on each fold.

Display the first trained linear regression model.

Mdl1 = CVMdl.Trained{1}

Mdl1 = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x15 double] Bias: [1x15 double] Lambda: [1x15 double] Learner: 'leastsquares' Properties, Methods

`Mdl1`

is a `RegressionLinear`

model object. `fitrlinear`

constructed `Mdl1`

by training on the first four folds. Because `Lambda`

is a sequence of regularization strengths, you can think of `Mdl1`

as 11 models, one for each regularization strength in `Lambda`

.

Estimate the cross-validated MSE.

mse = kfoldLoss(CVMdl);

Higher values of `Lambda`

lead to predictor variable sparsity, which is a good quality of a regression model. For each regularization strength, train a linear regression model using the entire data set and the same options as when you cross-validated the models. Determine the number of nonzero coefficients per model.

Mdl = fitrlinear(X,Y,'ObservationsIn','columns','Lambda',Lambda,... 'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numNZCoeff = sum(Mdl.Beta~=0);

In the same figure, plot the cross-validated MSE and frequency of nonzero coefficients for each regularization strength. Plot all variables on the log scale.

figure; [h,hL1,hL2] = plotyy(log10(Lambda),log10(mse),... log10(Lambda),log10(numNZCoeff)); hL1.Marker = 'o'; hL2.Marker = 'o'; ylabel(h(1),'log_{10} MSE') ylabel(h(2),'log_{10} nonzero-coefficient frequency') xlabel('log_{10} Lambda') hold off

Choose the index of the regularization strength that balances predictor variable sparsity and low MSE (for example, `Lambda(10)`

).

idxFinal = 10;

Extract the model with corresponding to the minimal MSE.

MdlFinal = selectModels(Mdl,idxFinal)

MdlFinal = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x1 double] Bias: -0.0050 Lambda: 0.0037 Learner: 'leastsquares' Properties, Methods

idxNZCoeff = find(MdlFinal.Beta~=0)

`idxNZCoeff = `*2×1*
100
200

EstCoeff = Mdl.Beta(idxNZCoeff)

`EstCoeff = `*2×1*
1.0051
1.9965

`MdlFinal`

is a `RegressionLinear`

model with one regularization strength. The nonzero coefficients `EstCoeff`

are close to the coefficients that simulated the data.

Simulate 10 new observations, and predict corresponding responses using the best-performing model.

XNew = sprandn(d,10,nz); YHat = predict(MdlFinal,XNew,'ObservationsIn','columns');

Calculate with arrays that have more rows than fit in memory.

This function fully supports tall arrays. For more information, see Tall Arrays (MATLAB).

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

You can generate C/C++ code for both

`predict`

and`update`

by using a coder configurer. Or, generate code only for`predict`

by using`saveLearnerForCoder`

,`loadLearnerForCoder`

, and`codegen`

.Code generation for

`predict`

and`update`

— Create a coder configurer by using`learnerCoderConfigurer`

and then generate code by using`generateCode`

. Then you can update model parameters in the generated code without having to regenerate the code.Code generation for

`predict`

— Save a trained model by using`saveLearnerForCoder`

. Define an entry-point function that loads the saved model by using`loadLearnerForCoder`

and calls the`predict`

function. Then use`codegen`

to generate code for the entry-point function.

This table contains notes about the arguments of

`predict`

. Arguments not included in this table are fully supported.Argument Notes and Limitations `Mdl`

For the usage notes and limitations of the model object, see Code Generation of the

`RegressionLinear`

object.`X`

Must be a single-precision or double-precision matrix and can be variable-size.

If you specify

`'ObservationsIn','rows'`

(default), then the number of columns in`X`

must be`numel(Mdl.PredictorNames)`

. Rows and columns must correspond to observations and predictors, respectively.If you specify

`'ObservationsIn','columns'`

, then the number of rows in`X`

must be`numel(Mdl.PredictorNames)`

. Rows and columns must correspond to predictors and observations, respectively.

Name-value pair arguments Names in name-value pair arguments must be compile-time constants.

The value for the

`'ObservationsIn'`

name-value pair argument must be a compile-time constant. For example, to use the`'ObservationsIn','columns'`

name-value pair argument in the generated code, include`{coder.Constant('ObservationsIn'),coder.Constant('columns')}`

in the`-args`

value of`codegen`

.

For more information, see Introduction to Code Generation.

A modified version of this example exists on your system. Do you want to open this version instead?

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)