# ClassificationLinear class

Linear model for binary classification of high-dimensional data

## Description

`ClassificationLinear`

is a trained linear model object for binary classification; the linear model is a support vector machine (SVM) or logistic regression model. `fitclinear`

fits a `ClassificationLinear`

model by minimizing the objective function using techniques that reduce computation time for high-dimensional data sets (e.g., stochastic gradient descent). The classification loss plus the regularization term compose the objective function.

Unlike other classification models, and for economical memory usage, `ClassificationLinear`

model objects do not store the training data. However, they do store, for example, the estimated linear model coefficients, prior-class probabilities, and the regularization strength.

You can use trained `ClassificationLinear`

models to predict labels or classification scores for new data. For details, see `predict`

.

## Construction

Create a `ClassificationLinear`

object by using `fitclinear`

.

## Properties

**Linear Classification Properties**

`Lambda`

— Regularization term strength

nonnegative scalar | vector of nonnegative values

Regularization term strength, specified as a nonnegative scalar or vector of nonnegative values.

**Data Types: **`double`

| `single`

`Learner`

— Linear classification model type

`'logistic'`

| `'svm'`

Linear classification model type, specified as `'logistic'`

or `'svm'`

.

In this table, $$f\left(x\right)=x\beta +b.$$

*β*is a vector of*p*coefficients.*x*is an observation from*p*predictor variables.*b*is the scalar bias.

Value | Algorithm | Loss Function | `FittedLoss` Value |
---|---|---|---|

`'svm'` | Support vector machine | Hinge: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$$ | `'hinge'` |

`'logistic'` | Logistic regression | Deviance (logistic): $$\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$$ | `'logit'` |

`Beta`

— Linear coefficient estimates

numeric vector

Linear coefficient estimates, specified as a numeric vector with length equal to the number of predictors.

**Data Types: **`double`

`Bias`

— Estimated bias term

numeric scalar

Estimated bias term or model intercept, specified as a numeric scalar.

**Data Types: **`double`

`FittedLoss`

— Loss function used to fit linear model

`'hinge'`

| `'logit'`

This property is read-only.

Loss function used to fit the linear model, specified as `'hinge'`

or `'logit'`

.

Value | Algorithm | Loss Function | `Learner` Value |
---|---|---|---|

`'hinge'` | Support vector machine | Hinge: $$\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,1-yf\left(x\right)\right]$$ | `'svm'` |

`'logit'` | Logistic regression | Deviance (logistic): $$\ell \left[y,f\left(x\right)\right]=\mathrm{log}\left\{1+\mathrm{exp}\left[-yf\left(x\right)\right]\right\}$$ | `'logistic'` |

`Regularization`

— Complexity penalty type

`'lasso (L1)'`

| `'ridge (L2)'`

Complexity penalty type, specified as `'lasso (L1)'`

or ```
'ridge
(L2)'
```

.

The software composes the objective function for minimization from the sum of the average loss
function (see `FittedLoss`

) and a regularization value from this
table.

Value | Description |
---|---|

`'lasso (L1)'` | Lasso (L_{1}) penalty: $$\lambda {\displaystyle \sum _{j=1}^{p}\left|{\beta}_{j}\right|}$$ |

`'ridge (L2)'` | Ridge (L_{2}) penalty: $$\frac{\lambda}{2}{\displaystyle \sum _{j=1}^{p}{\beta}_{j}^{2}}$$ |

*λ* specifies the regularization term
strength (see `Lambda`

).

The software excludes the bias term (*β*_{0})
from the regularization penalty.

**Other Classification Properties**

`CategoricalPredictors`

— Categorical predictor indices

vector of positive integers | `[]`

Categorical predictor
indices, specified as a vector of positive integers. `CategoricalPredictors`

contains index values indicating that the corresponding predictors are categorical. The index
values are between 1 and `p`

, where `p`

is the number of
predictors used to train the model. If none of the predictors are categorical, then this
property is empty (`[]`

).

**Data Types: **`single`

| `double`

`ClassNames`

— Unique class labels

categorical array | character array | logical vector | numeric vector | cell array of character vectors

Unique class labels used in training, specified as a categorical or
character array, logical or numeric vector, or cell array of
character vectors. `ClassNames`

has the same
data type as the class labels `Y`

.
(The software treats string arrays as cell arrays of character
vectors.)
`ClassNames`

also determines the class
order.

**Data Types: **`categorical`

| `char`

| `logical`

| `single`

| `double`

| `cell`

`Cost`

— Misclassification costs

square numeric matrix

This property is read-only.

Misclassification costs, specified as a square numeric matrix. `Cost`

has *K* rows
and columns, where *K* is the number of classes.

`Cost(`

is
the cost of classifying a point into class * i*,

*)*

`j`

*if its true class is*

`j`

*. The order of the rows and columns of*

`i`

`Cost`

corresponds to the order of
the classes in `ClassNames`

.**Data Types: **`double`

`ModelParameters`

— Parameters used for training model

structure

Parameters used for training the `ClassificationLinear`

model, specified as a structure.

Access fields of `ModelParameters`

using dot notation. For example, access
the relative tolerance on the linear coefficients and the bias term by using
`Mdl.ModelParameters.BetaTolerance`

.

**Data Types: **`struct`

`PredictorNames`

— Predictor names

cell array of character vectors

Predictor names in order of their appearance in the predictor data, specified as a
cell array of character vectors. The length of `PredictorNames`

is
equal to the number of variables in the training data `X`

or
`Tbl`

used as predictor variables.

**Data Types: **`cell`

`ExpandedPredictorNames`

— Expanded predictor names

cell array of character vectors

Expanded predictor names, specified as a cell array of character vectors.

If the model uses encoding for categorical variables, then
`ExpandedPredictorNames`

includes the names that describe the
expanded variables. Otherwise, `ExpandedPredictorNames`

is the same as
`PredictorNames`

.

**Data Types: **`cell`

`Prior`

— Prior class probabilities

numeric vector

This property is read-only.

Prior class probabilities, specified as a numeric vector.
`Prior`

has as many elements as
classes in `ClassNames`

, and the order of the
elements corresponds to the elements of
`ClassNames`

.

**Data Types: **`double`

`ResponseName`

— Response variable name

character vector

Response variable name, specified as a character vector.

**Data Types: **`char`

`ScoreTransform`

— Score transformation function

`'doublelogit'`

| `'invlogit'`

| `'ismax'`

| `'logit'`

| `'none'`

| function handle | ...

Score transformation function to apply to predicted scores, specified as a function name or function handle.

For linear classification models and before transformation, the predicted
classification score for the observation *x* (row vector) is *f*(*x*) =
*x**β* + *b*, where *β* and *b* correspond to
`Mdl.Beta`

and `Mdl.Bias`

, respectively.

To change the score transformation function to, for example,
* function*, use dot notation.

For a built-in function, enter this code and replace

with a value in the table.`function`

Mdl.ScoreTransform = '

*function*';Value Description `"doublelogit"`

1/(1 + *e*^{–2x})`"invlogit"`

log( *x*/ (1 –*x*))`"ismax"`

Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0 `"logit"`

1/(1 + *e*^{–x})`"none"`

or`"identity"`

*x*(no transformation)`"sign"`

–1 for *x*< 0

0 for*x*= 0

1 for*x*> 0`"symmetric"`

2 *x*– 1`"symmetricismax"`

Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1 `"symmetriclogit"`

2/(1 + *e*^{–x}) – 1For a MATLAB

^{®}function, or a function that you define, enter its function handle.Mdl.ScoreTransform = @

*function*;must accept a matrix of the original scores for each class, and then return a matrix of the same size representing the transformed scores for each class.`function`

**Data Types: **`char`

| `function_handle`

## Object Functions

`edge` | Classification edge for linear classification models |

`incrementalLearner` | Convert linear model for binary classification to incremental learner |

`lime` | Local interpretable model-agnostic explanations (LIME) |

`loss` | Classification loss for linear classification models |

`margin` | Classification margins for linear classification models |

`partialDependence` | Compute partial dependence |

`plotPartialDependence` | Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots |

`predict` | Predict labels for linear classification models |

`shapley` | Shapley values |

`selectModels` | Choose subset of regularized, binary linear classification models |

`update` | Update model parameters for code generation |

## Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects.

## Examples

### Train Linear Classification Model

Train a binary, linear classification model using support vector machines, dual SGD, and ridge regularization.

Load the NLP data set.

`load nlpdata`

`X`

is a sparse matrix of predictor data, and `Y`

is a categorical vector of class labels. There are more than two classes in the data.

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Train a binary, linear classification model that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. Train the model using the entire data set. Determine how well the optimization algorithm fit the model to the data by extracting a fit summary.

```
rng(1); % For reproducibility
[Mdl,FitInfo] = fitclinear(X,Ystats)
```

Mdl = ClassificationLinear ResponseName: 'Y' ClassNames: [0 1] ScoreTransform: 'none' Beta: [34023x1 double] Bias: -1.0059 Lambda: 3.1674e-05 Learner: 'svm' Properties, Methods

`FitInfo = `*struct with fields:*
Lambda: 3.1674e-05
Objective: 5.3783e-04
PassLimit: 10
NumPasses: 10
BatchLimit: []
NumIterations: 238561
GradientNorm: NaN
GradientTolerance: 0
RelativeChangeInBeta: 0.0562
BetaTolerance: 1.0000e-04
DeltaGradient: 1.4582
DeltaGradientTolerance: 1
TerminationCode: 0
TerminationStatus: {'Iteration limit exceeded.'}
Alpha: [31572x1 double]
History: []
FitTime: 0.8858
Solver: {'dual'}

`Mdl`

is a `ClassificationLinear`

model. You can pass `Mdl`

and the training or new data to `loss`

to inspect the in-sample classification error. Or, you can pass `Mdl`

and new predictor data to `predict`

to predict class labels for new observations.

`FitInfo`

is a structure array containing, among other things, the termination status (`TerminationStatus`

) and how long the solver took to fit the model to the data (`FitTime`

). It is good practice to use `FitInfo`

to determine whether optimization-termination measurements are satisfactory. Because training time is small, you can try to retrain the model, but increase the number of passes through the data. This can improve measures like `DeltaGradient`

.

### Predict Class Labels Using Linear Classification Model

Load the NLP data set.

load nlpdata n = size(X,1); % Number of observations

Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

`Ystats = Y == 'stats';`

Hold out 5% of the data.

rng(1); % For reproducibility cvp = cvpartition(n,'Holdout',0.05)

cvp = Hold-out cross validation partition NumObservations: 31572 NumTestSets: 1 TrainSize: 29994 TestSize: 1578

`cvp`

is a `CVPartition`

object that defines the random partition of *n* data into training and test sets.

Train a binary, linear classification model using the training set that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. For faster training time, orient the predictor data matrix so that the observations are in columns.

idxTrain = training(cvp); % Extract training set indices X = X'; Mdl = fitclinear(X(:,idxTrain),Ystats(idxTrain),'ObservationsIn','columns');

Predict observations and classification error for the hold out sample.

idxTest = test(cvp); % Extract test set indices labels = predict(Mdl,X(:,idxTest),'ObservationsIn','columns'); L = loss(Mdl,X(:,idxTest),Ystats(idxTest),'ObservationsIn','columns')

L = 7.1753e-04

`Mdl`

misclassifies fewer than 1% of the out-of-sample observations.

## Extended Capabilities

### C/C++ Code Generation

Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

When you train a linear classification model by using

`fitclinear`

, the following restrictions apply.If the predictor data input argument value is a matrix, it must be a full, numeric matrix. Code generation does not support sparse data.

You can specify only one regularization strength, either

`'auto'`

or a nonnegative scalar for the`'Lambda'`

name-value pair argument.The value of the

`'ScoreTransform'`

name-value pair argument cannot be an anonymous function.For code generation with a coder configurer, the following additional restrictions apply.

Categorical predictors (

`logical`

,`categorical`

,`char`

,`string`

, or`cell`

) are not supported. You cannot use the`CategoricalPredictors`

name-value argument.To include categorical predictors in a model, preprocess them by using`dummyvar`

before fitting the model.Class labels with the

`categorical`

data type are not supported. Both the class label value in training data (`Tbl`

or`Y`

) and the value of the`ClassNames`

name-value argument cannot be an array with the`categorical`

data type.

For more information, see Introduction to Code Generation.

## Version History

**Introduced in R2016a**

### R2022a: `Cost`

property stores the user-specified cost matrix

*Behavior changed in R2022a*

Starting in R2022a, the `Cost`

property stores the user-specified cost
matrix, so that you can compute the observed misclassification cost using the specified cost
value. The software stores the normalized prior probabilities (`Prior`

)
that do not reflect the penalties described in the cost matrix. For details about computing
the observed misclassification cost, see the R2022a release note loss Functions of Classification Model Objects: Compute observed misclassification cost.

Note that model training has not changed and, therefore, the decision boundaries between classes have not changed.

For training, the fitting function updates the specified prior probabilities by
incorporating the penalties described in the specified cost matrix, and then normalizes the
prior probabilities and observation weights. This behavior has not changed. In previous
releases, the software stored the default cost matrix in the `Cost`

property and stored the prior probabilities used for training in the
`Prior`

property. Starting in R2022a, the software stores the
user-specified cost matrix as is, and stores the normalized prior probabilities that do
not reflect the cost penalties. For more details, see Misclassification Cost Matrix, Prior Probabilities, and Observation Weights.

Some object functions use the `Cost`

and `Prior`

properties:

The

`loss`

function uses the cost matrix stored in the`Cost`

property if you specify the`LossFun`

name-value argument as`"classifcost"`

or`"mincost"`

.The

`loss`

and`edge`

functions use the prior probabilities stored in the`Prior`

property to normalize the observation weights of the input data.

If you specify a nondefault cost matrix when you train a classification model, the object functions return a different value compared to previous releases.

If you want the software to handle the cost matrix, prior
probabilities, and observation weights as in previous releases, adjust the prior probabilities
and observation weights for the nondefault cost matrix, as described in Adjust Prior Probabilities and Observation Weights for Misclassification Cost Matrix. Then, when you train a
classification model, specify the adjusted prior probabilities and observation weights by using
the `Prior`

and `Weights`

name-value arguments, respectively,
and use the default cost matrix.

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)