# fitPosterior

Fit posterior probabilities for compact support vector machine (SVM) classifier

## Syntax

``ScoreSVMModel = fitPosterior(SVMModel,TBL,Y)``
``ScoreSVMModel = fitPosterior(SVMModel,X,Y)``
``````[ScoreSVMModel,ScoreTransform] = fitPosterior(___)``````

## Description

````ScoreSVMModel = fitPosterior(SVMModel,TBL,Y)` returns a trained support vector machine (SVM) classifier `ScoreSVMModel` containing the optimal score-to-posterior-probability transformation function for two-class learning. For more details, see Algorithms. If you train `SVMModel` using a table, then you must use a table as input for `fitPosterior`.```

example

````ScoreSVMModel = fitPosterior(SVMModel,X,Y)` returns a trained SVM classifier `ScoreSVMModel` containing the optimal score-to-posterior-probability transformation function for two-class learning. If you train `SVMModel` using a matrix, then you must use a matrix as input for `fitPosterior`.```

example

``````[ScoreSVMModel,ScoreTransform] = fitPosterior(___)``` additionally returns the optimal score-to-posterior-probability transformation function parameters (`ScoreTransform`) for any of the input argument combinations in the previous syntaxes.```

## Examples

collapse all

Load the `ionosphere` data set. Reserve 20 random observations of the data, and consider this set new data.

```load ionosphere n = size(X,1); rng(1); % For reproducibility indx = ~ismember([1:n],randsample(n,20)); % Indices for the training data```

The classes of this data set are inseparable.

Train an SVM classifier using the training data. Standardize the data and specify that `'g'` is the positive class.

```SVMModel = fitcsvm(X(indx,:),Y(indx),'ClassNames',{'b','g'},... 'Standardize',true);```

`SVMModel` is a `ClassificationSVM` classifier.

Use the new data set to estimate the optimal score-to-posterior-probability transformation function for mapping scores to the posterior probability of an observation being classified as `g`. For efficiency, make a compact version of `SVMModel`, and pass it and the new data to `fitPosterior`.

```CompactSVMModel = compact(SVMModel); [ScoreCSVMModel,ScoreParameters] = fitPosterior(CompactSVMModel,... X(~indx,:),Y(~indx)); ScoreTransform = ScoreCSVMModel.ScoreTransform```
```ScoreTransform = '@(S)sigmoid(S,-1.098976e+00,4.520314e-01)' ```
`ScoreParameters`
```ScoreParameters = struct with fields: Type: 'sigmoid' Slope: -1.0990 Intercept: 0.4520 ```

`ScoreTransform` is the optimal score transformation function. `ScoreParameters` is a structure array with three fields: the score transformation function name (`Type`), the sigmoid slope (`Slope`), and the sigmoid intercept estimates (`Intercept`).

Alternatively, you can pass `SVMModel` and the new data to `fitSVMPosterior`, but this process is not as efficient.

Estimate the posterior probabilities that the observations in the new data are in class `g`.

```[labels,postProbs] = predict(ScoreCSVMModel,X(~indx,:)); table(Y(~indx),labels,postProbs(:,2),... 'VariableNames',{'TrueLabel','PredictedLabel','PosteriorProbability'})```
```ans=20×3 table TrueLabel PredictedLabel PosteriorProbability _________ ______________ ____________________ {'g'} {'g'} 0.7844 {'b'} {'b'} 0.024584 {'g'} {'g'} 0.82402 {'b'} {'b'} 0.0061632 {'b'} {'b'} 3.6064e-06 {'b'} {'b'} 0.15688 {'b'} {'g'} 0.96219 {'b'} {'b'} 6.1343e-09 {'b'} {'b'} 0.001964 {'g'} {'g'} 0.72509 {'g'} {'g'} 0.70261 {'b'} {'b'} 0.075298 {'g'} {'g'} 0.90692 {'g'} {'g'} 0.82848 {'b'} {'b'} 0.051175 {'g'} {'g'} 0.95332 ⋮ ```

Load Fisher's iris data set. Use the petal lengths and widths as the predictor data, and remove the virginica species from the data. Reserve 10 random observations of the data, and consider this set new data.

```load fisheriris classKeep = ~strcmp(species,'virginica'); X = meas(classKeep,3:4); Y = species(classKeep); rng(1); % For reproducibility indx1 = 1:numel(species); indx2 = indx1(classKeep); indx = ~ismember(indx2,randsample(indx2,10)); % Indices for the training data gscatter(X(indx,1),X(indx,2),Y(indx)); title('Scatter Diagram of Iris Measurements') xlabel('Petal length') ylabel('Petal width') legend('Setosa','Versicolor')```

The classes are perfectly separable. Therefore, the score-to-posterior-probability transformation function is a step function.

Train an SVM classifier. Standardize the data and specify that `versicolor` is the positive class.

```SVMModel = fitcsvm(X(indx,:),Y(indx),... 'ClassNames',{'setosa','versicolor'},'Standardize',true);```

`SVMModel` is a `ClassificationSVM` classifier.

Use the new data set to estimate the optimal score-to-posterior-probability transformation function for mapping scores to the posterior probability of an observation being classified as `versicolor`. For efficiency, make a compact version `SVMModel`, and pass it and the new data to `fitPosterior`.

```CompactSVMModel = compact(SVMModel); [ScoreCSVMModel,ScoreParameters] = fitPosterior(CompactSVMModel,... X(~indx,:),Y(~indx));```
```Warning: Classes are perfectly separated. The optimal score-to-posterior transformation is a step function. ```
`ScoreTransform = ScoreCSVMModel.ScoreTransform`
```ScoreTransform = '@(S)step(S,-1.338450e+00,2.012495e+00,5.333333e-01)' ```

`fitPosterior` displays a warning whenever the classes are separable, and stores the step function in `ScoreSVMModel.ScoreTransform`.

Display the score function type and its estimated values.

`ScoreParameters`
```ScoreParameters = struct with fields: Type: 'step' LowerBound: -1.3385 UpperBound: 2.0125 PositiveClassProbability: 0.5333 ```

`ScoreParameters` is a structure array with four fields:

• Score transformation function type (`Type`)

• Score corresponding to the negative class boundary (`LowerBound`)

• Score corresponding to the positive class boundary (`UpperBound`)

• Positive class probability (`PositiveClassProbability`)

Alternatively, you can pass `SVMModel` and the new data to `fitSVMPosterior`, but this process is not as efficient.

Estimate the posterior probabilities that the observations in the new data are versicolor irises.

```[labels,postProbs] = predict(ScoreCSVMModel,X(~indx,:)); table(Y(~indx),labels,postProbs(:,2),... 'VariableNames',{'TrueLabel','PredictedLabel','PosteriorProbability'})```
```ans=10×3 table TrueLabel PredictedLabel PosteriorProbability ______________ ______________ ____________________ {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'setosa' } {'setosa' } 0 {'versicolor'} {'versicolor'} 1 {'versicolor'} {'versicolor'} 1 ```

Because the classes are separable, the step function transforms the positive-class score to:

• `0` if the score is less than `ScoreParameters.LowerBound`

• `1` if the score is greater than `ScoreParameters.UpperBound`

• `ScoreParameters.PositiveClassProbability` if the score is in the interval [ `ScoreParameters.LowerBound` , `ScoreParameters.LowerBound`]

## Input Arguments

collapse all

Trained, compact SVM classifier, specified as a `CompactClassificationSVM` model returned by `compact`.

Sample data, specified as a table. Each row of `TBL` corresponds to one observation, and each column corresponds to one predictor variable. `TBL` must contain all of the predictors used to train `SVMModel`. Optionally, `TBL` can contain an additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If `TBL` contains the response variable used to train `SVMModel`, then you do not need to specify `Y`. If `TBL` does not include the response variable, then the length of `Y` must be equal to the number of rows in `TBL`.

If the sample data used to train `SVMModel` is a `table`, then you must specify the input data for `fitPosterior` as a table.

If you set `'Standardize',true` in `fitcsvm` when training `SVMModel`, then the software fits the transformation function parameter estimates using standardized data.

Data Types: `table`

Predictor data used to estimate the score-to-posterior-probability transformation function, specified as a matrix.

Each row of `X` corresponds to one observation (also known as an instance or example), and each column corresponds to one variable (also known as a feature).

The length of `Y` and the number of rows in `X` must be equal.

If you set `'Standardize',true` in `fitcsvm` when training `SVMModel`, then the software fits the transformation function parameter estimates using standardized data.

Data Types: `double` | `single`

Class labels used to estimate the score-to-posterior-probability transformation function, specified as a categorical, character, or string array, a logical or numeric vector, or a cell array of character vectors.

If `Y` is a character array, then each element must correspond to one class label.

The length of `Y` and the number of rows in `X` must be equal.

Data Types: `categorical` | `char` | `string` | `logical` | `single` | `double` | `cell`

## Output Arguments

collapse all

Trained, compact SVM classifier containing the estimated score-to-posterior-probability transformation function, returned as a `CompactClassificationSVM` classifier.

To estimate posterior probabilities for new observations, pass `ScoreSVMModel` and the new observations to `predict`.

Optimal score-to-posterior-probability transformation function parameters, returned as a structure array.

• If the value of the `Type` field of `ScoreTransform` is `sigmoid`, then `ScoreTransform` also has these fields:

• `Slope`: The value of A in the sigmoid function

• `Intercept`: The value of `B` in the sigmoid function

• If the value of the `Type` field of `ScoreTransform` is `step`, then `ScoreTransform` also has these fields:

• `PositiveClassProbability`: The value of π in the step function. This value represents the probability that an observation is in the positive class or the posterior probability that an observation is in the positive class given that its score is in the interval (`LowerBound`,`UpperBound`).

• `LowerBound`: The value $\underset{{y}_{n}=-1}{\mathrm{max}}{s}_{n}$ in the step function. This value represents the lower bound of the score interval that assigns observations with scores in the interval the posterior probability of being in the positive class `PositiveClassProbability`. Any observation with a score less than `LowerBound` has the posterior probability of being in the positive class equal to `0`.

• `UpperBound`: The value $\underset{{y}_{n}=+1}{\mathrm{min}}{s}_{n}$ in the step function. This value represents the upper bound of the score interval that assigns observations with scores in the interval the posterior probability of being in the positive class `PositiveClassProbability`. Any observation with a score greater than `UpperBound` has the posterior probability of being in the positive class equal to `1`.

• If the value of the `Type` field of `ScoreTransform` is `constant`, then `ScoreTransform.PredictedClass` contains the name of the class prediction.

This result is the same as `SVMModel.ClassNames`. The posterior probability of an observation being in `ScoreTransform.PredictedClass` is always `1`.

collapse all

### Sigmoid Function

The sigmoid function that maps score sj corresponding to observation j to the positive class posterior probability is

`$P\left({s}_{j}\right)=\frac{1}{1+\mathrm{exp}\left(A{s}_{j}+B\right)}.$`

If the value of the `Type` field of `ScoreTransform` is `sigmoid`, then parameters A and B correspond to the fields `Scale` and `Intercept` of `ScoreTransform`, respectively.

### Step Function

The step function that maps score sj corresponding to observation j to the positive class posterior probability is

`$P\left({s}_{j}\right)=\left\{\begin{array}{l}\begin{array}{cc}0;& s<\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\end{array}\\ \begin{array}{cc}\pi ;& \underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}\le {s}_{j}\le \underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\\ \begin{array}{cc}1;& {s}_{j}>\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}\end{array}\end{array},$`

where:

• sj is the score of observation j.

• +1 and –1 denote the positive and negative classes, respectively.

• π is the prior probability that an observation is in the positive class.

If the value of the `Type` field of `ScoreTransform` is `step`, then the quantities $\underset{{y}_{k}=-1}{\mathrm{max}}{s}_{k}$ and $\underset{{y}_{k}=+1}{\mathrm{min}}{s}_{k}$ correspond to the fields `LowerBound` and `UpperBound` of `ScoreTransform`, respectively.

### Constant Function

The constant function maps all scores in a sample to posterior probabilities 1 or 0.

If all observations have posterior probability 1, then they are expected to come from the positive class.

If all observations have posterior probability 0, then they are not expected to come from the positive class.

## Tips

• This process describes one way to predict positive class posterior probabilities.

1. Train an SVM classifier by passing the data to `fitcsvm`. The result is a trained SVM classifier, such as `SVMModel`, that stores the data. The software sets the score transformation function property (`SVMModel.ScoreTransformation`) to `none`.

2. Pass the trained SVM classifier `SVMModel` to `fitSVMPosterior` or `fitPosterior`. The result, such as, `ScoreSVMModel`, is the same trained SVM classifier as `SVMModel`, except the software sets `ScoreSVMModel.ScoreTransformation` to the optimal score transformation function.

3. Pass the predictor data matrix and the trained SVM classifier containing the optimal score transformation function (`ScoreSVMModel`) to `predict`. The second column in the second output argument of `predict` stores the positive class posterior probabilities corresponding to each row of the predictor data matrix.

If you skip step 2, then `predict` returns the positive class score rather than the positive class posterior probability.

• After fitting posterior probabilities, you can generate C/C++ code that predicts labels for new data. Generating C/C++ code requires MATLAB® Coder™. For details, see Introduction to Code Generation.

## Algorithms

The software fits the appropriate score-to-posterior-probability transformation function by using the SVM classifier `SVMModel` and by conducting 10-fold cross-validation using the stored predictor data (`SVMModel.X`) and the class labels (`SVMModel.Y`), as outlined in [1]. The transformation function computes the posterior probability that an observation is classified into the positive class (`SVMModel.Classnames(2)`).

• If the classes are inseparable, then the transformation function is the sigmoid function.

• If the classes are perfectly separable, then the transformation function is the step function.

• In two-class learning, if one of the two classes has a relative frequency of 0, then the transformation function is the constant function. The `fitPosterior` function is not appropriate for one-class learning.

• The software stores the optimal score-to-posterior-probability transformation function in `ScoreSVMModel.ScoreTransform`.

If you re-estimate the score-to-posterior-probability transformation function, that is, if you pass an SVM classifier to `fitPosterior` or `fitSVMPosterior` and its `ScoreTransform` property is not `none`, then the software:

• Displays a warning

• Resets the original transformation function to `'none'` before estimating the new one

## Alternative Functionality

You can also fit the optimal score-to-posterior-probability function by using `fitSVMPosterior`. This function is similar to `fitPosterior`, except it is more broad because it accepts a wider range of SVM classifier types.

## References

[1] Platt, J. “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods.” Advances in Large Margin Classifiers. Cambridge, MA: The MIT Press, 2000, pp. 61–74.

## Version History

Introduced in R2014a