discardSupportVectors

Discard support vectors for linear support vector machine (SVM) classifier

Description

example

Mdl = discardSupportVectors(MdlSV) returns the trained, linear support vector machine (SVM) model Mdl. Both Mdl and the trained, linear SVM model MdlSV are the same type of object. That is, they both are either ClassificationSVM objects or CompactClassificationSVM objects. However, Mdl and MdlSV differ in the following ways:

Examples

collapse all

Create a linear SVM model that is more memory-efficient by discarding support vectors and other related parameters.

Load the ionosphere data set.

load ionosphere

Train a linear SVM model using the entire data set.

MdlSV = fitcsvm(X,Y)
MdlSV = 
  ClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
          NumObservations: 351
                    Alpha: [103x1 double]
                     Bias: -3.8827
         KernelParameters: [1x1 struct]
           BoxConstraints: [351x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [351x1 logical]
                   Solver: 'SMO'


  Properties, Methods

Display the number of support vectors in MdlSV.

numSV = size(MdlSV.SupportVectors,1)
numSV = 103

Display the number of predictor variables in X.

p = size(X,2)
p = 34

By default, fitcsvm trains a linear SVM model for two-class learning. The software lists Alpha in the display. The model includes 103 support vectors and 34 predictors. If you discard the support vectors, the resulting model consumes less memory.

Discard the support vectors and other related parameters.

Mdl = discardSupportVectors(MdlSV)
Mdl = 
  ClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
          NumObservations: 351
                     Beta: [34x1 double]
                     Bias: -3.8827
         KernelParameters: [1x1 struct]
           BoxConstraints: [351x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [351x1 logical]
                   Solver: 'SMO'


  Properties, Methods

Display the coefficients in Mdl.

Mdl.Alpha
ans =

     []

Display the support vectors in Mdl.

Mdl.SupportVectors
ans =

     []

Display the support vector class labels in Mdl.

Mdl.SupportVectorLabels
ans =

     []

The software lists Beta in the display instead of Alpha. The Alpha, SupportVectors, and SupportVectorLabels properties are empty.

Compare the sizes of the models.

vars = whos('MdlSV','Mdl');
100*(1 - vars(1).bytes/vars(2).bytes)
ans = 20.5140

Mdl is about 20% smaller than MdlSV.

Remove MdlSV from the workspace.

clear MdlSV

Compact an SVM model by discarding the stored support vectors and other related estimates. Predict the label for a row of the training data by using the compacted model.

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM model using the default options.

MdlSV = fitcsvm(X,Y);

MdlSV is a ClassificationSVM model containing nonempty values for its Alpha, SupportVectors, and SupportVectorLabels properties.

Reduce the size of the SVM model by discarding the training data, support vectors, and related estimates.

CMdlSV = compact(MdlSV);               % Discard training data
CMdl = discardSupportVectors(CMdlSV);  % Discard support vectors

CMdl is a CompactClassificationSVM model.

Compare the sizes of the SVM models MdlSV and CMdl.

vars = whos('MdlSV','CMdl');
100*(1 - vars(1).bytes/vars(2).bytes)
ans = 96.8120

The compacted model CMdl consumes much less memory than the full model.

Predict the label for a random row of the training data by using CMdl. The predict function accepts compacted SVM models, and, for linear SVM models, does not require the Alpha, SupportVectors, and SupportVectorLabels properties to predict labels for new observations.

idx = randsample(size(X,1),1)
idx = 147
predictedLabel = predict(CMdl,X(idx,:))
predictedLabel = 1x1 cell array
    {'b'}

trueLabel = Y(idx)
trueLabel = 1x1 cell array
    {'b'}

Input Arguments

collapse all

Trained, linear SVM model, specified as a ClassificationSVM or CompactClassificationSVM model.

If the field MdlSV.KernelParameters.Function is not 'linear' (that is, MdlSV is not a linear SVM model), the software returns an error.

Tips

  • For a trained, linear SVM model, the SupportVectors property is an nsv-by-p matrix. nsv is the number of support vectors (at most the training sample size) and p is the number of predictors, or features. The Alpha and SupportVectorLabels properties are vectors with nsv elements. These properties can be large for complex data sets containing many observations or examples. The Beta property is a vector with p elements.

  • If the trained SVM model has many support vectors, use discardSupportVectors to reduce the amount of space consumed by the trained, linear SVM model. You can display the size of the support vector matrix by entering size(MdlSV.SupportVectors).

Algorithms

predict and resubPredict estimate SVM scores f(x), and subsequently label and estimate posterior probabilities using

f(x)=xβ+b.

β is Mdl.Beta and b is Mdl.Bias, that is, the Beta and Bias properties of Mdl, respectively. For more details, see Support Vector Machines for Binary Classification.

Introduced in R2015a