discardSupportVectors

Discard support vectors for linear support vector machine (SVM) classifier

Syntax

Mdl = discardSupportVectors(MdlSV)

Description

Mdl = discardSupportVectors(MdlSV) returns the trained, linear support vector machine (SVM) model Mdl. Both Mdl and the trained, linear SVM model MdlSV are the same type of object. That is, they both are either ClassificationSVM objects or CompactClassificationSVM objects. However, Mdl and MdlSV differ in the following ways:

The Alpha, SupportVectors, and SupportVectorLabels properties are empty ([]) in Mdl.
If you display Mdl, the software lists the Beta property instead of Alpha.

example

Examples

collapse all

Discard Support Vectors

Open Live Script

Create a linear SVM model that is more memory-efficient by discarding support vectors and other related parameters.

Load the ionosphere data set.

load ionosphere

Train a linear SVM model using the entire data set.

MdlSV = fitcsvm(X,Y)

MdlSV = 
  ClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
          NumObservations: 351
                    Alpha: [103x1 double]
                     Bias: -3.8829
         KernelParameters: [1x1 struct]
           BoxConstraints: [351x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [351x1 logical]
                   Solver: 'SMO'

Display the number of support vectors in MdlSV.

numSV = size(MdlSV.SupportVectors,1)

numSV = 
103

Display the number of predictor variables in X.

p = size(X,2)

p = 
34

By default, fitcsvm trains a linear SVM model for two-class learning. The software lists Alpha in the display. The model includes 103 support vectors and 34 predictors. If you discard the support vectors, the resulting model consumes less memory.

Discard the support vectors and other related parameters.

Mdl = discardSupportVectors(MdlSV)

Mdl = 
  ClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
          NumObservations: 351
                     Beta: [34x1 double]
                     Bias: -3.8829
         KernelParameters: [1x1 struct]
           BoxConstraints: [351x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [351x1 logical]
                   Solver: 'SMO'

Display the coefficients in Mdl.

Mdl.Alpha

ans =

     []

Display the support vectors in Mdl.

Mdl.SupportVectors

ans =

     []

Display the support vector class labels in Mdl.

Mdl.SupportVectorLabels

ans =

     []

The software lists Beta in the display instead of Alpha. The Alpha, SupportVectors, and SupportVectorLabels properties are empty.

Compare the sizes of the models.

vars = whos('MdlSV','Mdl');
100*(1 - vars(1).bytes/vars(2).bytes)

ans = 
20.6748

Mdl is about 20% smaller than MdlSV.

Remove MdlSV from the workspace.

clear MdlSV

Reduce Memory Consumption of SVM Models

Open Live Script

Compact an SVM model by discarding the stored support vectors and other related estimates. Predict the label for a row of the training data by using the compacted model.

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM model using the default options.

MdlSV = fitcsvm(X,Y);

MdlSV is a ClassificationSVM model containing nonempty values for its Alpha, SupportVectors, and SupportVectorLabels properties.

Reduce the size of the SVM model by discarding the training data, support vectors, and related estimates.

CMdlSV = compact(MdlSV);               % Discard training data
CMdl = discardSupportVectors(CMdlSV);  % Discard support vectors

CMdl is a CompactClassificationSVM model.

Compare the sizes of the SVM models MdlSV and CMdl.

vars = whos('MdlSV','CMdl');
100*(1 - vars(1).bytes/vars(2).bytes)

ans = 
97.0135

The compacted model CMdl consumes much less memory than the full model.

Predict the label for a random row of the training data by using CMdl. The predict function accepts compacted SVM models, and, for linear SVM models, does not require the Alpha, SupportVectors, and SupportVectorLabels properties to predict labels for new observations.

idx = randsample(size(X,1),1)

idx = 
147

predictedLabel = predict(CMdl,X(idx,:))

predictedLabel = 1x1 cell array
    {'b'}

trueLabel = Y(idx)

trueLabel = 1x1 cell array
    {'b'}

Input Arguments

collapse all

`MdlSV` — Trained, linear SVM model
`ClassificationSVM` model | `CompactClassificationSVM` model

Trained, linear SVM model, specified as a ClassificationSVM or CompactClassificationSVM model.

If the field MdlSV.KernelParameters.Function is not 'linear' (that is, MdlSV is not a linear SVM model), the software returns an error.

Tips

For a trained, linear SVM model, the SupportVectors property is an n_sv-by-p matrix. n_sv is the number of support vectors (at most the training sample size) and p is the number of predictors, or features. The Alpha and SupportVectorLabels properties are vectors with n_sv elements. These properties can be large for complex data sets containing many observations or examples. The Beta property is a vector with p elements.
If the trained SVM model has many support vectors, use discardSupportVectors to reduce the amount of space consumed by the trained, linear SVM model. You can display the size of the support vector matrix by entering size(MdlSV.SupportVectors).

Algorithms

predict and resubPredict estimate SVM scores f(x), and subsequently label and estimate posterior probabilities using

$f (x) = x' β + b .$

β is Mdl.Beta and b is Mdl.Bias, that is, the Beta and Bias properties of Mdl, respectively. For more details, see Support Vector Machines for Binary Classification.

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2015a