# Optimize a Cross-Validated SVM Classifier Using `bayesopt`

This example shows how to optimize an SVM classification using the `bayesopt` function. The classification works on locations of points from a Gaussian mixture model. In The Elements of Statistical Learning, Hastie, Tibshirani, and Friedman (2009), page 17 describes the model. The model begins with generating 10 base points for a "green" class, distributed as 2-D independent normals with mean (1,0) and unit variance. It also generates 10 base points for a "red" class, distributed as 2-D independent normals with mean (0,1) and unit variance. For each class (green and red), generate 100 random points as follows:

1. Choose a base point m of the appropriate color uniformly at random.

2. Generate an independent random point with 2-D normal distribution with mean m and variance I/5, where I is the 2-by-2 identity matrix. In this example, use a variance I/50 to show the advantage of optimization more clearly.

After generating 100 green and 100 red points, classify them using `fitcsvm`. Then use `bayesopt` to optimize the parameters of the resulting SVM model with respect to cross validation.

### Generate the Points and Classifier

Generate the 10 base points for each class.

```rng default grnpop = mvnrnd([1,0],eye(2),10); redpop = mvnrnd([0,1],eye(2),10);```

View the base points.

```plot(grnpop(:,1),grnpop(:,2),'go') hold on plot(redpop(:,1),redpop(:,2),'ro') hold off``` Since some red base points are close to green base points, it can be difficult to classify the data points based on location alone.

Generate the 100 data points of each class.

```redpts = zeros(100,2);grnpts = redpts; for i = 1:100 grnpts(i,:) = mvnrnd(grnpop(randi(10),:),eye(2)*0.02); redpts(i,:) = mvnrnd(redpop(randi(10),:),eye(2)*0.02); end```

View the data points.

```figure plot(grnpts(:,1),grnpts(:,2),'go') hold on plot(redpts(:,1),redpts(:,2),'ro') hold off``` ### Prepare Data For Classification

Put the data into one matrix, and make a vector `grp` that labels the class of each point.

```cdata = [grnpts;redpts]; grp = ones(200,1); % Green label 1, red label -1 grp(101:200) = -1;```

### Prepare Cross-Validation

Set up a partition for cross-validation. This step fixes the train and test sets that the optimization uses at each step.

`c = cvpartition(200,'KFold',10);`

### Prepare Variables for Bayesian Optimization

Set up a function that takes an input `z = [rbf_sigma,boxconstraint]` and returns the cross-validation loss value of `z`. Take the components of `z` as positive, log-transformed variables between `1e-5` and `1e5`. Choose a wide range, because you don't know which values are likely to be good.

```sigma = optimizableVariable('sigma',[1e-5,1e5],'Transform','log'); box = optimizableVariable('box',[1e-5,1e5],'Transform','log');```

### Objective Function

This function handle computes the cross-validation loss at parameters `[sigma,box]`. For details, see `kfoldLoss`.

`bayesopt` passes the variable `z` to the objective function as a one-row table.

```minfn = @(z)kfoldLoss(fitcsvm(cdata,grp,'CVPartition',c,... 'KernelFunction','rbf','BoxConstraint',z.box,... 'KernelScale',z.sigma));```

### Optimize Classifier

Search for the best parameters `[sigma,box]` using `bayesopt`. For reproducibility, choose the `'expected-improvement-plus'` acquisition function. The default acquisition function depends on run time, and so can give varying results.

```results = bayesopt(minfn,[sigma,box],'IsObjectiveDeterministic',true,... 'AcquisitionFunctionName','expected-improvement-plus')```
```|=====================================================================================================| | Iter | Eval | Objective | Objective | BestSoFar | BestSoFar | sigma | box | | | result | | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 1 | Best | 0.61 | 0.49723 | 0.61 | 0.61 | 0.00013375 | 13929 | | 2 | Best | 0.345 | 0.23801 | 0.345 | 0.345 | 24526 | 1.936 | | 3 | Accept | 0.61 | 0.27614 | 0.345 | 0.345 | 0.0026459 | 0.00084929 | | 4 | Accept | 0.345 | 0.37522 | 0.345 | 0.345 | 3506.3 | 6.7427e-05 | | 5 | Accept | 0.345 | 0.23259 | 0.345 | 0.345 | 9135.2 | 571.87 | | 6 | Accept | 0.345 | 0.18231 | 0.345 | 0.345 | 99701 | 10223 | | 7 | Best | 0.295 | 0.21768 | 0.295 | 0.295 | 455.88 | 9957.4 | | 8 | Best | 0.24 | 1.493 | 0.24 | 0.24 | 31.56 | 99389 | | 9 | Accept | 0.24 | 1.8306 | 0.24 | 0.24 | 10.451 | 64429 | | 10 | Accept | 0.35 | 0.29761 | 0.24 | 0.24 | 17.331 | 1.0264e-05 | | 11 | Best | 0.23 | 1.1237 | 0.23 | 0.23 | 16.005 | 90155 | | 12 | Best | 0.1 | 0.28838 | 0.1 | 0.1 | 0.36562 | 80878 | | 13 | Accept | 0.115 | 0.15901 | 0.1 | 0.1 | 0.1793 | 68459 | | 14 | Accept | 0.105 | 0.25403 | 0.1 | 0.1 | 0.2267 | 95421 | | 15 | Best | 0.095 | 0.16714 | 0.095 | 0.095 | 0.28999 | 0.0058227 | | 16 | Best | 0.075 | 0.21326 | 0.075 | 0.075 | 0.30554 | 8.9017 | | 17 | Accept | 0.085 | 0.23925 | 0.075 | 0.075 | 0.41122 | 4.4476 | | 18 | Accept | 0.085 | 0.31168 | 0.075 | 0.075 | 0.25565 | 7.8038 | | 19 | Accept | 0.075 | 0.26502 | 0.075 | 0.075 | 0.32869 | 18.076 | | 20 | Accept | 0.085 | 0.20048 | 0.075 | 0.075 | 0.32442 | 5.2118 | |=====================================================================================================| | Iter | Eval | Objective | Objective | BestSoFar | BestSoFar | sigma | box | | | result | | runtime | (observed) | (estim.) | | | |=====================================================================================================| | 21 | Accept | 0.3 | 0.13136 | 0.075 | 0.075 | 1.3592 | 0.0098067 | | 22 | Accept | 0.12 | 0.23338 | 0.075 | 0.075 | 0.17515 | 0.00070913 | | 23 | Accept | 0.175 | 0.24124 | 0.075 | 0.075 | 0.1252 | 0.010749 | | 24 | Accept | 0.105 | 0.18186 | 0.075 | 0.075 | 1.1664 | 31.13 | | 25 | Accept | 0.1 | 0.17713 | 0.075 | 0.075 | 0.57465 | 2013.8 | | 26 | Accept | 0.12 | 0.13144 | 0.075 | 0.075 | 0.42922 | 1.1602e-05 | | 27 | Accept | 0.12 | 0.1642 | 0.075 | 0.075 | 0.42956 | 0.00027218 | | 28 | Accept | 0.095 | 0.18907 | 0.075 | 0.075 | 0.4806 | 13.452 | | 29 | Accept | 0.105 | 0.18856 | 0.075 | 0.075 | 0.19755 | 943.87 | | 30 | Accept | 0.205 | 0.19236 | 0.075 | 0.075 | 3.5051 | 93.492 | ```  ```__________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 62.3158 seconds Total objective function evaluation time: 10.693 Best observed feasible point: sigma box _______ ______ 0.30554 8.9017 Observed objective function value = 0.075 Estimated objective function value = 0.075 Function evaluation time = 0.21326 Best estimated feasible point (according to models): sigma box _______ ______ 0.32869 18.076 Estimated objective function value = 0.075 Estimated function evaluation time = 0.23015 ```
```results = BayesianOptimization with properties: ObjectiveFcn: [function_handle] VariableDescriptions: [1x2 optimizableVariable] Options: [1x1 struct] MinObjective: 0.0750 XAtMinObjective: [1x2 table] MinEstimatedObjective: 0.0750 XAtMinEstimatedObjective: [1x2 table] NumObjectiveEvaluations: 30 TotalElapsedTime: 62.3158 NextPoint: [1x2 table] XTrace: [30x2 table] ObjectiveTrace: [30x1 double] ConstraintsTrace: [] UserDataTrace: {30x1 cell} ObjectiveEvaluationTimeTrace: [30x1 double] IterationTimeTrace: [30x1 double] ErrorTrace: [30x1 double] FeasibilityTrace: [30x1 logical] FeasibilityProbabilityTrace: [30x1 double] IndexOfMinimumTrace: [30x1 double] ObjectiveMinimumTrace: [30x1 double] EstimatedObjectiveMinimumTrace: [30x1 double] ```

Use the results to train a new, optimized SVM classifier.

```z(1) = results.XAtMinObjective.sigma; z(2) = results.XAtMinObjective.box; SVMModel = fitcsvm(cdata,grp,'KernelFunction','rbf',... 'KernelScale',z(1),'BoxConstraint',z(2));```

Plot the classification boundaries. To visualize the support vector classifier, predict scores over a grid.

```d = 0.02; [x1Grid,x2Grid] = meshgrid(min(cdata(:,1)):d:max(cdata(:,1)),... min(cdata(:,2)):d:max(cdata(:,2))); xGrid = [x1Grid(:),x2Grid(:)]; [~,scores] = predict(SVMModel,xGrid); h = nan(3,1); % Preallocation figure; h(1:2) = gscatter(cdata(:,1),cdata(:,2),grp,'rg','+*'); hold on h(3) = plot(cdata(SVMModel.IsSupportVector,1),... cdata(SVMModel.IsSupportVector,2),'ko'); contour(x1Grid,x2Grid,reshape(scores(:,2),size(x1Grid)),[0 0],'k'); legend(h,{'-1','+1','Support Vectors'},'Location','Southeast'); axis equal hold off``` ### Evaluate Accuracy on New Data

Generate and classify some new data points.

```grnobj = gmdistribution(grnpop,.2*eye(2)); redobj = gmdistribution(redpop,.2*eye(2)); newData = random(grnobj,10); newData = [newData;random(redobj,10)]; grpData = ones(20,1); grpData(11:20) = -1; % red = -1 v = predict(SVMModel,newData); g = nan(7,1); figure; h(1:2) = gscatter(cdata(:,1),cdata(:,2),grp,'rg','+*'); hold on h(3:4) = gscatter(newData(:,1),newData(:,2),v,'mc','**'); h(5) = plot(cdata(SVMModel.IsSupportVector,1),... cdata(SVMModel.IsSupportVector,2),'ko'); contour(x1Grid,x2Grid,reshape(scores(:,2),size(x1Grid)),[0 0],'k'); legend(h(1:5),{'-1 (training)','+1 (training)','-1 (classified)',... '+1 (classified)','Support Vectors'},'Location','Southeast'); axis equal hold off``` See which new data points are correctly classified. Circle the correctly classified points in red, and the incorrectly classified points in black.

```mydiff = (v == grpData); % Classified correctly figure; h(1:2) = gscatter(cdata(:,1),cdata(:,2),grp,'rg','+*'); hold on h(3:4) = gscatter(newData(:,1),newData(:,2),v,'mc','**'); h(5) = plot(cdata(SVMModel.IsSupportVector,1),... cdata(SVMModel.IsSupportVector,2),'ko'); contour(x1Grid,x2Grid,reshape(scores(:,2),size(x1Grid)),[0 0],'k'); for ii = mydiff % Plot red squares around correct pts h(6) = plot(newData(ii,1),newData(ii,2),'rs','MarkerSize',12); end for ii = not(mydiff) % Plot black squares around incorrect pts h(7) = plot(newData(ii,1),newData(ii,2),'ks','MarkerSize',12); end legend(h,{'-1 (training)','+1 (training)','-1 (classified)',... '+1 (classified)','Support Vectors','Correctly Classified',... 'Misclassified'},'Location','Southeast'); hold off``` 