# fitcauto

Automatically select classification model with optimized hyperparameters

## Syntax

``Mdl = fitcauto(Tbl,ResponseVarName)``
``Mdl = fitcauto(Tbl,formula)``
``Mdl = fitcauto(Tbl,Y)``
``Mdl = fitcauto(X,Y)``
``Mdl = fitcauto(___,Name,Value)``
``[Mdl,OptimizationResults] = fitcauto(___)``

## Description

Given predictor and response data, `fitcauto` automatically tries a selection of classification model types with different hyperparameter values. By default, the function uses Bayesian optimization to select models and their hyperparameter values, and computes the cross-validation classification error for each model. After the optimization is complete, `fitcauto` returns the model, trained on the entire data set, that is expected to best classify new data. You can use the `predict` and `loss` object functions of the returned model to classify new data and compute the test set classification error, respectively.

Use `fitcauto` when you are uncertain which classifier types best suit your data. For information on alternative methods for tuning hyperparameters of classification models, see Alternative Functionality.

If your data contains over 10,000 observations, consider using an asynchronous successive halving algorithm (ASHA) instead of Bayesian optimization when you run `fitcauto`. ASHA optimization often finds good solutions faster than Bayesian optimization for data sets with many observations.

example

````Mdl = fitcauto(Tbl,ResponseVarName)` returns a classification model `Mdl` with tuned hyperparameters. The table `Tbl` contains the predictor variables and the response variable, where `ResponseVarName` is the name of the response variable.```
````Mdl = fitcauto(Tbl,formula)` uses `formula` to specify the response variable and the predictor variables to consider among the variables in `Tbl`.```
````Mdl = fitcauto(Tbl,Y)` uses the predictor variables in table `Tbl` and the class labels in vector `Y`.```

example

````Mdl = fitcauto(X,Y)` uses the predictor variables in matrix `X` and the class labels in vector `Y`.```

example

````Mdl = fitcauto(___,Name,Value)` specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, use the `HyperparameterOptimizationOptions` name-value argument to specify whether to use Bayesian optimization (default) or an asynchronous successive halving algorithm (ASHA). To use ASHA optimization, specify `"HyperparameterOptimizationOptions",struct("Optimizer","asha")`. You can include additional fields in the structure to control other aspects of the optimization.```

example

````[Mdl,OptimizationResults] = fitcauto(___)` also returns `OptimizationResults`, which contains the results of the model selection and hyperparameter tuning process. This output is a `BayesianOptimization` object when you use Bayesian optimization, and a table when you use ASHA optimization.```

## Examples

collapse all

Use `fitcauto` to automatically select a classification model with optimized hyperparameters, given predictor and response data stored in a table.

Load the `carbig` data set, which contains measurements of cars made in the 1970s and early 1980s.

`load carbig`

Categorize the cars based on whether they were made in the USA.

```Origin = categorical(cellstr(Origin)); Origin = mergecats(Origin,["France","Japan","Germany", ... "Sweden","Italy","England"],"NotUSA");```

Create a table containing the predictor variables `Acceleration`, `Displacement`, and so on, as well as the response variable `Origin`.

```cars = table(Acceleration,Displacement,Horsepower, ... Model_Year,MPG,Weight,Origin);```

Partition Data

Partition the data into training and test sets. Use approximately 80% of the observations for the model selection and hyperparameter tuning process, and 20% of the observations to test the performance of the final model returned by `fitcauto`. Use `cvpartition` to partition the data.

```rng("default") % For reproducibility of the data partition c = cvpartition(Origin,"Holdout",0.2); trainingIdx = training(c); % Training set indices carsTrain = cars(trainingIdx,:); testIdx = test(c); % Test set indices carsTest = cars(testIdx,:);```

Run `fitcauto`

Pass the training data to `fitcauto`. By default, `fitcauto` determines appropriate model types to try, uses Bayesian optimization to find good hyperparameter values, and returns a trained model `Mdl` with the best expected performance. Additionally, `fitcauto` provides a plot of the optimization and an iterative display of the optimization results. For more information on how to interpret these results, see Verbose Display.

Expect this process to take some time. To speed up the optimization process, consider specifying to run the optimization in parallel, if you have a Parallel Computing Toolbox™ license. To do so, pass `"HyperparameterOptimizationOptions",struct("UseParallel",true)` to `fitcauto` as a name-value argument.

`Mdl = fitcauto(carsTrain,"Origin");`
```Warning: It is recommended that you first standardize all numeric predictors when optimizing the Naive Bayes 'Width' parameter. Ignore this warning if you have done that. ```
```Learner types to explore: ensemble, knn, nb, net, svm, tree Total iterations (MaxObjectiveEvaluations): 180 Total time (MaxTime): Inf |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 1 | Best | 0.37179 | 0.62903 | 0.37179 | 0.37179 | svm | BoxConstraint: 0.11704 | | | | | | | | | KernelScale: 0.004903 | | 2 | Best | 0.22769 | 0.42586 | 0.22769 | 0.22769 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 3 | Best | 0.19231 | 0.42729 | 0.19231 | 0.19231 | knn | NumNeighbors: 3 | | 4 | Accept | 0.22769 | 0.1005 | 0.19231 | 0.19231 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 5 | Best | 0.1891 | 0.13361 | 0.1891 | 0.19096 | knn | NumNeighbors: 12 | | 6 | Best | 0.10154 | 0.28324 | 0.10154 | 0.10154 | tree | MinLeafSize: 5 | | 7 | Accept | 0.16026 | 7.2743 | 0.10154 | 0.10154 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.025856 | | | | | | | | | LayerSizes: [ 286 51 3 ] | | 8 | Accept | 0.37179 | 0.096727 | 0.10154 | 0.10154 | svm | BoxConstraint: 1.2607 | | | | | | | | | KernelScale: 97.75 | | 9 | Accept | 0.37179 | 0.25096 | 0.10154 | 0.10154 | net | Activations: relu | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 211.47 | | | | | | | | | LayerSizes: [ 102 222 ] | | 10 | Accept | 0.19231 | 0.070249 | 0.10154 | 0.10154 | knn | NumNeighbors: 15 | | 11 | Accept | 0.22769 | 0.067816 | 0.10154 | 0.10154 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 12 | Accept | 0.15077 | 9.2891 | 0.10154 | 0.10154 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 249 | | | | | | | | | MinLeafSize: 25 | | 13 | Accept | 0.22769 | 0.063862 | 0.10154 | 0.10154 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 14 | Accept | 0.37179 | 0.092315 | 0.10154 | 0.10154 | svm | BoxConstraint: 9.3148 | | | | | | | | | KernelScale: 0.0017736 | | 15 | Accept | 0.24615 | 0.64539 | 0.10154 | 0.10154 | nb | DistributionNames: kernel | | | | | | | | | Width: 1.2125 | | 16 | Accept | 0.12615 | 0.074982 | 0.10154 | 0.11409 | tree | MinLeafSize: 7 | | 17 | Accept | 0.16308 | 9.4331 | 0.10154 | 0.11409 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 284 | | | | | | | | | MinLeafSize: 89 | | 18 | Accept | 0.16923 | 0.062929 | 0.10154 | 0.13272 | tree | MinLeafSize: 81 | | 19 | Accept | 0.37179 | 0.096521 | 0.10154 | 0.13272 | svm | BoxConstraint: 1.6219 | | | | | | | | | KernelScale: 0.0011185 | | 20 | Accept | 0.25321 | 0.069954 | 0.10154 | 0.13272 | knn | NumNeighbors: 124 | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 21 | Accept | 0.37179 | 0.081257 | 0.10154 | 0.13272 | svm | BoxConstraint: 0.0011787 | | | | | | | | | KernelScale: 1.1427 | | 22 | Accept | 0.22769 | 0.062348 | 0.10154 | 0.13272 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 23 | Accept | 0.13846 | 9.7413 | 0.10154 | 0.13272 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 279 | | | | | | | | | MinLeafSize: 2 | | 24 | Accept | 0.25231 | 0.19127 | 0.10154 | 0.13272 | nb | DistributionNames: kernel | | | | | | | | | Width: 1.6084 | | 25 | Accept | 0.22769 | 0.062537 | 0.10154 | 0.13272 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 26 | Accept | 0.19872 | 0.079713 | 0.10154 | 0.13272 | knn | NumNeighbors: 1 | | 27 | Accept | 0.23397 | 3.666 | 0.10154 | 0.13272 | net | Activations: tanh | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 1.1283e-06 | | | | | | | | | LayerSizes: [ 102 1 ] | | 28 | Accept | 0.13538 | 0.070656 | 0.10154 | 0.1338 | tree | MinLeafSize: 19 | | 29 | Accept | 0.19551 | 0.064085 | 0.10154 | 0.1338 | knn | NumNeighbors: 26 | | 30 | Accept | 0.37179 | 0.09049 | 0.10154 | 0.1338 | svm | BoxConstraint: 3.391 | | | | | | | | | KernelScale: 0.021864 | | 31 | Accept | 0.1891 | 2.712 | 0.10154 | 0.1338 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 1.0513e-06 | | | | | | | | | LayerSizes: [ 2 2 ] | | 32 | Accept | 0.21154 | 0.063578 | 0.10154 | 0.1338 | knn | NumNeighbors: 2 | | 33 | Accept | 0.10154 | 0.070667 | 0.10154 | 0.12751 | tree | MinLeafSize: 5 | | 34 | Accept | 0.37179 | 0.096094 | 0.10154 | 0.12751 | svm | BoxConstraint: 469.1 | | | | | | | | | KernelScale: 0.0089806 | | 35 | Accept | 0.14462 | 8.4121 | 0.10154 | 0.12751 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 241 | | | | | | | | | MinLeafSize: 1 | | 36 | Accept | 0.11385 | 0.066814 | 0.10154 | 0.11727 | tree | MinLeafSize: 11 | | 37 | Best | 0.098462 | 5.7806 | 0.098462 | 0.11727 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 218 | | | | | | | | | MinLeafSize: 48 | | 38 | Accept | 0.22769 | 0.061168 | 0.098462 | 0.11727 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 39 | Accept | 0.37179 | 0.44849 | 0.098462 | 0.11727 | net | Activations: tanh | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 29.705 | | | | | | | | | LayerSizes: 118 | | 40 | Accept | 0.24923 | 0.19982 | 0.098462 | 0.11727 | nb | DistributionNames: kernel | | | | | | | | | Width: 3.9774 | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 41 | Accept | 0.18769 | 0.063134 | 0.098462 | 0.11494 | tree | MinLeafSize: 112 | | 42 | Accept | 0.10769 | 5.1672 | 0.098462 | 0.11494 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | MinLeafSize: 6 | | 43 | Accept | 0.17628 | 0.064608 | 0.098462 | 0.11494 | knn | NumNeighbors: 41 | | 44 | Accept | 0.37231 | 0.065944 | 0.098462 | 0.11788 | tree | MinLeafSize: 152 | | 45 | Accept | 0.22769 | 0.055901 | 0.098462 | 0.11788 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 46 | Accept | 0.37179 | 0.070447 | 0.098462 | 0.11788 | svm | BoxConstraint: 0.017639 | | | | | | | | | KernelScale: 1.8123 | | 47 | Accept | 0.37179 | 0.22159 | 0.098462 | 0.11788 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 2.6201 | | | | | | | | | LayerSizes: [ 134 10 240 ] | | 48 | Accept | 0.37179 | 0.16389 | 0.098462 | 0.11788 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.12107 | | | | | | | | | LayerSizes: [ 253 1 ] | | 49 | Accept | 0.13141 | 0.077293 | 0.098462 | 0.11788 | svm | BoxConstraint: 31.426 | | | | | | | | | KernelScale: 1.6379 | | 50 | Accept | 0.22769 | 0.055984 | 0.098462 | 0.11788 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 51 | Accept | 0.14769 | 9.3995 | 0.098462 | 0.11788 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 272 | | | | | | | | | MinLeafSize: 1 | | 52 | Accept | 0.12923 | 0.063907 | 0.098462 | 0.11796 | tree | MinLeafSize: 3 | | 53 | Accept | 0.37179 | 0.10353 | 0.098462 | 0.11796 | svm | BoxConstraint: 20.907 | | | | | | | | | KernelScale: 0.0030163 | | 54 | Accept | 0.15385 | 8.3016 | 0.098462 | 0.11796 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 243 | | | | | | | | | MinLeafSize: 18 | | 55 | Accept | 0.17628 | 0.061456 | 0.098462 | 0.11796 | knn | NumNeighbors: 41 | | 56 | Accept | 0.16667 | 0.7857 | 0.098462 | 0.11796 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.00041104 | | | | | | | | | LayerSizes: 2 | | 57 | Accept | 0.22769 | 0.058204 | 0.098462 | 0.11796 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 58 | Accept | 0.17949 | 4.0585 | 0.098462 | 0.11796 | net | Activations: sigmoid | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 3.7419e-06 | | | | | | | | | LayerSizes: [ 8 37 ] | | 59 | Accept | 0.23385 | 0.22143 | 0.098462 | 0.11796 | nb | DistributionNames: kernel | | | | | | | | | Width: 561.16 | | 60 | Accept | 0.19551 | 0.06451 | 0.098462 | 0.11796 | knn | NumNeighbors: 26 | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 61 | Accept | 0.37231 | 6.2084 | 0.098462 | 0.11796 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 272 | | | | | | | | | MinLeafSize: 161 | | 62 | Accept | 0.37179 | 0.090047 | 0.098462 | 0.11796 | svm | BoxConstraint: 29.741 | | | | | | | | | KernelScale: 0.045927 | | 63 | Accept | 0.20513 | 6.8138 | 0.098462 | 0.11796 | net | Activations: sigmoid | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 5.1269e-08 | | | | | | | | | LayerSizes: [ 175 2 4 ] | | 64 | Accept | 0.1891 | 0.05843 | 0.098462 | 0.11796 | knn | NumNeighbors: 9 | | 65 | Accept | 0.19231 | 0.067392 | 0.098462 | 0.11796 | knn | NumNeighbors: 15 | | 66 | Accept | 0.12923 | 0.075809 | 0.098462 | 0.11512 | tree | MinLeafSize: 3 | | 67 | Accept | 0.37179 | 0.073588 | 0.098462 | 0.11512 | svm | BoxConstraint: 0.018753 | | | | | | | | | KernelScale: 0.38262 | | 68 | Accept | 0.17308 | 0.06212 | 0.098462 | 0.11512 | knn | NumNeighbors: 4 | | 69 | Accept | 0.14769 | 0.074561 | 0.098462 | 0.11688 | tree | MinLeafSize: 2 | | 70 | Accept | 0.13538 | 7.9953 | 0.098462 | 0.11674 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 222 | | | | | | | | | MinLeafSize: 1 | | 71 | Accept | 0.13538 | 0.062477 | 0.098462 | 0.11674 | tree | MinLeafSize: 19 | | 72 | Accept | 0.12923 | 0.063529 | 0.098462 | 0.11674 | tree | MinLeafSize: 8 | | 73 | Accept | 0.15064 | 1.8048 | 0.098462 | 0.11674 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.00059411 | | | | | | | | | LayerSizes: [ 5 4 ] | | 74 | Accept | 0.22769 | 0.058621 | 0.098462 | 0.11674 | nb | DistributionNames: normal | | | | | | | | | Width: NaN | | 75 | Accept | 0.16987 | 5.3533 | 0.098462 | 0.11674 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.00022289 | | | | | | | | | LayerSizes: [ 81 9 ] | | 76 | Accept | 0.14154 | 7.4446 | 0.098462 | 0.11813 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 214 | | | | | | | | | MinLeafSize: 7 | | 77 | Accept | 0.33846 | 0.059384 | 0.098462 | 0.1167 | tree | MinLeafSize: 130 | | 78 | Accept | 0.13231 | 4.9246 | 0.098462 | 0.1167 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 212 | | | | | | | | | MinLeafSize: 64 | | 79 | Accept | 0.37179 | 0.12859 | 0.098462 | 0.1167 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 5.8307 | | | | | | | | | LayerSizes: [ 20 1 ] | | 80 | Accept | 0.14154 | 7.5942 | 0.098462 | 0.1167 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 219 | | | | | | | | | MinLeafSize: 7 | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 81 | Accept | 0.19872 | 0.063107 | 0.098462 | 0.1167 | knn | NumNeighbors: 1 | | 82 | Accept | 0.37179 | 0.14405 | 0.098462 | 0.1167 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 15.587 | | | | | | | | | LayerSizes: [ 182 4 ] | | 83 | Accept | 0.37179 | 0.3951 | 0.098462 | 0.1167 | net | Activations: none | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 0.00026401 | | | | | | | | | LayerSizes: [ 1 79 ] | | 84 | Accept | 0.14154 | 6.1198 | 0.098462 | 0.1167 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 266 | | | | | | | | | MinLeafSize: 110 | | 85 | Accept | 0.14423 | 0.10373 | 0.098462 | 0.1167 | svm | BoxConstraint: 322.43 | | | | | | | | | KernelScale: 1.7393 | | 86 | Accept | 0.12 | 0.073943 | 0.098462 | 0.11425 | tree | MinLeafSize: 4 | | 87 | Accept | 0.37179 | 0.089384 | 0.098462 | 0.11425 | svm | BoxConstraint: 0.0026322 | | | | | | | | | KernelScale: 0.004006 | | 88 | Accept | 0.14154 | 9.5099 | 0.098462 | 0.11425 | ensemble | Method: Bag | | | | | | | | | NumLearningCycles: 276 | | | | | | | | | MinLeafSize: 2 | | 89 | Accept | 0.37179 | 0.088251 | 0.098462 | 0.11425 | svm | BoxConstraint: 25.201 | | | | | | | | | KernelScale: 0.019423 | | 90 | Accept | 0.1891 | 0.072839 | 0.098462 | 0.11425 | knn | NumNeighbors: 13 | | 91 | Accept | 0.12615 | 4.8775 | 0.098462 | 0.11425 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 211 | | | | | | | | | MinLeafSize: 75 | | 92 | Accept | 0.14154 | 4.9062 | 0.098462 | 0.11425 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | MinLeafSize: 96 | | 93 | Accept | 0.11077 | 6.4978 | 0.098462 | 0.11425 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 277 | | | | | | | | | MinLeafSize: 4 | | 94 | Accept | 0.10154 | 6.4944 | 0.098462 | 0.11048 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 274 | | | | | | | | | MinLeafSize: 16 | | 95 | Accept | 0.12615 | 6.6536 | 0.098462 | 0.11217 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | MinLeafSize: 53 | | 96 | Accept | 0.14462 | 4.5605 | 0.098462 | 0.1093 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 201 | | | | | | | | | MinLeafSize: 91 | | 97 | Best | 0.089231 | 5.7135 | 0.089231 | 0.10371 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 242 | | | | | | | | | MinLeafSize: 13 | | 98 | Accept | 0.10769 | 5.5323 | 0.089231 | 0.10353 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 233 | | | | | | | | | MinLeafSize: 10 | | 99 | Accept | 0.12 | 5.9044 | 0.089231 | 0.10351 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 253 | | | | | | | | | MinLeafSize: 26 | | 100 | Accept | 0.10154 | 6.4575 | 0.089231 | 0.10122 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 272 | | | | | | | | | MinLeafSize: 13 | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 101 | Accept | 0.16026 | 7.4685 | 0.089231 | 0.10122 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.020619 | | | | | | | | | LayerSizes: [ 117 160 2 ] | | 102 | Accept | 0.20513 | 3.5845 | 0.089231 | 0.10122 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.052211 | | | | | | | | | LayerSizes: [ 18 182 163 ] | | 103 | Best | 0.086154 | 4.9401 | 0.086154 | 0.095252 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | MinLeafSize: 15 | | 104 | Accept | 0.095385 | 6.4925 | 0.086154 | 0.096118 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 274 | | | | | | | | | MinLeafSize: 14 | | 105 | Accept | 0.092308 | 4.8125 | 0.086154 | 0.093255 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 201 | | | | | | | | | MinLeafSize: 14 | | 106 | Accept | 0.37231 | 4.7781 | 0.086154 | 0.092615 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 207 | | | | | | | | | MinLeafSize: 134 | | 107 | Accept | 0.14462 | 4.7742 | 0.086154 | 0.097454 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 206 | | | | | | | | | MinLeafSize: 93 | | 108 | Accept | 0.092308 | 4.9755 | 0.086154 | 0.092405 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 206 | | | | | | | | | MinLeafSize: 14 | | 109 | Accept | 0.092308 | 4.9658 | 0.086154 | 0.091949 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | MinLeafSize: 20 | | 110 | Accept | 0.10154 | 5.0332 | 0.086154 | 0.092013 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 207 | | | | | | | | | MinLeafSize: 16 | | 111 | Accept | 0.17846 | 4.6286 | 0.086154 | 0.092219 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 201 | | | | | | | | | MinLeafSize: 120 | | 112 | Accept | 0.18462 | 5.3291 | 0.086154 | 0.092663 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 234 | | | | | | | | | MinLeafSize: 114 | | 113 | Accept | 0.10154 | 4.9436 | 0.086154 | 0.091972 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 210 | | | | | | | | | MinLeafSize: 21 | | 114 | Accept | 0.10154 | 4.8767 | 0.086154 | 0.092793 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | MinLeafSize: 23 | | 115 | Accept | 0.10769 | 4.8384 | 0.086154 | 0.09189 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 203 | | | | | | | | | MinLeafSize: 27 | | 116 | Accept | 0.089231 | 5.1892 | 0.086154 | 0.091881 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 216 | | | | | | | | | MinLeafSize: 18 | | 117 | Accept | 0.095385 | 5.2157 | 0.086154 | 0.092387 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 220 | | | | | | | | | MinLeafSize: 15 | | 118 | Accept | 0.095385 | 5.0879 | 0.086154 | 0.092544 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | MinLeafSize: 15 | | 119 | Accept | 0.10154 | 5.6188 | 0.086154 | 0.092332 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 235 | | | | | | | | | MinLeafSize: 17 | | 120 | Accept | 0.37179 | 0.197 | 0.086154 | 0.092332 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.063953 | | | | | | | | | LayerSizes: [ 220 3 ] | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 121 | Accept | 0.37179 | 0.1961 | 0.086154 | 0.092332 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.055033 | | | | | | | | | LayerSizes: [ 97 13 ] | | 122 | Accept | 0.14103 | 1.4229 | 0.086154 | 0.092332 | net | Activations: none | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.00085152 | | | | | | | | | LayerSizes: [ 197 20 2 ] | | 123 | Accept | 0.37179 | 0.22892 | 0.086154 | 0.092332 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.051445 | | | | | | | | | LayerSizes: [ 247 6 ] | | 124 | Accept | 0.13782 | 0.38245 | 0.086154 | 0.092332 | net | Activations: none | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0087893 | | | | | | | | | LayerSizes: [ 199 2 ] | | 125 | Accept | 0.26282 | 1.0954 | 0.086154 | 0.092332 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.016624 | | | | | | | | | LayerSizes: [ 115 10 ] | | 126 | Accept | 0.18269 | 0.24182 | 0.086154 | 0.092332 | net | Activations: none | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.035571 | | | | | | | | | LayerSizes: [ 224 9 ] | | 127 | Accept | 0.095385 | 5.0171 | 0.086154 | 0.091895 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | MinLeafSize: 2 | | 128 | Accept | 0.37179 | 0.13063 | 0.086154 | 0.091895 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0424 | | | | | | | | | LayerSizes: [ 4 2 3 ] | | 129 | Accept | 0.14103 | 2.0229 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0028328 | | | | | | | | | LayerSizes: [ 1 8 1 ] | | 130 | Accept | 0.18269 | 1.0404 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.037003 | | | | | | | | | LayerSizes: [ 153 3 ] | | 131 | Accept | 0.14423 | 2.7718 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0058072 | | | | | | | | | LayerSizes: [ 1 72 ] | | 132 | Accept | 0.14103 | 9.7891 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0040232 | | | | | | | | | LayerSizes: [ 1 88 57 ] | | 133 | Accept | 0.14103 | 0.44563 | 0.086154 | 0.091895 | net | Activations: none | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0020791 | | | | | | | | | LayerSizes: [ 3 5 2 ] | | 134 | Accept | 0.37179 | 0.13512 | 0.086154 | 0.091895 | net | Activations: relu | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 0.027533 | | | | | | | | | LayerSizes: [ 2 2 1 ] | | 135 | Accept | 0.18269 | 1.6723 | 0.086154 | 0.091895 | net | Activations: relu | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0042357 | | | | | | | | | LayerSizes: [ 5 3 1 ] | | 136 | Accept | 0.23397 | 3.4642 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 0.00033175 | | | | | | | | | LayerSizes: [ 1 26 1 ] | | 137 | Accept | 0.21154 | 3.0298 | 0.086154 | 0.091895 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.030788 | | | | | | | | | LayerSizes: [ 1 130 48 ] | | 138 | Accept | 0.086154 | 5.109 | 0.086154 | 0.091384 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 210 | | | | | | | | | MinLeafSize: 15 | | 139 | Accept | 0.14103 | 5.9512 | 0.086154 | 0.091384 | net | Activations: tanh | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0029983 | | | | | | | | | LayerSizes: [ 1 11 104 ] | | 140 | Accept | 0.14744 | 3.0068 | 0.086154 | 0.091384 | net | Activations: relu | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.007429 | | | | | | | | | LayerSizes: [ 1 102 25 ] | |=============================================================================================================================================| | Iter | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | result | loss | & validation (sec)| validation loss | validation loss | | | |=============================================================================================================================================| | 141 | Accept | 0.22436 | 2.6641 | 0.086154 | 0.091384 | net | Activations: none | | | | | | | | | Standardize: false | | | | | | | | | Lambda: 0.001718 | | | | | | | | | LayerSizes: [ 3 7 125 ] | | 142 | Accept | 0.089231 | 5.0047 | 0.086154 | 0.089265 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 209 | | | | | | | | | MinLeafSize: 15 | | 143 | Accept | 0.14154 | 5.1754 | 0.086154 | 0.0895 | ensemble | Method: LogitBoost | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | MinLeafSize: 100 | | 144 | Accept | 0.37179 | 0.1594 | 0.086154 | 0.0895 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.011755 | | | | | | | | | LayerSizes: [ 5 1 2 ] | | 145 | Accept | 0.37179 | 0.37807 | 0.086154 | 0.0895 | net | Activations: sigmoid | | | | | | | | | Standardize: true | | | | | | | | | Lambda: 0.0033755 | | | | | | | | | Lay... ```

```__________________________________________________________ Optimization completed. Total iterations: 180 Total elapsed time: 699.614 seconds Total time for training and validation: 493.3351 seconds Best observed learner is an ensemble model with: Learner: ensemble Method: LogitBoost NumLearningCycles: 208 MinLeafSize: 15 Observed validation loss: 0.086154 Time for training and validation: 4.9401 seconds Best estimated learner (returned model) is an ensemble model with: Learner: ensemble Method: LogitBoost NumLearningCycles: 209 MinLeafSize: 15 Estimated validation loss: 0.089192 Estimated time for training and validation: 5.0288 seconds Documentation for fitcauto display ```

The final model returned by `fitcauto` corresponds to the best estimated learner. Before returning the model, the function retrains it using the entire training data (`carsTrain`), the listed `Learner` (or model) type, and the displayed hyperparameter values.

Evaluate Test Set Performance

Evaluate the performance of the model on the test set.

`testAccuracy = 1 - loss(Mdl,carsTest,"Origin")`
```testAccuracy = 0.9263 ```
`confusionchart(carsTest.Origin,predict(Mdl,carsTest))`

Use `fitcauto` to automatically select a classification model with optimized hyperparameters, given predictor and response data stored in separate variables.

Load the `humanactivity` data set. This data set contains 24,075 observations of five physical human activities: Sitting (1), Standing (2), Walking (3), Running (4), and Dancing (5). Each observation has 60 features extracted from acceleration data measured by smartphone accelerometer sensors. The variable `feat` contains the predictor data matrix of the 60 features for the 24,075 observations, and the response variable `actid` contains the activity IDs for the observations as integers.

`load humanactivity`

Partition Data

Partition the data into training and test sets. Use 90% of the observations to select a model, and 10% of the observations to validate the final model returned by `fitcauto`. Use `cvpartition` to reserve 10% of the observations for testing.

```rng("default") % For reproducibility of the partition c = cvpartition(actid,"Holdout",0.10); trainingIndices = training(c); % Indices for the training set XTrain = feat(trainingIndices,:); YTrain = actid(trainingIndices); testIndices = test(c); % Indices for the test set XTest = feat(testIndices,:); YTest = actid(testIndices);```

Run `fitcauto`

Pass the training data to `fitcauto`. Because the training data `XTrain` has more than 10,000 observations, use ASHA optimization rather than Bayesian optimization. The `fitcauto` function randomly selects appropriate model (or learner) types with different hyperparameter values, trains the models on a small subset of the training data, promotes the models that perform well, and retrains the promoted models on progressively larger sets of training data. The function returns the model with the best cross-validation performance, retrained on all the training data, and a table that contains the details of the optimization. Specify to run the optimization in parallel (requires Parallel Computing Toolbox™).

By default, `fitcauto` provides a plot of the optimization and an iterative display of the optimization results. For more information on how to interpret these results, see Verbose Display.

```options = struct("Optimizer","asha","UseParallel",true); [Mdl,OptimizationResults] = fitcauto(XTrain,YTrain,"HyperparameterOptimizationOptions",options);```
```Warning: It is recommended that you first standardize all numeric predictors when optimizing the Naive Bayes 'Width' parameter. Ignore this warning if you have done that. ```
```Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 8). Copying objective function to workers... Done copying objective function to workers. Learner types to explore: ensemble, knn, nb, net, svm, tree Total iterations (MaxObjectiveEvaluations): 595 Total time (MaxTime): Inf |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 1 | 8 | Best | 0.74165 | 2.2322 | 0.74165 | 271 | tree | MinLeafSize: 945 | | 2 | 7 | Accept | 0.74165 | 9.0692 | 0.049289 | 271 | knn | NumNeighbors: 1726 | | 3 | 7 | Best | 0.049289 | 3.3616 | 0.049289 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 4 | 7 | Accept | 0.74165 | 9.2877 | 0.049289 | 271 | knn | NumNeighbors: 3072 | | 5 | 8 | Best | 0.046566 | 0.81486 | 0.046566 | 1084 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 6 | 8 | Accept | 0.13379 | 3.0947 | 0.046566 | 271 | knn | NumNeighbors: 46 | | 7 | 8 | Accept | 0.066457 | 13.692 | 0.046566 | 271 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.927e-08 | | | | | | | | | | LayerSizes: [ 10 56 28 ] | | 8 | 8 | Accept | 0.096225 | 3.801 | 0.046566 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 8.8825 | | | | | | | | | | KernelScale: 73.89 | | 9 | 7 | Accept | 0.73962 | 14.971 | 0.046566 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 17.562 | | | | | | | | | | KernelScale: 0.0082394 | | 10 | 7 | Accept | 0.73925 | 14.98 | 0.046566 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 1.3419 | | | | | | | | | | KernelScale: 0.027033 | | 11 | 8 | Accept | 0.74165 | 5.7328 | 0.046566 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.040932 | | | | | | | | | | KernelScale: 949.09 | | 12 | 8 | Best | 0.04472 | 3.1496 | 0.04472 | 271 | knn | NumNeighbors: 2 | | 13 | 8 | Accept | 0.74165 | 6.4341 | 0.04472 | 271 | knn | NumNeighbors: 1240 | | 14 | 8 | Best | 0.041536 | 2.2078 | 0.041536 | 271 | knn | NumNeighbors: 3 | | 15 | 8 | Accept | 0.051828 | 2.211 | 0.041536 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 16 | 8 | Accept | 0.74165 | 12.471 | 0.041536 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.0079065 | | | | | | | | | | KernelScale: 0.039442 | | 17 | 8 | Accept | 0.74165 | 0.88796 | 0.041536 | 271 | tree | MinLeafSize: 252 | | 18 | 8 | Best | 0.029629 | 6.9518 | 0.029629 | 1084 | knn | NumNeighbors: 2 | | 19 | 8 | Accept | 0.030044 | 6.2852 | 0.029629 | 1084 | knn | NumNeighbors: 3 | | 20 | 8 | Accept | 0.74165 | 6.2132 | 0.029629 | 271 | knn | NumNeighbors: 8117 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 21 | 8 | Accept | 0.1811 | 2.2815 | 0.029629 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.0052814 | | | | | | | | | | KernelScale: 546.04 | | 22 | 8 | Accept | 0.34996 | 4.8703 | 0.029629 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 1.0486 | | | | | | | | | | KernelScale: 2.323 | | 23 | 8 | Accept | 0.74165 | 5.7248 | 0.029629 | 271 | knn | NumNeighbors: 524 | | 24 | 8 | Accept | 0.046243 | 1.9627 | 0.029629 | 1084 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 25 | 7 | Accept | 0.73057 | 53.598 | 0.029629 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 1.0633e-13 | | 26 | 7 | Accept | 0.04209 | 38.18 | 0.029629 | 1084 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.927e-08 | | | | | | | | | | LayerSizes: [ 10 56 28 ] | | 27 | 8 | Accept | 0.035998 | 31.431 | 0.029629 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.4417e-06 | | | | | | | | | | LayerSizes: [ 202 11 ] | | 28 | 8 | Accept | 0.082103 | 2.5818 | 0.029629 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 1.0839 | | | | | | | | | | KernelScale: 39.504 | | 29 | 8 | Accept | 0.032721 | 20.812 | 0.029629 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 257 | | | | | | | | | | MinLeafSize: 4 | | | | | | | | | | MaxNumSplits: 13 | | 30 | 8 | Accept | 0.74165 | 7.0246 | 0.029629 | 271 | knn | NumNeighbors: 384 | | 31 | 8 | Accept | 0.73034 | 49.099 | 0.029629 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 2.0942e-12 | | 32 | 8 | Accept | 0.052843 | 1.4947 | 0.029629 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 33 | 8 | Best | 0.019522 | 19.771 | 0.019522 | 4334 | knn | NumNeighbors: 2 | | 34 | 8 | Accept | 0.74072 | 10.11 | 0.019522 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.44626 | | | | | | | | | | KernelScale: 0.089894 | | 35 | 8 | Accept | 0.10218 | 4.6984 | 0.019522 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 2.8638 | | | | | | | | | | KernelScale: 201.68 | | 36 | 8 | Accept | 0.71774 | 68.103 | 0.019522 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.00032006 | | 37 | 8 | Accept | 0.6913 | 5.0575 | 0.019522 | 271 | knn | NumNeighbors: 184 | | 38 | 8 | Accept | 0.10689 | 2.2859 | 0.019522 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.036995 | | | | | | | | | | KernelScale: 13.878 | | 39 | 8 | Accept | 0.72983 | 56.397 | 0.019522 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 2.3638e-05 | | 40 | 8 | Accept | 0.035121 | 5.8369 | 0.019522 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 2.7559e-07 | | | | | | | | | | LayerSizes: [ 32 93 ] | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 41 | 8 | Accept | 0.048459 | 0.72565 | 0.019522 | 1084 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 42 | 8 | Accept | 0.045136 | 0.89703 | 0.019522 | 271 | tree | MinLeafSize: 6 | | 43 | 8 | Accept | 0.11427 | 20.922 | 0.019522 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 240 | | | | | | | | | | MinLeafSize: 49 | | | | | | | | | | MaxNumSplits: 97 | | 44 | 8 | Accept | 0.041674 | 2.8829 | 0.019522 | 271 | knn | NumNeighbors: 2 | | 45 | 8 | Accept | 0.74165 | 5.746 | 0.019522 | 271 | knn | NumNeighbors: 4410 | | 46 | 8 | Accept | 0.019799 | 3.4571 | 0.019522 | 1084 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 2.7559e-07 | | | | | | | | | | LayerSizes: [ 32 93 ] | | 47 | 8 | Accept | 0.030414 | 23.366 | 0.019522 | 1084 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 257 | | | | | | | | | | MinLeafSize: 4 | | | | | | | | | | MaxNumSplits: 13 | | 48 | 8 | Accept | 0.049243 | 2.1016 | 0.019522 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 49 | 8 | Accept | 0.041767 | 0.83736 | 0.019522 | 271 | tree | MinLeafSize: 3 | | 50 | 8 | Accept | 0.032813 | 4.5548 | 0.019522 | 271 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 223 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 75 | | 51 | 8 | Accept | 0.7362 | 11.413 | 0.019522 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 760.57 | | | | | | | | | | KernelScale: 0.34067 | | 52 | 8 | Accept | 0.055843 | 0.7704 | 0.019522 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 53 | 8 | Accept | 0.028937 | 6.754 | 0.019522 | 1084 | knn | NumNeighbors: 2 | | 54 | 8 | Accept | 0.74165 | 5.6993 | 0.019522 | 271 | knn | NumNeighbors: 9124 | | 55 | 8 | Accept | 0.054689 | 0.75579 | 0.019522 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 56 | 8 | Accept | 0.019799 | 8.4671 | 0.019522 | 1084 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 223 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 75 | | 57 | 8 | Best | 0.010338 | 22.501 | 0.010338 | 4334 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 2.7559e-07 | | | | | | | | | | LayerSizes: [ 32 93 ] | | 58 | 8 | Accept | 0.12599 | 2.4672 | 0.010338 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.41895 | | | | | | | | | | KernelScale: 96.491 | | 59 | 8 | Accept | 0.70357 | 60.43 | 0.010338 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.00045097 | | 60 | 8 | Accept | 0.5737 | 73.531 | 0.010338 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.0018132 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 61 | 8 | Accept | 0.74165 | 2.1874 | 0.010338 | 271 | net | Activations: sigmoid | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 0.019205 | | | | | | | | | | LayerSizes: [ 3 3 ] | | 62 | 8 | Accept | 0.050212 | 1.0089 | 0.010338 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 63 | 8 | Accept | 0.027506 | 77.499 | 0.010338 | 1084 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.4417e-06 | | | | | | | | | | LayerSizes: [ 202 11 ] | | 64 | 8 | Accept | 0.030137 | 2.1967 | 0.010338 | 1084 | tree | MinLeafSize: 3 | | 65 | 8 | Accept | 0.74165 | 6.9907 | 0.010338 | 271 | knn | NumNeighbors: 7694 | | 66 | 8 | Accept | 0.73025 | 46.328 | 0.010338 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 7.594e-10 | | 67 | 8 | Accept | 0.71742 | 63.01 | 0.010338 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.00036683 | | 68 | 8 | Accept | 0.029906 | 1.3903 | 0.010338 | 1084 | tree | MinLeafSize: 6 | | 69 | 8 | Accept | 0.090502 | 5.9339 | 0.010338 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 2.6095e-08 | | | | | | | | | | LayerSizes: [ 3 16 4 ] | | 70 | 8 | Accept | 0.061381 | 2.8978 | 0.010338 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.66386 | | | | | | | | | | KernelScale: 11.66 | | 71 | 8 | Accept | 0.037705 | 19.775 | 0.010338 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 258 | | | | | | | | | | MinLeafSize: 11 | | | | | | | | | | MaxNumSplits: 10 | | 72 | 8 | Accept | 0.03849 | 19.504 | 0.010338 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 6.8643e-05 | | | | | | | | | | LayerSizes: [ 24 101 14 ] | | 73 | 8 | Accept | 0.037152 | 20.796 | 0.010338 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 2 | | | | | | | | | | MaxNumSplits: 27 | | 74 | 8 | Accept | 0.048689 | 2.1741 | 0.010338 | 271 | knn | NumNeighbors: 4 | | 75 | 8 | Accept | 0.55335 | 4.3695 | 0.010338 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.011209 | | | | | | | | | | KernelScale: 1.0514 | | 76 | 8 | Accept | 0.032352 | 22.88 | 0.010338 | 1084 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 258 | | | | | | | | | | MinLeafSize: 11 | | | | | | | | | | MaxNumSplits: 10 | | 77 | 8 | Best | 0.010061 | 42.264 | 0.010061 | 4334 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 223 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 75 | | 78 | 8 | Accept | 0.13125 | 5.0693 | 0.010061 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.5998e-06 | | | | | | | | | | LayerSizes: [ 1 3 ] | | 79 | 8 | Accept | 0.74165 | 14.599 | 0.010061 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 216 | | | | | | | | | | MinLeafSize: 421 | | | | | | | | | | MaxNumSplits: 59 | | 80 | 8 | Accept | 0.73131 | 46.536 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 1.0852e-11 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 81 | 8 | Accept | 0.026121 | 27.416 | 0.010061 | 1084 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 2 | | | | | | | | | | MaxNumSplits: 27 | | 82 | 8 | Accept | 0.07818 | 2.9308 | 0.010061 | 271 | knn | NumNeighbors: 15 | | 83 | 8 | Accept | 0.64501 | 150.1 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 422.98 | | 84 | 8 | Accept | 0.048966 | 0.97418 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 85 | 8 | Accept | 0.74165 | 5.9347 | 0.010061 | 271 | knn | NumNeighbors: 2792 | | 86 | 8 | Accept | 0.050858 | 0.89733 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 87 | 8 | Accept | 0.035444 | 5.8934 | 0.010061 | 1084 | knn | NumNeighbors: 4 | | 88 | 8 | Accept | 0.74165 | 156.96 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 1679.5 | | 89 | 8 | Accept | 0.043982 | 1.8291 | 0.010061 | 271 | tree | MinLeafSize: 3 | | 90 | 8 | Accept | 0.080857 | 144.55 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.87308 | | 91 | 8 | Accept | 0.72956 | 47.076 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 2.7893e-08 | | 92 | 8 | Accept | 0.029721 | 1.4922 | 0.010061 | 1084 | tree | MinLeafSize: 3 | | 93 | 8 | Accept | 0.74165 | 0.12917 | 0.010061 | 271 | tree | MinLeafSize: 7511 | | 94 | 7 | Accept | 0.74165 | 6.1437 | 0.010061 | 271 | knn | NumNeighbors: 9834 | | 95 | 7 | Accept | 0.59359 | 73.466 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.0015851 | | 96 | 8 | Accept | 0.040336 | 0.69244 | 0.010061 | 271 | tree | MinLeafSize: 1 | | 97 | 7 | Accept | 0.042874 | 2.1452 | 0.010061 | 271 | net | Activations: relu | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 3.2315e-08 | | | | | | | | | | LayerSizes: 3 | | 98 | 7 | Accept | 0.74165 | 1.2052 | 0.010061 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 0.3822 | | | | | | | | | | LayerSizes: [ 27 9 ] | | 99 | 8 | Accept | 0.026121 | 1.2028 | 0.010061 | 1084 | tree | MinLeafSize: 1 | | 100 | 8 | Accept | 0.74165 | 0.66984 | 0.010061 | 271 | tree | MinLeafSize: 1551 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 101 | 8 | Accept | 0.74165 | 10.343 | 0.010061 | 271 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 223 | | | | | | | | | | MinLeafSize: 9461 | | | | | | | | | | MaxNumSplits: 11 | | 102 | 8 | Accept | 0.051505 | 0.82579 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 103 | 8 | Accept | 0.028244 | 46.019 | 0.010061 | 1084 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 6.8643e-05 | | | | | | | | | | LayerSizes: [ 24 101 14 ] | | 104 | 8 | Accept | 0.05612 | 1.9299 | 0.010061 | 271 | knn | NumNeighbors: 5 | | 105 | 8 | Accept | 0.73006 | 44.245 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 1.203e-13 | | 106 | 7 | Accept | 0.74165 | 6.0627 | 0.010061 | 271 | knn | NumNeighbors: 323 | | 107 | 7 | Accept | 0.041167 | 3.0942 | 0.010061 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 7.2382e-05 | | | | | | | | | | LayerSizes: [ 3 2 ] | | 108 | 8 | Accept | 0.66841 | 4.8715 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 8.4445 | | | | | | | | | | KernelScale: 0.70923 | | 109 | 8 | Accept | 0.027321 | 5.5757 | 0.010061 | 1084 | net | Activations: relu | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 3.2315e-08 | | | | | | | | | | LayerSizes: 3 | | 110 | 8 | Accept | 0.1343 | 4.1806 | 0.010061 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 2.4202 | | | | | | | | | | KernelScale: 3.4213 | | 111 | 8 | Accept | 0.055612 | 0.90595 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 112 | 8 | Accept | 0.019153 | 1.3107 | 0.010061 | 4334 | tree | MinLeafSize: 1 | | 113 | 8 | Accept | 0.048597 | 0.70833 | 0.010061 | 1084 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 114 | 8 | Accept | 0.59175 | 0.84241 | 0.010061 | 271 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.0815 | | | | | | | | | | LayerSizes: 150 | | 115 | 8 | Accept | 0.54006 | 4.17 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.029391 | | | | | | | | | | KernelScale: 1.121 | | 116 | 8 | Accept | 0.74165 | 6.5172 | 0.010061 | 271 | knn | NumNeighbors: 2873 | | 117 | 8 | Accept | 0.024091 | 7.1168 | 0.010061 | 1084 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 7.2382e-05 | | | | | | | | | | LayerSizes: [ 3 2 ] | | 118 | 8 | Accept | 0.027368 | 10.135 | 0.010061 | 271 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 216 | | | | | | | | | | MinLeafSize: 2 | | | | | | | | | | MaxNumSplits: 42 | | 119 | 8 | Accept | 0.027598 | 46.175 | 0.010061 | 4334 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 2 | | | | | | | | | | MaxNumSplits: 27 | | 120 | 8 | Accept | 0.12701 | 3.4389 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 19.581 | | | | | | | | | | KernelScale: 2.5698 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 121 | 8 | Accept | 0.74165 | 5.7497 | 0.010061 | 271 | knn | NumNeighbors: 3004 | | 122 | 8 | Accept | 0.051551 | 0.98458 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 123 | 8 | Accept | 0.73934 | 10.379 | 0.010061 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 6.66 | | | | | | | | | | KernelScale: 0.0075876 | | 124 | 8 | Accept | 0.74165 | 6.0464 | 0.010061 | 271 | knn | NumNeighbors: 538 | | 125 | 8 | Accept | 0.051643 | 0.94588 | 0.010061 | 1084 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 126 | 8 | Accept | 0.050074 | 0.66389 | 0.010061 | 271 | tree | MinLeafSize: 8 | | 127 | 8 | Accept | 0.51948 | 11.015 | 0.010061 | 271 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 245 | | | | | | | | | | MinLeafSize: 103 | | | | | | | | | | MaxNumSplits: 30 | | 128 | 8 | Accept | 0.054966 | 0.59455 | 0.010061 | 271 | tree | MinLeafSize: 19 | | 129 | 8 | Accept | 0.068765 | 2.1334 | 0.010061 | 271 | knn | NumNeighbors: 12 | | 130 | 8 | Accept | 0.031613 | 1.5598 | 0.010061 | 1084 | tree | MinLeafSize: 8 | | 131 | 8 | Accept | 0.018414 | 20.732 | 0.010061 | 1084 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 216 | | | | | | | | | | MinLeafSize: 2 | | | | | | | | | | MaxNumSplits: 42 | | 132 | 8 | Accept | 0.73874 | 9.856 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.37029 | | | | | | | | | | KernelScale: 0.28264 | | 133 | 8 | Accept | 0.044813 | 1.4812 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 187.62 | | | | | | | | | | KernelScale: 55.462 | | 134 | 8 | Accept | 0.74165 | 0.39159 | 0.010061 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.3833 | | | | | | | | | | LayerSizes: [ 7 3 ] | | 135 | 8 | Accept | 0.69134 | 7.5927 | 0.010061 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.90862 | | | | | | | | | | KernelScale: 1.0317 | | 136 | 8 | Accept | 0.022568 | 2.2974 | 0.010061 | 1084 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 187.62 | | | | | | | | | | KernelScale: 55.462 | | 137 | 8 | Accept | 0.029352 | 19.821 | 0.010061 | 271 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 268 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 79 | | 138 | 8 | Accept | 0.74165 | 5.6724 | 0.010061 | 271 | knn | NumNeighbors: 4444 | | 139 | 8 | Accept | 0.056443 | 0.6542 | 0.010061 | 271 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 140 | 8 | Accept | 0.74165 | 8.0385 | 0.010061 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.0043989 | | | | | | | | | | KernelScale: 0.36557 | |====================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 141 | 8 | Accept | 0.73066 | 46.903 | 0.010061 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 1.1477e-12 | | 142 | 8 | Accept | 0.015784 | 31.487 | 0.010061 | 4334 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 7.2382e-05 | | | | | | | | | | LayerSizes: [ 3 2 ] | | 143 | 8 | Accept | 0.028567 | 13.713 | 0.010061 | 271 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | | MinLeafSize: 7 | | | | | | | | | | MaxNumSplits: 82 | | 144 | 8 | Accept | 0.62622 | 3.8393 | 0.010061 | 271 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.0037069 | | | | | | | | | | KernelScale: 0.81773 | | 145 | 8 | Accept | 0.025014 | 27.536 | 0.010061 | 1084 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 268 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 79 | | 146 | 8 | Best | 0.0038767 | 74.175 | 0.0038767 | 17335 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 223 | | | | | | | | | | MinLeafSize: 1 | | | | | | | | | | MaxNumSplits: 75 | | 147 | 8 | Accept | 0.74165 | 4.4671 | 0.0038767 | 271 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.052342 | | | | | | | | | | KernelScale: 305.83 | | 148 | 8 | Accept | 0.082795 | 2.2053 | 0.0038767 | 271 | knn | NumNeighbors: 20 | | 149 | 8 | Accept | 0.57647 | 153.17 | 0.0038767 | 271 | nb | DistributionNames: kernel | | | | | | | | | | Width: 200.41 | | 150 | 8 | Accept | 0.041767 | 7.58 | 0.0038767 | 271 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 0.0095919 | | | | | | | | | | LayerSizes: [ 89 3 ] | | 151 | 8 | Accept | 0.018276 | 18.316 | 0.0038767 | 1084 | ensemble | Method: AdaBoostM2 | | | ... ```

```__________________________________________________________ Optimization completed. Total iterations: 595 Total elapsed time: 1276.4375 seconds Total time for training and validation: 9777.0453 seconds Best observed learner is an ensemble model with: Learner: ensemble Method: AdaBoostM2 NumLearningCycles: 223 MinLeafSize: 1 MaxNumSplits: 75 Observed validation loss: 0.0038767 Time for training and validation: 74.175 seconds Documentation for fitcauto display ```

The final model returned by `fitcauto` corresponds to the best observed learner. Before returning the model, the function retrains it using all the training data (`XTrain` and `YTrain`), the listed `Learner` (or model) type, and the displayed hyperparameter values.

Evaluate Test Set Performance

Evaluate the final model performance on the test data set.

`testAccuracy = 1 - loss(Mdl,XTest,YTest)`
```testAccuracy = 0.9958 ```

The final model correctly classifies over 99% of the observations.

Use `fitcauto` to automatically select a classification model with optimized hyperparameters, given predictor and response data stored in a table. Before passing data to `fitcauto`, perform feature selection to remove unimportant predictors from the data set.

Read the sample file `CreditRating_Historical.dat` into a table. The predictor data consists of financial ratios and industry sector information for a list of corporate customers. The response variable consists of credit ratings assigned by a rating agency. Preview the first few rows of the data set.

```creditrating = readtable("CreditRating_Historical.dat"); head(creditrating)```
```ans=8×8 table ID WC_TA RE_TA EBIT_TA MVE_BVTD S_TA Industry Rating _____ ______ ______ _______ ________ _____ ________ _______ 62394 0.013 0.104 0.036 0.447 0.142 3 {'BB' } 48608 0.232 0.335 0.062 1.969 0.281 8 {'A' } 42444 0.311 0.367 0.074 1.935 0.366 1 {'A' } 48631 0.194 0.263 0.062 1.017 0.228 4 {'BBB'} 43768 0.121 0.413 0.057 3.647 0.466 12 {'AAA'} 39255 -0.117 -0.799 0.01 0.179 0.082 4 {'CCC'} 62236 0.087 0.158 0.049 0.816 0.324 2 {'BBB'} 39354 0.005 0.181 0.034 2.597 0.388 7 {'AA' } ```

Because each value in the `ID` variable is a unique customer ID, that is, `length(unique(creditrating.ID))` is equal to the number of observations in `creditrating`, the `ID` variable is a poor predictor. Remove the `ID` variable from the table, and convert the `Industry` variable to a `categorical` variable.

```creditrating = removevars(creditrating,"ID"); creditrating.Industry = categorical(creditrating.Industry);```

Partition the data into training and test sets. Use approximately 85% of the observations for the model selection and hyperparameter tuning process, and 15% of the observations to test the performance of the final model returned by `fitcauto` on new data. Use `cvpartition` to partition the data.

```rng("default") % For reproducibility of the partition c = cvpartition(creditrating.Rating,"Holdout",0.15); trainingIndices = training(c); % Indices for the training set testIndices = test(c); % Indices for the test set creditTrain = creditrating(trainingIndices,:); creditTest = creditrating(testIndices,:);```

Perform Feature Selection

Before passing the training data to `fitcauto`, find the important predictors by using the `fscchi2` function. Visualize the predictor scores by using the `bar` function. Because some scores can be `Inf`, and `bar` discards `Inf` values, plot the finite scores first and then plot a finite representation of the `Inf` scores in a different color.

```[idx,scores] = fscchi2(creditTrain,"Rating"); bar(scores(idx)) % Represents finite scores hold on veryImportant = isinf(scores); finiteMax = max(scores(~veryImportant)); bar(finiteMax*veryImportant(idx)) % Represents Inf scores hold off xticklabels(strrep(creditTrain.Properties.VariableNames(idx),"_","\_")) xtickangle(45) legend(["Finite Scores","Inf Scores"])```

Notice that the `Industry` predictor has a low score corresponding to a p-value that is greater than 0.05, which indicates that `Industry` might not be an important feature. Remove the `Industry` feature from the training and test data sets.

```creditTrain = removevars(creditTrain,'Industry'); creditTest = removevars(creditTest,'Industry');```

Run `fitcauto`

Pass the training data to `fitcauto`. The function uses Bayesian optimization to select models and their hyperparameter values, and returns a trained model `Mdl` with the best expected performance. Specify to try all available learner types and run the optimization in parallel (requires Parallel Computing Toolbox™). Return a second output `Results` that contains the details of the Bayesian optimization.

Expect this process to take some time. By default, `fitcauto` provides a plot of the optimization and an iterative display of the optimization results. For more information on how to interpret these results, see Verbose Display.

```options = struct("UseParallel",true); [Mdl,Results] = fitcauto(creditTrain,"Rating", ... "Learners","all","HyperparameterOptimizationOptions",options);```
```Warning: It is recommended that you first standardize all numeric predictors when optimizing the Naive Bayes 'Width' parameter. Ignore this warning if you have done that. ```
```Copying objective function to workers... ```
```Warning: Files that have already been attached are being ignored. To see which files are attached see the 'AttachedFiles' property of the parallel pool. ```
```Done copying objective function to workers. Learner types to explore: discr, ensemble, kernel, knn, linear, nb, net, svm, tree Total iterations (MaxObjectiveEvaluations): 270 Total time (MaxTime): Inf |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 1 | 7 | Best | 0.31798 | 0.78014 | 0.31798 | 0.31798 | tree | MinLeafSize: 1 | | 2 | 7 | Accept | 0.47622 | 0.76358 | 0.31798 | 0.31798 | knn | NumNeighbors: 63 | | | | | | | | | | Distance: correlation | | 3 | 3 | Accept | 0.42896 | 0.95281 | 0.31798 | 0.31798 | discr | Delta: 1.5003e-06 | | | | | | | | | | Gamma: 0.47392 | | 4 | 3 | Accept | 0.74185 | 0.99139 | 0.31798 | 0.31798 | discr | Delta: 389.85 | | | | | | | | | | Gamma: 0.29596 | | 5 | 3 | Accept | 0.7236 | 1.0177 | 0.31798 | 0.31798 | knn | NumNeighbors: 599 | | | | | | | | | | Distance: hamming | | 6 | 3 | Accept | 0.5848 | 2.195 | 0.31798 | 0.31798 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.84619 | | | | | | | | | | KernelScale: 19.346 | | 7 | 3 | Accept | 0.74185 | 2.3486 | 0.31798 | 0.31798 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.0052084 | | | | | | | | | | KernelScale: 128.61 | | 8 | 8 | Best | 0.28059 | 0.30573 | 0.28059 | 0.28059 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 9 | 6 | Accept | 0.74215 | 6.4474 | 0.28059 | 0.28059 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.028123 | | | | | | | | | | Lambda: 0.016312 | | 10 | 6 | Accept | 0.74185 | 1.0242 | 0.28059 | 0.28059 | knn | NumNeighbors: 1052 | | | | | | | | | | Distance: jaccard | | 11 | 6 | Accept | 0.38618 | 1.1643 | 0.28059 | 0.28059 | knn | NumNeighbors: 8 | | | | | | | | | | Distance: mahalanobis | | 12 | 5 | Accept | 0.76967 | 1.9214 | 0.28059 | 0.29713 | ensemble | Method: RUSBoost | | | | | | | | | | NumLearningCycles: 14 | | | | | | | | | | LearnRate: 0.0069636 | | | | | | | | | | MinLeafSize: 230 | | 13 | 5 | Accept | 0.3096 | 2.4447 | 0.28059 | 0.29713 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.45485 | | 14 | 8 | Accept | 0.40532 | 1.6213 | 0.28059 | 0.29713 | linear | Coding: onevsall | | | | | | | | | | Lambda: 0.00047099 | | | | | | | | | | Learner: logistic | | 15 | 7 | Accept | 0.74185 | 4.2042 | 0.28059 | 0.29713 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 20.46 | | | | | | | | | | Lambda: 0.081223 | | 16 | 7 | Accept | 0.48011 | 0.83491 | 0.28059 | 0.29713 | linear | Coding: onevsall | | | | | | | | | | Lambda: 0.0012929 | | | | | | | | | | Learner: svm | | 17 | 4 | Accept | 0.379 | 7.0624 | 0.24978 | 0.31798 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 156.99 | | | | | | | | | | Lambda: 7.336e-06 | | 18 | 4 | Best | 0.24978 | 2.7256 | 0.24978 | 0.31798 | linear | Coding: onevsone | | | | | | | | | | Lambda: 1.3809e-08 | | | | | | | | | | Learner: logistic | | 19 | 4 | Accept | 0.25067 | 1.5777 | 0.24978 | 0.31798 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 80.452 | | | | | | | | | | KernelScale: 20.931 | | 20 | 4 | Accept | 0.68112 | 2.197 | 0.24978 | 0.31798 | nb | DistributionNames: kernel | | | | | | | | | | Width: 3.3045 | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 21 | 8 | Best | 0.24319 | 2.5177 | 0.24319 | 0.31798 | linear | Coding: onevsone | | | | | | | | | | Lambda: 7.4269e-05 | | | | | | | | | | Learner: svm | | 22 | 5 | Accept | 0.24559 | 2.5222 | 0.24319 | 0.31798 | linear | Coding: onevsone | | | | | | | | | | Lambda: 9.6858e-06 | | | | | | | | | | Learner: svm | | 23 | 5 | Accept | 0.26653 | 0.28511 | 0.24319 | 0.31798 | tree | MinLeafSize: 61 | | 24 | 5 | Accept | 0.66647 | 0.40489 | 0.24319 | 0.31798 | knn | NumNeighbors: 81 | | | | | | | | | | Distance: hamming | | 25 | 5 | Accept | 0.28059 | 0.19167 | 0.24319 | 0.31798 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 26 | 7 | Accept | 0.24499 | 2.1437 | 0.24319 | 0.31798 | linear | Coding: onevsone | | | | | | | | | | Lambda: 0.0014257 | | | | | | | | | | Learner: svm | | 27 | 7 | Accept | 0.28059 | 0.19273 | 0.24319 | 0.31798 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 28 | 6 | Accept | 0.26503 | 0.31492 | 0.24319 | 0.31798 | tree | MinLeafSize: 41 | | 29 | 6 | Accept | 0.74185 | 0.86991 | 0.24319 | 0.31798 | net | Activations: relu | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 19.947 | | | | | | | | | | LayerSizes: [ 12 9 162 ] | | 30 | 6 | Accept | 0.71523 | 1.7617 | 0.24319 | 0.27687 | linear | Coding: onevsone | | | | | | | | | | Lambda: 7.8161 | | | | | | | | | | Learner: svm | | 31 | 7 | Accept | 0.31289 | 0.29991 | 0.24319 | 0.27687 | knn | NumNeighbors: 12 | | | | | | | | | | Distance: seuclidean | | 32 | 7 | Accept | 0.31289 | 0.23823 | 0.24319 | 0.27687 | knn | NumNeighbors: 12 | | | | | | | | | | Distance: seuclidean | | 33 | 6 | Accept | 0.77984 | 11.017 | 0.24319 | 0.27687 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.0048668 | | | | | | | | | | Lambda: 6.1392e-06 | | 34 | 6 | Accept | 0.31289 | 0.18884 | 0.24319 | 0.27687 | knn | NumNeighbors: 12 | | | | | | | | | | Distance: seuclidean | | 35 | 5 | Accept | 0.78612 | 3.896 | 0.24319 | 0.27687 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 0.0016754 | | | | | | | | | | Lambda: 6.6323e-06 | | 36 | 5 | Accept | 0.74185 | 0.47636 | 0.24319 | 0.27687 | discr | Delta: 30.053 | | | | | | | | | | Gamma: 0.73742 | | 37 | 8 | Accept | 0.26443 | 0.12865 | 0.24319 | 0.27532 | tree | MinLeafSize: 33 | | 38 | 5 | Accept | 0.32276 | 0.13117 | 0.24319 | 0.27532 | tree | MinLeafSize: 328 | | 39 | 5 | Accept | 0.28059 | 0.17681 | 0.24319 | 0.27532 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 40 | 5 | Accept | 0.71702 | 0.44235 | 0.24319 | 0.27532 | discr | Delta: 4.3736 | | | | | | | | | | Gamma: 0.086011 | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 41 | 5 | Accept | 0.25755 | 0.30003 | 0.24319 | 0.27532 | knn | NumNeighbors: 17 | | | | | | | | | | Distance: cityblock | | 42 | 8 | Accept | 0.58989 | 1.8892 | 0.24319 | 0.27532 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 2.3154 | | | | | | | | | | Lambda: 0.17755 | | 43 | 8 | Best | 0.23961 | 1.8068 | 0.23961 | 0.27532 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 1.6804 | | | | | | | | | | KernelScale: 0.69915 | | 44 | 8 | Accept | 0.52019 | 3.4078 | 0.23961 | 0.27532 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.0020203 | | | | | | | | | | KernelScale: 4.9323 | | 45 | 8 | Accept | 0.39844 | 1.2275 | 0.23961 | 0.27532 | linear | Coding: onevsall | | | | | | | | | | Lambda: 0.000245 | | | | | | | | | | Learner: logistic | | 46 | 7 | Accept | 0.29345 | 3.2203 | 0.23961 | 0.28001 | ensemble | Method: RUSBoost | | | | | | | | | | NumLearningCycles: 11 | | | | | | | | | | LearnRate: 0.0037912 | | | | | | | | | | MinLeafSize: 3 | | 47 | 7 | Accept | 0.28597 | 0.24422 | 0.23961 | 0.28001 | tree | MinLeafSize: 10 | | 48 | 8 | Accept | 0.24828 | 2.1892 | 0.23961 | 0.26397 | linear | Coding: onevsone | | | | | | | | | | Lambda: 3.0856e-06 | | | | | | | | | | Learner: logistic | | 49 | 7 | Accept | 0.42776 | 0.87452 | 0.23961 | 0.26397 | discr | Delta: 0.00024246 | | | | | | | | | | Gamma: 0.94088 | | 50 | 7 | Accept | 0.42776 | 0.73959 | 0.23961 | 0.26397 | discr | Delta: 0.00024246 | | | | | | | | | | Gamma: 0.94088 | | 51 | 6 | Accept | 0.85372 | 22.106 | 0.23961 | 0.26397 | ensemble | Method: RUSBoost | | | | | | | | | | NumLearningCycles: 335 | | | | | | | | | | LearnRate: 0.0029421 | | | | | | | | | | MinLeafSize: 418 | | 52 | 6 | Accept | 0.26982 | 0.13679 | 0.23961 | 0.26397 | tree | MinLeafSize: 17 | | 53 | 6 | Accept | 0.74185 | 0.39262 | 0.23961 | 0.26397 | net | Activations: sigmoid | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 18.22 | | | | | | | | | | LayerSizes: [ 18 7 ] | | 54 | 7 | Accept | 0.28059 | 0.16664 | 0.23961 | 0.26397 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 55 | 7 | Accept | 0.28059 | 0.12859 | 0.23961 | 0.26397 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 56 | 7 | Accept | 0.28059 | 0.11931 | 0.23961 | 0.26397 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 57 | 7 | Accept | 0.43703 | 0.12446 | 0.23961 | 0.26397 | discr | Delta: 0.38801 | | | | | | | | | | Gamma: 0.19442 | | 58 | 6 | Accept | 0.49028 | 18.139 | 0.23961 | 0.26397 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 1.0284 | | | | | | | | | | KernelScale: 0.066897 | | 59 | 6 | Accept | 0.74185 | 0.41221 | 0.23961 | 0.26397 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.1378 | | | | | | | | | | LayerSizes: [ 58 2 ] | | 60 | 6 | Accept | 0.49447 | 0.84836 | 0.23961 | 0.27351 | linear | Coding: onevsall | | | | | | | | | | Lambda: 3.1837e-07 | | | | | | | | | | Learner: svm | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 61 | 7 | Accept | 0.70117 | 0.17095 | 0.23961 | 0.27351 | knn | NumNeighbors: 1 | | | | | | | | | | Distance: hamming | | 62 | 7 | Accept | 0.70117 | 0.17044 | 0.23961 | 0.27351 | knn | NumNeighbors: 1 | | | | | | | | | | Distance: hamming | | 63 | 7 | Accept | 0.4819 | 0.16752 | 0.23961 | 0.27351 | knn | NumNeighbors: 15 | | | | | | | | | | Distance: correlation | | 64 | 8 | Accept | 0.24469 | 2.4156 | 0.23961 | 0.26436 | linear | Coding: onevsone | | | | | | | | | | Lambda: 1.0606e-06 | | | | | | | | | | Learner: svm | | 65 | 7 | Accept | 0.27251 | 1.4945 | 0.23961 | 0.26436 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.016406 | | 66 | 7 | Accept | 0.27251 | 1.5369 | 0.23961 | 0.26436 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.016406 | | 67 | 8 | Accept | 0.3102 | 1.6327 | 0.23961 | 0.26436 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.66555 | | | | | | | | | | KernelScale: 7.7425 | | 68 | 7 | Accept | 0.31798 | 0.31217 | 0.23961 | 0.26436 | tree | MinLeafSize: 1 | | 69 | 7 | Accept | 0.31798 | 0.1989 | 0.23961 | 0.26436 | tree | MinLeafSize: 1 | | 70 | 8 | Accept | 0.7239 | 2.2998 | 0.23961 | 0.26436 | nb | DistributionNames: kernel | | | | | | | | | | Width: 7.994 | | 71 | 8 | Accept | 0.26982 | 0.16298 | 0.23961 | 0.26436 | tree | MinLeafSize: 17 | | 72 | 7 | Accept | 0.4843 | 2.8168 | 0.23961 | 0.26436 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.033141 | | | | | | | | | | KernelScale: 0.28115 | | 73 | 7 | Accept | 0.4843 | 2.8684 | 0.23961 | 0.26436 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.033141 | | | | | | | | | | KernelScale: 0.28115 | | 74 | 8 | Accept | 0.25157 | 17.917 | 0.23961 | 0.26436 | net | Activations: tanh | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 1.7369e-06 | | | | | | | | | | LayerSizes: 8 | | 75 | 8 | Accept | 0.28059 | 0.14319 | 0.23961 | 0.26436 | nb | DistributionNames: normal | | | | | | | | | | Width: NaN | | 76 | 8 | Accept | 0.27401 | 5.2171 | 0.23961 | 0.26436 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 6.8476 | | | | | | | | | | Lambda: 0.00036546 | | 77 | 8 | Accept | 0.24678 | 5.4758 | 0.23961 | 0.26436 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 1.6458 | | | | | | | | | | Lambda: 0.00029076 | | 78 | 8 | Accept | 0.24768 | 15.469 | 0.23961 | 0.26436 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 1.6458 | | | | | | | | | | Lambda: 0.00029076 | | 79 | 8 | Accept | 0.74185 | 2.8161 | 0.23961 | 0.26436 | nb | DistributionNames: kernel | | | | | | | | | | Width: 57.408 | | 80 | 8 | Accept | 0.26683 | 16.599 | 0.23961 | 0.26436 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 356 | | | | | | | | | | LearnRate: 0.4849 | | | | | | | | | | MinLeafSize: 190 | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 81 | 8 | Accept | 0.43045 | 0.14072 | 0.23961 | 0.26436 | discr | Delta: 5.5966e-05 | | | | | | | | | | Gamma: 0.72927 | | 82 | 8 | Accept | 0.42806 | 0.11498 | 0.23961 | 0.26436 | discr | Delta: 0.014062 | | | | | | | | | | Gamma: 0.11412 | | 83 | 7 | Accept | 0.34699 | 17.983 | 0.23961 | 0.26436 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 0.15793 | | | | | | | | | | Lambda: 3.4157e-07 | | 84 | 7 | Accept | 0.26713 | 0.14254 | 0.23961 | 0.26436 | tree | MinLeafSize: 37 | | 85 | 8 | Accept | 0.25067 | 0.34385 | 0.23961 | 0.26436 | knn | NumNeighbors: 107 | | | | | | | | | | Distance: euclidean | | 86 | 8 | Accept | 0.25067 | 0.3674 | 0.23961 | 0.26436 | knn | NumNeighbors: 107 | | | | | | | | | | Distance: euclidean | | 87 | 8 | Accept | 0.25456 | 0.12393 | 0.23961 | 0.26436 | knn | NumNeighbors: 21 | | | | | | | | | | Distance: euclidean | | 88 | 8 | Accept | 0.42866 | 0.16366 | 0.23961 | 0.26436 | discr | Delta: 0.010446 | | | | | | | | | | Gamma: 0.45787 | | 89 | 8 | Accept | 0.42746 | 7.739 | 0.23961 | 0.26436 | net | Activations: none | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 0.012033 | | | | | | | | | | LayerSizes: [ 36 71 224 ] | | 90 | 8 | Accept | 0.42148 | 0.1231 | 0.23961 | 0.26436 | discr | Delta: 6.5069e-06 | | | | | | | | | | Gamma: 0.035196 | | 91 | 8 | Accept | 0.2402 | 4.845 | 0.23961 | 0.26436 | net | Activations: none | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 0.00031695 | | | | | | | | | | LayerSizes: [ 1 1 4 ] | | 92 | 7 | Accept | 0.25337 | 4.0368 | 0.23961 | 0.26436 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 87 | | | | | | | | | | LearnRate: 0.72655 | | | | | | | | | | MinLeafSize: 8 | | 93 | 7 | Accept | 0.29076 | 1.476 | 0.23961 | 0.26436 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 6.1463 | | | | | | | | | | KernelScale: 11.586 | | 94 | 8 | Accept | 0.24858 | 2.4734 | 0.23961 | 0.25191 | linear | Coding: onevsone | | | | | | | | | | Lambda: 5.8564e-07 | | | | | | | | | | Learner: logistic | | 95 | 8 | Accept | 0.75381 | 2.7208 | 0.23961 | 0.25191 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 0.0011931 | | | | | | | | | | Lambda: 0.0021196 | | 96 | 8 | Accept | 0.74514 | 2.5576 | 0.23961 | 0.25191 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 0.0011931 | | | | | | | | | | Lambda: 0.0021196 | | 97 | 8 | Accept | 0.2423 | 1.4238 | 0.23961 | 0.25191 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.037756 | | | | | | | | | | KernelScale: 0.054479 | | 98 | 7 | Accept | 0.24529 | 72.651 | 0.23961 | 0.25191 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.092471 | | | | | | | | | | KernelScale: 0.0035999 | | 99 | 7 | Accept | 0.32605 | 1.0318 | 0.23961 | 0.25191 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.002311 | | 100 | 7 | Accept | 0.26683 | 0.11759 | 0.23961 | 0.25191 | tree | MinLeafSize: 60 | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 101 | 7 | Accept | 0.31738 | 0.73567 | 0.23961 | 0.25191 | knn | NumNeighbors: 440 | | | | | | | | | | Distance: minkowski | | 102 | 8 | Accept | 0.24619 | 2.8761 | 0.23961 | 0.25191 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.051313 | | | | | | | | | | KernelScale: 0.018609 | | 103 | 8 | Accept | 0.24619 | 2.8825 | 0.23961 | 0.25191 | svm | Coding: onevsone | | | | | | | | | | BoxConstraint: 0.051313 | | | | | | | | | | KernelScale: 0.018609 | | 104 | 8 | Accept | 0.31798 | 0.18993 | 0.23961 | 0.25191 | knn | NumNeighbors: 30 | | | | | | | | | | Distance: seuclidean | | 105 | 8 | Accept | 0.27251 | 7.5016 | 0.23961 | 0.25191 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 2.0044 | | | | | | | | | | Lambda: 0.00044252 | | 106 | 8 | Accept | 0.25606 | 2.4574 | 0.23961 | 0.25191 | ensemble | Method: AdaBoostM2 | | | | | | | | | | NumLearningCycles: 47 | | | | | | | | | | LearnRate: 0.37809 | | | | | | | | | | MinLeafSize: 56 | | 107 | 8 | Accept | 0.71702 | 0.11134 | 0.23961 | 0.25191 | discr | Delta: 4.4097 | | | | | | | | | | Gamma: 0.8883 | | 108 | 8 | Accept | 0.43195 | 0.10201 | 0.23961 | 0.25191 | discr | Delta: 3.4149e-06 | | | | | | | | | | Gamma: 0.66863 | | 109 | 8 | Accept | 0.5851 | 0.16296 | 0.23961 | 0.25191 | knn | NumNeighbors: 1 | | | | | | | | | | Distance: correlation | | 110 | 8 | Accept | 0.3102 | 120.87 | 0.23961 | 0.25191 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 1.2141e-07 | | | | | | | | | | LayerSizes: 157 | | 111 | 7 | Accept | 0.79958 | 4.0886 | 0.23961 | 0.25431 | kernel | Coding: onevsall | | | | | | | | | | KernelScale: 0.005193 | | | | | | | | | | Lambda: 8.7114e-07 | | 112 | 7 | Accept | 0.48998 | 1.3636 | 0.23961 | 0.25431 | linear | Coding: onevsall | | | | | | | | | | Lambda: 1.0966e-07 | | | | | | | | | | Learner: svm | | 113 | 8 | Accept | 0.36464 | 126.9 | 0.23961 | 0.25431 | net | Activations: tanh | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 7.078e-08 | | | | | | | | | | LayerSizes: [ 85 74 ] | | 114 | 8 | Accept | 0.45319 | 139.01 | 0.23961 | 0.25431 | svm | Coding: onevsall | | | | | | | | | | BoxConstraint: 0.049562 | | | | | | | | | | KernelScale: 0.019349 | | 115 | 7 | Accept | 0.24379 | 18.913 | 0.23961 | 0.25431 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 5.0497e-07 | | | | | | | | | | LayerSizes: [ 1 4 5 ] | | 116 | 7 | Accept | 0.42836 | 1.076 | 0.23961 | 0.25431 | discr | Delta: 0.0016971 | | | | | | | | | | Gamma: 0.35943 | | 117 | 6 | Accept | 0.24319 | 20.917 | 0.23961 | 0.25622 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 5.0497e-07 | | | | | | | | | | LayerSizes: [ 1 4 5 ] | | 118 | 6 | Accept | 0.46814 | 1.0997 | 0.23961 | 0.25622 | linear | Coding: onevsall | | | | | | | | | | Lambda: 7.5025e-05 | | | | | | | | | | Learner: svm | | 119 | 8 | Accept | 0.74065 | 2.2714 | 0.23961 | 0.25622 | nb | DistributionNames: kernel | | | | | | | | | | Width: 19.139 | | 120 | 8 | Accept | 0.30302 | 0.17067 | 0.23961 | 0.25622 | tree | MinLeafSize: 191 | |=======================================================================================================================================================| | Iter | Active | Eval | Validation | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 121 | 8 | Accept | 0.26593 | 0.11582 | 0.23961 | 0.25622 | tree | MinLeafSize: 40 | | 122 | 8 | Accept | 0.47712 | 0.68686 | 0.23961 | 0.25178 | linear | Coding: onevsall | | | | | | | | | | Lambda: 0.2622 | | | | | | | | | | Learner: svm | | 123 | 8 | Accept | 0.24798 | 8.1465 | 0.23961 | 0.25178 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 85 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 14 | | 124 | 6 | Accept | 0.29554 | 48.462 | 0.23961 | 0.25178 | net | Activations: relu | | | | | | | | | | Standardize: true | | | | | | | | | | Lambda: 5.8633e-09 | | | | | | | | | | LayerSizes: 115 | | 125 | 6 | Accept | 0.24978 | 8.4739 | 0.23961 | 0.25178 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 85 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 14 | | 126 | 6 | Accept | 0.25157 | 8.422 | 0.23961 | 0.25178 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 85 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 14 | | 127 | 6 | Accept | 0.25815 | 1.9238 | 0.23961 | 0.25178 | nb | DistributionNames: kernel | | | | | | | | | | Width: 0.062941 | | 128 | 6 | Accept | 0.24948 | 2.291 | 0.23961 | 0.25148 | linear | Coding: onevsone | | | | | | | | | | Lambda: 1.2391e-07 | | | | | | | | | | Learner: logistic | | 129 | 8 | Accept | 0.28118 | 5.9878 | 0.23961 | 0.25148 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 82 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 153 | | 130 | 8 | Accept | 0.29644 | 8.8037 | 0.23961 | 0.25148 | ensemble | Method: RUSBoost | | | | | | | | | | NumLearningCycles: 85 | | | | | | | | | | LearnRate: 0.56002 | | | | | | | | | | MinLeafSize: 8 | | 131 | 7 | Accept | 0.39785 | 8.9125 | 0.23961 | 0.25148 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.10138 | | | | | | | | | | Lambda: 0.00010013 | | 132 | 7 | Accept | 0.28298 | 6.7573 | 0.23961 | 0.25148 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 81 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 151 | | 133 | 8 | Accept | 0.27371 | 6.3282 | 0.23961 | 0.25148 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 84 | | | | | | | | | | LearnRate: NaN | | | | | | | | | | MinLeafSize: 129 | | 134 | 8 | Accept | 0.24559 | 76.492 | 0.23961 | 0.25148 | net | Activations: sigmoid | | | | | | | | | | Standardize: false | | | | | | | | | | Lambda: 2.4405e-07 | | | | | | | | | | LayerSizes: [ 33 17 82 ] | | 135 | 7 | Accept | 0.34819 | 16.667 | 0.23961 | 0.25148 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.12039 | | | | | | | | | | Lambda: 0.00010483 | | 136 | 7 | Accept | 0.42477 | 8.2081 | 0.23961 | 0.25148 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.095799 | | | | | | | | | | Lambda: 8.6717e-05 | | 137 | 6 | Accept | 0.25007 | 4.8241 | 0.23961 | 0.25148 | kernel | Coding: onevsone | | | | | | | | | | KernelScale: 0.54593 | | | | | | | | | | Lambda: 0.0017131 | | 138 | 6 | Accept | 0.74185 | 0.75998 | 0.23961 | 0.25148 | net | Activations: sigmoid | | | | | | ... ```

```__________________________________________________________ Optimization completed. Total iterations: 271 Total elapsed time: 907.7117 seconds Total time for training and validation: 5400.9893 seconds Best observed learner is a net model with: Learner: net Activations: relu Standardize: false Lambda: 0.0004658 LayerSizes: [1 31 23] Observed validation loss: 0.23811 Time for training and validation: 21.2958 seconds Best estimated learner (returned model) is a net model with: Learner: net Activations: none Standardize: false Lambda: 0.00036647 LayerSizes: [1 6 10] Estimated validation loss: 0.24112 Estimated time for training and validation: 5.894 seconds Documentation for fitcauto display ```

The final model returned by `fitcauto` corresponds to the best estimated learner. Before returning the model, the function retrains it using the entire training data (`creditTrain`), the listed `Learner` (or model) type, and the displayed hyperparameter values.

Evaluate Test Set Performance

The model `Mdl` corresponds to the best point in the Bayesian optimization according to the `"min-visited-mean"` criterion. To gauge how the model will perform on new data, look at the observed cross-validation accuracy of the model (`cvAccuracy`) and its general estimated performance based on the Bayesian optimization (`estimatedAccuracy`).

```[x,~,iteration] = bestPoint(Results,"Criterion","min-visited-mean"); cvError = Results.ObjectiveTrace(iteration); cvAccuracy = 1 - cvError```
```cvAccuracy = 0.7595 ```
```estimatedError = predictObjective(Results,x); estimatedAccuracy = 1 - estimatedError```
```estimatedAccuracy = 0.7589 ```

Evaluate the performance of the model on the test set. Create a confusion matrix from the results, and specify the order of the classes in the confusion matrix.

`testAccuracy = 1 - loss(Mdl,creditTest,"Rating")`
```testAccuracy = 0.7437 ```
```cm = confusionchart(creditTest.Rating,predict(Mdl,creditTest)); sortClasses(cm,["AAA","AA","A","BBB","BB","B","CCC"])```

## Input Arguments

collapse all

Sample data, specified as a table. Each row of `Tbl` corresponds to one observation, and each column corresponds to one predictor. Optionally, `Tbl` can contain one additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not accepted.

If `Tbl` contains the response variable, and you want to use all remaining variables in `Tbl` as predictors, specify the response variable using `ResponseVarName`.

If `Tbl` contains the response variable, and you want to use only a subset of the remaining variables in `Tbl` as predictors, specify a formula using `formula`.

If `Tbl` does not contain the response variable, specify a response variable using `Y`. The length of the response variable and the number of rows in `Tbl` must be equal.

Data Types: `table`

Response variable name, specified as the name of a variable in `Tbl`.

You must specify `ResponseVarName` as a character vector or string scalar. For example, if the response variable `Y` is stored as `Tbl.Y`, then specify it as `"Y"`. Otherwise, the software treats all columns of `Tbl`, including `Y`, as predictors when training the model.

The response variable must be a categorical, character, or string array; a logical or numeric vector; or a cell array of character vectors. If `Y` is a character array, then each element of the response variable must correspond to one row of the array.

A good practice is to specify the order of the classes by using the `ClassNames` name-value argument.

Data Types: `char` | `string`

Explanatory model of the response variable and a subset of the predictor variables, specified as a character vector or string scalar in the form `"Y~x1+x2+x3"`. In this form, `Y` represents the response variable, and `x1`, `x2`, and `x3` represent the predictor variables.

To specify a subset of variables in `Tbl` as predictors for training the model, use a formula. If you specify a formula, then the software does not use any variables in `Tbl` that do not appear in `formula`.

The variable names in the formula must be both variable names in `Tbl` (`Tbl.Properties.VariableNames`) and valid MATLAB® identifiers. You can verify the variable names in `Tbl` by using the `isvarname` function. If the variable names are not valid, then you can convert them by using the `matlab.lang.makeValidName` function.

Data Types: `char` | `string`

Class labels, specified as a numeric, categorical, or logical vector, a character or string array, or a cell array of character vectors.

• If `Y` is a character array, then each element of the class labels must correspond to one row of the array.

• The length of `Y` must be equal to the number of rows in `Tbl` or `X`.

• A good practice is to specify the class order by using the `ClassNames` name-value argument.

Data Types: `single` | `double` | `categorical` | `logical` | `char` | `string` | `cell`

Predictor data, specified as a numeric matrix.

Each row of `X` corresponds to one observation, and each column corresponds to one predictor.

The length of `Y` and the number of rows in `X` must be equal.

To specify the names of the predictors in the order of their appearance in `X`, use the `PredictorNames` name-value argument.

Data Types: `single` | `double`

Note

The software treats `NaN`, empty character vector (`''`), empty string (`""`), `<missing>`, and `<undefined>` elements as missing data. The software removes rows of data corresponding to missing values in the response variable. However, the treatment of missing values in the predictor data `X` or `Tbl` varies among models (or learners).

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `"HyperparameterOptimizationOptions",struct("MaxObjectiveEvaluations",200,"Verbose",2)` specifies to run 200 iterations of the optimization process (that is, try 200 model hyperparameter combinations), and to display information in the Command Window about the next model hyperparameter combination to be evaluated.

Optimization Options

collapse all

Types of classification models to try during the optimization, specified as a value in the first table below or one or more learner names in the second table. Specify multiple learner names as a string or cell array.

ValueDescription
`"auto"`

`fitcauto` automatically selects a subset of learners, suitable for the given predictor and response data. The learners can have model hyperparameter values that differ from the default. For more information, see Automatic Selection of Learners.

Note

To provide the best hyperparameter optimization experience, the automatic selection of learners behavior is subject to frequent changes. For a more consistent selection of learners across software releases, explicitly specify the models you want to include.

`"all"``fitcauto` selects all possible learners.
`"all-linear"``fitcauto` selects linear learners: `"discr"` (with a linear discriminant type) and `"linear"`.
`"all-nonlinear"``fitcauto` selects all nonlinear learners: `"discr"` (with a quadratic discriminant type), `"ensemble"`, `"kernel"`, `"knn"`, `"nb"`, `"net"`, `"svm"` (with a Gaussian or polynomial kernel), and `"tree"`.

Note

For greater efficiency, `fitcauto` does not select the following combinations of models when you specify one of the previous values.

• `"kernel"` and `"svm"` (with a Gaussian kernel) — `fitcauto` chooses the first when the predictor data has more than 11,000 observations, and the second otherwise.

• `"linear"` and `"svm"` (with a linear kernel) — `fitcauto` chooses the first.

Learner NameDescription
`"discr"`Discriminant analysis classifier
`"ensemble"`Ensemble classification model
`"kernel"`Kernel classification model
`"knn"`k-nearest neighbor model
`"linear"`Linear classification model
`"nb"`Naive Bayes classifier
`"net"`Neural network classifier
`"svm"`Support vector machine classifier
`"tree"`Binary decision classification tree

Example: `"Learners","all"`

Example: `"Learners","ensemble"`

Example: `"Learners",["svm","tree"]`

Data Types: `char` | `string` | `cell`

Hyperparameters to optimize, specified as `"auto"` or `"all"`. The optimizable hyperparameters depend on the model (or learner), as described in this table.

Learner NameHyperparameters for `"auto"`Additional Hyperparameters for `"all"`Notes
`"discr"``Delta`, `Gamma``DiscrimType`

• When the `Learners` value is `"all-linear"`, the `fitcauto` function chooses among the `DiscrimType` values of `"linear"`, `"diaglinear"`, and `"pseudolinear"`, regardless of the `OptimizeHyperparameters` value.

• When the `Learners` value is `"all-nonlinear"`, the `fitcauto` function chooses among the `DiscrimType` values of `"quadratic"`, `"diagquadratic"`, and `"pseudoquadratic"`, regardless of the `OptimizeHyperparameters` value.

For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.

`"ensemble"``Method`, `NumLearningCycles`, `LearnRate`, `MinLeafSize``MaxNumSplits`, `NumVariablesToSample`, `SplitCriterion`

When the ensemble `Method` value is a boosting method, the ensemble `NumBins` value is `50`.

For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.

`"kernel"``KernelScale`, `Lambda`, `Coding` (for three or more classes only)`Learner`, `NumExpansionDimensions`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters` and `OptimizeHyperparameters` (for three or more classes only). Note that you cannot change hyperparameter search ranges when you use `fitcauto`.
`"knn"``Distance`, `NumNeighbors``DistanceWeight`, `Exponent`, `Standardize`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.
`"linear"``Lambda`, `Learner`, `Coding` (for three or more classes only)`Regularization`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters` and `OptimizeHyperparameters` (for three or more classes only). Note that you cannot change hyperparameter search ranges when you use `fitcauto`.
`"nb"``DistributionNames`, `Width``Kernel`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.
`"net"``Activations`, `Lambda`, `LayerSizes`, `Standardize``LayerBiasesInitializer`, `LayerWeightsInitializer`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.
`"svm"``BoxConstraint`, `KernelScale`, `Coding` (for three or more classes only)`KernelFunction`, `PolynomialOrder`, `Standardize`

• When the `Learners` value is `"all-linear"`, the `fitcauto` function does not optimize the `KernelFunction` or `PolynomialOrder` hyperparameters when the `OptimizeHyperparameters` value is `"all"`.

• When the `Learners` value is `"all-nonlinear"`, the `fitcauto` function chooses among the `KernelFunction` values of `"gaussian"` and `"polynomial"`, regardless of the `OptimizeHyperparameters` value.

For more information, including hyperparameter search ranges, see `OptimizeHyperparameters` and `OptimizeHyperparameters` (for three or more classes only). Note that you cannot change hyperparameter search ranges when you use `fitcauto`.

`"tree"``MinLeafSize``MaxNumSplits`, `SplitCriterion`For more information, including hyperparameter search ranges, see `OptimizeHyperparameters`. Note that you cannot change hyperparameter search ranges when you use `fitcauto`.

Note

When `Learners` is set to a value other than `"auto"`, the default values for the model hyperparameters not being optimized match the default fit function values, unless otherwise indicated in the table notes. When `Learners` is set to `"auto"`, the optimized hyperparameter search ranges and nonoptimized hyperparameter values can vary, depending on the characteristics of the training data. For more information, see Automatic Selection of Learners.

Example: `"OptimizeHyperparameters","all"`

Options for the optimization, specified as a structure. All fields in the structure are optional.

Field NameValuesDefault
`Optimizer`
• `"bayesopt"` — Uses Bayesian optimization. For more details, see Bayesian Optimization.

• `"asha"` — Uses ASHA optimization. For more details, see ASHA Optimization.

`"bayesopt"`
`MaxObjectiveEvaluations`Maximum number of iterations (objective function evaluations), specified as a positive integer

`30*L`, where `L` is the number of learners (see `Learners`)

• This value is the default when the `Optimizer` field is set to `"bayesopt"`.

• For the default value when the `Optimizer` field is set to `"asha"`, see Number of ASHA Iterations.

`MaxTime`

Time limit, specified as a positive real number. The time limit is in seconds, as measured by `tic` and `toc`. Run time can exceed `MaxTime` because `MaxTime` does not interrupt function evaluations.

`Inf`
`ShowPlots`Logical value indicating whether to show a plot of the optimization progress. If `true`, this field plots the observed minimum validation loss against the iteration number. When you use Bayesian optimization, the plot also shows the estimated minimum validation loss.`true`
`SaveIntermediateResults`Logical value indicating whether to save results. If `true`, this field overwrites a workspace variable at each iteration. The variable is a `BayesianOptimization` object named `BayesoptResults` if you use Bayesian optimization, and a table named `ASHAResults` if you use ASHA optimization.`false`
`Verbose`

Display at the command line:

• `0` — No iterative display

• `1` — Iterative display

• `2` — Iterative display with additional information about the next point to be evaluated

`1`
`UseParallel`Logical value indicating whether to run the optimization in parallel, which requires Parallel Computing Toolbox™. Due to the nonreproducibility of parallel timing, parallel optimization does not necessarily yield reproducible results.`false`
`Repartition`

Logical value indicating whether to repartition the cross-validation at every iteration. If `false`, the optimizer uses a single partition for the optimization.

`true` usually gives the most robust results because this setting takes partitioning noise into account. However, for good results, `true` requires at least twice as many function evaluations.

`false`
`MaxTrainingSetSize`

Maximum number of observations in each training set, specified as a positive integer. This value matches the largest training set size.

Note

If you want to specify this value, the `Optimizer` field must be set to `"asha"`.

Largest available training partition size

• When the optimization uses `k`-fold cross-validation, this value is `(k – 1)*n/k`, where `n` is the total number of observations.

• When the optimization uses a `cvpartition` object `cvp`, this value is `max(cvp.TrainSize)`.

• When the optimization uses a holdout fraction `p`, this value is `(1 – p)*n`, where `n` is the total number of observations.

`MinTrainingSetSize`

Minimum number of observations in each training set, specified as a positive integer. This value is a lower bound for the smallest training set size.

Note

If you want to specify this value, the `Optimizer` field must be set to `"asha"`.

`100`
Specify only one of the following three options.
`CVPartition``cvpartition` object, created by `cvpartition``"Kfold",5` if you do not specify any cross-validation field
`Holdout`Scalar in the range `(0,1)` representing the holdout fraction
`Kfold`Integer greater than 1

Example: `"HyperparameterOptimizationOptions",struct("UseParallel",true)`

Data Types: `struct`

Classification Options

collapse all

Categorical predictors list, specified as one of the values in this table.

ValueDescription
Vector of positive integers

Each entry in the vector is an index value indicating that the corresponding predictor is categorical. The index values are between 1 and `p`, where `p` is the number of predictors used to train the model.

If `fitcauto` uses a subset of input variables as predictors, then the function indexes the predictors using only the subset. The `CategoricalPredictors` values do not count the response variable, observation weight variable, or any other variables that the function does not use.

Logical vector

A `true` entry means that the corresponding predictor is categorical. The length of the vector is `p`.

Character matrixEach row of the matrix is the name of a predictor variable. The names must match the entries in `PredictorNames`. Pad the names with extra blanks so each row of the character matrix has the same length.
String array or cell array of character vectorsEach element in the array is the name of a predictor variable. The names must match the entries in `PredictorNames`.
`"all"`All predictors are categorical.

By default, if the predictor data is in a table (`Tbl`), `fitcauto` assumes that a variable is categorical if it is a logical vector, categorical vector, character array, string array, or cell array of character vectors. However, learners that use decision trees assume that mathematically ordered categorical vectors are continuous variables. If the predictor data is a matrix (`X`), `fitcauto` assumes that all predictors are continuous. To identify any other predictors as categorical predictors, specify them by using the `CategoricalPredictors` name-value argument.

For more information on how fitting functions treat categorical predictors, see Automatic Creation of Dummy Variables.

Note

• `fitcauto` does not support categorical predictors for discriminant analysis classifiers. That is, if you want `Learners` to include `"discr"` models, you cannot specify the `CategoricalPredictors` name-value argument or use a table of sample data (`Tbl`) containing categorical predictors.

• `fitcauto` does not support a mix of numeric and categorical predictors for k-nearest neighbor models. That is, if you want `Learners` to include `"knn"` models, you must specify the `CategoricalPredictors` value as `"all"` or `[]`.

Example: `"CategoricalPredictors","all"`

Data Types: `single` | `double` | `logical` | `char` | `string` | `cell`

Names of classes to use for training, specified as a categorical, character, or string array; a logical or numeric vector; or a cell array of character vectors. `ClassNames` must have the same data type as the response variable in `Tbl` or `Y`.

If `ClassNames` is a character array, then each element must correspond to one row of the array.

Use `ClassNames` to:

• Specify the order of the classes during training.

• Specify the order of any input or output argument dimension that corresponds to the class order. For example, use `ClassNames` to specify the order of the dimensions of `Cost` or the column order of classification scores returned by `predict`.

• Select a subset of classes for training. For example, suppose that the set of all distinct class names in `Y` is `["a","b","c"]`. To train the model using observations from classes `"a"` and `"c"` only, specify `"ClassNames",["a","c"]`.

The default value for `ClassNames` is the set of all distinct class names in the response variable in `Tbl` or `Y`.

Example: `"ClassNames",["b","g"]`

Data Types: `categorical` | `char` | `string` | `logical` | `single` | `double` | `cell`

Misclassification cost, specified as a square matrix or structure array.

• If you specify a square matrix `Cost` and the true class of an observation is `i`, then `Cost(i,j)` is the cost of classifying a point into class `j`. That is, rows correspond to the true classes and columns correspond to the predicted classes. To specify the class order for the corresponding rows and columns of `Cost`, also specify the `ClassNames` name-value argument.

• If you specify a structure `S`, then it must have two fields:

• `S.ClassNames`, which contains the class names as a variable of the same data type as `Y`

• `S.ClassificationCosts`, which contains the cost matrix with rows and columns ordered as in `S.ClassNames`

Misclassification costs are used differently by the various models in `Learners`. However, `fitcauto` computes the same mean misclassification cost to compare the models during the optimization process. For more information, see Mean Misclassification Cost.

`fitcauto` does not support misclassification costs for neural network classifiers. That is, if you want `Learners` to include `"net"` models, then you cannot specify the `Cost` name-value argument.

The default value for `Cost` is ```ones(K) – eye(K)```, where `K` is the number of distinct classes.

Example: `"Cost",[0 1; 2 0]`

Data Types: `single` | `double` | `struct`

Predictor variable names, specified as a string array of unique names or cell array of unique character vectors. The functionality of `PredictorNames` depends on the way you supply the training data.

• If you supply `X` and `Y`, then you can use `PredictorNames` to assign names to the predictor variables in `X`.

• The order of the names in `PredictorNames` must correspond to the column order of `X`. That is, `PredictorNames{1}` is the name of `X(:,1)`, `PredictorNames{2}` is the name of `X(:,2)`, and so on. Also, `size(X,2)` and `numel(PredictorNames)` must be equal.

• By default, `PredictorNames` is `{'x1','x2',...}`.

• If you supply `Tbl`, then you can use `PredictorNames` to choose which predictor variables to use in training. That is, `fitcauto` uses only the predictor variables in `PredictorNames` and the response variable during training.

• `PredictorNames` must be a subset of `Tbl.Properties.VariableNames` and cannot include the name of the response variable.

• By default, `PredictorNames` contains the names of all predictor variables.

• A good practice is to specify the predictors for training using either `PredictorNames` or `formula`, but not both.

Example: `"PredictorNames",["SepalLength","SepalWidth","PetalLength","PetalWidth"]`

Data Types: `string` | `cell`

Prior probabilities for each class, specified as a value in this table.

ValueDescription
`"empirical"`The class prior probabilities are the class relative frequencies in `Y`.
`"uniform"`All class prior probabilities are equal to 1/K, where K is the number of classes.
numeric vectorEach element is a class prior probability. Order the elements according to `Mdl``.ClassNames` or specify the order using the `ClassNames` name-value argument. The software normalizes the elements to sum to `1`.
structure

A structure `S` with two fields:

• `S.ClassNames` contains the class names as a variable of the same type as `Y`.

• `S.ClassProbs` contains a vector of corresponding prior probabilities. The software normalizes the elements to sum to `1`.

`fitcauto` does not support prior probabilities for neural network classifiers. That is, if you want `Learners` to include `"net"` models, then you cannot specify the `Prior` name-value argument.

Example: `"Prior",struct("ClassNames",["b","g"],"ClassProbs",1:2)`

Data Types: `single` | `double` | `char` | `string` | `struct`

Response variable name, specified as a character vector or string scalar.

Example: `"ResponseName","response"`

Data Types: `char` | `string`

Score transformation, specified as a character vector, string scalar, or function handle.

This table summarizes the available character vectors and string scalars.

ValueDescription
`"doublelogit"`1/(1 + e–2x)
`"invlogit"`log(x / (1 – x))
`"ismax"`Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0
`"logit"`1/(1 + ex)
`"none"` or `"identity"`x (no transformation)
`"sign"`–1 for x < 0
0 for x = 0
1 for x > 0
`"symmetric"`2x – 1
`"symmetricismax"`Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1
`"symmetriclogit"`2/(1 + ex) – 1

For a MATLAB function or a function you define, use its function handle for the score transform. The function handle must accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).

Example: `"ScoreTransform","logit"`

Data Types: `char` | `string` | `function_handle`

Observation weights, specified as a positive numeric vector or the name of a variable in `Tbl`. The software weights each observation in `X` or `Tbl` with the corresponding value in `Weights`. The length of `Weights` must equal the number of rows in `X` or `Tbl`.

If you specify the input data as a table `Tbl`, then `Weights` can be the name of a variable in `Tbl` that contains a numeric vector. In this case, you must specify `Weights` as a character vector or string scalar. For example, if the weights vector `W` is stored as `Tbl.W`, then specify it as `"W"`. Otherwise, the software treats all columns of `Tbl`, including `W`, as predictors or the response variable when training the model.

By default, `Weights` is `ones(n,1)`, where `n` is the number of observations in `X` or `Tbl`.

The software normalizes `Weights` to sum to the value of the prior probability in the respective class.

Data Types: `single` | `double` | `char` | `string`

## Output Arguments

collapse all

Trained classification model, returned as one of the classification model objects in this table.

Learner NameReturned Model Object
`"discr"``CompactClassificationDiscriminant`
`"ensemble"``CompactClassificationEnsemble`
`"kernel"`
`"knn"``ClassificationKNN`
`"linear"`
`"nb"``CompactClassificationNaiveBayes`
`"net"``CompactClassificationNeuralNetwork`
`"svm"`
`"tree"``CompactClassificationTree`

Optimization results, returned as a `BayesianOptimization` object if you use Bayesian optimization or a table if you use ASHA optimization. For more information, see Bayesian Optimization and ASHA Optimization.

collapse all

### Verbose Display

When you set the `Verbose` field of the `HyperparameterOptimizationOptions` name-value argument to `1` or `2`, the `fitcauto` function provides an iterative display of the optimization results.

The following table describes the columns in the display and their entries.

Column NameDescription
`Iter`Iteration number — You can set a limit to the number of iterations by using the `MaxObjectiveEvaluations` field of the `HyperparameterOptimizationOptions` name-value argument.
`Active workers`Number of active parallel workers — This column appears only when you run the optimization in parallel by setting the `UseParallel` field of the `HyperparameterOptimizationOptions` name-value argument to `true`.
`Eval result`

One of the following evaluation results:

• `Best` — The learner and hyperparameter values at this iteration give the minimum observed validation loss computed so far. That is, the `Validation loss` value is the smallest computed so far.

• `Accept` — The learner and hyperparameter values at this iteration give meaningful (for example, non-`NaN`) validation loss values.

• `Error` — The learner and hyperparameter values at this iteration result in an error (for example, a ```Validation loss``` value of `NaN`).

`Validation loss`

Validation loss computed for the learner and hyperparameter values at this iteration — In particular, `fitcauto` computes the cross-validation classification error by default. If you specify misclassification costs by using the `Cost` name-value argument, `fitcauto` computes the mean misclassification cost instead. For more information, see Mean Misclassification Cost.

You can change the validation scheme by using the `CVPartition`, `Holdout`, or `Kfold` field of the `HyperparameterOptimizationOptions` name-value argument.

`Time for training & validation (sec)`Time taken to train and compute the validation loss for the model with the learner and hyperparameter values at this iteration (in seconds) — When you use Bayesian optimization, this value excludes the time required to update the objective function model maintained by the Bayesian optimization process. For more details, see Bayesian Optimization.
`Observed min validation loss`

Observed minimum validation loss computed so far — This value corresponds to the smallest `Validation loss` value computed so far in the optimization process.

By default, `fitcauto` returns a plot of the optimization that displays dark blue points for the observed minimum validation loss values. This plot does not appear when the `ShowPlots` field of the `HyperparameterOptimizationOptions` name-value argument is set to `false`.

`Estimated min validation loss`

Estimated minimum validation loss — When you use Bayesian optimization, `fitcauto` updates, at each iteration, an objective function model maintained by the Bayesian optimization process, and uses this model to estimate the minimum validation loss. For more details, see Bayesian Optimization.

By default, `fitcauto` returns a plot of the optimization that displays light blue points for the estimated minimum validation loss values. This plot does not appear when the `ShowPlots` field of the `HyperparameterOptimizationOptions` name-value argument is set to `false`.

Note

This column appears only when you use Bayesian optimization, that is, when the `Optimizer` field of the `HyperparameterOptimizationOptions` name-value argument is set to `"bayesopt"`.

`Training set size`

Number of observations used in each training set at this iteration — Use the `MaxTrainingSetSize` and `MinTrainingSetSize` fields of the `HyperparameterOptimizationOptions` name-value argument to specify bounds for the training set size. For more details, see ASHA Optimization.

Note

This column appears only when you use ASHA optimization, that is, when the `Optimizer` field of the `HyperparameterOptimizationOptions` name-value argument is set to `"asha"`.

`Learner`Model type evaluated at this iteration — Specify the learners used in the optimization by using the `Learners` name-value argument.
`Hyperparameter: Value`Hyperparameter values at this iteration — Specify the hyperparameters used in the optimization by using the `OptimizeHyperparameters` name-value argument.

The display also includes these model descriptions:

• `Best observed learner` — This model, with the listed learner type and hyperparameter values, yields the final observed minimum validation loss. When you use ASHA optimization, `fitcauto` retrains the model on the entire training data set and returns it as the `Mdl` output.

• `Best estimated learner` — This model, with the listed learner type and hyperparameter values, yields the final estimated minimum validation loss when you use Bayesian optimization. In this case, `fitcauto` retrains the model on the entire training data set and returns it as the `Mdl` output.

Note

The `Best estimated learner` model appears only when you use Bayesian optimization, that is, when the `Optimizer` field of the `HyperparameterOptimizationOptions` name-value argument is set to `"bayesopt"`.

## Tips

• Depending on the size of your data set, the number of learners you specify, and the optimization method you choose, `fitcauto` can take some time to run.

• If you have a Parallel Computing Toolbox license, you can speed up computations by running the optimization in parallel. To do so, specify `"HyperparameterOptimizationOptions",struct("UseParallel",true)`. You can include additional fields in the structure to control other aspects of the optimization. See `HyperparameterOptimizationOptions`.

• If `fitcauto` with Bayesian optimization takes a long time to run because of the number of observations in your training set (for example, over 10,000), consider using `fitcauto` with ASHA optimization instead. ASHA optimization often finds good solutions faster than Bayesian optimization for data sets with many observations. To use ASHA optimization, specify `"HyperparameterOptimizationOptions",struct("Optimizer","asha")`. You can include additional fields in the structure to control other aspects of the optimization. In particular, if you have a time constraint, specify the `MaxTime` field of the `HyperparameterOptimizationOptions` structure to limit the number of seconds `fitcauto` runs.

## Algorithms

collapse all

### Automatic Selection of Learners

When you specify `"Learners","auto"`, the `fitcauto` function analyzes the predictor and response data in order to choose appropriate learners. The function considers whether the data set has any of these characteristics:

• Categorical predictors

• Missing values for more than 5% of the data

• Imbalanced data, where the ratio of the number of observations in the largest class to the number of observations in the smallest class is greater than 5

• More than 100 observations in the smallest class

• Wide data, where the number of predictors is greater than or equal to the number of observations

• High-dimensional data, where the number of predictors is greater than 100

• Large data, where the number of observations is greater than 50,000

• Binary response variable

• Ordinal response variable

The selected learners are always a subset of those listed in the `Learners` table. However, the associated models tried during the optimization process can have different default values for hyperparameters not being optimized, as well as different search ranges for hyperparameters being optimized.

### Bayesian Optimization

The goal of Bayesian optimization, and optimization in general, is to find a point that minimizes an objective function. In the context of `fitcauto`, a point is a learner type together with a set of hyperparameter values for the learner (see `Learners` and `OptimizeHyperparameters`), and the objective function is the cross-validation classification error, by default. The Bayesian optimization implemented in `fitcauto` internally maintains a multi-`TreeBagger` model of the objective function. That is, the objective function model splits along the learner type and, for a given learner, the model is a `TreeBagger` ensemble for regression. (This underlying model differs from the Gaussian process model employed by other Statistics and Machine Learning Toolbox™ functions that use Bayesian optimization.) Bayesian optimization trains the underlying model by using objective function evaluations, and determines the next point to evaluate by using an acquisition function (`"expected-improvement"`). For more information, see Expected Improvement. The acquisition function balances between sampling at points with low modeled objective function values and exploring areas that are not well modeled yet. At the end of the optimization, `fitcauto` chooses the point with the minimum objective function model value, among the points evaluated during the optimization. For more information, see the `"Criterion","min-visited-mean"` name-value argument of `bestPoint`.

### ASHA Optimization

The asynchronous successive halving algorithm (ASHA) in `fitcauto` randomly chooses several models with different hyperparameter values (see `Learners` and `OptimizeHyperparameters`) and trains them on a small subset of the training data. If the performance of a particular model is promising, the model is promoted and trained on a larger amount of the training data. This process repeats, and successful models are trained on progressively larger amounts of data. By default, at the end of the optimization, `fitcauto` chooses the model that has the lowest cross-validation classification error.

At each iteration, ASHA either chooses a previously trained model and promotes it (that is, retrains the model using more training data), or selects a new model (learner type and hyperparameter values) using random search. ASHA promotes models as follows:

• The algorithm searches for the group of models with the largest training set size for which this condition does not hold: `floor(g/4)` of the models have been promoted, where `g` is the number of models in the group.

• Among the group of models, ASHA chooses the model with the lowest cross-validation classification error and retrains that model with ```4*(Training Set Size)``` observations.

• If no such group of models exists, then ASHA selects a new model instead of promoting an old one, and trains the new model using the smallest training set size.

When a model is trained on a subset of the training data, ASHA computes the cross-validation classification error as follows:

• For each training fold, the algorithm selects a random sample of the observations (of size `Training set size`) using stratified sampling, and then trains a model on that subset of data.

• The algorithm then tests the fitted model on the test fold (that is, the observations not in the training fold) and computes the classification error.

• Finally, the algorithm averages the results across all folds.

### Number of ASHA Iterations

When you use ASHA optimization, the default number of iterations depends on the number of observations in the data, the number of learner types, the use of parallel processing, and the type of cross-validation. The algorithm selects the number of iterations such that, for L learner types (see `Learners`), `fitcauto` trains L models on the largest training set size.

This table describes the default number of iterations based on the given specifications when you use 5-fold cross-validation. Note that n represents the number of observations and L represents the number of learner types.

Number of Observations

n

Default Number of Iterations

(run in serial)

Default Number of Iterations

(run in parallel)

n < 50030*Ln is too small to implement ASHA optimization, and `fitcauto` implements random search to find and assess models instead.30*Ln is too small to implement ASHA optimization, and `fitcauto` implements random search to find and assess models instead.
500 ≤ n < 20005*L5*(L + 1)
2000 ≤ n < 800021*L21*(L + 1)
8000 ≤ n < 32,00085*L85*(L + 1)
32,000 ≤ n341*L341*(L + 1)

### Mean Misclassification Cost

If you specify the `Cost` name-value argument, then `fitcauto` minimizes the mean misclassification cost rather than the misclassification error as part of the optimization process. The mean misclassification cost is defined as

`$L=\frac{\sum _{j=1}^{n}C\left({k}_{j},{\stackrel{^}{k}}_{j}\right)\cdot I\left({y}_{j}\ne {\stackrel{^}{y}}_{j}\right)}{n}$`

where

• C is the misclassification cost matrix as specified by the `Cost` name-value argument, and I is the indicator function.

• yj is the true class label for observation j, and yj belongs to class kj.

• ${\stackrel{^}{y}}_{j}$ is the class label with the maximal predicted score for observation j, and ${\stackrel{^}{y}}_{j}$ belongs to class ${\stackrel{^}{k}}_{j}$.

• n is the number of observations in the validation set.

## Alternative Functionality

• If you are unsure which models work best for your data set, you can alternatively use the Classification Learner app. Using the app, you can perform hyperparameter tuning for different models, and choose the optimized model that performs best. Although you must select a specific model before you can tune the model hyperparameters, Classification Learner provides greater flexibility for selecting optimizable hyperparameters and setting hyperparameter values. However, you cannot optimize in parallel, optimize `"linear"` or `"kernel"` learners, specify observation weights, specify prior probabilities, or use ASHA optimization in the app. For more information, see Hyperparameter Optimization in Classification Learner App.

• If you know which models might suit your data, you can alternatively use the corresponding model fit functions and specify the `OptimizeHyperparameters` name-value argument to tune hyperparameters. You can compare the results across the models to select the best classifier. For an example of this process, see Moving Towards Automating Model Selection Using Bayesian Optimization.

## References

[1] Li, Liam, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. “A System for Massively Parallel Hyperparameter Tuning.” ArXiv:1810.05934v5 [Cs], March 16, 2020. https://arxiv.org/abs/1810.05934v5.

## Version History

Introduced in R2020a

expand all

Behavior changed in R2022a

Behavior changed in R2022a

Behavior changed in R2022a

Behavior changed in R2021a