Classify observations using ensemble of classification models
also returns a matrix of classification scores
scores), indicating the likelihood that a label comes from
a particular class, using any of the input arguments in the previous syntaxes. For
each observation in
X, the predicted class label corresponds to
the maximum score among all classes.
Predictor data to be classified, specified as a numeric matrix or table.
Each row of
comma-separated pairs of
the argument name and
Value is the corresponding value.
Name must appear inside quotes. You can specify several name and value
pair arguments in any order as
Indices of weak learners
A logical matrix of size
Vector of classification labels.
A matrix with one row per observation and one column per class. For each observation and each class, the score represents the confidence that the observation originates from that class. A higher score indicates a higher confidence. For more information, see Score (ensemble).
Load Fisher's iris data set. Determine the sample size.
load fisheriris N = size(meas,1);
Partition the data into training and test sets. Hold out 10% of the data for testing.
rng(1); % For reproducibility cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices
Store the training data in a table.
tblTrn = array2table(meas(idxTrn,:)); tblTrn.Y = species(idxTrn);
Train a classification ensemble using AdaBoostM2 and the training set. Specify tree stumps as the weak learners.
t = templateTree('MaxNumSplits',1); Mdl = fitcensemble(tblTrn,'Y','Method','AdaBoostM2','Learners',t);
Predict labels for the test set. You trained model using a table of data, but you can predict labels using a matrix.
labels = predict(Mdl,meas(idxTest,:));
Construct a confusion matrix for the test set.
Mdl misclassifies one versicolor iris as virginica in the test set.
For ensembles, a classification score represents the confidence that an observation originates from a specific class. The higher the score, the higher the confidence.
Different ensemble algorithms have different definitions for their scores. Furthermore, the range of scores depends on ensemble type. For example:
Bag scores range from
1. You can interpret these scores as probabilities
averaged over all the trees in the ensemble.
LogitBoost scores range from –∞ to ∞.
You can convert these scores to probabilities by setting the
ScoreTransform property of
'doublelogit' before passing
Mdl.ScoreTransform = 'doublelogit'; [labels,scores] = predict(Mdl,X);
'ScoreTransform','doublelogit'in the call to
fitcensemblewhen you create
For more information on the different ensemble algorithms and how they compute scores, see Ensemble Algorithms.
This function fully supports tall arrays. For more information, see Tall Arrays.
Usage notes and limitations:
codegen (MATLAB Coder) to generate code for the
predict function. Save
a trained model by using
saveLearnerForCoder. Define an entry-point function
that loads the saved model by using
loadLearnerForCoder and calls the
predict function. Then use
to generate code for the entry-point function.
To generate single-precision C/C++ code for predict, specify the name-value argument
'DataType','single' when you call the
You can also generate fixed-point C/C++ code for
predict. Fixed-point code generation requires an additional step that
defines the fixed-point data types of the variables required for prediction. Create a
fixed-point data type structure by using the data type function
generateLearnerDataTypeFcn, and use the structure as an input argument of
loadLearnerForCoder in an entry-point function. Generating fixed-point
C/C++ code requires MATLAB
Coder™ and Fixed-Point Designer™.
Generating fixed-point code for
propagating data types for individual learners and, therefore, can be time
This table contains
notes about the arguments of
predict. Arguments not included in this
table are fully supported.
|Argument||Notes and Limitations|
For the usage notes and limitations of the model object,
Code Generation of the
|Name-value pair arguments||
Names in name-value pair arguments must be compile-time constants. For example, to allow user-defined indices up to 5 weak learners in the generated
For fixed-point code generation, the
For more information, see Introduction to Code Generation.