Main Content

resume

Resume training of cross-validated classification ensemble model

Description

example

ens1 = resume(ens,nlearn) continues training in every fold with the same options used to train ens, except parallel training options and printout frequency, for nlearn more training cycles. The function returns a new cross-validated classification ensemble model ens1.

ens1 = resume(ens,nlearn,Name=Value) specifies additional options using one or more name-value arguments. For example, you can specify the printout frequency, and set options for computing in parallel.

ens1 = resume(ens,nlearn,Name,Value) trains ens with additional options specified by one or more Name,Value pair arguments.

Examples

collapse all

Train a partitioned classification ensemble for 10 cycles, and compare the classification loss obtained after training the ensemble for more cycles.

Load the ionosphere data set.

load ionosphere

Train a partitioned classification ensemble for 10 cycles and examine the error.

t = templateTree('MaxNumSplits',1); % Weak learner template tree object
cvens = fitcensemble(X,Y,'Method','GentleBoost','NumLearningCycles',10,'Learners',t,'crossval','on');
rng(10,'twister') % For reproducibility
L = kfoldLoss(cvens)
L = 0.0940

Train for 10 more cycles and examine the new error.

cvens = resume(cvens,10);
L = kfoldLoss(cvens)
L = 0.0712

The cross-validation error is lower in the ensemble after training for 10 more cycles.

Input Arguments

collapse all

Cross-validated classification ensemble, specified as a ClassificationPartitionedEnsemble model object created with either:

  • The fitcensemble function with the cross-validation name-value argument crossval, kfold, holdout, leaveout, or cvpartition.

  • The crossval method applied to a classification ensemble.

Number of additional training cycles for ens, specified as a positive integer.

Data Types: double | single

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: resume(ens,10,NPrint=5,Options=statset(UseParallel=true)) specifies to train ens for an additional 10 cycles, display a message to the command line every time resume finishes training 5 folds, and to perform computations in parallel.

Printout frequency, specified as a positive integer m or "off". resume displays a message to the command line every time it finishes training m folds. If you specify "off", resume does not display a message when it completes training folds.

Tip

For fastest training of some boosted decision trees, set NPrint to the default value "off". This tip holds when the classification Method is "AdaBoostM1", "AdaBoostM2", "GentleBoost", or "LogitBoost", or when the regression Method is "LSBoost".

Example: NPrint=5

Data Types: single | double | char | string

Options for computing in parallel and setting random number streams, specified as a structure. Create the Options structure using statset.

Note

You need Parallel Computing Toolbox™ to run computations in parallel.

You can use the same parallel options for resume as you used for the original training. Use the Options argument to change the parallel options, as needed. This table describes the option fields and their values.

Field NameValueDefault
UseParallel

Set this value to true to compute in parallel. Parallel ensemble training requires you to set the Method name-value argument to "Bag". Parallel training is available only for tree learners, the default type for Method="Bag".

false
UseSubstreams

Set this value to true to perform computations in a reproducible manner.

To compute reproducibly, set Streams to a type that allows substreams: "mlfg6331_64" or "mrg32k3a".

false
StreamsSpecify this value as a RandStream object or cell array of such objects. Use a single object except when the UseParallel value is true and the UseSubstreams value is false. In that case, use a cell array that has the same size as the parallel pool.If you do not specify Streams, resume uses the default stream or streams.

For dual-core systems and above, resume parallelizes training using Intel® Threading Building Blocks (TBB). Therefore, setting UseParallel to true might not provide a significant increase in speed on a single computer. For details on Intel TBB, see https://www.intel.com/content/www/us/en/developer/tools/oneapi/onetbb.html.

Example: Options=statset(UseParallel=true)

Data Types: struct

Extended Capabilities

Version History

Introduced in R2012b