resume

Resume training of cross-validated classification ensemble model

Syntax

ens1 = resume(ens,nlearn)

ens1 = resume(ens,nlearn,Name=Value)

Description

ens1 = resume(ens,nlearn) continues training in every fold with the same options used to train ens, except parallel training options and printout frequency, for nlearn more training cycles. The function returns a new cross-validated classification ensemble model ens1.

example

ens1 = resume(ens,nlearn,Name=Value) specifies additional options using one or more name-value arguments. For example, you can specify the printout frequency, and set options for computing in parallel.

Examples

collapse all

Train Partitioned Classification Ensemble for More Cycles

Open Live Script

Train a partitioned classification ensemble for 10 cycles, and compare the classification loss obtained after training the ensemble for more cycles.

Load the ionosphere data set.

load ionosphere

Train a partitioned classification ensemble for 10 cycles and examine the error.

t = templateTree('MaxNumSplits',1); % Weak learner template tree object
cvens = fitcensemble(X,Y,'Method','GentleBoost','NumLearningCycles',10,'Learners',t,'crossval','on');
rng(10,'twister') % For reproducibility
L = kfoldLoss(cvens)

L = 
0.0940

Train for 10 more cycles and examine the new error.

cvens = resume(cvens,10);
L = kfoldLoss(cvens)

L = 
0.0712

The cross-validation error is lower in the ensemble after training for 10 more cycles.

Input Arguments

collapse all

`ens` — Cross-validated classification ensemble model
`ClassificationPartitionedEnsemble` model object

Cross-validated classification ensemble model, specified as a ClassificationPartitionedEnsemble model object created with one of these functions:

fitcensemble with one of these five cross-validation name-value arguments specified: CrossVal, KFold, Holdout, Leaveout, or CVPartition
crossval applied to a classification ensemble model object

`nlearn` — Number of additional training cycles
positive integer

Number of additional training cycles for ens, specified as a positive integer.

Data Types: double | single

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: resume(ens,10,NPrint=5,Options=statset(UseParallel=true)) specifies to train ens for an additional 10 cycles, display a message to the command line every time resume finishes training 5 folds, and perform computations in parallel.

`NPrint` — Printout frequency
`"off"` (default) | positive integer

Printout frequency, specified as a positive integer m or "off". resume displays a message to the command line every time it finishes training m folds. If you specify "off", resume does not display a message when it completes training folds.

Tip

For the fastest training of some boosted decision trees, when the classification Method is "AdaBoostM1", "AdaBoostM2", "GentleBoost", or "LogitBoost", set NPrint to "off" (the default value).

Example: NPrint=5

Data Types: single | double | char | string

`Options` — Options for computing in parallel and setting random number streams
structure

Options for computing in parallel and setting random number streams, specified as a structure. Create the Options structure using statset.

Note

You need Parallel Computing Toolbox™ to run computations in parallel.

You can use the same parallel options for resume as you used for the original training. Use the Options argument to change the parallel options, as needed. This table describes the option fields and their values.

Field Name Value Default

Field Name	Value	Default
`UseParallel`	Set this value to `true` to compute in parallel. Parallel ensemble training requires you to set the `Method` name-value argument to `"Bag"`. Parallel training is available only for tree learners, the default type for `Method="Bag"`.	`false`
`UseSubstreams`	Set this value to `true` to perform computations in a reproducible manner. To compute reproducibly, set `Streams` to a type that allows substreams: `"mlfg6331_64"` or `"mrg32k3a"`.	`false`
`Streams`	Specify this value as a `RandStream` object or cell array of such objects. Use a single object except when the `UseParallel` value is `true` and the `UseSubstreams` value is `false`. In that case, use a cell array that has the same size as the parallel pool.	If you do not specify `Streams`, `resume` uses the default stream or streams.

UseParallel

Set this value to true to compute in parallel. Parallel ensemble training requires you to set the Method name-value argument to "Bag". Parallel training is available only for tree learners, the default type for Method="Bag".

false

UseSubstreams

Set this value to true to perform computations in a reproducible manner.

To compute reproducibly, set Streams to a type that allows substreams: "mlfg6331_64" or "mrg32k3a".

false

Streams Specify this value as a RandStream object or cell array of such objects. Use a single object except when the UseParallel value is true and the UseSubstreams value is false. In that case, use a cell array that has the same size as the parallel pool. If you do not specify Streams, resume uses the default stream or streams.

For dual-core systems and above, resume parallelizes training using Intel^® Threading Building Blocks (TBB). Therefore, setting UseParallel to true might not provide a significant increase in speed on a single computer. For details on Intel TBB, see https://www.intel.com/content/www/us/en/developer/tools/oneapi/onetbb.html.

Example: Options=statset(UseParallel=true)

Data Types: struct

Extended Capabilities

expand all

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

resume supports parallel training using the 'Options' name-value argument. Create options using statset, such as options = statset('UseParallel',true). Parallel ensemble training requires you to set the 'Method' name-value argument to 'Bag'. Parallel training is available only for tree learners, the default type for 'Bag'.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2012b

resume

Syntax

Description

Examples

Train Partitioned Classification Ensemble for More Cycles

Input Arguments

ens — Cross-validated classification ensemble model ClassificationPartitionedEnsemble model object

nlearn — Number of additional training cycles positive integer

Name-Value Arguments

NPrint — Printout frequency "off" (default) | positive integer

Options — Options for computing in parallel and setting random number streams structure

Extended Capabilities

Automatic Parallel Support Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

`ens` — Cross-validated classification ensemble model
`ClassificationPartitionedEnsemble` model object

`nlearn` — Number of additional training cycles
positive integer

`NPrint` — Printout frequency
`"off"` (default) | positive integer

`Options` — Options for computing in parallel and setting random number streams
structure

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.