Divide Data for Optimal Neural Network Training
This topic presents part of a typical multilayer network workflow. For more information and other steps, see Multilayer Shallow Neural Networks and Backpropagation Training.
When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error. This technique is discussed in more detail in Improve Shallow Neural Network Generalization and Avoid Overfitting.
The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set.
There are four functions provided for dividing data into training,
validation and test sets. They are dividerand
(the
default), divideblock
, divideint
, and divideind
.
The data division is normally performed automatically when you train
the network.
Function | Algorithm |
---|---|
Divide the data randomly (default) | |
Divide the data into contiguous blocks | |
Divide the data using an interleaved selection | |
Divide the data by index |
You can access or change the division function for your network with this property:
net.divideFcn
Each of the division functions takes parameters that customize its behavior. These values are stored and can be changed with the following network property:
net.divideParam
The divide function is accessed automatically whenever the network
is trained, and is used to divide the data into training, validation
and testing subsets. If net.divideFcn
is set to '
dividerand
'
(the default),
then the data is randomly divided into the three subsets using the
division parameters net.divideParam.trainRatio
, net.divideParam.valRatio
,
and net.divideParam.testRatio
. The fraction of
data that is placed in the training set is trainRatio
/(trainRatio+valRatio+testRatio
),
with a similar formula for the other two sets. The default ratios
for training, testing and validation are 0.7, 0.15 and 0.15, respectively.
If net.divideFcn
is set to '
divideblock
'
, then the
data is divided into three subsets using three contiguous blocks of
the original data set (training taking the first block, validation
the second and testing the third). The fraction of the original data
that goes into each subset is determined by the same three division
parameters used for dividerand
.
If net.divideFcn
is set to '
divideint
'
, then the
data is divided by an interleaved method, as in dealing a deck of
cards. It is done so that different percentages of data go into the
three subsets. The fraction of the original data that goes into each
subset is determined by the same three division parameters used for dividerand
.
When net.divideFcn
is set to '
divideind
'
, the data
is divided by index. The indices for the three subsets are defined
by the division parameters net.divideParam.trainInd
, net.divideParam.valInd
and net.divideParam.testInd
.
The default assignment for these indices is the null array, so you
must set the indices when using this option.