Insight on spikes in Accuracy(%) graph while training a CNN in Deep Learning Toolbox.

8 views (last 30 days)
I'm using the Deep Network Designer and following the example "Get Started With Transfer Learning" to train a CNN based on the ResNet101 architecture with a data set of 2353 images (70/30 split for train and validation) of size 600x600, I'm using the following training options.
InitalLearnRate: 0.0001
ValidationFrequency 183
MaxEpochs 30
MiniBatchSize 9
The rest of the training options are left as default. While training I'm getting abnormal spikes in the training accuracy graph, the smoothed one has some noticeable spikes and the not smoothed one is all over the place. I've searched for answers on the internet and have not found a graph nor post with similar spikes. Is this a bad thing? Should I avoid it? How?

Answers (1)

Debraj Maji
Debraj Maji on 18 Oct 2023
Hi @Diego,
I see that you are using Mini-Batch gradient descent to train the network and you want to avoid having these abnormal spikes in the accuracy graph.
The spikes can be attributed as an unavoidable consequence of Mini-Batch Gradient Descent in general. This happens as some minibatches might have some outliers in them which can reduce training accuracy temporarily. You can avoid the spikes by using the following techniques:
  • Full Batch Gradient descent: Here, the batch size is equal to the entire training set, so the entire training data is used for gradient and accuracy computation.
  • Using higher batch sizes: This is supposed to reduce the depth of the spikes but not completely eliminate them. This is however better than Full Batch Gradient descent as training time is much less compared to Full Batch Gradient Descent.
  • Outlier removal: It is not recommended as model will more likely not be robust enough and it is quite tedious given the amount of data you are using.
For further information on how mini-batch gradient descent works you can use the following: A Gentle Introduction to Mini-Batch Gradient Descent and How to Configure Batch Size




Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!