What is Validation data in deep learning?

93 views (last 30 days)
Hi, I am new to deep learning and I am struggling to grasp the concept of validation and its interpretation in training progress plots.
As far I understand after some Googling, it is a check on how the model is behaving during training before deploying on test data.
My question is what is validation data and what it validation accuracy and how to interpret. I would appreciate if some help me to understand in simple words.
Best,
Jabbar

Accepted Answer

Philip Brown
Philip Brown on 21 Jun 2021
The "validation data" is a set of data held separate from your training data. It's used during the training process to see how the network would perform on data it hasn't been directly trained on.
The "training data" is what's used in the process of updating the layer weights via backpropagation. Training data is fed through the network every iteration, the loss is computed, and the layer weights are updated via backpropagation to reduce the loss for that iteration. If the training is going well, every iteration updates the weights of the network so it becomes better and better at predicting on the training data.
However, there's a danger the network becomes too good at predicting on the training data. It can learn very specific features of the training set, rather than generally useful features which would be helpful for predicting on new data. This is called "overfitting". To check for that, we can use "validation data". The validation data is not used directly to train the network. It's instead used to see how the network is performing. In your training plot, these validation checks happen every 50th iteration. Results from the validation data are not used to update the network weights.
As you're seeing in the training plot, the validation accuracy is a little bit below the training accuracy - the network is better at predicting on the data it's been directly trained on. This is quite common. If you saw the validation accuracy start to drop substantially as you train further, that would be more evidence for overfitting: your network would be learning specific features of the training set, rather than general features also useful for the validation set.
  2 Comments
nika mentges
nika mentges on 11 Apr 2024
may I ask what the diffrence between the validation data and the testing data is?
Cris LaPierre
Cris LaPierre on 11 Apr 2024
Edited: Cris LaPierre on 12 Apr 2024
Validation data is used during the training process to evaulate the model. From the doc:
  • "Validation estimates model performance on new data compared to the training data, and helps you choose the best model. Validation protects against overfitting."
Test data is used to evaluate the final trained model. From the doc:
  • "You can use the test set to evaluate the performance of a trained model. In particular, you can check whether the validation metrics provide good estimates for the model performance on new data."
Of note, the model has been trained without ever seeing the test data, so it can help highlight issues with your trained model that the validation data does not.

Sign in to comment.

More Answers (2)

Lei Liu
Lei Liu on 29 May 2021
I'm also a newcomer to neural network learning,perhaps some of this page will help you:
https://www.mathworks.com/help/deeplearning/ug/setting-up-parameters-and-training-of-a-convnet.html?searchHighlight=overfitting&s_tid=doc_srchtitle

Klara Husonuk
Klara Husonuk on 26 Sep 2023
Validation accuracy is a metric that tells you how well your model is doing on this unseen data. It's like a grade on a practice test. Higher validation accuracy indicates that your model is learning and generalizing effectively, making accurate predictions on new, unseen data. Validation data and validation accuracy are essential tools in deep learning to ensure your model learns effectively and generalizes well to new situations. Monitoring this accuracy during training helps you fine-tune your model for better results.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!