Understanding the train/validation and test set.

48 views (last 30 days)
Can someone tell me how should I divide the training, validation and test in a correct way?
The default settings are training 70%, validation 15% and test 15%.
The training set is to train the model, the validation set is to adjust it and the test set is how well it is performs on unseen data?
In my case, all I am trying to do is get the closest result of my "Output". What's the point of validation and test set?

Accepted Answer

Askic V
Askic V on 9 Nov 2022
Hello Bob,
I don't think that anyone could explain this better than Andrew Ng himself. Please have a look at this YT video:
here you have answers to your questions. But as a quick answer, train data is used to train (fit) model. To evaluate its performance and do additional tweaking, tuning or to even change a structure and choose better model, validation set is used.
So depending on the validation cost function error, structure of the model can change (for example adding new feature, changing the order of polynomial etc).
After that, test dataset is used to answer to the following question: "How well does the model generalize i.e. how it performs on a data never seen before"?
But have a look at that YT video, that will answer your questions in a much better way.

More Answers (0)

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!