Convolutional neural network regression - low RMSE but poor prediction performance?
3 views (last 30 days)
Show older comments
Dear community,
I'm using a convolutional neural network to predict a numerical output from onehot encoded matrices. After a long time finally I managed to configure a network which doesn't seem to overfit my data based on the learning curves:

Being happy with my achievement, I used my trained network to predict the test set, however, I get extremely poor performance:

Now I'm totally puzzled. I expected that similarly good RMSE values would yield me nearly perfect prediction, but it seems my network just memorizes my training data. But shouldn't I get typical overfitting learning curves then? Can the problem be related to the quality of the data?
As a supplementary information, here's my code:
% network configuration
filtersize=[3,3];
layer=[imageInputLayer(inputsize,"Normalization","none")
convolution2dLayer(filtersize,8,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',2)
convolution2dLayer(filtersize,8,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',2)
convolution2dLayer(filtersize,16,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2],'Stride',1)
convolution2dLayer(filtersize,32,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2])
convolution2dLayer(filtersize,64,'Stride',1,"Padding","same")
reluLayer
maxPooling2dLayer([2,2])
flatLayer % this is a flatten layer imported from Keras
dropoutLayer(.65)
fullyConnectedLayer(2048)
dropoutLayer(.65)
fullyConnectedLayer(128)
fullyConnectedLayer(32)
reluLayer
fullyConnectedLayer(1)
regressionLayer];
lgraph=layerGraph(layer);
% training options
miniBatchSize=8;
options = trainingOptions('adam', ...
'MaxEpochs',4096,...
'MiniBatchSize',miniBatchSize, ...
'Shuffle','every-epoch', ...
"OutputNetwork",'best-validation-loss',...
'Plots','training-progress', ...
'Verbose',0,...
'VerboseFrequency',200,...
"ValidationData",{Xval,Yval},...
'ValidationFrequency',32,...
'ValidationPatience',128,...
'InitialLearnRate',1e-4,...
'LearnRateSchedule','none',...
'GradientDecayFactor',.7,...
'L2Regularization',10^(-4),...
'ExecutionEnvironment','auto');
Thanks for any kind of help
0 Comments
Answers (1)
Sai Pavan
on 27 Sep 2023
Hi Daniel,
I understand that you want to know the possible reason for low performance of CNN regression model on the test set despite low RMSE values on the training set.
As you have rightly hypothesized, the most likely reason for this type of behaviour of the model, where it doesn’t overfit the training and validation datasets but perform poorly on the test set, is mismatch between training and test data distributions. It's crucial to have a balanced and representative test dataset that follows a similar distribution as the training dataset and captures the true variability of the real-world scenarios you want your model to perform well on.
Hope it helps.
Regards,
Sai Pavan
0 Comments
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!