SSD Object Detector training results in NaN loss and RMSE
10 views (last 30 days)
Show older comments
Hello
I've create an SSD with mobilenetv2 with the example from "Create SSD Object Detection Network". But changed the class count to just 1.
For training I've used the sample from "Object Detection Using SSD Deep Learning".
|=======================================================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | Loss | Accuracy | RMSE | Rate |
|=======================================================================================================|
| 1 | 1 | 00:00:01 | 3.1220 | 50.98% | 3.39 | 1.0000e-05 |
| 4 | 50 | 00:00:27 | NaN | 0.00% | NaN | 1.0000e-05 |
| 8 | 100 | 00:00:53 | NaN | 0.00% | NaN | 1.0000e-05 |
| 11 | 150 | 00:01:19 | NaN | 0.00% | NaN | 1.0000e-05 |
Is there something I'm missing? Is the SSD model created in the first sample not an actual working model?
Best regards
Link Sample 1: https://ch.mathworks.com/help/vision/examples/create-ssd-object-detection-network.html
Link Sample 2: https://ch.mathworks.com/help/deeplearning/ug/object-detection-using-ssd-deep-learning.html
Edit: I've tried decreasing the learning rate with no success.
2 Comments
Answers (1)
Ryan Comeau
on 10 May 2020
Hello,
I do not know the exact thing which may be causing this, but if I had to bed on it, I would check all of the bounding boxes in your data set and make sure they are correctly labelling the objects. If you removed 1 class but left the bounding boxes there it could be finding NaN value in this way. Superimpose the bounding boxed on the image and ensure you have the correct labels and locations of these objects.
Second, the lowest recommended learning rate i've seen in literature(i don't have a specific paper to link her unfortunately) is about 1e-6. Low learning rates like this can cause your network to not converge at all since the weights will never be updated enough. What I recommend is use the learn rate drop schedule that is provided here. Here is a sample of what i've used to achive some satisfactory results.
options = trainingOptions('sgdm',...
'InitialLearnRate',18.0e-4,...
'LearnRateSchedule','piecewise', ...
'LearnRateDropFactor',0.7, ...
'LearnRateDropPeriod',1, ...
'Verbose',true,...
'MiniBatchSize',24,...
'MaxEpochs',8,...
'Shuffle','every-epoch',...
'VerboseFrequency',1);

Third, to further gain performance, tune your strides and sizes of convolution kernels, you'll need to adjust this to your specific task i cannot help here.
Hope this helps,
RC
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!