Nan problem ( validation loss and mini batch loss) in Transfer Learning with Googlenet

Question

caesar on 8 Aug 2019

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/475468-nan-problem-validation-loss-and-mini-batch-loss-in-transfer-learning-with-googlenet

Answered: caesar on 9 Aug 2019

I am trying to use Googlenet for transfer Learning on a data set ( Images) , with three classes as output. I am using the same script that has been used successfully with ResNet50 before. As both networks shares the same input layer ( 224 224 3) ,I thought it would be easy to do another test on the same data set using Google net . Neither using Deep Network Designer nor editing the original Googlenet ( substituting the last layers ) by script have helped to avoid getting the same error that is :

after the first iteration both the mini batch loss and the validation loss go to NAN .

I think I am missing something here as it doesnt make sense to me .

I have attached the script plus a screen shot of the output that shows the Nan .

augmenter = imageDataAugmenter(...
    'RandRotation',[0 15]);
     
 path = fullfile('C:\','Big J','My Exp','training');
 
imds=imageDatastore(path,'IncludeSubfolder',true,'LabelSource','Foldernames');
inputSize=[224 224];   
imds.ReadFcn = @(loc)imresize(imread(loc),inputSize);
[TrainDataStore,ValDataStore] = splitEachLabel(imds,0.8,'randomize');
Traindatasource = augmentedImageSource([224 224],TrainDataStore,'DataAugmentation',augmenter);
%%
net=googlenet;
lgraph = layerGraph(net);
lgraph = removeLayers(lgraph, {'loss3-classifier','prob','output'});
figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]);plot(lgraph);
newLayers = [
    fullyConnectedLayer(3,'Name','fc-3','WeightLearnRateFactor',20,'BiasLearnRateFactor', 20) % increase the learning factor of FC layer to speed up learning
    softmaxLayer('Name','softmax')
    classificationLayer('Name','classoutput')];
lgraph = addLayers(lgraph,newLayers);
figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]);plot(lgraph);
lgraph = connectLayers(lgraph,'pool5-drop_7x7_s1','fc-3');
figure('Units','normalized','Position',[0.1 0.1 0.8 0.8]);plot(lgraph);
%%
   
options = trainingOptions('sgdm',...
    'ExecutionEnvironment','gpu',...
     'InitialLearnRate',0.01,...    
     'L2Regularization', 0.0001,...             
    'MaxEpochs',15,...                           
    'MiniBatchSize',32,...                    
    'Momentum',0.9,...                           
    'Shuffle','once',...                
    'Verbose',1,...                               
    'VerboseFrequency',50,...                  
    'ValidationData',ValDataStore,...                    
    'ValidationFrequency',50,...               
    'ValidationPatience',Inf,...                  
    'Plots','training-progress'    );
%%
net = trainNetwork(Traindatasource,lgraph,options);
%%
G_net_test =net;
save('G_net_test.mat', 'G_net_test');