两段我觉得差别很小的代码，结果却差别很大

Question

辽辽程 on 21 Jun 2024

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/2130721-

Answered: Aneela on 24 Jul 2024

代码一：% 输入输出归一化

[XTrain,input_str] = mapminmax(p) ;

[YTrain,output_str] = mapminmax(q) ;

layers = [ ...

sequenceInputLayer(numFeatures)

bilstmLayer(numHiddenUnits,'OutputMode','sequence')

% lstmLayer(numHiddenUnits)

fullyConnectedLayer(300)

% lstmLayer(numHiddenUnits)

dropoutLayer(0.5)%会使训练结果出现波动

fullyConnectedLayer(numResponses)

];

net = dlnetwork(layers)

% net = gpuArray(net);

% net = dlupdate(@gpuArray,net);

numEpochs = 150;

miniBatchSize = 100;

ds=arrayDatastore([XTrain;YTrain]);

numObservationsTrain = size(YTrain,2);

numIterationsPerEpoch = floor(numObservationsTrain / miniBatchSize);

%adam优化器

averageGrad = [];

averageSqGrad = [];

numIterations = numEpochs * numIterationsPerEpoch;

iteration = 0;

epoch = 0;

mbq = minibatchqueue(ds,...

MiniBatchSize=miniBatchSize,...

MiniBatchFormat=["CBT"]);

monitor = trainingProgressMonitor( ...

Metrics="Loss", ...

Info=["Epoch" "LearnRate"], ...

XLabel="Iteration");

gradThreshold = 1.0;

while epoch < numEpochs && ~monitor.Stop

epoch = epoch + 1;

reset(mbq);

% idx = 1:1:numel(YTrain);

idx = randperm(size(YTrain,2));

XTrain = XTrain(:,idx);%XTrain = XTrain(:,idx);

YTrain = YTrain(:,idx);

k = 0;

% while k < numIterationsPerEpoch && ~monitor.Stop

for i = 1:numel(net.Layers)

if isa(net.Layers(i), 'nnet.internal.cnn.layer.learnable.LearnableParameter')

grad = net.Layers(i).dLdW; % 获取参数的梯度

gradNorm = norm(grad);

if gradNorm > gradThreshold

grad = grad * (gradThreshold / gradNorm); % 对梯度进行修剪

end

net.Layers(i).dLdW = grad; % 更新修剪后的梯度

end

while hasdata(mbq) && ~monitor.Stop

k = k + 1;

iteration = iteration + 1;

XY=next(mbq);

X=XY(1:13,:);

Y=XY(14:16,:);

X = dlarray(X, 'CBT');

Y = dlarray(Y, 'CBT');

[loss, gradients] = dlfeval(@modelLoss2, net, X, Y);

[net, averageGrad, averageSqGrad] = adamupdate(net, gradients, averageGrad, averageSqGrad, iteration);%学习率的自适应调整

recordMetrics(monitor, iteration, Loss=loss);

updateInfo(monitor,Epoch=epoch + " of " + numEpochs);

monitor.Progress = 100 * iteration / numIterations;

end

代码二：[XTrain,input_str] = mapminmax(p) ;

[YTrain,output_str] = mapminmax(q) ;

layers = [ ...

sequenceInputLayer(numFeatures)

bilstmLayer(numHiddenUnits,'OutputMode','sequence')

fullyConnectedLayer(300)

dropoutLayer(0.5)

fullyConnectedLayer(numResponses)

regressionLayer];

maxEpochs = 150;

miniBatchSize = 200;

options = trainingOptions('adam', ...

'MaxEpochs',maxEpochs, ...

'MiniBatchSize',miniBatchSize, ...

'InitialLearnRate',0.005, ...

'GradientThreshold',1, ...

'Shuffle','never', ...

'Plots','training-progress',...

'Verbose',false);

net = trainNetwork(XTrain,YTrain,layers,options);

我的XTrain\YTrain都一致，不知为何测试集的回归情况，代码一误差很大，代码二误差很小。我的XTrain是13*14000；YTrain是3*14000。

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Aneela on 24 Jul 2024

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/2130721-#answer_1489826

Hello,

The first code provided is a custom training loop while the second code uses MATLAB’s built-in “trainNetwork” function.

The potential reasons for different performance can be:

In the custom training loop, the data is shuffled at the beginning of each epoch whereas in the built-in “trainNetwork” function, “shuffling” is set to never.
The mini-batch size in the custom training loop is set to 100 while in the “trainNetwork” it is set to 200.
The initial learning rate is set to 0.005 in the “trainNetwork”, it is not explicitly mentioned in the custom training loop.

You can consider the following to reduce the difference in performance:

Ensure that data normalization and preprocessing steps are consistent between the two approaches.
Use a validation set to monitor performance during training.
Keep a track of immediate results such as loss and gradients to understand the training process in both the approaches.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

两段我觉得差别很小的代码，结果却差别很大

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

两段我觉得差别很小的代码，结果却差别很大

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments