How do I improve my neural network performance?

28 views (last 30 days)
Prabhakar
Prabhakar on 18 Jan 2011
Commented: Saeed Magsi on 24 Sep 2022
My neural network either does not reach its error goal or takes too long and uses too much memory to train.

Answers (3)

Akshay
Akshay on 18 Jan 2011
Edited: John Kelly on 27 May 2014
Poor neural network performance can be described as two situations:
1. The error does not reach the goal. If this occurs then you can try the following:
a. Raise the error goal of the training function. While it seems that the lowest error goal is best, it can cause invalid training as well as hinder network generalization. For example, if you are using the TRAINLM function, the default error is 0. You may want to set the error to 1e-6 in order to make the network capable of reaching the error goal.
net=newff([0 10],[5 1],{'tansig' 'purelin'});
net.trainParam.goal=1e-6
b. You may want to use a different performance function. The default performance function is MSE (the mean squared error). It is always recommended to use MSEREG in order to improve generalization of the neural network. To set the performance function to MSEREG:
net.performFcn='msereg'; net.performParam.ratio=0.5;
However, the more a network is generalized the more difficult it is to achieve the lowest error. Therefore you may want to consider sacrificing some generalization and improving network performance by raising the performance ratio which lies in the range [0 1]. For more information on improving network generalization see the Related Solution listed at the bottom of the page. c. You may want to increase the number of epochs for training in some situations. This will take a longer time to train the network but may yield more accurate results. To set the number of epochs see the following example which is an extension to the one above:
net.trainParam.epochs=1000;
2. The next issue that arises in neural network training is the speed and memory usage of training a network to reach the goal. The following are some suggestions to improving these issues: a. You may want to preprocess your data to make the network training more efficient. Preprocessing scales the inputs so that they fall into the range of [-1 1]. If preprocessing is done before training, then postprocessing must be done to successfully analyze the results afterward. For more information on Preprocessing and Postprocessing please refer to Chapter 5 of the Neural Network
b. You may also want to try different training functions. TRAINLM is the most widely used because it is the fastest, but it requires significantly more memory than other training functions. If you want to use this function, but have less memory use, then the "mem_reduc" property needs to be raised:
net.trainParam.mem_reduc=2;
This will reduce the memory used by 2, but take longer to train.
c. Reduce the number of neurons that are being used. A general rule is to have less input parameters than output parameters. Also, it is not necessary to have as many neurons as input parameters, and the number of output neurons should be significantly less than the number of input parameters. For example, if you have your input P with 200 input parameters then it not necessarily beneficial to have 200 input neurons. You may want to try and use 20 input neurons. It is very difficult to give an exact ratio of input parameters to input neurons because each application calls for specific network architectures. This resolution is intended as a general guideline to give suggestions to improve neural network performance. For more information on any of these topics please refer to the Neural Networks User's Guide

Matt McDonnell
Matt McDonnell on 20 Jan 2011
Coffee.

Greg Heath
Greg Heath on 18 Jan 2013
You have not given enough information.
Regression or classification?
What sizes are the input and target matrices?
Are they standardized, normalized and/or transformed?
What network creation function?
How many hidden nodes?
What creation and training defaults have you overwritten?
Sample code would help enormously.
Greg
  2 Comments
Behzad Fotovvati
Behzad Fotovvati on 5 Dec 2019
Edited: Behzad Fotovvati on 5 Dec 2019
Hi Greg,
I have the same regression problem and I would be appreciative if you could help me with it. I have used the NN fitting app, which uses a two-layer feed-forward network. I have four inputs and three outputs. They are neither standardized, normalized, nor transformed. I have only 45 observations (training: 35, validation: 5, and testing: 5), so I chose five neurons for my hidden layer. I employed the Levenburg-Marquardt training algorithm.
First of all, because my three target value sets have different ranges (one is between 5 to 20, one around 100, and one between 350 to 380), I believe the high R-squared that I get (99.99%) is not real and the "predicted vs. actual" plot has three distinct regions (figure attached). So, should I standardize/normalize/transform my response values before feeding them to the network?
Then how can I improve the performance of this network? The following is the function that is generated by the application. I also attached the input and target mat files.
Thank you in advance,
Behzad
function [Y,Xf,Af] = NN_function(X,~,~)
%MYNEURALNETWORKFUNCTION neural network simulation function.
%
% Auto-generated by MATLAB, 22-Nov-2019 17:00:32.
%
% [Y] = myNeuralNetworkFunction(X,~,~) takes these arguments:
%
% X = 1xTS cell, 1 inputs over TS timesteps
% Each X{1,ts} = Qx4 matrix, input #1 at timestep ts.
%
% and returns:
% Y = 1xTS cell of 1 outputs over TS timesteps.
% Each Y{1,ts} = Qx3 matrix, output #1 at timestep ts.
%
% where Q is number of samples (or series) and TS is the number of timesteps.
%#ok<*RPMT0>
if nargin ==0
load ('input.mat', 'input');
X= input;
end
% ===== NEURAL NETWORK CONSTANTS =====
% Input 1
x1_step1.xoffset = [170;900;100;20];
x1_step1.gain = [0.0125;0.00333333333333333;0.025;0.05];
x1_step1.ymin = -1;
% Layer 1
b1 = [3.1578500830052504966;4.4952071974983596192;-0.88364164245700427269;-1.1702983217913538461;-3.5182431864255532261];
IW1_1 = [-1.0819794624949294892 2.6692769892964860468 -7.6209503796131876641 3.312060992159889139;-4.7788086572971595345 4.1485007161535563114 0.83416670793134528594 3.8627684555966799174;-0.14205550613852135911 0.33459277147998212065 0.28679737241542646586 0.34985936950154811198;-2.0561896360862452759 1.0965988493366791712 1.5442399301331737327 -0.11614043990841006748;-0.89033979322261624922 2.5551531766568640336 1.9479683094053272807 -6.1523826340108769273];
% Layer 2
b2 = [0.34606223263192292805;-0.0033838724860873262146;0.51080999686766304091];
LW2_1 = [-0.027811241786895000982 0.028835288395292278663 -0.065274177726248938658 -0.44062348614705032501 0.0024791607847241547979;-0.26034435162446623035 0.23517629033311457376 0.13229333876255239266 -0.77292615570020328786 0.090965199728923057387;0.033716941955813012344 -0.032639694615536805899 1.4554221825175619465 0.12942922726259944999 0.0085852049359890665603];
% Output 1
y1_step1.ymin = -1;
y1_step1.gain = [0.79554494828958;0.0577867668303958;0.0582241630276565];
y1_step1.xoffset = [97.379;337.83;4.49];
% ===== SIMULATION ========
% Format Input Arguments
isCellX = iscell(X);
if ~isCellX
X = {X};
end
% Dimensions
TS = size(X,2); % timesteps
if ~isempty(X)
Q = size(X{1},1); % samples/series
else
Q = 0;
end
% Allocate Outputs
Y = cell(1,TS);
% Time loop
for ts=1:TS
% Input 1
X{1,ts} = X{1,ts}';
Xp1 = mapminmax_apply(X{1,ts},x1_step1);
% Layer 1
a1 = tansig_apply(repmat(b1,1,Q) + IW1_1*Xp1);
% Layer 2
a2 = repmat(b2,1,Q) + LW2_1*a1;
% Output 1
Y{1,ts} = mapminmax_reverse(a2,y1_step1);
Y{1,ts} = Y{1,ts}';
end
% Final Delay States
Xf = cell(1,0);
Af = cell(2,0);
% Format Output Arguments
if ~isCellX
Y = cell2mat(Y);
end
end
% ===== MODULE FUNCTIONS ========
% Map Minimum and Maximum Input Processing Function
function y = mapminmax_apply(x,settings)
y = bsxfun(@minus,x,settings.xoffset);
y = bsxfun(@times,y,settings.gain);
y = bsxfun(@plus,y,settings.ymin);
end
% Sigmoid Symmetric Transfer Function
function a = tansig_apply(n,~)
a = 2 ./ (1 + exp(-2*n)) - 1;
end
% Map Minimum and Maximum Output Reverse-Processing Function
function x = mapminmax_reverse(y,settings)
x = bsxfun(@minus,y,settings.ymin);
x = bsxfun(@rdivide,x,settings.gain);
x = bsxfun(@plus,x,settings.xoffset);
end
Saeed Magsi
Saeed Magsi on 24 Sep 2022
Dear @Behzad Fotovvati did you get the answer of your question. I have the same problem of actual vs predicted plot has two distinct regions. what should i do? please help.

Sign in to comment.

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!