How to implement cross validation in neural network for time series prediction
    8 views (last 30 days)
  
       Show older comments
    
I am using k fold cross validation for the training neural network in order to predict a time series. I have an input time series and I am using Nonlinear Autoregressive Tool for time series. I am using 10 fold cross validation method and divide the data set as 70 % training, 15% validation and 15 % testing. But I really din't know how to generate the code.
And please to be honest, this is the first time that I am using neural networks. So, please be humble in your explanation!!
This is something that I wrote,
k=10;
Indices=crossvalind('Kfold', length(X), 10);
X = tonndata(densig,true,false);
T = tonndata(densig,true,false);
trainFcn = 'trainlm';
inputDelays = 1:2;
feedbackDelays = 1:2;
hiddenLayerSize = [50 20 20];
X1=cell2mat(X);
T1=cell2mat(T);
for i=1:k
  net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize,'open',trainFcn);  
  X1(i)=find(X1(Indices(i)));
  T1(i)=find(T1(Indices(i)));
  [x,xi,ai,t] = preparets(net,X1,{},T1);
  net.divideParam.trainRatio = 70/100;
  net.divideParam.valRatio = 15/100;
  net.divideParam.testRatio = 15/100;
  net.trainParam.epochs = 5000;
  [net,tr] = train(net,x,t,xi,ai);
end
  y = net(x,xi,ai);
  e = gsubtract(t,y);
  performance = perform(net,t,y);
Please help
Thanks Baqar
1 Comment
  Greg Heath
      
      
 on 23 Aug 2017
				The quickest way to get NN help is to run your program on one or more of the MATLAB examples from
 doc nndatasets
and/or
 help nndatasets
after initializing using
 rng('default') % same as rng(0).
Hope this helps.
Greg
Answers (4)
  Greg Heath
      
      
 on 23 Aug 2017
        
      Edited: Greg Heath
      
      
 on 24 Aug 2017
  
      UH-OH ! I do not have crossvalind.
               CROSSVALIND IS NOT IN THE NN TOOLBOX!!!
However, I have posted crossvalidation results in both the NEWSGROUP and ANSWERS.
Your problem is doubly troubling because there are very few references that use cross-validation with
           EITHER NNs OR TIMESERIES !!!
My search yields the following number of hits:
                    NEWSGROUP       ANSWERS
 NEURAL                4319             5130
 TIMESERIES             604             1696
 NEURAL TIMESERIES       87              344
 CROSSVAL                51              119
 CROSSVAL NEURAL          9               19
 CROSSVAL TIMESERIES      0                3
The main reasons for so few examples is that
 1. It IS VERY MUCH EASIER AND NO LESS VALID to design NNs with 
    multiple random data divisions.
 2. TIMESERIES REQUIRE CONSTANT TIMESTEPS. However, the number 
    of relevant arrangements is severely limited.
 3. The best way to get many design variations is merely to use 
    many trials with random initial weights.
Hope this helps.
Thank you for formally accepting my answer
Greg
1 Comment
  Greg Heath
      
      
 on 25 Aug 2017
				I don't think you understand:
It is YOUR job to test YOUR code.
Use a MATLAB example dataset and initialize the rng to the zero state so that we can compare our results with yours.
Greg
  Greg Heath
      
      
 on 31 Dec 2017
        If this is the 1st time you are using neural networks:
1. BOTH TIMESERIES AND CROSSVALIDATION ARE ADVANCED TOPICS. IF YOU HAVE A CHOICE, START WITH ELEMENTARY TOPICS
a. Regression/Function-Fitting
     help fitnet
     doc fitnet
b. Classification/Target-Identification
     help patternnet
    doc patternnet
c. Non-feedback Timeseries
    help time-delaynet
     doc time-delaynet
d. Feedback Timeseries
    help narxnet
    doc narxnet
2. I don't recommend crossvalidation for neural networks.
a. Multiple random weight intializations for each of a specified number of hidden nodes in a single hidden layer net tends to be sufficient and order of magnitudes faster.
b. The goal is to minimize the number the number of hidden nodes subject to an upper limit on meansquareerror (or crossentropy for classification)
Hope this helps.
Greg
0 Comments
  orlem lima dos santos
 on 19 Jan 2018
        Hi again, I do not recommend using standard cross-validation (crossval function) to time series prediction for this type of case there is a technique known as "time series cross-validation" (https://robjhyndman.com/hyndsight/tscv/)

Unfortunately there is not a function implemented in matlab, but there is one in python scikit-learn (<http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html>) that can help.
0 Comments
  Greg Heath
      
      
 on 19 Jan 2018
        
      Edited: Greg Heath
      
      
 on 19 Jan 2018
  
      If you have to maintain the original spacing, one way to use f-fold XVAL in time series is illustrated below for f = 10
 1. Divide the data into 10 blocks [ B1 B2 ... B10 ] 
 2. for i= 1: 10, test on Bi, train on the rest.
 3. For example, if i =5, 
    a. Train on B1 to B4 using B1 for initial conditions
    b. Continue training on B6 to B10 using B6 (NOT B4 !) for initial conditions
    c. Compute separate SSEs for B5 and ~B5
 4. Combine the i=1:10 SSEs for 2 separate results MSEtrn and MSEtst 
 5. To obtain a production series, you can test each on all of the data 
    and combine them any way you choose (e.g., best, weighted average, ...)
Hope this helps.
Thank you for formally accepting my answer
Greg
P.S. I favor the normalized MSE,
              NMSE = MSE/mean(var(target',1))
which is normally in the range 0 <= NMSE <= 1 and related to the statistical Rsquare (See Wikipedia)
               Rsquare = 1 - NMSE
0 Comments
See Also
Categories
				Find more on Deep Learning Toolbox in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!