NARXnet with Multi input
Show older comments
Dear all, I am trying to find ID, FD and hidden nodes for my Narx network with multiple input. In this case I am using the PH Data set which has 2 inputs and one output. I have search pretty much the entire ANSWER and NEWSROOM to see how this can be done but it appears that the questions and answers are around single input data set such as SIMPLENARX data set. I have modified the code posted earlier at: https://au.mathworks.com/matlabcentral/answers/296646-finding-optimal-id-fd-and-hidden-nodes-for-narxnet
my codes are as follow:
close all, clear all, clc,
tic
plt=0;
%[X,T] = PH_dataset;
[X,T] = ph_dataset;
x = cell2mat(X);
t = cell2mat(T);
[ I N ] = size(x); % [ 2 2001 ]
[ O N ] = size(t); % [ 1 2001 ]
% Define data for training
Ntrn = N-2*round(0.15*N) % 0.7/0.15/0.15 trn/val/tst ratios
trnind = 1:Ntrn;
Ttrn = T(trnind);
Ntrneq = prod(size(Ttrn)) % 1401
MSE00 = var(t',1) % 6.86
% Calculate Z-Score for input (x) and target (t), split input series and
% calculate z-Score separately
x1 = x(1,:);
x2 = x(2,:);
zx1 = zscore(x1,1);
zx2 = zscore(x2,1);
zt = zscore(t, 1);
zx1trn = zscore(x1(trnind), 1);
zx2trn = zscore(x2(trnind), 1);
zttrn = zscore(t(trnind), 1);
% Plot Input & Output for both original and transformed (Z-scored)
plt = plt+1,figure(plt);
subplot(221)
plot(x1)
hold on
plot(x2)
title('PH DATASET INPUT SERIES')
subplot(222)
plot(zx1)
hold on
plot(zx2)
title('STANDARDIZED INPUT SERIES')
subplot(223)
plot(t)
title('PH DATASET OUTPUT SERIES')
subplot(224)
plot(zt)
title('STANDARDIZED OUTPUT SERIES')
rng('default')
L = floor(0.95*(2*N-1)) % 189
for i = 1:1000 % Number of repetations to use for estimating summary statistics
% This is for Target (T) Autocorrelation
n = zscore(randn(1,N),1);
autocorrn = nncorr( n,n, N-1, 'biased');
sortabsautocorrn = sort(abs(autocorrn));
thresh95T(i) = sortabsautocorrn(L);
% This is for Input-Target (IT) Cross-correlation
nx = zscore(randn(1,N),1);
nt = zscore(randn(1,N),1);
autocorrnIT = nncorr( nx,nt, N-1, 'biased');
sortabsautocorrnIT = sort(abs(autocorrnIT));
thresh95IT(i) = sortabsautocorrnIT(L);
end
% For Target Autocorrelation
sigTthresh95 = median(thresh95T)% 0.0327
meanTthresh95 = mean(thresh95T) % 0.0327
minTthresh95 = min(thresh95T) % 0.0291
medTthresh95 = median(thresh95T) % 0.0327
stdTthresh95 = std(thresh95T) % 0.0011
maxTthresh95 = max(thresh95T) % 0.0358
% For Input-Target Autocorrelation
sigITthresh95 = median(thresh95IT)% 0.0326
meanITthresh95 = mean(thresh95IT) % 0.0327
mintIThresh95 = min(thresh95IT) % 0.0286
medtIThresh95 = median(thresh95IT) % 0.0326
stdtIThresh95 = std(thresh95IT) % 9.1674e-4
maxtIThresh95 = max(thresh95IT) % 0.0351
%%CORRELATIONS
%%%%%TARGET AUTOCORRELATION %%%%%%%
%
autocorrt = nncorr(zttrn,zttrn,Ntrn-1,'biased');
sigflag95 = -1+ find(abs(autocorrt(Ntrn:2*Ntrn-1))>=sigTthresh95);
sigflag95(sigflag95==0)=[]; % Remove 0 from FD
%
plt = plt+1, figure(plt);
hold on
plot(0:Ntrn-1, -sigTthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, zeros(1,Ntrn),'k')
plot(0:Ntrn-1, sigTthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, autocorrt(Ntrn:2*Ntrn-1))
plot(sigflag95,autocorrt(Ntrn+sigflag95),'ro')
title('SIGNIFICANT TARGET AUTOCORRELATIONS (FD)')
%
%%%%%%INPUT-TARGET CROSSCORRELATION %%%%%%
% Data Row 1
crosscorrxt_zx1 = nncorr(zx1trn,zttrn,Ntrn-1,'biased');
sigilag95_x1 = -1 + find(abs(crosscorrxt_zx1(Ntrn:2*Ntrn-1))>=sigITthresh95)
%
plt = plt+1, figure(plt);
hold on
plot(0:Ntrn-1, -sigITthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, zeros(1,Ntrn),'k')
plot(0:Ntrn-1, sigITthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, crosscorrxt_zx1(Ntrn:2*Ntrn-1))
plot(sigilag95_x1,crosscorrxt_zx1(Ntrn+sigilag95_x1),'ro')
title('SIGNIFICANT INPUT(X1)-TARGET CROSSCORRELATIONS (ID)')
% Data Row 2
crosscorrxt_zx2 = nncorr(zx2trn,zttrn,Ntrn-1,'biased');
sigilag95_x2 = -1 + find(abs(crosscorrxt_zx2(Ntrn:2*Ntrn-1))>=sigITthresh95)
%
plt = plt+1, figure(plt);
hold on
plot(0:Ntrn-1, -sigITthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, zeros(1,Ntrn),'k')
plot(0:Ntrn-1, sigITthresh95*ones(1,Ntrn),'b--')
plot(0:Ntrn-1, crosscorrxt_zx2(Ntrn:2*Ntrn-1))
plot(sigilag95_x2,crosscorrxt_zx2(Ntrn+sigilag95_x2),'ro')
title('SIGNIFICANT INPUT(X2)-TARGET CROSSCORRELATIONS (ID)')
My first question is, am I do it right in finding the ID and FD for multi-input? IF I have 10 input then I will have to do the similar things for 10 times? Are there a better way of doing this? Would greatly appreciate any comments!
Secondly, there are so many statistically significant for the data, how do I select what to include?There are a few hundreds of them (for e.g X1 and X2 has more than 500 significant)! I don't quite understand how to do, any advice would be much appreciated.
After finding the statistically significant, I then select the minimal subset of delay and use it to calculate the size of the hidden nodes. I remember Greg said that if more than 1 input, some equations need to be modified and I am pretty sure this is the reason why my results are poor but I am not sure which equation need to be modified, even if so, I may not understand why I need to change it since the numbers of data point for both inputs are the same and that the NID is fixed for both input, so are NFD, O, I etc. The subsequent codes as below:
ID = 1:26; % I have chosen 26 and 20 based on the smallest subset of ID/FD calculated above.
FD = 1:20;
NID = length(ID); % 26
NFD = length(FD); % 21
LDB = max([ID,FD]) % Length of the delay buffer = 26
Hub = (Ntrneq-O)/(NID*I+NFD*O+O+1) % 18.67
Hmax = floor(Hub); % 18
dH =2;
Hmin = 0;
Ntrials = 10;
rng('default')
trainFcn = 'trainlm';
j=0;
for h = Hmin:dH:Hmax
j=j+1
if h==0
neto = narxnet(ID,FD,[],'open',trainFcn);
Nw = (NID*I+NFD*O+1)*O;
else
neto = narxnet(ID,FD,h);
Nw = (NID*I+NFD*O+1)*h+(h+1)*O;
end
Ndof = Ntrneq-Nw
neto.divideFcn = 'divideblock';
neto.performFcn = 'mse'; % This is default since 'msereg' is obsolete.
[Xo Xoi Aoi To] = preparets(neto,X,{},T);
to = cell2mat(To);
MSE00o = var(to,1)
MSE00oa = var(to,0)
MSEgoal = 0.005*max(Ndof,0)*MSE00oa/Ntrneq
MinGrad = MSEgoal/100
neto.trainParam.goal = MSEgoal;
neto.trainParam.min_grad = MinGrad;
for i= 1:Ntrials
% Save state of RNG for duplication
s(i) = rng;
neto = configure(neto,Xo,To);
[neto tro Yo Eo Xof Aof ] = train(neto,Xo, To, Xoi, Aoi);
% Eo = gsubtract(To,Yo);
stopcrit{i,j} = tro.stop;
R2o(i,j) = 1 - mse(Eo)/MSE00o;
end
end
result = [ (Hmin:dH:Hmax); R2o ]
% stopcrit = stopcrit
elapsedtime = toc %
Thank you very much!
5 Comments
Greg Heath
on 15 Sep 2016
I do not see any responses. I will check to see if I have started one. Either way, I'll post one.
Greg
Greg Heath
on 15 Sep 2016
Edited: Greg Heath
on 16 Sep 2016
Yes it looks like I did begin one. Found several corrections and am in the process of responding in detail.
Meanwhile, I will try to answer your questions:
I moved the remainder to an ANSWER BOX
More later,
Greg
Greg Heath
on 15 Sep 2016
ZIKES!!!
We failed to do something that I've emphasized to students and staff for the last 35 years:
LOOK AT THE PLOTS !!!
Clearly, the data is not stationary. There appear to be 3 contiguous regions of approximate length [1000 500 500 ] which have different summary stats.
Training over 1400 points certainly doesn't seen to be adequate for estimating the remaining 600.
Greg
Greg Heath
on 15 Sep 2016
> After finding the statistically significant, I then select the minimal subset of delay and use it to calculate the size of the hidden nodes. I remember Greg said that if more than 1 input, some equations need to be modified and I am pretty sure this is the reason why my results are poor but I am not sure which equation need to be modified,
The ones that quickly come to mind are VAR and ZSCORE. Maybe NNCORR ...
The reason is these functions operate on columns whereas timeseries are rows. However, if the row is one-dimensional these functions give the same result as if it were a column.
Hope this helps.
Greg
Greg Heath
on 16 Sep 2016
1. Ntrneq = prod(size(Ttrn))
is only valid for 1-D. In general, not the same as
ttrn = cell2mat(Ttrn);
Ntrneq = prod(size(ttrn)))
2. Only use training data to determine siglags, ID and FD.
3. Nrep = 1000 seems excessive. Start with 100 & compare with 200.
4. I'm still baffled by closeloop causing NMSEc >> NMSEo and best way to deal with it.
Greg
Accepted Answer
More Answers (0)
Categories
Find more on Linear Least Squares in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!