customized Loss function for cross validation

1 view (last 30 days)
Af
Af on 14 Jun 2019
Commented: Af on 27 Jun 2019
I trained a decision tree regression model with the following code:
MdlDeep = fitrtree(X,Y,'KFold',SbjNm,'MergeLeaves','off', 'MinParentSize',1,'Surrogate','on');
and customized the loss function to test the model accuracy:
LossEst(OutCnt)=kfoldLoss(CllTr{OutCnt},'LossFun',@TstLossFunIn);
the customized loss function was:
function lossvalue = TstLossFunIn(C,S,W)
DffTtl=(C-S).^2;
DffTtl=DffTtl.*W;
SSE=sum(DffTtl); SSTM=mean((C-mean(C)).^2);
lossvalue=(SSE/SSTM);
this results in a reasonable loss given my problem. However, I wanted to control the cross-validation procedure, so I modified the code to split the traning and testing dataset myself and see how the model performs:
for SbjCnt=1:SbjNm
TrnDt=X;
TrnDt(SbjCnt,:)=[];
TrnOut=Y;
TrnOut(SbjCnt)=[];
MdlDeep = fitrtree(TrnDt,TrnOut,'MergeLeaves','off','MinParentSize',1,'Surrogate','on');
TstDt=XS(SbjCnt,:);
EstY=predict(MdlDeep,TstDt);
end
Now I wanted to calculate the loss function. The thing is that in this case, the calculated loss is very much different from the loss function in the first scenario and the model does not seem to be accurate at all.
Any hint, why this works like that?
Best regards,
Afshin
  1 Comment
Af
Af on 27 Jun 2019
kfold of matlab shuffles the samples in the training and test set, appareantly it cannot be garanteed that one specific subject is not included in the training set. There is a loss function which takes an input argument called "usenfort" showing which input in each partition should be used for testing. There one can see that the included samples in the test are shuffled.

Sign in to comment.

Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!