Measuring similarity between two individual trees in a model created by fitcensemble

1 view (last 30 days)
I investigated for finding parameters to measure similarity between individual trees in ensemble.I think the most suitable ones are nodeSize, nodeProbability and nodeError. But when I watch the results of them, the mean of their values for individual trees do not regularly increase or decrease. So I am not be able to draw a conclusion. Could you suggest anything else for this? My code snippet for one model is below: t = templateTree('MaxNumSplits',cell2mat(treeDepth)); % Weak-learner template tree object C1 = fitcensemble(X_train,Y_train,'Method','RUSBoost','Learners',t); [labels,scores] = predict(C1,X_test); meanSuccessRate=calculateMeanSuccessRate(scores,labels,Y_test); sonuclar(i,j)=meanSuccessRate; . . . nodeSize(d,z)=mean(C1.Trained{z}.NodeSize); nodeProbability(d,z)=mean(C1.Trained{z}.NodeProbability); nodeError(d,z)=mean(C1.Trained{z}.NodeError);

Answers (1)

Aditya Patil
Aditya Patil on 23 Dec 2020
Comparing decision trees is not straightforward as they can have different structures, different variables at each node, and different condtions on those variables.
Generally you should evaluate the properties of the ensemble as a whole, and not that of individual trees.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!