how can I classify the loads based on Bagging decision trees and compare it with actual data?
1 view (last 30 days)
Show older comments
Hello all I need some help, please. I can not obtain the result correctly for the classification and i want to compare it with actual data. I am going to attach the code and I can not attach the data due to the huge size so I will share it via google drive Data
clear all
Traindata = readtable('training_a.csv');
X = [Traindata.I Traindata.THD Traindata.H3 Traindata.H5 Traindata.H7 Traindata.H9];
Y = ordinal(Traindata.Loads);
%% Build bagged decision tree classifer
leaf = 1;
nTrees = 50;
rng(9876,'twister');
savedRng = rng; % Save the current RNG settings
color = 'bgr';
for ii = 1:length(leaf)
% Reinitialize the random number generator, so that the
% random samples are the same for each leaf size
rng(savedRng)
% Create a bagged decision tree for each leaf size and plot out-of-bag
% error 'oobError'
b = TreeBagger(nTrees,X,Y,'OOBPrediction','on',...
'CategoricalPredictors',6,...
'MinLeafSize',leaf(ii));
plot(oobError(b),color(ii))
hold on
end
xlabel('Number of grown trees')
ylabel('Out-of-bag classification error')
legend({'1'},'Location','NorthEast')
title('Classification Error for Different Leaf Sizes')
hold off
%% -Features importance results-
nTrees = 50;
leaf = 1;
rng(savedRng);
b = TreeBagger(nTrees,X,Y,'OOBPredictorImportance','on', ...
'CategoricalPredictors',6, ...
'MinLeafSize',leaf);
bar(b.OOBPermutedPredictorDeltaError)
xlabel('Feature number')
ylabel('Out-of-bag feature importance')
title('Feature importance results')
b = compact(b);
%% Load Classification
Testdata = readtable('testing_a.csv');
[predClass,classifScore] = predict(b,[Testdata.I Testdata.THD Testdata.H3 Testdata.H5 Testdata.H7 Testdata.H9]);
for i = 1:6
fprintf(' Current = %5.2f\n',Testdata.I(i));
fprintf(' Total harmonic distortion = %5.2f\n',Testdata.THD(i));
fprintf(' Third harmonic = %2d\n',Testdata.H3(i));
fprintf(' Fifth harmonic = %5.2f\n',Testdata.H5(i));
fprintf(' Seventh harmonic = %5.2f\n',Testdata.H7(i));
fprintf(' Ninth harmonic = %2d\n',Testdata.H9(i));
fprintf(' Predicted Rating : %s\n',predClass{i});
fprintf(' Classification score : \n');
for j = 1:length(b.ClassNames)
if (classifScore(i,j)>0)
fprintf(' %s : %5.4f \n',b.ClassNames{j},classifScore(i,j));
end
end
end
classnames = b.ClassNames;
Preddata = [table(Testdata.Loads,predClass),array2table(classifScore)];
Preddata.Properties.VariableNames = [{'I'},{'Loads'},classnames'];
Actualdata = readtable('testing_aa.csv');
C = confusionchart(Actualdata.Loads,Preddata.Loads);
sortClasses(C,{'1' '2' '3' '4'})
1 Comment
Image Analyst
on 25 Aug 2022
Can you at least post a screenshot of your confusion matrix? What accuracy do you get and what is the min acceptable to you?
Accepted Answer
the cyclist
on 26 Aug 2022
Edited: the cyclist
on 26 Aug 2022
I have not gone through your whole code, but I loaded your MAT file and then ran
C = confusionchart(Actualdata.Loads,Preddata.Loads);
One problem is that Preddata.Loads is a cell array, not a double. So, I converted it :
C = confusionchart(Actualdata.Loads,[Preddata.Loads{:}]-'0'); % This is a TERRIBLE obfuscated way to convert
which gives the confusion matrix
The way I did that conversion is terrible, but it works. I would not do that in real code, but I was lazy. You should instead figure out why some or your labels are doubles, and some are characters, and fix that upstream.
More Answers (0)
See Also
Categories
Find more on Classification in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!