- 1st col: 1 = Blue, 2 = Red (notice there is no observation of 3 in the 1st column)
- 2nd col: 1 = Democrat, 2 = Republican, 3 = Libertarian
- 3rd col: 1 = Ford, 2 = BMW, 3 = Honda
fitcecoc SVM with categorical predictors not predicting the correct label for multiclass problem.
2 views (last 30 days)
Show older comments
Building a simple SVM model in Matlab does not seem to predict the correct label when using categorical predictors, for multiclass problems.
The sample code is as follows:
% first model, train and test data are categorical
% the test data is closest to label 20
trainData = [1 1 1; 2 2 2; 2 3 3];
trainLabel = [10; 20; 30];
testData = [1 2 2];
model = fitcecoc(trainData,trainLabel,'CategoricalPredictors','all');
predictLabel = predict(model,testData);
disp(['predictLabel: ',num2str(predictLabel)]);
% second model, train and test data are same as above but represented as:
% 1 = 1 0 0, 2 = 0 1 0, 3 = 0 0 1
trainData2 = [1 0 0 1 0 0 1 0 0; 0 1 0 0 1 0 0 1 0; 0 1 0 0 0 1 0 0 1];
testData2 = [1 0 0 0 1 0 0 1 0];
model2 = fitcecoc(trainData2,trainLabel);
predictLabel2 = predict(model2,testData2);
disp(['predictLabel2: ',num2str(predictLabel2)]);
The first model should predict label 20, but chooses label 30 instead. Based on my understanding of how SVM works, it should have chosen label 20. When I transform the first model, per this link, and reduce it to it's binary representation as per model2, then it predicts the correct label 20. As fas as I'm aware, and per the previous link, the two models are logically identical. So, I may be using some incorrect syntax for the first model, or my understanding of how SVM works under the covers is incorrect (but then the two models above should have the same result), or perhaps there is a bug for multiclass ECOC categorical models.
Any help is greatly appreciated - thanks!
0 Comments
Answers (1)
the cyclist
on 13 Feb 2020
I'm pretty sure you've got your dummy encoding wrong.
You are treating 1,2 and 3 as if they are somehow the same categories in all three columns. But those are different explanatory variables, so it could be:
Therefore, the correct dummy encoding is
trainData2 = dummyvar({categorical([1;2;2]),categorical([1;2;3]),categorical([1;2;3])});
trainData2 =
1 0 1 0 0 1 0 0
0 1 0 1 0 0 1 0
0 1 0 0 1 0 0 1
where the first two columns indicate Blue/Red, the next three colums indicate Dem/Rep/Lib, and the last three columns indicate Ford/BMW/Honda.
The correct test data for the dummy-encoded version is then
testData2 = [1 0 0 1 0 0 1 0]; % Because the test is Blue / Rep / BMW
Those inputs give me the same prediction for the dummy-encoded version as the categorical version.
3 Comments
the cyclist
on 16 Feb 2020
So, let's call my dummy encoding the third model. Then,
% first model, train and test data are categorical
% the test data is closest to label 20
trainData = [1 1 1;
2 2 2;
2 3 3];
trainLabel = [10;
20;
30];
testData = [1 2 2];
model = fitcecoc(trainData,trainLabel,'CategoricalPredictors','all');
predictLabel = predict(model,testData);
disp(['predictLabel: ',num2str(predictLabel)]);
% second model, train and test data are same as above but represented as:
% 1 = 1 0 0, 2 = 0 1 0, 3 = 0 0 1
trainData2 = [1 0 0 1 0 0 1 0 0;
0 1 0 0 1 0 0 1 0;
0 1 0 0 0 1 0 0 1];
testData2 = [1 0 0 0 1 0 0 1 0];
model2 = fitcecoc(trainData2,trainLabel);
predictLabel2 = predict(model2,testData2);
disp(['predictLabel2: ',num2str(predictLabel2)]);
% third model
trainData3 = dummyvar({categorical([1;2;2]),categorical([1;2;3]),categorical([1;2;3])})
testData3 = [1 0 0 1 0 0 1 0]; % Because the test is Blue / Rep / BMW
model3 = fitcecoc(trainData3,trainLabel);
predictLabel3 = predict(model3,testData3);
disp(['predictLabel3: ',num2str(predictLabel3)]);
Weird thing is that I could have sworn that models #1 and #3 were the ones that gave the same result. I think the reason for that may have been that I was also playing around with using the name-value pair ['CategoricalPredictors','all']for the dummy-encoded models as well. When I do, then everything gives the same answer.
I'm frankly not sure at the moment if it makes sense to use that for the dummy-encoded models. I'm not able to spend time right now thinking about it, but thought I would toss that idea out there.
See Also
Categories
Find more on Image Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!