hyperparameter tuning with fitclinear
21 views (last 30 days)
Show older comments
Hello Matlab community,
I would like to run an SVM classification on my high-dimensional data. I decided to use fitclinear to do so. I would like to tune lambda.
What I don't understand is the cross-validation that takes place in the HyperparameterOptimizationOptions field.
The 'MaxObjectiveEvaluations' field is by default set to 30 and 'Kfold' is by default set to 5. In my script, I choose to tune lambda and the result is 30 lambda's ranked. I do not understand where the cross-validation happens exactly.
Here is a simplified example of my code:
% 1. load data
x = data.data;
y = labels;
% 2. CV particion
CV = cvpartition(data.sex, 'KFold', 5);
for i = 1:5
x_train = x(CV.training(i), :);
y_train = y(CV.training(i));
x_test = x(CV.test(i), :);
y_test = y(CV.test(i));
% 3. normalization
[x_train_norm, C, S] = normalize(x_train);
x_test_norm = normalize(x_test, 'center', C, 'scale', S);
% 4. Hyperparameter (lambda) tuning
VariableDescriptions = hyperparameters('fitclinear', x_train_norm, y_train);
[mdl, ~, HyperparameterOptimizationResults] = fitclinear(x_train_norm', y_train,...
'ObservationsIn','columns', 'OptimizeHyperparameters', VariableDescriptions(1,1),...
'HyperparameterOptimizationOptions', struct('Optimizer', 'randomsearch', 'AcquisitionFunctionName', ...
'expected-improvement-plus', 'Verbose', 0));
% I am choosing 'OptimizeHyperparameters', VariableDescriptions(1,1)
% here because I only want to tune Lambda
% 5. Find best lambda out of the 30 MaxObjectiveEvaluations
idx = find(HyperparameterOptimizationResults.Rank == 1);
lambda = HyperparameterOptimizationResults.Lambda(idx);
% 6. Train final SVM model
finalModel = fitclinear(x_train_norm', y_train, 'ObservationsIn', 'columns', ...
'Lambda', lambda);
% 7. Predict labels for test data
[predictionsY, scores] = predict(finalModel, x_test_norm);
end
In this example, when the hyperparameter tuning happens in Step 4, is the x_train_norm further split into 5 training/test groups? And then the 30 lambdas are calculated using these 5 training/test groups of the x_train_norm? Is this process an equivalent of a nested cross-validation?
I appreciate the help!
Best,
Nasia
0 Comments
Answers (1)
Drew
on 23 Aug 2023
The short answer is yes. That is, the code you shared is doing "nested cross-validation" because the hyperparameter optimization inside fitclinear is using 'Kfold',5 crossvalidation by default as part of the HyperparameterOptimizationOptions. This is documented at https://www.mathworks.com/help/stats/fitclinear.html.
If this answer is helpful for you, please remember to accept the answer.
0 Comments
See Also
Categories
Find more on Classification Ensembles in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!