Using Bayesopt for TreeBagger Classification

2 views (last 30 days)
Anthony
Anthony on 6 Jul 2017
Commented: Alan Weiss on 7 Jul 2017
Basically am running a random forest classification using tree bagger and I am extremely confused on how to run bayesopt on the program as I'm new to programming in matlab. Basically, I want to minimize the loss function. I've been trying the code below and I know that it's probably far off, but I'm very lost on this one. Any help would be great.
minLS=optimizableVariable('minLS',[1,20],'Type','integer');
numPTS=optimizableVariable('numPTS',[1,113],'Type','integer');
hyperparametersRF=[minLS;numPTS];
rng(1);
opts=statset('UseParallel',true);
numTrees=1000;
A=TreeBagger(numTrees,Xtrain,Ytrain,'method','classification','Options',opts,...
'MinLeafSize',hyperparametersRF.minLS,'NumPredictorstoSample',hyperparametersRF.numPTS);
classA=@(Xtrain,Ytrain,Xtest)(predict(A,Xtest));
cvMCR=crossval('mcr',X,y_cat,'predfun',classA,'partition',cvp);
fun=@(hparams)cvMCR(hyperparametersRF,X);
results=bayesopt(fun,hyperparametersRF);
besthyperparameters=bestPoint(results);

Answers (3)

Don Mathis
Don Mathis on 6 Jul 2017
How about something like this? Instead of crossval, it uses TreeBagger's ability to use the "out of bag" observations as validation data. It also uses a local function to build the objective function that you pass to bayesopt. You should be able to run this by pasting it into a MATLAB editor and hitting the Run button.
load fisheriris
X = meas;
Y = species;
minLS = optimizableVariable('minLS',[1,20],'Type','integer');
numPTS = optimizableVariable('numPTS',[1,113],'Type','integer');
hyperparametersRF = [minLS;numPTS];
rng(1);
fun = makeFun(X, Y);
results = bayesopt(fun,hyperparametersRF);
besthyperparameters = bestPoint(results);
function fun = makeFun(X, Y)
% Make the objective function to pass to bayesopt
fun = @f;
% A nested function that uses X and Y
function oobMCR = f(hparams)
opts=statset('UseParallel',true);
numTrees=1000;
A=TreeBagger(numTrees,X,Y,'method','classification','OOBPrediction','on','Options',opts,...
'MinLeafSize',hparams.minLS,'NumPredictorstoSample',hparams.numPTS);
oobMCR = oobError(A, 'Mode','ensemble');
end
end
  2 Comments
Anthony
Anthony on 6 Jul 2017
Similar to one of the answers below, this is the error message I get when running the code:
function fun = makeFun(X,y_cat)
↑ Error: Function definitions are not permitted in this context.
Anthony
Anthony on 6 Jul 2017
Note: my feature labels are y_cat (this is a team project we're working on, so everybody is using y_cat, otherwise I would just run the code the way that you put it.)

Sign in to comment.


Alan Weiss
Alan Weiss on 6 Jul 2017
I think that you are pretty close. You just have to define what your objective function is as a function of the parameters you use. Bayesian Optimization Objective Functions has the basics for what bayesopt passes to your objective function: a table of values. This example shows using bayesope to minimize cross-valudated loss.
I believe that you are going to want to write a function that takes your training data Xtrain and Ytrain as well as the hyperparameter vector x, and some other arguments, something like this:
function loss = myCVlossfcn(x,Xtrain,Ytrain,X,y_cat,cvp,opts)
A = TreeBagger(numTrees,Xtrain,Ytrain,'method','classification','Options',opts,...
'MinLeafSize',x.minLS,'NumPredictorstoSample',x.numPTS);
classA = @(Xtest)predict(A,Xtest);
loss = crossval('mcr',X,y_cat,'predfun',classA,'partition',cvp);
Then call bayesopt like this:
results = bayesopt(@(x)myCVlossfcn(x,Xtrain,Ytrain,X,y_cat,cvp,opts),hyperparametersRF)
If you don't understand about passing extra arguments this way, see Passing Extra Parameters.
I might have some typos, I didn't try running this, but is it clearer?
Alan Weiss
MATLAB mathematical toolbox documentation

Don Mathis
Don Mathis on 6 Jul 2017
Here's a variation that doesn't use a function-within-a-function. Maybe it's easier to understand:
load fisheriris
X = meas;
Y = species;
minLS = optimizableVariable('minLS',[1,20],'Type','integer');
numPTS = optimizableVariable('numPTS',[1,113],'Type','integer');
hyperparametersRF = [minLS;numPTS];
rng(1);
fun = @(hyp)f(hyp,X,Y);
results = bayesopt(fun,hyperparametersRF);
besthyperparameters = bestPoint(results);
function oobMCR = f(hparams, X, Y)
opts=statset('UseParallel',true);
numTrees=1000;
A=TreeBagger(numTrees,X,Y,'method','classification','OOBPrediction','on','Options',opts,...
'MinLeafSize',hparams.minLS,'NumPredictorstoSample',hparams.numPTS);
oobMCR = oobError(A, 'Mode','ensemble');
end
  5 Comments
Anthony
Anthony on 7 Jul 2017
Update: I fixed this. The original m file that my group uses to import our data that I never look at had me running in a completely different directory than everything else. I just moved everything onto that one for the time being and it's working just fine now
Alan Weiss
Alan Weiss on 7 Jul 2017
Glad to hear it, thanks for letting us know.
Maybe you should accept Don's answer if that is what helped you.
Alan Weiss
MATLAB mathematical toolbox documentation

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!