Can I have checkpoints in Bayesian optimization for tuning hyperparameters of a neural network?

Hi there,
I'm trying to implement Bayesian Optimization on a BiLSTM network.
I'm planning to run this code in a university cluster but, they give us maximum of 2 days (48 hours) to run our job and if it goes beyond it, they automatically kill the job which probably will result in wasted time and resources for me and for other students waiting in que.
I was wondering if it would be possible to implement some kind of a checkpoint for bayesopt() to continue from where the job is left off:
Basically, what I'm asking is if I can save my previous runs (variables bayesopt() observed) and load them in my next run and continue from where it stopped?
I have not seen any documentation related to this (I may have missed it).
My understanding with bayesopt() is that, the more points are observed, the more accurate the answers bayesopt() gives. Is this right? If so, that means I might want to try to run it for more than 2 days maybe. The number of cores I can request are limited (the more I request, the longer I wait in que) and from what I'm estimating, the most complex combination of variables can take between 40 mins to 1 hour to train and give me a result ( obviously, not every combination will take this much time).
Any help is appreciated.
Thank you.

Accepted Answer

Aditya Patil
Aditya Patil on 16 Nov 2020
Edited: Aditya Patil on 16 Nov 2020
Currently, there is no checkpointing argument. However, you can use the 'OutputFcn' argument along with the 'SaveFileName' argument to save to file, and the resume function to restart the process as follows,
x1 = optimizableVariable('x1',[-5,5]);
x2 = optimizableVariable('x2',[-5,5]);
fun = @rosenbrocks;
if exist('BayesoptResults.mat','file')
results = resume(BayesoptResults,...
'SaveFileName', 'BayesoptResults.mat', ...
results = bayesopt(fun, [x1, x2],'AcquisitionFunctionName',...
'expected-improvement-plus', ...
'SaveFileName', 'BayesoptResults.mat', ...
function f = rosenbrocks(x)
f = 100*(x.x2 - x.x1^2)^2 + (1 - x.x1)^2;
Note that this saves to file on every iteration, so you might want to replace saveToFile with a custom function that saves occasionally, for performance reason.
The relevant docs are available here, resume and bayesopt.
Yildirim Kocoglu
Yildirim Kocoglu on 7 Dec 2020
That's fine.
Your suggested method still works in general (automatic) and that's what I need for the time being.
I highly appreciate you for answering my question because it was a life saver and I'm forever grateful to you for it.
I also really appreciate you trying to reproduce the issue even if it was not successful. There may be other reasons (not code related) for the issue which I can't pinpoint right now but, at least it's good to know it worked fine for you. This knowledge might help me in the future.
Have a great day.

More Answers (1)

Alan Weiss
Alan Weiss on 17 Nov 2020
There is one other possible solution to your problem. surrogateopt froom Optimization Toolbox™ can checkpoint automatically. It does not perform Bayesian optimization, but surrogate optimization is closely related and might be similar enough for your purposes.
Alan Weiss
MATLAB mathematical toolbox documentation

