What happens if I use parfor and the fmincon/fminunc option "UseParallel" together?

14 views (last 30 days)
I want to minimize the same objective function using n different starting points. I have k cores available. I have .
Consider the following (toy) code:
parfor ii = 1:n
x_opt(ii) = fminunc( obj_fun, starting_point(ii), 'UseParallel', true )
end
Is MatLab smart enough to allocate cores to each optimization?
  7 Comments
Michael Stollenwerk
Michael Stollenwerk on 20 Nov 2020
Ok, thanks. That's unfortuante, since I have the suspicion that in my case 'UseParallel' with corse would improve speed by a lot, since obj_fun has a high-dimensional input array and thus many (parallel) function evaluations are used to estimate the gradient in each step.
Mario Malic
Mario Malic on 22 Nov 2020
I think it's worthy to try a serial optimisation for outer loop and UseParallel option, especially if your function evaluation takes some time to evaluate.

Sign in to comment.

Answers (1)

Matt J
Matt J on 18 Nov 2020
I don't know the answer to that directly, but it is easy to show by example that turning off 'UseParallel' can be beneficial:
objfun=@(x)sum(x.^2);
x0=rand(2000,1);
opts=optimoptions('fminunc','Display','none','UseParallel',true);
tic;
parfor i=1:4
fminunc(objfun,x0,opts);
end
toc%Elapsed time is 0.369885 seconds.
opts.UseParallel=false;
tic;
parfor i=1:4
fminunc(objfun,x0,opts);
end
toc%Elapsed time is 0.274740 seconds.
  9 Comments
Raymond Norris
Raymond Norris on 19 Nov 2020
I'm puzzled how Matt's example worked when UseParallel is set to true. If you're running a parfor loop, I would think the optimization would not use the parallel pool and hence run serially. This would explain slightly why the performance.
Secondly, 2000x1 most likely just isn't big enough. Take a look at the following with 20000. For starters, when I benchmark, I'm setting maxNumCompThreads to 1 for a baseline and the setting to 4 later. This tells me what implicit multi-threading MATLAB gives me.
objfun = @(x)sum(x.^2);
x0 = rand(20000,1);
opts = optimoptions('fminunc','Display','none','UseParallel',false);
%% Baseline
maxNumCompThreads(1);
tic
for i=1:4
fminunc(objfun,x0,opts);
end
toc
% Elapsed time is 197.351202 seconds.
%% 4 threads, no additional parallelism
maxNumCompThreads(4);
tic
for i=1:4
fminunc(objfun,x0,opts);
end
toc
% Elapsed time is 75.309341 seconds.
%% 4 workers, but parallelism is minimal
opts = optimoptions('fminunc','Display','none','UseParallel',true);
tic
for i=1:4
fminunc(objfun,x0,opts);
end
toc
% Elapsed time is 64.789117 seconds.
Matt J
Matt J on 19 Nov 2020
Edited: Matt J on 19 Nov 2020
I'm not sure what the manipulation of maxNumCompThreads is supposed to tell us about the interaction between parfor and UseParallel. I would think that reducing maxNumCompThreads will add even more bottlenecks to the computation that wouldn't otherwise be there, because now fminunc cannot maximally exploit the multi-core resources even for basic linear algebra steps. With more bottlenecks, the parallelization or non-parallelization of the gradient calculation step (controlled by UseParallel) will have a diminished impact.

Sign in to comment.

Categories

Find more on MATLAB Parallel Server in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!