Why use `UseVectorized` and `parsim` instead of `UseParallel` and `sim` with `ga`?

7 views (last 30 days)
I have found multiple questions here about leverging Parallel Computing Toolbox with ga that involves simulation in the fitness function, and UseVectorized together with parsim in the fitness function is often recommended instead of UseParallel and sim.
I have my code written in to use UseVectorized and parsim, however, I find it time wasteful when ga sometimes call the fitness function with less than nWorkers row of population, then nWorkers - nRows workers will idle and wait for the fitness function to complete.
Does this also happen with UseParallel and sim inplace of UseVectorized and parsim? One problem I can think of when using the former is the simulation cache will have to either be built for every simulation, or have to be copied from somewhere. Is there performance benefit for using the latter?

Accepted Answer

Harald
Harald on 8 Jan 2024
Hi,
with UseVectorized, the fitness function should only be called once for an entire generation. I would thus only expect the scenario you describe if your population size is less than nWorkers or if there are a few iterations left. For example, consider population size 12, nWorkers = 8, and simulations of exact same duration. Then each worker will perform one simulation, leaving 4 to be completed. The first four to finish will each perform another simulation, the other four will remain idle. With a sufficiently large population size, the effect of this should become negligible.
I expect the same to happen with UseParallel and sim. An advantage of UseVectorized and parsim is that you can use the FastRestart option.
Another aspect to consider is that ga may not be the best solver - check out https://www.mathworks.com/help/gads/improving-optimization-by-choosing-another-solver.html. If your problem is sufficiently smooth and you are looking for a single global solution, ga is pretty far down that list. If you decide to try Global Search or MultiStart, consider the following for running the local solver: https://www.mathworks.com/help/optim/ug/optimizing-a-simulation-or-ordinary-differential-equation.html.
Best wishes,
Harald
  2 Comments
Jack
Jack on 8 Jan 2024
I was also expecting that only in such situation that you described that workers will stay idle, however, if I have InitialPopulationRange set, the first few calls to the fitness function only consist of less than 10 rows, even though I have PopulationSize set to 800 (I use display(size(x)); in the fitness function to see the size of x). With one simulation lasting around 5 minutes, and 48 workers, a lot of core-hours are wasted (idle cores are still billed the same).
Regarding the algorithm choice, because my problem is global optimization with integer and non-linear constraints, I think my choice is only between ga and surrogateopt. Initially I needed multi-objective, so gamultionj was the only option, but I have since then converted the problem into single-objective, so I just changed to ga since the set up is most the same between the ga and gamultiobj. Is it worth the effort to switch to surrogateopt?
Harald
Harald on 9 Jan 2024
That's interesting. I have tried a little example of setting InitialPopulationRange and am not observing the same, so I suspect this to be due to an interaction with other settings. I also suspect that you would experience the same if you were using UseParallel and sim.
There may be some initialization going on before the first real population. It might be worth a shot to create an InitialPopulation. Also if you have good ideas on what will likely result in an improved solution, you may want to specify custom mutation and crossover functions.
According to https://www.mathworks.com/help/gads/improving-optimization-by-choosing-another-solver.html (sorry, a dot was added to the link in my previous post, causing a redirect to a different page), "surrogateopt attempts to find a global solution using the fewest objective function evaluations" which should be advantageous with costly objective functions. Since the information that needs to be provided to ga and surrogateopt is comparable, I hope that it won't be too much effort to rewrite.
A final idea, of course: if there is any way of accelerating a simulation of the model (e.g., by choosing a different simulation mode), that could help tremendously.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!