Calling parpool with SpmdEnabled = False

9 views (last 30 days)
Hello,
I am having an issue with my University cluster. It is that when one worker crash, for a reason like "communication with the worker is lost probably due to a network problem", then all the remaining workers crash, and it may happen again and again!
Because I am using parfor and it doesn't require communication among the workers like spmd, then I would like all the other workers to finish their job!
I learnt online that you can solve this issue by calling your HPC cluster with the parameter "SpmdEnabled" equals False. What I did but MATLAB ignored my request as you can see in the Warning message at the end.
My question is, how can I solve this issue?
---------------------------------------------------------------------------------------------------------
parpool('HPCServerProfile1', 160, 'SpmdEnabled', false)
Starting parallel pool (parpool) using the 'HPCServerProfile1' profile ...
Warning: Disabling SPMD on parallel pools is not supported on this cluster type. Set 'SpmdEnabled' value to true.
Connected to the parallel pool (number of workers: 160).

Accepted Answer

Edric Ellis
Edric Ellis on 30 Jan 2020
Unfortunately, only MJS and Local cluster types support SpmdEnabled = false. You might be able to use the "cluster parfor" approach though - see the documentation. Basically, you would transform your main parfor loop like so:
% Important: do *not* create a parallel pool prior to running this!
% In fact, you may wish to call "delete(gcp('nocreate'))" to be sure.
% Get the cluster object
c = parcluster('HPCServerProfile1');
% Create the parforOptions structure
opts = parforOptions(c);
% Run the parfor loop directly on the cluster with no parallel pool
parfor (idx = 1:10000, opts)
% Body of the parfor loop
end
This should be more robust, however it does incur more overhead than the interactive parpool approach.

More Answers (0)

Categories

Find more on Parallel Computing Fundamentals in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!