Long transmission times of parfeval with large parallel.pool.Constant objects

7 views (last 30 days)
Hi,
I am currently experiencing issues with very long transmission times when I use parfeval with large parallel.pool.Constant objects. I create three such pool constants
poolA = parallel.pool.Constant(A); % A requires about 600 MB of memory
poolB = parallel.pool.Constant(B); % B requires about 200 MB of memory
poolC = parallel.pool.Constant(C); % C requires about 100 MB of memory
In total these three matrices have close to 1 GB of memory allocated. After defining them as pool objects I pass them to parfeval as follows:
t = tic;
futures(idx) = parfeval(currentPool, @solveInstance, 1, instance,model,params,t,poolA,poolB,poolC);
Now using function input t, I pass the start time for the evaluation of function 'solveInstance' to the worker. In function 'solveInstance' the first task is to record how long it took to pass the data to the worker and start function evaluation. Consequently the first line of code in function 'solveInstance' is:
endTrans = toc(t);
This measurement tells me that it takes about 130 seconds (sometimes even longer) until the function evaluation really starts. Now I know that 1 GB of data is a lot but this is already optimized to be solely sparse arrays and everything has been trimmed down as much as possible. That's why I defined arrays A,B and C as parallel.pool.Constant objects so that I avoid passing the data to the workers every time I call parfeval. Now my question is why does it still take that long? (the size of the other function arguments is negligible). Shouldn't the data be readily accessible on the workers? My server also has more than enough memory so that can't be a problem either. I would appreciate your help on this matter.
Thanks for any help,
Florian
  1 Comment
Matt J
Matt J on 12 Mar 2019
Does it take that long every time you call parfeval, or just the first time? Maybe you're just seeing the time of the initial transmission.

Sign in to comment.

Answers (1)

Edric Ellis
Edric Ellis on 13 Mar 2019
You are correct that parallel.pool.Constant data is already available on the workers after construction. It's not clear to me why you're seeing this problem. Here's what I tried (with a pool of size 1, in R2018b, on GLNXA64):
t = tic;
poolA = parallel.pool.Constant(rand(600, 1e6/8)); % 600MB
poolB = parallel.pool.Constant(rand(200, 1e6/8)); % 200MB
poolC = parallel.pool.Constant(rand(100, 1e6/8)); % 100MB
timeToCreateConstants = fetchOutputs(parfeval(@toc, 1, t))
resulting in
timeToCreateConstants =
3.6487
Using the constants is (for me) quick, as expected. Here's an example:
f = parfeval(gcp, @(a,b,c,t) [toc(t), numel(a.Value) + numel(b.Value) + numel(c.Value)], ...
1, poolA, poolB, poolC, tic);
out = fetchOutputs(f);
timeToUseConstants = out(1)
resulting in
timeToUseConstants =
0.0684
So something else must be going on in your case...
  3 Comments
Edric Ellis
Edric Ellis on 13 Mar 2019
Actually, the parallel.pool.Constant construction transmits all the data to the workers. (The constructor of parallel.pool.Constant basically uses parfevalOnAll to send the input argument to the workers, and then stash it away in persistent storage. When you send the actual parallel.pool.Constant instance into parfeval, the only thing that gets sent across the wire is the key into that persistent storage). So, once the constant has been created, each subsequent transmission to the workers is very cheap.
martinflorian888
martinflorian888 on 13 Mar 2019
Ok, then it actually works the way I originally understood it. A customer service employee from MathWorks explained it as it would be transmitted on the first parfeval call (which I found weird anyways cause what would happen then when you create the pool variable). But this sounds more reasonable.
I actually think that my issue has nothing to do with MATLAB and is a phenomenon caused by different processes interfering on the same physical cores. When MATLAB identifies a worker as idle, that just means that it has finished it's last parfeval-future-task, not that a physical core is really available. So when I send a new job, the MATLAB worker (which is just a thread) might have to wait for processing until a physical core becomes available. That's one explanation I can think of. Why it has to wait that long, however, is a little beyond me (even if a all cores are busy, the interference through other threads/programmes should be very small I think.
Anyways I think I can handle this issue, plus it doesn't happen often, and only in the largest of instances that I run. So it's not a major thing for me.

Sign in to comment.

Categories

Find more on Asynchronous Parallel Programming in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!