Best Practice to Distribute Data to Workers?
2 views (last 30 days)
Show older comments
Hi,
I wonder if there is any known best practice to distribute data from the client to workers in terms of time (and space) efficiency. Suppose we have a large matrix A on the client, and want to distribute it to workers (along the column). Suppose
- A is the result of some complicated operations, so we can't generate columns (or rows) of A parallelly on each worker
- A can be fitted into the memory (not datastore type needed)
I wonder what would be the best practice to distribute A on workers.
I made the following comparison:
n = 512;
n_workers = 25;
A = rand(n^2, n); % generate synthesized data A
% method 1: distributed
tic;
A_dist = distributed(A);
t1=toc;
fprintf("t1 = %7.4e\n", t1)
clear A_dist
% method 2: Composite -> distributed
tic;
A_dist = Composite();
chunk_size = ceil(n/n_workers);
for i = 1 : n_workers-1
A_dist{i} = A(:,chunk_size*(i-1)+1:chunk_size*i);
end
A_dist{n_workers} = A(:,chunk_size*(n_workers-1)+1:end);
A_dist = distributed(A_dist, 2);
t2=toc;
fprintf("t2 = %7.4e\n", t2)
clear A_dist
% method 3: spmd + codistributed
tic;
spmd
A_dist = codistributed(A, codistributor('1d', 2));
end
t3=toc;
fprintf("t3 = %7.4e\n", t3)
clear A_dist
I observe that method 2 is always faster than method 1, and they two are both significantly faster than method 3. The typical output is: (and the rank and the gap are quite robust)
t1 = 3.0949e+00
t2 = 2.2290e+00
t3 = 1.7517e+01
Is there any better way than my method 2?
Besides, I am wondering about the mirror question: what would be a best pratice to gather data from workers to client? Basically it should be an inverse of my code that gets a (large) matrix A from distributed array A_dist.
4 Comments
Edric Ellis
on 8 Dec 2021
parfor is probably fastest since it can send slices of data to multiple workers simultaneously. Unfortunately, using parfor is not useful for creating a distributed array since you don't have control over where the data ends up. (Ideally the distributed constructor would do this too, but I think the current implementation doesn't).
Answers (0)
See Also
Categories
Find more on Distributed Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!