decomposition in parfor failure

7 views (last 30 days)
Shumao Zhang
Shumao Zhang on 23 Nov 2021
Edited: Edric Ellis on 19 Aug 2022
Hi there,
I need to solve multiple linear systems for multiple times: for . Due to this problem strcure, I realize that the decomposition routine will be very efficient, and since all linear systems are independent of each other, I use parfor to further speed up the time.
However decomposition + parfor does work in matlab. I attach a piece of toy code where I synthesize the matrices and show case my idea:
d = 1000; % size of matrix A
n = 10; % number of linear systems
% get two random sparse PSD matrices in sparse form: (I, J, VA1) and (I, J, VA2)
I = horzcat(1:d, randi(d, 1, 2*d));
J = horzcat(1:d, randi(d, 1, 2*d));
V1 = rand(3*d, 1);
V2 = rand(3*d, 1);
T1 = sparse(I, J, V1);
T2 = sparse(I, J, V2);
[~, ~, VA1] = find(T1*T1');
[I, J, VA2] = find(T2*T2');
A = cell(1, n);
pool = parpool(2);
% construct A_i and decompose
parfor i = 1 : n
A{i} = decomposition(sparse(I, J, VA1 + i^2 * VA2));
end
b = rand(d, n);
x = rand(d, n);
% solve all linear systems for one RHS
parfor i = 1 : n
x(:, i) = A{i}\b(:, i);
end
delete(pool);
The problem is, matlab will give error at the second parfor loop:
Error using \ (line 391)
Matrix dimensions must agree.
It seems that A{i} are still empty. I found a previous post https://www.mathworks.com/matlabcentral/answers/401212-decomposition-object-saveobj-method-not-supported that points me to https://merzba.ch/dw/blg:matlab_decomposition_parfor where the author explained that it is the (un)serialization steps of the decomposition object in parfor that caused the trouble.
I followed the suggestions in the post to modify decomposition.m. However, Matlab still reports warnings and error. The warning is
Warning: Decomposition built-in error: CHOLMODWrapper object does not support saving to file.
Loaded file will not be usable.
> In parallel.internal.pool.serialize (line 10)
In distcomp/remoteparfor/serialize (line 279)
In distcomp/remoteparfor/addInterval (line 329)
In parallel_function>distributed_execution (line 704)
In parallel_function (line 573)
In untitled (line 24)
Warning: Decomposition built-in error: CHOLMODWrapper object not initialized correctly. Save and load are not supported for this object.
> In parallel.internal.pool.deserialize (line 33)
In remoteParallelFunction (line 66)
and the error reads
Error using matlab.internal.decomposition.SparseCholesky/solve (line 34)
Decomposition built-in error: Error: Cholesky decomposition is in invalid state.
Error in \ (line 394)
x = solve(f.Underlying, b, transp);
Error in untitled (line 24)
parfor i = 1 : m
Any idea on how to fix this? I would appreciate it a lot!

Accepted Answer

Edric Ellis
Edric Ellis on 23 Nov 2021
Edited: Edric Ellis on 19 Aug 2022
Unfortunately, I think you're hitting the sort of problem that was mentioned in this answer to the question you linked to - the internal representation of decomposition cannot (always) be saved correctly.
In your code above, you use each decomposition only once, so you ought to be able to simply combine the two parfor loops, and not transfer the decomposition objects at all.
If your full code is more complex, and you need to re-use the decomposition objects several times, you will need a different approach. It's not clear what would be best, but options include:
  1. Use parallel.pool.Constant to store decomposition on the workers without transferring it
  2. Use spmd to structure your parallelism to group computations that share the same decomposition.
Note that for case (1) above, it is crucial that you use the function_handle constructor for the parallel.pool.Constant, a bit like this (for a single matrix A):
A = someMatrixToBeDecomposed();
cA = parallel.pool.Constant(@() decomposition(A));
parfor i = 1:N
% use cA.Value
end
  4 Comments
Shumao Zhang
Shumao Zhang on 23 Nov 2021
Thanks for your reply!
In fact I have a lot of decomposition objects (~1000), and need to use them in the following way:
  1. Get the decomposition objects of
  2. Solve linear equations by using the decomposition objects of
  3. Generate new based on , each new depends on all
  4. Go to step 2 and repeat
A nested loop is not so possible because of step 2.
My original plan is to use parfor with decomposition in step 1 to speed up the computation (which failed as in the problem description).
Suppose that I have 32 workers (physical CPUs). I wonder if it is possible to, for example, on worker i compute the decomposition objects for matrices and store them only on that worker for step 1. And for step 2 on worker i we only solve . When we reach step 4 and go back to step 2, we can use the precomputed decomposition objects. Pseudocode like this:
parpool(32);
% step 1
parfor i = 1 : 32
% compute decompositions binded on that worker
end
for k = 1 : M % step 4
% step 2
parfor i = 1 : 32
for j = 1 : 32
x(:, 32*i-j+1) = A{32*i-j+1} \ b(:, 32*i-j+1);
end
end
% step 3
b = f(x);
end
Conceptually this approach will avoid the (un)serialization of decomposition objects so it would work. But I don't know how to implement step 1 and access A{32*i-j+1} in step 2 (apperantly the current form is not good).
Thanks for your help!
Edric Ellis
Edric Ellis on 24 Nov 2021
I think you might be better off with an spmd block in this case, perhaps together with a for-drange loop and codistributed arrays. The spmd block makes it simpler to leave data on the workers using Composite or even distributed and the for-drange loop is a bit like a parfor loop that runs inside spmd. for-drange has the advantage (for this case) that it is deterministic, so you know in advance which iterations will end up on which worker. codistributed arrays are spread across the memory of the workers.
A sketch:
% Step 1 - set up the decompositions
spmd
% A codistributed cell array
A = codistributed.cell(1, n);
for i = drange(1:n)
% Each worker gets a different set of values of `i`
A{i} = computeA(i);
end
end
% Here, A is a distributed cell array. The elements remain on the workers
spmd
% Initial value of `b`, as `codistributed`.
b = codistributed.zeros(n);
for k = 1:M
% Make codistributed x to store results
x = codistributed.zeros(n);
for i = drange(1:n)
% Here, we are relying on the fact that each worker has
% the right pieces of x, A, and b
x(:,i) = A{i} \ b(:,i);
end
% Now we need to perform the operation on `x` to
% get the next `b`. Maybe this works, maybe it doesn't:
b = f(x);
end
end
The slightly tricky part is what you had as b = f(x). As I've written it, x is a codistributed array. If the operation inside f is in the right form, it might just work anyway on the codistributed array using the existing overloads for that data type. If not, you can always gather the data back on to one worker, perform the operation there, and then re-distribute, like this:
spmd
% Gather all of x onto worker 1
x1 = gather(x, 1)
if labindex == 1
% operate on local data
b1 = f(x1);
else
% Dummy value not needed on other workers.
b1 = [];
end
% Re-create codistributed b from the value b1 from worker 1.
b = codistributed(b1, 1, getCodistributor(x));
end

Sign in to comment.

More Answers (0)

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!