I have a 2*N-by-2*N matrix whose 2x2 tiles are defined by L*PSI*R, where L is a 2x3 array, PSI is a 3x3 array, and R is a 3x2 array.
It's a bit trickier, since L,R, and PSI depend on which 2x2 block is being generated. Fortunately, L and R come from the same array, and the basic structure smells parallelizable.
I've developed a few different approaches for generating this matrix, but I have an intellectual curiosity in finding better ways of doing it. I'd greatly appreciate any hints or full solutions.
In the code below, I never explicitly declare PSI. Its generation is quite intricate and distracts from the issue considerably (it would be a few dozen lines of garbage). The one thing that may help is that PSI is symmetric in its arguments, up to transpose (i.e., PSI(X(i,:),X(j,:)) == PSI(X(i,:),X(j,:))'). Feel free to use something like PSI = @ (x,y) rand(3).
Basic Setup:
matlabpool(12);
N=10;
X = rand(3,N)
timesMatVec = @(A,b) bsxfun(@times,A,b(:));
d = @(x) timesMatVec([(-x(:,3).*x(:,1)) (-x(:,3).*x(:,2)) ...
(1-x(:,3).^2)],(1./sqrt(1-x(:,3).^2)));
e = @(x) timesMatVec([(-x(:,2)) x(:,1) ...
0*x(:,1)],(1./sqrt(1-x(:,3).^2)));
dmat = d(X(:,:));
emat = e(X(:,:));
function to compute--a few hundred FLOP per call.
PSI = @(x,y) [p11 p12 p13; p21 p22 p23; p31 p32 p33];
And now, two different methods for evaluating.
First, the for-loop version to highlight what's going on:
for i = 1:size(X,1)
for j = 1:size(X,1)
Asbf( (2*i-1):(2*i),(2*j-1):(2*i) ) =
[dmat(i,:)' emat(i,:)']*...
PSI(X(j,:),X(i,:))*...
[dmat(j,:)' emat(j,:)']'
end
end
This is the arrayfun + parfor version I'm currently using. This is decently speedy, but at the expense of putting hugely redundant data in RAM. There is so much waste that my 32GB workstation is saturated when N=3249.
The basic strategy here is to compute 2-by-2*N strips at a time, since X(:,j) and R remain fixed through a strip. Parfor then offloads the strip generation to individual workers.
Asbf = num2cell(Asbf, [2 3]);
parfor i=1:N
Asbf{i}=reshape(Asbf{i},2,[]);
end
parfor i =1:size(X,1)
Asbf{i} = cell2mat(...
arrayfun(@(j) leftmat{j}*PSI(X(j,:), X(i,:))*leftmat{i}',...
(1:size(X,1))','UniformOutput', 0)...
);
end
Asbf=cell2mat(Asbf);
Asbf_buff=reshape(Asbf,2*N,[]);
Asbf=zeros(2*N,2*N);
Asbf(:,1:2:end)=Asbf_buff(:,1:N);
Asbf(:,2:2:end)=Asbf_buff(:,(N+1):end);
Any insight or suggestions are greatly appreciated, be they better ways of generating the array, some advice on configuring the cell arrays so parfor makes better use of memory, or otherwise. Even an incremental improvement will carry me some way.