Efficient algorithm for a duplication matrix

Question

Youngkyu Kim on 27 Jul 2019

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/473737-efficient-algorithm-for-a-duplication-matrix

Commented: Jan on 5 Aug 2021

Accepted Answer: Jan

Can anybody help me to design a Matlab code function that creates a duplication matrix D?

Thanks in advnace.

My codes is very slow...

Any ideas to speed it up?

n=1000;
% Duplication matrix: vec(P)=Dvech(P)
tic
m=1/2*n*(n+1);
nsq=n^2;
DT=sparse(m,nsq);
for j=1:n
    for i=j:n
        ijth=(j-1)*n+i;
        jith=(i-1)*n+j;
        vecTij=sparse(ijth,1,1,nsq,1);
        vecTij(jith,1)=1;
        k=(j-1)*n+i-1/2*j*(j-1);
        uij=sparse(k,1,1,m,1);
        DT=DT+uij*vecTij';
    end
end
D=DT';
toc
% test duplication matrix
C=rand(n,n);
P=1/2*(C+C');
vechP=nonzeros(tril(P));
vecP=P(:);
err_D=vecP-D*vechP;
max(err_D(:))
min(err_D(:))

2 Comments
Show NoneHide None

Walter Roberson on 27 Jul 2019

What are vec and vech in this context?

Stephan on 27 Jul 2019

The question Text is complete copied from Wikipedia- we can assume it is meant: https://en.m.wikipedia.org/wiki/Vectorization_%28mathematics%29?wprov=sfla1

Sign in to comment.

Sign in to answer this question.

Answer 1

Jan on 28 Jul 2019

2
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/473737-efficient-algorithm-for-a-duplication-matrix#answer_385153

Edited: Jan on 4 Aug 2021

For n=300 this needs 1.3 sec instead of 27.5 sec:

tic
m   = n * (n + 1) / 2;
nsq = n^2;
D   = spalloc(nsq, m, nsq);
row = 1;
a   = 1;
for i = 1:n
   b = i;
   for j = 0:i-2
      D(row + j, b) = 1;
      b = b + n - j - 1;
   end
   row = row + i - 1;
   
   for j = 0:n-i
      D(row + j, a + j) = 1;
   end
   row = row + n - i + 1;
   a   = a + n - i + 1;
end
toc

But it is much faster to create the index vector at first instead of accessing the sparse matrix repeatedly:

tic
m   = n * (n + 1) / 2;
nsq = n^2;
r   = 1;
a   = 1;
v   = zeros(1, nsq);
for i = 1:n
   b = i;
   for j = 0:i-2
      v(r) = b;
      b    = b + n - j - 1;
      r    = r + 1;
   end
   
   for j = 0:n-i
     v(r) = a + j;
     r    = r + 1;
   end
   % BUGFIX: Omit "r = r + n - i + 1;"  Thanks Trisha Phippard
   a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc

Now I get 0.013 sec for n=300. Finally vectorize the 2 inner loops:

tic
m   = n * (n + 1) / 2;
nsq = n^2;
r   = 1;
a   = 1;
v   = zeros(1, nsq);
cn  = cumsum(n:-1:2);   % [EDITED, 2021-08-04], 10% faster
for i = 1:n
   % v(r:r + i - 2) = i - n + cumsum(n - (0:i-2));
   v(r:r + i - 2) = i - n + cn(1:i - 1);   % [EDITED, 2021-08-04]
   r = r + i - 1;
   
   v(r:r + n - i) = a:a + n - i;
   r = r + n - i + 1;
   a = a + n - i + 1;
end
D2 = sparse(1:nsq, v, 1, nsq, m);
toc

0.011 sec. A speedup of factor 2500 for n=300. And 0.12 sec for n=1000. Nice! :-)

7 Comments
Show 5 older commentsHide 5 older comments

Jan on 4 Aug 2021

Edited: Jan on 5 Aug 2021

@Michael Stollenwerk: Yes, of course. Simply store the calcuated matrices in a persistent cell array:

function D = duplication(n)
persistent C
if isempty(C)
   nCache = 1000;  % Set according to your needs
   C      = cell(1, nCache);
end
if n <= numel(C) && ~isempty(C{n})
   D = C{n};
else
   m   = n * (n + 1) / 2;
   nsq = n^2;
   r   = 1;
   a   = 1;
   v   = zeros(1, nsq);
   cn  = cumsum(n:-1:2);
   for i = 1:n
      v(r:r + i - 2) = i - n + cn(1:i - 1);
      r = r + i - 1;
      
      v(r:r + n - i) = a:a + n - i;
      r = r + n - i + 1;
      a = a + n - i + 1;
   end
   D = sparse(1:nsq, v, 1, nsq, m);
   if n <= numel(C)
      C{n} = D;
   end
end
end

When this function is called the first time for a specific n, D is calculated. On further calls D is taken from the persistently stored list C.

% Timings for i7, Win 10 64, Matlab R2018b:
tic; for k = 1:1000; D = duplication(k); end; toc
tic; for k = 1:1000; D = duplication(k); end; toc
% First run with calculations:
% Elapsed time is 35.022836 seconds.
% Second run with copies from the list:
% Elapsed time is 0.006062 seconds.

Michael Stollenwerk on 5 Aug 2021

@Jan Perfect! Thanks!

Jan on 5 Aug 2021

The list C of my code needs 6GB of RAM, if all 1000 matrices are created. Writing it as -v7.3 MAT file creates an 1GB file. Reading it takes 15 seconds on a HDD instead of 35 seconds of creating all matrices dynamically.

Sign in to comment.

Efficient algorithm for a duplication matrix

2 Comments
Show NoneHide None

Accepted Answer

7 Comments
Show 5 older commentsHide 5 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

Efficient algorithm for a duplication matrix

2 Comments Show NoneHide None

Accepted Answer

7 Comments Show 5 older commentsHide 5 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

2 Comments
Show NoneHide None

7 Comments
Show 5 older commentsHide 5 older comments