Efficient way of storing a triangular matrix

29 views (last 30 days)
Hi all,
I have some pretty large upper triangular matrices to store and to be used like this:
As you see, these matrices are full in upper triangle and zeros in lower triangle. However, Matlab is not optimised to store such matrices, using sparse only makes the memory cost slightly larger.
So shall I vectorise these matrices to vectors, such that Matlab only stores the real numbers, and then after the operations I change them back to upper triangular matrices? If I should, is there any efficient functions to achieve this?
Many thanks!

Accepted Answer

John D'Errico
John D'Errico on 17 Feb 2018
Edited: John D'Errico on 17 Feb 2018
If all of your matrices are the same size, it is trivial to store them, extracting the upper triangle.
Just create one vector. It will be a boolean vector, so cheap to store. For example, given a large upper triangular matrix...
clear
N = 5000;
UT = triu(rand(N,N)); % Our sample test matrix
% Now, create a boolean vector.
Boolind = triu(repmat(true,N,N));
Boolind = Boolind(:);
UTvec = UT(Boolind);
whos
Name Size Bytes Class Attributes
Boolind 25000000x1 25000000 logical
N 1x1 8 double
UT 5000x5000 200000000 double
UTvec 12502500x1 100020000 double
As you can see, UTvec is roughly half the size.
You can save UTvec as a vector now. If you have several matrices, no problem. Just save them all as vectors. You can also save the boolean vector, so it need not be recreated each time. But since it is of class logical, it is relatively small potatoes.
Now, can I recreate a new full matrix that is upper triangular and contains that upper triangle?
UTcopy = zeros(N);
UTcopy(Boolind) = UTvec;
max(max(abs(UT - UTcopy)))
ans =
0
It will take some small amount of time to stuff the data into a new array. But if you need to work with it as an array, you do what you need.
If you are creating several such arrays in turn, you can even save the time for the zeros allocation, just overwrite that upper triangle as I did in a simple assignment.
One other point. I don't know what you are doing with the arrays, since you have given us no clue at all. But suppose you wanted to just add two such arrays, when stored in vector form? Just add the vectors! As long as you will do only element-wise operations on only those elements, you can do it in purely vector form. So there might be many things you can do without ever needing to create an upper triangular matrix at all.
  3 Comments
John D'Errico
John D'Errico on 19 Feb 2018
Even if you have multiple matrices, you only need to store ONE boolean vector to locate those elements. So the additional storage required is pretty small, especially since you said that you have multiple such matrices to store.
Of course, you can simply create it on the fly. It is more costly that way in terms of time. More importantly, if you will be creating the vector on the fly, you STILL need to have RAM available to create it in temporary memory.
Xiaohan Du
Xiaohan Du on 19 Feb 2018
Edited: Xiaohan Du on 19 Feb 2018
This is true. I did a test:
clear; clc;
N = 10000;
m1 = triu(rand(N, N)); % Our sample test matrix
m2 = triu(rand(N, N));
%%method 1: store vectors
% transform upper trniangular matrix into Boolean vectors.
Boolind = triu(true(N, N));
Boolind = Boolind(:);
m1vec = m1(Boolind);
m2vec = m2(Boolind);
>> whos
Name Size Bytes Class Attributes
Boolind 100000000x1 100000000 logical
N 1x1 8 double
m1 10000x10000 800000000 double
m1vec 50005000x1 400040000 double
m2 10000x10000 800000000 double
m2vec 50005000x1 400040000 double
Total memory cost is 400040000 * 2 + 100000000 = 900080000 bytes (need to store Boolind, m1vec, m2vec), cost more than David's method due to the need of storing vector Boolind, but slightly faster.

Sign in to comment.

More Answers (1)

David Goodmanson
David Goodmanson on 17 Feb 2018
Edited: David Goodmanson on 17 Feb 2018
Hi Xiaohan,
Since you have several matrices, one option is to store them in pairs as a square matrix (plus one extra column). For upper triangular M1,M2:
n = size(M2,1);
d2 = diag(M2);
M2(1:n+1:n^2) = 0;
M12 = [M1 + M2.' d2]; % n x (n+1) to store
% other direction
d2 = M12(:,end);
M12(:,end) = [];
M1 = triu(M12);
M2 = tril(M12).';
M2(1:n+1:n^2) = d2;
I don't claim that this process is necessarily speedy, but if storage is an issue it does not waste space.
  3 Comments
David Goodmanson
David Goodmanson on 26 Feb 2018
Hi Xiaohan,
I like the visual appeal of storing two matrices as a matrix, but aside from that I think John's method has most of the advantages. Besides its being slightly faster, you can deal with one matrix at a time, not two. Creating Boolind itself is fast enough that I don't think you need to store it, and for several matrices of the same size you only need to create it once. And since you are storing matrices as vectors, within memory limitations you can even stack a few of the storage vectors and make a matrix of that if you wish.
Xiaohan Du
Xiaohan Du on 28 Feb 2018
Hi David,
I agree, John's method is the most straight forward, and for your method, I'll need to invent a way to transfer the square matrices back to triangular matrices, which can be tricky. But still, your method works and thanks a lot for your help!

Sign in to comment.

Categories

Find more on Sparse Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!