simple copy task much slower with high memory use, workaround possible?

Question

Jochen Schuettler on 10 Mar 2021

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/768212-simple-copy-task-much-slower-with-high-memory-use-workaround-possible

Commented: Jan on 19 Mar 2021

All memory usage is as reported by the function "memory".

I found that my algorithm is much slower after running for some time, and the reason is Matlab takes longer for a quite simple copy task, when it already uses a lot of memory. In my case Matlab uses 9.9 GB RAM of 16 GB. When copying matrices like

tic; for i = 1:10000; t = a(1:n); end; toc

and plotting the resulting time over n, the result is:

On an Matlab instance with 5.5 GB RAM usage, the same leads to this instead:

So some kind of very time-intensive memory handling sets in with the 9.9GB-Matlab when copying more than 2000 entries. Why? What can I do to work around that? Using less memory is an option but adds time for file handling, so I'd like to keep using ~10GB.

Edit: I tried generating enough random data from scratch to reach 9.9GB usage, the problem doesn't arise in that case! This seems to be a reproducale bug in memory handling with my specific 9.9GB of data!

2nd edit: saved to disk, the "real" 9.9GB data, make up 15.2 GB data. The "fake" 9.9 GB data are only 7.5GB on disk. Maybe that is important somehow?

10 Comments
Show 8 older commentsHide 8 older comments

Jochen Schuettler on 10 Mar 2021

Edited: Jochen Schuettler on 10 Mar 2021

Here both graphs bigger. In the 9,9GB version, copying more than 2000 entries 10000 times takes more than 7s, while slightly less takes almost nothing. In the 5.5GB version, all copying takes less than 50 ms!

I determine memory usage with the matlab function "memory". (In the Windows task manager, a lot less memory usage is shown.) In any case, 16 GB is far away so no page file should be involved. Of course, it looks like SDD access instead of RAM access, but for page file use both Windows task manager and Matlab "memory" would be far from correct in that case. Also, it doesn't depend on other open processes with their memory use, and the step is always at 2000 entries.The problem also happens the same when loading the data from disk into a freshly started Matlab, no old assigned memory in the background.

I cannot avoid those copies, the example shown here is simplified to narrow down the problem. I could parcel them into smaller units. But better would be a real solution - I mean, there must be a reason, something you could find, if I gave you my 9.9GB of data, right? Because random 9.9GB of data doesn't have the same effect.

Jan on 15 Mar 2021

Open in MATLAB Online

If the 9.9 GB of RAM are occupied by one variable, it is stored as a contiguos block and the rest of the RAM is free (execpt for the RAM used by the OS and programs). If yout create thousands of variables in Matlab with a sum of 9.9 GB the RAM can be fragmented, e.g. by having free blocks of 1 MB between the variables. Then the sum of the free RAM can be large, but there is no space to store a variable with 2 MB anywhere.

Having a fragmented memory is a serious problem. There is no easy solution to solve this. Therefore it is a good programming pratice to avoid this. E.g. the iterative growing of arrays has to be avoided.

The 15.2 GB MAT file is really strange. Maybe it contains a lot of figure handles? Note that WHOS claims, that a figure uses 8 bytes only, because this is the memory of the handle only. Storing this handle on the disk, write the contents of the figure also, which takes much more space.

It would be useful, if you provide a minimal working example. You have posted some code, which does not produce the shown diagrams. My tests with this code did not show any suspicious behviour in Matlab R2018b:

figure;
axes('NextPlot', 'add', 'XScale', 'log');
a = rand(1, 1e6);
for n = 1:0.5:6
   len = round(10^n);
   tic
   for i = 1:10000
      t = a(1:len);
   end
   b = toc;
   plot(len, b/len, 'o');
   drawnow;
end

Walter Roberson on 15 Mar 2021

-v7.3 files are stored in HDF 5, and there is a notable amount of overhead for container datatypes such as cell arrays or struct; storing pure numeric arrays is not nearly as bad.

Jochen Schuettler on 15 Mar 2021

Edited: Jochen Schuettler on 16 Mar 2021

There is just 1 small figure involved, Walter's comment is correct concerning the workspace file size. Jan's comment is likely correct concerning memory fragmentation as the reason for my problem while the program runs. More on that below. Big BUT: Shouldn't saving the workspace and loading it from disk into a fresh instance lead to unfragmented memory? At least growing is not a problem then, but the structure of my variables is the same, of course.

I cannot show it by minimal code, the problem is not there without my big memory usage. I could give you the 15.2 GB for download, but as the upload is a problem, I'd only do it if you are willing to use it. I could also give you my full code, but it needs weeks to build up this size, I guess that is not possible. [Edit: see my newest answer, I have a short code example there!]

The full project including all saved files is about 18 GB and growing, that's why I need to save parts to SSD and load them again later, when I need them again. I have heuristics going to guess which entries are not needed very soon again, to mitigate file access speed loss. If that leads to memory fragmentation I don't know what to do to change it.

I can describe in more detail. In the workspace there are 2 big variables involved, let's call them A and B:

A is a 1x4 cell array. A{1} is a large x 1 sparse matrix, which seldomly gets additional entries. A{2} is a 1 x 660+ cell array, grown from 1 x 10 and growing seldomly. Each cell is a large x 1 sparse matrix, which seldomly gets additional entries. A{3} is a 1 x verylarge+ cell array, grown from 1 x 100 and growing quickly. Each cell is a 83195×1 (almost) empty sparse matrix, which very seldomly gets additional entries. A{4} is small and of constant size.

Both A{2} and A{3} have a structure like "1 x large" times "other_large x 1 sparse matrix", so you could view each as "other_large x large sparse matrix". In the beginning, they were exactly that. But growing (adding columns), or saving columns to disk, or loading columns back from disk were very time-consuming. Capsulating the columns into cells worked to make it quick, so that is what I did.

B is a 62107+ x 1 cell array, grown from 10 x 1. Many of these cells are empty (saved to disk), the others are 1 x 4 cell arrays, each 10000x1 uint32, largex6 uint8, largex2 int16, 2x1 double. So again I used cell arrays to make saving/loading of parts quicker, only the parts are not sparse columns, but cell arrays of quite full matrices of different types to better utilise RAM.

Whenever A or B hits 5 GB, using 'whos', I save 2.2 GB of cells (only from A{3} and B) to files and clear the space afterwards.

Instead of growing columns (cells now) to A{2}, A{3} and B, I could directly make them larger. But I don't know how large. I could still use bigger chunks for growing. But as I save and load middle cells again and again, and also add entries in the sparse matrices, I guess that is the more important problem.

What could I do to

- still save and load parts of A{3} and B to disk, as I simply cannot keep all of it in memory.

- still grow columns (cells now) to them, while I can't know how many I'll need in the end.

- still add entries into the sparse parts, while I can't know how many I'll need in the end.

- do it in a time-efficient way

- and without strong memory fragmentation

Sign in to comment.

Sign in to answer this question.

Answer 1

Jochen Schuettler on 11 Mar 2021

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/768212-simple-copy-task-much-slower-with-high-memory-use-workaround-possible#answer_644692

Edited: Jochen Schuettler on 11 Mar 2021

Parcelling the larger equation into a a nested for loop, which is effectively called 48 times and uses 384 entries, worked to make the calculation quick!

I realize your answer about the memory allocation was the correct reason: Matlab calls for additional memory for these 48x384 entries and the memory is freed again afterwards. So the freeing is the problem, and it might be part of Windows' strategy to either keep the memory with the process or not.

But: Why does the random 9.9GB data behave different than my real data?

And: Why does task manager show so much less memory use than matlabs "memory"?

And: Would it help to declare the intermediate data as global/persistent, so we don't need for-loops?

4 Comments
Show 2 older commentsHide 2 older comments

Jan on 15 Mar 2021

Open in MATLAB Online

Why does task manager show so much less memory use than matlabs "memory"?

The term "memory usage" is not uniquely defined.

x = rand(1, 1e6);
x = 5

Now the large memory block reserved by the first command is free'd, but maybe the memory manager of the OS did not overwrite it with zeros already and therefore it is not available for other programs. Now it is a question of taste, if this memory belongs to Matlab or not. In theory the OS can decide by its own when to clear the contents. Usually it does this on demand or on idle only.

In addition the OS stores a file cache for each program. Does this belong to the application or to the operating system?

Would it help to declare the intermediate data as global/persistent, so we don't need for-loops?

Maybe. You did not post a section of your code, which would clarify exactly what you are doing. So I could speculate only. It is more reliable, if you try this. Remember, that the behavior can change with the Matlab version, the OS and the avalaible free RAM.

Jochen Schuettler on 15 Mar 2021

Thank you for comment-answering my answer here, Jan. Global variables did not help in my case, sadly.

Concerning the memory management part: I'd understand if Matlab were showing somewhat less memory than Windows, but MORE ?

Windows Taskmanager: 4.something GB - Matlab 'memory': 9.9 GB

Sign in to comment.

Answer 2

Jochen Schuettler on 16 Mar 2021

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/768212-simple-copy-task-much-slower-with-high-memory-use-workaround-possible#answer_649337

Edited: Jochen Schuettler on 16 Mar 2021

Open in MATLAB Online

Jan, you twice asked for a minimal code example. I offered to upload 15.2 GB or let you run my code for weeks. Now I wrote a short code that puts variables of the same size and structure into memory as my real problem. At first I tried to fill in semi-meaningful random values as well, but I guess that's not necessary and replaced them by ones. I commented them, so you could put them back in.

In the end, after all the memory building and filling, there is a short loop producing the above figures. At least it should happen that way, the code is still running... Any way to let randperm and mat2cell produce a progress message?

My machine is x64, 16GB, OS is Windows 10 20H2, Matlab is R2019a.

A = cell(1,4);
l(1) = 24043282;
l(2) = 654;
A{1} = sparse(l(1),1);
A{1}(randperm(l(1),l(2))) = 1;%[1 randperm(l(2)-2,l(2)-2)+1 l(2)];
l(3) = 660;
l(4) = 23580752;
temp = sparse(l(1),l(3));
temp(randperm(l(2)*l(1),l(4))) = 1;%[1 randperm(l(4)-2,l(4)-2) l(4)];
A{2} = mat2cell(temp,l(1),ones(l(3),1));
l(5) = 83195;
l(6) = 7368740;
l(7) = 109051176;
l(8) = 621077181;
temp = sparse(l(5),l(6));
temp(randperm(l(5)*l(6),l(7))) = 1;%[1 randperm(l(8)-2,l(7)-2)+1 l(8)]/10000;
A{3} = mat2cell(temp,l(5),ones(l(6),1));
A{4} = ones(1,4);
clear temp;
l(9) = fix(l(8)/10000);
l(10) = 10000;
l(11) = 7272;
std_l2 = 3.2744e4;
mean_l2 = 2.693e5;
l2 = round(randn(l(11),1)*std_l2+mean_l2);
I2 = [randperm(l(9)-1,l(11)-1) l(9)];
B = cell(l(9),1);
for i = 1:l(11)
    I2i = I2(i);
    B{I2i} = cell(1,4);
    B{I2i}{1} = ones(l(10),1,'uint32');%uint32([1 randperm(l2(i)-1,l(10)-1)+1]');
    B{I2i}{2} = ones(l2(i),6,'uint8');
    B{I2i}{3} = ones(l2(i),2,'int16');
    B{I2i}{4} = ones(1,2);
end
l(12) = 233209050;
l(13) = 6957970;
C = zeros(l(12),1,'int32');
C(randperm(l(12),l(13))) = 1;%[randperm(l(13)-1,l(13)-1) l(13)];
D = ones(384,17);
tv = [];
for n = 1:400:6000
	tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);

7 Comments
Show 5 older commentsHide 5 older comments

Jochen Schuettler on 18 Mar 2021

Edited: Jochen Schuettler on 18 Mar 2021

Open in MATLAB Online

This runs in 1-2 hours:

A = cell(1,4);
n(1) = 24043282;
n(2) = 654;
A{1} = sparse(n(1),1);
A{1}(randperm(n(1),n(2))) = 1;%[1 randperm(n(2)-2,n(2)-2)+1 n(2)];
n(3) = 660;
n(4) = 23580752;
n(5) = 6358246;
I = randperm(n(5)*n(2),n(4));
[r,c] = ind2sub([n(5),n(2)],I);
I = randperm(n(1),n(5));
r = I(r);
temp = sparse(r,c,ones(size(r)),n(1),n(3));
A{2} = mat2cell(temp,n(1),ones(n(3),1));
n(6) = 83195;
n(7) = 7368740;
n(8) = 109051176;
n(9) = 621077181;
I = randperm(n(6)*n(7),n(8));
[r,c] = ind2sub([n(6),n(7)],I);
temp = sparse(r,c,ones(size(r)),n(6),n(7));
A{3} = mat2cell(temp,n(6),ones(n(7),1));
A{4} = ones(1,4);
clear temp;
n(9) = A = cell(1,4);
n(1) = 24043282;
n(2) = 654;
A{1} = sparse(n(1),1);
A{1}(randperm(n(1),n(2))) = 1;%[1 randperm(n(2)-2,n(2)-2)+1 n(2)];
n(3) = 660;
n(4) = 23580752;
n(5) = 6358246;
I = randperm(n(5)*n(2),n(4));
[r,c] = ind2sub([n(5),n(2)],I);
I = randperm(n(1),n(5));
r = I(r);
temp = sparse(r,c,ones(size(r)),n(1),n(3));
A{2} = mat2cell(temp,n(1),ones(n(3),1));
n(6) = 83195;
n(7) = 7368740;
n(8) = 109051176;
I = randperm(n(6)*n(7),n(8));
[r,c] = ind2sub([n(6),n(7)],I);
temp = sparse(r,c,ones(size(r)),n(6),n(7));
A{3} = mat2cell(temp,n(6),ones(n(7),1));
A{4} = ones(1,4);
clear temp;
n(9) = 62107;
n(10) = 10000;
n(11) = 7272;
std_n2 = 3.2744e4;
mu_n2 = 2.693e5;
n2 = round(randn(n(11),1)*std_n2 + mu_n2);
I2 = [randperm(n(9)-1,n(11)-1) n(9)];
B = cell(n(9),1);
for i = 1:n(11)
    I2i = I2(i);
    B{I2i} = cell(1,4);
    B{I2i}{1} = ones(n(10),1,'uint32');
    B{I2i}{2} = ones(n2(i),6,'uint8');
    B{I2i}{3} = ones(n2(i),2,'int16');
    B{I2i}{4} = ones(1,2);
end
n(12) = 233209050;
n(13) = 6957970;
C = zeros(n(12),1,'int32');
C(randperm(n(12),n(13))) = 1;
D = ones(384,17);
tv = [];
for n = 1:400:6000
	tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);;
n(10) = 10000;
n(11) = 7272;
std_n2 = 3.2744e4;
mu_n2 = 2.693e5;
n2 = round(randn(n(11),1)*std_n2 + mu_n2);
I2 = [randperm(n(9)-1,n(11)-1) n(9)];
B = cell(n(9),1);
for i = 1:n(11)
    I2i = I2(i);
    B{I2i} = cell(1,4);
    B{I2i}{1} = ones(n(10),1,'uint32');
    B{I2i}{2} = ones(n2(i),6,'uint8');
    B{I2i}{3} = ones(n2(i),2,'int16');
    B{I2i}{4} = ones(1,2);
end
n(12) = 233209050;
n(13) = 6957970;
C = zeros(n(12),1,'int32');
C(randperm(n(12),n(13))) = 1;
D = ones(384,17);
tv = [];
for n = 1:400:6000
	tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);

But I don't see the effect I'm experiencing with my real problem. So it's not caused by the variable structure alone. Also memory gives out 28 GB with this instead of 9.9 GB. I don't know, why the structurally similar data takes so much more space with this example.

Concerning your idea about 32 GB: Yes, I'll buy 2 SO-DIMMs of 16 GB each, costs 140€ including shipping. More is not possible in my laptop. And my tower is maxed out at 4x4GB.

Jochen Schuettler on 19 Mar 2021

This can be closed now, thank you for your help!

Jan on 19 Mar 2021

In this forum "closing" means, that a question is removed soon. So we close questions only, if they contain too few information to be answered.

Sign in to comment.

simple copy task much slower with high memory use, workaround possible?

10 Comments
Show 8 older commentsHide 8 older comments

Answers (2)

4 Comments
Show 2 older commentsHide 2 older comments

7 Comments
Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

simple copy task much slower with high memory use, workaround possible?

10 Comments Show 8 older commentsHide 8 older comments

Answers (2)

4 Comments Show 2 older commentsHide 2 older comments

7 Comments Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

10 Comments
Show 8 older commentsHide 8 older comments

4 Comments
Show 2 older commentsHide 2 older comments

7 Comments
Show 5 older commentsHide 5 older comments