You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
simple copy task much slower with high memory use, workaround possible?
13 views (last 30 days)
Show older comments
All memory usage is as reported by the function "memory".
I found that my algorithm is much slower after running for some time, and the reason is Matlab takes longer for a quite simple copy task, when it already uses a lot of memory. In my case Matlab uses 9.9 GB RAM of 16 GB. When copying matrices like
tic; for i = 1:10000; t = a(1:n); end; toc
and plotting the resulting time over n, the result is:

On an Matlab instance with 5.5 GB RAM usage, the same leads to this instead:

So some kind of very time-intensive memory handling sets in with the 9.9GB-Matlab when copying more than 2000 entries. Why? What can I do to work around that? Using less memory is an option but adds time for file handling, so I'd like to keep using ~10GB.
Edit: I tried generating enough random data from scratch to reach 9.9GB usage, the problem doesn't arise in that case! This seems to be a reproducale bug in memory handling with my specific 9.9GB of data!
2nd edit: saved to disk, the "real" 9.9GB data, make up 15.2 GB data. The "fake" 9.9 GB data are only 7.5GB on disk. Maybe that is important somehow?
10 Comments
Jan
on 10 Mar 2021
The diagrams are far to small to recognize anything. What are the units?
How do you determine the memory usage of Matlab? Remember, that "used memory" is not uniquely defined. When Matlab allocates 1GB of RAM and the corresponding variable is cleared, the OS does not release the corresponding memory immediately. As long as this memory is available for Matlab, it appears in Matlab's memory usage, although it is not used currently.
I guess in the 2nd diagram you see, that Matlab allocates new memory instead of reusing the formerly reserved memory. This should happen, when the OS did not clean the memory before by overwriting it with zeros. Then providing "new" RAM is more efficient.
The solution is to avoid unnecessary memory copies.
Jochen Schuettler
on 10 Mar 2021
Edited: Jochen Schuettler
on 10 Mar 2021
Here both graphs bigger. In the 9,9GB version, copying more than 2000 entries 10000 times takes more than 7s, while slightly less takes almost nothing. In the 5.5GB version, all copying takes less than 50 ms!
I determine memory usage with the matlab function "memory". (In the Windows task manager, a lot less memory usage is shown.) In any case, 16 GB is far away so no page file should be involved. Of course, it looks like SDD access instead of RAM access, but for page file use both Windows task manager and Matlab "memory" would be far from correct in that case. Also, it doesn't depend on other open processes with their memory use, and the step is always at 2000 entries.The problem also happens the same when loading the data from disk into a freshly started Matlab, no old assigned memory in the background.
I cannot avoid those copies, the example shown here is simplified to narrow down the problem. I could parcel them into smaller units. But better would be a real solution - I mean, there must be a reason, something you could find, if I gave you my 9.9GB of data, right? Because random 9.9GB of data doesn't have the same effect.


Jan
on 11 Mar 2021
In the 9,9GB version, copying more than 2000 entries 10000 times takes more than 7s, while slightly less takes almost nothing. In the 5.5GB version, all copying takes less than 50 ms!
I still do not understand, what the "9.9" and "5.5GB versions" are. What does "9.9GB-Matlab" mean?
This is not clear to me also: saved to disk, the "real" 9.9GB data, make up 15.2 GB data. How do you save the data and why are they larger?
copying more than 2000 entries 10000 times takes more than 7s, while slightly less takes almost nothing
This sounds like an effect of the cache size: The CPU cache ist very fast compared to the RAM. If the data are available there already, the OS saves the time for reloading from the slow RAM completely. Of course this takes no time.
Jochen Schuettler
on 11 Mar 2021
I meant, my program ran up till the point where I checked the reason for slow speed and found the reason in the copying line. It amassed 9.9 GB in RAM till then. I compared to the same program, but stopped earlier, when the executing efficiency was still good. Then it had amassed 5.5 GB in RAM. Still later I created 9.9 GB of "fake" random data to check against, which also leads to fast copying, and named my program data "real" in comparison
In all cases I use the matlab function "memory" to find out memory usage. I saved the whole workspace to disk with matlab "save <filename>". On disk, my "real" 9.9 GB lead to file of 15.2 GB size, while the "fake" leads to only 7.5 GB in file size. I don't know why that is so but I guessed it could be related to the different effect on the copying speed.
Concerning the cache: yes that would explain the effect, but why can the cache hold much more entries when the RAM has the "fake" data or the 5.5 GB data?!?
Jan
on 11 Mar 2021
On disk, my "real" 9.9 GB lead to file of 15.2 GB size, while the "fake" leads to only 7.5 GB in file size.
This is such confusing, that I have problem to concentrate on the rest. If Matlab stores 9.9 GB of RAM in a 15.2 GB MAT file, there is a fundamental problem. With the -v7.3 format, the save command compresses the data. I would be surprised, if this increases the size.
I do not understand, what "fake" data are. For Matlab all data are numbers.
Jochen Schuettler
on 12 Mar 2021
Is there a possibility to speak in person? I have the feeling that would be so much more productive with both our time!
Jochen Schuettler
on 15 Mar 2021
Ok, I'll try once more and skip confusing parts:
- My RAM data is 9.9GB as shown by Matlab 'memory', on a 16GB system. Saved to disk with 'save workspace_slow '-v7.3', the file is 15.2 GB big. I don't know why. Perhaps Matlab 'memory' is wrong somehow? How to find out?
- The Windows task manager shows a lot less RAM use than Matlab 'memory'. Why?
- Having that data in RAM leads to very slow copying of more than 2000 entries at once. Copying smaller packages with for-loops is a lot quicker, but the one-line-copy was even quicker, before my memory was so full.
So, having less data in RAM, or random numbers of 9.9GB size (measured with Matlab 'memory' again) is quick and takes up less space on the disk. But it's intentional to have more than 10GB in memory. So I use for-loops as my work-around now and still need a real solution.
Jan
on 15 Mar 2021
If the 9.9 GB of RAM are occupied by one variable, it is stored as a contiguos block and the rest of the RAM is free (execpt for the RAM used by the OS and programs). If yout create thousands of variables in Matlab with a sum of 9.9 GB the RAM can be fragmented, e.g. by having free blocks of 1 MB between the variables. Then the sum of the free RAM can be large, but there is no space to store a variable with 2 MB anywhere.
Having a fragmented memory is a serious problem. There is no easy solution to solve this. Therefore it is a good programming pratice to avoid this. E.g. the iterative growing of arrays has to be avoided.
The 15.2 GB MAT file is really strange. Maybe it contains a lot of figure handles? Note that WHOS claims, that a figure uses 8 bytes only, because this is the memory of the handle only. Storing this handle on the disk, write the contents of the figure also, which takes much more space.
It would be useful, if you provide a minimal working example. You have posted some code, which does not produce the shown diagrams. My tests with this code did not show any suspicious behviour in Matlab R2018b:
figure;
axes('NextPlot', 'add', 'XScale', 'log');
a = rand(1, 1e6);
for n = 1:0.5:6
len = round(10^n);
tic
for i = 1:10000
t = a(1:len);
end
b = toc;
plot(len, b/len, 'o');
drawnow;
end
Walter Roberson
on 15 Mar 2021
-v7.3 files are stored in HDF 5, and there is a notable amount of overhead for container datatypes such as cell arrays or struct; storing pure numeric arrays is not nearly as bad.
Jochen Schuettler
on 15 Mar 2021
Edited: Jochen Schuettler
on 16 Mar 2021
There is just 1 small figure involved, Walter's comment is correct concerning the workspace file size. Jan's comment is likely correct concerning memory fragmentation as the reason for my problem while the program runs. More on that below. Big BUT: Shouldn't saving the workspace and loading it from disk into a fresh instance lead to unfragmented memory? At least growing is not a problem then, but the structure of my variables is the same, of course.
I cannot show it by minimal code, the problem is not there without my big memory usage. I could give you the 15.2 GB for download, but as the upload is a problem, I'd only do it if you are willing to use it. I could also give you my full code, but it needs weeks to build up this size, I guess that is not possible. [Edit: see my newest answer, I have a short code example there!]
The full project including all saved files is about 18 GB and growing, that's why I need to save parts to SSD and load them again later, when I need them again. I have heuristics going to guess which entries are not needed very soon again, to mitigate file access speed loss. If that leads to memory fragmentation I don't know what to do to change it.
I can describe in more detail. In the workspace there are 2 big variables involved, let's call them A and B:
A is a 1x4 cell array. A{1} is a large x 1 sparse matrix, which seldomly gets additional entries. A{2} is a 1 x 660+ cell array, grown from 1 x 10 and growing seldomly. Each cell is a large x 1 sparse matrix, which seldomly gets additional entries. A{3} is a 1 x verylarge+ cell array, grown from 1 x 100 and growing quickly. Each cell is a 83195×1 (almost) empty sparse matrix, which very seldomly gets additional entries. A{4} is small and of constant size.
Both A{2} and A{3} have a structure like "1 x large" times "other_large x 1 sparse matrix", so you could view each as "other_large x large sparse matrix". In the beginning, they were exactly that. But growing (adding columns), or saving columns to disk, or loading columns back from disk were very time-consuming. Capsulating the columns into cells worked to make it quick, so that is what I did.
B is a 62107+ x 1 cell array, grown from 10 x 1. Many of these cells are empty (saved to disk), the others are 1 x 4 cell arrays, each 10000x1 uint32, largex6 uint8, largex2 int16, 2x1 double. So again I used cell arrays to make saving/loading of parts quicker, only the parts are not sparse columns, but cell arrays of quite full matrices of different types to better utilise RAM.
Whenever A or B hits 5 GB, using 'whos', I save 2.2 GB of cells (only from A{3} and B) to files and clear the space afterwards.
Instead of growing columns (cells now) to A{2}, A{3} and B, I could directly make them larger. But I don't know how large. I could still use bigger chunks for growing. But as I save and load middle cells again and again, and also add entries in the sparse matrices, I guess that is the more important problem.
What could I do to
- still save and load parts of A{3} and B to disk, as I simply cannot keep all of it in memory.
- still grow columns (cells now) to them, while I can't know how many I'll need in the end.
- still add entries into the sparse parts, while I can't know how many I'll need in the end.
- do it in a time-efficient way
- and without strong memory fragmentation
Answers (2)
Jochen Schuettler
on 11 Mar 2021
Edited: Jochen Schuettler
on 11 Mar 2021
Parcelling the larger equation into a a nested for loop, which is effectively called 48 times and uses 384 entries, worked to make the calculation quick!
I realize your answer about the memory allocation was the correct reason: Matlab calls for additional memory for these 48x384 entries and the memory is freed again afterwards. So the freeing is the problem, and it might be part of Windows' strategy to either keep the memory with the process or not.
But: Why does the random 9.9GB data behave different than my real data?
And: Why does task manager show so much less memory use than matlabs "memory"?
And: Would it help to declare the intermediate data as global/persistent, so we don't need for-loops?
4 Comments
Jochen Schuettler
on 11 Mar 2021
I can already state that a global variable makes the calculation faster than the for-loop. But for validation of continuing efficiency the program needs to run a day again.
Jochen Schuettler
on 11 Mar 2021
Using the saved data I could check - the for-loop-version is the fastest in the end.
Jan
on 15 Mar 2021
Why does task manager show so much less memory use than matlabs "memory"?
The term "memory usage" is not uniquely defined.
x = rand(1, 1e6);
x = 5
Now the large memory block reserved by the first command is free'd, but maybe the memory manager of the OS did not overwrite it with zeros already and therefore it is not available for other programs. Now it is a question of taste, if this memory belongs to Matlab or not. In theory the OS can decide by its own when to clear the contents. Usually it does this on demand or on idle only.
In addition the OS stores a file cache for each program. Does this belong to the application or to the operating system?
Would it help to declare the intermediate data as global/persistent, so we don't need for-loops?
Maybe. You did not post a section of your code, which would clarify exactly what you are doing. So I could speculate only. It is more reliable, if you try this. Remember, that the behavior can change with the Matlab version, the OS and the avalaible free RAM.
Jochen Schuettler
on 15 Mar 2021
Thank you for comment-answering my answer here, Jan. Global variables did not help in my case, sadly.
Concerning the memory management part: I'd understand if Matlab were showing somewhat less memory than Windows, but MORE ?
Windows Taskmanager: 4.something GB - Matlab 'memory': 9.9 GB
Jochen Schuettler
on 16 Mar 2021
Edited: Jochen Schuettler
on 16 Mar 2021
Jan, you twice asked for a minimal code example. I offered to upload 15.2 GB or let you run my code for weeks. Now I wrote a short code that puts variables of the same size and structure into memory as my real problem. At first I tried to fill in semi-meaningful random values as well, but I guess that's not necessary and replaced them by ones. I commented them, so you could put them back in.
In the end, after all the memory building and filling, there is a short loop producing the above figures. At least it should happen that way, the code is still running... Any way to let randperm and mat2cell produce a progress message?
My machine is x64, 16GB, OS is Windows 10 20H2, Matlab is R2019a.
A = cell(1,4);
l(1) = 24043282;
l(2) = 654;
A{1} = sparse(l(1),1);
A{1}(randperm(l(1),l(2))) = 1;%[1 randperm(l(2)-2,l(2)-2)+1 l(2)];
l(3) = 660;
l(4) = 23580752;
temp = sparse(l(1),l(3));
temp(randperm(l(2)*l(1),l(4))) = 1;%[1 randperm(l(4)-2,l(4)-2) l(4)];
A{2} = mat2cell(temp,l(1),ones(l(3),1));
l(5) = 83195;
l(6) = 7368740;
l(7) = 109051176;
l(8) = 621077181;
temp = sparse(l(5),l(6));
temp(randperm(l(5)*l(6),l(7))) = 1;%[1 randperm(l(8)-2,l(7)-2)+1 l(8)]/10000;
A{3} = mat2cell(temp,l(5),ones(l(6),1));
A{4} = ones(1,4);
clear temp;
l(9) = fix(l(8)/10000);
l(10) = 10000;
l(11) = 7272;
std_l2 = 3.2744e4;
mean_l2 = 2.693e5;
l2 = round(randn(l(11),1)*std_l2+mean_l2);
I2 = [randperm(l(9)-1,l(11)-1) l(9)];
B = cell(l(9),1);
for i = 1:l(11)
I2i = I2(i);
B{I2i} = cell(1,4);
B{I2i}{1} = ones(l(10),1,'uint32');%uint32([1 randperm(l2(i)-1,l(10)-1)+1]');
B{I2i}{2} = ones(l2(i),6,'uint8');
B{I2i}{3} = ones(l2(i),2,'int16');
B{I2i}{4} = ones(1,2);
end
l(12) = 233209050;
l(13) = 6957970;
C = zeros(l(12),1,'int32');
C(randperm(l(12),l(13))) = 1;%[randperm(l(13)-1,l(13)-1) l(13)];
D = ones(384,17);
tv = [];
for n = 1:400:6000
tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);
7 Comments
Jochen Schuettler
on 17 Mar 2021
After letting this code run for a day I'm having doubts if it will finish soon. There seem to be efficiency problems with randperm and/or mat2cell and such high numbers. I don't know what else I could do to let you experience the problem yourself. I offer once more to meet virtually, so you could test it on my computer.
Generelly the "memory fragmentation" explanation sounds right and is likely caused by the specific structure of the main variables, as I tried to recreate here. Perhaps you can glean information from the code and/or my explanation above (15 Mar 2021 at 14:00) and from that tell me a better variable structure, holding the same information, with the possibility to insert data, and to save, clear and reload parts of the data time-efficiently, without (strong) memory fragmentation.
Jan
on 17 Mar 2021
Edited: Jan
on 17 Mar 2021
I have a hard time reading e.g. such lines:
I2 = [randperm(l(9)-1,l(11)-1) l(9)];
for i = 1:l(11)
To avoid mistakes I recommend to avoid mixing "1"s and "l"s.
But Matlab is not impeded by this. I'm not sure if there is a better strategy to represent your data. I try to avoid nested cells, but sometimes it is the most efficient way.
Therefore I have a very cheap idea only:
If memory is rare, install more RAM. If you wark with 10GB of data, having 16 GB of RAM is short. If Matlab starts to use virtual memory, this slows down the speed massively. 16 GB of additional DDR4 RAM cost 72€ currently. With more RAM the problem of fragmentation is less severe.
Jochen Schuettler
on 18 Mar 2021
Edited: Jochen Schuettler
on 18 Mar 2021
This runs in 1-2 hours:
A = cell(1,4);
n(1) = 24043282;
n(2) = 654;
A{1} = sparse(n(1),1);
A{1}(randperm(n(1),n(2))) = 1;%[1 randperm(n(2)-2,n(2)-2)+1 n(2)];
n(3) = 660;
n(4) = 23580752;
n(5) = 6358246;
I = randperm(n(5)*n(2),n(4));
[r,c] = ind2sub([n(5),n(2)],I);
I = randperm(n(1),n(5));
r = I(r);
temp = sparse(r,c,ones(size(r)),n(1),n(3));
A{2} = mat2cell(temp,n(1),ones(n(3),1));
n(6) = 83195;
n(7) = 7368740;
n(8) = 109051176;
n(9) = 621077181;
I = randperm(n(6)*n(7),n(8));
[r,c] = ind2sub([n(6),n(7)],I);
temp = sparse(r,c,ones(size(r)),n(6),n(7));
A{3} = mat2cell(temp,n(6),ones(n(7),1));
A{4} = ones(1,4);
clear temp;
n(9) = A = cell(1,4);
n(1) = 24043282;
n(2) = 654;
A{1} = sparse(n(1),1);
A{1}(randperm(n(1),n(2))) = 1;%[1 randperm(n(2)-2,n(2)-2)+1 n(2)];
n(3) = 660;
n(4) = 23580752;
n(5) = 6358246;
I = randperm(n(5)*n(2),n(4));
[r,c] = ind2sub([n(5),n(2)],I);
I = randperm(n(1),n(5));
r = I(r);
temp = sparse(r,c,ones(size(r)),n(1),n(3));
A{2} = mat2cell(temp,n(1),ones(n(3),1));
n(6) = 83195;
n(7) = 7368740;
n(8) = 109051176;
I = randperm(n(6)*n(7),n(8));
[r,c] = ind2sub([n(6),n(7)],I);
temp = sparse(r,c,ones(size(r)),n(6),n(7));
A{3} = mat2cell(temp,n(6),ones(n(7),1));
A{4} = ones(1,4);
clear temp;
n(9) = 62107;
n(10) = 10000;
n(11) = 7272;
std_n2 = 3.2744e4;
mu_n2 = 2.693e5;
n2 = round(randn(n(11),1)*std_n2 + mu_n2);
I2 = [randperm(n(9)-1,n(11)-1) n(9)];
B = cell(n(9),1);
for i = 1:n(11)
I2i = I2(i);
B{I2i} = cell(1,4);
B{I2i}{1} = ones(n(10),1,'uint32');
B{I2i}{2} = ones(n2(i),6,'uint8');
B{I2i}{3} = ones(n2(i),2,'int16');
B{I2i}{4} = ones(1,2);
end
n(12) = 233209050;
n(13) = 6957970;
C = zeros(n(12),1,'int32');
C(randperm(n(12),n(13))) = 1;
D = ones(384,17);
tv = [];
for n = 1:400:6000
tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);;
n(10) = 10000;
n(11) = 7272;
std_n2 = 3.2744e4;
mu_n2 = 2.693e5;
n2 = round(randn(n(11),1)*std_n2 + mu_n2);
I2 = [randperm(n(9)-1,n(11)-1) n(9)];
B = cell(n(9),1);
for i = 1:n(11)
I2i = I2(i);
B{I2i} = cell(1,4);
B{I2i}{1} = ones(n(10),1,'uint32');
B{I2i}{2} = ones(n2(i),6,'uint8');
B{I2i}{3} = ones(n2(i),2,'int16');
B{I2i}{4} = ones(1,2);
end
n(12) = 233209050;
n(13) = 6957970;
C = zeros(n(12),1,'int32');
C(randperm(n(12),n(13))) = 1;
D = ones(384,17);
tv = [];
for n = 1:400:6000
tic; for i = 1:10000; t = D(1:n); end; tv(end+1) = toc;
end
plot(1:400:6000,tv);
But I don't see the effect I'm experiencing with my real problem. So it's not caused by the variable structure alone. Also memory gives out 28 GB with this instead of 9.9 GB. I don't know, why the structurally similar data takes so much more space with this example.
Concerning your idea about 32 GB: Yes, I'll buy 2 SO-DIMMs of 16 GB each, costs 140€ including shipping. More is not possible in my laptop. And my tower is maxed out at 4x4GB.
Jan
on 19 Mar 2021
"I don't know, why the structurally similar data takes so much more space with this example."
The OS provides the available memory to an application, when this is possible.
If a problem requires a lot of RAM, there are some tricks, but nothing is as efficient as running the code on a machine with enough ressources.
Jochen Schuettler
on 19 Mar 2021
Edited: Jochen Schuettler
on 19 Mar 2021
Also I included a pragmatic workaround, now that I know that the one-line-copy is the fastest with low memory use, and the for-loop-copy is the fastest with high memory use: I simply do 1000 copies of both, measure time and set a flag. That takes << 1s with low memory and ~2s in the high mem case, in practice developing from one to the other. In the next roughly 2 minutes the proven faster method is used. Repeat.
Jan
on 19 Mar 2021
In this forum "closing" means, that a question is removed soon. So we close questions only, if they contain too few information to be answered.
See Also
Categories
Find more on Performance and Memory in Help Center and File Exchange
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)