Memory not being cleared from iteration to iteration?

Hello all,
I'm having a memory leakage somewhere in my code that I can't find. I have a matlab code that uses objects to read different files, process them and add the results to a database. In pseudocode, let's say:
for ii=1:numfiles
reader.readfile(ii);
processor.process(reader);
database.addresults(processor);
end
The "processor" object uses parfor inside. The pararllel pool is being created and deleted inside the "processor" object in each iteration. I'm running this loop over the same file several time and monitoring the ram usage (using "memory" command in windows, and monitoring the "MemAvailableAllArrays" field), and also the size of each object (using the "GetSize" function). The database is a huge matrix where I accumulate the results.
The MemAvailableAllArrays is decreasing from iteration to iteration. I thought that maybe one of the objects might be concatenating some vectors by mistake, increasing thus its size, but all objects sizes remain contant for iteration to iteration.
Any advise on how to find / solve where this memory "leak" is and/or how to solve it?
Edit: I tried with Matlab 2016b and 2020b, and same issue.

4 Comments

"The "processor" object uses parfor inside."
My gut instinct tells me that this parfor might be involved... is there any chance that you could convert the parfor into an ordinary for, and try testing it again?
I run a test processing 20 times the same file with parfor and with for. I monitored the memory usage with the MemAvailableAllArrays field of the memory command. The x tick lines correspond to the moment right after when a file was read. In the code, I just changed the "parfor" for a "for".
The objects do not change their size between iterations, neither between parfor or for:
Why is parfor leaking memory? Any hint/guess?
"I monitored the memory usage with the MemAvailableAllArrays field of the memory command."
The MemAvailableAllArrays field gives the "Total memory available to hold data", it is not a measure of "memory usage" as you write. To measure current usage, use the MemUsedMATLAB field.
The first graph in your previous comment is confusing: did MemAvailableAllArrays contain negative values?
Just out of curiousity, how much memory does your PC have installed?
Sorry for the delay answering. The machin has 64 GBytes of RAM. MemAvailableAllArrays has negative values since I subtrated the first value to the whole serie.
I let the computer analyzing files the whole night and monitored MemUsedMATLAB:
I am not sure if what it is explained here would be useful, but I added these lines at the end of the loop just in case that could trigger the garbage collector:
java.lang.System.gc()
java.lang.Runtime.getRuntime().gc
pause(1)
Since the GC has the lowest priority, I thought that maybe a pause would push it to start running. Doesn't seem to improve compared with the same experiment (the time series is way shorter without the garbage collector):
I also tried to delete the three objects and then create them again in each iteration, and the result is as follows (with the GC lines still activated):
Which doesn't seem to be increasing with time. I'll let the computer all night analyzing files to see what happens in this case. The MemAvailableAllArrays behaves in the same exact way as in previous situations, though:
The 68 GBytes of ram correspond to the "commited" memory in the task manager, I guess.
I still don't know what to do to solve this issue.

Sign in to comment.

Answers (1)

As per my understanding, your code is using more memory than expected.
This might happen if you are creating many temporary variables, or if you are appending new values to a vector. You can look at the following resources for general tips on efficient memory usage in MATLAB.

8 Comments

It's not using more memory than expected.
From one iteration to the next one, some used memory seems to be accumulated and it's not being cleared. The objects I am using are not changing its size, and this only happens when I use "parfor", it does not happen with "for" (see the graphs here).
Can you provide sample code so that I can reproduce the issue? Also provide the output of the "version" command.
I usually wor with versions 9.1.0.441655 (R2016b) (windows and linux, linux in a Oracle VirtualBox virtual machine), but I carried out the first post experiment also with 9.9.0.1570001 (R2020b) Update 4 (windows).
I don't think I can provide a sample of the code, each object has more than a thousand lines of code. Im not asking for the solution, but for a procedure to try to find what is going on with this "lost" memory.
The memory usage from the plots looks like expected behaviour to me. The memory usage seems to be constant over time, with peaks at each iteration. Note that parfor uses extra memory than for, it makes a tradeoff between memory usage and computation speed.
If you do face any memory issues in the future, you can share the code with the support team(they can work over email in case you don't want to share code publicly). Else, you can iteratively remove code and use breakpoints to detect where the memory leak happens.
Regarding the "MemAvailableAllArrays", it is the available memory to store arrays, right? In the very first plot, "MemAvailableAllArrays" decreases over time, which means that at some point matlab won't be able to allocate the data loaded from files, am I right? Is this behaviour normal?
Also, why does it happen when I execute the code with parfor but not with for?
@Raul Onrubia: it seems a bit odd to me too, that MemAvailableAllArrays would decrease as your figures show.
This new graph shows two more tests compared to the previous ones.
The "For" and "parfor" tests correspond to the code represented in the first post running with for and parfor respectively. There, the parpool was created at the beggining of the code and deleted after all iterations.
Test1 creates the reader, processor, database, and parpool inside the loop, right after starting the iteration, and deletes all of them right before finishing the iteration.
Test 2 does the same as Test1, but one of each 5 iterations does not create the parallel pool, and runs the parfor with 0 threads. In this way, the MemAvailableAllArrays seems to keep under control.
I saw an old post that someone had the same issue, and tried to follow the advises, but I am not using any mex function that allocates data, neither creating grpahic objects, so I can not find where the memory leak comes from...
Using MemUsedMATLAB would be more appropriate than MemAvailableAllArrays. Without the code, it's difficult for me to say what causes the issue.

Sign in to comment.

Categories

Asked:

on 29 Jan 2021

Commented:

on 11 Feb 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!