MATLAB Answers

0

inmem - list files loaded inside parfor

Asked by Alex R. on 25 May 2015
Latest activity Commented on by Walter Roberson
on 1 Jun 2015
It looks like inmem does not list functions loaded inside parfor loops. At least that is what I'm experiencing in R2014b and I have not found this mentioned in the docs. Is this expected behaviour?
I understand it's the workers who execute the file, but the master process could be made aware of which files are loaded by the workers ... it already communicates with all workers back and forth, and waits for all workers to finish anyway.
This would be extremely useful for reproducibility in research. I could save/zip all files that were loaded for the execution chain and also save the inputs as .mat and the last history entry, and then be able to completely reproduce the same results later. As it is, not all files are saved and that's a shame.
I know I could use matlab.codetools.requiredFilesAndProducts but that saves all dependencies (not just those needed for the particular set of inputs I used), which in my case are more than 100 MB worth of files, as opposed to 100 KB with inmem.
Specifically:
parfor k=1:100
external_file(k);
end
[files,mexs] = inmem('-completenames');
The above does not list 'external_file.m'. Replacing parfor with for will list it, but then my code is slow ...
Best, Alex.

  0 Comments

Sign in to comment.

1 Answer

Answer by Walter Roberson
on 25 May 2015

inmem is documented specifically as being the ones that are currently loaded, not a history of what has ever been loaded during the session. If a function has been cleared, it is no longer loaded. The workers logically unload after a parfor, so the loaded file is logically gone.
If workers do not unload after parfor, then you could, after a parfor, parfor "enough" to use all of the worker entries:
%maybe 50 will be enough to hit each worker at least once. Maybe not.
was_inmem = cell(50,1);
parfor K = 1 : 50
was_inmem = inmem('-completenames');
end
was_inmem = unique({was_inmem{:}});
My prediction is that it won't show anything.
Safer way: incorporate the above strategy right into your original parfor
was_inmem = cell(100,1);
parfor k=1:100
external_file(k);
was_inmem{k} = inmem('-completenames');
end
was_inmem = unique({was_inmem{:}});

  4 Comments

Show 1 older comment
clear functions is not the only way of clearing a function from memory; various other clear() commands can do it as well, and I think perhaps rehash() might too.
A function that has been executed in a worker is definitely gone from memory if the worker is gone. If you have an actual Distributed Computing Pool that you are opening and closing then the worker is gone for sure when the pool is closed. If you do not have such a pool (and are operating on local workers) or if you opened a pool but have not closed it yet, then the state retained by the workers as "hot spares" is not really documented that I have seen. The effect for variables is as if each worker lost the state of all variables and was reinitialized, as if parfor started a new copy of a nested workspace which used shared variables to refer to all variables that are not local to the body of the parfor; the local state of that workspace being destroyed when the iteration returned. But that does not address the question of when the pre-parsed code for functions invoked in a parfor body is cleared, when exactly the "loaded in memory" ends. I don't know the answer to that. But it seems to me plausible that the worker state might be destroyed when the workspace that calls parfor ends.
I am getting some interesting results on a system without PCT installed, when I use parfor (i.e., running on local workers) and make assignments to anonymous functions that use local variables. All of those workspaces continue to exist after end of the parfor loop. I suspect that the same thing might work with a real pool, provided that the referenced objects were serializable. With the real PCT, whether one is operating on a worker or on the client can be checked with getCurrentTask()
Anyhow, if you are using a real parfor pool, try using
pctRunOnAll('inmem()')
An alternative to pctRunOnAll in >= R2013b is:
f = parfevalOnAll(@inmem, 1)
which runs only the workers and returns you a future from which you can extract the individual results from each worker.
Some day when I win the lottery, I'm gonna buy me a copy of PCT to play with...

Sign in to comment.