Parallel computing, occasionally get Exception message "Message Catalog MATLAB:load was not loaded from the file"

14 views (last 30 days)
I am running two jobs on a cluster, job1 on node1, job2 on node2. Job1 starts a little bit earlier than job2.
Everything is fine for job1. But for job2, sometimes I get the exception message in command line, " Caught "std::exception" Exception message is:
Message Catalog MATLAB:load was not loaded from the file. Please check file location, format or contents ".
When this happens, my job did not stop but it did not do calculation anymore, i.e. it hangs.
I suspect this is due to the following resons:
  1. This is related to the load() function. Actually, I did use load() in my parfor-loop. However, I thought load() is different from fopen(), which needs to be followed by fclose(). So, do I have to take some actions when using load() in parlor-loop?
  2. This is related to linux system. When there are too many open files, this may occur. However, I did not open any file in my parfor-loop.
  3. This is related to linux system and I used too much resources. When I run only a job, this exception message never shows.
Did someone come into this?

Accepted Answer

Edric Ellis
Edric Ellis on 11 Dec 2020
Edited: Edric Ellis on 11 Dec 2020
The probable cause of this is the file handle limit. This page: https://www.mathworks.com/help/parallel-computing/recommended-system-limits-for-machintosh-and-linux.html has some instructions. Basically, I think you need to raise the ulimit values on the system.
(The other thing to check is that you aren't opening lots of file handles using fopen and not subsequently fcloseing them)
  1 Comment
Xingwang Yong
Xingwang Yong on 11 Dec 2020
Thanks, Edric. I checked the Maximum number of user processes of my node, it is 4096, far smaller than the recommended 23741. But Maximum number of open file descriptors of my node is greater than the recommended one.
I did not open any files using fopen(). I used load() in my parfor-loop.
I'll ask my admin to increase Maximum number of user processes to see if this still occurs.

Sign in to comment.

More Answers (0)

Categories

Find more on Parallel for-Loops (parfor) in Help Center and File Exchange

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!