Resolve Memory Errors on Linux Clusters
Issue
When MATLAB® Parallel Server™ services create more processes on a Linux® node than the limits of the operating system allow, the services fail and generate an out-of-memory error.
If you try to run a job on the affected cluster, the job fails with an error like the following:
Error using parallel.Job/submit (line 304)
Java exception occurred:
java.lang.OutOfMemoryError: unable to create new native threadPossible Solutions
On many Linux distributions, the max user processes limit (ulimit) for a
shell defaults to 1024. User limits control the resources available to the shell and its
processes. To check the current max user processes limit for the shell, run the following
command at the command line of one of the cluster nodes. The ulimit
command might require root
access.
ulimit -u"unlimited" or at least to the recommended
minimum value of 23741.For example, to set the max user processes limit to the recommended minimum value, run this command.
ulimit -u 23741Changing a limit within a shell affects only that shell and any subsequent MATLAB
Parallel Server services you start there. To make this setting persistent system-wide, you
must modify the /etc/security/limits.conf file. For MATLAB Job Scheduler cluster nodes, modify the user processes limit for the
root user or the user the mjs service runs
as.
For example, to remove the user process limit for the root user, add this line to the
limits.conf file. Editing the limits.conf file might
require root
access.
root hard nproc unlimitedFor more information on ulimit or limits.conf, see
their manual pages.