Main Content

Resolve Memory Errors on Linux Clusters

Issue

When MATLAB® Parallel Server™ services create more processes on a Linux® node than the limits of the operating system allow, the services fail and generate an out-of-memory error.

If you try to run a job on the affected cluster, the job fails with an error like the following:

Error using parallel.Job/submit (line 304)
Java exception occurred:
java.lang.OutOfMemoryError: unable to create new native thread

Possible Solutions

On many Linux distributions, the max user processes limit (ulimit) for a shell defaults to 1024. User limits control the resources available to the shell and its processes. To check the current max user processes limit for the shell, run the following command at the command line of one of the cluster nodes. The ulimit command might require root access.

ulimit -u
Ensure that the max user processes limit is set to "unlimited" or at least to the recommended minimum value of 23741.

For example, to set the max user processes limit to the recommended minimum value, run this command.

ulimit -u 23741
Check the max user processes limit again to confirm the new setting.

Changing a limit within a shell affects only that shell and any subsequent MATLAB Parallel Server services you start there. To make this setting persistent system-wide, you must modify the /etc/security/limits.conf file. For MATLAB Job Scheduler cluster nodes, modify the user processes limit for the root user or the user the mjs service runs as.

For example, to remove the user process limit for the root user, add this line to the limits.conf file. Editing the limits.conf file might require root access.

root        hard        nproc        unlimited
To resolve out-of-memory errors for your entire cluster, modify the number of user processes for each node in your cluster.

For more information on ulimit or limits.conf, see their manual pages.

See Also

Topics