batch

Run MATLAB script or function on worker

Description

example

j = batch(script) runs the script file script on a worker in the cluster specified by the default cluster profile. (Note: Do not include the .m file extension with the script name.) The function returns j, a handle to the job object that runs the script. The script file script is copied to the worker.

j = batch(myCluster,script) is identical to batch(script) except that the script runs on a worker in the cluster specified by the cluster object myCluster.

example

j = batch(fcn,N,{x1,...,xn}) runs the function fcn on a worker in the cluster specified by the default cluster profile. The function returns j, a handle to the job object that runs the function. The function is evaluated with the given arguments, x1,...,xn, and returns N output arguments. The function file for fcn is copied to the worker. (Note: Do not include the .m file extension with the function name argument.)

example

j = batch(myCluster,fcn,N,{x1,...,xn}) is identical to batch(fcn,N,{x1,...,xn}) except that the function runs on a worker in the cluster specified by the cluster object myCluster.

example

j = batch(___,Name,Value) specifies options that modify the behavior of a job using one or more name-value pair arguments. These options support batch for functions and scripts, unless otherwise indicated. Use this syntax in addition to any of the input argument combinations in previous syntaxes.

Examples

collapse all

Use batch to offload work to a MATLAB worker session that runs in the background. You can continue using MATLAB while computations take place.

Run a script as a batch job by using the batch function. By default, batch uses your default cluster profile. Check your default cluster profile on the MATLAB Home tab, in the Environment section, in Parallel > Select a Default Cluster. Alternatively, you can specify a cluster profile with the 'Profile' name-value pair argument.

job = batch('myScript');

batch does not block MATLAB and you can continue working while computations take place.

If you want to block MATLAB until the job finishes, use the wait function on the job object.

wait(job);

By default, MATLAB saves the Command Window output from the batch job to the diary of the job. To retrieve it, use the diary function.

diary(job)
--- Start Diary ---
n = 100

--- End Diary ---

After the job finishes, fetch the results by using the load function.

load(job,'x');
plot(x)

If you want to load all the variables in the batch job, use the load function without arguments.

When you have fetched all the required variables, delete the job object to clean up its data and avoid consuming resources unnecessarily.

delete(job);
clear job

Note that if you send a script file using batch, MATLAB transfers all the workspace variables to the cluster, even if your script does not use them. The data transfer time for a large workspace can be substantial. As a best practice, convert your script to a function file to avoid this communication overhead. For an example that uses a function, see Run Batch Job and Access Files from Workers.

For more advanced options with batch, see Run Batch Job and Access Files from Workers.

You can offload your computations to run in the background by using batch. If your code needs access to files, you can use additional options, such as 'AttachedFiles' or 'AdditionalPaths', to make the data accessible. You can close or continue working in MATLAB while computations take place and recover the results later.

Prepare Example

Prepare and copy the supporting files for this example by using the following command.

prepareSupportingFiles;

Run Batch Job

Create a cluster object using parcluster. By default, parcluster uses your default cluster profile. Check your default cluster profile on the MATLAB Home tab, in the Environment section, in Parallel > Select a Default Cluster.

c = parcluster();

Place your code inside a function and submit it as a batch job by using batch. For an example of a custom function, see the supporting function myFunction. Specify the expected number of output arguments and a cell array with inputs to the function.

If your code uses a parallel pool, use the 'Pool' name-value pair argument to create a parallel pool with the number of workers that you specify. batch uses an additional worker to run the function itself.

By default, batch changes the initial working directory of the workers to the current folder of the MATLAB client. It can be useful to control the initial working directory in the workers. For example, you might want to control it if your cluster uses a different filesystem, and therefore the paths are different, such as when you submit from a Windows client machine to a Linux cluster.

  • To keep the initial working directory of the workers and use their default, set 'CurrentFolder' to '.'.

  • To change the initial working directory, set 'CurrentFolder' to a folder of your choice.

This example uses a parallel pool with three workers and chooses a temporary location for the initial working directory.

job = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir);

batch offloads the computations in your function to a parallel worker, so you can continue working in MATLAB while computations take place.

If you want to block MATLAB until the job completes, use the wait function on the job object.

wait(job);

To retrieve the results, use fetchOutputs on the job object.

If your code has an error, then fetchOutputs throws an error. You can access error information by checking the Error property of Task objects in the job. In this example, the code depends on a file that the workers cannot find.

getReport(job.Tasks(1).Error)
ans = 
    'Error using myFunction (line 4)
     Unable to read file 'mydata2.dat'. No such file or directory.'

Access Files from Workers

By default, batch automatically analyzes your code and transfers required files to the workers. In some cases, you must explicitly transfer those files--for example, when you determine the name of a file at runtime.

In this example, myFunction accesses the supporting file mydata.dat, which batch automatically detects and transfers. The function also accesses mydata1.dat, but it resolves the name of the file at runtime, so the automatic dependency analysis does not detect it.

type myFunction.m 
function X = myFunction()    
    A = load("mydata.dat"); 
    X = zeros(flip(size(A)));
    parfor i = 1:3
       B = load("mydata"+i+".dat");
       X = X + A\B;
    end
end

If the data is in a location that the workers can access, you can use the name-value pair argument 'AdditionalPaths' to specify the location. 'AdditionalPaths' adds this path to the MATLAB search path of the workers and makes the data visible to them.

pathToData = pwd;
job(2) = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir, ...
    'AdditionalPaths',pathToData);
wait(job(2));

If the data is in a location that the workers cannot access, you can transfer files to the workers by using the 'AttachedFiles' name-value pair argument.

job(3) = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir, ...
    'AttachedFiles',"mydata"+string(1:3)+".dat");

Find Existing Job

You can close MATLAB after job submission and retrieve the results later. Before you close MATLAB, make a note of the job ID.

job3ID = job(3).ID
job3ID = 19

When you open MATLAB again, you can find the job by using the findJob function.

job(3) = findJob(c,'ID',job3ID);
wait(job(3));

Alternatively, you can use the Job Monitor to track your job. You can open it from the MATLAB Home tab, in the Environment section, in Parallel > Monitor Jobs.

Retrieve Results and Clean Up Data

To retrieve the results of a batch job, use the fetchOutputs function. fetchOutputs returns a cell array with the outputs of the function run with batch.

X = fetchOutputs(job(3))
X = 1×1 cell array
    {420×1680 double}

When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.

delete(job)
clear job

Input Arguments

collapse all

MATLAB script to be evaluated by the worker, specified as a character vector or string.

Example: batch('aScript');

Data Types: char | string

Cluster, specified as a parallel.Cluster object that represents cluster compute resources. To create the object, use the parcluster function.

Example: cluster = parcluster; batch(cluster,'aScript');

Data Types: parallel.Cluster

Function to be evaluated by the worker, specified as a function handle or function name.

Example: batch(@myFunction,1,{x,y});

Data Types: char | string | function_handle

Number of outputs expected from the evaluated function fcn, specified as a nonnegative integer.

Example: batch(@myFunction,1,{x,y});

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Input arguments to the function fcn, specified as a cell array.

Example: batch(@myFunction,1,{x,y});

Data Types: cell

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: j = batch(@myFunction,1,{x,y},'Pool',3);

Workspace on the worker just before the script or function is called, specified as the comma-separated pair consisting of 'Workspace' and a 1-by-1 struct. The field names of the struct define the names of the variables, and the field values are assigned to the workspace variables. By default, this parameter has a field for every variable in the current workspace where batch is executed. This parameter supports only the running of scripts.

Example: workspace.myVar = 5; j = batch('aScript','Workspace',workspace);

Data Types: struct

Cluster profile used to identify the cluster, specified as the comma-separated pair consisting of 'Profile' and a character vector or string. If this option is omitted, the default profile is used to identify the cluster and is applied to the job and task properties.

Example: j = batch('aScript','Profile','local');

Data Types: char | string

Paths to add to the MATLAB search path of the workers before the script or function executes, specified as the comma-separated pair consisting of 'AdditionalPaths' and a character vector, string array, or cell array of character vectors.

The default search path might not be the same on the workers as it is on the client; the path difference could be the result of different current working folders (cwd), platforms, or network file system access. Specifying the 'AdditionalPaths' name-value pair argument helps ensure that workers look for files, such as code files, data files, or model files, in the correct locations.

You can use 'AdditionalPaths' to access files in a shared file system. Note that path representations can vary depending on the target machines. 'AdditionalPaths' must be the paths as seen by the machines in the cluster. For example, if Z:\data on your local Windows® machine is /network/data to your Linux® cluster, then add the latter to 'AdditionalPaths'. If you use a datastore, use 'AlternateFileSystemRoots' instead to deal with other representations. For more information, see Set Up Datastore for Processing on Different Machines or Clusters (MATLAB).

Note that AdditionalPaths only helps to find files when you refer to them using a relative path or file name, and not an absolute path.

Example: j = batch(@myFunction,1,{x,y},'AdditionalPaths','/network/data/');

Data Types: char | string | cell

Files or folders to transfer to the workers, specified as the comma-separated pair consisting of 'AttachedFiles' and a character vector, string array, or cell array of character vectors.

Example: j = batch(@myFunction,1,{x,y},'AttachedFiles','myData.dat');

Data Types: char | string | cell

Flag to add user-added entries on the client path to worker paths, specified as the comma-separated pair consisting of 'AutoAddClientPath' and a logical value.

Example: j = batch(@myFunction,1,{x,y},'AutoAddClientPath',false);

Data Types: logical

Flag to enable dependency analysis and automatically attach code files to the job, specified as the comma-separated pair consisting of 'AutoAttachFiles' and a logical value. If you set the value to true, the batch script or function is analyzed and the code files that it depends on are automatically transferred to the workers.

Example: j = batch(@myFunction,1,{x,y},'AutoAttachFiles',true);

Data Types: logical

Folder in which the script or function executes, specified as the comma-separated pair consisting of 'CurrentFolder' and a character vector or string. There is no guarantee that this folder exists on the worker. The default value for this property is the current directory of MATLAB when the batch command is executed. If the argument is '.', there is no change in folder before batch execution.

Example: j = batch(@myFunction,1,{x,y},'CurrentFolder','.');

Data Types: char | string

Flag to collect the diary from the function call, specified as the comma-separated pair consisting of 'CaptureDiary' and a logical value. For information on the collected data, see diary.

Example: j = batch('aScript','CaptureDiary',false);

Data Types: logical

Environment variables to copy from the client session to the workers, specified as the comma-separated pair consisting of 'EnvironmentVariables' and a character vector, string array, or cell array of character vectors. The names specified here are appended to the EnvironmentVariables property specified in the applicable parallel profile to form the complete list of environment variables. Listed variables that are not set are not copied to the workers. These environment variables are set on the workers for the duration of the batch job.

Example: j = batch('aScript','EnvironmentVariables',"MY_ENV_VAR");

Data Types: char | string | cell

Number of workers to make into a parallel pool, specified as the comma-separated pair consisting of 'Pool' and either:

  • A nonnegative integer.

  • A 2-element vector of nonnegative integers, which is interpreted as a range. The size of the resulting parallel pool is as large as possible in the range requested.

In addition, note that batch uses another worker to run the batch job itself.

The script or function uses this pool to execution statements such as parfor and spmd that are inside the batch code. Because the pool requires N workers in addition to the worker running the batch, the cluster must have at least N+1 workers available. You do not need a parallel pool already running to execute batch, and the new pool that batch creates is not related to a pool you might already have open. For more information, see Run a Batch Job with a Parallel Pool.

If you use the default value, 0, the script or function runs on only a single worker and not on a parallel pool.

Example: j = batch(@myFunction,1,{x,y},'Pool',4);

Example: j = batch(@myFunction,1,{x,y},'Pool',[2,6]);

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Output Arguments

collapse all

Job that runs the script or function, returned as a parallel.Job object.

Example: j = batch('aScript');

Data Types: parallel.Job

Tips

To view the status or track the progress of a batch job, use the Job Monitor, as described in Job Monitor. You can also use the Job Monitor to retrieve a job object for a batch job that was created in a different session, or for a batch job that was created without returning a job object from the batch call.

Delete any batch jobs you no longer need to avoid consuming cluster storage resources unnecessarily.

Introduced in R2008a