Run Batch Job and Access Files from Workers

You can offload your computations to run in the background by using batch. If your code needs access to files, you can use additional options, such as 'AttachedFiles' or 'AdditionalPaths', to make the data accessible. You can close or continue working in MATLAB while computations take place and recover the results later.

Prepare Example

Prepare and copy the supporting files for this example by using the following command.

prepareSupportingFiles;

Run Batch Job

Create a cluster object using parcluster. By default, parcluster uses your default cluster profile. Check your default cluster profile on the MATLAB Home tab, in the Environment section, in Parallel > Select a Default Cluster.

c = parcluster();

Place your code inside a function and submit it as a batch job by using batch. For an example of a custom function, see the supporting function myFunction. Specify the expected number of output arguments and a cell array with inputs to the function.

If your code uses a parallel pool, use the 'Pool' name-value pair argument to create a parallel pool with the number of workers that you specify. batch uses an additional worker to run the function itself.

By default, batch changes the initial working directory of the workers to the current folder of the MATLAB client. It can be useful to control the initial working directory in the workers. For example, you might want to control it if your cluster uses a different filesystem, and therefore the paths are different, such as when you submit from a Windows client machine to a Linux cluster.

  • To keep the initial working directory of the workers and use their default, set 'CurrentFolder' to '.'.

  • To change the initial working directory, set 'CurrentFolder' to a folder of your choice.

This example uses a parallel pool with three workers and chooses a temporary location for the initial working directory.

job = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir);

batch offloads the computations in your function to a parallel worker, so you can continue working in MATLAB while computations take place.

If you want to block MATLAB until the job completes, use the wait function on the job object.

wait(job);

To retrieve the results, use fetchOutputs on the job object.

If your code has an error, then fetchOutputs throws an error. You can access error information by checking the Error property of Task objects in the job. In this example, the code depends on a file that the workers cannot find.

getReport(job.Tasks(1).Error)
ans = 
    'Error using myFunction (line 4)
     Unable to read file 'mydata2.dat'. No such file or directory.'

Access Files from Workers

By default, batch automatically analyzes your code and transfers required files to the workers. In some cases, you must explicitly transfer those files--for example, when you determine the name of a file at runtime.

In this example, myFunction accesses the supporting file mydata.dat, which batch automatically detects and transfers. The function also accesses mydata1.dat, but it resolves the name of the file at runtime, so the automatic dependency analysis does not detect it.

type myFunction.m 
function X = myFunction()    
    A = load("mydata.dat"); 
    X = zeros(flip(size(A)));
    parfor i = 1:3
       B = load("mydata"+i+".dat");
       X = X + A\B;
    end
end

If the data is in a location that the workers can access, you can use the name-value pair argument 'AdditionalPaths' to specify the location. 'AdditionalPaths' adds this path to the MATLAB search path of the workers and makes the data visible to them.

pathToData = pwd;
job(2) = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir, ...
    'AdditionalPaths',pathToData);
wait(job(2));

If the data is in a location that the workers cannot access, you can transfer files to the workers by using the 'AttachedFiles' name-value pair argument.

job(3) = batch(c,@myFunction,1,{}, ...
    'Pool',3, ...
    'CurrentFolder',tempdir, ...
    'AttachedFiles',"mydata"+string(1:3)+".dat");

Find Existing Job

You can close MATLAB after job submission and retrieve the results later. Before you close MATLAB, make a note of the job ID.

job3ID = job(3).ID
job3ID = 19

When you open MATLAB again, you can find the job by using the findJob function.

job(3) = findJob(c,'ID',job3ID);
wait(job(3));

Alternatively, you can use the Job Monitor to track your job. You can open it from the MATLAB Home tab, in the Environment section, in Parallel > Monitor Jobs.

Retrieve Results and Clean Up Data

To retrieve the results of a batch job, use the fetchOutputs function. fetchOutputs returns a cell array with the outputs of the function run with batch.

X = fetchOutputs(job(3))
X = 1×1 cell array
    {420×1680 double}

When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.

delete(job)
clear job

See Also

| | |