Main Content

Choose How to Manage Data in Parallel Computing

To perform parallel computations, you need to manage data access and transfer between your MATLAB® client and the parallel workers. Use this page to decide how to transfer data the client and workers. You can manage data such as files, MATLAB variables, and handle-type resources.

Determine your Data Management Approach

The best techniques for managing data depend on your parallel application. Use the following tables to look for your goals and discover appropriate data management functions and their key features. In some cases, more than one type of object or function might meet your requirements. You can choose the type of object or function based on your preferences.

Transfer Data from Client to Workers

Use this table to identify some goals for transferring data from the client to workers and discover recommended workflows.

GoalRecommended Workflow

Use variables in your MATLAB workspace in an interactive parallel pool.

The parfor and spmd functions automatically transfer variables in the client workspace to workers. To send variables to workers in a parfeval computation, you must specify variables as input arguments in the parfeval function call.

Transfer variables in your MATLAB workspace to workers on a cluster in a batch workflow.

Pass variables as inputs into batch function.

Give workers access to large data stored on your desktop.

  • To give workers in a parallel pool access to large data, save the data to the parallel.pool.Constant object.

  • To give workers in a batch job created with the batch function access to large data, pass the data as an argument to the batch function.

  • To give workers in a batch job created with the createJob function access to large data, you can pass the data to the job ValueStore objects before you submit the job tasks.

Access large amounts of data or large files stored in the cloud and process it in an onsite or cloud cluster.

Use datastore with tall and distributed arrays to access and process data that does not fit in memory.

Give workers access to files stored on the client computer.

For workers in a parallel pool:

  • If the files are small or contain live data, you can specify files to send to workers using the addAttachedFiles function.

  • If the files are large or contain static data, you can reduce data transfer overheads by moving the files to the cluster storage. Use the addpath function to add their location to the workers' search paths.

For workers running batch jobs:

  • If the files are small or are frequently modified, you can let MATLAB determine which files to send to workers by setting the AutoAttachFiles property of the job to true. You can check if AutoAttachFiles has picked up all the file dependencies by running the listAutoAttachedFiles function.

  • You can also specify files to send to workers using the AttachedFiles property of the job.

  • If the files are large or are not frequently modified, you can reduce data transfer overheads by moving the files to the cluster storage and use the AdditionalPaths property of the job to specify their location. You must ensure that the workers have access to the cluster storage location.

Access custom MATLAB functions or libraries that are stored on the cluster.

Specify paths to the libraries or functions using the AdditionalPaths property of a parallel job.

Allow workers in a parallel pool to access non-copyable resources such as database connections or file handle

Use parallel.pool.Constant objects to manage handle-type resources such as database connections or file handles across pool workers.

Send a message to a worker in an interactive pool running a function.

Create a parallel.pool.PollableDataQueue object at the worker, and send this object back to the client. Then you can use the PollableDataQueue object to send a message to the worker. For an example of this workflow, see Receive Communication on Workers.

Transfer Data Between Workers

Use this table to identify some goals for transferring data between workers and discover recommended workflows.

GoalRecommended Workflow

  • Coordinate data transfer between workers as part of a parallel pipeline application.

  • Communicate between workers with Message Passing Interface (MPI).

Use the spmdSend,spmdReceive, spmdSendReceive and spmdBarrier functions to communicate between in an spmd block. These functions use the Message Passing Interface (MPI) to send and receive data between workers.

Offload results from workers, which another worker can process.

Store the data in the ValueStore object of the job or parallel pool. Multiple workers can read and write to the ValueStore object, which is stored on a shared file system accessible by the client and all workers.

Transfer Data from Workers to Client

Use this table to identify some goals for transferring data from a worker to a client and discover recommended workflows.

GoalRecommended Workflow

Retrieve results from a parfeval calculation.

Apply the fetchOutputs (parfeval) function to the parfeval Future object.

Retrieve large results at the client.

Store the data in the ValueStore object of the job or parallel pool. Multiple workers can read and write to the ValueStore object, which is stored on a shared file system accessible by the client and all workers.

  • Transfer a large file to the client.

  • Transfer files created during a batch execution back to the client.

Use the FileStore object of the parallel pool or job to store the files. Workers can read and write to the FileStore object, which is stored on a shared file system accessible by the client and all workers.

Fetch the results from a parallel job.

Apply the fetchOutputs (Job) function to the job object to retrieve all the output arguments from all tasks in a job.

Load the workspace variables from a batch job running a script or expression.

Apply the load function to the job object to load all the workspace variables on the workers.

Transfer Data from Workers to Client During Execution

Use this table to identify some goals for transferring data from a worker during execution and discover recommended workflows.

GoalRecommended Workflow

Inspect results from parfor or parfeval calculations in interactive parallel pool.

Use a PollableDataQueue to send results to the client during execution.

Update a plot, progress bar or other user interface with data from a function running in an interactive parallel pool.

Send the data to the client with a parallel.pool.DataQueue and use afterEach to run a function that updates the user interface when new data is received.

For very large computations with 1000s of calls to the afterEach update function, you might want to turn of visualizations. Visualizing results can be very useful but you can observe some performance degradation when you scale up to large calculations.

Collect data asynchronously to update a plot, progress bar or other user interface with data from a parfeval calculation.

Use afterEach to schedule a callback function that updates the user interface after a Future object finishes.

  • Track the progress of a job.

  • Retrieve some intermediate results while a job is running.

Store the data in the ValueStore object of the job. Use the KeyUpdatedFcn or the KeyRemovedFcn properties of the ValueStore object to run a callback function that updates a user interface at the client when data is added or removed from the ValueStore.

  • Send a large file to the client.

  • Transfer files created during a batch execution back to the client.

Store the files in the FileStore object of the job to store the files. Use the KeyUpdatedFcn or the KeyRemovedFcn properties of the FileStore object to run a callback function that sends files to the client when files are added or removed from the FileStore.

Compare Data Management Functions and Objects

Some parallel computing objects and functions that manage data have similar features. This section provides comparisons of the functions and objects that have similar features for managing data.

DataQueue vs. ValueStore

DataQueue and ValueStore are two objects in Parallel Computing Toolbox™ you can use transfer data between client and workers. The DataQueue object passes data from workers to the client in a first-in, first-out (FIFO) order, while ValueStore stores data that multiple workers as well as the client can access and update. You can use both objects for asynchronous data transfer to the client. However, DataQueue is only supported on interactive parallel pools.

The choice between DataQueue and ValueStore depends on the data access pattern you require in your parallel application. If you have many independent tasks that workers can execute in any order, and you want to pass data to the client in a streaming fashion, then use a DataQueue object. However, if you want to store and share values to multiple workers and access or update it at any time, then use ValueStore instead.

fetchOutputs (parfeval) vs. ValueStore

Use the fetchOutputs function to retrieve the output arguments of a Future object, which the software returns when you run a parfeval or parfevalOnAll computation. fetchOutputs blocks the client until the computation is complete, then sends the results of the parfeval or parfevalOnAll computation to the client. In contrast, you can use ValueStore to store and retrieve values from any parallel computation and also retrieve intermediate results as they are produced without blocking the program. Additionally, the ValueStore object is not held in system memory, so you can store large results in the ValueStore. However, be careful when storing large amounts of data to avoid filling up the disk space on the cluster.

If you only need to retrieve the output of a parfeval or parfevalOnAll computation, then fetchOutputs is the simpler option. However, if you want to store and access the results of multiple independent parallel computations, then use ValueStore. In cases where you have multiple parfeval computations generating large amounts of data, using the pool ValueStore object can help avoid memory issues on the client. You can temporarily save the results in the ValueStore and retrieve them when you need them.

load and fetchOutputs (Jobs) vs. ValueStore

load, fetchOutputs (Jobs), and ValueStore provide different ways of transferring data from jobs back to the client.

load retrieves the variables related to a job you create when you use the batch function to run a script or an expression. This includes any input arguments you provide and temporary variables the workers create during the computation. load does not retrieve the variables from batch jobs that run a function and you cannot retrieve results while the job is running. fetchOutputs (Jobs) retrieves the output arguments contained in the tasks of a finished job you create using the batch, createJob or createCommunicatingJob functions. If the job is still running when you call the fetchOutputs (Jobs) function, the fetchOutputs (Jobs) function returns an error.

When you create a job on a cluster, the software automatically creates a ValueStore object for the job, and you can use it to store data generated during job execution. Unlike the load and fetchOutputs functions, the ValueStore object does not automatically store data. Instead, you must manually add data as key-value pairs to the ValueStore object. Workers can store data in the ValueStore object that the MATLAB client can retrieve during the job execution. Additionally, the ValueStore object is not held in system memory, so you can store large results in the store.

To retrieve the results of a job after the job has finished, use the load or fetchOutputs (Jobs) function. To access the results or track the progress of a job while it is still running, or to store potentially high memory results, use the ValueStore object

AdditionalPaths vs. AttachedFiles vs. AutoAttachedFiles

AdditionalPaths, AttachedFiles, and AutoAttachedFiles are all parallel job properties that you can use to specify additional files and directories that are required to run parallel code on workers.

AdditionalPaths is a property you can use to add cluster file locations to the MATLAB path on all workers running your job. This can be useful if you have files with large data stored on the cluster storage, functions or libraries that are required by the workers, but are not on the MATLAB path by default.

The AttachedFiles property allows you to specify files or directories that are required by the workers but are not stored on the cluster storage. These files are copied to a temporary directory on each worker before the parallel code runs. The files can be scripts, functions, or data files, and must be located within the directory structure of the client.

Use the AutoAttachedFiles property to allow files needed by the workers to be automatically attached to the job. When you submit a job or task, MATLAB performs dependency analysis on all the task functions, or on the batch job script or function. Then it automatically adds the files required to the job or task object so they are transferred to the workers. Essentially, you only want to set the AutoAttachedFiles property to false if you know that you do not need the software to identify the files for you. For example, if the files your job is going to use are already present on the cluster, perhaps inside one of the AdditionalPaths locations.

Use AdditionalPaths when you have functions and libraries stored on the cluster that are required on all workers. Use AttachedFiles when you have small files that are required to run your code. To let MATLAB automatically determine if a job requires additional files to run, set the AutoAttachedFiles property to true.

See Also

| | | | | | | | | | |

Related Topics