Main Content

Use Amazon S3 Buckets with MATLAB Deep Learning Container

Train your deep learning model with training data stored in an Amazon S3™ Bucket and save the trained model to the cloud.

You can scale up an existing deep learning workflow by moving data and training to the cloud, where you can rent high performance GPUs and store large data files. One way to do this is to use s3 buckets. You can read and write directly to s3 buckets from any MATLAB®. You can use this workflow to access data in an s3 bucket from a MATLAB Deep Learning Container, and to get variables in and out of the container. For example:

  • If you have data locally, you can use the workflow on this page to upload that data to an s3 bucket, and access it from your MATLAB Deep Learning Container to train in the cloud, where you can rent high performance GPUs.

  • After training in the cloud in the container, you can save variables to the s3 bucket and access them from anywhere after you stop running the container.

Create an Amazon S3 Bucket

To upload a model from your local installation of MATLAB to the MATLAB session running in the deep learning container on the Amazon EC2 GPU enabled instance, you can use an s3 bucket. You can use the save function to save a model (and other workspace variables) as MAT-files from your local installation of MATLAB to an s3 bucket. You can then use the load function to load the model into the deep learning container. Similarly, you can save a trained model from the deep learning container to an s3 bucket and load this to your local MATLAB session.

To get started using s3 buckets with MATLAB:

  1. Download and install the AWS® Command Line Interface tool on your local machine

  2. Create AWS access keys on your local machine and set keys as environment variables

  3. Create an s3 bucket for your data

You can find detailed step-by-step instructions, including how to create AWS access keys, export the keys and set up s3 bucket in steps 1-3 on the Transfer Data To Amazon S3 Buckets and Access Data Using MATLAB Datastore documentation page.

Save and Load MATLAB Workspace Variables with Amazon S3

From your local installation of MATLAB, you can save an untrained neural network, untrainedNetwork, for example, directly from your workspace to your s3 bucket, mynewbucket. You must set your AWS Access Key ID, Secret Access Key (and Session Token if you are using an AWS temporary token) as environment variables in your local MATLAB installation.

setenv('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID'); 
setenv('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY');
setenv('AWS_SESSION_TOKEN', 'YOUR_AWS_SESSION_TOKEN'); % optional
setenv('AWS_DEFAULT_REGION', 'YOUR_AWS_DEFAULT_REGION'); % optional

save('s3://mynewbucket/untrainedNetwork.mat','untrainedNetwork','-v7.3');

Load this untrained model from the s3 bucket into the MATLAB session running in the deep learning container on AWS. Again, you must set your AWS Access Key ID, Secret Access Key (and Session Token if you are using an AWS temporary token) as environment variables in your container MATLAB session.

setenv('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID'); 
setenv('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY');
setenv('AWS_SESSION_TOKEN', 'YOUR_AWS_SESSION_TOKEN'); % optional
setenv('AWS_DEFAULT_REGION', 'YOUR_AWS_DEFAULT_REGION'); % optional

load('s3://mynewbucket/untrainedNetwork.mat')

Note: Saving and loading MAT-files to and from remote file systems using the save and load functions are supported from MATLAB releases R2021a and later, provided the MAT-files are version 7.3. Ensure you are running MATLAB release R2021a or later on both your local machine and in the deep learning container.

Save your training, testing, and validation data from your local MATLAB workspace to an S3 Bucket and load this into the Deep Learning Container by following the same steps as above. You can then train your model, save the trained network into the S3 Bucket and load the trained network back into your local MATLAB installation.

Save and Access Training Data with Amazon S3

You can train your network using data hosted in an s3 bucket on both your local installation of MATLAB or your MATLAB session running in the deep learning container. You might prefer this if you already have data in S3, or if you have very large datasets that you cannot download onto your local machine or into the container.

An example of how to upload the CIFAR-10 dataset from your local machine to an s3 bucket is shown in steps 1-4 on the Upload Deep Learning Data to the Cloud documentation page.

After you store your data in Amazon S3, you can use datastores to access the data from your MATLAB session either on your local machine, or from the deep learning container (ensure your appropriate AWS Access Keys have been exported as environment variables). Simply create a datastore pointing to the URL of the s3 bucket. The following sample code shows how to use an imageDatastore to access an s3 bucket. Replace 's3://MyExampleCloudData/cifar10/train' with the URL of your s3 bucket.

setenv('AWS_ACCESS_KEY_ID', 'YOUR_AWS_ACCESS_KEY_ID'); 
setenv('AWS_SECRET_ACCESS_KEY', 'YOUR_AWS_SECRET_ACCESS_KEY');
setenv('AWS_SESSION_TOKEN', 'YOUR_AWS_SESSION_TOKEN'); % optional
setenv('AWS_DEFAULT_REGION', 'YOUR_AWS_DEFAULT_REGION'); % optional

imds = imageDatastore('s3://MyExampleCloudData/cifar10/train', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');

With the CIFAR-10 data set now stored in Amazon S3, you can try any of the examples in Deep Learning in Parallel and in the Cloud that show how to use CIFAR-10 in different use cases.

Note: Training will always be faster if you have locally hosted training data. Remote data use has overheads, especially if the data has many small files like the digits classification example. For example, training time depends on network speed and the proximity of the s3 bucket to the machine running the MATLAB container. Larger data files make efficient use of bandwidth in EC2 (>200kb per file). If your data fits, copy it locally for best training speed.

Related Topics