Unable to read file on AWS S3
2 views (last 30 days)
Show older comments
I am getting an error when trying to access files in our S3 bucket for training a neural network for image classification. I have set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY properly, and can retrieve an image and view it in MATLAB with imshow. However, when trying to execute the trainNetwork function it crashes with an "unable to read file..." message. Here is the relevant code up until the line in which it crashes:
%% set up net
setenv('AWS_ACCESS_KEY_ID','XXXXXXXXXXXXXXXX');
setenv('AWS_SECRET_ACCESS_KEY','XXXXXXXXXXXXXXXXXXX')
% start cluster
c=parcluster
start(c)
numberOfWorkers = 1;
% set network architecture
net=resnet50;
lgraph = layerGraph(net);
[learnableLayer,classLayer] = findLayersToReplace(lgraph);
[learnableLayer,classLayer]
newLearnableLayer = fullyConnectedLayer(2, ...
'Name','new_fc', ...
'WeightLearnRateFactor',10, ...
'BiasLearnRateFactor',10);
lgraph = replaceLayer(lgraph,learnableLayer.Name,newLearnableLayer);
newClassLayer = classificationLayer('Name','new_classoutput');
lgraph = replaceLayer(lgraph,classLayer.Name,newClassLayer);
figure('Units','normalized','Position',[0.3 0.3 0.4 0.4]);
plot(lgraph)
ylim([0,20])
layers = lgraph.Layers;
connections = lgraph.Connections;
layers(1:10) = freezeWeights(layers(1:10));
lgraph = createLgraphUsingConnections(layers,connections);
pixelRange = [-30 30];
scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandYReflection',true, ...
'RandRotation',[-180 180], ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange, ...
'RandXShear',[0 20], ...
'RandYShear',[0 20],...
'RandScale',scaleRange ...
);
%%
pool = parpool('EnvironmentVariables',["AWS_ACCESS_KEY_ID","AWS_SECRET_ACCESS_KEY"]);
%%
baseFd='E:\AI Training images\Results\';
baseFds3='s3://rumitestbucket/';
cd([baseFd])
TrainFd=[baseFds3,'train/']
imds = imageDatastore(TrainFd, ...
'IncludeSubfolders',true, ...
'FileExtensions','.tif', ...
'LabelSource','foldernames');
[imdsTrain,imdsValidation] = splitEachLabel(imds,0.8,0.2);
augimdsTrain = augmentedImageDatastore(net.Layers(1).InputSize(1:2),imdsTrain, ...
'DataAugmentation',imageAugmenter);
augimdsValidation = augmentedImageDatastore(net.Layers(1).InputSize(1:2),imdsValidation);
miniBatchSize = 12;
valFrequency = floor(numel(augimdsTrain.Files)/miniBatchSize);
options = trainingOptions('sgdm', ...
'ExecutionEnvironment','parallel', ...
'MiniBatchSize',miniBatchSize, ...
'MaxEpochs',40, ...
'InitialLearnRate',3e-4, ...
'Shuffle','every-epoch', ...
'ValidationData',augimdsValidation, ...
'ValidationFrequency',valFrequency, ...
'Verbose',false, ...
'Plots','training-progress');
[TrainedNet,info] = trainNetwork(augimdsTrain,lgraph,options);
And this is the error message:
Error using trainNetwork (line 184)
Unable to read file: 's3://rumitestbucket/train/RUES2_training/E03_col19.tif'.
Caused by:
Error using nnet.internal.cnn.DistributedDispatcher/computeInParallel (line 193)
Error detected on worker 1.
Error using matlab.io.datastore.ImageDatastore/read (line 77)
Unable to read file: 's3://rumitestbucket/train/RUES2_training/E03_col19.tif'.
Unable to use a value of type string as an index.
-------------------------------
Any help would be very appreciated!
2 Comments
Joss Knight
on 8 Jul 2021
So you can read your dataset on your client MATLAB? e.g.
reset(augimdsTrain);
while hasdata(augimdsTrain)
read(augimdsTrain);
end
Answers (0)
See Also
Categories
Find more on Deep Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!