Is it possible to use variable length arrays with SequenceInputLayer in a custom training loop, i.e. as a dlarray?

Question

0 votes

I would like to train a LSTM network on sequence lengths of differing lengths. I have followed the example "nnet/SequenceClassificationUsing1DConvolutionsExample", but for my own application I need to implement such a LSTM network as a dlnetwork and use custom training loop. (It will be part of a generative model.)

% Edited example:
[XTrain,YTrain] = japaneseVowelsTrainData;
[XValidation,TValidation] = japaneseVowelsTestData;
numFeatures = size(XTrain{1},1);
inputSize = 12;
numHiddenUnits = 100;
numClasses = 9;
filterSize = 3;
numFilters = 32;
layers = [ ...
    sequenceInputLayer(numFeatures)
    convolution1dLayer(filterSize,numFilters,Padding="causal")
    reluLayer
    layerNormalizationLayer
    convolution1dLayer(filterSize,2*numFilters,Padding="causal")
    reluLayer
    layerNormalizationLayer
    globalAveragePooling1dLayer
    fullyConnectedLayer(numClasses)
    softmaxLayer ]; % last layer removed for dlnetwork
% my code: create the dlnetwork
dlnetJap = dlnetwork( layers );

As I understand it from the documentation, dlarrays have to have fixed dimensions. This would not seem to be a problem as the example code stores the variable length arrays in a cell array. e.g.

net = trainNetwork(XTrain,TTrain,layers,options);

However, a dlnetwork does not permit a cell array as input. (I use analyzeNetwork to demonstrate this rather than showing my whole code. I get the same error when I use forward() inside the dfeval.)

analyzeNetwork( dlnetJap, XTrain );
Error using analyzeNetwork (line 56)
Invalid argument at position 2. Example network inputs must be formatted dlarray objects.

I can't dlarray create cell array as a dlarray.

analyzeNetwork( dlnetJap, dlarray( XTrain, 'CB' ) ); % or any other format
Error using dlarray (line 151)
dlarray is supported only for full arrays of data type double, single, or logical, or for full gpuArrays of these data types.

I also tried converting the array within each cell to a dlarray area but to no avail.

XTrainDl = cell( size(XTrain) );
for i = 1:length(XTrainDl)
    XTrainDl{i} = dlarray( XTrain{i}, 'CB' );
end
analyzeNetwork( dlnetJap, XTrainDl );
Error using analyzeNetwork (line 56)
Invalid argument at position 2. Example network inputs must be formatted dlarray objects.

I can't see a way around this problem. I have already created the generative model based on fully connected layers rather than LSTM. I suppose I could use LSTM with a fixed length input, but my time series data differs in length. I would not like to time normalise to a standard length as that distorts the data. Nor can I pad it at one end because it comes from cyclical data. LSTM seems to be ideal for its ability to deal with sequences of variable lengths, but how to get that to work in a custom training loop?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Joss Knight on 22 Jan 2022

1 vote

dlnetwork objects do not take your entire dataset as input, they expect to receive a single batch at a time. You need to loop over your dataset and access the data. You can use iterator objects such as minibatchqueue and arrayDatastore to help you achieve this.

The purpose of dlnetwork is to give you much greater control over how to iterate over your dataset and flexibility over how your network is built, but as a result you need to write more of the code yourself.

The Examples section of the dlnetwork documentation gives some relevant code.

dlnetworks are not restricted to fixed size inputs; they do sometimes need to be given example inputs in order to initialize but that doesn't mean that once initialized the input size cannot change. In your code, example input data is not needed for dlnetwork or analyzeNetwork, because your sequenceInputLayer already provides that information.

3 Comments
Show 1 older comment Hide 1 older comment

Mark White on 24 Jan 2022

Open in MATLAB Online

Thank you. Yes, I am familiar with minibatchqueue and have been using it for a few months now with fullyconnected and convolutional networks. I have a working custom loop for those networks. This is the first time I have tried using a LSTM network. This is the simple code I first tried:

miniBatchSize = 27;
fmt = 'CTB';
dsXTrain = arrayDatastore( XTrain, 'IterationDimension', 1 );
dsYTrain = arrayDatastore( YTrain, 'IterationDimension', 1 );   
dsTrain = combine( dsXTrain, dsYTrain );
% setup the batches
mbqTrain = minibatchqueue(  dsTrain,...
                          'MiniBatchSize', miniBatchSize, ...
                          'PartialMiniBatch', 'discard', ...
                          'MiniBatchFormat', fmt );
nIter = floor( size(XTrain,1)/miniBatchSize );
j = 0;
for epoch = 1:maxEpochs
    % Shuffle the data
    shuffle( mbqTrain );
    % Loop over mini-batches.
    for i = 1:nIter
        % Read mini-batch of data
        [dlXTrn, dlYTrn] = next( mbqTrain );
        j = j + 1;
        
        % Evaluate the model gradients 
        [ grad, state, loss ] = ...
                          dlfeval(  @modelGradients, ...
                                    dlnetJap, ...
                                    dlXTrn, ...
                                    dlYTrn );
        dlnetJap.State = state;
        % Update the decoder network parameters
        [ dlnetJap, avgG.dec, avgGS.dec ] = ...
                    adamupdate( dlnetJap, ...
                                grad, ...
                                avgG, ...
                                avgGS, ...
                                j, ...
                                learnRate, ...
                                beta1, ...
                                beta2 );
    end
end
function [  grad, state, loss ] = modelGradients( ...
                                                dlnetJap, ...
                                                dlX, ...
                                                dlY )
% --- reconstruction phase ---
% generate latent encodings
logits = forward( dlnetJap, dlX );
loss = crossentropy( logits, dlY, 'TargetCategories', 'Independent' ); 
grad = dlgradient( loss, dlnetJap.Learnables, 'RetainData', true );
state = [];
end

However I got this error:

Error using minibatchqueue (line 319)
Unable to convert mini-batch variable 1 from class "cell" to class "single".
Error in lstmTest (line 37)
mbqTrain = minibatchqueue(  dsTrain,...
Caused by:
    Error using cast
    Conversion to single from cell is not possible.

This seems to be a fundamental problem. Variable length arrays have to be packaged up into cell arrays, but minibatchqueue will not accept that format. Am I doing something wrong?

To get around this I tried writing my own code to shuffle and partition the data instead.

k = 1;
...
% Shuffle the data
idx = randperm( numObservations );
XTrain = XTrain( idx );
YTrain = YTrain( idx );
...
% Read mini-batch of data
batchIdx = k:k+miniBatchSize-1;
k = k + miniBatchSize;
dlXTrn = preprocess( XTrain( batchIdx ), fmt );
dlYTrn = dlarray( single(YTrain( batchIdx )), fmt );
...

However, I get this error inside modelGradients:

Error using dlnetwork/validateForwardInputs (line 911)
Input data must be a formatted dlarray.
Error in dlnetwork/forward (line 553)
            [x, doForwardExampleInputs] = validateForwardInputs(net, x, "forward");
Error in lstmTest>modelGradients (line 96)
logits = forward( dlnetJap, dlX );
Error in deep.internal.dlfeval (line 17)
[varargout{1:nargout}] = fun(x{:});
Error in dlfeval (line 40)
    [varargout{1:nargout}] = deep.internal.dlfeval(fun,varargin{:});
Error in lstmTest (line 57)
                          dlfeval(  @modelGradients, ...
 

I suspected that I need the batch processing function to transform the data in some way, but I couldn't find something that worked. What should I do? Thanks.

Joss Knight on 24 Jan 2022

Edited: Joss Knight on 24 Jan 2022

Open in MATLAB Online

You need to specify 'OutputType', 'same' for the arrayDatastore otherwise it'll wrap your existing cell elements in another cell. Then you need to write a 'MiniBatchFcn' for minibatchqueue because the sequences all have different length so to concatenate them you either need to concat them as cells, or your need to use padsequences to pad them all to the same length. For them to be consumed by a network in a batch they need to be a numeric array, so you need to do that latter.

Then the other point is that minibatchqueue is designed to return data that can be put into a gpuArray and a dlarray, which means numeric data - it can't handle categorical, just like it can't handle cell. So use onehotencode to convert from categorical into numeric data that your network can consume.

I've given some example code here. Of course, you don't have to use minibatchqueue, you can iterate over the data yourself. But many of the same principles will apply.

[XTrain,YTrain] = japaneseVowelsTrainData;
miniBatchSize = 27;
fmt = 'CTB';
dsXTrain = arrayDatastore( XTrain, 'IterationDimension', 1, 'OutputType', 'same' );
dsYTrain = arrayDatastore( YTrain, 'IterationDimension', 1 );   
dsTrain = combine( dsXTrain, dsYTrain );
% setup the batches
mbqTrain = minibatchqueue(  dsTrain,...
                          'MiniBatchSize', miniBatchSize, ...
                          'PartialMiniBatch', 'discard', ...
                          'MiniBatchFcn', @concatSequenceData, ...
                          'MiniBatchFormat', {fmt, 'CB'});
function [x, y] = concatSequenceData(x, y)
x = padsequences(x,2);
y = onehotencode(cat(2,y{:}),1);
end

Mark White on 25 Jan 2022

Thank you. That makes sense. I'd overlooked the OutputType setting.

Having looked at the examples, I see that will have to pad the data, but sorting the series by length mitigates my concern.

Sign in to comment.

Is it possible to use variable length arrays with SequenceInputLayer in a custom training loop, i.e. as a dlarray?

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

3 Comments
Show 1 older comment Hide 1 older comment

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Is it possible to use variable length arrays with SequenceInputLayer in a custom training loop, i.e. as a dlarray?

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

3 Comments Show 1 older comment Hide 1 older comment

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

3 Comments
Show 1 older comment Hide 1 older comment