sequenceInputLayer
Sequence input layer
Description
A sequence input layer inputs sequence data to a neural network.
Creation
Description
creates a sequence input layer and sets the layer
= sequenceInputLayer(inputSize
)InputSize
property.
sets the optional layer
= sequenceInputLayer(inputSize
,Name,Value
)MinLength
, Normalization
, Mean
, and Name
properties using name-value pairs. You can specify multiple name-value pairs.
Enclose each property name in single quotes.
Properties
Sequence Input
InputSize
— Size of input
positive integer | vector of positive integers
Size of the input, specified as a positive integer or a vector of positive integers.
For vector sequence input,
InputSize
is a scalar corresponding to the number of features.For 1-D image sequence input,
InputSize
is vector of two elements[h c]
, whereh
is the image height andc
is the number of channels of the image.For 2-D image sequence input,
InputSize
is vector of three elements[h w c]
, whereh
is the image height,w
is the image width, andc
is the number of channels of the image.For 3-D image sequence input,
InputSize
is vector of four elements[h w d c]
, whereh
is the image height,w
is the image width,d
is the image depth, andc
is the number of channels of the image.
To specify the minimum sequence length of the input data, use the
MinLength
property.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
MinLength
— Minimum sequence length of input data
1
(default) | positive integer
Minimum sequence length of input data, specified as a positive
integer. When training or making predictions with the network, if the
input data has fewer than MinLength
time steps, then the software throws an error.
When you create a network that downsamples data in the time dimension, you must take care that the network supports your training data and any data for prediction. Some deep learning layers require that the input has a minimum sequence length. For example, a 1-D convolution layer requires that the input has at least as many time steps as the filter size.
As time series of sequence data propagates through a network, the sequence length can change. For example, downsampling operations such as 1-D convolutions can output data with fewer time steps than its input. This means that downsampling operations can cause later layers in the network to throw an error because the data has a shorter sequence length than the minimum length required by the layer.
When you train or assemble a network, the software automatically
checks that sequences of length 1 can propagate through the network.
Some networks might not support sequences of length 1, but can
successfully propagate sequences of longer lengths. To check that a
network supports propagating your training and expected prediction data,
set the MinLength
property to a value less than or
equal to the minimum length of your data and the expected minimum length
of your prediction data.
Tip
To prevent convolution and pooling layers from changing the size
of the data, set the Padding
option of the layer
to "same"
or "causal"
.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Normalization
— Data normalization
'none'
(default) | 'zerocenter'
| 'zscore'
| 'rescale-symmetric'
| 'rescale-zero-one'
| function handle
Data normalization to apply every time data is forward propagated through the input layer, specified as one of the following:
'zerocenter'
— Subtract the mean specified byMean
.'zscore'
— Subtract the mean specified byMean
and divide byStandardDeviation
.'rescale-symmetric'
— Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified byMin
andMax
, respectively.'rescale-zero-one'
— Rescale the input to be in the range [0, 1] using the minimum and maximum values specified byMin
andMax
, respectively.'none'
— Do not normalize the input data.function handle — Normalize the data using the specified function. The function must be of the form
Y = func(X)
, whereX
is the input data and the outputY
is the normalized data.
Tip
The software, by default, automatically calculates the normalization statistics when using the
trainNetwork
function. To save time when
training, specify the required statistics for normalization and set the ResetInputNormalization
option in trainingOptions
to 0
(false
).
The software applies normalization to all input elements, including padding values.
Data Types: char
| string
| function_handle
NormalizationDimension
— Normalization dimension
'auto'
(default) | 'channel'
| 'element'
| 'all'
Normalization dimension, specified as one of the following:
'auto'
– If the training option isfalse
and you specify any of the normalization statistics (Mean
,StandardDeviation
,Min
, orMax
), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.'channel'
– Channel-wise normalization.'element'
– Element-wise normalization.'all'
– Normalize all values using scalar statistics.
Data Types: char
| string
Mean
— Mean for zero-center and z-score normalization
[]
(default) | numeric array | numeric scalar
Mean for zero-center and z-score normalization, specified as a numeric array, or empty.
For vector sequence input,
Mean
must be aInputSize
-by-1 vector of means per channel, a numeric scalar, or[]
.For 2-D image sequence input,
Mean
must be a numeric array of the same size asInputSize
, a 1-by-1-by-InputSize(3)
array of means per channel, a numeric scalar, or[]
.For 3-D image sequence input,
Mean
must be a numeric array of the same size asInputSize
, a 1-by-1-by-1-by-InputSize(4)
array of means per channel, a numeric scalar, or[]
.
If you specify the Mean
property,
then Normalization
must be
'zerocenter'
or 'zscore'
. If
Mean
is []
,
then the trainNetwork
function calculates the mean
and ignores padding values. To train a dlnetwork
object
using a custom training loop or assemble a network without training it
using the assembleNetwork
function, you must set
the Mean
property to a numeric scalar or a numeric
array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
StandardDeviation
— Standard deviation
[]
(default) | numeric array | numeric scalar
Standard deviation used for z-score normalization, specified as a numeric array, a numeric scalar, or empty.
For vector sequence input,
StandardDeviation
must be aInputSize
-by-1 vector of standard deviations per channel, a numeric scalar, or[]
.For 2-D image sequence input,
StandardDeviation
must be a numeric array of the same size asInputSize
, a 1-by-1-by-InputSize(3)
array of standard deviations per channel, a numeric scalar, or[]
.For 3-D image sequence input,
StandardDeviation
must be a numeric array of the same size asInputSize
, a 1-by-1-by-1-by-InputSize(4)
array of standard deviations per channel, or a numeric scalar.
If you specify the StandardDeviation
property, then Normalization
must be 'zscore'
. If
StandardDeviation
is
[]
, then the trainNetwork
function calculates the mean and ignores padding values. To train a
dlnetwork
object using a custom training loop or
assemble a network without training it using the
assembleNetwork
function, you must set the
StandardDeviation
property to a
numeric scalar or a numeric array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Min
— Minimum value for rescaling
[]
(default) | numeric array | numeric scalar
Minimum value for rescaling, specified as a numeric array, or empty.
For vector sequence input,
Min
must be aInputSize
-by-1 vector of means per channel or a numeric scalar.For 2-D image sequence input,
Min
must be a numeric array of the same size asInputSize
, a 1-by-1-by-InputSize(3)
array of minima per channel, or a numeric scalar.For 3-D image sequence input,
Min
must be a numeric array of the same size asInputSize
, a 1-by-1-by-1-by-InputSize(4)
array of minima per channel, or a numeric scalar.
If you specify the Min
property,
then Normalization
must be
'rescale-symmetric'
or
'rescale-zero-one'
. If Min
is []
, then the
trainNetwork
function calculates the minima and
ignores padding values. To train a dlnetwork
object
using a custom training loop or assemble a network without training it
using the assembleNetwork
function, you must set
the Min
property to a numeric scalar or a numeric
array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
Max
— Maximum value for rescaling
[]
(default) | numeric array | numeric scalar
Maximum value for rescaling, specified as a numeric array, or empty.
For vector sequence input,
Max
must be aInputSize
-by-1 vector of means per channel or a numeric scalar.For 2-D image sequence input,
Max
must be a numeric array of the same size asInputSize
, a 1-by-1-by-InputSize(3)
array of maxima per channel, a numeric scalar, or[]
.For 3-D image sequence input,
Max
must be a numeric array of the same size asInputSize
, a 1-by-1-by-1-by-InputSize(4)
array of maxima per channel, a numeric scalar, or[]
.
If you specify the Max
property,
then Normalization
must be
'rescale-symmetric'
or
'rescale-zero-one'
. If Max
is []
, then the
trainNetwork
function calculates the maxima and
ignores padding values. To train a dlnetwork
object
using a custom training loop or assemble a network without training it
using the assembleNetwork
function, you must set
the Max
property to a numeric scalar or a numeric
array.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
SplitComplexInputs
— Flag to split input data into real and imaginary components
0
(false
) (default) | 1
(true
)
This property is read-only.
Flag to split input data into real and imaginary components specified as one of these values:
0
(false
) – Do not split input data.1
(true
) – Split data into real and imaginary components.
When SplitComplexInputs
is 1
, then the layer
outputs twice as many channels as the input data. For example, if the input data is
complex-values with numChannels
channels, then the layer outputs data
with 2*numChannels
channels, where channels 1
through numChannels
contain the real components of the input data and
numChannels+1
through 2*numChannels
contain
the imaginary components of the input data. If the input data is real, then channels
numChannels+1
through 2*numChannels
are all
zero.
To input complex-valued data into a neural network, the
SplitComplexInputs
option of the input layer must be
1
.
For an example showing how to train a network with complex-valued data, see Train Network with Complex-Valued Data.
Layer
Name
— Layer name
''
(default) | character vector | string scalar
Layer name, specified as a character vector or a string scalar.
For Layer
array input, the trainNetwork
, assembleNetwork
, layerGraph
, and
dlnetwork
functions automatically assign
names to layers with the name ''
.
Data Types: char
| string
NumInputs
— Number of inputs
0 (default)
This property is read-only.
Number of inputs of the layer. The layer has no inputs.
Data Types: double
InputNames
— Input names
{}
(default)
This property is read-only.
Input names of the layer. The layer has no inputs.
Data Types: cell
NumOutputs
— Number of outputs
1
(default)
This property is read-only.
Number of outputs of the layer. This layer has a single output only.
Data Types: double
OutputNames
— Output names
{'out'}
(default)
This property is read-only.
Output names of the layer. This layer has a single output only.
Data Types: cell
Examples
Create Sequence Input Layer
Create a sequence input layer with the name 'seq1'
and an input size of 12.
layer = sequenceInputLayer(12,'Name','seq1')
layer = SequenceInputLayer with properties: Name: 'seq1' InputSize: 12 MinLength: 1 SplitComplexInputs: 0 Hyperparameters Normalization: 'none' NormalizationDimension: 'auto'
Include a sequence input layer in a Layer
array.
inputSize = 12; numHiddenUnits = 100; numClasses = 9; layers = [ ... sequenceInputLayer(inputSize) lstmLayer(numHiddenUnits,'OutputMode','last') fullyConnectedLayer(numClasses) softmaxLayer classificationLayer]
layers = 5x1 Layer array with layers: 1 '' Sequence Input Sequence input with 12 dimensions 2 '' LSTM LSTM with 100 hidden units 3 '' Fully Connected 9 fully connected layer 4 '' Softmax softmax 5 '' Classification Output crossentropyex
Create Sequence Input Layer for Image Sequences
Create a sequence input layer for sequences of 224-224 RGB images with the name 'seq1'
.
layer = sequenceInputLayer([224 224 3], 'Name', 'seq1')
layer = SequenceInputLayer with properties: Name: 'seq1' InputSize: [224 224 3] MinLength: 1 SplitComplexInputs: 0 Hyperparameters Normalization: 'none' NormalizationDimension: 'auto'
Train Network for Sequence Classification
Train a deep learning LSTM network for sequence-to-label classification.
Load the Japanese Vowels data set as described in [1] and [2]. XTrain
is a cell array containing 270 sequences of varying length with 12 features corresponding to LPC cepstrum coefficients. Y
is a categorical vector of labels 1,2,...,9. The entries in XTrain
are matrices with 12 rows (one row for each feature) and a varying number of columns (one column for each time step).
[XTrain,YTrain] = japaneseVowelsTrainData;
Visualize the first time series in a plot. Each line corresponds to a feature.
figure plot(XTrain{1}') title("Training Observation 1") numFeatures = size(XTrain{1},1); legend("Feature " + string(1:numFeatures),'Location','northeastoutside')
Define the LSTM network architecture. Specify the input size as 12 (the number of features of the input data). Specify an LSTM layer to have 100 hidden units and to output the last element of the sequence. Finally, specify nine classes by including a fully connected layer of size 9, followed by a softmax layer and a classification layer.
inputSize = 12; numHiddenUnits = 100; numClasses = 9; layers = [ ... sequenceInputLayer(inputSize) lstmLayer(numHiddenUnits,'OutputMode','last') fullyConnectedLayer(numClasses) softmaxLayer classificationLayer]
layers = 5x1 Layer array with layers: 1 '' Sequence Input Sequence input with 12 dimensions 2 '' LSTM LSTM with 100 hidden units 3 '' Fully Connected 9 fully connected layer 4 '' Softmax softmax 5 '' Classification Output crossentropyex
Specify the training options. Specify the solver as 'adam'
and 'GradientThreshold'
as 1. Set the mini-batch size to 27 and set the maximum number of epochs to 70.
Because the mini-batches are small with short sequences, the CPU is better suited for training. Set 'ExecutionEnvironment'
to 'cpu'
. To train on a GPU, if available, set 'ExecutionEnvironment'
to 'auto'
(the default value).
maxEpochs = 70; miniBatchSize = 27; options = trainingOptions('adam', ... 'ExecutionEnvironment','cpu', ... 'MaxEpochs',maxEpochs, ... 'MiniBatchSize',miniBatchSize, ... 'GradientThreshold',1, ... 'Verbose',false, ... 'Plots','training-progress');
Train the LSTM network with the specified training options.
net = trainNetwork(XTrain,YTrain,layers,options);
Load the test set and classify the sequences into speakers.
[XTest,YTest] = japaneseVowelsTestData;
Classify the test data. Specify the same mini-batch size used for training.
YPred = classify(net,XTest,'MiniBatchSize',miniBatchSize);
Calculate the classification accuracy of the predictions.
acc = sum(YPred == YTest)./numel(YTest)
acc = 0.9486
Classification LSTM Networks
To create an LSTM network for sequence-to-label classification, create a layer array containing a sequence input layer, an LSTM layer, a fully connected layer, a softmax layer, and a classification output layer.
Set the size of the sequence input layer to the number of features of the input data. Set the size of the fully connected layer to the number of classes. You do not need to specify the sequence length.
For the LSTM layer, specify the number of hidden units and the output mode 'last'
.
numFeatures = 12; numHiddenUnits = 100; numClasses = 9; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits,'OutputMode','last') fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];
For an example showing how to train an LSTM network for sequence-to-label classification and classify new data, see Sequence Classification Using Deep Learning.
To create an LSTM network for sequence-to-sequence classification, use the same architecture as for sequence-to-label classification, but set the output mode of the LSTM layer to 'sequence'
.
numFeatures = 12; numHiddenUnits = 100; numClasses = 9; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits,'OutputMode','sequence') fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];
Regression LSTM Networks
To create an LSTM network for sequence-to-one regression, create a layer array containing a sequence input layer, an LSTM layer, a fully connected layer, and a regression output layer.
Set the size of the sequence input layer to the number of features of the input data. Set the size of the fully connected layer to the number of responses. You do not need to specify the sequence length.
For the LSTM layer, specify the number of hidden units and the output mode 'last'
.
numFeatures = 12; numHiddenUnits = 125; numResponses = 1; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits,'OutputMode','last') fullyConnectedLayer(numResponses) regressionLayer];
To create an LSTM network for sequence-to-sequence regression, use the same architecture as for sequence-to-one regression, but set the output mode of the LSTM layer to 'sequence'
.
numFeatures = 12; numHiddenUnits = 125; numResponses = 1; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits,'OutputMode','sequence') fullyConnectedLayer(numResponses) regressionLayer];
For an example showing how to train an LSTM network for sequence-to-sequence regression and predict on new data, see Sequence-to-Sequence Regression Using Deep Learning.
Deeper LSTM Networks
You can make LSTM networks deeper by inserting extra LSTM layers with the output mode 'sequence'
before the LSTM layer. To prevent overfitting, you can insert dropout layers after the LSTM layers.
For sequence-to-label classification networks, the output mode of the last LSTM layer must be 'last'
.
numFeatures = 12; numHiddenUnits1 = 125; numHiddenUnits2 = 100; numClasses = 9; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits1,'OutputMode','sequence') dropoutLayer(0.2) lstmLayer(numHiddenUnits2,'OutputMode','last') dropoutLayer(0.2) fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];
For sequence-to-sequence classification networks, the output mode of the last LSTM layer must be 'sequence'
.
numFeatures = 12; numHiddenUnits1 = 125; numHiddenUnits2 = 100; numClasses = 9; layers = [ ... sequenceInputLayer(numFeatures) lstmLayer(numHiddenUnits1,'OutputMode','sequence') dropoutLayer(0.2) lstmLayer(numHiddenUnits2,'OutputMode','sequence') dropoutLayer(0.2) fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];
Create Network for Video Classification
Create a deep learning network for data containing sequences of images, such as video and medical image data.
To input sequences of images into a network, use a sequence input layer.
To apply convolutional operations independently to each time step, first convert the sequences of images to an array of images using a sequence folding layer.
To restore the sequence structure after performing these operations, convert this array of images back to image sequences using a sequence unfolding layer.
To convert images to feature vectors, use a flatten layer.
You can then input vector sequences into LSTM and BiLSTM layers.
Define Network Architecture
Create a classification LSTM network that classifies sequences of 28-by-28 grayscale images into 10 classes.
Define the following network architecture:
A sequence input layer with an input size of
[28 28 1]
.A convolution, batch normalization, and ReLU layer block with 20 5-by-5 filters.
An LSTM layer with 200 hidden units that outputs the last time step only.
A fully connected layer of size 10 (the number of classes) followed by a softmax layer and a classification layer.
To perform the convolutional operations on each time step independently, include a sequence folding layer before the convolutional layers. LSTM layers expect vector sequence input. To restore the sequence structure and reshape the output of the convolutional layers to sequences of feature vectors, insert a sequence unfolding layer and a flatten layer between the convolutional layers and the LSTM layer.
inputSize = [28 28 1]; filterSize = 5; numFilters = 20; numHiddenUnits = 200; numClasses = 10; layers = [ ... sequenceInputLayer(inputSize,'Name','input') sequenceFoldingLayer('Name','fold') convolution2dLayer(filterSize,numFilters,'Name','conv') batchNormalizationLayer('Name','bn') reluLayer('Name','relu') sequenceUnfoldingLayer('Name','unfold') flattenLayer('Name','flatten') lstmLayer(numHiddenUnits,'OutputMode','last','Name','lstm') fullyConnectedLayer(numClasses, 'Name','fc') softmaxLayer('Name','softmax') classificationLayer('Name','classification')];
Convert the layers to a layer graph and connect the miniBatchSize
output of the sequence folding layer to the corresponding input of the sequence unfolding layer.
lgraph = layerGraph(layers); lgraph = connectLayers(lgraph,'fold/miniBatchSize','unfold/miniBatchSize');
View the final network architecture using the plot
function.
figure plot(lgraph)
References
[1] M. Kudo, J. Toyama, and M. Shimbo. "Multidimensional Curve Classification Using Passing-Through Regions." Pattern Recognition Letters. Vol. 20, No. 11–13, pages 1103–1111.
[2] UCI Machine Learning Repository: Japanese Vowels Dataset. https://archive.ics.uci.edu/ml/datasets/Japanese+Vowels
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
For vector sequence inputs, the number of features must be a constant during code generation.
For code generation, the input data must contain either zero or two spatial dimensions.
Code generation does not support
'Normalization'
specified using a function handle.Code generation does not support complex input and does not support
'SplitComplexInputs'
option.
GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.
Usage notes and limitations:
To generate CUDA® or C++ code by using GPU Coder™, you must first construct and train a deep neural network. Once the network is trained and evaluated, you can configure the code generator to generate code and deploy the convolutional neural network on platforms that use NVIDIA® or ARM® GPU processors. For more information, see Deep Learning with GPU Coder (GPU Coder).
For this layer, you can generate code that takes advantage of the NVIDIA CUDA deep neural network library (cuDNN), or the NVIDIA TensorRT™ high performance inference library.
The cuDNN library supports vector and 2-D image sequences. The TensorRT library support only vector input sequences.
For vector sequence inputs, the number of features must be a constant during code generation.
For image sequence inputs, the height, width, and the number of channels must be a constant during code generation.
Code generation does not support
'Normalization'
specified using a function handle.Code generation does not support complex input and does not support
'SplitComplexInputs'
option.
Version History
Introduced in R2017bR2020a: trainNetwork
ignores padding values when calculating normalization statistics
Starting in R2020a, trainNetwork
ignores padding values when
calculating normalization statistics. This means that the Normalization
option in the
sequenceInputLayer
now makes training invariant to data
operations, for example, 'zerocenter'
normalization now implies
that the training results are invariant to the mean of the data.
If you train on padded sequences, then the calculated normalization factors may be different in earlier versions and can produce different results.
R2019b: sequenceInputLayer
, by default, uses channel-wise normalization for zero-center normalization
Starting in R2019b, sequenceInputLayer
, by default, uses
channel-wise normalization for zero-center normalization. In previous versions, this
layer uses element-wise normalization. To reproduce this behavior, set the NormalizationDimension
option of this layer to
'element'
.
See Also
trainNetwork
| lstmLayer
| bilstmLayer
| gruLayer
| classifyAndUpdateState
| predictAndUpdateState
| resetState
| sequenceFoldingLayer
| flattenLayer
| sequenceUnfoldingLayer
| Deep Network
Designer | featureInputLayer
Topics
- Sequence Classification Using Deep Learning
- Time Series Forecasting Using Deep Learning
- Sequence-to-Sequence Classification Using Deep Learning
- Classify Videos Using Deep Learning
- Visualize Activations of LSTM Network
- Long Short-Term Memory Neural Networks
- Specify Layers of Convolutional Neural Network
- Set Up Parameters and Train Convolutional Neural Network
- Deep Learning in MATLAB
- List of Deep Learning Layers
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)