how to feed "MachineData.mat" raw data from "anomaly detection" into biLSTMAutoencoder ?

1 view (last 30 days)
Hi all
Since I've no access to he Diagnostic Feature Designer App from the Predictive Maintenance Toolbox, and as suggested in "Part1_DataPrepFeatureExtraction", I'm trying to "train a model on raw data" => train the biLSTM with "MachineData.mat" instead of "FeatureEntire.mat".
To have "MachineData.mat" compatible with "Part2_LSTMAutoencoder.mlx", I've modified "extractLabeledData.m" file to create a [18x1] cells [70000x3 double] => 18 is the number of sequences, 70000 number of samples, 3 number of channels.
the train result is "The training sequences are of feature dimension 70000 but the input layer expects sequences of feature dimension 1." => Clearly not exepected...
Anybody know how to adapt the shape of "MachineData.mat" ?
Instead of trying to feed the 18 sequences, should I proceed sequence per sequence and try to retarin the network ?
BR
Juliette
  1 Comment
juliette soula
juliette soula on 13 Nov 2021
As a test, I've set "featureDimension = 70000;" in "Part2_LSTMAutoencoder.mlx" => the training process is performed but with very poor performances. I think it is not the right thing to do because samples of a time serie should not be considered as dimensions.
is there a vocabulary problem ?
How should these "nbTimeSerie" time series "nbSamplee" long from 3 sensor should be presented to the biLSTM ?

Sign in to comment.

Answers (1)

Hornett
Hornett on 19 Sep 2024
To correctly shape your data for a biLSTM network in MATLAB, ensure it follows the [sequenceLength, numFeatures, numObservations] format. Given your scenario with 18 sequences, 70,000 samples per sequence, and 3 sensors:
  • sequenceLength: 70,000 (number of samples)
  • numFeatures: 3 (number of sensors)
  • numObservations: 18 (number of sequences)
If using cell arrays, each cell should be [sequenceLength x numFeatures], meaning each contains a [70000x3] matrix.
For the LSTM network, set the inputSize in sequenceInputLayer to the number of features (3 for three sensors). Do not treat the sequence length as the feature dimension. Training should be done on all sequences together, not one by one, to improve model performance.
layers = [ ...
sequenceInputLayer(3) % 3 sensors
bilstmLayer(100,'OutputMode','sequence')
fullyConnectedLayer(3)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs',100, ...
'MiniBatchSize', 18, ...
'Shuffle','never', ...
'Verbose',0, ...
'Plots','training-progress');
net = trainNetwork(XTrain, YTrain, layers, options);
Ensure XTrain is correctly shaped or is a cell array where each cell is [70000x3].
Hope it helps!

Categories

Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!