Sequence Classification Using Deep Learning

3 views (last 30 days)
Leon
Leon on 24 Mar 2021
Answered: Aniket on 23 Sep 2024
Currently I am wokring on a Sequence Classification Using Deep Learning.
I have also looked at the example from MATLAB: Sequence Classification Using Deep Learning using japaneseVowelsTrainData
However I have some problems getting the input right with my own data.
I have x amount of secquences all with same length. A(x, 1:30) of the type double. For each sequence A(x,:) I have a classification/string B(x) = {'ClassA'} or B(x) = ('ClassA'), I am not sure wihich one to use.There are in total 6 different classification.
Question 1: I am wondering how I should create the correct sequence type to be able as input?
Question 2: For the training input I assume that the inputSize = 30; Should be based on the fixed length of the sequence. Is this correct?
Question 3: How do I create the correct format for the classification to train the model? How should I convert classification training variable?
Question 4: What should I define at the miniBatchSize as all sequences are the same length and in principle it is one data base?
Thanks you very much in advance for your feedback.

Answers (1)

Aniket
Aniket on 23 Sep 2024
Hi @Leon,
I understand that you want to use the Sequence Classification using DL with your own input data and need some clarifications regarding the same.
  1. For your first query, you should structure your input data as a cell array where each cell contains a sequence. Given your data structure A(x, 1:30), you can convert it into a cell array like this:
sequenceData = num2cell(A, 2);
2. Regarding the inputSize, this specifies the number of features (or channels) per timestep in your sequence data. Given your description, A(x, 1:30) represents a sequence of 30 data points for each sequence, not 30 features. If each time step has only one feature (for example, a single measurement like temperature or stock price), then inputSize should indeed be 1.
However, if A(x, 1:30) implies that each of the 30 values represents a separate feature at a single time step (e.g., different sensor readings at the same time), then inputSize should be 30.
3. For the third query, your classification labels should be categorical for training purposes. You can convert your labels B into a categorical array like this:
labels = categorical(B);
Ensure that B is a cell array of strings. If B is a character array, you can convert it to a cell array of strings first using:
B = cellstr(B);
4. Lastly, minibatch is a hyperparameter that you can experiment with and depends on size of training data, complexity, resources available, etc. You may start with typical values of 32 or 64 and tune them as per the validation scores.
Hope this answers your queries !

Categories

Find more on Image Data Workflows in Help Center and File Exchange

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!