how to select relative time stamp from target vector using deep learning

Question

Shilpa Sonawane on 7 May 2023

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/1959314-how-to-select-relative-time-stamp-from-target-vector-using-deep-learning

Commented: Shilpa Sonawane on 11 Sep 2023

i have 4 input signals namely, 4-D array to store frames of video , 4-D array to store frames of mfccs, 1-d array time stamps of mfcc(audio) & 1-d array of speaker identity. Target considered as '1' ie digit1. For experimentation i have considered only digit-1 with 41 frames. All 41 frames make sound digit-1. In this experiment, test signal is same as trained signal. I got correct target output1 for 41 frames. but from these target vector i have to slect corresponding mfcc frame.Ex. for first '1' in target vector i need to select mfcc-first frame. I have to generate sound from it. I am unable to do it. Please provide guidance.

The code is shown below.

clear all; close all; clc;

[visual_frames, visual_lbl,vid_fr_cnt,new_vl]=fun_visual_data_processing_30april_23();

[audio_frames, audio_lbl,aud_fr_cnt,new_al]=fun_audio_data_processing_30april_23();

[v1 v2 v3 v4]=size(visual_frames)

[a1 a2 a3 a4]=size(audio_frames)

dsX1Train = arrayDatastore(visual_frames,IterationDimension=4);

dsX2Train = arrayDatastore(audio_frames,IterationDimension=4);

dsTTrain = arrayDatastore(audio_lbl);

%dsTTTrain=arrayDatastore(new_vl);

SI=ones(41,1);

dsX3Train=arrayDatastore(SI);

dsX4Train=arrayDatastore(new_al');

[h,w,numChannels,numObservations] = size(visual_frames);

numClasses = numel(categories(visual_lbl));

imageInputSize = [h w numChannels];

filterSize = 5;

numFilters = 16;

layers1 = [

imageInputLayer(imageInputSize,Normalization="none")

convolution2dLayer(filterSize,numFilters)

batchNormalizationLayer

reluLayer

fullyConnectedLayer(50)

flattenLayer(Name="FL1")

concatenationLayer(1,4,Name="cat")

fullyConnectedLayer(numClasses)

softmaxLayer

classificationLayer];

lgraph = layerGraph(layers1);

layers2 = [

imageInputLayer(imageInputSize,Normalization="none")

convolution2dLayer(filterSize,numFilters)

batchNormalizationLayer

reluLayer

fullyConnectedLayer(50)

flattenLayer(Name="FL2")

];

lgraph = addLayers(lgraph,layers2);

lgraph = connectLayers(lgraph,"FL2","cat/in2");

numFeatures=1;

featInput = featureInputLayer(numFeatures,Name="features");

lgraph = addLayers(lgraph,featInput);

lgraph = connectLayers(lgraph,"features","cat/in3");

numFeatures=1;

featInput2 = featureInputLayer(numFeatures,Name="features2");

lgraph = addLayers(lgraph,featInput2);

lgraph = connectLayers(lgraph,"features2","cat/in4");

figure

plot(lgraph)

options = trainingOptions("sgdm", ...

MaxEpochs=15, ...

InitialLearnRate=0.01, ...

Plots="training-progress", ...

Verbose=0);

dsTrain = combine(dsX1Train,dsX2Train,dsX3Train,dsX4Train,dsTTrain);

net = trainNetwork(dsTrain,lgraph,options);

%

dsTest=dsTrain;

[ytest] = classify(net,dsTest);

OUTPUT

>> ytest'

ans =

1×41 categorical array

Columns 1 through 16

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Columns 17 through 32

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Columns 33 through 41

1 1 1 1 1 1 1 1 1

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Shilpa Sonawane on 6 Sep 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1959314-how-to-select-relative-time-stamp-from-target-vector-using-deep-learning#answer_1302461

Thank you so much. I will try it definately.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Karan Singh on 5 Sep 2023

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/1959314-how-to-select-relative-time-stamp-from-target-vector-using-deep-learning#answer_1301586

Open in MATLAB Online

Hi Sarah,

Based on the provided code and description, I am assuming that you have a dataset containing video frames and corresponding MFCC frames.

In your experiment, you are focusing on a specific digit, '1', and you have considered 41 frames that represent the sound of the digit '1'. The goal is to train a neural network model to recognize and classify these frames as '1'.

Now, you want to select the corresponding MFCC frame for each '1' in the target vector and generate sound from it. In other words, you want to extract the MFCC frame that corresponds to each predicted '1' in the target vector and use it to generate sound.

To achieve this, you can follow these steps:

Extract the indices of the '1' values from the target vector using the “find” function:

indices = find(ytest == 1);

2. Iterate over the obtained indices and select the corresponding MFCC frame using indexing:

for i = 1:length(indices) 
    mfccFrame = audio_frames(:, :, :, indices(i)); 
    % Generate sound from the selected MFCC frame 
    % ... 
    % Your sound generation code here 
end 

In the above code, “audio_frames” is the 4D array representing the MFCC frames, and “indices” contains the indices of the '1' values in the target vector. 

As the sound generation code was not provided in your code snippet, you would need to replace the comment “% Generate sound from the selected MFCC frame” with your specific sound generation code to generate sound from the selected MFCC frame. Here's a simple demo sound generation code using the MATLAB “soundsc” function to play the sound from each selected MFCC frame:

% Assuming you have extracted the indices and stored them in the variable 'indices' 
for i = 1:length(indices) 
    % Select the corresponding MFCC frame 
    mfccFrame = audio_frames(:, :, :, indices(i)); 
    
    % Convert MFCC frame to audio signal (dummy code) 
    audioSignal = mfccFrame(:);  % Replace with your actual conversion code 
    
    % Normalize the audio signal 
    audioSignal = audioSignal / max(abs(audioSignal)); 
    
    % Set the sampling rate and play the sound 
    fs = 44100;  % Replace with your desired sampling rate 
    soundsc(audioSignal, fs); 
    
    % Pause for a moment to hear the sound 
    pause(1);  % Adjust the duration as needed 
end 

Attached below are some documentation links that you may find helpful:

Hope this helps!

1 Comment
Show -1 older commentsHide -1 older comments

Shilpa Sonawane on 11 Sep 2023

I will go through it. Thank you

Sign in to comment.

how to select relative time stamp from target vector using deep learning

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

how to select relative time stamp from target vector using deep learning

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments