How can I read .ogg audio datasets for training and applying LSTM in Matlab according to the following code?

Question

Pooyan Mobtahej on 3 Oct 2020

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/604501-how-can-i-read-ogg-audio-datasets-for-training-and-applying-lstm-in-matlab-according-to-the-followi

Answered: Kiran Felix Robert on 8 Oct 2020

There is a Matlab code that is doing the following steps for deep learning and applying LSTM, I need to change first three steps to use our dataset to train this model need to apply that for .ogg audio files so Create and Use some audio files with .ogg format as sample data and give me the code.

The following steps is for your information:

Three classes of audio signals are generated and labeled as 'white', 'brown', and 'pink'. Each class has 1000 samples. 800 samples from each class are used as the training samples to train the deep neural network, so total 800*3=2400 samples in the training dataset. Their labels are their class names 'white', 'brown', and 'pink'. (Lines 29 and 30) 200 samples from each class are used as the validation samples to test the performance of deep neural network, so total 600 samples in the validation dataset. Their labels are their class names 'white', 'brown', and 'pink' (Lines 32 and 33) Extract features from the training dataset and validation dataset. define the structure of the neural network model (LSTM) set training options train the model iteratively using the training dataset and test the model using the validation dataset every iteration. finish training and get the trained model. generate test dataset and use the trained model to classify the test dataset into three classes, 'white', 'brown', and 'pink'.

Our dataset has 2 classes, 'normal' and 'anomaly', instead of three classes 'white', 'brown', and 'pink' used in this example. We know what signals are normal and what signals are anomaly, so the class of each signal is known and you don't need to do labeling. You can separate our data into three parts. For example, 80% of all normal and anomaly signals for training (2 classes), 10% for validation, and 10% for testing.

code: '''

s = 44.1e3;
duration = 0.5;
N = duration*fs;
wNoise = 2*rand([N,1000]) - 1;
wLabels = repelem(categorical("white"),1000,1);
bNoise = filter(1,[1,-0.999],wNoise);
bNoise = bNoise./max(abs(bNoise),[],'all');
bLabels = repelem(categorical("brown"),1000,1);
pNoise = pinknoise([N,1000]);
pLabels = repelem(categorical("pink"),1000,1)
sound(wNoise(:,1),fs)
melSpectrogram(wNoise(:,1),fs)
title('White Noise')
sound(bNoise(:,1),fs)
melSpectrogram(bNoise(:,1),fs)
title('Brown Noise')
sound(pNoise(:,1),fs)
melSpectrogram(pNoise(:,1),fs)
title('Pink Noise')
featuresTrain = extract(aFE,audioTrain);
[numHopsPerSequence,numFeatures,numSignals] = size(featuresTrain)
audioTrain = [wNoise(:,1:800),bNoise(:,1:800),pNoise(:,1:800)];
labelsTrain = [wLabels(1:800);bLabels(1:800);pLabels(1:800)];
audioValidation = [wNoise(:,801:end),bNoise(:,801:end),pNoise(:,801:end)];
labelsValidation = [wLabels(801:end);bLabels(801:end);pLabels(801:end)];
aFE = audioFeatureExtractor("SampleRate",fs, ...
"SpectralDescriptorInput","melSpectrum", ...
"spectralCentroid",true, ...
"spectralSlope",true);
featuresTrain = permute(featuresTrain,[2,1,3]);
featuresTrain = squeeze(num2cell(featuresTrain,[1,2]));
numSignals = numel(featuresTrain)
[numFeatures,numHopsPerSequence] = size(featuresTrain{1})
featuresValidation = extract(aFE,audioValidation);
featuresValidation = permute(featuresValidation,[2,1,3]);
featuresValidation = squeeze(num2cell(featuresValidation,[1,2]));
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(50,"OutputMode","last")
fullyConnectedLayer(numel(unique(labelsTrain)))
softmaxLayer
classificationLayer];
options = trainingOptions("adam", ...
"Shuffle","every-epoch", ...
"ValidationData",{featuresValidation,labelsValidation}, ...
"Plots","training-progress", ...
"Verbose",false);
net = trainNetwork(featuresTrain,labelsTrain,layers,options);
wNoiseTest = 2*rand([N,1]) - 1;
classify(net,extract(aFE,wNoiseTest)')
bNoiseTest = filter(1,[1,-0.999],wNoiseTest);
bNoiseTest= bNoiseTest./max(abs(bNoiseTest),[],'all');
classify(net,extract(aFE,bNoiseTest)')
pNoiseTest = pinknoise(N);
classify(net,extract(aFE,pNoiseTest)')

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Kiran Felix Robert on 8 Oct 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/604501-how-can-i-read-ogg-audio-datasets-for-training-and-applying-lstm-in-matlab-according-to-the-followi#answer_508786

Open in MATLAB Online

Hi Pooyan,

The audioread function can be used to read .ogg files.

The code uses 3 classes, each containing N different audio files, each file has 1000 samples for your deep learning application.

On similar grounds, assuming you have 100 audio files of 1000 samples each, for both normal and anomaly classes, you can use a loop to read the files and split the data for training / Validation / Testing.

The following code shows you an example, (Assuming you have named the files as Normal_1.ogg, Normal_2.ogg, …, Normal_100.ogg and Anomaly_1.ogg, Anomaly_2.ogg, …. ,Anomaly_100.ogg )

normal = zeros(100,1000); 
anomaly = zeros(100,1000); 
for i = 1:100 
    normal_name = strcat('normal_',num2str(i),'.ogg'); 
    anomoly_name = strcat('anomaly_',num2str(i),'.ogg'); 
    normal(i) = audioread(normal_name,Fs); 
    anomaly(i) = audioread(anomaly_name,Fs); 
end 

The above arrays can be split for training/Validation/Testing data set as per your requirement.

Kiran Felix Robert

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How can I read .ogg audio datasets for training and applying LSTM in Matlab according to the following code?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

How can I read .ogg audio datasets for training and applying LSTM in Matlab according to the following code?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments