Main Content

Acoustics-Based Machine Fault Recognition

In this example, you develop a deep learning model to detect faults in an air compressor using acoustic measurements. After developing the model, you package the system so that you can recognize faults based on streaming input data.

Data Preparation

Download and unzip the air compressor data set [1]. This data set consists of recordings from air compressors in a healthy state or one of seven faulty states.

loc = matlab.internal.examples.downloadSupportFile('audio','AirCompressorDataset/AirCompressorDataset.zip');
unzip(loc,pwd)

Create an audioDatastore (Audio Toolbox) object to manage the data and split it into training and validation sets. Call countEachLabel (Audio Toolbox) to inspect the distribution of labels in the train and validation sets.

ads = audioDatastore(pwd,'IncludeSubfolders',true,'LabelSource','foldernames');

[adsTrain,adsValidation] = splitEachLabel(ads,0.9,0.1);

countEachLabel(adsTrain)
ans=8×2 table
      Label      Count
    _________    _____

    Bearing       203 
    Flywheel      203 
    Healthy       203 
    LIV           203 
    LOV           203 
    NRV           203 
    Piston        203 
    Riderbelt     203 

countEachLabel(adsValidation)
ans=8×2 table
      Label      Count
    _________    _____

    Bearing       22  
    Flywheel      22  
    Healthy       22  
    LIV           22  
    LOV           22  
    NRV           22  
    Piston        22  
    Riderbelt     22  

adsTrain = shuffle(adsTrain);
adsValidation = shuffle(adsValidation);

You can reduce the training data set used in this example to speed up the runtime at the cost of performance. In general, reducing the data set is a good practice for development and debugging.

reduceDataset = false;
if reduceDataset
    adsTrain = splitEachLabel(adsTrain,20);
end

The data consists of time-series recordings of acoustics from faulty or healthy air compressors. As such, there are strong relationships between samples in time. Listen to a recording and plot the waveform.

[sampleData,sampleDataInfo] = read(adsTrain);
fs = sampleDataInfo.SampleRate;

soundsc(sampleData,fs)
plot(sampleData)
xlabel("Sample")
ylabel("Amplitude")
title("State: " + string(sampleDataInfo.Label))
axis tight

Because the samples are related in time, you can use a recurrent neural network (RNN) to model the data. A long short-term memory (LSTM) network is a popular choice of RNN because it is designed to avoid vanishing and exploding gradients. Before you can train the network, it's important to prepare the data adequately. Often, it is best to transform or extract features from 1-dimensional signal data in order to provide a richer set of features for the model to learn from.

Feature Engineering

The next step is to extract a set of acoustic features used as inputs to the network. Audio Toolbox™ enables you to extract spectral descriptors that are commonly used as inputs in machine learning tasks. You can extract the features using individual functions, or you can use audioFeatureExtractor (Audio Toolbox) to simplify the workflow and do it all at once.

trainFeatures = cell(1,numel(adsTrain.Files));
windowLength = 512;
overlapLength = 0;

aFE = audioFeatureExtractor('SampleRate',fs, ...
    'Window',hamming(windowLength,'periodic'),...
    'OverlapLength',overlapLength,...
    'spectralCentroid',true, ...
    'spectralCrest',true, ...
    'spectralDecrease',true, ...
    'spectralEntropy',true, ...
    'spectralFlatness',true, ...
    'spectralFlux',false, ...
    'spectralKurtosis',true, ...
    'spectralRolloffPoint',true, ...
    'spectralSkewness',true, ...
    'spectralSlope',true, ...
    'spectralSpread',true);

reset(adsTrain)
tic
for index = 1:numel(adsTrain.Files)
    data = read(adsTrain);
    trainFeatures{index} = (extract(aFE,data))';
end
fprintf('Feature extraction of training set took %f seconds.\n',toc);
Feature extraction of training set took 13.648354 seconds.

Data Augmentation

The training set contains a relatively small number of acoustic recordings for training a deep learning model. A popular method to enlarge the dataset is to use mixup. In mixup, you augment your dataset by mixing the features and labels from two different class instances. Mixup was reformulated by [2] as labels drawn from a probability distribution instead of mixed labels. The supporting function, mixup, takes the training features, associated labels, and the number of mixes per observation and then outputs the mixes and associated labels.

trainLabels = adsTrain.Labels;
numMixesPerInstance = 2;
tic
[augData,augLabels] = mixup(trainFeatures,trainLabels,numMixesPerInstance);

trainLabels = cat(1,trainLabels,augLabels);
trainFeatures = cat(2,trainFeatures,augData);
fprintf('Feature augmentation of train set took %f seconds.\n',toc);
Feature augmentation of train set took 0.250037 seconds.

Generate Validation Features

Repeat the feature extraction for the validation features.

validationFeatures = cell(1,numel(adsValidation.Files));

reset(adsValidation)
tic
for index = 1:numel(adsValidation.Files)
    data = read(adsValidation);
    validationFeatures{index} = (extract(aFE,data))';
end
fprintf('Feature extraction of validation set took %f seconds.\n',toc);
Feature extraction of validation set took 1.388853 seconds.

Train Model

Next, you define and train a network. To skip training the network, set downloadPretrainedSystem to true, then continue to the next section.

downloadPretrainedSystem = false;
if downloadPretrainedSystem
    loc = matlab.internal.examples.downloadSupportFile('audio','AcousticsBasedMachineFaultRecognition/AcousticsBasedMachineFaultRecognition.zip');
    unzip(loc,pwd)
    addpath(fullfile(pwd,'AcousticsBasedMachineFaultRecognition'))
end

Define Network

An LSTM layer learns long-term dependencies between time steps of time series or sequence data. The first lstmLayer has 100 hidden units and outputs sequence data. Then a dropout layer is used to reduce overfitting. The second lstmLayer outputs the last step of the time sequence.

numHiddenUnits = 100;
dropProb = 0.2;
layers = [ ...
    sequenceInputLayer(aFE.FeatureVectorLength,'Normalization','zscore')
    lstmLayer(numHiddenUnits,"OutputMode","sequence")
    dropoutLayer(dropProb)
    lstmLayer(numHiddenUnits,"OutputMode","last")
    fullyConnectedLayer(numel(unique(adsTrain.Labels)))
    softmaxLayer
    classificationLayer];

Define Network Hyperparameters

To define hyperparameters for the network, use trainingOptions.

miniBatchSize = 128;
validationFrequency = floor(numel(trainFeatures)/miniBatchSize);
options = trainingOptions("adam", ...
    "MiniBatchSize",miniBatchSize, ...
    "MaxEpochs",35, ...
    "Plots","training-progress", ...
    "Verbose",false, ...
    "Shuffle","every-epoch", ...
    "LearnRateSchedule","piecewise", ...
    "LearnRateDropPeriod",30, ...
    "LearnRateDropFactor",0.1, ...
    "ValidationData",{validationFeatures,adsValidation.Labels}, ...
    "ValidationFrequency",validationFrequency);

Train Network

To train the network, use trainNetwork.

airCompNet = trainNetwork(trainFeatures,trainLabels,layers,options);

Evaluate Network

View the confusion chart for the validation data.

validationResults = classify(airCompNet,validationFeatures);
cm = confusionchart(validationResults,adsValidation.Labels);
accuracy = mean(validationResults == adsValidation.Labels)*100;
cm.title("Accuracy: " + accuracy + " (%)")

Model Streaming Detection

Create Functions to Process Data in a Streaming Loop

Once you have a trained network with satisfactory performance, you can apply the network to test data in a streaming fashion.

There are many additional considerations to take into account to make the system work in a real-world embedded system.

For example,

  • The rate or interval at which classification can be performed with accurate results

  • The size of the network in terms of generated code (program memory) and weights (data memory)

  • The efficiency of the network in terms of computation speed

In MATLAB, you can mimic how the network is deployed and used in hardware on a real embedded system and begin to answer these important questions.

Create MATLAB Function Compatible with C/C++ Code Generation

Once you train your deep learning model, you will deploy it to an embedded target. That means you also need to deploy the code used to perform the feature extraction. Use the generateMATLABFunction method of audioFeatureExtractor to generate a MATLAB function compatible with C/C++ code generation. Specify IsStreaming as true so that the generated function is optimized for stream processing.

filename = fullfile(pwd,"extractAudioFeatures");
generateMATLABFunction(aFE,filename,'IsStreaming',true);

Combine Streaming Feature Extraction and Classification

Save the trained network as a MAT file.

save('AirCompressorFaultRecognitionModel.mat','airCompNet')

Create a function that combines the feature extraction and deep learning classification.

type recognizeAirCompressorFault.m
function scores = recognizeAirCompressorFault(audioIn,rs)
% This is a streaming classifier function 

persistent airCompNet

if isempty(airCompNet)
    airCompNet = coder.loadDeepLearningNetwork('AirCompressorFaultRecognitionModel.mat');
end
if rs
    airCompNet = resetState(airCompNet);
end

% Extract features using function
features = extractAudioFeatures(audioIn);

% Classify
[airCompNet,scores] = predictAndUpdateState(airCompNet,features);

end

Test Streaming Loop

Next, you test the streaming classifier in MATLAB. Stream audio one frame at a time to represent a system as it would be deployed in a real-time embedded system. This enables you to measure and visualize the timing and accuracy of the streaming implementation.

Stream in several audio files and plot the output classification results for each frame of data. At a time interval equal to the length of each file, evaluate the output of the classifier.

reset(adsValidation)

N = 10;
labels = categories(ads.Labels);
numLabels = numel(labels);

% Create a dsp.AsyncBuffer to read audio in a streaming fashion
audioSource = dsp.AsyncBuffer;

% Create a dsp.AsyncBuffer to accumulate scores
scoreBuffer = dsp.AsyncBuffer;

% Create a dsp.AsyncBuffer to record execution time.
timingBuffer = dsp.AsyncBuffer;

% Pre-allocate array to store results
streamingResults = categorical(zeros(N,1));

% Loop over files
for fileIdx = 1:N

    % Read one audio file and put it in the source buffer
    [data,dataInfo] = read(adsValidation);
    write(audioSource,data);

    % Inner loop over frames
    rs = true;
    while audioSource.NumUnreadSamples >= windowLength

        % Get a frame of audio data
        x = read(audioSource,windowLength);

        % Apply streaming classifier function
        tic
        score = recognizeAirCompressorFault(x,rs);
        write(timingBuffer,toc);

        % Store score for analysis
        write(scoreBuffer,score);

        rs = false;
    end
    reset(audioSource)

    % Store class result for that file
    scores = read(scoreBuffer);
    [~,result] = max(scores(end,:),[],2);
    streamingResults(fileIdx) = categorical(labels(result));

    % Plot scores to compare over time
    figure
    plot(scores) %#ok<*NASGU>
    legend(string(airCompNet.Layers(end).Classes),'Location','northwest')
    xlabel("Time Step")
    ylabel("Score")
    str = sprintf('Known label = %s\nPredicted label = %s',string(dataInfo.Label),string(streamingResults(fileIdx)));
    title(str)
end

Compare the test results for the streaming version of the classifier and the non-streaming.

testError = mean(validationResults(1:N) ~= streamingResults);
disp("Error between streaming classifier and non-streaming: " + testError*100 + " (%)")
Error between streaming classifier and non-streaming: 0 (%)

Analyze the execution time. The execution time when state is reset is often above the 32 ms budget. However, in a real, deployed system, that initialization time will only be incurred once. The execution time of the main loop is around 10 ms, which is well below the 32 ms budget for real-time performance.

executionTime = read(timingBuffer)*1000;
budget = (windowLength/aFE.SampleRate)*1000;
plot(executionTime,'o')
title('Execution Time Per Frame')
xlabel('Frame Number')
ylabel('Time (ms)')
yline(budget,'',{'Budget'},'LineWidth',2)

Supporting Functions

function [augData,augLabels] = mixup(data,labels,numMixesPerInstance)
augData = cell(1,numel(data)*numMixesPerInstance);
augLabels = repelem(labels,numMixesPerInstance);

kk = 1;
for ii = 1:numel(data)
    for jj = 1:numMixesPerInstance
        lambda = max(min((randn./10)+0.5,1),0);
        
        % Find all available data with different labels.
        availableData = find(labels~=labels(ii));

        % Randomly choose one of the available data with a different label.
        numAvailableData = numel(availableData);
        idx = randi([1,numAvailableData]);

        % Mix.
        augData{kk} = lambda*data{ii} + (1-lambda)*data{availableData(idx)};

        % Specify the label as randomly set by lambda.
        if lambda < rand
            augLabels(kk) = labels(availableData(idx));
        else
            augLabels(kk) = labels(ii);
        end
        kk = kk + 1;
    end
end

end

References

[1] Verma, Nishchal K., et al. "Intelligent Condition Based Monitoring Using Acoustic Signals for Air Compressors." IEEE Transactions on Reliability, vol. 65, no. 1, Mar. 2016, pp. 291–309. DOI.org (Crossref), doi:10.1109/TR.2015.2459684.

[2] Huszar, Ferenc. "Mixup: Data-Dependent Data Augmentation." InFERENCe. November 03, 2017. Accessed January 15, 2019. https://www.inference.vc/mixup-data-dependent-data-augmentation/.