Interactively extract, visualize, and rank features from measured or simulated data for machine diagnostics and prognostics
The Diagnostic Feature Designer app allows you to accomplish the feature design portion of the predictive maintenance workflow using a multifunction graphical interface. You design and compare features interactively, and then determine which features are best at discriminating between data from nominal systems and from faulty systems. The most effective features ultimately become your condition indicators for fault diagnosis and prognostics.
Using this app, you can:
Import measured or simulated data from individual files, an ensemble file, or an ensemble datastore that references files external to the app.
Interactively visualize data to plot the ensemble variables you import or that you compute within the app. Group data by condition label in plots so that you can clearly see whether member data comes from nominal or faulty systems.
Derive new variables such as time-synchronous averaged signals or order spectra. The app executes processing on all ensemble members with one command.
Generate features from your variables, and visualize their effectiveness using histograms. Features include signal statistics, nonlinear metrics, rotating machinery metrics, and spectral metrics.
Rank your features so that you can determine numerically which ones are likely to best discriminate between nominal and faulty behavior.
Export your most effective features directly to Classification Learner for more insight into feature effectiveness and for algorithm training.
To get started with the app, you must have data from your systems available in your MATLAB® workspace. For information about organizing your data for import into the app, see Organize System Data for Diagnostic Feature Designer.
MATLAB toolstrip: On the Apps tab, under Control System Design and Analysis, click the app icon.
MATLAB command prompt: Enter
Import Data— Import datasets from the MATLAB workspace into the app
Import Single-Member Datasets|
Import Multi-Member Ensemble
Import single-member datasets when your source data in the MATLAB workspace consists of an individual workspace variable for each machine member.
The app displays a selectable list of all the datasets in your MATLAB workspace. Select the datasets that correspond to your ensemble members. Upon completion of the import, the app incorporates the datasets into a single ensemble. For more information, use the Help button in the Import dialog box.
Import a multimember ensemble when your source data is combined into one collective dataset that includes data for all members. This collective dataset can be any of the following:
An ensemble table containing
table arrays or matrices. Table
rows represent individual members.
An ensemble cell array containing tables or matrices. Cell array rows represent individual members.
An ensemble datastore object that contains the information necessary to interact with files stored externally to the app. Use an ensemble datastore object especially when you have too much data to fit into app memory.
The app displays a menu that allows you to choose one dataset or ensemble datastore object from the MATLAB workspace. Select the item that corresponds to your ensemble. Upon completion of the import, the app initializes its internal ensemble using the imported item. For more information, use the Help button in the Import dialog box.
When you select either of the import methods, the app selects the variables to display based on format, not content. The list of candidate datasets are therefore similar for both methods. The app bases its interpretation of the dataset on the import method you select.
For more information about terms related to data ensembles, see Definitions.
For more information about organizing your data for import into the app, see Organize System Data for Diagnostic Feature Designer.
Group Signals, Group Variables— Plot multiple variables together in separate plots or in one plot
Specify how to plot multiple variables together.
Select to create separate plots displayed vertically, each with a unique y-axis scaling.
Clear to create a single plot that overlays all traces and uses a single y-axis scale.
Show Signal Information— Display highlighted variable member name and condition label
In a signal or spectrum plot, you highlight an individual member by positioning your cursor on the member trace. Select Show Signal Information to display both the variable member that you highlight and the condition label for that member in the lower right corner.
Data Cursors— Display x and y values of points, distances between two points
Select Data Cursors to display values of key points in your signal. Data Cursors are horizontal and vertical bars that you position over a point of interest, such as a peak value. The cursors display the x and y positions. To display the distance between the cursors, select Show Signal Information. To lock the bars so that they move together, select one of the Lock Spacing options.
Select Features— Choose the features to plot
Click Select Features to open a selectable list of features to plot. Use Select Features, for example, when you have generated many features but you want to focus on a subset in a single plot panel.
Group By— Select the condition variable for grouping data
Select the condition variable to base feature histograms on. The feature histograms use color to visualize the separation of data groups with different labels for that condition variable.
Bin Settings— Specify the histogram resolution
auto(default) | numeric | binning method name
Specify histogram resolution, as driven by your selection of Bin Width, Bin Method, Number of Bins, and Bin Limits. The bin settings apply to all the histograms for the feature table
The bin settings are not independent. The algorithm uses an order of precedence to determine what to use:
The Binning Method is the default driver for the bin width.
A Bin Width specification overrides the Binning Method.
The bin width and the independent Bin Limits drive the
number of bins. A Number of Bins specification has an effect
only when the value of Group By is
For more information on interpreting and customizing histograms, see Generate and Customize Feature Histograms.
Ranking Techniques— Select a ranking algorithm to apply
One-way ANOVA (default for multiclass CV)|
Select a ranking technique to assess how effectively each feature separates data with different condition labels. If you have already ranked your features, you can rank with a different technique and display the resulting rankings together. Each technique uses a different statistical method.
The menu differentiates between two-class and multiclass ranking methods.
Two-Class Methods — Use when your condition variable (CV) has only two labels,
Multiclass methods — use when your condition variable has more than two
labels, such as
The default ranking technique for two-class condition variables,
t-test, is the simplest technique, as it uses only the
means of the two labeled groups and not their distributions.
t-test is primarily useful for identifying ineffective
features to discard.
The following figure illustrates the influence of specific criteria on ranking-method selection.
Selecting a technique activates a new tab with a name that matches the ranking technique. For more information on this technique-activated tab, see Ranking Technique Tab.
Rank By— Specify condition variable for ranking algorithm to use
Select the condition variable that the ranking algorithm bases its separation assessment on.
Sort By— Specify ranking technique to sort results by when displaying results from multiple techniques
T-Test(default) | ranking technique | ...
Specify the ranking technique to sort by when you are comparing the results of different ranking methods. When you use a single ranking technique, the app displays the results in order of importance, as indicated by the ranking score for that technique. When you are comparing the results for multiple methods, change Sort By to change the technique that drives the sorting order.
Delete Scores— Delete ranking scores from display
Specify this parameter to eliminate ranking scores for a specific technique. Use this parameter, for example, when you are comparing the results of multiple rankings, and you want to simplify the display by eliminating rankings that do not influence your feature selection.
Export— Export features from the app
Export features to the MATLAB workspace|
Export features to the Classification Learner
Export features to use them or share them outside of the app. Both options open a ranking-sorted selectable list to choose from. When you export to the MATLAB workspace, you can use command-line techniques with the features. When you export to the Classification Learner, you open a Classification Learner session that uses your selected features as input.
Correlation Importance— Reduce the ranking of redundant features
The correlation importance setting allows you to screen out features that convey similar information to higher-ranked features. This screening provides a more diverse feature set in the upper ranks.
The criterion for the screening is the set of cross-correlation coefficients a feature has with higher-ranked features. High cross-correlation between two features implies that both features are separating condition groups similarly and provide redundant information. With the default value of 0, the app does not incorporate feature redundancy into ranking scores. As you increase the correlation importance value, the app increases the influence of feature cross-correlation on the feature ranking score. This increasing influence progressively lowers the score of redundant features.
Normalization Scheme— Apply normalization across members
The normalization scheme applies independent normalization across the members for every feature. Normalization allows more direct comparisons among features. The app displays the defining equation for the scheme you select directly beneath your selection.
Apply— Apply parameter settings to new ranking computation
Click Apply to calculate ranking with the specified parameters. The Feature Ranking tab in the plotting area displays the results both graphically and tabularly. This display also includes the results for the default ranking algorithm, and for any other ranking techniques you calculated previously.
Once you calculate a ranking, the app disables Apply until you change a parameter. You can calculate ranking within a tab multiple times. Each time you modify the parameters and calculate ranking, the new results overwrite the previous results in the plotting-area tab.
Close— Close the tab and return control to the feature ranking tab
Once you have completed your ranking within the ranking technique tab, close that tab to return control to the Feature Ranking tab. The Feature Ranking is disabled while any ranking technique tab is activated.
A data ensemble is a collection of datasets, created by measuring or simulating a system under varying conditions. An ensemble can be implemented using independent datasets such as matrices or tables, or in a single collective dataset such as an ensemble table.
For more information on data ensembles and variables, see Data Ensembles for Condition Monitoring and Predictive Maintenance.
Each dataset within an ensemble is a member. Members of an ensemble all contain the same variables. For example, if your ensemble contains data from a set of similar machines, the dataset corresponding to one of those machines is a member.
An ensemble table is an ensemble dataset formatted as a
table. Each column of the table represents one variable. Each row of the table
represents one ensemble member. For information on converting member matrices to an ensemble
table, see Prepare Matrix Data for Diagnostic Feature Designer.
Large ensembles can be implemented using an ensemble datastore object. These objects contain a list of the member files and information for interacting with them. For more information on ensemble datastore objects, see Data Ensembles for Condition Monitoring and Predictive Maintenance.
Data variables make up the main content of the ensemble members,
including measured data and derived data that you use for analysis and development of
predictive maintenance algorithms. For example, you might represent accelerometer data as
the data variable
Vibration. Data variables can also include derived
values, such as the mean value of a signal, or the frequency of the peak magnitude in a
Independent variables (IV) are the variables that identify or
order the members in an ensemble, such as timestamps, number of operating hours, or machine
identifiers. For example,
Time is a common independent variable.
Condition variables (CV) are variables that describe the fault
condition or operating condition of the ensemble member. Condition variables can record the
presence or absence of a fault state, or other operating conditions such as ambient
temperature. Frequently condition variables have specific possible values described by
labels. For example, a condition variable named
Health might have two states described by labels
Degraded. Condition variables can also
be derived values, such as a single scalar value that encodes multiple fault and operating