Detecting Anomalies in Time Series Using Distance Methods
Anomaly detection is the process of identifying signal abnormalities by detecting deviations from normal behavior.
One approach to anomaly detection is using distance methods, which base anomaly detection on pattern matching within the time series data. This pattern matching is based on the z-normalized Euclidean distances among subsequences.
Distance methods allow you to identify common recurring subsequences (motifs) that typically indicate normal operation and unique subsequences (discords) that indicate possible anomalies.
Distance methods are especially useful because they are relatively simple. The methods require no trained models or labeled data. They can detect anomalous deviations in the data itself for any type of anomalous behavior, whether characterized previously or not. The algorithms for these methods are optimized for fast computation.
Creating a Profile
Distance methods use a query sequence that includes a target motif, or, alternatively, a specific discord. In general, distance algorithms perform the following steps.
Computes the distance between query and all time series subsequences that have the same length. The distance between the query and a subsequence characterizes the likely behavior of that subsequence.
If the query contains a motif, then
A small distance indicates a good match between the query and subsequence, and, most likely, indicates normal behavior.
A large distance indicates poor match and possible anomalous behavior.
Conversely, If the query contains a discord, then a large distance indicates likely nominal behavior and a small distance indicates probable anomalous behavior.
Returns the distances, known as a profile, along with an index vector that orders the distances from smallest (best fit) to largest (worst fit).
Distance Method Functions in Predictive Maintenance
Predictive Maintenance Toolbox™ provides the following functions for creating profiles of motifs and discords in a time series..
similarityDistance—Returns the distances between a specified separate query subsequence and a target time series.distanceProfile—Returns the distance vector between one query subsequence within the target time series and all remaining subsequences within that time series that have the same length. You can use this function with single-variable or multivariable time series.matrixProfile—Returns the matrix profiles containing the best-matching neighbors for each same-length subsequence within the target time series. You can use this function with single-variable or multivariable time series. For single-variable time series, two additional functions help you analyze the output ofmatrixProfile:findDiscord—Identifies discords, indicating possible anomaliesfindMotif—Identifies motifs, indicating examples of normal behavior
For detailed information on each of these functions, see the corresponding function reference pages.
See Also
similarityDistance | distanceProfile | matrixProfile | findDiscord | findMotif