Main Content

Select Subsets of Data

Why Select Subsets of Data?

You can use data selection to create independent data sets for estimation and validation.

You can also use data selection as a way to clean the data and exclude parts with noisy or missing information. For example, when your data contains missing values, outliers, level changes, and disturbances, you can select one or more portions of the data that are suitable for identification and exclude the rest.

If you only have one data set and you want to estimate linear models, you should split the data into two portions to create two independent data sets for estimation and validation, respectively. Splitting the data is selecting parts of the data set and saving each part independently.

You can merge several data segments into a single multiexperiment data set and identify an average model. For more information, see Create Data Sets from a Subset of Signal Channels or Representing Time- and Frequency-Domain Data Using iddata Objects.

Note

Subsets of the data set must contain enough samples to adequately represent the system, and the inputs must provide suitable excitation to the system.

Selecting portions of frequency-domain data is equivalent to filtering the data. For more information about filtering, see Filtering Data.

Extract Subsets of Data Using the App

Ways to Select Data in the App

You can use System Identification app to select ranges of data on a time-domain or frequency-domain plot. Selecting data in the frequency domain is equivalent to passband-filtering the data.

After you select portions of the data, you can specify to use one data segment for estimating models and use the other data segment for validating models. For more information, see Specify Estimation and Validation Data in the App.

Note

Selecting <--Preprocess > Quick start performs the following actions simultaneously:

  • Remove the mean value from each channel.

  • Split the data into two parts.

  • Specify the first part as estimation data (or Working Data).

  • Specify the second part as Validation Data.

Selecting a Range for Time-Domain Data

You can select a range of data values on a time plot and save it as a new data set in the System Identification app.

Note

Selecting data does not extract experiments from a data set containing multiple experiments. For more information about multiexperiment data, see Create Multiexperiment Data Sets in the App.

To extract a subset of time-domain data and save it as a new data set:

  1. Import time-domain data into the System Identification app, as described in Create Data Sets from a Subset of Signal Channels.

  2. Drag the data set you want to subset to the Working Data area.

  3. If your data contains multiple I/O channels, in the Channel menu, select the channel pair you want to view. The upper plot corresponds to the input signal, and the lower plot corresponds to the output signal.

    Although you view only one I/O channel pair at a time, your data selection is applied to all channels in this data set.

  4. Select the data of interest in either of the following ways:

    • Graphically — Draw a rectangle on either the input-signal or the output-signal plot with the mouse to select the desired time interval. Your selection appears on both plots regardless of the plot on which you draw the rectangle. The Time span and Samples fields are updated to match the selected region.

    • By specifying the Time span — Edit the beginning and the end times in seconds. The Samples field is updated to match the selected region. For example:

      28.5 56.8

    • By specifying the Samples range — Edit the beginning and the end indices of the sample range. The Time span field is updated to match the selected region. For example:

      342 654

    Note

    To clear your selection, click Revert.

  5. In the Data name field, enter the name of the data set containing the selected data.

  6. Click Insert. This action saves the selection as a new data set and adds it to the Data Board.

  7. To select another range, repeat steps 4 to 6.

Selecting a Range of Frequency-Domain Data

Selecting a range of values in frequency domain is equivalent to filtering the data. For more information about data filtering, see Filtering Frequency-Domain or Frequency-Response Data in the App.

Extract Subsets of Data at the Command Line

Selecting ranges of data values is equivalent to subreferencing the data.

For more information about subreferencing time-domain and frequency-domain data, see Select Data Channels, I/O Data and Experiments in iddata Objects.

For more information about subreferencing frequency-response data, see Select I/O Channels and Data in idfrd Objects.

Related Topics