# Data Types in System Identification Toolbox

This example describes different data types and formats supported by System Identification Toolbox™.

The toolbox supports model identification using time-domain signals, frequency-domain signals and frequency-response data (also called *Frequency Response Function*, or FRF). Typically there is no restriction on the number of input and output signals; most estimation commands support an arbitrary number of I/O signals. In particular, there is full support for time-series modeling, or signal modeling, where the data is typically described as having an arbitrary number of outputs, but no inputs.

### How Data of Different Types are Generated

The origin of signals is in time-domain where the system or process to be modeled is excited by an external stimulus (the "input" **u(t)**) and the corresponding response of the system (the "output" **y(t)**) recorded. These signals are recorded over a finite span of time **t** resulting in vectors or matrices storing the measured values. The triplet of recorded values $$\{y,u,t\}$$ denoted by $${D}_{1}$$** **in the schematic diagram below forms the **time-domain data**. Since the data can only be recorded at discrete time instants, sometimes an extra piece of information is also required for modeling - the intersample behavior.

The time-domain signals can be converted into their frequency-domain counterparts by using Discrete Fourier Transform, typically carried out using the FFT operation. This operation requires the time domain signals to be collected on (or interpolated to) a uniform time-grid, that is, made available at a constant sampling interval $${T}_{s}$$. Unless zero-padding is used, the resulting signals correspond to a uniform frequency grid of as many samples as in the original (time-domain) signals and spaced over 0 to Nyquist frequency ($$=\pi /{T}_{s}$$, where $${T}_{s}$$ is the time-domain data sampling interval). The triplet$$\{Y,U,\omega \}$$is called **frequency-domain data**. It is shown by ${\mathit{D}}_{2}$ in the diagram above. To make the frequency-domain data independent of any correspondence to another time-domain dataset, the sample time $${T}_{s}$$ is also stored.

The dynamic behavior of systems is often described by its frequency response, which is a collection of the gain and relative phase of the output of the system to a sinusoidal input, over a range of input frequencies. Typically spectrum analyzers compute these values. They can also be computed by applying spectral analysis techniques to the time- or frequency-domain data. The result is a complex frequency response vector $$H(\omega )$$ computed over a range of $$\omega $$ values. The pair $$\{H,\omega \}$$is called the** frequency response data** (FRD) or **frequency response function** (FRF), shown as ${\mathit{D}}_{3}$ in the diagram above.

To summarize, System Identification Toolbox supports these 3 data types:

Time-domain data: composed of input, output signal values for a given time-vector. The input signal is not aways present, such as in case of time-series modeling. Both linear and nonlinear models can be identified using time-domain data.

Frequency-domain data: FFT of time-domain signals. Typically used for linear model identification. They offer the benefit of data compression and fast estimation algorithms.

Frequency response data (FRD): gain and phase of the system output relative to its input, recorded over a range sinusoidal inputs of varying frequencies. Like frequency-domain data, the FRDs provide the benefit of data compression and fast estimation algorithms. They are used for identifying linear models.

### Data Formats: Representation of Various Data Types for Identification Purposes

System Identification Toolbox supports 3 ways of representing time-domain data - timetables, numeric matrices, and iddata objects.

### 1. Timetables

Timetables are a built-in MATLAB datatype represented by the timetable objects. They are used to represent observations that are ordered by time and hence are suitable for time-domain identification. You can create timetables by using the timetable constructor, or by using the `array2timetable`

command. Timetables can be created automatically when importing data into MATLAB from data files (spreadsheets, csv files etc).

The time vector associated with the data in a timetable `TT`

can be accessed using its `TT.Properties.RowTimes`

property. Identification typically requires uniformly sampled data (has a finite value for its "`TT.Properties.SampleRate`

" and "`TT.Properties.TimeStep`

" properties).Timetables are supported for model identification, release R2022b onwards. For earlier releases, you must use iddata object to represent data. For the identification of Neural State-Space models (represented by the `idNeuralStateSpace`

object) in the continuous-time form, the timetable sample points need not be uniformly separated.

Timetables can store non-numeric data, as well as samples of arbitrary sizes for each time value. However, only the numeric, scalar-valued variables (that is, the variables whose values are scalars for a given time sample) can be used for identification of models. Suppose you have data for two signals `u1`

and `u2`

that share a common time vector `T`

. There are two ways of storing these signals in a timetable:

As a single variable whose value is a 2-element vector

`[u1(t), u2(t)]`

for a given time`t`

.As two distinct variables, one for each signal.

In System Identification Toolbox, the second option - using distinct variables for each signal, is the only supported representation. Here is an example to see the difference more clearly.

% Make up some data to illustrate the difference between the two ways of % creating the timetable T = seconds(1:5)'; % 5 samples u1 = rand(5,1); u2 = rand(5,1); % Option 1: Joint representation of u1, u2 into a single variable TT1 = timetable(T, [u1, u2], 'VariableNames', {'U'}); head(TT1)

T U _____ __________________ 1 sec 0.81472 0.09754 2 sec 0.90579 0.2785 3 sec 0.12699 0.54688 4 sec 0.91338 0.95751 5 sec 0.63236 0.96489

% Option 2: Separate variable for each signal TT2 = timetable(T, u1, u2, 'VariableNames', {'u1','u2'}); head(TT2)

T u1 u2 _____ _______ _______ 1 sec 0.81472 0.09754 2 sec 0.90579 0.2785 3 sec 0.12699 0.54688 4 sec 0.91338 0.95751 5 sec 0.63236 0.96489

The timetable `TT1`

cannot be used for identification/analysis purposes. `TT2`

is the only valid representation. Let us now see how a timetable can be used for identifying a model.

% Example: Estimate a continuous-time transfer function % using data represented by a timetable load sdata8 tt8 head(tt8) % view the first few rows of the variables in timetable tt8

t u1 u2 u3 y _____ __ __ _________ ________ 1 sec 1 1 -0.62559 1.3563 2 sec -1 -1 0.38628 1.2895 3 sec 1 -1 -0.61745 -0.61873 4 sec 1 1 -0.72216 -0.10768 5 sec 1 -1 -0.41638 2.9138 6 sec 1 -1 -0.002612 0.58144 7 sec -1 -1 0.38949 0.20034 8 sec -1 -1 -1.1337 -1.0462

np = 2; % number of poles nz = 2; % number of zeros % Estimate a model using only one input ('u2') and output ('y') % (this is just to illustrate a calling syntax; the model quality is not good) sysTF_a = tfest(tt8, np, nz, 'InputName','u2','OutputName','y'); % Estimate a model with 3 inputs ('u1', 'u2', 'u3') and 1 output ('y') sysTF_b = tfest(tt8, np, nz, 'InputName',{'u1','u2','u3'},'OutputName','y')

sysTF_b = From input "u1" to output "y": -0.06172 s^2 + 0.1762 s + 1.836 ------------------------------- s^2 + 0.3634 s + 0.2453 From input "u2" to output "y": 1.025 s^2 + 1.679 s + 1.211 --------------------------- s^2 + 0.3707 s + 0.2436 From input "u3" to output "y": -0.06868 s^2 - 0.3401 s + 0.5967 -------------------------------- s^2 + 0.3603 s + 0.2496 Continuous-time identified transfer function. Parameterization: Number of poles: [2 2 2] Number of zeros: [2 2 2] Number of free coefficients: 15 Use "tfdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using TFEST on time domain data. Fit to estimation data: 80.91% FPE: 1.068, MSE: 0.9817 Model Properties

Timetables can similarly be in all the model validation and analysis operations that require input/output data.

```
% Compare the responses of models sysTF_a and sysTF_b to the measured output in tt8
compare(tt8, sysTF_a, sysTF_b)
```

Note that the input intersample behavior has no effect on discrete-time estimations.

```
% Perform residual analysis using the timetable tt8 to represent the data.
resid(tt8, sysTF_b)
```

% Simulate the model sysTF_a using the input signal variables in timetable tt8 InputData = tt8(:,{'u2'}); % separate out the variables that are inputs for the model head(InputData)

t u2 _____ __ 1 sec 1 2 sec -1 3 sec -1 4 sec 1 5 sec -1 6 sec -1 7 sec -1 8 sec -1

`sim(sysTF_a, InputData(1:20,:)) % simulate the model response for 20 seconds`

It is not necessary to separate out the variables to use. In the simulation example above, the same result would be obtained if the original timetable tt8 was used in the `sim`

command.

sim(sysTF_a, tt8(1:20,:))

#### Timetables: Representing Multi-Experiment Data

You can use multiple datasets together for model identification. If the data is in timetables, you pass a cell array of the timetables to all the estimation and analysis commands. The individual timetables in the cell array are referred to as "experiments". Thus, the variable `X = {TT1, TT2, TT3}`

represents a multi-experiment dataset consisting of 3 experiments `TT1, TT2`

, and `TT3`

. Each experiment must have the same datatype (timetable here), the same active numeric variables, and the same sample time and time units. The start time, and the number of observations in the individual experiments may be different.

% Load two timetables - tt1 and tt2. % For the purpose of this example only, assume that they both correspond to the same system % The datasets have different lengths, but are sample at the same rate (10 Hz) load sdata1 tt1 load sdata2 tt2 % Create a multi-experiment dataset MultiExpData = {tt1, tt2}; % Estimate a nonlinear ARX model using MultiExpData Orders = [5 5 1]; NL = idGaussianProcess; nlsys = nlarx(MultiExpData, Orders, NL)

nlsys = Nonlinear ARX model with 1 output and 1 input Inputs: u Outputs: y Regressors: Linear regressors in variables y, u List of all regressors Output function: Gaussian process function using a SquaredExponential kernel Sample time: 0.1 seconds Status: Estimated using NLARX on time domain data "MultiExpData". Fit to estimation data: [79.04 87.04]% (prediction focus) FPE: 0.9208, MSE: [0.8527 0.8901] More information in model's "Report" property. Model Properties

```
% Compare the model response to the estimation data
compare(MultiExpData, nlsys)
```

The plot shows the fit to the first dataset. Right-click on the plot and use the context menu to pick the second data experiment to view.

Similarly, you can perform model simulation, response prediction, and residual analysis using multi-experiment data. Here is an example of using multi-experiment timetable data for model simulation.

sim(nlsys,MultiExpData)

Here is another example of using multi-experiment data for spectral analysis, that is, generating a non-parametric estimate of the model frequency response.

% Example: Using SPA for spectral analysis. G = spa(MultiExpData); % view result bode(G)

#### Relationship of Timetables to the IDDATA object

The timetable object is a basic datatype available in MATLAB and is the recommended way of representing time-domain signals. The iddata object is specific to System Identification Toolbox. It can be used to store both time-domain signals as well frequency-domain signals. Given a timetable with variables `'u1'`

, `'u2'`

, `'u3'`

for inputs, and the variables `'y1', 'y2'`

for the outputs, you can convert the data into an equivalent `iddata `

object as follows:

Extract the time vector

`T`

from the timetable and convert it into the numeric form.Extract all the input variables (

`'u1'`

,`'u2'`

,`'u3'`

) and combine them into a single input matrix`U`

.Extract all the output variables (

`'y1', 'y2'`

) and combine them into a single output matrix`Y`

.Optionally, also fetch the names and units of these variables from the timetable's "Properties".

Call the iddata constructor, using

`Y, U`

and`T`

as the primary input arguments. You can optionally specify the names and/or units of the I/O channels using name-value pairs in the call to the constructor (or by dot-assignment post construction)

Here is an example.

% Generate some data u1 = rand(100,1); u2 = cumsum(randn(100,1)); u3 = -rand(100,1); y1 = sin(u1+u2); y2 = rand(100,1); T = hours(0:99)'; % Pack the data into a timetable TT = timetable(T,u1,u2,u3,y1,y2,'VariableNames',{'u1','u2','u3','y1','y2'}); TT.Properties.VariableUnits = {'a','b','c','d','e'}; % Fetch the inputs from the timetable TT U = TT{:,1:3}; % fetch by index here (could also fetch by names) Y = TT{:,4:5}; T = TT.Properties.RowTimes; % T is a duration vector T_numeric = hours(T); % convert the duration vector into doubles % Compose the iddata object z = iddata(Y, U, 'SamplingInstants', T_numeric, 'TimeUnit','hours'); z.InputName = TT.Properties.VariableNames(1:3)'; z.OutputName = TT.Properties.VariableNames(4:5)'; % do this only if TT has units available (non-empty) z.InputUnit = TT.Properties.VariableUnits(1:3)'; z.OutputUnit = TT.Properties.VariableUnits(4:5)'; % optional, if you want to carry over any UserData content z.UserData = TT.Properties.UserData; display(z)

z = Time domain data set with 100 samples. Sample time: 1 hours Outputs Unit (if specified) y1 d y2 e Inputs Unit (if specified) u1 a u2 b u3 c

plot(z)

Note that for most identification and analysis purposes, the most important properties of the iddata object to assign are the input/output data and the sampling (time vector) information. The other properties are optional. However, it does help to assign the input and output names for book-keeping and for interpreting any time plots (such as plot, compare, resid, or sim).

### 2. Numeric Matrices

The model training, validation, and analysis commands accept numeric matrices for the input and output data. The observations are along the rows and the channels are along the columns. This is the simplest data format and is often sufficient for identifying discrete-time models. Note that this format does not provide any information about the time vector. **Hence this format is not recommended for identifying continuous-time models.**

When using the numeric (double) matrices, the input and output matrices must be specified as separate input arguments in the estimation command.

% load double matrices % umat1: input vector, ymat1: output vector load sdata1 umat1 ymat1 % Estimate a discrete-time state-space model of 4th order using the N4SID command % Input and output data are specified using the first and second input arguments respectively sysd = n4sid(umat1, ymat1, 4)

sysd = Discrete-time identified state-space model: x(t+Ts) = A x(t) + B u(t) + K e(t) y(t) = C x(t) + D u(t) + e(t) A = x1 x2 x3 x4 x1 0.8392 -0.3129 0.02105 0.03743 x2 0.4768 0.6671 0.1428 0.003757 x3 -0.01951 0.08374 -0.09761 1.046 x4 -0.003885 -0.02914 -0.8796 -0.03171 B = u1 x1 0.02635 x2 -0.03301 x3 7.256e-05 x4 0.0005861 C = x1 x2 x3 x4 y1 69.08 26.64 -2.237 -0.5601 D = u1 y1 0 K = y1 x1 0.003282 x2 0.009339 x3 -0.003232 x4 0.003809 Sample time: 1 seconds Parameterization: FREE form (all coefficients in A, B, C free). Feedthrough: none Disturbance component: estimate Number of free coefficients: 28 Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using N4SID on time domain data "umat1". Fit to estimation data: 76.33% (prediction focus) FPE: 1.21, MSE: 1.087 Model Properties

sysd.Ts

ans = 1

Since the data does not provide the sample time, it is assumed to be 1 second. Consequently, the model sample time, `sysd.Ts`

, is also 1 second (note that n4sid estimates a discrete-time model by default). Since the data does not provide the sample time information, you can choose a different sample time for the model by using the`'Ts'/<value>`

name-value pair.

% Estimate a discrete-time state-space model of sample time 0.1 seconds sysd = n4sid(umat1, ymat1, 4, 'Ts', 0.1);

In the above call, the sample time (`Ts`

) of the model is assigned a value of 0.1 seconds. In order to achieve this, the data is first assigned a sample time of 0.1 seconds. This simple assignment of the sample time does not work if a continuous-time model is desired.

% Estimate a continuous-time state-space model sysc = n4sid(umat1, ymat1, 4, 'Ts', 0);

Warning: Data sample time is assumed to be 1 seconds. To specify a different value for the data sample time consider providing data using a timetable or an iddata object.

In this case, there is no direct way of specifying the data sample time; it is assumed to be 1 second. The warning issued is pointing to this fact. If you have the data in numeric format, convert it into a timetable or an iddata object wherein you can add the information regarding the sampling (such as the value of sample time and the time unit). Suppose the true sample time of the data above is 0.1 hours. In order to pass this formation to `N4SID`

, you can proceed as follows.

N = size(umat1,1); % (assume a start time of 1 hr) t = hours(0.1*(1:N)'); dataTT = timetable(umat1,ymat1,'RowTimes',t); sysc = n4sid(dataTT, 4, 'Ts', 0); sysc.Ts

ans = 0

sysc.TimeUnit

ans = 'hours'

You can, alternatively, use an iddata object to represent the data.

dataIDDATA = iddata(ymat1, umat1, 0.1, 'TimeUnit', 'hours'); sysc = n4sid(dataIDDATA, 4, 'Ts', 0); sysc.Ts

ans = 0

sysc.TimeUnit

ans = 'hours'

#### Numeric Data - Multi-experiment case

As in the case of timetables, multi-experiment data is represented by a cell array of matrices, where each cell element represents one experiment.

### 3. The IDDATA Object

The iddata object provides a convenient means of keeping all the data attributes together. You can store the measured signals, their sampling information, names and units in a single object. Furthermore, the iddata object can be used to represent both time- and frequency-domain signals.

So why not use this object all the time, as opposed to using double matrices or timetables? Here are the main reasons:

The main reason is that the iddata object is specific to the toolbox and may not be familiar to the users who have not used that this toolbox before. Requiring the data to be packaged into a specialized object can sometimes appear burdensome. This is unlike doubles and timetables which are built-in datatypes in MATLAB and hence available for use in any toolbox.

Representing the data in frequency domain requires the use of iddata objects. The conversion of the time-domain data into the frequency domain is achieved by the FFT operation. Frequency domain representation can sometimes be beneficial for system identification. You can compress the data, sometimes significantly, by storing only the selected frequency samples (such as those closer to the dynamics of interest). The data can be used in dedicated frequency-domain specific identification algorithms. These algorithms can sometimes yield faster and more accurate results for linear systems.

Packaging all the input and output signals into an iddata object requires that all signals be measured at the same sampling instants, that is, they all share a common time vector. In use cases where the input and output signals use different time vectors, these signals need to be specified as separate input arguments in the estimation and analysis commands. Currently, on the Neural State Space models support such data.

#### IDDATA Object: Time-Domain Configuration

% Create some data. u = sign(randn(200,2)); % 2 inputs y = randn(200,1); % 1 output ts = 0.1; % The sample time % Create an iddata object z = iddata(y,u,ts)

z = Time domain data set with 200 samples. Sample time: 0.1 seconds Outputs Unit (if specified) y1 Inputs Unit (if specified) u1 u2

By default, the object represents time-domain data. You can verify this by checking the `Domain`

property of the object.

z.Domain

ans = 'Time'

The data is plotted as iddata by the plot command, as in:

```
% plot all the data
plot(z)
```

```
% Plot only the first 100 samples of the output signal and the second input signal
plot(z(1:100,:,2))
```

To retrieve the outputs and inputs, use

u = z.u; % or, equivalently u = get(z,'u'), or u = z.InputData y = z.y; % or, equivalently y = get(z,'y'), or y = z.OutputData

To select a portion of the data:

`zp = z(48:79); % select only the samples 48 through 79`

To select the first output and the second input:

`zs = z(:,1,2); % The ':' refers to all the data time points.`

The channels are given default names 'y1', 'u2', etc. This can be changed to any values by

z = set(z,'InputName',{'Voltage';'Current'},'OutputName','Speed');

Equivalently you can use:

z.InputName = {'Voltage';'Current'}; z.OutputName = 'Speed';

For bookkeeping and plots, also units can be set:

z.InputUnit = {'Volt';'Ampere'}; z.OutputUnit = 'm/s'; z

z = Time domain data set with 200 samples. Sample time: 0.1 seconds Outputs Unit (if specified) Speed m/s Inputs Unit (if specified) Voltage Volt Current Ampere

All current properties are obtained by `get`

:

get(z)

`ans = `*struct with fields:*
Domain: 'Time'
Name: ''
OutputData: [200×1 double]
y: 'Same as OutputData'
OutputName: {'Speed'}
OutputUnit: {'m/s'}
InputData: [200×2 double]
u: 'Same as InputData'
InputName: {2×1 cell}
InputUnit: {2×1 cell}
Period: [2×1 double]
InterSample: {2×1 cell}
Ts: 0.1000
Tstart: 0.1000
SamplingInstants: [200×1 double]
TimeUnit: 'seconds'
ExperimentName: 'Exp1'
Notes: {}
UserData: []

You can add channels (both input and output) by "horizontal concatenation", i.e. `z = [z1 z2]`

:

z2 = iddata(rand(200,1),ones(200,1),0.1,'OutputName','New Output',... 'InputName','New Input'); z3 = [z,z2]

z3 = Time domain data set with 200 samples. Sample time: 0.1 seconds Outputs Unit (if specified) Speed m/s New Output Inputs Unit (if specified) Voltage Volt Current Ampere New Input

#### IDDATA Object: Frequency-Domain Configuration

Frequency domain signals are represented by a set of complex input/output vectors (which represent the FFT of their time-domain counterparts) and their corresponding frequency points. Frequency-domain signals can also be the continuous-time Fourier transforms of their time-domain counterparts leading to a dataset with a sample time of zero.

Load some frequency domain signals obtained by discrete Fourier transform of certain uniformly sampled time domain signals with sample time of 0.1 seconds.

load('demofr','U','Y') W = (0:500)'/100; % frequency vector in Hz tiledlayout(2,1) nexttile loglog(W,abs(Y),W,abs(U)) ylabel('Amplitude') nexttile semilogx(W,unwrap(angle(Y))*180/pi,W,unwrap(angle(U))*180/pi) ylabel('Phase (degrees)') xlabel('Frequency (Hz)')

The signals `U, Y`

, and frequency `W`

can be put in an iddata object as follows:

Ts = .1; % sample time zf = iddata(Y,U,Ts,'Frequency',W,'FrequencyUnit','Hz')

zf = Frequency domain data set with responses at 501 frequencies. Frequency range: 0 to 5 Hz Sample time: 0.1 seconds Outputs Unit (if specified) y1 Inputs Unit (if specified) u1

get(zf)

`ans = `*struct with fields:*
Domain: 'Frequency'
Name: ''
OutputData: [501×1 double]
y: 'Same as OutputData'
OutputName: {'y1'}
OutputUnit: {''}
InputData: [501×1 double]
u: 'Same as InputData'
InputName: {'u1'}
InputUnit: {''}
Period: Inf
InterSample: 'zoh'
Ts: 0.1000
FrequencyUnit: 'Hz'
Frequency: [501×1 double]
TimeUnit: 'seconds'
ExperimentName: 'Exp1'
Notes: {}
UserData: []

zf.Domain

ans = 'Frequency'

The `Domain`

property value shows that the iddata object zf represents a frequency-domain dataset. The property `Frequency`

stores the frequency vector in the chosen frequency units. Plot the data.

clf plot(zf)

The plots shows the magnitude and the phase of the input and output frequency domain signals on separate axes.

The syntaxes for selecting a subset of samples or channels, and those for concatenation and subreferencing are same as in the time-domain case. If the data represents continuous-time Fourier transforms, use a sample time of zero, as in `zf = iddata(Y,U,0,'Frequency',W,'FrequencyUnit','Hz')`

If you are starting with time-domain signals, a convenient way of obtaining the frequency-domain representation is by using the `fft`

method of the iddata object.

u = sign(randn(200,2)); % 2 inputs y = randn(200,1); % 1 output ts = 0.1; % The sample time % Create time-domain iddata object z_time = iddata(y,u,ts); z_frequency = fft(z_time) % frequency domain representation of the dataset z

z_frequency = Frequency domain data set with responses at 101 frequencies. Frequency range: 0 to 31.416 rad/seconds Sample time: 0.1 seconds Outputs Unit (if specified) y1 Inputs Unit (if specified) u1 u2

You can convert frequency-domain dataset back into its time-domain representation using the `ifft`

method, provided that the frequencies are uniformly spaced over the 0 to Nyquist frequency.

z_time = ifft(z_frequency)

z_time = Time domain data set with 200 samples. Sample time: 0.1 seconds Outputs Unit (if specified) y1 Inputs Unit (if specified) u1 u2

#### IDDATA Object: Representing Multi-Experiment Data

You can use an iddata object to store data corresponding to multiple experiments. To do so, use a cell array of signal values in the object constructor. The individual data experiments need not use the same sampling but they must all share the same input/output names and time/frequency units.

% Load two different load iddata1 z1 load iddata2 z2 plot(z1,z1)

y1 = z1.y; u1 = z1.u; Ts1 = z1.Ts; y2 = z2.y; u2 = z2.u; Ts2 = z2.Ts; ycell = {y1,y2}; ucell = {u1,u2}; Tscell = {Ts1,Ts2}; zm = iddata(ycell,ucell,Tscell)

zm = Time domain data set containing 2 experiments. Experiment Samples Sample Time Exp1 300 0.1 Exp2 400 0.1 Outputs Unit (if specified) y1 Inputs Unit (if specified) u1

Often, an easier way is to first create a set of single-experiment iddata objects and then combine them into a single, multi-experiment object using the `merge`

command.

Both the datasets have the same input/output signal names and time units. Merge them into a single dataset:

zm = merge(z1,z2); plot(zm)

For more information, see the example *Dealing with Multi-Experiment Data and Merging Models*.

### 4. IDFRD Object

The IDFRD object is used to store measured frequency response data. To use it as a source of data in the various identification and analysis commands, the object must contain non-empty values in its properties that store the measured response:

`ResponseData`

, and`Frequency`

properties for input/output models.SpectrumData and

`Frequency`

properties for time series models.

If you measured the frequency response data directly from hardware, such as by using a spectrum analyzer, set the data sample time to zero (`G = idfrd(response, frequency, 0)`

). If the frequency response is obtained by converting from a time-domain dataset, the sample time of the idfrd object matches that of the original data. You can set the idfrd sample time to zero only if the original data used inputs that were band-limited. Note that setting the idfrd sample time to zero is desirable since it makes it possible to identify continuous-time parametric models with high speed and accuracy.

#### Convert Data Between Time and Frequency Domains

As described earlier, the time-domain signals can be converted into their frequency-domain counterparts by the discrete Fourier transform operation, implemented as an FFT operation. Conversion of the signal data (either time- or frequency-domain) into frequency response (FRF) is actually an exercise in estimation. There are two ways of doing so:

Use a non-parametric fitting method, such as etfe, spa, or spafdr. These commands produce an idfrd object as output with sample time matching that of the input data.

Identify an linear parametric model (e.g., using tfest, ssest, arx) and fetch the frequency response by calling the idfrd command on the result.

In either of these approaches, the input data must be uniformly sampled. See the example *Frequency Domain Identification: Estimating Models Using Frequency Domain Data* for more information on how to identify parametric models from frequency response data.

### Specifying Input Intersample Behavior for Continuous-Time Modeling

An attribute of the input data that is important for continuous-time modeling is its intersample behavior, whether it is held constant between the samples (zero-order-hold), is linear (first-order-hold), or is bandlimited. Specification of this data attribute allows the estimation algorithm to account for potential aliasing effects in the data and deliver the correct result. Use the "`InputInterSample`

" option in the estimation option set to assign the correct value.

opt = tfestOptions; opt.InputInterSample = 'foh'; sysTF_foh = tfest(tt8, np, nz, 'InputName',{'u1','u2','u3'},'OutputName','y',opt); opt.InputInterSample = 'zoh'; sysTF_zoh = tfest(tt8, np, nz, 'InputName',{'u1','u2','u3'},'OutputName','y',opt); opt.InputInterSample = 'bl'; sysTF_bl = tfest(tt8, np, nz, 'InputName',{'u1','u2','u3'},'OutputName','y',opt); bodemag(sysTF_foh, sysTF_zoh, sysTF_bl)

Note that the inter-sample behavior considerations only apply to models with inputs.

### Time-Series and Signal Modeling

Time series, or autonomous system, models are models with no measured input signals. `size(model,2)`

returns `0`

. Such models can be identified, often using the same commands as those used for identifying the input-output models (such as `ssest, arx, nlarx, greyest`

etc). The toolbox also provides dedicated routines such as `ar, ivar`

for time-series modeling.

When you are fitting a time-series model, you only provide the output signal to the estimation command. So if the data is in the form of a timetable containing several variables, you may need to instruct the estimation command that none of those variables should be treated as measured inputs. Here is how different commands handle data specified in various formats (iddata object, double matrix, or timetable). For all model types, except idNeuralStateSpace, the data must be uniformly sampled.

#### AR, IVAR Commands

These commands identify univariate time series models - they need only a single data vector.

#### Double matrix: data must be a column vector

% Example: fit a 5th order AR model to a moving average process data = movmean(randn(100,1),5); model = ar(data(1:10), 5); % forecast response 20 steps ahead forecast(model, data(1:10), 20)

#### Iddata object: data must contain a single output channel and no inputs

t = (0:100)'/100; data = iddata(exp(-0.2*t).*sin(2*pi*3*t), [], 0.01); plot(data)

```
% fit the model using the first 50 samples of data
model = ar(data(1:50), 3)
```

model = Discrete-time AR model: A(z)y(t) = e(t) A(z) = 1 - 2.945 z^-1 + 2.926 z^-2 - 0.9801 z^-3 Sample time: 0.01 seconds Parameterization: Polynomial orders: na=3 Number of free coefficients: 3 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using AR ('fb/now') on time domain data. Fit to estimation data: 99.99% FPE: 1.005e-08, MSE: 8.915e-09 Model Properties

```
% forecast response 100 steps ahead, given the measurements of 50 samples.
forecast(model, data(1:50), 100)
```

#### Timetable: the first variable that contains numeric, scalar value (i.e. scalar value in one row) is used. All others are ignored.

t = (0:100)'/100; x1 = randn(length(t),2); % some other vector valued data x2 = rand(101,1); x2 = discretize(x2,[0 .25 .75 1],'categorical',{'small', 'medium', 'large'}); x3 = exp(-0.2*t).*sin(2*pi*3*t); % signal of interest x4 = exp(-0.4*t).*sin(2*pi*6*t+1); % some other valid signal T = timetable(minutes(t),x1,x2,x3,x4); head(T)

Time x1 x2 x3 x4 ________ ______________________ ______ _______ ________ 0 min 0.75911 -1.029 medium 0 0.84147 0.01 min -0.081664 0.20651 medium 0.18701 0.97736 0.02 min -0.15847 1.3411 large 0.36665 0.97543 0.03 min 0.006498 1.3327 medium 0.53262 0.83706 0.04 min -1.0646 -1.2849 medium 0.67909 0.58267 0.05 min -1.6439 1.6184 medium 0.80097 0.2488 0.06 min 1.5815 0.66164 large 0.89403 -0.11722 0.07 min 0.29261 0.22735 medium 0.95512 -0.46392

The timetable `T`

contains 3 variables - `x1, x2, x3 `

and` x4`

. `x1`

is not a valid signal for identification since its value at a given time instant is a 1-by-2 vector. SImilarly, x2 is not valid since its value is categorical - we need doubles.` x3`

and `x4`

are valid signals. What happens when you use T as data in the `ar`

or the `ivar`

command?

```
% fit the 3rd order auto-regressive model using the first 50 rows of T
model1 = ar(T(1:50,:), 3)
```

Warning: The timetable variable "x1" cannot be used as a signal because it contains multi-column data and thus will be ignored.

Warning: The timetable variable "x2" cannot be used as a signal because it contains categorical data and thus will be ignored.

model1 = Discrete-time AR model: A(z)y(t) = e(t) A(z) = 1 - 2.945 z^-1 + 2.926 z^-2 - 0.9801 z^-3 Sample time: 0.6 seconds Parameterization: Polynomial orders: na=3 Number of free coefficients: 3 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using AR ('fb/now') on time domain data. Fit to estimation data: 99.99% FPE: 1.005e-08, MSE: 8.915e-09 Model Properties

model1.OutputName

`ans = `*1×1 cell array*
{'x3'}

The model used the variable `x3`

as the data to fit. This is because it is the first available valid signal in the timetable. What if you wanted to fit the model to the signal `x4`

instead? To make that happen, you will need to specify 'OutputName' as name-value pair.

model2 = ar(T(1:50,:), 5, 'OutputName', 'x4')

model2 = Discrete-time AR model: A(z)y(t) = e(t) A(z) = 1 - 2.911 z^-1 + 2.453 z^-2 + 0.6907 z^-3 - 2.005 z^-4 + 0.808 z^-5 Sample time: 0.6 seconds Parameterization: Polynomial orders: na=5 Number of free coefficients: 5 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using AR ('fb/now') on time domain data. Fit to estimation data: 100% FPE: 4.61e-30, MSE: 3.772e-30 Model Properties

model2.OutputName

`ans = `*1×1 cell array*
{'x4'}

#### Validate results and forecast

```
% how well can the two models predict their respective signals in the timetable with 100-step prediction horizon?
compare(T, model1, model2, 100)
```

```
% evaluate model residuals
resid(T, model1, model2)
```

The models are able to predict outcomes 100 steps ahead but the model residues (with 1-step horizon) are not uncorrelated. This hints at using a higher model order or tweaking estimation algorithm choices. Finally, forecast the respective signals 100 steps ahead.

forecast(model1, model2, T, 100)

#### State Space Time Series Models

State space models are syntactically the simplest ones to estimate since, in the simplest case, they require just the data and a rough guess about the model order. The commands `ssest, n4sid, era,`

and `ssregest`

can be used to identify a time series model in the state space form. These commands can identify multivariate models.

#### Using double matrices: leave the first input argument empty

When using double data, the input and the output data must be specified using separate input arguments, as in `ssest(input, output, model_order)`

. For time series modeling, you must use `input = []. `

t = (0:100)'/100; x1 = exp(-0.2*t).*sin(2*pi*3*t); x2 = exp(-0.4*t).*sin(2*pi*6*t+1); output = [x1,x2]; % fit a 6th order disrete-time state space model model_dt = ssest([],output,'best','Ts',0.01) % automatically pick the best order in the 1:10 range

model_dt = Discrete-time identified state-space model: x(t+Ts) = A x(t) + K e(t) y(t) = C x(t) + e(t) A = x1 x2 x3 x4 x1 0.9497 0.2879 0.07526 0.03245 x2 -0.3003 0.9329 -0.02131 0.2194 x3 -0.08673 0.01416 0.9771 -0.162 x4 0.03299 -0.2092 0.1872 0.953 C = x1 x2 x3 x4 y1 2.95 0.409 2.241 3.735 y2 -3.553 -0.9476 -0.3008 3.071 K = y1 y2 x1 0.05216 -0.06347 x2 0.002149 -0.03596 x3 0.04893 -0.01924 x4 0.02691 0.0441 Sample time: 0.01 seconds Parameterization: FREE form (all coefficients in A, B, C free). Disturbance component: estimate Number of free coefficients: 32 Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using SSEST on time domain data. Fit to estimation data: [100;100]% (prediction focus) FPE: 2.437e-60, MSE: 2.616e-30 Model Properties

`forecast(model_dt,output,100) % 100-step ahead response forecast`

As noted before, the use of double data for identifying continuous-time models is not recommended. You must use iddata, or timetable format of the data.

#### Using iddata object: Use an object with nu=0 input channels and ny>=1 output channels.

The model is fit to all the available output channels in the data object.

dataobj = iddata(output, [], 0.01, 'Tstart', 0); % identify a continuous-time model model_ct = ssest(dataobj,'best'); % automatically pick the best order in the 1:10 range forecast(model_ct,output,100) % 100-step ahead response forecast

#### Using timetables: must use `'InputName','OutputName'`

name-value pair if the number of variables is greater than 1

If the timetable contains `N>1`

valid variables, the ssest command (similarly the n4sid or ssregest command) treats the first N-1 variables as inputs and the last one as the output resulting in an input-output model (models with 1 or more inputs) rather than a multivariate time series model. In order to treat all the valid variables as time series data, you must designate them as "outputs" using the 'OutputName'/Value name-value pair, where the Value is a cell array containing the names of all the valid variables in the timetable. Alternatively, use 'InputName'/[] to indicate that all the valid variables are outputs.

% supress the warning about the table containing invalid variables Warn = warning; warning('off','Ident:utility:timetableDataMultiCol'); warning('off','Ident:utility:timetableDataTypeNotAcceptable'); Restore = onCleanup(@()warning(Warn)); % create some tabular data t = (0:100)'/100; x1 = randn(length(t),2); % some other vector valued data x2 = cumsum(t); x3 = rand(101,1); x3 = discretize(x3,[0 .25 .75 1],'categorical',{'small', 'medium', 'large'}); x4 = exp(-0.2*t).*sin(2*pi*3*t); % signal of interest x5 = exp(-0.4*t).*sin(2*pi*6*t+1); % some other valid signal T = timetable(minutes(t),x1,x2,x3,x4,x5); head(T)

Time x1 x2 x3 x4 x5 ________ ______________________ ____ ______ _______ ________ 0 min -0.83494 -0.75297 0 medium 0 0.84147 0.01 min 0.17968 -1.0331 0.01 medium 0.18701 0.97736 0.02 min 1.1663 1.1638 0.03 medium 0.36665 0.97543 0.03 min 0.055986 -0.58011 0.06 large 0.53262 0.83706 0.04 min -2.1 0.4173 0.1 medium 0.67909 0.58267 0.05 min 1.2353 -1.6481 0.15 small 0.80097 0.2488 0.06 min 0.0026715 -0.57268 0.21 medium 0.89403 -0.11722 0.07 min -0.45954 0.65923 0.28 medium 0.95512 -0.46392

% x2, x4 and x5 are valid signals % Identify a multivariate time series models for {'x2','x4','x5'} signals sys1 = n4sid(T,7,'OutputName',{'x2';'x4';'x5'})

sys1 = Discrete-time identified state-space model: x(t+Ts) = A x(t) + K e(t) y(t) = C x(t) + e(t) A = x1 x2 x3 x4 x5 x6 x7 x1 120.1 -145.7 4.411 0.1017 -23.21 63.04 -166 x2 -59.81 74.15 -2.315 -0.03415 11.63 -31.68 83.35 x3 1.253 -1.449 1.012 0.03292 -0.1766 0.8812 -1.753 x4 -4.391 5.384 -0.2207 0.9264 1.148 -2.222 6.122 x5 -10.4 12.76 -0.4461 -0.3291 2.968 -5.469 14.49 x6 32.03 -39.11 0.9799 -0.08308 -6.33 17.91 -44.6 x7 151.6 -185.4 5.633 0.1183 -29.55 80.25 -210.3 C = x1 x2 x3 x4 x5 x6 x7 x2 -124.3 11.61 0.63 -0.45 -0.5675 2.689 124.9 x4 -0.1134 1.798 1.282 0.09119 -2.941 4.024 -0.2603 x5 -0.1976 0.9301 -0.5758 -1.33 3.912 2.63 0.2018 K = x2 x4 x5 x1 0.0668 -6.652 -1.338 x2 -0.03393 3.275 0.6335 x3 0.0006712 -0.05902 -0.006004 x4 -0.00248 0.2269 0.04379 x5 -0.005874 0.5324 0.1649 x6 0.01811 -1.677 -0.3007 x7 0.08523 -8.41 -1.708 Sample time: 0.6 seconds Parameterization: FREE form (all coefficients in A, B, C free). Disturbance component: estimate Number of free coefficients: 91 Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using N4SID on time domain data "T". Fit to estimation data: [100;100;100]% (prediction focus) FPE: 1.043e-40, MSE: 1.528e-10 Model Properties

size(sys1)

Identified state-space model with 3 outputs, 0 inputs, 7 states and 91 free parameters.

The outputs of model sys1 are `'x2', 'x4', 'x5'. There are no inputs since the timetable T does not contain any other valid variables. Alternatively:`

`sys2 = n4sid(T,7,'InputName',[])`

sys2 = Discrete-time identified state-space model: x(t+Ts) = A x(t) + K e(t) y(t) = C x(t) + e(t) A = x1 x2 x3 x4 x5 x6 x7 x1 120.1 -145.7 4.411 0.1017 -23.21 63.04 -166 x2 -59.81 74.15 -2.315 -0.03415 11.63 -31.68 83.35 x3 1.253 -1.449 1.012 0.03292 -0.1766 0.8812 -1.753 x4 -4.391 5.384 -0.2207 0.9264 1.148 -2.222 6.122 x5 -10.4 12.76 -0.4461 -0.3291 2.968 -5.469 14.49 x6 32.03 -39.11 0.9799 -0.08308 -6.33 17.91 -44.6 x7 151.6 -185.4 5.633 0.1183 -29.55 80.25 -210.3 C = x1 x2 x3 x4 x5 x6 x7 x2 -124.3 11.61 0.63 -0.45 -0.5675 2.689 124.9 x4 -0.1134 1.798 1.282 0.09119 -2.941 4.024 -0.2603 x5 -0.1976 0.9301 -0.5758 -1.33 3.912 2.63 0.2018 K = x2 x4 x5 x1 0.0668 -6.652 -1.338 x2 -0.03393 3.275 0.6335 x3 0.0006712 -0.05902 -0.006004 x4 -0.00248 0.2269 0.04379 x5 -0.005874 0.5324 0.1649 x6 0.01811 -1.677 -0.3007 x7 0.08523 -8.41 -1.708 Sample time: 0.6 seconds Parameterization: FREE form (all coefficients in A, B, C free). Disturbance component: estimate Number of free coefficients: 91 Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using N4SID on time domain data "T". Fit to estimation data: [100;100;100]% (prediction focus) FPE: 1.043e-40, MSE: 1.528e-10 Model Properties

size(sys2)

Identified state-space model with 3 outputs, 0 inputs, 7 states and 91 free parameters.

What if you wanted to fit only the signals `'x4' `

and `'x5'`

, ignoring the rest? You need to specify both 'InputName' and 'OutputName' using name-value pairs.

% fit a bivariate model to x4, x5 signals where the model order is determined automatically sys = n4sid(T,'best','InputName',[],'OutputName',{'x4';'x5'})

sys = Discrete-time identified state-space model: x(t+Ts) = A x(t) + K e(t) y(t) = C x(t) + e(t) A = x1 x2 x3 x4 x1 0.9497 0.2879 0.07526 0.03245 x2 -0.3003 0.9329 -0.02131 0.2194 x3 -0.08673 0.01416 0.9771 -0.162 x4 0.03299 -0.2092 0.1872 0.953 C = x1 x2 x3 x4 x4 2.95 0.409 2.241 3.735 x5 -3.553 -0.9476 -0.3008 3.071 K = x4 x5 x1 0.05216 -0.06347 x2 0.002149 -0.03596 x3 0.04893 -0.01924 x4 0.02691 0.0441 Sample time: 0.6 seconds Parameterization: FREE form (all coefficients in A, B, C free). Disturbance component: estimate Number of free coefficients: 32 Use "idssdata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using N4SID on time domain data "T". Fit to estimation data: [100;100]% (prediction focus) FPE: 2.437e-60, MSE: 2.616e-30 Model Properties

`forecast(sys,T,100) % 100 step ahead forecast`

#### Polynomial Time Series Models

The polynomial time series models take the form `A(q)y(t) = C(q)/D(q)e(t)`

, where `y(t)`

is the measured signal (can be multivariate), `e(t)`

is unmeasured disturbance, and `A(q), C(q), D(q) `

are polynomials expressed in the lag operator` q^-1`

. They can be identified using commands such as `ar, ivar, arx, bj, armax, polyest`

.In the nonlinear case, the equation takes the form `y(t) = f(y(t-1), y(t-2), ..., y(t-n)) + e(t)`

, where `f(.)`

is a nonlinear fucntion. Nonlinear time series models can be identified using the `nlarx`

command. The data use in the `ar`

and `ivar`

commands was described above. For arx, armax, bj, and polyest commands, the number of inputs and the outputs used by the model are uniquely determined by the order matrix `NN`

. The number of rows of `NN`

equals the number of outputs (`ny = size(NN,1)`

). The number of inputs nu can be determined by the number of columns in the order variable `nb (nu = size(nb,2)).`

For example,` NN = [na nb nc nk] `

for the `armax `

command`. ny = size(NN,1), nu = size(nb,2).`

In the timeseries case, `nu = 0`

.

ARX:

`NN = na`

(ny-by-ny matrix)ARMAX:

`NN = [na nc]`

(ny-by-(ny+1) matrix)BJ:

`NN = [nc nd]`

(ny-by-2) matrixPOLYEST:

`NN = [na nc nd]`

(ny-by-(ny+2)) matrix, either na or nd must be zero

#### Using double matrices

Specify the input and output matrices as separate input arguments to the estimation command. Input data matrix must be empty.

t = (0:100)'/100; x1 = exp(-0.2*t).*sin(2*pi*3*t); x2 = exp(-0.4*t).*sin(2*pi*6*t+1); output = [x1,x2]; % fit a 6th order disrete-time state space model na = [3 1; 0 3]; % (the off-diagonal values could have been chosen to be zero since x1 and x2 are independent) model_arx = arx([],output,na); present(model_arx)

model_arx = Discrete-time AR model: Model for output "y1": A(z)y_1(t) = - A_i(z)y_i(t) + e_1(t) A(z) = 1 + 0 (+/- 0.1951) z^-1 - 2.848 (+/- 0.3825) z^-2 + 1.953 ( +/- 0.1943) z^-3 A_2(z) = 2.166e-16 (+/- 2.843e-16) z^-1 Model for output "y2": A(z)y_2(t) = e_2(t) A(z) = 1 - 1.317 (+/- 0.08134) z^-1 + 0 (+/- 0.1507) z^-2 + 0.5313 ( +/- 0.08069) z^-3 Sample time: 1 seconds Parameterization: Polynomial orders: na=[3 1;0 3] Number of free coefficients: 7 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using ARX on time domain data. Fit to estimation data: [100;100]% (prediction focus) FPE: 4.269e-60, MSE: 3.851e-30 More information in model's "Report" property. Model Properties

#### Using Timetables

Unless name-value pairs are used, the first `ny`

valid variables in the timetable are used as outputs.

t = (0:100)'/100; x1 = randn(length(t),2); % some other vector valued data x2 = cumsum(t); x3 = rand(101,1); x3 = discretize(x3,[0 .25 .75 1],'categorical',{'small', 'medium', 'large'}); x4 = exp(-0.2*t).*sin(2*pi*3*t); % signal of interest x5 = exp(-0.4*t).*sin(2*pi*6*t+1); % some other valid signal T = timetable(minutes(t),x1,x2,x3,x4,x5); head(T)

Time x1 x2 x3 x4 x5 ________ ______________________ ____ ______ _______ ________ 0 min -1.238 -0.39275 0 large 0 0.84147 0.01 min 0.24413 -0.62204 0.01 medium 0.18701 0.97736 0.02 min 1.3983 -1.1905 0.03 medium 0.36665 0.97543 0.03 min -0.095473 -1.8785 0.06 small 0.53262 0.83706 0.04 min 0.38762 -0.42401 0.1 small 0.67909 0.58267 0.05 min -0.96631 0.77724 0.15 small 0.80097 0.2488 0.06 min 1.5092 -0.71393 0.21 medium 0.89403 -0.11722 0.07 min 0.40383 1.5846 0.28 small 0.95512 -0.46392

% fit an armax model to x2 which is the first valid signal in T NN = [1 1]; % na = nc = 1 model_armax1 = armax(T,NN,'OutputName','x2'); model_armax1.OutputName

`ans = `*1×1 cell array*
{'x2'}

% fit an armax model to x4 which is the second valid signal in T NN = [3 2]; % na = 3, nc = 2 model_armax2 = armax(T,NN,'OutputName','x4'); model_armax2.OutputName

`ans = `*1×1 cell array*
{'x4'}

% fit a multi-variate BJ model to x5 , x2 in that order nc = [3; 2]; nd = [3; 3]; NN = [nc, nd]; model_bj = bj(T,NN,'OutputName',{'x5';'x2'}); model_bj.OutputName

`ans = `*2×1 cell*
{'x5'}
{'x2'}

In the case of Nonlinear ARX models, if the regressors are specified using model order matrix, or by using a linear model, then the I/O size can be determined uniquely as in the case of ARX models. Otherwise, name-value pairs `'InputName'`

and `'OutputName'`

must be used. In particular, use `'InputName'/[]`

to indicate the absence of inputs.

% Identify a Nonlinear ARX time series model % Prepare some data t = (0:100)'/100; x1 = randn(length(t),2); % some other vector valued data x2 = cumsum(t).^2; x3 = rand(101,1); x3 = discretize(x3,[0 .25 .75 1],'categorical',{'small', 'medium', 'large'}); x4 = exp(-0.2*t).*sin(2*pi*3*t).*(t+1)./(t+5); % some signal x5 = exp(-0.4*t).*sin(2*pi*6*t+1); % some other signal T = timetable(minutes(t),x1,x2,x3,x4,x5); clf stackedplot(T)

% x2, x4 and x5 are valid signals % Identify a multivariate time series models for {'x2','x4','x5'} signals L = linearRegressor({'x2','x4','x5'},1:10); % linear terms in x2, x4, x5 P1 = polynomialRegressor({'x2','x4'},1:2,2); % second order polynomial terms in x2, x4 P2 = polynomialRegressor({'x5'},1:2,3); % third order polynomial terms in x5 H = periodicRegressor({'x2','x4','x5'},5:2:15,.20,10); % Stack all the regressors into one variable Regressors = [L;P1;P2;H]; % Regressors is a 4-by-1 object % view all the regressors contained in these specifications getreg(Regressors)'

`ans = `*1×396 string*
Columns 1 through 6
"x2(t-1)" "x2(t-2)" "x2(t-3)" "x2(t-4)" "x2(t-5)" "x2(t-6)"
Columns 7 through 12
"x2(t-7)" "x2(t-8)" "x2(t-9)" "x2(t-10)" "x4(t-1)" "x4(t-2)"
Columns 13 through 18
"x4(t-3)" "x4(t-4)" "x4(t-5)" "x4(t-6)" "x4(t-7)" "x4(t-8)"
Columns 19 through 24
"x4(t-9)" "x4(t-10)" "x5(t-1)" "x5(t-2)" "x5(t-3)" "x5(t-4)"
Columns 25 through 30
"x5(t-5)" "x5(t-6)" "x5(t-7)" "x5(t-8)" "x5(t-9)" "x5(t-10)"
Columns 31 through 35
"x2(t-1)^2" "x2(t-2)^2" "x4(t-1)^2" "x4(t-2)^2" "x5(t-1)^3"
Columns 36 through 38
"x5(t-2)^3" "sin(0.2*x2(t-5))" "sin(2*0.2*x2(t-5))"
Columns 39 through 41
"sin(3*0.2*x2(t-5))" "sin(4*0.2*x2(t-5))" "sin(5*0.2*x2(t-5))"
Columns 42 through 44
"sin(6*0.2*x2(t-5))" "sin(7*0.2*x2(t-5))" "sin(8*0.2*x2(t-5))"
Columns 45 through 47
"sin(9*0.2*x2(t-5))" "sin(10*0.2*x2(t-5))" "cos(0.2*x2(t-5))"
Columns 48 through 50
"cos(2*0.2*x2(t-5))" "cos(3*0.2*x2(t-5))" "cos(4*0.2*x2(t-5))"
Columns 51 through 53
"cos(5*0.2*x2(t-5))" "cos(6*0.2*x2(t-5))" "cos(7*0.2*x2(t-5))"
Columns 54 through 56
"cos(8*0.2*x2(t-5))" "cos(9*0.2*x2(t-5))" "cos(10*0.2*x2(t-5))"
Columns 57 through 59
"sin(0.2*x2(t-7))" "sin(2*0.2*x2(t-7))" "sin(3*0.2*x2(t-7))"
Columns 60 through 62
"sin(4*0.2*x2(t-7))" "sin(5*0.2*x2(t-7))" "sin(6*0.2*x2(t-7))"
Columns 63 through 65
"sin(7*0.2*x2(t-7))" "sin(8*0.2*x2(t-7))" "sin(9*0.2*x2(t-7))"
Columns 66 through 68
"sin(10*0.2*x2(t-7))" "cos(0.2*x2(t-7))" "cos(2*0.2*x2(t-7))"
Columns 69 through 71
"cos(3*0.2*x2(t-7))" "cos(4*0.2*x2(t-7))" "cos(5*0.2*x2(t-7))"
Columns 72 through 74
"cos(6*0.2*x2(t-7))" "cos(7*0.2*x2(t-7))" "cos(8*0.2*x2(t-7))"
Columns 75 through 77
"cos(9*0.2*x2(t-7))" "cos(10*0.2*x2(t-7))" "sin(0.2*x2(t-9))"
Columns 78 through 80
"sin(2*0.2*x2(t-9))" "sin(3*0.2*x2(t-9))" "sin(4*0.2*x2(t-9))"
Columns 81 through 83
"sin(5*0.2*x2(t-9))" "sin(6*0.2*x2(t-9))" "sin(7*0.2*x2(t-9))"
Columns 84 through 86
"sin(8*0.2*x2(t-9))" "sin(9*0.2*x2(t-9))" "sin(10*0.2*x2(t-9))"
Columns 87 through 89
"cos(0.2*x2(t-9))" "cos(2*0.2*x2(t-9))" "cos(3*0.2*x2(t-9))"
Columns 90 through 92
"cos(4*0.2*x2(t-9))" "cos(5*0.2*x2(t-9))" "cos(6*0.2*x2(t-9))"
Columns 93 through 95
"cos(7*0.2*x2(t-9))" "cos(8*0.2*x2(t-9))" "cos(9*0.2*x2(t-9))"
Columns 96 through 98
"cos(10*0.2*x2(t-9))" "sin(0.2*x2(t-11))" "sin(2*0.2*x2(t-11))"
Columns 99 through 101
"sin(3*0.2*x2(t-11))" "sin(4*0.2*x2(t-11))" "sin(5*0.2*x2(t-11))"
Columns 102 through 104
"sin(6*0.2*x2(t-11))" "sin(7*0.2*x2(t-11))" "sin(8*0.2*x2(t-11))"
Columns 105 through 107
"sin(9*0.2*x2(t-11))" "sin(10*0.2*x2(t-11…" "cos(0.2*x2(t-11))"
Columns 108 through 110
"cos(2*0.2*x2(t-11))" "cos(3*0.2*x2(t-11))" "cos(4*0.2*x2(t-11))"
Columns 111 through 113
"cos(5*0.2*x2(t-11))" "cos(6*0.2*x2(t-11))" "cos(7*0.2*x2(t-11))"
Columns 114 through 116
"cos(8*0.2*x2(t-11))" "cos(9*0.2*x2(t-11))" "cos(10*0.2*x2(t-11…"
Columns 117 through 119
"sin(0.2*x2(t-13))" "sin(2*0.2*x2(t-13))" "sin(3*0.2*x2(t-13))"
Columns 120 through 122
"sin(4*0.2*x2(t-13))" "sin(5*0.2*x2(t-13))" "sin(6*0.2*x2(t-13))"
Columns 123 through 125
"sin(7*0.2*x2(t-13))" "sin(8*0.2*x2(t-13))" "sin(9*0.2*x2(t-13))"
Columns 126 through 128
"sin(10*0.2*x2(t-13…" "cos(0.2*x2(t-13))" "cos(2*0.2*x2(t-13))"
Columns 129 through 131
"cos(3*0.2*x2(t-13))" "cos(4*0.2*x2(t-13))" "cos(5*0.2*x2(t-13))"
Columns 132 through 134
"cos(6*0.2*x2(t-13))" "cos(7*0.2*x2(t-13))" "cos(8*0.2*x2(t-13))"
Columns 135 through 137
"cos(9*0.2*x2(t-13))" "cos(10*0.2*x2(t-13…" "sin(0.2*x2(t-15))"
Columns 138 through 140
"sin(2*0.2*x2(t-15))" "sin(3*0.2*x2(t-15))" "sin(4*0.2*x2(t-15))"
Columns 141 through 143
"sin(5*0.2*x2(t-15))" "sin(6*0.2*x2(t-15))" "sin(7*0.2*x2(t-15))"
Columns 144 through 146
"sin(8*0.2*x2(t-15))" "sin(9*0.2*x2(t-15))" "sin(10*0.2*x2(t-15…"
Columns 147 through 149
"cos(0.2*x2(t-15))" "cos(2*0.2*x2(t-15))" "cos(3*0.2*x2(t-15))"
Columns 150 through 152
"cos(4*0.2*x2(t-15))" "cos(5*0.2*x2(t-15))" "cos(6*0.2*x2(t-15))"
Columns 153 through 155
"cos(7*0.2*x2(t-15))" "cos(8*0.2*x2(t-15))" "cos(9*0.2*x2(t-15))"
Columns 156 through 158
"cos(10*0.2*x2(t-15…" "sin(0.2*x4(t-5))" "sin(2*0.2*x4(t-5))"
Columns 159 through 161
"sin(3*0.2*x4(t-5))" "sin(4*0.2*x4(t-5))" "sin(5*0.2*x4(t-5))"
Columns 162 through 164
"sin(6*0.2*x4(t-5))" "sin(7*0.2*x4(t-5))" "sin(8*0.2*x4(t-5))"
Columns 165 through 167
"sin(9*0.2*x4(t-5))" "sin(10*0.2*x4(t-5))" "cos(0.2*x4(t-5))"
Columns 168 through 170
"cos(2*0.2*x4(t-5))" "cos(3*0.2*x4(t-5))" "cos(4*0.2*x4(t-5))"
Columns 171 through 173
"cos(5*0.2*x4(t-5))" "cos(6*0.2*x4(t-5))" "cos(7*0.2*x4(t-5))"
Columns 174 through 176
"cos(8*0.2*x4(t-5))" "cos(9*0.2*x4(t-5))" "cos(10*0.2*x4(t-5))"
Columns 177 through 179
"sin(0.2*x4(t-7))" "sin(2*0.2*x4(t-7))" "sin(3*0.2*x4(t-7))"
Columns 180 through 182
"sin(4*0.2*x4(t-7))" "sin(5*0.2*x4(t-7))" "sin(6*0.2*x4(t-7))"
Columns 183 through 185
"sin(7*0.2*x4(t-7))" "sin(8*0.2*x4(t-7))" "sin(9*0.2*x4(t-7))"
Columns 186 through 188
"sin(10*0.2*x4(t-7))" "cos(0.2*x4(t-7))" "cos(2*0.2*x4(t-7))"
Columns 189 through 191
"cos(3*0.2*x4(t-7))" "cos(4*0.2*x4(t-7))" "cos(5*0.2*x4(t-7))"
Columns 192 through 194
"cos(6*0.2*x4(t-7))" "cos(7*0.2*x4(t-7))" "cos(8*0.2*x4(t-7))"
Columns 195 through 197
"cos(9*0.2*x4(t-7))" "cos(10*0.2*x4(t-7))" "sin(0.2*x4(t-9))"
Columns 198 through 200
"sin(2*0.2*x4(t-9))" "sin(3*0.2*x4(t-9))" "sin(4*0.2*x4(t-9))"
Columns 201 through 203
"sin(5*0.2*x4(t-9))" "sin(6*0.2*x4(t-9))" "sin(7*0.2*x4(t-9))"
Columns 204 through 206
"sin(8*0.2*x4(t-9))" "sin(9*0.2*x4(t-9))" "sin(10*0.2*x4(t-9))"
Columns 207 through 209
"cos(0.2*x4(t-9))" "cos(2*0.2*x4(t-9))" "cos(3*0.2*x4(t-9))"
Columns 210 through 212
"cos(4*0.2*x4(t-9))" "cos(5*0.2*x4(t-9))" "cos(6*0.2*x4(t-9))"
Columns 213 through 215
"cos(7*0.2*x4(t-9))" "cos(8*0.2*x4(t-9))" "cos(9*0.2*x4(t-9))"
Columns 216 through 218
"cos(10*0.2*x4(t-9))" "sin(0.2*x4(t-11))" "sin(2*0.2*x4(t-11))"
Columns 219 through 221
"sin(3*0.2*x4(t-11))" "sin(4*0.2*x4(t-11))" "sin(5*0.2*x4(t-11))"
Columns 222 through 224
"sin(6*0.2*x4(t-11))" "sin(7*0.2*x4(t-11))" "sin(8*0.2*x4(t-11))"
Columns 225 through 227
"sin(9*0.2*x4(t-11))" "sin(10*0.2*x4(t-11…" "cos(0.2*x4(t-11))"
Columns 228 through 230
"cos(2*0.2*x4(t-11))" "cos(3*0.2*x4(t-11))" "cos(4*0.2*x4(t-11))"
Columns 231 through 233
"cos(5*0.2*x4(t-11))" "cos(6*0.2*x4(t-11))" "cos(7*0.2*x4(t-11))"
Columns 234 through 236
"cos(8*0.2*x4(t-11))" "cos(9*0.2*x4(t-11))" "cos(10*0.2*x4(t-11…"
Columns 237 through 239
"sin(0.2*x4(t-13))" "sin(2*0.2*x4(t-13))" "sin(3*0.2*x4(t-13))"
Columns 240 through 242
"sin(4*0.2*x4(t-13))" "sin(5*0.2*x4(t-13))" "sin(6*0.2*x4(t-13))"
Columns 243 through 245
"sin(7*0.2*x4(t-13))" "sin(8*0.2*x4(t-13))" "sin(9*0.2*x4(t-13))"
Columns 246 through 248
"sin(10*0.2*x4(t-13…" "cos(0.2*x4(t-13))" "cos(2*0.2*x4(t-13))"
Columns 249 through 251
"cos(3*0.2*x4(t-13))" "cos(4*0.2*x4(t-13))" "cos(5*0.2*x4(t-13))"
Columns 252 through 254
"cos(6*0.2*x4(t-13))" "cos(7*0.2*x4(t-13))" "cos(8*0.2*x4(t-13))"
Columns 255 through 257
"cos(9*0.2*x4(t-13))" "cos(10*0.2*x4(t-13…" "sin(0.2*x4(t-15))"
Columns 258 through 260
"sin(2*0.2*x4(t-15))" "sin(3*0.2*x4(t-15))" "sin(4*0.2*x4(t-15))"
Columns 261 through 263
"sin(5*0.2*x4(t-15))" "sin(6*0.2*x4(t-15))" "sin(7*0.2*x4(t-15))"
Columns 264 through 266
"sin(8*0.2*x4(t-15))" "sin(9*0.2*x4(t-15))" "sin(10*0.2*x4(t-15…"
Columns 267 through 269
"cos(0.2*x4(t-15))" "cos(2*0.2*x4(t-15))" "cos(3*0.2*x4(t-15))"
Columns 270 through 272
"cos(4*0.2*x4(t-15))" "cos(5*0.2*x4(t-15))" "cos(6*0.2*x4(t-15))"
Columns 273 through 275
"cos(7*0.2*x4(t-15))" "cos(8*0.2*x4(t-15))" "cos(9*0.2*x4(t-15))"
Columns 276 through 278
"cos(10*0.2*x4(t-15…" "sin(0.2*x5(t-5))" "sin(2*0.2*x5(t-5))"
Columns 279 through 281
"sin(3*0.2*x5(t-5))" "sin(4*0.2*x5(t-5))" "sin(5*0.2*x5(t-5))"
Columns 282 through 284
"sin(6*0.2*x5(t-5))" "sin(7*0.2*x5(t-5))" "sin(8*0.2*x5(t-5))"
Columns 285 through 287
"sin(9*0.2*x5(t-5))" "sin(10*0.2*x5(t-5))" "cos(0.2*x5(t-5))"
Columns 288 through 290
"cos(2*0.2*x5(t-5))" "cos(3*0.2*x5(t-5))" "cos(4*0.2*x5(t-5))"
Columns 291 through 293
"cos(5*0.2*x5(t-5))" "cos(6*0.2*x5(t-5))" "cos(7*0.2*x5(t-5))"
Columns 294 through 296
"cos(8*0.2*x5(t-5))" "cos(9*0.2*x5(t-5))" "cos(10*0.2*x5(t-5))"
Columns 297 through 299
"sin(0.2*x5(t-7))" "sin(2*0.2*x5(t-7))" "sin(3*0.2*x5(t-7))"
Columns 300 through 302
"sin(4*0.2*x5(t-7))" "sin(5*0.2*x5(t-7))" "sin(6*0.2*x5(t-7))"
Columns 303 through 305
"sin(7*0.2*x5(t-7))" "sin(8*0.2*x5(t-7))" "sin(9*0.2*x5(t-7))"
Columns 306 through 308
"sin(10*0.2*x5(t-7))" "cos(0.2*x5(t-7))" "cos(2*0.2*x5(t-7))"
Columns 309 through 311
"cos(3*0.2*x5(t-7))" "cos(4*0.2*x5(t-7))" "cos(5*0.2*x5(t-7))"
Columns 312 through 314
"cos(6*0.2*x5(t-7))" "cos(7*0.2*x5(t-7))" "cos(8*0.2*x5(t-7))"
Columns 315 through 317
"cos(9*0.2*x5(t-7))" "cos(10*0.2*x5(t-7))" "sin(0.2*x5(t-9))"
Columns 318 through 320
"sin(2*0.2*x5(t-9))" "sin(3*0.2*x5(t-9))" "sin(4*0.2*x5(t-9))"
Columns 321 through 323
"sin(5*0.2*x5(t-9))" "sin(6*0.2*x5(t-9))" "sin(7*0.2*x5(t-9))"
Columns 324 through 326
"sin(8*0.2*x5(t-9))" "sin(9*0.2*x5(t-9))" "sin(10*0.2*x5(t-9))"
Columns 327 through 329
"cos(0.2*x5(t-9))" "cos(2*0.2*x5(t-9))" "cos(3*0.2*x5(t-9))"
Columns 330 through 332
"cos(4*0.2*x5(t-9))" "cos(5*0.2*x5(t-9))" "cos(6*0.2*x5(t-9))"
Columns 333 through 335
"cos(7*0.2*x5(t-9))" "cos(8*0.2*x5(t-9))" "cos(9*0.2*x5(t-9))"
Columns 336 through 338
"cos(10*0.2*x5(t-9))" "sin(0.2*x5(t-11))" "sin(2*0.2*x5(t-11))"
Columns 339 through 341
"sin(3*0.2*x5(t-11))" "sin(4*0.2*x5(t-11))" "sin(5*0.2*x5(t-11))"
Columns 342 through 344
"sin(6*0.2*x5(t-11))" "sin(7*0.2*x5(t-11))" "sin(8*0.2*x5(t-11))"
Columns 345 through 347
"sin(9*0.2*x5(t-11))" "sin(10*0.2*x5(t-11…" "cos(0.2*x5(t-11))"
Columns 348 through 350
"cos(2*0.2*x5(t-11))" "cos(3*0.2*x5(t-11))" "cos(4*0.2*x5(t-11))"
Columns 351 through 353
"cos(5*0.2*x5(t-11))" "cos(6*0.2*x5(t-11))" "cos(7*0.2*x5(t-11))"
Columns 354 through 356
"cos(8*0.2*x5(t-11))" "cos(9*0.2*x5(t-11))" "cos(10*0.2*x5(t-11…"
Columns 357 through 359
"sin(0.2*x5(t-13))" "sin(2*0.2*x5(t-13))" "sin(3*0.2*x5(t-13))"
Columns 360 through 362
"sin(4*0.2*x5(t-13))" "sin(5*0.2*x5(t-13))" "sin(6*0.2*x5(t-13))"
Columns 363 through 365
"sin(7*0.2*x5(t-13))" "sin(8*0.2*x5(t-13))" "sin(9*0.2*x5(t-13))"
Columns 366 through 368
"sin(10*0.2*x5(t-13…" "cos(0.2*x5(t-13))" "cos(2*0.2*x5(t-13))"
Columns 369 through 371
"cos(3*0.2*x5(t-13))" "cos(4*0.2*x5(t-13))" "cos(5*0.2*x5(t-13))"
Columns 372 through 374
"cos(6*0.2*x5(t-13))" "cos(7*0.2*x5(t-13))" "cos(8*0.2*x5(t-13))"
Columns 375 through 377
"cos(9*0.2*x5(t-13))" "cos(10*0.2*x5(t-13…" "sin(0.2*x5(t-15))"
Columns 378 through 380
"sin(2*0.2*x5(t-15))" "sin(3*0.2*x5(t-15))" "sin(4*0.2*x5(t-15))"
Columns 381 through 383
"sin(5*0.2*x5(t-15))" "sin(6*0.2*x5(t-15))" "sin(7*0.2*x5(t-15))"
Columns 384 through 386
"sin(8*0.2*x5(t-15))" "sin(9*0.2*x5(t-15))" "sin(10*0.2*x5(t-15…"
Columns 387 through 389
"cos(0.2*x5(t-15))" "cos(2*0.2*x5(t-15))" "cos(3*0.2*x5(t-15))"
Columns 390 through 392
"cos(4*0.2*x5(t-15))" "cos(5*0.2*x5(t-15))" "cos(6*0.2*x5(t-15))"
Columns 393 through 395
"cos(7*0.2*x5(t-15))" "cos(8*0.2*x5(t-15))" "cos(9*0.2*x5(t-15))"
Column 396
"cos(10*0.2*x5(t-15…"

% Fit an SVM based Nonlinear ARX model NonlinearFcn = idSupportVectorMachine; % use SVM for all the 3 variables x2, x4, x5 warning('off','Ident:estimation:NparGTNsamp'); sysNL = nlarx(T, Regressors, NonlinearFcn, 'InputName', [])

sysNL = Nonlinear multivariate time series model with 3 outputs Outputs: x2, x4, x5 Regressors: 1. Linear regressors in variables x2, x4, x5 2. Order 2 regressors in variables x2, x4 3. Order 3 regressors in variables x5 4. Periodic regressors in variables x2, x4, x5 with W = 0.2, and 10 Fourier terms List of all regressors Output functions: Output 1: Support Vector Machine function using a Gaussian kernel Output 2: Support Vector Machine function using a Gaussian kernel Output 3: Support Vector Machine function using a Gaussian kernel Sample time: 0.6 seconds Status: Estimated using NLARX on time domain data "T". Fit to estimation data: [75.08;70.92;44.11]% (prediction focus) MSE: 2.986e+04 More information in model's "Report" property. Model Properties

```
% forecast the response 100 steps ahead, using T as past data
clf
forecast(sysNL, T(1:50,:), 10)
xlim([0.4 0.6])
```

#### Using the iddata object

When using the iddata object, the input/output size of the data must match those of the model. Thus for bivariate time series model, create an iddata object with 2 outputs and 0 inputs.

z = iddata([x5 x2],[],0.01,'Tstart',0,'OutputName',{'x5';'x2'}); nc = [3; 2]; nd = [3; 3]; NN = [nc, nd]; model_bj = bj(z,NN)

model_bj = Discrete-time Polynomial model: Model for output "x5": y_1(t) = [C(z)/D(z)]e_1(t) C(z) = 1 + 3.989e-15 z^-1 - 6.315e-15 z^-2 + 3.158e-15 z^-3 D(z) = 1 - 1.171 z^-1 - 0.27 z^-2 + 0.676 z^-3 Model for output "x2": y_2(t) = [C(z)/D(z)]e_2(t) C(z) = 1 + 1.882 z^-1 + 0.976 z^-2 D(z) = 1 - 3.105 z^-1 + 3.214 z^-2 - 1.11 z^-3 Sample time: 0.01 seconds Parameterization: Polynomial orders: nc=[3;2] nd=[3;3] Number of free coefficients: 11 Use "polydata", "getpvec", "getcov" for parameters and their uncertainties. Status: Estimated using BJ on time domain data "z". Fit to estimation data: [100;100]% (prediction focus) FPE: 5.691e-37, MSE: 2.319e-07 Model Properties

compare(z,model_bj)

```
% restore warning state
delete(Restore)
```

### Summary

System Identification Toolbox lets you work with data in either time- or frequency-domain for identifying dynamic systems. There is no restriction on the number of input/output channels; you can wok with datasets that represent one or more output channels, and zero or more input channels. Key takeaways:

There are many ways of specifying time-domain signals for identification - using double matrices, timetables, or iddata objects.

Frequency-domain input/output signals must be represented using iddata objects.

For multi-experiment data, use a cell array of matrices or timetables, or a multi-experiment iddata object. A multi-experiment iddata representation is typically obtained by combining a set of single-experiment iddata objects together using the

`merge`

command.Frequency response data (FRF) must be represented using idfrd objects. Set the data sample time to zero if the FRF is obtained by direct measurements such as by using a spectrum analyzer.

For continuous-time model identification, it is important to specify the correct sample time and input intersample behavior.

## See Also

`timetable`

| `array2timetable`

| `iddata`

| `idfrd`

| `compare`

| `sim`

| `fft`

| `merge (iddata)`