# simulate

Monte Carlo simulation of vector error-correction (VEC) model

## Description

uses additional
options specified by one or more name-value arguments. For example, `Y`

= simulate(`Mdl`

,`numobs`

,`Name,Value`

)`'NumPaths',1000,'X',X`

specifies
simulating 1000 paths and `X`

as exogenous predictor data for the
regression component.

## Examples

### Simulate Response Series from VEC Model

Consider a VEC model for the following seven macroeconomic series, and then fit the model to the data.

Gross domestic product (GDP)

GDP implicit price deflator

Paid compensation of employees

Nonfarm business sector hours of all persons

Effective federal funds rate

Personal consumption expenditures

Gross private domestic investment

Suppose that a cointegrating rank of 4 and one short-run term are appropriate, that is, consider a VEC(1) model.

Load the `Data_USEconVECModel`

data set.

`load Data_USEconVECModel`

For more information on the data set and variables, enter `Description`

at the command line.

Determine whether the data needs to be preprocessed by plotting the series on separate plots.

figure; subplot(2,2,1) plot(FRED.Time,FRED.GDP); title('Gross Domestic Product'); ylabel('Index'); xlabel('Date'); subplot(2,2,2) plot(FRED.Time,FRED.GDPDEF); title('GDP Deflator'); ylabel('Index'); xlabel('Date'); subplot(2,2,3) plot(FRED.Time,FRED.COE); title('Paid Compensation of Employees'); ylabel('Billions of $'); xlabel('Date'); subplot(2,2,4) plot(FRED.Time,FRED.HOANBS); title('Nonfarm Business Sector Hours'); ylabel('Index'); xlabel('Date');

figure; subplot(2,2,1) plot(FRED.Time,FRED.FEDFUNDS); title('Federal Funds Rate'); ylabel('Percent'); xlabel('Date'); subplot(2,2,2) plot(FRED.Time,FRED.PCEC); title('Consumption Expenditures'); ylabel('Billions of $'); xlabel('Date'); subplot(2,2,3) plot(FRED.Time,FRED.GPDI); title('Gross Private Domestic Investment'); ylabel('Billions of $'); xlabel('Date');

Stabilize all series, except the federal funds rate, by applying the log transform. Scale the resulting series by 100 so that all series are on the same scale.

FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);

Create a VECM(1) model using the shorthand syntax. Specify the variable names.

Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames

Mdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [7×1 vector of NaNs] Adjustment: [7×4 matrix of NaNs] Cointegration: [7×4 matrix of NaNs] Impact: [7×7 matrix of NaNs] CointegrationConstant: [4×1 vector of NaNs] CointegrationTrend: [4×1 vector of NaNs] ShortRun: {7×7 matrix of NaNs} at lag [1] Trend: [7×1 vector of NaNs] Beta: [7×0 matrix] Covariance: [7×7 matrix of NaNs]

`Mdl`

is a `vecm`

model object. All properties containing `NaN`

values correspond to parameters to be estimated given data.

Estimate the model using the entire data set and the default options.

EstMdl = estimate(Mdl,FRED.Variables)

EstMdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [14.1329 8.77841 -7.20359 ... and 4 more]' Adjustment: [7×4 matrix] Cointegration: [7×4 matrix] Impact: [7×7 matrix] CointegrationConstant: [-28.6082 109.555 -77.0912 ... and 1 more]' CointegrationTrend: [4×1 vector of zeros] ShortRun: {7×7 matrix} at lag [1] Trend: [7×1 vector of zeros] Beta: [7×0 matrix] Covariance: [7×7 matrix]

`EstMdl`

is an estimated `vecm`

model object. It is fully specified because all parameters have known values. By default, `estimate`

imposes the constraints of the H1 Johansen VEC model form by removing the cointegrating trend and linear trend terms from the model. Parameter exclusion from estimation is equivalent to imposing equality constraints to zero.

Simulate a response series path from the estimated model with length equal to the path in the data.

```
rng(1); % For reproducibility
numobs = size(FRED,1);
Y = simulate(EstMdl,numobs);
```

`Y`

is a 240-by-7 matrix of simulated responses. Columns correspond to the variable names in `EstMdl.SeriesNames`

.

### Simulate Responses Using filter

Illustrate the relationship between `simulate`

and `filter`

by estimating a 4-D VEC(1) model of the four response series in Johansen's Danish data set. Simulate a single path of responses using the fitted model and the historical data as initial values, and then filter a random set of Gaussian disturbances through the estimated model using the same presample responses.

Load Johansen's Danish economic data.

`load Data_JDanish`

For details on the variables, enter `Description`

.

Create a default 4-D VEC(1) model. Assume that a cointegrating rank of 1 is appropriate.

Mdl = vecm(4,1,1); Mdl.SeriesNames = DataTable.Properties.VariableNames

Mdl = vecm with properties: Description: "4-Dimensional Rank = 1 VEC(1) Model with Linear Time Trend" SeriesNames: "M2" "Y" "IB" ... and 1 more NumSeries: 4 Rank: 1 P: 2 Constant: [4×1 vector of NaNs] Adjustment: [4×1 matrix of NaNs] Cointegration: [4×1 matrix of NaNs] Impact: [4×4 matrix of NaNs] CointegrationConstant: NaN CointegrationTrend: NaN ShortRun: {4×4 matrix of NaNs} at lag [1] Trend: [4×1 vector of NaNs] Beta: [4×0 matrix] Covariance: [4×4 matrix of NaNs]

Estimate the VEC(1) model using the entire data set. Specify the H1* Johansen model form.

EstMdl = estimate(Mdl,Data,'Model','H1*');

When reproducing the results of `simulate`

and `filter`

, it is important to take these actions.

Set the same random number seed using

`rng`

.Specify the same presample response data using the

`'Y0'`

name-value pair argument.

Set the default random seed. Simulate 100 observations by passing the estimated model to `simulate`

. Specify the entire data set as the presample.

rng default; YSim = simulate(EstMdl,100,'Y0',Data);

`YSim`

is a 100-by-4 matrix of simulated responses. Columns correspond to the columns of the variables in `EstMdl.SeriesNames`

.

Set the default random seed. Simulate 4 series of 100 observations from the standard Gaussian distribution.

```
rng default;
Z = randn(100,4);
```

Filter the Gaussian values through the estimated model. Specify the entire data set as the presample.

`YFilter = filter(EstMdl,Z,'Y0',Data);`

`YFilter`

is a 100-by-4 matrix of simulated responses. Columns correspond to the columns of the variables in `EstMdl.SeriesNames`

. Before filtering the disturbances, `filter`

scales `Z`

by the lower triangular Cholesky factor of the model covariance in `EstMdl.Covariance`

.

Compare the resulting responses between `filter`

and `simulate`

.

(YSim - YFilter)'*(YSim - YFilter)

`ans = `*4×4*
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0

The results are identical.

### Simulate Multiple Response Paths

Consider this VEC(1) model for three hypothetical response series.

$$\begin{array}{rcl}\Delta {y}_{t}& =& c+A{B}^{\prime}{y}_{t-1}+{\Phi}_{1}\Delta {y}_{t-1}+{\epsilon}_{t}\\ & & \\ & =& \left[\begin{array}{c}-1\\ -3\\ -30\end{array}\right]+\left[\begin{array}{cc}-0.3& 0.3\\ -0.2& 0.1\\ -1& 0\end{array}\right]\left[\begin{array}{ccc}0.1& -0.2& 0.2\\ -0.7& 0.5& 0.2\end{array}\right]{y}_{t-1}+\left[\begin{array}{ccc}0& 0.1& 0.2\\ 0.2& -0.2& 0\\ 0.7& -0.2& 0.3\end{array}\right]\Delta {y}_{t-1}+{\epsilon}_{t}.\end{array}$$

The innovations are multivariate Gaussian with a mean of 0 and the covariance matrix

$$\Sigma =\left[\begin{array}{ccc}1.3& 0.4& 1.6\\ 0.4& 0.6& 0.7\\ 1.6& 0.7& 5\end{array}\right].$$

Create variables for the parameter values.

Adjustment = [-0.3 0.3; -0.2 0.1; -1 0]; Cointegration = [0.1 -0.7; -0.2 0.5; 0.2 0.2]; ShortRun = {[0. 0.1 0.2; 0.2 -0.2 0; 0.7 -0.2 0.3]}; Constant = [-1; -3; -30]; Trend = [0; 0; 0]; Covariance = [1.3 0.4 1.6; 0.4 0.6 0.7; 1.6 0.7 5];

Create a `vecm`

model object representing the VEC(1) model using the appropriate name-value pair arguments.

Mdl = vecm('Adjustment',Adjustment,'Cointegration',Cointegration,... 'Constant',Constant,'ShortRun',ShortRun,'Trend',Trend,... 'Covariance',Covariance);

`Mdl`

is effectively a fully specified `vecm`

model object. That is, the cointegration constant and linear trend are unknown, but are not needed for simulating observations or forecasting given that the overall constant and trend parameters are known.

Simulate 1000 paths of 100 observations. Return the innovations (scaled disturbances).

numpaths = 1000; numobs = 100; rng(1); % For reproducibility [Y,E] = simulate(Mdl,numobs,'NumPaths',numpaths);

`Y`

is a 100-by-3-by-1000 matrix of simulated responses. `E`

is a matrix whose dimensions correspond to the dimensions of `Y`

, but represents the simulated, scaled disturbances. Columns correspond to the response variable names `Mdl.SeriesNames`

.

For each time point, compute the mean vector of the simulated responses among all paths.

MeanSim = mean(Y,3);

`MeanSim`

is a 100-by-7 matrix containing the average of the simulated responses at each time point.

Plot the simulated responses and their averages.

figure; for j = 1:Mdl.NumSeries subplot(2,2,j) plot(squeeze(Y(:,j,:)),'Color',[0.8,0.8,0.8]) title(Mdl.SeriesNames{j}); hold on plot(MeanSim(:,j)); xlabel('Time index') hold off end

## Input Arguments

### Name-Value Arguments

Specify optional
comma-separated pairs of `Name,Value`

arguments. `Name`

is
the argument name and `Value`

is the corresponding value.
`Name`

must appear inside quotes. You can specify several name and value
pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

**Example:**

`'Y0',Y0,'X',X`

uses the matrix `Y0`

as
presample responses and the matrix `X`

as predictor data in the
regression component.`Y0`

— Presample responses

numeric matrix | numeric array

Presample responses providing initial values for the model, specified as the comma-separated
pair consisting of `'Y0'`

and a
`numpreobs`

-by-`numseries`

numeric matrix or a
`numpreobs`

-by-`numseries`

-by-`numprepaths`

numeric array.

`numpreobs`

is the number of presample observations.
`numseries`

is the number of response series
(`Mdl.NumSeries`

). `numprepaths`

is the number of
presample response paths.

Rows correspond to presample observations, and the last row contains the latest presample
observation. `Y0`

must have at least `Mdl.P`

rows. If
you supply more rows than necessary, `simulate`

uses the latest
`Mdl.P`

observations only.

Columns must correspond to the response series names in
`Mdl.SeriesNames`

.

Pages correspond to separate, independent paths.

If

`Y0`

is a matrix, then`simulate`

applies it to simulate each sample path (page). Therefore, all paths in the output argument`Y`

derive from common initial conditions.Otherwise,

`simulate`

applies`Y0(:,:,`

to initialize simulating path)`j`

.`j`

`Y0`

must have at least`numpaths`

pages (see`NumPaths`

), and`simulate`

uses only the first`numpaths`

pages.

By default, `simulate`

sets any necessary presample observations.

For stationary VAR processes without regression components,

`simulate`

sets presample observations to the unconditional mean $$\mu ={\Phi}^{-1}(L)c.$$For nonstationary processes or models that contain a regression component,

`simulate`

sets presample observations to zero.

**Data Types: **`double`

`X`

— Predictor data

numeric matrix

Predictor data for the regression component in the model, specified as the comma-separated
pair consisting of `'X'`

and a numeric matrix containing
`numpreds`

columns.

`numpreds`

is the number of predictor variables
(`size(Mdl.Beta,2)`

).

Rows correspond to observations, and the last row contains the latest observation.
`X`

must have at least `numobs`

rows. If you
supply more rows than necessary, `simulate`

uses only the latest
`numobs`

observations. `simulate`

does not
use the regression component in the presample period.

Columns correspond to individual predictor variables. All predictor variables are present in the regression component of each response equation.

`simulate`

applies `X`

to each path (page);
that is, `X`

represents one path of observed predictors.

By default, `simulate`

excludes the regression component, regardless of its presence in `Mdl`

.

**Data Types: **`double`

`YF`

— Future multivariate response series

numeric matrix | numeric array

Future multivariate response series for conditional simulation, specified as the
comma-separated pair consisting of `'YF'`

and a numeric matrix or array
containing `numseries`

columns.

Rows correspond to observations in the simulation horizon, and the first row is the
earliest observation. Specifically, row * j* in sample path

*(*

`k`

`YF(``j`

,:,`k`

)

)
contains the responses *periods into the future.*

`j`

`YF`

must have at least `numobs`

rows to cover the
simulation horizon. If you supply more rows than necessary, `simulate`

uses only the first `numobs`

rows.Columns must correspond to the response variable names in
`Mdl.SeriesNames`

.

Pages correspond to sample paths. Specifically, path * k*
(

`YF(:,:,``k`

)

) captures the state, or
knowledge, of the response series as they evolve from the presample past
(`Y0`

) into the future.If

`YF`

is a matrix, then`simulate`

applies`YF`

to each of the`numpaths`

output paths (see`NumPaths`

).Otherwise,

`YF`

must have at least`numpaths`

pages. If you supply more pages than necessary,`simulate`

uses only the first`numpaths`

pages.

Elements of `YF`

can be numeric scalars or missing values (indicated by
`NaN`

values). `simulate`

treats numeric scalars
as deterministic future responses that are known in advance, for example, set by policy.
`simulate`

simulates responses for corresponding
`NaN`

values conditional on the known values.

By default, `YF`

is an array composed of `NaN`

values indicating a complete lack of knowledge of the future state of all simulated responses. Therefore, `simulate`

obtains the output responses `Y`

from a conventional, unconditional Monte Carlo simulation.

For more details, see Algorithms.

**Example: **Consider simulating one path of a VEC model composed of four
response series three periods into the future. Suppose that you have
prior knowledge about some of the future values of the responses, and
you want to simulate the unknown responses conditional on your
knowledge. Specify `YF`

as a matrix containing the
values that you know, and use `NaN`

for values you do
not know but want to simulate. For example, ```
'YF',[NaN 2 5 NaN;
NaN NaN 0.1 NaN; NaN NaN NaN NaN]
```

specifies that you have
no knowledge of the future values of the first and fourth response
series; you know the value for period 1 in the second response series,
but no other value; and you know the values for periods 1 and 2 in the
third response series, but not the value for period 3.

**Data Types: **`double`

**Note**

`NaN`

values in `Y0`

and `X`

indicate missing values. `simulate`

removes missing values from the data by list-wise deletion. If `Y0`

is a 3-D array, then `simulate`

performs these steps.

Horizontally concatenate pages to form a

`numpreobs`

-by-`numpaths*numseries`

matrix.Remove any row that contains at least one

`NaN`

from the concatenated data.

In the case of missing observations, the results obtained from multiple paths of `Y0`

can differ from the results obtained from each path individually.

For conditional simulation (see `YF`

), if `X`

contains any missing values in the latest `numobs`

observations, then `simulate`

throws an error.

## Output Arguments

`Y`

— Simulated multivariate response series

numeric matrix | numeric array

Simulated multivariate response series, returned as a `numobs`

-by-`numseries`

numeric matrix or a `numobs`

-by-`numseries`

-by-`numpaths`

numeric array. `Y`

represents the continuation of the presample responses in `Y0`

.

If you specify future responses for conditional simulation using the `YF`

name-value pair argument, then the known values in `YF`

appear in the same positions in `Y`

. However, `Y`

contains simulated values for the missing observations in `YF`

.

`E`

— Simulated multivariate model innovations series

numeric matrix | numeric array

Simulated multivariate model innovations series, returned as a `numobs`

-by-`numseries`

numeric matrix or a `numobs`

-by-`numseries`

-by-`numpaths`

numeric array.

If you specify future responses for conditional simulation (see the `YF`

name-value pair argument), then `simulate`

infers the innovations
from the known values in `YF`

and places the inferred innovations in
the corresponding positions in `E`

. For the missing observations in
`YF`

, `simulate`

draws from the Gaussian
distribution conditional on any known values, and places the draws in the corresponding
positions in `E`

.

## Algorithms

`simulate`

performs conditional simulation using this process for all pages= 1,...,`k`

`numpaths`

and for each time= 1,...,`t`

`numobs`

.`simulate`

infers (or inverse filters) the innovations`E(`

from the known future responses,:,`t`

)`k`

`YF(`

. For,:,`t`

)`k`

`E(`

,,:,`t`

)`k`

`simulate`

mimics the pattern of`NaN`

values that appears in`YF(`

.,:,`t`

)`k`

For the missing elements of

`E(`

,,:,`t`

)`k`

`simulate`

performs these steps.Draw

`Z1`

, the random, standard Gaussian distribution disturbances conditional on the known elements of`E(`

.,:,`t`

)`k`

Scale

`Z1`

by the lower triangular Cholesky factor of the conditional covariance matrix. That is,`Z2`

=`L*Z1`

, where`L`

=`chol(C,'lower')`

and`C`

is the covariance of the conditional Gaussian distribution.Impute

`Z2`

in place of the corresponding missing values in`E(`

.,:,`t`

)`k`

For the missing values in

`YF(`

,,:,`t`

)`k`

`simulate`

filters the corresponding random innovations through the model`Mdl`

.

`simulate`

uses this process to determine the time origin*t*_{0}of models that include linear time trends.If you do not specify

`Y0`

, then*t*_{0}= 0.Otherwise,

`simulate`

sets*t*_{0}to`size(Y0,1)`

–`Mdl.P`

. Therefore, the times in the trend component are*t*=*t*_{0}+ 1,*t*_{0}+ 2,...,*t*_{0}+`numobs`

. This convention is consistent with the default behavior of model estimation in which`estimate`

removes the first`Mdl.P`

responses, reducing the effective sample size. Although`simulate`

explicitly uses the first`Mdl.P`

presample responses in`Y0`

to initialize the model, the total number of observations in`Y0`

(excluding any missing values) determines*t*_{0}.

## References

[1]
Hamilton, James D. *Time Series Analysis*. Princeton, NJ: Princeton University Press, 1994.

[2]
Johansen, S. *Likelihood-Based Inference in Cointegrated Vector Autoregressive Models*. Oxford: Oxford University Press, 1995.

[3]
Juselius, K. *The Cointegrated VAR Model*. Oxford: Oxford University Press, 2006.

[4]
Lütkepohl, H. *New Introduction to Multiple Time Series Analysis*. Berlin: Springer, 2005.

## See Also

### Objects

### Functions

**Introduced in R2017b**

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)