forecast
Forecast univariate ARIMA or ARIMAX model responses or conditional variances
Syntax
Description
[
returns the Y
,YMSE
]
= forecast(Mdl
,numperiods
,Y0
)numperiods
-by-1 numeric vector of consecutive forecasted
responses Y
and the corresponding numeric vector of forecast mean
square errors (MSE) YMSE
of the fully specified, univariate ARIMA model
Mdl
. The presample response data in the numeric vector
Y0
initializes the model to generate forecasts.
returns the table or timetable Tbl2
= forecast(Mdl
,numperiods
,Tbl1
)Tbl2
containing a variable for each of
the paths of response, forecast MSE, and conditional variance series resulting from
forecasting the ARIMA model Mdl
over a numperiods
forecast horizon. Tbl1
is a table or timetable containing a variable
for required presample response data to initialize the model for forecasting.
Tbl1
can optionally contain variables of presample data for
innovations, conditional variances, and predictors. (since R2023b)
forecast
selects the response variable named in
Mdl.SeriesName
or the sole variable in Tbl1
. To
select a different response variable in Tbl1
to initialize the model,
use the PresampleResponseVariable
name-value argument.
[___] = forecast(___,
specifies options using one or more name-value arguments in
addition to any of the input argument combinations in previous syntaxes.
Name=Value
)forecast
returns the output argument combination for the
corresponding input arguments. For example, forecast(Mdl,10,Y0,X0=Exo0,XF=Exo)
specifies
the presample and forecast sample exogenous predictor data to Exo0
and
Exo
, respectively, to forecast a model with a regression component
(an ARIMAX model).
Examples
Forecast Conditional Mean Response Vector
Forecast the conditional mean response of simulated data over a 30-period horizon. Supply a vector of presample response data and return a vector of forecasts.
Simulate 130 observations from a multiplicative seasonal moving average (MA) model with known parameter values.
Mdl = arima(MA={0.5 -0.3},SMA=0.4,SMALags=12,Constant=0.04, ... Variance=0.2); rng(200,"twister") Y = simulate(Mdl,130);
Fit a seasonal MA model to the first 100 observations, and reserve the remaining 30 observations to evaluate forecast performance.
MdlTemplate = arima(MALags=1:2,SMALags=12); EstMdl = estimate(MdlTemplate,Y(1:100));
ARIMA(0,0,2) Model with Seasonal MA(12) (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ __________ Constant 0.20403 0.069064 2.9542 0.0031344 MA{1} 0.50212 0.097298 5.1606 2.4619e-07 MA{2} -0.20174 0.10447 -1.9312 0.053464 SMA{12} 0.27028 0.10907 2.478 0.013211 Variance 0.18681 0.032732 5.7073 1.148e-08
EstMdl
is a new arima
model that contains estimated parameters (that is, a fully specified model).
Forecast the fitted model into a 30-period horizon. Specify the estimation period data as a presample.
[YF,YMSE] = forecast(EstMdl,30,Y(1:100)); YF(15)
ans = 0.2040
YMSE(15)
ans = 0.2592
YF
is a 30-by-1 vector of forecasted responses, and YMSE
is a 30-by-1 vector of corresponding MSEs. The 15-period-ahead forecast is 0.2040 and its MSE is 0.2592.
Visually compare the forecasts to the holdout data.
figure h1 = plot(Y,Color=[.7,.7,.7]); hold on h2 = plot(101:130,YF,"b",LineWidth=2); h3 = plot(101:130,YF + 1.96*sqrt(YMSE),"r:",LineWidth=2); plot(101:130,YF - 1.96*sqrt(YMSE),"r:",LineWidth=2); legend([h1 h2 h3],"Observed","Forecast","95% confidence interval", ... Location="NorthWest") title("30-Period Forecasts and 95% Confidence Intervals") hold off
Forecast NYSE Composite Index
Since R2023b
Forecast the weekly average NYSE closing prices over a 15-week horizon. Supply presample data in a timetable and return a timetable of forecasts.
Load Data
Load the US equity index data set Data_EquityIdx
.
load Data_EquityIdx
T = height(DataTimeTable)
T = 3028
The timetable DataTimeTable
includes the time series variable NYSE
, which contains daily NYSE composite closing prices from January 1990 through December 2001.
Plot the daily NYSE price series.
figure
plot(DataTimeTable.Time,DataTimeTable.NYSE)
title("NYSE Daily Closing Prices: 1990 - 2001")
Prepare Timetable for Estimation
When you plan to supply a timetable, you must ensure it has all the following characteristics:
The selected response variable is numeric and does not contain any missing values.
The timestamps in the
Time
variable are regular, and they are ascending or descending.
Remove all missing values from the timetable, relative to the NYSE price series.
DTT = rmmissing(DataTimeTable,DataVariables="NYSE");
T_DTT = height(DTT)
T_DTT = 3028
Because all sample times have observed NYSE prices, rmmissing
does not remove any observations.
Determine whether the sampling timestamps have a regular frequency and are sorted.
areTimestampsRegular = isregular(DTT,"days")
areTimestampsRegular = logical
0
areTimestampsSorted = issorted(DTT.Time)
areTimestampsSorted = logical
1
areTimestampsRegular = 0
indicates that the timestamps of DTT
are irregular. areTimestampsSorted = 1
indicates that the timestamps are sorted. Business day rules make daily macroeconomic measurements irregular.
Remedy the time irregularity by computing the weekly average closing price series of all timetable variables.
DTTW = convert2weekly(DTT,Aggregation="mean"); areTimestampsRegular = isregular(DTTW,"weeks")
areTimestampsRegular = logical
1
T_DTTW = height(DTTW)
T_DTTW = 627
DTTW
is regular.
figure
plot(DTTW.Time,DTTW.NYSE)
title("NYSE Daily Closing Prices: 1990 - 2001")
Create Model Template for Estimation
Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period.
Create an ARIMA(1,1,1) model template for estimation. Set the response series name to NYSE
.
Mdl = arima(1,1,1);
Mdl.SeriesName = "NYSE";
Mdl
is a partially specified arima
model object.
Partition Data
estimate
and forecast
require Mdl.P
presample observations to initialize the model for estimaiton and forecasting.
Partition the data into three sets:
A presample set for estimation
An in-sample set, to which you fit the model and initialize the model for forecasting
A holdout sample of length 15 to measure the model's predictive performance
numpreobs = Mdl.P; % Required presample length numperiods = 15; % Forecast horizon DTTW0 = DTTW(1:numpreobs,:); % Estimation presample DTTW1 = DTTW((numpreobs+1):(end-numperiods),:); % In-sample for estimation and presample for forecasting DTTW2 = DTTW((end-numperiods+1):end,:); % Holdout sample
Fit Model to Data
Fit an ARIMA(1,1,1) model to the in-sample weekly average NYSE closing prices. Specify the presample timetable and the presample response variable name.
EstMdl = estimate(Mdl,DTTW1,Presample=DTTW0,PresampleResponseVariable="NYSE");
ARIMA(1,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ ___________ Constant 0.31873 0.23754 1.3418 0.17965 AR{1} 0.41132 0.2371 1.7348 0.082779 MA{1} -0.31232 0.24486 -1.2755 0.20212 Variance 55.472 1.8496 29.992 1.2638e-197
EstMdl
is a fully specified, estimated arima
model object.
Forecast Conditional Mean
Forecast the weekly average NASDQ closing prices 15 weeks beyond the estimation sample using the fitted model. Use the estimatoin sample data as a presample to initialize the forecast. Specify the response variable name in the presample data.
Tbl2 = forecast(EstMdl,numperiods,DTTW1)
Tbl2=15×3 timetable
Time NYSE_Response NYSE_MSE NYSE_Variance
___________ _____________ ________ _____________
28-Sep-2001 521.34 55.472 55.472
05-Oct-2001 519.89 122.47 55.472
12-Oct-2001 519.62 194.53 55.472
19-Oct-2001 519.82 268.72 55.472
26-Oct-2001 520.23 343.8 55.472
02-Nov-2001 520.71 419.24 55.472
09-Nov-2001 521.23 494.83 55.472
16-Nov-2001 521.76 570.49 55.472
23-Nov-2001 522.3 646.17 55.472
30-Nov-2001 522.84 721.86 55.472
07-Dec-2001 523.38 797.56 55.472
14-Dec-2001 523.92 873.26 55.472
21-Dec-2001 524.46 948.96 55.472
28-Dec-2001 525 1024.7 55.472
04-Jan-2002 525.55 1100.4 55.472
Tbl2
is a 15-by-3 timetable containing the forecasted weekly average closing price forecasts NYSE_Response
, corresponding forecast MSEs NYSE_MSE
, and the model's constant variance NYSE_Variance
(EstMdl.Variance = 55.8147
).
Plot the forecasts and approximate 95% forecast intervals.
Tbl2.NYSE_Lower = Tbl2.NYSE_Response - 1.96*sqrt(Tbl2.NYSE_MSE); Tbl2.NYSE_Upper = Tbl2.NYSE_Response + 1.96*sqrt(Tbl2.NYSE_MSE); figure h1 = plot([DTTW1.Time((end-75):end); DTTW2.Time], ... [DTTW1.NYSE((end-75):end); DTTW2.NYSE],Color=[.7,.7,.7]); hold on h2 = plot(Tbl2.Time,Tbl2.NYSE_Response,"k",LineWidth=2); h3 = plot(Tbl2.Time,Tbl2{:,["NYSE_Lower" "NYSE_Upper"]},"r:",LineWidth=2); legend([h1 h2 h3(1)],"Observations","Forecasts","95% forecast intervals", ... Location="NorthWest") title("NYSE Weekly Average Closing Price") hold off
The process is nonstationary, so the width of each forecast interval grows with time. The model tends to unestimate the weekly average closing prices.
Forecast ARX Model
Forecast the following known autoregressive model with one lag and an exogenous predictor (ARX(1)) model into a 10-period forecast horizon:
where is a standard Gaussian random variable, and is an exogenous Gaussian random variable with a mean of 1 and a standard deviation of 0.5.
Create an arima
model object that represents the ARX(1) model.
Mdl = arima(Constant=1,AR=0.3,Beta=2,Variance=1);
To forecast responses from the ARX(1) model, the forecast
function requires:
One presample response to initialize the autoregressive term
Future exogenous data to include the effects of the exogenous variable on the forecasted responses
Set the presample response to the unconditional mean of the stationary process:
For the future exogenous data, draw 10 values from the distribution of the exogenous variable.
rng(1,"twister");
y0 = (1 + 2)/(1 - 0.3);
xf = 1 + 0.5*randn(10,1);
Forecast the ARX(1) model into a 10-period forecast horizon. Specify the presample response and future exogenous data.
fh = 10; yf = forecast(Mdl,fh,y0,XF=xf)
yf = 10×1
3.6367
5.2722
3.8232
3.0373
3.0657
3.3470
3.4454
4.2120
4.0667
4.8065
yf(3)
= 3.8232
is the 3-period-ahead forecast of the ARX(1) model.
Forecast Composite Conditional Mean and Variance Model
Since R2023b
Consider the following AR(1) conditional mean model with a GARCH(1,1) conditional variance model for the weekly average NASDAQ rate series (as a percent) from January 2, 1990 through December 31, 2001.
where is a series of independent random Gaussian variables with a mean of 0.
Create the model. Name the response series NASDAQ
.
CondVarMdl = garch(Constant=0.022,GARCH=0.873,ARCH=0.119);
Mdl = arima(Constant=0.073,AR=0.138,Variance=CondVarMdl);
Mdl.SeriesName = "NASDAQ";
Load the equity index data set. Remedy the time irregularity by computing the weekly average closing price series of all timetable variables.
load Data_EquityIdx DTTW = convert2weekly(DataTimeTable,Aggregation="mean");
Convert the weekly average NASDAQ closing price series to a percent return series.
RetTT = price2ret(DTTW); RetTT.NASDAQ = RetTT.NASDAQ*100;
Infer residuals and conditional variances from the model.
RetTT2 = infer(Mdl,RetTT); T = numel(RetTT);
Forecast the model over a 25-day horizon. Supply the entire data set as a presample (forecast
uses only the latest required observations to initialize the conditional mean and variance models). Supply variable names for the presample innovations and conditional variances. By default, forecast
uses the variable name Mdl.SeriesName
as the presample response variable.
fh = 25; ForecastTT = forecast(Mdl,fh,RetTT2,PresampleInnovationVariable="NASDAQ_Residual", ... PresampleVarianceVariable="NASDAQ_Variance");
Plot the forecasted responses and conditional variances with the observed series from June 2000.
pdates = RetTT2.Time > datetime(2000,6,1); figure plot(RetTT2.Time(pdates),RetTT2.NASDAQ(pdates)) hold on plot([RetTT2.Time(end); ForecastTT.Time], ... [RetTT2.NASDAQ(end); ForecastTT.NASDAQ_Response]) title("NASDAQ Weekly Average Percent Return Series") legend("Observed","Forecasted") axis tight grid on hold off
figure plot(RetTT2.Time(pdates),RetTT2.NASDAQ_Variance(pdates)) hold on plot([RetTT2.Time(end); ForecastTT.Time], ... [RetTT2.NASDAQ_Variance(end); ForecastTT.NASDAQ_Variance]) title("Conditional Variance Series") legend("Observed","Forecasted") axis tight grid on hold off
Forecast Multiple Paths
Forecast multiple response and conditional variance paths from a known composite conditional mean and variance model: a SAR conditional mean model with an ARCH(1) conditional variance model. Specify multiple presample response paths.
Create a garch
model object that represents this ARCH(1) model:
Create an arima
model object that represents this quarterly SAR model:
where is a standard Gaussian random variable.
CVMdl = garch(ARCH=0.2,Constant=0.1)
CVMdl = garch with properties: Description: "GARCH(0,1) Conditional Variance Model (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" P: 0 Q: 1 Constant: 0.1 GARCH: {} ARCH: {0.2} at lag [1] Offset: 0
Mdl = arima(Constant=1,AR=0.5,Variance=CVMdl,Seasonality=4, ...
SARLags=4,SAR=0.2)
Mdl = arima with properties: Description: "ARIMA(1,0,0) Model Seasonally Integrated with Seasonal AR(4) (Gaussian Distribution)" SeriesName: "Y" Distribution: Name = "Gaussian" P: 9 D: 0 Q: 0 Constant: 1 AR: {0.5} at lag [1] SAR: {0.2} at lag [4] MA: {} SMA: {} Seasonality: 4 Beta: [1×0] Variance: [GARCH(0,1) Model]
Because Mdl
contains 9 autoregressive terms and 1 ARCH term, forecast
requires Mdl.P = 9
responses and CVMdl.Q
= 1 conditional variance to generate each -period-ahead forecast.
Generate 10 random paths of length 9 from the model.
rng(1,"twister")
numpreobs = Mdl.P;
numpaths = 10;
[Y0,~,V0] = simulate(Mdl,numpreobs,NumPaths=numpaths);
Forecast 10 paths of responses and conditional variances from the model into a 12-quarter forecast horizon. Specify the presample response paths Y0
and conditional variance paths V0.
fh = 12; [YF,~,VF] = forecast(Mdl,fh,Y0,V0=V0);
YF
and VF
are 12-by-10 matrices of independent forecasted response and conditional variance paths, respectively. YF(j,k)
is the j
-period-ahead forecast of path k
. Path YF(:,k)
represents the continuation of the presample path Y0(:,k)
. forecast
structures VF
similarly.
Plot the presample and forecasted responses.
Y = [Y0; YF]; figure plot(Y) hold on h = gca; px = [numpreobs+0.5 h.XLim([2 2]) numpreobs+0.5]; py = h.YLim([1 1 2 2]); hp = patch(px,py,[0.9 0.9 0.9]); uistack(hp,"bottom"); axis tight legend("Forecast period") xlabel("Time (quarters)") title("Response paths") hold off
V = [V0; VF]; figure plot(V) hold on h = gca; px = [numpreobs+0.5 h.XLim([2 2]) numpreobs+0.5]; py = h.YLim([1 1 2 2]); hp = patch(px,py,[0.9 0.9 0.9]); uistack(hp,"bottom"); legend("Forecast period") axis tight xlabel("Time (quarters)") title("Conditional Variance Paths") hold off
Input Arguments
numperiods
— Forecast horizon
positive integer
Forecast horizon, or the number of time points in the forecast period, specified as a positive integer.
Data Types: double
Y0
— Presample response data yt
numeric column vector | numeric matrix
Presample response data yt used to
initialize the model for forecasting, specified as a numpreobs
-by-1
numeric column vector or a numpreobs
-by-numpaths
numeric matrix. When you supply Y0
, supply all optional data as
numeric arrays, and forecast
returns results in numeric
arrays.
numpreobs
is the number of presample observations.
numpaths
is the number of independent presample paths, from which
forecast
initializes the resulting numpaths
forecasts (see Algorithms).
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.P
to initialize
the model. If numpreobs
> Mdl.P
,
forecast
uses only the latest Mdl.P
rows.
For more details, see Time Base Partitions for Forecasting.
Columns of Y0
correspond to separate, independent presample
paths.
If
Y0
is a column vector, it represents a single path of the response series.forecast
applies it to each forecasted path. In this case, all forecast pathsY
derive from the same initial responses.If
Y0
is a matrix, each column represents a presample path of the response series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,E0
, andV0
.
Data Types: double
Tbl1
— Presample data
table | timetable
Since R2023b
Presample data containing required presample responses
yt, and, optionally, innovations
εt, conditional variances
σt2, or
predictors xt, to initialize the model,
specified as a table or timetable with numprevars
variables and
numpreobs
rows. You can select a response, innovation, conditional
variance, or multiple predictor variables from Tbl1
by using the
PresampleResponseVariable
,
PresampleInnovationVariable
,
PresampleVarianceVariable
, or
PresamplePredictorVariables
name-value argument,
respectively.
numpreobs
is the number of presample observations.
numpaths
is the number of independent presample paths, from which
forecast
initializes the resulting numpaths
forecasts (see Algorithms).
For all selected variables except predictor variables, each variable contains a
single path (numpreobs
-by-1 vector) or multiple paths
(numpreobs
-by-numpaths
matrix) of presample
response, innovations, or conditional variance data.
Each selected predictor variable contains a single path of observations.
forecast
applies all selected predictor variables to each
forecasted path. When you do not specify presample innovation data for forecasting an
ARIMAX model, forecast
uses the presample predictor data to
infer presample innovations.
Each row is a presample observation, and measurements in each row occur
simultaneously. numpreobs
must be one of the following values:
At least
Mdl.P
whenPresample
provides only presample responsesAt least
max([Mdl.P Mdl.Q])
otherwise
When Mdl.Variance
is a conditional variance model,
forecast
can require more than the minimum required number of
presample values. If numpreobs
exceeds the minimum number,
forecast
uses the latest required number of observations
only.
If Tbl1
is a timetable, all the following conditions must be true:
Tbl1
must represent a sample with a regular datetime time step (seeisregular
).The datetime vector of sample timestamps
Tbl1.Time
must be ascending or descending.
If Tbl1
is a table, the last row contains the latest presample
observation.
Although forecast
requires presample response data,
forecast
sets default presample innovation and conditional
variance data as follows:
To infer necessary presample innovations from presample responses,
numpreobs
must be at leastMdl.P + Mdl.Q
(seeinfer
). Additionally, for ARIMAX models,forecast
requires enough presample predictor data. Ifnumpreobs
is less thanMdl.P + Mdl.Q
or you do not specify presample predictor data for ARIMAX forecasting,forecast
sets all necessary presample innovations to zero.To infer necessary presample variances from presample innovations,
forecast
requires a sufficient number of presample innovations to initialize the specified conditional variance model (seeinfer
). If you do not specify enough presample innovations to initialize the conditional variance model,forecast
sets the necessary presample variances to the unconditional variance of the specified variance process.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: forecast(Mdl,10,Y0,X0=Exo0,XF=Exo)
specifies the presample
and forecast sample exogenous predictor data to Exo0
and
Exo
, respectively, to forecast a model with a regression
component.
E0
— Presample innovations εt
numeric column vector | numeric matrix
Presample innovations εt used to
initialize either the moving average (MA) component of the ARIMA model or the
conditional variance model, specified as a numpreobs
-by-1 column
vector or numpreobs
-by-numpaths
numeric matrix.
Use E0
only when you supply the numeric array of presample response
data Y0
. forecast
assumes that the
presample innovations have a mean of zero.
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.Q
to initialize
the model. If Mdl.Variance
is a conditional variance model (for
example, a garch
model object), E0
might require more than Mdl.Q
rows. If numpreobs
is greater than required, forecast
uses only the latest
required rows.
Columns of E0
correspond to separate, independent presample
paths.
If
E0
is a column vector, it represents a single path of the innovation series.forecast
applies it to each forecasted path. In this case, all forecast pathsY
derive from the same initial innovations.If
E0
is a matrix, each column represents a presample path of the innovation series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,E0
, andV0
.
By default:
If you provide enough presample responses and, for ARIMAX models, presample predictor data (
X0
),forecast
infers necessary presample innovations from the presample data. In this case,numpreobs
must be at leastMdl.P + Mdl.Q
(seeinfer
)Otherwise,
forecast
sets all necessary presample innovations to zero.
Data Types: double
V0
— Presample conditional variances σt2
positive numeric column vector | positive numeric matrix
Presample conditional variances
σt2 used to
initialize the conditional variance model, specified as a
numpreobs
-by-1 positive column vector or
numpreobs
-by-numpaths
positive matrix. Use
V0
only when you supply the numeric array of presample response
data Y0
. If the model variance Mdl.Variance
is
constant, forecast
ignores V0
.
Rows of V0
correspond to periods in the presample, and the last
row contains the latest presample conditional variance. numpreobs
must be enough to initialize the conditional variance model (see forecast
). If numpreobs
exceeds the minimum number,
forecast
uses only the latest observations.
Columns of V0
correspond to separate, independent paths.
If
V0
is a column vector,forecast
applies it to each forecasted path. In this case, the conditional variance model of all forecast pathsY
derives from the same initial conditional variances.If
V0
is a matrix, each column represents a presample path of the conditional variance series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,E0
, andV0
.
By default:
If you specify enough presample innovations
E0
to initialize the conditional variance modelMdl.Variance
,forecast
infers any necessary presample conditional variances by passing the conditional variance model andE0
to theinfer
function.If you do not specify
E0
, but you specify enough presample responses and, for ARIMAX models, presample predictor data,Y0
to infer enough presample innovations,forecast
infers any necessary presample conditional variances from the inferred presample innovations.If you do not specify enough presample data,
forecast
sets all necessary presample conditional variances to the unconditional variance of the variance process.
Data Types: double
PresampleResponseVariable
— Response variable yt to select from Tbl1
string scalar | character vector | integer | logical vector
Since R2023b
Response variable yt to select from
Tbl1
containing the presample response data, specified as one
of the following data types:
String scalar or character vector containing a variable name in
Tbl1.Properties.VariableNames
Variable index (positive integer) to select from
Tbl1.Properties.VariableNames
A logical vector, where
PresampleResponseVariable(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
The selected variable must be a numeric vector and cannot contain missing values
(NaN
s).
If Tbl1
has one variable, the default specifies that
variable. Otherwise, the default matches the variable to names in
Mdl.SeriesName
.
Example: PresampleResponseVariable="StockRate"
Example: PresampleResponseVariable=[false false true false]
or
PresampleResponseVariable=3
selects the third table variable as
the response variable.
Data Types: double
| logical
| char
| cell
| string
PresampleInnovationVariable
— Presample innovation variable of εt to select from Tbl1
string scalar | character vector | integer | logical vector
Since R2023b
Presample innovation variable of εt to
select from Tbl1
containing presample innovation data, specified as
one of the following data types:
String scalar or character vector containing a variable name in
Tbl1.Properties.VariableNames
Variable index (positive integer) to select from
Tbl1.Properties.VariableNames
A logical vector, where
PresampleInnovationVariable(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
The selected variable must be a numeric matrix and cannot contain missing values
(NaN
s).
If you specify presample innovation data in Tbl1
, you must
specify PresampleInnovationVariable
.
Example: PresampleInnovationVariable="StockRateDist0"
Example: PresampleInnovationVariable=[false false true false]
or
PresampleInnovationVariable=3
selects the third table variable as
the presample innovation variable.
Data Types: double
| logical
| char
| cell
| string
PresampleVarianceVariable
— Presample conditional variance variable σt2 to select
from Tbl1
string scalar | character vector | integer | logical vector
Presample conditional variance variable
σt2 to select
from Tbl1
containing presample conditional variance data, specified
as one of the following data types:
String scalar or character vector containing a variable name in
Tbl1.Properties.VariableNames
Variable index (positive integer) to select from
Tbl1.Properties.VariableNames
A logical vector, where
PresampleVarianceVariable(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
The selected variable must be a numeric vector and cannot contain missing values
(NaN
s).
If you specify presample conditional variance data in Tbl1
,
you must specify PresampleVarianceVariable
.
Example: PresampleVarianceVariable="StockRateVar0"
Example: PresampleVarianceVariable=[false false true false]
or
PresampleVarianceVariable=3
selects the third table variable as
the presample conditional variance variable.
Data Types: double
| logical
| char
| cell
| string
X0
— Presample predictor data
numeric matrix
Presample predictor data used to infer the presample innovations
E0
, specified as a
numpreobs
-by-numpreds
numeric matrix. Use
X0
only when you supply the numeric array of presample response
data Y0
and your model contains a regression component.
numpreds
= numel(Mdl.Beta)
.
Rows of X0
correspond to periods in the presample, and the last
row contains the latest set of presample predictor observations. Columns of
X0
represent separate time series variables, and they correspond
to the columns of XF
and Mdl.Beta
.
If you do not specify E0
, X0
must have at
least numpreobs
– Mdl.P
rows so that
forecast
can infer presample innovations. If the number of
rows exceeds the minimum number required to infer presample innovations,
forecast
uses only the latest required presample predictor
observations. A best practice is to set X0
to the same predictor
data matrix used in the estimation, simulation, or inference of
Mdl
. This setting ensures that forecast
infers presample innovations E0
correctly.
If you specify E0
, forecast
ignores
X0
.
If you specify X0
but you do not specify forecasted predictor
data XF
, forecast
issues an
error.
By default, forecast
drops the regression component from the model when it infers presample innovations, regardless of the value of the regression coefficient Mdl.Beta
.
Data Types: double
PresamplePredictorVariables
— Presample exogenous predictor variables xt to select from Tbl1
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2023b
Presample exogenous predictor variables
xt to select from Tbl1
containing presample exogenous predictor data, specified as one of the following data types:
String vector or cell vector of character vectors containing
numpreds
variable names inTbl1.Properties.VariableNames
A vector of unique indices (positive integers) of variables to select from
Tbl1.Properties.VariableNames
A logical vector, where
PresamplePredictorVariables(
selects variablej
) = true
fromj
Tbl1.Properties.VariableNames
The selected variables must be numeric vectors and cannot contain missing values
(NaN
s).
If you specify presample predictor data, you must also specify in-sample predictor
data by using the InSample
and
PredictorVariables
name-value arguments.
By default, forecast
excludes the regression component,
regardless of its presence in Mdl
.
Example: PresamplePredictorVariables=["M1SL" "TB3MS"
"UNRATE"]
Example: PresamplePredictorVariables=[true false true false]
or
PredictorVariable=[1 3]
selects the first and third table
variables to supply the predictor data.
Data Types: double
| logical
| char
| cell
| string
XF
— Forecasted (or future) predictor data
numeric matrix
Forecasted (or future) predictor data, specified as a numeric matrix with
numpreds
columns. XF
represents the evolution
of specified presample predictor data X0
forecasted into the
future (the forecast period). Use XF
only when you supply the
numeric array of presample response data Y0
.
Rows of XF
correspond to time points in the future;
XF(
contains the
t
,:)t
-period-ahead predictor forecasts. XF
must have at least numperiods
rows. If the number of rows exceeds
numperiods
, forecast
uses only the first
(earliest) numperiods
forecasts. For more details, see Time Base Partitions for Forecasting.
Columns of XF
are separate time series variables, and they
correspond to the columns of X0
and
Mdl.Beta
.
By default, the forecast
function generates forecasts from Mdl
without a regression component, regardless of the value of the regression coefficient Mdl.Beta
.
InSample
— Forecasted (future) predictor data
table | timetable
Since R2023b
Forecasted (future) predictor data for the exogenous regression component of the
model, specified as a table or timetable. InSample
contains
numvars
variables, including numpreds
predictor variables xt.
forecast
returns the forecasted variables in the output
table or timetable Tbl2
, which is commensurate with
InSample
.
Each row corresponds to an observation in the forecast horizon, the first row is
the earliest observation, and measurements in each row, among all paths, occur
simultaneously. InSample
must have at least
numperiods
rows to cover the forecast horizon. If you supply
more rows than necessary, forecast
uses only the first
numperiods
rows.
Each selected predictor variable is a numeric vector without missing values
(NaN
s). forecast
applies the specified
predictor variables to all forecasted paths.
If InSample
is a timetable, the following conditions apply:
If InSample
is a table, the last row contains the latest
observation.
By default, forecast
does not include the regression
component in the model, regardless of the value of Mdl.Beta
.
PredictorVariables
— Exogenous predictor variables xt to select from InSample
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2023b
Exogenous predictor variables xt to
select from InSample
containing exogenous predictor data in the
forecast horizon, specified as one of the following data types:
String vector or cell vector of character vectors containing
numpreds
variable names inInSample.Properties.VariableNames
A vector of unique indices (positive integers) of variables to select from
InSample.Properties.VariableNames
A logical vector, where
PredictorVariables(
selects variablej
) = true
fromj
InSample.Properties.VariableNames
The selected variables must be numeric vectors and cannot contain missing values
(NaN
s).
By default, forecast
excludes the regression component,
regardless of its presence in Mdl
.
Example: PredictorVariables=["M1SL" "TB3MS"
"UNRATE"]
Example: PredictorVariables=[true false true false]
or
PredictorVariable=[1 3]
selects the first and third table
variables to supply the predictor data.
Data Types: double
| logical
| char
| cell
| string
Note
For numeric array inputs, forecast
assumes that you
synchronize all specified presample data sets so that the latest observation of each
presample series occurs simultaneously. Similarly, forecast
assumes that the first observation in the forecasted predictor data
XF
occurs in the time point immediately after the last observation
in the presample predictor data X0
.
Output Arguments
Y
— Minimum mean square error (MMSE) conditional mean forecasts
numeric column vector | numeric matrix
Minimum mean square error (MMSE) conditional mean forecasts
yt, returned as a
numperiods
-by-1 column vector or a
numperiods
-by-numpaths
numeric matrix.
Y
represents a continuation of Y0
(Y(1,:)
occurs in the time point immediately after
Y0(end,:)
). forecast
returns
Y
only when you supply numeric presample data
Y0
.
Y(
contains the
t
,:)t
-period-ahead forecasts, or the
conditional mean forecast of all paths for time point t
in
the forecast period.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and V0
. For details, see Algorithms. If each
presample data set has one column, Y
is a column vector.
Data Types: double
YMSE
— MSE of forecasted responses
numeric column vector | numeric matrix
MSE of the forecasted responses Y
(forecast error variances),
returned as a numperiods
-by-1 column vector or a
numperiods
-by-numpaths
numeric matrix.
forecast
returns YMSE
only when you
supply numeric presample data Y0
.
YMSE(
contains the forecast error variances of all paths for time point t
,:)t
in the forecast period.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and V0
. For details, see Algorithms. If you do
not specify any presample data sets, or if each data set is a column vector,
YMSE
is a column vector.
The square roots of YMSE
are the standard errors of the forecasts Y
.
Data Types: double
V
— MMSE forecasts of conditional variances of future model innovations
numeric column vector | numeric matrix
MMSE forecasts of the conditional variances of future model innovations, returned as
a numperiods
-by-1 numeric column vector or a
numperiods
-by-numpaths
numeric matrix.
forecast
returns V
only when you supply
numeric presample data Y0
.
When Mdl.Variance
is a conditional variance model, row
contains the conditional variance
forecasts of period j
. Otherwise,
j
V
is a matrix composed of the constant
Mdl.Variance
.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and V0
. For details, see Algorithms. If you do
not specify any presample data sets, or if each data set is a column vector,
YMSE
is a column vector.
Data Types: double
Tbl2
— Paths of MMSE forecasts of responses yt, corresponding forecast MSEs, and MMSE forecasts of conditional variances
σt2 of future
model innovations εt
table | timetable
Since R2023b
Paths of MMSE forecasts of responses yt,
corresponding forecast MSEs, and MMSE forecasts of conditional variances
σt2 of future
model innovations εt, returned as a table or
timetable, the same data type as Tbl1
.
forecast
returns Tbl2
only when you
supply the input Tbl1
.
Tbl2
contains the following variables:
The forecasted response paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the input presample response paths inTbl1
.forecast
names the forecasted response variable
, whereresponseName
_Response
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isGDP
,Tbl2
contains a variable for the corresponding forecasted response paths with the nameGDP_Response
.Each path in
Tbl2.
represents the continuation of the corresponding presample response path inresponseName
_ResponseTbl1
(Tbl2.
occurs in the next time point, with respect to the periodicityresponseName
_Response(1,:)Tbl1
, after the last presample response).Tbl2.
contains theresponseName
_Response(j
,k
)j
-period-ahead forecasted response of pathk
.The forecast MSE paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the forecasted responses inTbl2.
.responseName
_Responseforecast
names the forecast MSEs
, whereresponseName
_MSE
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isGDP
,Tbl2
contains a variable for the corresponding forecast MSE with the nameGDP_MSE
.The forecasted conditional variance paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths.forecast
names the forecasted conditional variance variable
, whereresponseName
_Variance
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isStockReturns
,Tbl2
contains a variable for the corresponding forecasted conditional variance paths with the nameStockReturns_Variance
.Each path in
Tbl2.
represents a continuation of the presample conditional variance process, either supplied byresponseName
_VarianceTbl1
or set by default (Tbl2.
occurs in the next time point, with respect to the periodicityresponseName
_Variance(1,:)Tbl1
, after the last presample conditional variance).Tbl2.
contains theresponseName
_Variance(j
,k
)j
-period-ahead forecasted conditional variance of pathk
.When you supply
InSample
,Tbl2
contains all variables inInSample
.
If Tbl1
is a timetable, the following conditions hold:
The row order of
Tbl2
, either ascending or descending, matches the row order ofTbl1
.Tbl2.Time(1)
is the next time afterTbl1.Time(end)
relative the sampling frequency, andTbl2.Time(2:numobs)
are the following times relative to the sampling frequency.
More About
Time Base Partitions for Forecasting
Time base partitions for forecasting are two
disjoint, contiguous intervals of the time base; each interval contains time series data for
forecasting a dynamic model. The forecast period (forecast horizon)
is a numperiods
length partition at the end of the time base during
which the forecast
function generates the forecasts
Y
from the dynamic model Mdl
. The
presample period is the entire partition occurring before the
forecast period. The forecast
function can require observed
responses, innovations, or conditional variances in the presample period
(Y0
, E0
, and V0
, or
Tbl1
) to initialize the dynamic model for forecasting. The model
structure determines the types and amounts of required presample observations.
A common practice is to fit a dynamic model to a portion of the data set, and then validate the predictability of the model by comparing its forecasts to observed responses. During forecasting, the presample period contains the data to which the model is fit, and the forecast period contains the holdout sample for validation. Suppose that yt is an observed response series; x1,t, x2,t, and x3,t are observed exogenous series; and time t = 1,…,T. Consider forecasting responses from a dynamic model of yt containing a regression component with numperiods
= K periods. Suppose that the dynamic model is fit to the data in the interval [1,T – K] (for more details, see estimate
). This figure shows the time base partitions for forecasting.
For example, to generate the forecasts Y
from an ARX(2) model, forecast
requires:
Presample responses
Y0
= to initialize the model. The 1-period-ahead forecast requires both observations, whereas the 2-periods-ahead forecast requires yT – K and the 1-period-ahead forecastY(1)
. Theforecast
function generates all other forecasts by substituting previous forecasts for lagged responses in the model.Future exogenous data
XF
= for the model regression component. Without specified future exogenous data, theforecast
function ignores the model regression component, which can yield unrealistic forecasts.
Dynamic models containing either a moving average component or a conditional variance model can require presample innovations or conditional variances. Given enough presample responses, forecast
infers the required presample innovations and conditional variances. If such a model also contains a regression component, then forecast
must have enough presample responses and exogenous data to infer the required presample innovations and conditional variances. This figure shows the arrays of required observations for this case, with corresponding input and output arguments.
Algorithms
The
forecast
function sets the number of sample paths (numpaths
) to the maximum number of columns among the specified presample data sets:For input numeric arrays of presample data,
numpaths
is the maximum width amongE0
,V0
, andY0
.For an input table or timetable of presample data,
numpaths
is the maximum width among the variables representing the presample responsesPresampleResponseVariable
, innovationsPresampleInnovationVariable
, and conditional variancesPresampleVarianceVariable
.
All specified presample data sets must have either one column or
numpaths
> 1 columns. Otherwise,forecast
issues an error. For example, if you supplyY0
andE0
, andY0
has five columns representing five paths, thenE0
can have one column or five columns. IfE0
has one column,forecast
appliesE0
to each path.NaN
values in presample and future data sets indicate missing data. For input numeric arrays,forecast
removes missing data from the presample data sets following this procedure:forecast
horizontally concatenates the specified presample data setsY0
,E0
,V0
, andX0
so that the latest observations occur simultaneously. The result can be a jagged array because the presample data sets can have a different number of rows. In this case,forecast
prepads variables with an appropriate number of zeros to form a matrix.forecast
applies listwise deletion to the combined presample matrix by removing all rows containing at least oneNaN
.forecast
extracts the processed presample data sets from the result of step 2, and removes all prepadded zeros.
forecast
applies a similar procedure to the forecasted predictor dataXF
. Afterforecast
applies listwise deletion toXF
, the result must have at leastnumperiods
rows. Otherwise,forecast
issues an error.List-wise deletion reduces the sample size and can create irregular time series.
forecast
issues an error when any table or timetable input contains missing values.When
forecast
computes the MSEsYMSE
of the conditional mean forecastsY
, the function treats the specified predictor data sets as exogenous, nonstochastic, and statistically independent of the model innovations. Therefore,YMSE
reflects only the variance associated with the ARIMA component of the input modelMdl
.
References
[1] Baillie, Richard T., and Tim Bollerslev. “Prediction in Dynamic Models with Time-Dependent Conditional Variances.” Journal of Econometrics 52, (April 1992): 91–113. https://doi.org/10.1016/0304-4076(92)90066-Z.
[2] Bollerslev, Tim. “Generalized Autoregressive Conditional Heteroskedasticity.” Journal of Econometrics 31 (April 1986): 307–27. https://doi.org/10.1016/0304-4076(86)90063-1.
[3] Bollerslev, Tim. “A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return.” The Review of Economics and Statistics 69 (August 1987): 542–47. https://doi.org/10.2307/1925546.
[4] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.
[5] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.
[6] Engle, Robert. F. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica 50 (July 1982): 987–1007. https://doi.org/10.2307/1912773.
[7] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.
Version History
Introduced in R2012aR2023b: forecast
accepts input data in tables and timetables, and returns results in tables and timetables
In addition to accepting input presample and in-sample data in numeric arrays,
forecast
accepts input data in tables or regular timetables. Use
Tbl1
to supply presample data and InSample
to
provide in-sample (future) predictor data for the forecast horizon.
When you supply data in a table or timetable, the following conditions apply:
forecast
chooses the default presample response series on which to operate, but you can use the optionalPresampleResponseVariable
name-value argument to select a different variable.forecast
returns results in a table or timetable.
Name-value arguments to support tabular workflows include:
PresampleResponseVariable
specifies the variable name of the presample response paths in the input presample dataTbl1
to initialize the response series for the forecast.PresampleInnovationVariable
specifies the variable name of the innovation paths in the input presample dataTbl1
to initialize the model for the forecast.PresampleVarianceVariable
specifies the variable name of the conditional variance paths in the input presample dataTbl1
to initialize the conditional variance series for the forecast.PresamplePredictorVariables
specifies the variable names of the predictor data in the input presample dataTbl1
for the model exogenous regression component.PredictorVariables
specifies the variable names of the predictor data in the input in-sample dataInSample
for the model exogenous regression component in the forecast horizon.
R2019a: Univariate time series models require specification of presample response data to forecast responses
The forecast
function now has a third input argument for you to
supply presample response data.
forecast(Mdl,numperiods,Y0) forecast(Mdl,numperiods,Y0,Name,Value)
Before R2019a, the syntaxes were:
forecast(Mdl,numperiods) forecast(Mdl,numperiods,Name,Value)
'Y0'
name-value argument.There are no plans to remove the previous syntaxes or the 'Y0'
name-value argument at this time. However, you are encouraged to supply presample responses
because, to forecast responses from a dynamic model, forecast
must
initialize models containing lagged responses. Without specified presample responses,
forecast
initializes models by using reasonable default values,
but these values might not support all workflows.
For stationary models without a regression component, all presample responses are the unconditional mean of the process, by default.
For nonstationary models or models containing a regression component, all presample responses are
0
, by default.
Update your code by specifying presample responses in the third input argument.
If you do not supply presample responses, then forecast
provides
default presample values that might not support all workflows.
See Also
Objects
Functions
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)