Main Content

infer

Infer univariate ARIMA or ARIMAX model residuals or conditional variances

Description

E = infer(Mdl,Y) returns the numeric array of one or more residual series E inferred from the fully specified, univariate ARIMA model Mdl and the numeric array of one or more response series Y.

example

[E,V] = infer(Mdl,Y) also returns the numeric array of one or more conditional variance V series when Mdl represents a composite conditional mean and variance model.

example

Tbl2 = infer(Mdl,Tbl1) returns the table or timetable Tbl2 containing paths of residuals and conditional variances inferred from the model Mdl and the response data in the input table or timetable Tbl1. (since R2023b)

infer selects the response variable named in Mdl.SeriesName or the sole variable in Tbl1. To select a different response variable in Tbl1 to infer residuals and conditional variances, use the ResponseVariable name-value argument.

example

[___] = infer(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. infer returns the output argument combination for the corresponding input arguments. For example, infer(Mdl,Y,Y0=PS,X=Pred) infers residuals from the numeric vector of responses Y with respect to the ARIMAX Mdl, and specifies the numeric vector of presample response data PS to initialize the model and the exogenous predictor data Pred for the regression component.

example

[___,logL] = infer(___) also returns a numeric vector containing the loglikelihood objective function values logL associated with each specified path of response data.

example

Examples

collapse all

Infer residuals from an AR model by supplying a hypothetical response series in a vector.

Specify an AR(2) model using known parameters.

Mdl = arima(AR={0.5 -0.8},Constant=0.002, ...
	Variance=0.8);

Simulate response data with 100 observations.

rng(1,"twister");
Y = simulate(Mdl,100);

Y is a 100-by-1 vector containing a random response path drawn from Mdl.

Infer residuals for all corresponding responses.

E = infer(Mdl,Y);

E is a 100-by-1 vector containing a residuals corresponding to Y, with respect to Mdl. By default, infer backcasts for required presample observations.

Plot the residuals.

figure
plot(E)
title("Inferred Residuals")

Figure contains an axes object. The axes object with title Inferred Residuals contains an object of type line.

Infer the conditional variances from an AR(1) and GARCH(1,1) composite model. Return the loglikelihood value.

Specify an AR(1) model using known parameters. Set the variance equal to a garch model.

Mdl = arima(AR={0.8 -0.3},Constant=0);
MdlVar = garch(Constant=0.0002,GARCH=0.6,ARCH=0.2);
Mdl.Variance = MdlVar;

Simulate response data with 100 observations.

rng(1,"twister")
Y = simulate(Mdl,100);

Infer residuals and conditional variances for the entire response series. Compute the loglikelihood at the simulated data.

[E,V,logL] = infer(Mdl,Y);
logL
logL = 
209.6405

E and V are 100-by-1 vectors of inferred residuals and conditional variances, given the response data and model.

Plot the conditional variances.

figure
plot(V)
title("Inferred Conditional Variances")

Figure contains an axes object. The axes object with title Inferred Conditional Variances contains an object of type line.

Infer residuals from an AR model by supplying a hypothetical response series in a vector. Supply presample responses to initialize the model.

Specify an AR(2) model using known parameters.

Mdl = arima(AR={0.5 -0.8},Constant=0.002, ...
	Variance=0.8)
Mdl = 
  arima with properties:

     Description: "ARIMA(2,0,0) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 2
               D: 0
               Q: 0
        Constant: 0.002
              AR: {0.5 -0.8} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
     Seasonality: 0
            Beta: [1×0]
        Variance: 0.8

Consider inferring residuals from a response series of length T = 100. Because the model requires Mdl.P responses to initialize the model, simulate T + Mdl.P = 102 responses from the model.

rng(1,"twister");
T = 100;
TSim = T + Mdl.P;
y = simulate(Mdl,TSim);

Y is a 102-by-1 vector representing a random response path drawn from the model.

Infer residuals from the last T response and use the first Mdl.P observations as a presample to initialize the model.

E = infer(Mdl,y((Mdl.P+1):end),Y0=y(1:Mdl.P));
size(E)
ans = 1×2

   100     1

E is a 100-by-1 vector containing a residuals corresponding to the last 100 observations of y, with respect to Mdl.

Plot the residuals.

figure
plot(E)
title("Inferred Residuals")

Figure contains an axes object. The axes object with title Inferred Residuals contains an object of type line.

Since R2023b

Fit an ARIMA(1,1,1) model to the weekly average NYSE closing prices. Supply timetables of in-sample and presample data for the fit. Then, infer the residuals from the fit.

Load Data

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx
T = height(DataTimeTable)
T = 
3028

The timetable DataTimeTable includes the time series variable NYSE, which contains daily NYSE composite closing prices from January 1990 through December 2001.

Plot the daily NYSE price series.

figure
plot(DataTimeTable.Time,DataTimeTable.NYSE)
title("NYSE Daily Closing Prices: 1990 - 2001")

Figure contains an axes object. The axes object with title NYSE Daily Closing Prices: 1990 - 2001 contains an object of type line.

Prepare Timetable for Estimation

When you plan to supply a timetable, you must ensure it has all the following characteristics:

  • The selected response variable is numeric and does not contain any missing values.

  • The timestamps in the Time variable are regular, and they are ascending or descending.

Remove all missing values from the timetable, relative to the NYSE price series.

DTT = rmmissing(DataTimeTable,DataVariables="NYSE");
T_DTT = height(DTT)
T_DTT = 
3028

Because all sample times have observed NYSE prices, rmmissing does not remove any observations.

Determine whether the sampling timestamps have a regular frequency and are sorted.

areTimestampsRegular = isregular(DTT,"days")
areTimestampsRegular = logical
   0

areTimestampsSorted = issorted(DTT.Time)
areTimestampsSorted = logical
   1

areTimestampsRegular = 0 indicates that the timestamps of DTT are irregular. areTimestampsSorted = 1 indicates that the timestamps are sorted. Business day rules make daily macroeconomic measurements irregular.

Remedy the time irregularity by computing the weekly average closing price series of all timetable variables.

DTTW = convert2weekly(DTT,Aggregation="mean");
areTimestampsRegular = isregular(DTTW,"weeks")
areTimestampsRegular = logical
   1

T_DTTW = height(DTTW)
T_DTTW = 
627

DTTW is regular.

figure
plot(DTTW.Time,DTTW.NYSE)
title("NYSE Daily Closing Prices: 1990 - 2001")

Figure contains an axes object. The axes object with title NYSE Daily Closing Prices: 1990 - 2001 contains an object of type line.

Create Model Template for Estimation

Suppose that an ARIMA(1,1,1) model is appropriate to model NYSE composite series during the sample period.

Create an ARIMA(1,1,1) model template for estimation.

Mdl = arima(1,1,1);

Mdl is a partially specified arima model object.

Fit Model to Data

infer requires Mdl.P presample observations to initialize the model. infer backcasts for necessary presample responses, but you can provide a presample.

Partition the data into presample and in-sample, or estimation sample, observations.

T0 = Mdl.P;
DTTW0 = DTTW(1:T0,:);
DTTW1 = DTTW((T0+1):end,:);

Fit an ARIMA(1,1,1) model to the in-sample weekly average NYSE closing prices. Specify the response variable name, presample timetable, and the presample response variable name.

EstMdl = estimate(Mdl,DTTW1,ResponseVariable="NYSE", ...
    Presample=DTTW0,PresampleResponseVariable="NYSE");
 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic      PValue   
                ________    _____________    __________    ___________

    Constant     0.83624         0.453          1.846         0.064891
    AR{1}       -0.32862       0.23526        -1.3968          0.16246
    MA{1}        0.42703       0.22613         1.8885         0.058965
    Variance      56.065        1.8433         30.416      3.3809e-203

EstMdl is a fully specified, estimated arima model object.

Infer Residuals

Infer the residuals from the fitted model and in-sample observations. Specify the response variable name, presample timetable, and the presample response variable name.

Tbl2 = infer(EstMdl,DTTW1,ResponseVariable="NYSE", ...
    Presample=DTTW0,PresampleResponseVariable="NYSE");
tail(Tbl2)
       Time         NYSE     NASDAQ    Y_Residual    Y_Variance
    ___________    ______    ______    __________    __________

    16-Nov-2001    577.11    1886.9      5.8649        56.065  
    23-Nov-2001       583    1898.3      5.3303        56.065  
    30-Nov-2001    581.41    1925.8     -2.7678        56.065  
    07-Dec-2001    584.96    1998.1      3.3787        56.065  
    14-Dec-2001    574.03      1981     -12.038        56.065  
    21-Dec-2001     582.1    1967.9      8.7774        56.065  
    28-Dec-2001    590.28    1967.2      6.2526        56.065  
    04-Jan-2002     589.8    1950.4     -1.3008        56.065  
size(Tbl2)
ans = 1×2

   625     4

Tbl2 is a 625-by-4 timetable containing all variables in DTTW1, and the inferred residuals from the fit NYSE_Response and constant variance paths NYSE_Variance (Mdl.Variance = 56.065).

Since R2023b

Fit an ARIMA(1,1,1) model to the weekly average NYSE closing prices. Supply a timetable of data and specify the series for the fit. Then, compute fitted responses.

Load the US equity index data set Data_EquityIdx.

load Data_EquityIdx
T = height(DataTimeTable)
T = 
3028

Remedy the time irregularity by computing the weekly average closing price series of all timetable variables.

DTTW = convert2weekly(DataTimeTable,Aggregation="mean");
T_DTTW = height(DTTW)
T_DTTW = 
627

Create an ARIMA(1,1,1) model template for estimation. Set the response series name to NYSE.

Mdl = arima(1,1,1);
Mdl.SeriesName = "NYSE";

Partition the data into presample and in-sample, or estimation sample, observations.

T0 = Mdl.P;
DTTW0 = DTTW(1:T0,:);
DTTW1 = DTTW((T0+1):end,:);

Fit an ARIMA(1,1,1) model to the in-sample weekly average NYSE closing prices. Specify the presample timetable, and the presample response variable name.

EstMdl = estimate(Mdl,DTTW1,Presample=DTTW0, ...
    PresampleResponseVariable="NYSE");
 
    ARIMA(1,1,1) Model (Gaussian Distribution):
 
                 Value      StandardError    TStatistic      PValue   
                ________    _____________    __________    ___________

    Constant     0.83624         0.453          1.846         0.064891
    AR{1}       -0.32862       0.23526        -1.3968          0.16246
    MA{1}        0.42703       0.22613         1.8885         0.058965
    Variance      56.065        1.8433         30.416      3.3809e-203

Infer the residuals from the fitted model and in-sample observations. Specify the presample timetable, and the presample response variable name.

Tbl2 = infer(EstMdl,DTTW1,Presample=DTTW0, ...
    PresampleResponseVariable="NYSE");

Compute fitted response values by subtracting the residuals from the observed response series.

Tbl2.YHat = Tbl2.NYSE - Tbl2.NYSE_Residual;

Plot the observed responses and the fitted values.

figure
plot(Tbl2.Time,[Tbl2.NYSE Tbl2.YHat])
legend("Observations","Fitted values")
title("NYSE Weekly Average Price Series")

Figure contains an axes object. The axes object with title NYSE Weekly Average Price Series contains 2 objects of type line. These objects represent Observations, Fitted values.

The fitted values closely track the observations.

Plot the residuals versus the fitted values.

figure
plot(Tbl2.YHat,Tbl2.NYSE_Residual,".",MarkerSize=15)
ylabel("Residuals")
xlabel("Fitted Values")
title("Residual Plot")

Figure contains an axes object. The axes object with title Residual Plot, xlabel Fitted Values, ylabel Residuals contains a line object which displays its values using only markers.

Residual variance appears larger for larger fitted values. One remedy for this behavior is to apply the log transform to the data.

Infer residuals from an ARMAX model.

Specify an ARMA(1,2) model using known parameters for the response (MdlY) and an AR(1) model for the predictor data (MdlX).

MdlY = arima(AR=0.2,MA={-0.1,0.6},Constant=1, ...
    Variance=2,Beta=3)
MdlY = 
  arima with properties:

     Description: "ARIMAX(1,0,2) Model (Gaussian Distribution)"
      SeriesName: "Y"
    Distribution: Name = "Gaussian"
               P: 1
               D: 0
               Q: 2
        Constant: 1
              AR: {0.2} at lag [1]
             SAR: {}
              MA: {-0.1 0.6} at lags [1 2]
             SMA: {}
     Seasonality: 0
            Beta: [3]
        Variance: 2
MdlX = arima(AR=0.3,Constant=0,Variance=1);

If you do not specify presample responses, infer requires at least T + MdlY.P predictor observations to simulate a response series of length T.

Consider simulating a response series of length 100. Simulate a predictor series of length 101, and then simulate the response series. Provide the predictor data to simulate for the exogenous regression component.

rng(1,"twister") % For reproducibility
T = 100;
Pred = simulate(MdlX,T + MdlY.P);
Y = simulate(MdlY,T,X=Pred);

Infer residuals using the entire series.

E = infer(MdlY,Y,X=Pred);
figure
plot(E)
title("Inferred Residuals") 

Figure contains an axes object. The axes object with title Inferred Residuals contains an object of type line.

Input Arguments

collapse all

Fully specified ARIMA model, specified as an arima model object created by arima or estimate.

The properties of Mdl cannot contain NaN values.

Response data yt, specified as a numobs-by-1 numeric column vector or numobs-by-numpaths numeric matrix. numObs is the length of the time series (sample size). numpaths is the number of separate, independent paths of response series.

infer infers the residuals and conditional variances of columns of Y, which are time series characterized by Mdl. Y is the continuation of the presample series Y0.

Each row corresponds to a sampling time. The last row contains the latest set of observations.

Each column corresponds to a separate, independent path of response data. infer assumes that responses across any row occur simultaneously.

Data Types: double

Since R2023b

Time series data containing the observed response variable yt and, optionally, predictor variables xt for the exogenous regression component, specified as a table or timetable with numvars variables and numobs rows. You can optionally select the response variable or numpreds predictor variables by using the ResponseVariable or PredictorVariables name-value arguments, respectively.

Each row is an observation, and measurements in each row occur simultaneously. The selected response variable is a single path (numobs-by-1 vector) or multiple paths (numobs-by-numpaths matrix) of numobs observations of response data.

Each path (column) of the selected response variable is independent of the other paths, but path j of all presample and in-sample variables correspond, for j = 1,…,numpaths. Each selected predictor variable is a numobs-by-1 numeric vector representing one path. The infer function includes all predictor variables in the model when it infers residuals and conditional variances. Variables in Tbl1 represent the continuation of corresponding variables in Presample.

If Tbl1 is a timetable, it must represent a sample with a regular datetime time step (see isregular), and the datetime vector Tbl1.Time must be strictly ascending or descending.

If Tbl1 is a table, the last row contains the latest observation.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: infer(Mdl,Y,Y0=PS,X=Pred) infers residuals from the numeric vector of responses Y through the ARIMAX Mdl, and specifies the numeric vector of presample response data PS to initialize the model and the exogenous predictor data Pred for the regression component.

Since R2023b

Response variable yt to select from Tbl1 containing the response data, specified as one of the following data types:

  • String scalar or character vector containing a variable name in Tbl1.Properties.VariableNames

  • Variable index (positive integer) to select from Tbl1.Properties.VariableNames

  • A logical vector, where DisturbanceVariable(j) = true selects variable j from Tbl1.Properties.VariableNames

The selected variable must be a numeric vector and cannot contain missing values (NaNs).

If Tbl1 has one variable, the default specifies that variable. Otherwise, the default matches the variable to names in Mdl.SeriesName.

Example: ResponseVariable="StockRate"

Example: ResponseVariable=[false false true false] or ResponseVariable=3 selects the third table variable as the response variable.

Data Types: double | logical | char | cell | string

Presample response data yt to initialize the model, specified as a numpreobs-by-1 numeric column vector or a numpreobs-by-numprepaths numeric matrix. Use Y0 only when you supply the numeric array of response data Y.

numpreobs is the number of presample observations. numprepaths is the number of presample response paths.

Each row is a presample observation (sampling time), and measurements in each row occur simultaneously. The last row contains the latest presample observation. numpreobs must be at least Mdl.P to initialize the AR model component. If numpreobs > Mdl.P, infer uses the latest required number of observations only.

Columns of Y0 are separate, independent presample paths. The following conditions apply:

  • If Y0 is a column vector, it represents a single response path. infer applies it to each output path.

  • If Y0 is a matrix, each column represents a presample response path. infer applies Y0(:,j) to initialize path j. numprepaths must be at least numpaths. If numprepaths > numpaths, infer uses the first size(Y,2) columns only.

By default, infer backcasts to obtain the necessary observations.

Data Types: double

Presample residual data et to initialize the model, specified as a numpreobs-by-1 numeric column vector or a numpreobs-by-numprepaths numeric matrix. Use E0 only when you supply the numeric array of response data Y.

Each row is a presample observation (sampling time), and measurements in each row occur simultaneously. The last row contains the latest presample observation. numpreobs must be at least Mdl.Q to initialize the MA model component. If Mdl.Variance is a conditional variance model (for example, a garch model object), infer can require more rows than Mdl.Q. If numpreobs is larger than required, infer uses the latest required number of observations only.

Columns of E0 are separate, independent presample paths. The following conditions apply:

  • If E0 is a column vector, it represents a single residual path. infer applies it to each output path.

  • If E0 is a matrix, each column represents a presample residual path. infer applies E0(:,j) to initialize path j. numprepaths must be at least numpaths. If numprepaths > numpaths, infer uses the first size(Y,2) columns only.

  • infer assumes each column of E0 has a mean of zero.

By default, infer sets the necessary presample disturbances to zero.

Data Types: double

Presample conditional variances σt2 to initialize the conditional variance model, specified as a numpreobs-by-1 positive numeric column vector or a numpreobs-by-numprepaths positive numeric matrix. If the conditional variance Mdl.Variance is constant, infer ignores V0. Use V0 only when you supply the numeric array of response data Y.

Each row is a presample observation (sampling time), and measurements in each row occur simultaneously. The last row contains the latest presample observation. numpreobs must be at least Mdl.Q to initialize the conditional variance model in Mdl.Variance. For details, see the infer function of conditional variance models. If numpreobs is larger than required, infer uses the latest required number of observations only.

Columns of V0 are separate, independent presample paths. The following conditions apply:

  • If V0 is a column vector, it represents a single path of conditional variances. infer applies it to each output path.

  • If V0 is a matrix, each column represents a presample path of conditional variances. infer applies V0(:,j) to initialize path j. numprepaths must be at least numpaths. If numprepaths > numpaths, infer uses the first size(Y,2) columns only.

By default, infer sets all necessary presample conditional variances to the average squared value of inferred residuals.

Data Types: double

Since R2023b

Presample data containing paths of response yt, residual et, or conditional variance σt2 series to initialize the model, specified as a table or timetable, the same type as Tbl1, with numprevars variables and numpreobs rows. Use Presample only when you supply a table or timetable of data Tbl1.

Each selected variable is a single path (numpreobs-by-1 vector) or multiple paths (numpreobs-by-numprepaths matrix) of numpreobs observations representing the presample of the response, residual, or conditional variance series for ResponseVariable, the selected response variable in Tbl1.

Each row is a presample observation, and measurements in each row occur simultaneously. numpreobs must be one of the following values:

  • At least Mdl.P when Presample provides only presample responses

  • At least Mdl.Q when Presample provides only presample disturbances or conditional variances

  • At least max([Mdl.P Mdl.Q]) otherwise

When Mdl.Variance is a conditional variance model, infer can require more than the minimum required number of presample values.

If you supply more rows than necessary, infer uses the latest required number of observations only.

When Presample provides presample residuals, infer assumes each presample residual path has a mean of zero.

If Presample is a timetable, all the following conditions must be true:

  • Presample must represent a sample with a regular datetime time step (see isregular).

  • The inputs Tbl1 and Presample must be consistent in time such that Presample immediately precedes Tbl1 with respect to the sampling frequency and order.

  • The datetime vector of sample timestamps Presample.Time must be ascending or descending.

If Presample is a table, the last row contains the latest presample observation.

By default:

  • When Mdl is a model without a exogenous linear regression component (ARIMAX), infer backcasts for necessary presample responses, sets necessary presample residuals to 0, and sets necessary presample variances to the average squared value of inferred residuals.

  • When Mdl is an ARIMAX model (you specify the PredictorVariables name-value argument), you must specify presample response data, but infer sets necessary presample residuals to 0 and sets necessary presample variances to the average squared value of inferred residuals.

If you specify the Presample, you must specify the presample response, residual, or conditional variance name by using the PresampleResponseVariable, PresampleInnovationVariable, or PresampleVarianceVariable name-value argument.

Since R2023b

Response variable yt to select from Presample containing presample response data, specified as one of the following data types:

  • String scalar or character vector containing a variable name in Presample.Properties.VariableNames

  • Variable index (positive integer) to select from Presample.Properties.VariableNames

  • A logical vector, where PresampleResponseVariable(j) = true selects variable j from Presample.Properties.VariableNames

The selected variable must be a numeric matrix and cannot contain missing values (NaNs).

If you specify presample response data by using the Presample name-value argument, you must specify PresampleResponseVariable.

Example: PresampleResponseVariable="Stock0"

Example: PresampleResponseVariable=[false false true false] or PresampleResponseVariable=3 selects the third table variable as the presample response variable.

Data Types: double | logical | char | cell | string

Since R2023b

Presample residual variable et to select from Presample containing presample residual data, specified as one of the following data types:

  • String scalar or character vector containing a variable name in Presample.Properties.VariableNames

  • Variable index (positive integer) to select from Presample.Properties.VariableNames

  • A logical vector, where PresampleInnovationVariable(j) = true selects variable j from Presample.Properties.VariableNames

The selected variable must be a numeric matrix and cannot contain missing values (NaNs).

If you specify presample residual data by using the Presample name-value argument, you must specify PresampleInnovationVariable.

Example: PresampleInnovationVariable="StockRateDist0"

Example: PresampleInnovationVariable=[false false true false] or PresampleInnovationVariable=3 selects the third table variable as the presample innovation variable.

Data Types: double | logical | char | cell | string

Since R2023b

Conditional variance variable σt2 to select from Presample containing presample conditional variance data, specified as one of the following data types:

  • String scalar or character vector containing a variable name in Presample.Properties.VariableNames

  • Variable index (positive integer) to select from Presample.Properties.VariableNames

  • A logical vector, where PresampleVarianceVariable(j) = true selects variable j from Presample.Properties.VariableNames

The selected variable must be a numeric vector and cannot contain missing values (NaNs).

If you specify presample conditional variance data by using the Presample name-value argument, you must specify PresampleVarianceVariable.

Example: PresampleVarianceVariable="StockRateVar0"

Example: PresampleVarianceVariable=[false false true false] or PresampleVarianceVariable=3 selects the third table variable as the presample conditional variance variable.

Data Types: double | logical | char | cell | string

Exogenous predictor data for the model regression component, specified as a numeric matrix with numpreds columns. numpreds is the number of predictor variables (numel(Mdl.Beta)). Use X only when you supply the numeric array of response data Y.

If you do not specify Y0, the number of rows of X must be at least numObs + Mdl.P. Otherwise, the number of rows of X must be at least numObs. If the number of rows of X exceeds the number necessary, infer uses only the latest observations. infer does not use the regression component in the presample period.

Columns of X are separate predictor variables.

infer applies X to each path; that is, X represents one path of observed predictors.

By default, infer excludes the regression component, regardless of its presence in Mdl.

Data Types: double

Since R2023b

Exogenous predictor variables xt to select from Tbl1 containing the predictor data for the model regression component, specified as one of the following data types:

  • String vector or cell vector of character vectors containing numpreds variable names in Tbl1.Properties.VariableNames

  • A vector of unique indices (positive integers) of variables to select from Tbl1.Properties.VariableNames

  • A logical vector, where PredictorVariables(j) = true selects variable j from Tbl1.Properties.VariableNames

The selected variables must be numeric vectors and cannot contain missing values (NaNs).

If you specify PredictorVariables, you must also specify presample response data to by using the Presample and PresampleResponseVariable name-value arguments. For more details, see Algorithms.

By default, infer excludes the regression component, regardless of its presence in Mdl.

Example: PredictorVariables=["M1SL" "TB3MS" "UNRATE"]

Example: PredictorVariables=[true false true false] or PredictorVariable=[1 3] selects the first and third table variables to supply the predictor data.

Data Types: double | logical | char | cell | string

Note

  • NaN values in Y, X, Y0, E0, and V0 indicate missing values. infer removes missing values from specified data by list-wise deletion.

    • For the presample, infer horizontally concatenates the possibly jagged arrays Y0, E0, and V0 with respect to the last rows, and then it removes any row of the concatenated matrix containing at least one NaN.

    • For in-sample data, infer horizontally concatenates the possibly jagged arrays Y and X, and then it removes any row of the concatenated matrix containing at least one NaN.

    This type of data reduction reduces the effective sample size and can create an irregular time series.

  • For numeric data inputs, infer assumes that you synchronize the presample data such that the latest observations occur simultaneously.

  • infer issues an error when any table or timetable input contains missing values.

Output Arguments

collapse all

Inferred residual paths et, returned as a numobs-by-numpaths numeric matrix. infer returns E only when you supply the input Y.

E(j,k) is the path k residual of time j; it is the residual associated with response Y(j,k).

Inferred conditional variance paths σt, returned as a numobs-by-numpaths numeric matrix. infer returns V only when you supply the input Y.

V(j,k) is the path k conditional variance of time j; it is the conditional variance associated with response Y(j,k).

Since R2023b

Inferred residual et and conditional variance σt2 paths, returned as a table or timetable, the same data type as Tbl1. infer returns Tbl2 only when you supply the input Tbl1.

Tbl2 contains the following variables:

  • The inferred residual paths, which are in a numobs-by-numpaths numeric matrix, with rows representing observations and columns representing independent paths. Each path corresponds to the input response path in Tbl1 and represents the continuation of the corresponding presample residual path in Presample. infer names the inferred residual variable in Tbl2 responseName_Residual, where responseName is Mdl.SeriesName. For example, if Mdl.SeriesName is StockReturns, Tbl2 contains a variable for the corresponding inferred innovations paths with the name StockReturns_Residual.

  • The inferred conditional variance paths, which are in a numobs-by-numpaths numeric matrix, with rows representing observations and columns representing independent paths. Each path represents the continuation of the corresponding path of presample conditional variances in Presample. infer names the inferred conditional variance variable in Tbl2 responseName_Variance, where responseName is Mdl.SeriesName. For example, if Mdl.SeriesName is StockReturns, Tbl2 contains a variable for the corresponding inferred conditional variance paths with the name StockReturns_Variance.

  • All variables Tbl1.

If Tbl1 is a timetable, row times of Tbl1 and Tbl2 are equal.

Loglikelihood objective function values associated with the model Mdl, returned as a numeric scalar or vector of length numpaths.

If Y is a vector, then logL is a scalar. Otherwise, logL is vector of length size(Y,2), and each element is the loglikelihood of the corresponding column (or path) in Y.

Algorithms

If you supply data in the table or timetable Tbl1 to estimate an ARIMAX model, infer cannot backcast for presample responses. Therefore, if you specify PredictorVariables, you must also specify presample response data by using the Presample and PresampleResponseVariable name-value arguments.

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, 1995.

[3] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

Version History

Introduced in R2012a

expand all