Remove terms from generalized linear model
mdl1 = removeTerms(mdl,terms)
Terms to remove from the
Generalized linear model, the same as
This example makes a model using two predictors, then removes one.
Generate artificial data for the model, Poisson random numbers with two underlying predictors
rng('default') % for reproducibility rndvars = randn(100,2); X = [2+rndvars(:,1),rndvars(:,2)]; mu = exp(1 + X*[1;2]); y = poissrnd(mu);
Create a generalized linear regression model of Poisson data.
mdl = fitglm(X,y,'y ~ x1 + x2','distr','poisson')
mdl = Generalized linear regression model: log(y) ~ 1 + x1 + x2 Distribution = Poisson Estimated Coefficients: Estimate SE tStat pValue ________ _________ ______ ______ (Intercept) 1.0405 0.022122 47.034 0 x1 0.9968 0.003362 296.49 0 x2 1.987 0.0063433 313.24 0 100 observations, 97 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 2.95e+05, p-value = 0
Remove the second predictor from the model.
mdl1 = removeTerms(mdl,'x2')
mdl1 = Generalized linear regression model: log(y) ~ 1 + x1 Distribution = Poisson Estimated Coefficients: Estimate SE tStat pValue ________ _________ ______ ______ (Intercept) 2.7784 0.014043 197.85 0 x1 1.1732 0.0033653 348.6 0 100 observations, 98 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 1.25e+05, p-value = 0
Wilkinson notation describes the terms present in a model. The notation relates to the terms present in a model, not to the multipliers (coefficients) of those terms.
Wilkinson notation uses these symbols:
+ means include the next variable.
– means do not include the next variable.
: defines an interaction, which is a product of
* defines an interaction and all lower-order terms.
^ raises the predictor to a power, exactly as in
* repeated, so
^ includes lower-order
terms as well.
() groups terms.
This table shows typical examples of Wilkinson notation.
|Wilkinson Notation||Term in Standard Notation|
|Constant (intercept) term|
|Do not include |
Statistics and Machine
Learning Toolbox™ notation always includes a constant term unless you explicitly remove the term
For more details, see Wilkinson Notation.
For details, see Wilkinson and Rogers .
removeTerms treats a categorical predictor as follows:
A model with a categorical predictor that has L levels
(categories) includes L – 1 indicator variables. The model uses the first category as a
reference level, so it does not include the indicator variable for the reference
level. If the data type of the categorical predictor is
categorical, then you can check the order of categories
categories and reorder the
categories by using
reordercats to customize the
removeTerms treats the group of L – 1 indicator variables as a single variable. If you want to treat
the indicator variables as distinct predictor variables, create indicator
variables manually by using
dummyvar. Then use the
indicator variables, except the one corresponding to the reference level of the
categorical variable, when you fit a model. For the categorical predictor
X, if you specify all columns of
dummyvar(X) and an intercept term as predictors, then the
design matrix becomes rank deficient.
Interaction terms between a continuous predictor and a categorical predictor with L levels consist of the element-wise product of the L – 1 indicator variables with the continuous predictor.
Interaction terms between two categorical predictors with L and M levels consist of the (L – 1)*(M – 1) indicator variables to include all possible combinations of the two categorical predictor levels.
You cannot specify higher-order terms for a categorical predictor because the square of an indicator is equal to itself.
step adds or removes terms from a model using
a greedy one-step algorithm.
 Wilkinson, G. N., and C. E. Rogers. Symbolic description of factorial models for analysis of variance. J. Royal Statistics Society 22, pp. 392–399, 1973.