Understanding the Stepwiselm PRemove
1 view (last 30 days)
Daria Zhuravleva on 12 Sep 2018
I am using the following code:
My X has 22 features. I would like to start with constant model, track the AdjRsquared criterion values. My final model will include only linear terms, if it will at all. Setting Verbose to 2 let me monitor all the steps. PEnter and PRemove are default. This is how the output looks like (short version, excluding the checks for all the features):
Change in AdjRsquared for adding x12 is 0.1313
1. Adding x12, AdjRsquared = 0.1313
Change in AdjRsquared for adding x2 is 0.048833
2. Adding x2, AdjRsquared = 0.18014
Change in AdjRsquared for adding x20 is 0.037826
3. Adding x20, AdjRsquared = 0.21796
Change in AdjRsquared for adding x21 is 0.011027
4. Adding x21, AdjRsquared = 0.22899
Change in AdjRsquared for adding x22 is 0.00093592
5. Adding x22, AdjRsquared = 0.22993
Change in AdjRsquared for removing x2 is -0.10048
Change in AdjRsquared for removing x12 is -0.043955
Change in AdjRsquared for removing x20 is -0.019522
Change in AdjRsquared for removing x21 is -0.023
mdl1 = Linear regression model: y ~ 1 + x2 + x12 + x20 + x21 + x22
Why x2 is not removed? It is said here that for 'AdjRsquared' criterion:
- PEnter = 0, If the increase in the adjusted R-squared of the model is larger than PEnter, add the term to the model.
- PRemove = -0.05, If the increase in the adjusted R-squared value of the model is smaller than PRemove, remove the term from the model.
-0.10048 < -0.05, why it does not trigger x2 removing?
Tom Lane on 13 Sep 2018
Daria, thanks for providing the data, allowing me to reproduce your results.
It looks like the documentation is confusing or just wrong.
The best model has a high adjusted r-square. You might remove a term if doing that increases r-square, or decreases it just a little. So from your output, either x20 (least decrease among those shown) or x22 (just added with a change of 0.00093592, so not shown) would be the possible variables to remove rather than x2. The variable x22 does get removed if I do this next:
mdl2 = step(mdl1,'premove',.0010,'penter',1,'ver',2)
So it looks like the premove value is being compared to the negative of the change shown when verbose=2.
I'll try to have the documentation changed to make this clearer, or have the verbosity display changed to show the negative value.