Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

Robust regression

`b = robustfit(X,y)`

b = robustfit(X,y,* wfun*,tune)

b = robustfit(X,y,

`wfun`

`const`

[b,stats] = robustfit(...)

`b = robustfit(X,y)`

returns
a (*p* + 1)-by-1 vector `b`

of coefficient
estimates for a robust multilinear regression of the responses in `y`

on
the predictors in `X`

. `X`

is an *n*-by-*p* matrix
of *p* predictors at each of *n* observations. `y`

is
an *n*-by-1 vector of observed responses. By default,
the algorithm uses iteratively reweighted least squares with a bisquare
weighting function.

By default, `robustfit`

adds a first column
of 1s to `X`

, corresponding to a constant term in
the model. Do not enter a column of 1s directly into `X`

.
You can change the default behavior of `robustfit`

using
the input * const*, below.

`robustfit`

treats `NaN`

s
in `X`

or `y`

as missing values,
and removes them.

`b = robustfit(X,y,`

specifies
a weighting function * wfun*,tune)

`wfun`

`tune`

is
a tuning constant that is divided into the residual vector before
computing weights.The weighting function * wfun* is one of the values described in this
table. The default value is

`'bisquare'`

.Weight Function | Description | Default Tuning Constant |
---|---|---|

`'andrews'` | `w = (abs(r)<pi) .* sin(r) ./ r` | 1.339 |

`'bisquare'` | `w = (abs(r)<1) .* (1 - r.^2).^2` (also called biweight) | 4.685 |

`'cauchy'` | `w = 1 ./ (1 + r.^2)` | 2.385 |

`'fair'` | `w = 1 ./ (1 + abs(r))` | 1.400 |

`'huber'` | `w = 1 ./ max(1, abs(r))` | 1.345 |

`'logistic'` | `w = tanh(r) ./ r` | 1.205 |

`'ols'` | Ordinary least squares (no weighting function) | None |

`'talwar'` | `w = 1 * (abs(r)<1)` | 2.795 |

`'welsch'` | `w = exp(-(r.^2))` | 2.985 |

function handle | Custom weight function that accepts a vector `r` of scaled
residuals, and returns a vector of weights the same size as
`r` | 1 |

If `tune`

is unspecified, `robustfit`

uses the default
tuning value shown in the table. The default tuning constants of
built-in weight functions give coefficient estimates that are
approximately 95% as statistically efficient as the ordinary
least-squares estimates, provided the response has a normal
distribution with no outliers. Decreasing the tuning constant
increases the downweight assigned to large residuals; increasing
the tuning constant decreases the downweight assigned to large
residuals.

The value `r`

in the weight functions is

r = resid/(tune*s*sqrt(1-h))

where `resid`

is the vector of residuals from
the previous iteration, `h`

is the vector of leverage
values from a least-squares fit, and `s`

is an estimate
of the standard deviation of the error term given by

s = MAD/0.6745

Here `MAD`

is the median absolute deviation
of the residuals from their median. The constant 0.6745 makes the
estimate unbiased for the normal distribution. If there are *p* columns
in `X`

, the smallest *p* absolute
deviations are excluded when computing the median.

`b = robustfit(X,y,`

controls
whether or not the model will include a constant term. * wfun*,tune,

`const`

`const`

`'on'`

to
include the constant term (the default), or `'off'`

to
omit it. When `const`

`'on'`

, `robustfit`

adds
a first column of 1s to `X`

and `b`

becomes
a (`const`

`'off'`

, `robustfit`

does
not alter `X`

, then `b`

is a `[b,stats] = robustfit(...)`

returns
the structure `stats`

, whose fields contain diagnostic
statistics from the regression. The fields of `stats`

are:

`ols_s`

— Sigma estimate (RMSE) from ordinary least squares`robust_s`

— Robust estimate of sigma`mad_s`

— Estimate of sigma computed using the median absolute deviation of the residuals from their median; used for scaling residuals during iterative fitting`s`

— Final estimate of sigma, the larger of`robust_s`

and a weighted average of`ols_s`

and`robust_s`

`resid`

— Residual`rstud`

— Studentized residual (see`regress`

for more information)`se`

— Standard error of coefficient estimates`covb`

— Estimated covariance matrix for coefficient estimates`coeffcorr`

— Estimated correlation of coefficient estimates`t`

— Ratio of`b`

to`se`

`p`

—*p*-values for`t`

`w`

— Vector of weights for robust fit`R`

—*R*factor in*QR*decomposition of`X`

`dfe`

— Degrees of freedom for error`h`

— Vector of leverage values for least-squares fit

The `robustfit`

function estimates the variance-covariance
matrix of the coefficient estimates using `inv(X'*X)*stats.s^2`

.
Standard errors and correlations are derived from this estimate.

[1] DuMouchel, W. H., and F. L. O'Brien. “Integrating
a Robust Option into a Multiple Regression Computing Environment.” *Computer
Science and Statistics*:* Proceedings of the
21st Symposium on the Interface*. Alexandria, VA: American
Statistical Association, 1989.

[2] Holland, P. W., and R. E. Welsch. “Robust
Regression Using Iteratively Reweighted Least-Squares.” *Communications
in Statistics: Theory and Methods*, *A6*,
1977, pp. 813–827.

[3] Huber, P. J. *Robust Statistics*.
Hoboken, NJ: John Wiley & Sons, Inc., 1981.

[4] Street, J. O., R. J. Carroll, and D. Ruppert.
“A Note on Computing Robust Regression Estimates via Iteratively
Reweighted Least Squares.” *The American Statistician*.
Vol. 42, 1988, pp. 152–154.