image thumbnail


version (3.36 KB) by Antonio Trujillo-Ortiz
Prediction error sum of squares.


Updated 03 Sep 2013

View License

This m-file returns a useful residual scaling, the prediction error sum of squares (PRESS). To calculate PRESS, select an observation i. Fit the regression model to the remaining n-1 observations and use this equation to predict the withheld observation y_i. Denoting this predicted value by ye_(i), we may find the prediction error for point i as e_(i)=y_i - ye_(i). The prediction error is often called the ith PRESS residual. This procedure is repeated for each observation i = 1,2,...,n, producing a set of n PRESS residuals e_(1),e_(2),...,e_(n). Then the PRESS statistic is defined as the sum of squares of the n PRESS residuals as in,

PRESS = i_Sum_n e_(i)^2 = i_Sum_n [y_i - ye_(i)]^2

Thus PRESS uses such possible subset of n-1 observations as an estimation data set, and every observation in turn is used to form a prediction data set. In the construction of this m-file, we use this statistical approach.

As we have seen that calculating PRESS requires fitting n different regressions, also it is possible to calculate it from the results of a single least squares fit to all n observations. It turns out that the ith PRESS residual is,

e_(i) = e_i/(1 - h_ii)

Thus, because PRESS is just the sum of the squares of the PRESS residuals, a simple computing formula is

PRESS = i_Sum_n [e_i/(1 - h_ii)]^2

It is easy to see that the PRESS residual is just the ordinary residual weighted according to the diagonal elements of the hat matrix h_ii. Also, for all the interested people, here we just indicate, in an inactive form, this statistical approaching.

Data points for which h_ii are large will have large PRESS residuals. These observations will generally be high influence points. Generally, a large difference between the ordinary residual and the PRESS residual will indicate a point where the model fits the data well, but a model built without that point predicts poorly.

This is also known as leave-​​one-​​out cross-​​​​validation (LOOCV) in linear models as a measure of the accuracy. [an anon's suggestion]

In order to improve the matrix script for avoiding the squares condition number in the regression parameter estimation are by using a pivoted QR factorization of X.

Syntax: function x = press(D)

D - matrix data (=[X Y]) (last column must be the Y-dependent variable).
(X-independent variables).

x - prediction error sum of squares (PRESS).

Cite As

Antonio Trujillo-Ortiz (2021). press (, MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R14
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!