File Exchange

image thumbnail

Pseudo R-squared measure for Poisson regression models

version 1.1.3 (10.3 KB) by Valentina Unakafova
Computes pseudo R-squared goodness-of-fit measure for Poisson regression models from real and estimated data

4 Downloads

Updated 22 Aug 2018

View Version History

View License

function pR2 = pseudoR2( realData, estimatedData, lambda )

computes pseudo R-squared (pR2) goodness-of-fit measure for Poisson regression models from real and estimated data according to [1, page 255, first equation].
Pseudo R-squared measure was introduced in [3] to evaluate goodness of fit for Poisson regressions models, see also [1,2] where adjusted pR2 measure was introduced for Poisson regression models with over- or under-dispersion. Poisson regression models are often considered to model count data [1], and, in particular, spike data [4,5,6,8]. Pseudo R-squared values can be interpreted as the relative reduction in deviance due to the added to the model covariates [5]. Pseudo R-squared measure was used as goodness-of-fit measure when predicting spike counts in [4,5,6,8].

INPUT
- realData - observed values of the dependent variable (1xN values);
- estimatedData - estimated values (1xN values);
- lambda - mean value over realData (1x1 value).

OUTPUT
- pR2 - value of pseudo R-squared measure (1x1 value);

EXAMPLE OF USE
% 'arsdata_1950_2010.xls' is from
% http://www.maths.lth.se/matstat/kurser/fmsf60/_Labfiles/arsdata_1950_2010.xls

data = xlsread( 'arsdata_1950_2010.xls' ); % read data
startPoint = 26;
traffic = struct( 'year', data( startPoint:end, 1 ), 'killed', ...
data( startPoint:end, 2 ), 'cars', data( startPoint:end, 5 ), ...
'petrol', data( startPoint:end, 6 ) );
y = traffic.killed;
x = cell( 1, 3 ); % covariates or predictors
estCoeff = cell( 1, 3 ); % estimated coefficients of model fit
yEstimated = cell( 1, 3 );
pR2value = zeros( 1, 3 );
x{ 1 } = traffic.year - mean( traffic.year );
x{ 2 } = [ x{ 1 }, traffic.cars - mean( traffic.cars ) ];
x{ 3 } = [ x{ 2 }, traffic.petrol - mean( traffic.petrol ) ];
for iCovariate = 1:3
% leave-one-out cross-validation
for iPoint = 1:length( y )
trainingSet = [ 1:iPoint - 1 iPoint+1:length( y ) ];
estCoeff{ iCovariate } = glmfit( x{ iCovariate }( trainingSet, : ), ...
y( trainingSet ), 'poisson', 'link', 'log' );
yEstimated{ iCovariate }( iPoint ) = glmval( estCoeff{ iCovariate }, ...
x{ iCovariate }( iPoint, : ), 'log' );
end
pR2value( iCovariate ) = pseudoR2( y', yEstimated{ iCovariate }, mean( y ) );
end

% plot results
fontSize = 20;

figure;
plot( traffic.year, traffic.killed, '-', 'LineWidth', 4 ); hold on;
for iCovariate = 1:3
plot( traffic.year, yEstimated{ iCovariate }, 'o', 'LineWidth', 4, ...
'markerSize', 8 );
end
xlabel( 'Year', 'FontSize', fontSize );
ylabel( 'Number of people killed in accidients', 'FontSize', fontSize );
legendHandle = legend( 'Real', ...
[ 'Estimated from year, $pR^2$ = ' num2str( pR2value( 1 ), '%.3f' ) ], ...
[ 'Estimated from year and cars, $pR^2$ = ' num2str( pR2value( 2 ), '%.3f' ) ], ...
[ 'Estimated from year, cars and petrol, $pR^2$ = ' num2str( pR2value( 3 ), '%.3f' ) ] );
legendHandle.FontSize = fontSize;
set( legendHandle, 'Interpreter', 'Latex' );
set( gca, 'FontSize', fontSize );

REFERENCES

[1] Heinzl, H. and Mittlboeck, M., 2003. Pseudo R-squared measures for Poisson regression models with over-or underdispersion. Computational statistics & data analysis, 44(1-2), pp.253-271.
[2] Mittlböck, M. (2002). Calculating adjusted R2 measures for Poisson regression models. Computer Methods and Programs in Biomedicine, 68(3), 205-214.
[3] Cameron, A.C. and Windmeijer, F.A., 1996. R-squared measures for count data regression models with applications to health-care utilization. Journal of Business & Economic Statistics, 14(2), pp.209-220.
[4] Benjamin, A.S., Fernandes, H.L., Tomlinson, T., Ramkumar, P., VerSteeg, C., Miller, L. and Kording, K.P., 2017. Modern machine learning far outperforms GLMs at predicting spikes. bioRxiv, p.111450.
[5] Fernandes, H.L., Stevenson, I.H., Phillips, A.N., Segraves, M.A. and Kording, K.P., 2013. Saliency and saccade encoding in the frontal eye field during natural scene search. Cerebral Cortex, 24(12), pp.3232-3245.
[6] http://www.math.chalmers.se/Stat/Grundutb/CTH/mve300/1112/files/lab4/lab4.pdf
[7] Glaser, J. I., Perich, M. G., Ramkumar, P., Miller, L. E., & Kording, K. P. (2018). Population coding of conditional probability distributions in dorsal premotor cortex. Nature communications, 9(1), 1788.
[8] Ramkumar, P., Lawlor, P.N., Glaser, J.I., Wood, D.K., Phillips, A.N., Segraves, M.A. and Kording, K.P., 2016. Feature-based attention and spatial selection in frontal eye fields during natural scene search. Journal of neurophysiology, 116(3), pp.1328-1343.

Cite As

Valentina Unakafova (2020). Pseudo R-squared measure for Poisson regression models (https://www.mathworks.com/matlabcentral/fileexchange/67041-pseudo-r-squared-measure-for-poisson-regression-models), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (1)

Jose Martinez molina

MATLAB Release Compatibility
Created with R2016a
Compatible with any release
Platform Compatibility
Windows macOS Linux
Acknowledgements

Inspired: Importance of cross-validation

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!