How the CI of GLM parameters are calculated?

Question

TingTing on 8 Aug 2014

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/146900-how-the-ci-of-glm-parameters-are-calculated

Answered: Tom Lane on 12 Aug 2014

I need to have the distributions of the coefficients estimated by glm model. I tried to find them and tested if the ones I found are the right ones by calculating the 95% CI from them and comparing those to those given by “predict” command. And the result turned out disappointing.

For example, if I use glm with gamma as distribution for response variable. The model is called mdl and the dataset is called dataset. After fitting the model, I wrote:

shape=1/mdl.Dispersion; [newf newc] = predict(mdl,dataset); scale=newf*mdl.Dispersion;

I tried to see if this is correct by calculating the 95% CI bounds for the first data point.

>> gamcdf(newc(1,2),shape,scale(1))

ans =

0.6329

>> gamcdf(newc(1,1),shape,scale(1))

ans =

0.4629

I think the different between them should be 0.95, but here 0.6329-0.4629 is obviously not.

So my understanding is wrong?

I also tried with normal distribution

>> mu=mdl.Fitted; >> variance=mdl.Dispersion; >> [newf newc] = predict(mdl,dataset); >> sigma=sqrt(variance);

>> normcdf(newc(1,2),newf(1),sigma)

ans =

0.5910

>> normcdf(newc(1,1),newf(1),sigma)

ans =

0.4152

Shouldn't the difference between the last two be 0.95? Is my understanding wrong or...?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Tom Lane on 12 Aug 2014

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/146900-how-the-ci-of-glm-parameters-are-calculated#answer_148324

Open in MATLAB Online

I'm not exactly following what you are trying to do. It looks like you are using the estimated standard deviation of the observed response values as a measure of uncertainty in the predicted values. That's not exactly right. Imagine you had huge amounts of data. There would still be residual standard deviation, but the predicted values would be known much more accurately than that.

Here's an illustration of this using a regular linear model:

>> load fisheriris
>> x = meas(:,1:3);
>> y = meas(:,4);
>> lm = fitlm(x,y,'linear');
>> [newf,newc] = predict(lm,meas(1,1:3))
newf =
    0.2163
newc =
    0.1613    0.2712

Now here's how I can calculate those values myself:

>> newx = [1;meas(1,1:3)'];
>> C = lm.CoefficientCovariance;
>> b = lm.Coefficients.Estimate;
>> b'*newx
ans =
    0.2163
>> newf + [1 -1] * tinv(.025,lm.DFE) * sqrt(newx'*C*newx)
ans =
    0.1613    0.2712

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How the CI of GLM parameters are calculated?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

How the CI of GLM parameters are calculated?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments