The calculated R squared is not equal to the squared of correlation coefficient by Matlab functions corr

Question

Yuzhen Lu on 23 Apr 2020

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/520241-the-calculated-r-squared-is-not-equal-to-the-squared-of-correlation-coefficient-by-matlab-functions

Edited: John D'Errico on 24 Apr 2020

With model predicitons and true values, the R2 (determiantion coefficient) can be readily calculated using the standard formula:

Rsq = 1 - sum((ytrue - ypred).^2)/sum((ytrue - mean(ytrue)).^2)

Alternativley, the R square can be obtained by calculating the correlation coefficient, using buildin functions such as corr or corrcoeff:

Rsq = (corr(ytrue,ypred))^2

However, it is found the latter value is sligherly larger than the former. How does the build-in function give a higher value?

3 Comments
Show 1 older commentHide 1 older comment

Yuzhen Lu on 23 Apr 2020

I attch my data files for your double checking.

dpb on 24 Apr 2020

Altho they're not the sme calculation

Sign in to comment.

Sign in to answer this question.

Answer 1

Ameer Hamza on 23 Apr 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/520241-the-calculated-r-squared-is-not-equal-to-the-squared-of-correlation-coefficient-by-matlab-functions#answer_427982

You are trying to find the coefficient of determination(R-squared). Whereas, as shown in the documentation of corr(): https://www.mathworks.com/help/releases/R2020a/stats/corr.html#d120e195813 it calculates Pearson's linear correlation coefficient. I am not sure if any MATLAB's built-in function supports its direct calculation, however, I found this submission on FEX: https://www.mathworks.com/matlabcentral/fileexchange/34492-r-square-the-coefficient-of-determination. Internally, it implements the same formula as you are using right now.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

John D'Errico on 24 Apr 2020

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/520241-the-calculated-r-squared-is-not-equal-to-the-squared-of-correlation-coefficient-by-matlab-functions#answer_428029

Edited: John D'Errico on 24 Apr 2020

Open in MATLAB Online

What I do not see is the actual model you used. Did you use a linear model? Was there a constant term in the model? The problem is, depending on the model, the claims you make about R^2 and the correlation coefficient are only valid for specific models.

x = rand(10,1);
>> y = rand(10,1);
>> p2 = polyfit(x,y,2);
>> pred = polyval(p2,x);
>> Rsq = 1 - sum((y - pred).^2)/sum((y - mean(y)).^2)
Rsq =
         0.140274350649466
>> corr(y,pred).^2
ans =
         0.140274350649466

So, the square of the correlation coefficient is the same as the value your formula computes. It matches down to the last digit, which is my expectation.

However, now try the same thing, but using a model that has no constant term in it. In this case, I'll use a cubic polynomial fit, but one that has no constant term. We can do that using backslash, though I could have done the fit using any number of tools.

mdl = [x,x.^2,x.^3]\y
mdl =
         0.552026949387604
           3.2235169295382
         -3.50451900695301
>> pred = [x,x.^2,x.^3]*mdl;
>> Rsq = 1 - sum((y - pred).^2)/sum((y - mean(y)).^2)
Rsq =
         0.195980323024559
>> corr(y,pred).^2
ans =
         0.200698709640219

What was wrong? The error is in the assumption that the two ways compute the same thing for models that have no constant term estimated.

There are adjusted R^2 computations that can be more accurate in these cases, but even so, there is no expectation the formulas will give the same result any longer, when the model lacks a constant term.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

The calculated R squared is not equal to the squared of correlation coefficient by Matlab functions corr

3 Comments
Show 1 older commentHide 1 older comment

Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

The calculated R squared is not equal to the squared of correlation coefficient by Matlab functions corr

3 Comments Show 1 older commentHide 1 older comment

Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments