R2 with a loglog plot
Show older comments
Hi everyone.
I am quite new to matlab and I'd like to add a R2 to my loglog plot. I've seen some solution from few other posts, but none really does the job. My code is really simple:
bx= figure;
set(bx,'visible', 'on');
f = fit(x,y,'power1');
loglog(x,y,'O');
hold on;
plot(f);
[...]
the result look like that:

So far I haven't find any way of determining the r2: That post (https://uk.mathworks.com/matlabcentral/answers/182998-r-squared-value-for-fitted-line) overestimates my r2, and this bit of code found somewhere on the ML forum as well:
ba = [x,ones(size(x))]\y;
ypred = ba(1)*x + ba(2);
SSE=sum((y-ypred).^2);
SST=sum((y-mean(y)).^2);
Rsq = 1 - (SSE/SST);
I also tried that way, but I think it works fine only for linear distributions:
X = [ones(length(x),1) x];
b = X\y;
yCalc = X*b;
r = 1 - sum((y - yCalc).^2)/sum((y - mean(y)).^2);
Thank you very much for your help :)
Flo
PS: the the r2 on excel is equal to 0.9597
Answers (1)
dpb
on 8 Aug 2016
I see nothing wrong with Star's Answer nor follow-up comment. To compute SSE from such a model requires evaluating the residual of the fit in original metric, not in log space.
But, fit returns a cfit for curves (your case) and some additional optional outputs, the second of which is a goodness-of-fit structure, gof in the documentation. Fields in gof include
sse - Error SSquares
R2 - Coefficient of determination ("raw" R-square)
adjustedR2 - DOF-adjusted R-square
stdError - RMS (or "standard") error
so to save yourself some effort, use it...
[f,gof] = fit(x,y,'power1');
7 Comments
Flo
on 9 Aug 2016
dpb
on 9 Aug 2016
W/o the data, no, not really. Hmmm...I wonder. Check on SSE, SSQ; does the Matlab routine return the model values or the underlying values from log()? If you don't compute over the real values, but over SS(log(x0)) instead R-sq estimates will be inflated. Wouldn't think that'd be so, but w/o data to check can only hypothesize.
Flo
on 12 Aug 2016
Star Strider
on 12 Aug 2016
The ‘problem’ — if there is one — appears to be in your data (that are very close to being linear) and have a very wide range. In the ‘Rsq’ calculation, particularly the ‘SST’ calculation, note that the mean is very sensitive to extreme values, and your data have extreme values. (To experiment with this, compare the mean and median of your data.) The result of this is that ‘SSE’ is relatively low (with a good fit), and ‘SST’ will be relatively high, leading to a very high ‘Rsq’ value.
x = [0.737543298694378
0.110045297095657
0.0434319211297629
0.0239808153477218
0.0189181987743139
0.0165201172395417
0.0101252331468159
0.00746069810818012
0.00452970956568079
0.00346389555022649
0.00319744204636291
0.00479616306954436
0.00186517452704503
0.00213162803090861
0.00133226751931788];
y = [0.752928647497338
0.116879659211928
0.0388711395101172
0.0194355697550586
0.0133120340788072
0.0103833865814696
0.00692225772097977
0.00878594249201278
0.00532481363152290
0.00399361022364217
0.00505857294994675
0.00159744408945687
0.00292864749733759
0.00159744408945687
0.00159744408945687];
yfit = @(b,x) exp(b(1)) .* x.^b(2); % Power Function
SSECF = @(b) sum((yfit(b,x) - y).^2); % Sum-Squared-Error Cost Function
B = fminsearch(SSECF, [1; 1]);
ypred = yfit(B,x);
SSE=sum((y-ypred).^2);
SST=sum((y-mean(y)).^2);
Rsq = 1 - (SSE/SST);
xplot = linspace(min(x), max(x));
figure(1)
plot(x, y, 'bp')
hold on
plot(xplot, yfit(B,xplot), '-r')
hold off
grid
Very good observations, IA...I wondered if perhaps the "problem" was the curve fit in Excel excluded the one real outlier, the first observation, so just deleted it and reran -- Rsq = 0.996, a fair drop but still far from the 0.95 reported from Excel. I have no way to explain that other than it doesn't seem to match the data...OP should compare the results of the fitting from each.
Well, I took your plot and changed to loglog which looks like

Clearly, this isn't the same data set as OPs figure -- similar but not the same. For the most obvious case, the max point therein is ~[0.25 0.25], not ~[0.75 0.75] and while I didn't do a detailed examination and the patterns are similar, it doesn't look to me like any of the data points are exactly the same. So, it's an "apples to oranges" comparison problem it seems on the numerical value.
Flo
on 12 Aug 2016
Well, you need to look at the results obtained from the models (for the same, not disparate datasets) from Excel and Matlab to uncover where there's a difference. Clearly the results for the data you posted appear correct; if you got wildly different results from Excel, the most likely cause given the plot you posted is that it isn't the same dataset you're actually comparing to.
ADDENDUM
I attempted to read values off the above plot to see what kind of fit it actually provided; the Rsq was lower some but about 0.998 as 0.95 altho I could see guesstimating didn't work to get terribly close to the plot.
Agan, I can only suggest if you can reproduce the results you first quote in Excel, attach that set of data and model coefficients and results.
>> [f,gof]=fit(x,y,'power1')
f =
General model Power1:
f(x) = a*x^b
Coefficients (with 95% confidence bounds):
a = 1.026 (1.011, 1.04)
b = 1.013 (0.9852, 1.042)
gof =
sse: 1.6864e-04
rsquare: 0.9997
dfe: 13
adjrsquare: 0.9996
rmse: 0.0036
>> B % results from S-S:
B =
0.0254
1.0134
>> Rsq
Rsq =
0.9997
>>
BTW, I used fit in comparison to the fminsearch solution--results are essentially identical--
ADDENDUM 2
"*fit* in comparison to the fminsearch"
Actually, I just noticed there are two different solutions reached; the power term with fminsearch is about the magnitude of that of the fit solution less 1.0 -- 0.0254 vis a vis 1.026. How can that be???
Oh,
yfit = @(b,x) exp(b(1)) .* x.^b(2); % Power Function
has a definition problem; it's estimating log(B(1)) instead of B(1) directly...let's see what happens if redefine in same model space as fit uses--
>> yfit = @(b,x) b(1).*x.^b(2); % model A*x^B; B(1),B(2)-->A,B
>> SSECF = @(b) sum((yfit(b,x) - y).^2);
>> B = fminsearch(SSECF, [1; 1]);
>> B
B =
1.0256
1.0134
>>
Ah! As expected, now we agree...whew! :) Was worried there for a minute...
Categories
Find more on Spreadsheets in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!