Confusion about the representation of Root Mean Square, R Squared ...

84 views (last 30 days)
How are errors in Matlab represented? For example I have obtained the following after training a dataset using LinearModel.fit( ). I am confused about the Root Mean Squared Error, is the error 0.243 % or 24.3 %. I want to know the value of RMSE in terms of percentage, and is it represented here in percentage form or some other form. Can somebody please clarify. The same goes with the value of R-squared, is it 0.106% or 10.6%. Thanks.
Number of observations: 48, Error degrees of freedom: 46
Root Mean Squared Error: 0.243
R-squared: 0.106, Adjusted R-Squared 0.0861
F-statistic vs. constant model: 5.43, p-value = 0.0242

Accepted Answer

Star Strider
Star Strider on 26 May 2014
Residuals and measures related to them are not a percentage. In the context of a one-dimensional situation, residuals are analogous to deviations from the mean, and measures derived from them are roughly analogous to the variance or standard deviation. (With heavy emphasis on ‘roughly’.)
The Coefficient of Determination (R-Squared) value could be thought of as a decimal fraction (though not a percentage), in a very loose sense. From the documentation:
  • Coefficient of determination (R-squared) indicates the proportionate amount of variation in the response variable y explained by the independent variables X in the linear regression model. The larger the R-squared is, the more variability is explained by the linear regression model.
So the higher the R-Squared value, the better the fit of the model to the data.
  4 Comments
Motiur
Motiur on 26 May 2014
Saw that, after I commented; sorry for that. Just another thing SSE and RMSE are similar things, one has been averaged and square rooted and another is not. Is there an RMSE for GLM.Thanks.
Star Strider
Star Strider on 26 May 2014
My pleasure!
No worries!
Yes there is. I saw your other post and responded the your GLM RMSE question there.

Sign in to comment.

More Answers (2)

Kelly Kearney
Kelly Kearney on 26 May 2014
Root mean squared error is
sqrt(mean((xobs - xpre).^2))
where xobs is the input dataset, and xpre are the values predicted by the model for each corresponding observation. The value is absolute, not relative. Not quite sure what you mean by RMSE in terms of percentage... maybe percent error? Check the properties of the LinearModel object; it includes fitted values as well as several different measures of error that will help you perform this calculation.
  2 Comments
Motiur
Motiur on 26 May 2014
I know about the calculation procedure of RMSE, however, I only wanted to know whether the value is represented as a percentage or not. I asked this because I 'think' that the R-Squared is expressed as a percentage. So is it following some sort of trend?
Kelly Kearney
Kelly Kearney on 26 May 2014
If you check the doc page for LinearModel, it defines all of these values for you, under properties.
No, RMSE is not a percentage, so your RMSE is 0.243 whatever-the-input-units-were, not 0.243% or 24.3%.
R^2 is the coefficient of determination, i.e. a measure of how well the model fits the data.
Most of the terms are standard statistics terms, so you if the docs aren't clear, a statistics textbook (or Wikipedia) should be able to clarify further.

Sign in to comment.


John D'Errico
John D'Errico on 26 May 2014
RMSE is never expressed as a percentage that I have ever seen. Why would it be? As a percentage of what? A percentage for RMSE is meaningless. My point is for a percentage to make sense, we need to have some value A as a relative fraction of B, so then 100*A/B can be interpreted as a percentage.
(If you DO think that you need RMSE to be in the form of a percentage, I think you are mistaken.)
Likewise, R^2 is also never expressed as a percentage that I know of, although in the context I mentioned above, one can view R^2 as a ratio of the sum of squares explained divided by the total sum of squares. In that context, when one multiplies by 100, it could have a % sign attached and make sense.
Regardless, NEITHER of these parameters are expressed as percentages in the tool provided by MATLAB. Were that so, the help would say so, and do so explicitly, as that would be non-standard.
  1 Comment
Elizabeth Drybrugh
Elizabeth Drybrugh on 9 Feb 2018
I can understand the confusion of thinking R2 should be a percentage as on some websites this stated. However, if this is incorrect thank you for mentioning it. I was wondering more about what R2 range is considered a good fit vs a bad-fit. Ofcourse one you plot you can see the difference visually. However, in mathematical terms, if anyone knows any good links or journals to explain this?
NOVICE at stats. Cheers, Elizabeth

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!