- https://www.mathworks.com/help/stats/regression-learner-app.html
- https://www.mathworks.com/help/stats/residuals.html
Residuals of the model shows linearity. How can I solve this?
3 views (last 30 days)
Show older comments
I trained all Models in Regression Learner App with a dataset. The dataset has 6 variables.
All of the models get a residual plot like this with different variance:
I don't know why the residuals are plotted like this.
What does it mean and how could I solve this?
I had the idea to use another linear regression model for the residuals so it can be compensated. Would that be alright to do ?
0 Comments
Answers (1)
sai charan sampara
on 25 Oct 2023
Hello M.A.,
I understand that you are trying to understand about the residuals plotted for different models.
Residuals are plots that are used to compare the predicted value to the true or actual value. Residuals plots are used on validation data to check the accuracy of the model.
In the question there are two residual plots.
The first one is the “Residuals” vs “True response”. “Residuals” is calculated here as the difference between predicted value and true value. In ideal case it is always zero and this is indicated as the black line. The orange data points are the values obtained from the predictions of the model.
The second plot is “Predicted response” vs “True response”. In ideal case it is along the line y=x (as both should be equal) which is indicated by the black line. The blue data points show the values predicted by the model vs the true response from the data.
These plots are plotted to evaluate the goodness of fit of a model. A smaller average residual magnitude suggests a better fit, while larger residuals indicate a poorer fit. Positive residuals indicate that the observed values are higher than the predicted values, while negative residuals indicate that the observed values are lower than the predicted values. They help to evaluate how well the model captures the underlying patterns and variability in the data.
The residual plots having a lot of variance and away from the ideal behaviour imply that the model used is not very accurate. To solve this try using data preprocessing techniques like normalizing the data, Feature Selection or other techniques. Different models can also be tried out for identifying the best possible fit.
You can refer to the below documentation to learn more about regression:
Hope this helps.
Thanks,
Charan.
0 Comments
See Also
Categories
Find more on Gaussian Process Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!