fit tool with errors in x and y

61 views (last 30 days)
Tal Shavit
Tal Shavit on 26 Apr 2022
Edited: the cyclist on 20 Mar 2023
I need to use the fit tool (fit(x,y)) when I have errors both for my y and x values. how can I do so?
Thanks!
  2 Comments
Sam Chak
Sam Chak on 26 Apr 2022
@Tal Shavit, What are the error messages shown in x and y? Please display.
John D'Errico
John D'Errico on 25 May 2022
@Sam Chak - um, no. @Tal Shavit is not saying there are error messages. But that the DATA has noise in both the x and y variables.

Sign in to comment.

Answers (2)

John D'Errico
John D'Errico on 25 May 2022
Edited: John D'Errico on 25 May 2022
You can't. Period. You cannot use fit to solve a problem with errors in both x and y. That does not say the problam is never solvable, but only that fit cannot be used.
This is a classic problem, where data has noise in both variables. The curve fitting toolbox only allows for noise in the y variable. So it fits a model that minimizes the residual errors ONLY in y. And that sufficies for most, or at least many problems. But when you have noise in both variables, if you were to assume the noise lies only in y, then you will get biased results for the parameters, so incorrect predictions. In the case of a linear model, the slope of the curve tends to be predicted too low. (The explanation for this is interesting in a sense, but it would take too long for me to want to put here.)
The classic name for this problem is the errors in variables problem, or some call it the total least squares problem. They both refer to the same general case.
That problem, when your model is the SIMPLE linear case in two variables, thus something of this form:
y + noise = a + b*(x + noise)
IS solvable, but the curve fitting toolbox cannot handle even that simple model. Again, the CFTB cannot solve the errors in variables problem. It is just not designed to solve that class of problems. Instead, you can actually use something as simple as Principal Components Analysis (PCA) to solve it, as long as the noise variances in both x and y are the same. (You can also use a SVD, but again, this requires both variances are the same.) If the x and y variances are not the same, then again, there is a technical problem. If the variance ratio is known, then the problem can be resolved via scaling, in which case it reduces to the case above with equal variances.
In the case of a nonlinear model, then things get far more difficult, and there is no simple solution. Sorry about that. Depending upon the specific model, I'd probably want to write custom code with a maximum liklihood estimation, where the ratio of the two noise variances was one of the parameters that needs to be estimated. So it would be doable, but not even remotely trivial to write.

the cyclist
the cyclist on 20 Mar 2023
Edited: the cyclist on 20 Mar 2023
I am late to seeing this question, but FYI there is a submission to the File Exchange that does linear, univariate Deming regression (which is a special case of error-in-variables model). Obviously this might be too specific for your case. I have used this function, and it seems to work as advertised.
There is also a function for total least squares regression, which is more general. I have not used it.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!