# what is the best fit for a set of data?

21 views (last 30 days)
Deva Narayanan on 18 Jan 2019
Commented: James Phillips on 5 Feb 2019
I have data which represents Dynamic viscosity of a fluid as a function of Temperature. I need a best fit equation like : Dynamic viscosity = (a*Temperature^b) - c. where a, b and c are constant .can any one give the best fit equation.

John D'Errico on 18 Jan 2019
Edited: John D'Errico on 18 Jan 2019
I think you need to spend some time learning about modeling. Do some reading. Lots of stuff online to be found. You would want to consider at some point if a regression style model is really appropriate, or if a simple spline interpolation would be satisfactory. Do you just want a pretty plot? Do you want something you can interpolate with, just to predict viscosity for different temperatures? So you should spend some time deciding what you want, what you need, what are your long term goals in the task at hand.
Anyway, alot of modeling ends up being common sense. So first, plot your data. Always plot everything. A simple rule that will serve you greatly in the future.
plot(T,V,'.'), grid on
Here, I would first recognize that your data varies by several orders of magnitude in Viscosity, but temperature is pretty well-behaved. I would ask questions like is this measured data? What do you expect about the noise structure in your data? Note that any traditional fiit using least squares in the residuals of viscosity will be STRONGLY influenced by only those first few points. Again, this is something you need to learn about modeling. But my point is that IF we look at the data,
[T,V]
ans =
273 0.143
283 0.0738
293 0.0416
303 0.0252
313 0.0163
323 0.0111
333 0.00793
343 0.00589
353 0.00452
363 0.00356
373 0.00288
383 0.00238
393 0.002
403 0.00171
413 0.00148
423 0.00129
433 0.00114
443 0.00102
453 0.000913
463 0.000825
473 0.000749
483 0.000683
493 0.000625
503 0.000574
513 0.000528
523 0.000488
533 0.000451
543 0.000418
553 0.000387
563 0.00036
573 0.000334
583 0.000311
593 0.000289
I would want my curve to fit well down in that tail. Essentially, ypu want to consider if any proportional (relative) errors are appropriate. Most of the time, when your data varies by many orders of magnitude, that is what you will want to see. If so, then it makes complete sense to log the y variable, thus here we would work in terms of log(viscosity). Plot that. Look at it. Think about the curve you see.
semilogy(T,V,'.')
grid on
So here we see a relationship that is far more well-behaved. I don't see much noise in the data. So a simple interpolation would do well here. In fact, you need to consider if it may well be the best choice, since any simple model will have considerable lack of fit! So again, what do you need out of this in the end? A pretty picture? A function that you can write down, even if it misses passing through the data very well? What do you need?
Next, IF you would consider a simple model like
viscosity = A*temp^B
then a log-log plot would give a straight line fit. (Note that I ignored for the moment the constant offset term C that you posed.)
loglog(T,V,'.'), grid on
So it gave us something closer to a straight line. In fact, a little play would convince you that no value of C will make this curve a straight line. That is, there is no value of C such that
loglog(T,V - C ,'.')
would yield something close to a straight line. This is something you can prove rather easily. (Think about it.)
So the model you posed, of
viscosity = A*temp^B + C
is simply not a good choice.(You actually had it where you subtracted C, but that is simply a sign chance on the value of C: so completely irrelevant.)
loglog(T - 250,V,'.'), grid on
I just picked a number out of thin air, assuming these were temperatures in degrees Kelvin.
Note how the relationship is now very near a straight line. That suggests a good choice of model might be something like
viscosity = A*(temp - t0)^B
where t0 is on the order of 250, so around -20 degrees C.
So we might decide to fit a model of that form instead. Again, you very much want to do that fit on the log of viscosity. If we log that model, we will have:
log(viscosity) = log(A) + log(temp - t0)*B
I'll assume the curve fitting toolbox here, and with some not unreasonable guesses for the parameters, we see this:
ft = fittype('log(A) + log(temp - t0)*B','independent','temp','coefficients',{'A','B','t0'})
General model:
ft(A,B,t0,temp) = log(A) + log(temp - t0)*B
mdl = fit(T,log(V),ft,'start',[100,-1,250])
mdl =
General model:
mdl(temp) = log(A) + log(temp - t0)*B
Coefficients (with 95% confidence bounds):
A = 234.9 (123.2, 346.5)
B = -2.339 (-2.423, -2.255)
t0 = 250.6 (247.8, 253.5)
plot(mdl)
hold on
plot(T,log(V),'o')
grid on
xlabel 'Temp, degrees K'
ylabel 'log(viscosity)'
So not terrible. Stil some lack of fit, just a wee bit. A spline interpolant will nail the shape of the curve exactly.
But only you know why you need to do this curve fit. Does a model offer you some understanding of the process? With some thought, you could probably do a little better job yet. What you should understand here is the model I derived came from simply looking at the data, then applying a little common sense to the modeling. And plot everything.

#### 1 Comment

James Phillips on 5 Feb 2019
I got a good fit to the raw data with a generalized negative exponential equation, "y = a * pow(1.0 - exp(-1.0 * b * x), c)" with parameters a = 5.5192239096993052E-04, b = 1.2323292110168107E-02, and c = -1.5783979972152702E+02 yielding R-squared = 0.99994 and RMSE = 0.00021

KSSV on 18 Jan 2019

Andrei Bobrov on 18 Jan 2019
>> x = T.Temperature;
>> y = T.DynamicViscosity;
>> ft = fittype(@(a,b,x) a*(x.^b));
>> f1 = fit(x,y,ft ,'StartPoint', [1e40, -15])
f1 =
General model:
f1(x) = a*(x.^b)
Coefficients (with 95% confidence bounds):
a = 1e+40 (-Inf, Inf)
b = -16.77 (-16.77, -16.76)
>> x1 = linspace(x(1),x(end),1000);
plot(x1,f1(x1),x,y,'+');
>>
or use Curve Fitting Tool
>>cftool