lognpdf drawn by parameters from lognfit doenst fit data at all

6 views (last 30 days)
Hey guys,
I got a problem fitting my data. So the data should be describable by a log normal distribution. Somehow the lognpdf i plot with lognpdf(X,parmhat(1),parmhat(2)) totaly fails somehow and i cant figure out why. I get why better results by roughly playing with mµ and sigma in the lognpdf then hoping for lognfit to fit my data.
So in the picture below are my data point plotted with bar. I also tried to get the logn to fit the data by trying different binnings (second one), but that didnt help either.
The Data are given in the attachment. Column one are the X positions of the data points, column 2 are the absolut values of my data and column 3 are the relativ values (as plottet). The lognpdf i get with :
[parmhat,parmci] = lognfit(Bindistd(:,3),0.01); %% alternativ Bindistd(:,2)
X=Bindistd(:,1);
Y = lognpdf(X,parmhat(1),parmhat(2));
plot(X,Y)
is totaly wrong. As i said i tried different binnings since i thought that the missfit comes from the first local max in the data at about 2 but that didnt work. i know that 3 as parmhat(1) und 0.6 as 2 looks fine for the naked eye, i just wonder how it is possible to get about these values by the lognfit function.
Maybe my mistake is a very simple one, either way i cant find it, maybe one of you can help me with.
Thanks in advance.
Regard

Accepted Answer

John D'Errico
John D'Errico on 18 Jan 2019
Edited: John D'Errico on 18 Jan 2019
Look carefully at your data! always do this when you have a problem. Then think about what you see.
Bindistd(:,3)
ans =
0.00327868852459016
0.0229508196721311
0.019672131147541
0.00983606557377049
0.00655737704918033
0.00655737704918033
0.0131147540983607
0.0163934426229508
0.00327868852459016
0.0131147540983607
0.0295081967213115
0.0327868852459016
0.0262295081967213
0.039344262295082
0.0295081967213115
0.0262295081967213
0.0229508196721311
0.0295081967213115
0.00983606557377049
0.0229508196721311
0.0229508196721311
0.0360655737704918
0.0360655737704918
0.039344262295082
0.019672131147541
0.0426229508196721
0.0131147540983607
0.0327868852459016
0.0295081967213115
0.0426229508196721
0.019672131147541
0.019672131147541
0.00655737704918033
0.00983606557377049
0.019672131147541
0.0229508196721311
0.0131147540983607
0.019672131147541
0.00655737704918033
0.019672131147541
0.00983606557377049
0.00655737704918033
0.00983606557377049
0.0131147540983607
0.0163934426229508
0.00655737704918033
0.00983606557377049
3.27868852459016e-23
0.00327868852459016
0.00655737704918033
0.00327868852459016
3.27868852459016e-23
0.00327868852459016
0.00983606557377049
0.00655737704918033
3.27868852459016e-23
0.00327868852459016
0.00983606557377049
0.00327868852459016
3.27868852459016e-23
3.27868852459016e-23
0.00327868852459016
0.00327868852459016
3.27868852459016e-23
0.00327868852459016
3.27868852459016e-23
3.27868852459016e-23
0.00655737704918033
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
0.00327868852459016
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
3.27868852459016e-23
0.00327868852459016
Do you see all of those replicate values at 3.27868852459016e-23?
Infact, out of 85 data points,
sum(Bindistd(:,3) == Bindistd(end-1,3))
ans =
23
23 of them are the same garbage value.
I'm sorry, but this set of data does NOT follow a lognormal distribution. You can want it to do so. Hey, I want a lot of things, things that simply won't happen.
  3 Comments
John D'Errico
John D'Errico on 19 Jan 2019
If they are truly zeros, then this is not a lognormal distribution.
If they are all constant values, then this is not a lognormal distribution.
Your distribution is simply not lognormal. Trying to jam a square plug into a round hole tends not to work very well. PERIOD.
So lets consider what distribution your data MIGHT be from. It AIN'T Lognormal. (If I say it three times, then it must be so.)
B = Bindistd(:,3);
hist(B,50)
untitled.jpg
Does that look even remotely like a lognormal distribution? (No.) In fact, it looks more like a quantized exponential distribution.
Here, k will tell us where the garbage is, so we can remove it.
k = B == B(end-1);
Now, if your data does follow a lognormal distribution, then if I take the log of your data, it will look roughly normal.
hist(log(B(~k)),10)
untitled.jpg
Sorry. But not even close. Although if it was exponentially distributed, that histogram would have looked uniform. In fact, that is exactly what it looks like, more than anything else. The bins on the left are higher due to some quantization that went on in your data collection step, I might claim.
But no matter how hard you try, trying to stuff this into a lognormal is a waste of CPU cycles.
Marc Laub
Marc Laub on 19 Jan 2019
You are right, that if the values are true zeros or all the samwe value it is not a log normal distribution. In fact those measured data have the problem that they lagg statistic in my case. With more or bigger measurements the zero values should follow a lorn distribution. Thats why i tried to fix that part with the right-censored property in the lognfit function.
In fact ma data come from a measurement of a particle size distribution, where the logn distribution is the most used distribution to describe it. Thats why my first intent was to also try it, also because a manualy adjusted logn function seemed to describe the data pretty well, even if the current data are not log normal distributed as you said

Sign in to comment.

More Answers (0)

Products


Release

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!