lognpdf drawn by parameters from lognfit doenst fit data at all

Question

Marc Laub on 18 Jan 2019

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/440442-lognpdf-drawn-by-parameters-from-lognfit-doenst-fit-data-at-all

Commented: Marc Laub on 19 Jan 2019

Data.mat

Hey guys,

I got a problem fitting my data. So the data should be describable by a log normal distribution. Somehow the lognpdf i plot with lognpdf(X,parmhat(1),parmhat(2)) totaly fails somehow and i cant figure out why. I get why better results by roughly playing with mµ and sigma in the lognpdf then hoping for lognfit to fit my data.

So in the picture below are my data point plotted with bar. I also tried to get the logn to fit the data by trying different binnings (second one), but that didnt help either.

The Data are given in the attachment. Column one are the X positions of the data points, column 2 are the absolut values of my data and column 3 are the relativ values (as plottet). The lognpdf i get with :

[parmhat,parmci] = lognfit(Bindistd(:,3),0.01); %% alternativ Bindistd(:,2)

X=Bindistd(:,1);

Y = lognpdf(X,parmhat(1),parmhat(2));

plot(X,Y)

is totaly wrong. As i said i tried different binnings since i thought that the missfit comes from the first local max in the data at about 2 but that didnt work. i know that 3 as parmhat(1) und 0.6 as 2 looks fine for the naked eye, i just wonder how it is possible to get about these values by the lognfit function.

Maybe my mistake is a very simple one, either way i cant find it, maybe one of you can help me with.

Thanks in advance.

Regard

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 18 Jan 2019

1
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/440442-lognpdf-drawn-by-parameters-from-lognfit-doenst-fit-data-at-all#answer_357052

Edited: John D'Errico on 18 Jan 2019

Look carefully at your data! always do this when you have a problem. Then think about what you see.

Bindistd(:,3)
ans =
       0.00327868852459016
        0.0229508196721311
         0.019672131147541
       0.00983606557377049
       0.00655737704918033
       0.00655737704918033
        0.0131147540983607
        0.0163934426229508
       0.00327868852459016
        0.0131147540983607
        0.0295081967213115
        0.0327868852459016
        0.0262295081967213
         0.039344262295082
        0.0295081967213115
        0.0262295081967213
        0.0229508196721311
        0.0295081967213115
       0.00983606557377049
        0.0229508196721311
        0.0229508196721311
        0.0360655737704918
        0.0360655737704918
         0.039344262295082
         0.019672131147541
        0.0426229508196721
        0.0131147540983607
        0.0327868852459016
        0.0295081967213115
        0.0426229508196721
         0.019672131147541
         0.019672131147541
       0.00655737704918033
       0.00983606557377049
         0.019672131147541
        0.0229508196721311
        0.0131147540983607
         0.019672131147541
       0.00655737704918033
         0.019672131147541
       0.00983606557377049
       0.00655737704918033
       0.00983606557377049
        0.0131147540983607
        0.0163934426229508
       0.00655737704918033
       0.00983606557377049
      3.27868852459016e-23
       0.00327868852459016
       0.00655737704918033
       0.00327868852459016
      3.27868852459016e-23
       0.00327868852459016
       0.00983606557377049
       0.00655737704918033
      3.27868852459016e-23
       0.00327868852459016
       0.00983606557377049
       0.00327868852459016
      3.27868852459016e-23
      3.27868852459016e-23
       0.00327868852459016
       0.00327868852459016
      3.27868852459016e-23
       0.00327868852459016
      3.27868852459016e-23
      3.27868852459016e-23
       0.00655737704918033
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
       0.00327868852459016
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
      3.27868852459016e-23
       0.00327868852459016

Do you see all of those replicate values at 3.27868852459016e-23?

Infact, out of 85 data points,

sum(Bindistd(:,3) == Bindistd(end-1,3))
ans =
    23

23 of them are the same garbage value.

I'm sorry, but this set of data does NOT follow a lognormal distribution. You can want it to do so. Hey, I want a lot of things, things that simply won't happen.

3 Comments
Show 1 older commentHide 1 older comment

John D'Errico on 19 Jan 2019

If they are truly zeros, then this is not a lognormal distribution.

If they are all constant values, then this is not a lognormal distribution.

Your distribution is simply not lognormal. Trying to jam a square plug into a round hole tends not to work very well. PERIOD.

So lets consider what distribution your data MIGHT be from. It AIN'T Lognormal. (If I say it three times, then it must be so.)

B = Bindistd(:,3);
hist(B,50)

Does that look even remotely like a lognormal distribution? (No.) In fact, it looks more like a quantized exponential distribution.

Here, k will tell us where the garbage is, so we can remove it.

k = B == B(end-1);

Now, if your data does follow a lognormal distribution, then if I take the log of your data, it will look roughly normal.

hist(log(B(~k)),10)

Sorry. But not even close. Although if it was exponentially distributed, that histogram would have looked uniform. In fact, that is exactly what it looks like, more than anything else. The bins on the left are higher due to some quantization that went on in your data collection step, I might claim.

But no matter how hard you try, trying to stuff this into a lognormal is a waste of CPU cycles.

Marc Laub on 19 Jan 2019

You are right, that if the values are true zeros or all the samwe value it is not a log normal distribution. In fact those measured data have the problem that they lagg statistic in my case. With more or bigger measurements the zero values should follow a lorn distribution. Thats why i tried to fix that part with the right-censored property in the lognfit function.

In fact ma data come from a measurement of a particle size distribution, where the logn distribution is the most used distribution to describe it. Thats why my first intent was to also try it, also because a manualy adjusted logn function seemed to describe the data pretty well, even if the current data are not log normal distributed as you said

Sign in to comment.

lognpdf drawn by parameters from lognfit doenst fit data at all

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

lognpdf drawn by parameters from lognfit doenst fit data at all

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment