get parameters of gaussian distributions from ksdensity function

Question

Marc Laub on 29 Sep 2019

0
Link

Direct link to this question

https://nl.mathworks.com/matlabcentral/answers/482688-get-parameters-of-gaussian-distributions-from-ksdensity-function

Edited: Thiago Henrique Gomes Lobato on 29 Sep 2019

Accepted Answer: Thiago Henrique Gomes Lobato

Hey there,

I am locking for a possibilty do get an analytical solution of the distribution of my numbers. Since my numbers are generated by a simulation i can't say for sure which distribution would describe them the best at any time.

The best results i got to describe my data is with the ksdensity funcionmatlab ks density, but the results from ks density are only x and y point of a curve that fits the data.

Is there a possibility to get the parameters of the gaussian distributions from the ksdensity function? Like in the first example here: https://de.mathworks.com/help/stats/ksdensity.html

where you can clearly see the bimodality of the data. Would it be possible to get the parameters of the 2 gaussian distribution that are superimposed here?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Thiago Henrique Gomes Lobato on 29 Sep 2019

0
Link

Direct link to this answer

https://nl.mathworks.com/matlabcentral/answers/482688-get-parameters-of-gaussian-distributions-from-ksdensity-function#answer_393989

Edited: Thiago Henrique Gomes Lobato on 29 Sep 2019

The ksdensity uses a nonparametric representation to calculate the probabilities, so there's no parameters to get from the function self. If, however, you know which distribution may be underlying it (or can make a good visual estimation), you can do a later parametric optimization of your data to get the parameters. An example based in the two gaussians that you mentioned:

rng('default')  % For reproducibility
x = [randn(30,1); 5+randn(30,1)];
[f,xi] = ksdensity(x); 
% Here I generate a function from two Gaussians and output
% the rms of the estimation error from the values obtained from ksdensity
fun = @(xx,t,y)rms(y-(xx(5)*1./sqrt(xx(1)^2*2*pi).*exp(-(t-xx(2)).^2/(2*xx(1)^2))+...
             xx(6)*1./sqrt(xx(3)^2*2*pi).*exp(-(t-xx(4)).^2/(2*xx(3)^2))   )  );
         
% Get the parameters with the minimum error. To improve convergence,choose reasonable initial values       
[x,fval] = fminsearch(@(r)fun(r,xi,f),[2,0.5,2,4,0.5,0.5]);
% Make sure sigmas are positive
x([1,3]) = abs(x([1,3]));
% Generate the Parametric functions
pd1 = makedist('Normal','mu',x(2),'sigma',x(1));
pd2 = makedist('Normal','mu',x(4),'sigma',x(3));
% Get the probability values
y1 = pdf(pd1,xi)*x(5); % x(5) is the participation factor from pdf1
y2 = pdf(pd2,xi)*x(6); % x(6) is the participation factor from pdf2
% Plot
figure
plot(xi,f);
hold on;
plot(xi,y1);
plot(xi,y2);
plot(xi,y2+y1);
legend({'ksdensity',['\mu : ',num2str(x(2)),'. \sigma :',num2str(x(1))],...
    ['\mu : ',num2str(x(4)),'. \sigma :',num2str(x(3))],'pdf1+pdf2'})

you can see a list of possible distributions from matlab here in the parameter 'name': https://de.mathworks.com/help/stats/prob.normaldistribution.pdf.html .

A good aproach for you might then be:

Plot the distribution data with ksdensity
Verify what does it looks like and search for the distribution that most resemble it
Do a parametric fit in the data as shown above

If you want to fully automatize it you can generate optimization functions for multiple distributions and then choose the one with the lowest fit error. I hope it helped and if something is not clear you can ask it.

2 Comments
Show NoneHide None

Marc Laub on 29 Sep 2019

Ok, thank you very much so far.

If, then i would have to do it fully automized, but from trys i can guess that in most cases at maximum 4 superimposed gaussian distribution are able to fit ma data quite well.

Thiago Henrique Gomes Lobato on 29 Sep 2019

Edited: Thiago Henrique Gomes Lobato on 29 Sep 2019

You can adjust your optimization as you get new data, if you know from experience that they will always be 2-4 superimposed gaussians you don't need to test Poisson distributions, for example. An idea would be maybe to use findpeaks to get possible values from the mean and then perform the optimization based in the number of peaks you find:

[PeakValues,meanGuesses] = findpeaks(f);
NOfGaussians = length(meanGuesses);
meanGuesses = xi(meanGuesses);
[x,fval] = fminsearch(@(r)fun(r,xi,f,NOfGaussians),[2,meanGuesses(1),2,meanGuesses(2),0.5,0.5]);

Then just make sure the function do the right thing with the number of gaussians parameter and adjust the initial values vector.

Sign in to comment.

get parameters of gaussian distributions from ksdensity function

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

get parameters of gaussian distributions from ksdensity function

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None