Inverse of a mixture distribution of two lognormal functions
5 views (last 30 days)
Show older comments
Marco Gaetani d'Aragona
on 10 Apr 2024
Edited: David Goodmanson
on 19 Apr 2024
%I'm trying to derive the inverse of cdf function composed of a mixture of two lognormal functions. These have different median and lognormal %standard deviations, and give a final function that in terms of probability is ptot=p1*c1+p2*c2, where p is the probability, and c1+c2=1.
nVars = 1;
nVar = 1;
sampling_number = 100;
pvalue = rand(sampling_number, nVars); % Assuming pvalue is a matrix of probabilities
e_mu1(3)=26/100; e_beta1(3)=189/100; Pp1(3)=86/100;
e_mu2(3)=91/100; e_beta2(3)=112/100; Pp2(3)=1-Pp1(3);
% plot the mixture cdf
figure
Cr=[0:0.001:1.40];
for DS=3
cdf_plot1= normcdf((log(Cr/e_mu1(DS)))/log(e_beta1(DS)));
cdf_plot2= normcdf((log(Cr/e_mu2(DS)))/log(e_beta2(DS)));
cdf_plot=Pp1(DS)*cdf_plot1+Pp2(DS)*cdf_plot2;
plot(Cr,cdf_plot,'LineStyle',':','LineWidth',2);
hold on
end
% now i want to sample the corresponding Cr values starting from the extracted probabilities
for sym=1:sampling_number
DS=3;
hold on
if pvalue(sym,nVar)<=Pp1(DS)
invCDFcost(sym,DS)=icdf('LogNormal',pvalue(sym,nVar)/Pp1(DS),log(e_mu1(DS)),log(e_beta1(DS)));
elseif pvalue(sym,nVar)<=Pp1(DS)+Pp2(DS)
invCDFcost(sym,DS)=icdf('LogNormal',(pvalue(sym,nVar)-Pp1(DS))/Pp2(DS),log(e_mu2(DS)),log(e_beta2(DS)));
end
plot(invCDFcost(sym,DS),pvalue(sym,nVar),'o');
end
As it can be seen, there is a problem when the second of the two lognormal function activates, since the dots should follow the dashed line.
Can anyone help me with this issue?
Thanks you in advance.
Marco
1 Comment
Accepted Answer
Jeff Miller
on 11 Apr 2024
Unfortunately, your method below won't retrieve the Cr values correctly except in the special case where the two distributions in the mixture have zero overlap (and lognormals do have some overlap):
if pvalue(sym,nVar)<=Pp1(DS)
invCDFcost(sym,DS)=icdf('LogNormal',pvalue(sym,nVar)/Pp1(DS),log(e_mu1(DS)),log(e_beta1(DS)));
elseif pvalue(sym,nVar)<=Pp1(DS)+Pp2(DS)
invCDFcost(sym,DS)=icdf('LogNormal',(pvalue(sym,nVar)-Pp1(DS))/Pp2(DS),log(e_mu2(DS)),log(e_beta2(DS)));
end
In the region of overlap, you can't tell which distribution the Cr value came from, so the p values in that region don't cleanly tell you which icdf to use, as you are assuming they do.
As far as I know, the only general solution is to do a numerical search (e.g., with fzero) to find Cr such that
0 = Pp1*cdf1(Cr)+Pp2*cdf2(Cr) - pvalue
If you are happy with a ready-made solution rather than rolling your own, you might have a look at Cupid. The following code using its distribution objects seems to do about what you want, making the figure below:
Pp1 = 86/100;
Pp2 = 1 - Pp1;
% I don't really understand your lognormal parameters,
% but these produce something close to your plot.
LN1 = LognormalMS(0.3,0.15);
LN2 = LognormalMS(0.8,0.08);
M = Mixture(Pp1,LN1,Pp2,LN2);
Cr = 0:0.001:1.40;
cdf_plot = M.CDF(Cr);
plot(Cr,cdf_plot);
hold on
pvalue = rand(100, 1); % Assuming pvalue is a matrix of probabilities
invCDFcost = M.InverseCDF(pvalue);
hold on
plot(invCDFcost,pvalue,'o');
1 Comment
David Goodmanson
on 17 Apr 2024
Edited: David Goodmanson
on 19 Apr 2024
Hi Marco/Jeff
Since the function is monotonic you can stay with Matlab and use interpolation on the cdf_plot to obtain the inverse function
Cr = [0:1e-6:1.40]; % use lots of points, they're inexpensive
% next three lines are the same
cdf_plot1= normcdf((log(Cr/e_mu1(DS)))/log(e_beta1(DS)));
cdf_plot2= normcdf((log(Cr/e_mu2(DS)))/log(e_beta2(DS)));
cdf_plot=Pp1(DS)*cdf_plot1+Pp2(DS)*cdf_plot2;
% find Cr values
Crnew = interp1(cdf_plot,Cr,pvalue,'spline');
figure(1); grid on
plot(Cr,cdf_plot,'b:',Crnew,pvalue,'or');
This produces the same plot.
More Answers (0)
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!