Sum of two random variables with different distributions

15 views (last 30 days)
Hi everyone,
I have a kernel cdf for variable x1 (bandwith=0.22) and an exponential cdf for variable x2 (lambda =1/14). Is there a command in Matlab that allows me to get the cdf for a new variable that is the sum of x1 and x2 (y=x1+x2)? The range of values for x2 is [0;100] and for x1 is ]0; +∞[.
Thanks in advance!
Best,
Elizabeth
  1 Comment
John D'Errico
John D'Errico on 11 Feb 2016
Um, I see that you state that the exponential variable is limited to [0,100]. That is simply not true. An exponential CDF has support [0,inf). Given your (unknown to me) rate parameter, it may well be true that PRACTICALLY, your exponential random variable is almost always no larger than 100.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 11 Feb 2016
I'm sorry. There is no command in MATLaB that will give you the CDF of the sum of two general random variables. This is for good reason: there is NO simple way to write the CDF of the sum of two general, unrelated random variables, with arbitrary distributions.
There are many things we might wish to do that have no simple solutions. That is not to say this is impossible. In fact, there are several classical solution approaches. The general idea is called statistical tolerancing. I'll suggest a couple of ideas that you can use.
1. You can do a Monte Carlo simulation. Generate random samples from each component, then form the sum. You can then compute a sample CDF from the data points. Lots and lots of points here will yield a decent approximation to the CDF.
2. Compute the mean, variance, skewness, kurtosis, etc., of the sum. There are many ways this can be done, using Taylor series approximations, or using various Taguchi style methods. For example, as Walter points out, the mean of a sum is simple to compute. However, the higher order moments will take a little more work. Given those moments, then you can choose a member of a distribution family that matches those moments as well as possible. The usual families of distributions that are used are the Pearson and the Johnson families.
The above scheme (#2) has its flaws for two distributions that are so different from each other, especially when one of them is bounded. That suggests that no simple Pearson or Johnson member will be a good approximant. So I would suggest #1 as the best approach, to form a completely empirical CDF.
If you do choose option #2, then I would still strongly suggest using a Monte Carlo scheme to validate those results. This is especially true because a Monte Carlo simulation is so easy to do.

More Answers (1)

Walter Roberson
Walter Roberson on 10 Feb 2016
For any two distributions, as long as they are independent, the mean of the sum of a random variable from each of the two distributions, is the same as the sum of the means of the individual distributions.
The question becomes more interesting if you are clipping based upon the sum of the two rather than clipping each individually. Your (0,infinity) for x1 appears to be not be a truncated range (unless 0 would normally be part of the range), but your [0,100] for x2 is truncated, but you do not appear to be truncating based upon the two together, so the "sum of the means" still applies.
(Out of all continuous distributions, only the Uniform distribution and the Beta distribution are finite domain; all other distributions including the exponential distribution are infinite domain, so you need to do special analysis on your x2 distribution to find its mean.)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!