MATLAB Answers


Generate beta random number without statistics toolbox

Asked by Rémi BERNARD on 18 Jun 2019
Latest activity Commented on by John D'Errico
on 26 Jun 2019
Hello everyone,
I am currently working on a project involving the PDF of beta distribution, and generation of random numbers with according distribution. My script is supposed to be converted into an executable so I am limited in use of toolboxes and some commands.
Does someone know how to generate such numbers (a few millions of them, so manually is excluded) without the Statistics toolbox?
Many thanks!


Sign in to comment.




2 Answers

Answer by John D'Errico
on 18 Jun 2019
Edited by John D'Errico
on 18 Jun 2019
 Accepted Answer

As a followup, I note that the answer by dpb seems to be incorrect.
There are several ways to generate a beta random variable. Simplest is the direct use of betaincinv. (The hardest part of that is spelling betaincinv.) What I do not know is if you can use that utility, in an executable. It should be possible.
For example, let me pick alpha = 2, beta = 3, with a sample size of 1e6.
alpha = 2;
beta = 3;
N = 1e6;
U = rand(N,1);
tic,Z = betaincinv(U,alpha,beta);toc
Elapsed time is 0.432232 seconds.
hold on
H = fplot(@(x) betapdf(x,alpha,beta),[0,1]);
H.Color = 'r';
That is clearly reasonable. So if you can use betaincinv, then you are done. As I show in my comment on the answer by dpb, you can also use the ratio from a couple of gamma random variables. But betaincinv seems easiest. It seems reasonably fast given what it does, at less than .5 seconds for a million such events.
And if you cannot use betaincinv, then gammaincinv would also fail. So as long as you can use betaincinv, use that.


Show 1 older comment
As I said, what dpb told you to do is not technically correct. In fact, it does not generate the desired distribution, as I proved that it cannot. But you will need to make that decision for yourself.
So I took a closer look on your reply, what I don't get is how to generate the variables according a beta-distribution (for now I don't use the betapdf, which I reconstructed in a function to use it later, the beta function is available without TBs so that is not complicated). As I said my knowledge in statistics is quite shady, but you use uniformly distributed variates in the betaincinv function, can it make a big difference with variates generated from beta distribution such as betarnd?
EDIT: After being finally on my script, using betaincinv is much easier and faster. I tried something similar to what dpb explained (acceptance-rejection method), but it's longer to compute and it is necessary to create several cases function of the shape factors (integers, greater, less or equal to one ,...).
In fact, it is the same distribution as what betarnd would yield (unless alpha and beta are very near zero. That is itself an issue that needs discussion, later.)
Why do I use rand in there, and then call betaincinv? That gets into the theory of how one generates random variables besed on other distributions. (There are books to found on the subject.) There are multiple ways to do so, but as long as the inverse function for the distribution CDF is available and computable, then the method I showed is correct. And the nice thing is, the beta CDF is intimately related to the incomplete beta function itself, and betaincinv is exactly what is needed to compute the inverse of the beta distribution CDF.
In the case of very small alpha/beta, you will have a problem with any method explained here. So unless that is an issue, I would just use the betaincinv method, as simple.
An alternative, of computing the ratio X/(X+Y), where X and Y are GAMMA random variables, is an option. I recall dicussing that somewhere in a comment. TWO calls to gammaincinv will then be required, but they may be more efficient than one call to betaincinv. That would take some testing to resolve.
Finally, is the case where alpha and beta are both fairly small numbers. In that event, the resulting beta distribution becomes fairly nasty, and strongly bimodal. So one could then be forced to use a binomial distribution, as a special case. Only you might know if that is a case to worry about.

Sign in to comment.

Answer by dpb
on 18 Jun 2019

Given Ui are uniform random variates
Y1=U1/α1 and Y2=U1/β1,
such that Y1+Y2<=1,
follows the beta distribution with parameters α and β


ARGH. For some reason, my lengthy explanation of where it fails got deleted. Somehow, I got logged off in the middle of writing it, and then my comment just disappeared.
But there is a clear error in your claim.
Y1=U1/α1 and Y2=U1/β1,
such that Y1+Y2<=1,
You use U1 for BOTH Y1 and Y2. That would clearly resut in Y1/(Y1+Y2) a constant value, independent of U1. So you must have intended to write:
Y1=U1/α1 and Y2=U2/β1,
such that Y1+Y2<=1,
No problem there. But you don't tell us the required bounds on U1 and U2. Are they [0,1] both? I'll assume that at first.
First, I note that when alpha=beta=1, it seems true. A simple test would confirm that to be reasonable.
X = rand(1e6,1);
Y = rand(1e6,1);
k = X + Y <= 1;
Z = X(k)./(X(k) + Y(k));
Ok, but then what happens if alpha=beta, when they are both greater than 2?
Z = (U1/alpha) / (U1/alpha + U2/beta)
Now assume that alpha = beta. Then the ratio reduces to
Z = U1 / U1 + U2)
This tends to imply that the beta distribution does not change shape when alpha==beta.
fplot(@(x) betapdf(x,3,3),[0,1])
>> hold on
>> fplot(@(x) betapdf(x,2,2),[0,1])
>> fplot(@(x) betapdf(x,1,1),[0,1])
>> fplot(@(x) betapdf(x,1/2,1/2),[0,1])
>> fplot(@(x) betapdf(x,10,10),[0,1])
These are all beta distributions with alpha==beta. Clearly they are not the same. And as long as both parameters are always greater than 2 and eual, then the resulting ditribution will seem to vary, but still be symmetric. And the sum Y1+Y2 will ALWAYS be less than 1.
So the idea that U1 and U2 live in the unit square [0,1]X[0,1] must have been wrong.
Instead, perhaps the idea is that U1 and U2 live in a triangular region in the plane, and are uniform on that triangle. So might we have U1 and U2 live in the triangle bounded by vertices {[0,0], [alpha,0], [0,beta]} ?
So if U1 and U2 are uniformly distributed in that planar region, then it would be true that
U1/alpha + U2/beta <= 1
will always hold true. However, now if alpha = beta, then the ratio reduces to one that is NOT a function of alpha or beta. The point is, then we would have a case where the resulting distribution will not change as we vary alpha and beta.
Finally, I think where you got that result comes from something that DOES look like what you claim.
In the wiki page for the beta distribution,
I find the claim that when X and Y are gamma distributed, such that
X ~ gamma(alpha,theta)
Y ~ gamma(beta,theta)
Then the ratio
Z = X/(X+Y)
follows a beta distribution, beta(alpha,beta).
Let me test this for a simple example. Here, alpha = 2, beta = 3.
X = gamrnd(2,1,1,1e6);
Y = gamrnd(3,1,1,1e6);
Z = X./(X+Y);
hold on
H = fplot(@(x) betapdf(x,2,3),[0,1]);
H.Color = 'r';
As you see, the histogram closely follows the predicted beta distribution. I might conjecture that this is what you remember learning in the past.
Ok so my variates are bounded between zero and my upper limit is variable too (for most cases it will be less than one), but I can always work on the [0 1] case and do my maths later it doesn't look like a big issue.
I never went so far with statistics and I'm learning about distributions like gamma and beta on the spot while working, and still in the "on paper" phase before going head first coding something I don't fully understand.
So my plan, once I write the script, is to compare what you and John told me, if both work I should see it directly as it is going to serve for a geometrical distribution of the discs in multi-discs cultches and have an idea of what my results should look like. But from what I read these past few days your solution seems to do the job. I try to think of keeping you updated and thanks again, always nice to read from people as interested as you seem to be.
As I said, what dpb told you to do is not technically correct. In fact, it does not generate the desired distribution, as I proved that it cannot. But you will need to make that decision for yourself.

Sign in to comment.