select random number from an array with probabilities

60 views (last 30 days)
Margherita Premoli on 20 Feb 2020
Answered: Steven Lord on 4 Oct 2023
I have an array of three element: S=[4 3.9 3.8] and I want to randomly select one of those three numbers. The probability of selecting 4 is 0.5, the probability of selecting 3.9 is 0.4 and the probability of selecting 3.8 is 0.1.
Can anyone help me please?
Adam on 20 Feb 2020
Off the top of my head and unverified because my Matlab is busy and I can't be bothered to start another one:
cumulativeProbs = cumsum( [0.5 0.4 0.1] );
S( find( rand > cumulativeProbs, 1 ) - 1 );

Sky Sartorius on 20 Feb 2020
You can query the cumulative probabilities:
S = [4, 3.9, 3.8];
w = [0.5, 0.4, 0.1];
w = w/sum(w); % Make sure probabilites add up to 1.
cp = [0, cumsum(w)];
r = rand;
ind = find(r>cp, 1, 'last');
result = S(ind)
Margherita Premoli on 21 Feb 2020
okay great, now it is clar! thanks again :)
Very helpful, thanks! <3

Steven Lord on 4 Oct 2023
Another way to do this is to use the discretize function.
values=[4, 3.9, 3.8];
probabilities = [0.5, 0.4, 0.1];
Let's create the cumulative probability vector (and to account for roundoff, set the right-most edge to exactly 1.)
probabilityEdges = cumsum([0 probabilities])
probabilityEdges = 1×4
0 0.5000 0.9000 1.0000
probabilityEdges(end) = 1
probabilityEdges = 1×4
0 0.5000 0.9000 1.0000
Now generate random numbers between 0 and 1 and discretize those random numbers using the probability edges. Specify that you want the output of discretize to be elements from the values array rather than which probability bin they belong to by passing values into discretize as the third input argument.
x = rand(1, 1e5);
v = discretize(x, probabilityEdges, values);
% Elements in v are 4, 3.9, or 3.8 rather than 1, 2, or 3 respectively
Now to show that we received roughly the probability distribution given in the probabilities vector, using the values from the values variable to create the bin edges (with one additional edge to ensure the last bin contains only those values in v that are exactly 4, as if I didn't include 4.1 the last bin would have counted both elements of v equal to 4 and those equal to 3.9.) I subtracted 0.05 in this case to make each bin centered around the value in values rather than using those elements as the leftmost bin edge.
Let's also draw lines at the probabilities so we can see how close each bin is to the theoretical probability we requested. I'll increase the upper limit on the Y axis to make it easier to see the top of the tallest bin.
histogram(v, 'BinEdges', [sort(values) 4.1]-0.05, 'Normalization', 'probability')
yline(probabilities, ':')
ylim([0 0.55])
xticks(sort(values))
Those bars are in pretty good agreement with the probabilities from the probabilities variable.