Randomly dividing an integer (non-uniform distribution)

Question

0 votes

I am trying to write a code that randomly divides a number N into M parts (in other words, I want to generate M random numbers whose sum adds up to N). However, I want to allow the generated numbers to vary considerably in magnitude.

My current code generates numbers that are all in the same ballpark:

N = 1;
n = rand(1,M);
TOT = sum(n);
numbers = (n/TOT)*N

For instance, for

, it yields:

    0.0716    0.0184    0.0498    0.0204    0.0748    0.0528    0.0120    0.0194    0.0650    0.0721    0.0459    0.0744    0.0785    0.0248
    0.0414    0.0326    0.0786    0.0778    0.0608    0.0290

As one can see, the number

is so evenly distributed that all generated numbers have the same (or close) orders of magnitude.

Is there a simple way to tweak this code so that it can generate numbers that are widely different in magnitude (e.g. 0.7 and 0.01)?

Any suggestions would be greatly appreciated.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Jeff Miller on 12 May 2019

Open in MATLAB Online

1 vote

There are any number of ways to do that, depending on exactly what you mean by "considerably". Here are a few:

%% Transform the numbers so they aren't uniform to begin with.
M=20;
N = 1;
n = rand(1,M).^10;  % Increase the exponent to increase the size variation.  A lot of numbers will be near zero.
TOT = sum(n);
numbers = (n/TOT)*N
%%  Generate progressively smaller numbers by generating each new one as a uniform(0,1) fraction of
% what is left of your total.
M=20;
n = rand(1,M);
TOT = sum(n);
cumtot = cumsum(n(1:end-1));
n(1:end-1) = n(1:end-1) .* (TOT - cumtot);
TOT = sum(n);
numbers = (n/TOT)*N
% permute(numbers) here if you don't want larger numbers to be systematically first.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 2

John D'Errico on 12 May 2019

Edited: John D'Errico on 12 May 2019

Open in MATLAB Online

2 votes

The problem with the alternative answers given, is they tend to have a flaw, in that they do not sample the space of interest well. It is the same reason why those schemes fail, when you try to sample uniformly, but with a fixed sum. The problem is though, the goal is a difficult one.

First, rather than saying we want to just generically say we want to randomly sample, first, we need to indicate a distribution. My logical initial choice would be an exponential distribution. Thus, can we sample points from an exponential random variable, such that the sum of the set is some fixed value? In fact, even this seems difficult, because an exponential random variable can yield samples that are infinitely large. But, suppose we used the classic (and arguably problematic) approach?

For example, I want to generate a set of M values, such that the sum is N, but the distribution is something based on an exponential random variable?

This means we want to find a random set that lies within a planar region in M dimensions, but is also exponentially distributed? Ugh. So let me try an obvious solution, and see what happens.

% number of samples of this process
numsamples = 10000;
% numdim == M.
% I've picked M=2 because I will want to plot things.
% I'll redo it in a higher number of dimensions afterwards.
numdim = 2;
% it seems not to matter what the exponential rate parameter
% is, because an exponential generates samples that go out to
% infinity. So I'll arbitrarily pick lambda == 1. I'm going
% to rescale them anyway. (lambda is the inverse of the mean
% of the exponential.)
lambda = 1;
% we can just use exprnd to generate the samples. But
% some people do not have the stats toolbox, and an
% exponential is trivial to work with.
% https://en.wikipedia.org/wiki/Exponential_distribution
% On that page, we see that an exponential with rate
% parameter lambda has mean 1/lambda. (Note that the
% stats TB function exprnd uses parameter mu, the mean
% as the exponential parameter.)
X = -log(rand(numsamples,numdim))/lambda;
% So, initially, look at the distribution of the columns of X.
% done BEFORE re-scaling to have a unit sum.
hist(X(:,1),100)

As you should see, the columns of X are (initially) exponentially distributed.

% Now just rescale them to have the desired sum.
% Note that this next line uses an ability introduced
% into MATLAB with R2016b.
X = X./sum(X,2);
% plot a histogram of X(:,1), post-rescaling.
hist(X(:,1),100)

Which looks surprsingly uniformly distributed! (Something I would need to think about.)

My point in all of this, is you need to carefully think about the implications of the re-scaling operation, and what it does to the distribution of the numbers as generated.

Perhaps a better choice might be to use a beta random variable initially. So, in 3-dimensions, with M==3...

X = betarnd(.25,.25,[numsamples,3]);
X = X./sum(X,2);
hist(X(:,1),100)

So there are many ssamples with very small values of each variable, as well as a spike near the top end of the histogram too.

Plotting now in 3-dimensions, we see:

plot3(X(:,1),X(:,2),X(:,3),'.')
box on
grid on

Which is an interesting plot in itself. Again though, the re-sclaing operation does strange, and sometimes nasty things to what you should expect.

2 Comments
Show None Hide None

Erivelton Gualter on 13 May 2019

Very interesting this verification.

pedro on 18 May 2019

Edited: pedro on 18 May 2019

The joys of broken sticks...

John wrote (abbreviated)

X = X./sum(X,2);

hist(X(:,1),100)

Which looks surprisingly uniformly distributed! (Something I would need to think about.)

I agree. It does. (I did.)

How about

>> Y = X./sum(X,1);

>> hist(Y(:,1),100)

John, if you're reading this (and even if you're not), please let me take this opportunity to thank you very very much indeed for your very many contributions to these discussions, which have educated and entertained me beyond measure.

cheers,

pedro

Sign in to comment.

Answer 3

Erivelton Gualter on 11 May 2019

1 vote

How about you generate n variable from a desired range. For example:

n = (b-a)*rand(1,M)+a;

So you have,

% Range
a = 0.01;
b = .7;
N = 1;
M = 20;
n = (b-a)*rand(1,M)+a; % 1xM random numbers from: [a, b]
TOT = sum(n);
numbers = (n/TOT)*N;

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 4

Kaan Çamur on 23 Nov 2020

0 votes

In case of anyone needs an online tool for a quick solution, try this -> https://functioncube.com/Functions/Divide-Number

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Randomly dividing an integer (non-uniform distribution)

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments
Show -2 older comments Hide -2 older comments

More Answers (3)

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Randomly dividing an integer (non-uniform distribution)

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

0 Comments Show -2 older comments Hide -2 older comments

More Answers (3)

2 Comments Show None Hide None

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

2 Comments
Show None Hide None

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments