Zipf Based Number generation

24 views (last 30 days)
Tamoor Hassan
Tamoor Hassan on 31 Mar 2015
Answered: Wasseem Suhabuth on 31 Aug 2016
I am trying to generate numbers based on Zipf distribution.
There is a function named RANDDRAW that generates number for the zipf exponent greater than 1. However, i need to generate Zipf numbers which can be less than 1.
Can anybody help me to generate random numbers based on zipf distribution with any value of zipf exponent?

Answers (4)

John D'Errico
John D'Errico on 31 Mar 2015
Edited: John D'Errico on 31 Mar 2015
Whenever someone says something like "for ANY exponent", I can bet that they won't be going to throw an easy problem at any offered code.
So, looking at the zipf distribution, it turns out to be an easy one to deal with, sort of. The zipf pdf is simply
p_zipf(x,p) = -x^(p+1)/zeta(p+1)
This is a discrete distribution in x, over the positive integers.
If we look at the pdf, we recognize that zeta(p+1) is the limit of the sum of powers of x, x^-(p+1), where x ranges from 1 to inf. So zeta(p+1) merely normalizes that pdf to have unit area.
Here the zipf exponent that was referred to would be p, I assume, but there are unresolved questions hidden in the question. I'll quote the line you wrote:
"However, i need to generate Zipf numbers which can be less than 1."
So, IF your question is to generate zipf samples that are non-integer, this makes no sense, since the zipf distribution is a discrete one, that samples over the positive integers.
If you are calling the zipf exponent (p+1), and you wish to sample such that p+1 is less than 1, so p is less than zero, then you will have a great deal of trouble.
If your question is to sample for non-integer values of p that are less than one, but greater than zero, then this is possible, but a serious bit of an effort for many values of p less than 1.
By the way, since the zipf pdf involves zeta(p+1), this would employ the symbolic toolbox, because for small p, the zeta function converges pretty slowly.
Sadly, I cannot proceed further without some resolution of the above questions.
Edit: Though I've not yet gotten a response to my questions, I took a quick look at the randraw code. Randraw defines the zipf pdf such that
p_zipf(x,a) = -x^(-a)/zeta(a)
and then requires only that a>1.
It would appear that you want to generate zipf numbers for a<=1, as randraw defines that distribution. If this is so, then you cannot do so. The sum...
sum(x^-a)
where x is the set of positive integers, is inf for a<=1.
This is why randraw requires that a>1, as otherwise, the zeta function has a problem. Even for a near 1, I would predict that randraw becomes slow.

Roger Stafford
Roger Stafford on 31 Mar 2015
If you want exponent values less than or equal to one, it is necessary to confine yourself to a specific finite number of terms, since the infinite sum of terms in the zipf distribution would be infinity. See
http://en.wikipedia.org/wiki/Zipf's_law
In the following, N is the number of possible integer values of the distribution, s is the exponent, and the desired size of the generated random matrix is m-by-n.
V = cumsum([0,1./(1:N).^s]');
[~,r] = histc(rand(m*n,1),V/V(end));
r = reshape(r,m,n);
The variable 'r' assumes random integer values from 1 to N in accordance with the zipf distribution.
  10 Comments
Tamoor Hassan
Tamoor Hassan on 4 Apr 2015
In each row, no values along columns should be repeated.
Roger Stafford
Roger Stafford on 4 Apr 2015
Your statement "all the columns contain unique values" made me think you wanted no repetitions to occur along any column. If this is not the case, then things are a little easier.
V = cumsum([0,1./(1:N).^s]'); V = V/V(end);
t = true;
while t
[c,r] = histc(rand(n,m),V);
t = any(c(:)>1);
end
r = r';

Sign in to comment.


Tuyen Tran
Tuyen Tran on 17 Oct 2015

Wasseem Suhabuth
Wasseem Suhabuth on 31 Aug 2016
Can ihave the coding on matlab to generate zipf distribution please

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!