Why rand function is not uniform in large intervals?
8 views (last 30 days)
Show older comments
pankaj singh
on 23 Jun 2016
Answered: pankaj singh
on 27 Jun 2016
I am using rand function to generate uniformly distributed random numbers in the interval [10e-6 and 1.] But the function generates the nos. which are close to 1 (RATHER THAN BEING UNIFORM IN THE ENTIRE INTERVAL]. I have tried with 10 nos. and 100 nos. But I found that most the nos. generated are close to 1. Then, how it will be a uniform distribution??
3 Comments
Adam
on 23 Jun 2016
rand always generates numbers between 0 and 1. What you then do with those to get them into a range you are interested in is entirely up to you.
John D'Errico
on 23 Jun 2016
Rand IS uniform, and it generates numbers in the range from 0 to 1. If you are mis-using the results of rand in some way, then expect strange results. So show what you wrote.
Accepted Answer
John D'Errico
on 26 Jun 2016
Edited: John D'Errico
on 26 Jun 2016
Sigh. I think this is a misunderstanding of what uniformly random means. A fairly common one too. Since you seem not interested in showing your code, one can only guess the problem though. There can be no certain answer if I do not see your code. (addendum: The OP has since added a comment that indicates my conjecture is exactly on target.)
You are generating numbers uniformly random with a target interval [1e-6,1].
A uniform distribution implies that for ANY sub-interval of fixed width that is contained in the global window [1.e-6,1] (so assume a sub-interval [a,b]) where we have
1e-6 <= a <= b <=1
then the expected number of events we will observe should be:
(b - a)/(1 - 1e-6)
If you will generate N samples, then the expected number of events in the sub-interval is:
N*(b - a)/(1 - 1e-6)
So expect to see a number of events that are proportional to the sub-interval width. You won't seee exactly that many, since this is a random sampling.
So a uniform random sampling on the interval [0,1] would have roughly 10% of the samples in each bin [0,0.1], [0.1,0.2], [0.2,0.3], etc.
Now in your case, you are sampling on the interval [1e-6,1]. You find that very few samples occur right down at the bottom end, say between [1e-6,1e-5].
Lets use the rule above to see what fraction of the samples SHOULD occur in that interval. Lets say that we generate a sample size of 1000 values in the overall interval. Seems pretty big to me.
1000*(1e-5 - 1e-6)/(1 - 1e-6)
ans =
0.009
Hmm. I only expect to see 0.009 samples in that sub-interval, whereas I would have expected to see
1000*(1 - 0.9)/(1 - 1e-6)
ans =
100
So 100 events in the subinterval [0.9,1].
Is this truly uniform sampling? YES!!!!!!!!! Of course it is! You need to understand that the first interval I showed is a terribly tiny interval.
If you asked to generate a sampling that is uniformly probable over that region, but what you REALLY wanted was some sort of sampling that is uniform in a log space, then you needed to use a proper random sampling scheme!
For example, try this:
R = 10.^(rand(1,1000)*6 - 6);
Look at some percentiles of this sampling scheme:
Min 1.005e-06
1.0% 1.144e-06
5.0% 2.036e-06
10.0% 4.529e-06
25.0% 4.117e-05
50.0% 0.001265
75.0% 0.03185
90.0% 0.2554
95.0% 0.5309
99.0% 0.8607
Max 0.9928
It is NOT uniform, at least not in the domain [1e-6,1]. But the log10 of those numbers WILL be uniformly distributed. So, we will expect roughly 50% of the log10 values to be less than -3.
Min -5.998
1.0% -5.941
5.0% -5.691
10.0% -5.344
25.0% -4.385
50.0% -2.898
75.0% -1.497
90.0% -0.5928
95.0% -0.275
99.0% -0.06513
Max -0.003133
Again, it won't be perfect. But a sample size of 1000 is not really that huge. These predictions only become valid in the limit as N grows to a really large number.
Again, it is just a wild guess.
3 Comments
Stephen23
on 26 Jun 2016
Edited: Stephen23
on 26 Jun 2016
@pankaj singh: Instead of ten samples (which is too few for showing any kind of probability distribution trend), here is your code with one million samples:
N = 1e6;
dmin = 10e-6;
dmax = 1 ;
d = dmin + (dmax-dmin).*rand(1,N);
hist(d)
and a histogram of those random values:
Does this look like a uniform distribution to you? If not, what would you expect it to look like ?
John D'Errico
on 26 Jun 2016
Edited: John D'Errico
on 26 Jun 2016
Did you read my response? If not, then why not? I spent, what, an hour writing that response to you. Then you asked exactly the same question that I just answered.
READ MY ANSWER. In that answer, I explained why you are confused, why you are not getting the kind of sampling that you want to see. I then show how to achieve the sampling that you want. But if you will ask a question and won't bother to read the answers you get and think about what they say, how can I do more? (Sorry if I seem frustrated.)
More Answers (2)
Roger Stafford
on 26 Jun 2016
Edited: Roger Stafford
on 26 Jun 2016
It seems clear from his most recent comment that where Pankaj says “uniform” he actually means a "logarithmic" distribution where there would be as many samples in the interval [10^(-6),10^(-5)] as in the interval [10^(-1),10^(0)], and indeed in any interval [10^(-k),10^(-k+1)], -6<=k<=-1. If that is the case, the proper code would be:
r = 10.^(-6*rand(1,n));
1 Comment
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!