Random numbers with Zero mean (not the basics)
39 views (last 30 days)
Show older comments
Hello, I have been searching for an alternative for my problem, but wasn't really able to get a convenient solution.
Issue with randn:
Although randn is based on zero mean, it doesn't really produce an array with zero mean. Even if I generate 1 million random variables from the standard normal, the mean sometimes is "far" from zero (ex: 0.003). I really need zero mean generations. I can't simply deduct the mean, as I need to use the numbers generated in simulating new stock prices, this moves me to my specific issue.
Issue with my case:
To simplify it to the maximum, please consider the following algorithm:
for i=1:j
Step 1 StockPrice %Given
Step 2 %Some calculations
Step 3 StockPrice = StockPrice*randn %Generating new stock price
Step 4 %Some calculations
Step 5 StockPrice = StockPrice*randn %Generating new stock price
end
I really don't want to complicate my code with if statements to achieve the zero mean. Kindly note that generating one vector of random numbers will be much faster than my current method.
My problem will be solved if there is a way to move in a vector in every iteration.
Thanks in advance
6 Comments
Walter Roberson
on 1 Sep 2013
If you generate one random number at a time, and your random values are required to have a mean of 0 and not just statistically, then your one random number would be forced to be 0. Likewise, if you generate 2 at a time, the two would be forced to be negatives of each other. This does not sound realistic.
I would suggest that stock price movements are not uniformly random distributed, that there is too much of a tendency for No Change for a significant number of stocks (they might not be market movers but the stock exchange includes them anyhow.)
Accepted Answer
Peter Perkins
on 1 Sep 2013
AND, no offense intended, but this doesn't make a lot of sense to me.
I suspect you know the following: randn draws from a standard normal distribution, which has zero mean. Any finite set of independent values drawn from that distribution will, with probability 1, have a non-zero sample mean. But that's their sample mean, not the mean of the distribution that they are drawn from. This is exactly what you should expect when drawing independent values from a normal distribution. It is not in any sense a "strange behavior".
You probably also know most of the following: If you want a finite set of "normal-like" values that has zero mean, then the simplest thing is to generate however many values you need, and subtract their mean. Another possibility is what's called antithetic sampling, where for each value that you draw from randn, you also use its negative. You can get that using the Antithetic property of the gloabl RandStream object. Both of these create what are most definitely not independent draws from a normal distribution. They are constrained dependent draws.
You say you cannot do that, and that you need "a method that allows me to use them efficiently in the loop". I'm going to have to guess that the reason is one of two things:
1) You are generating your values one at a time. I won't get into the efficiency issues that that raises, but unless you're generating hundreds of millions of values, you should be able to call randn once to generate a vector of however many values you need, subtract off its mean, and draw from that vector in your loop. Equivalently, you could read and save the generator state, generate a large vector and save its mean, then reset the generator state and draw numbers one at a time, subtracting off the mean that you previously saved.
2) You don't know in advance how many loop iterations you will have. If that's the case, then I think you are asking for the impossible. The only way that you can expect to draw random values and maintain a sample mean of zero at all steps is to draw only zeros.
You have not said why you care about a zero mean. Perhaps that would make things more clear.
6 Comments
Peter Perkins
on 3 Sep 2013
Edited: Peter Perkins
on 3 Sep 2013
AND, I remain puzzled. This code
ntrials = 100000;
N = [10 100 1000 10000 100000 1000000 10000000];
for n = N
xbar = zeros(ntrials,1);
for i = 1:ntrials
xbar(i) = mean(randn(n,1));
end
z05 = 1.96/sqrt(n);
sprintf('n = %d, z05 = %f: %f',n,z05,sum(abs(xbar) > z05)/ntrials)
end
results in this output:
ans =
n = 10, z05 = 0.619806: 0.050480
ans =
n = 100, z05 = 0.196000: 0.050930
ans =
n = 1000, z05 = 0.061981: 0.049540
ans =
n = 10000, z05 = 0.019600: 0.051760
ans =
n = 100000, z05 = 0.006198: 0.049380
ans =
n = 1000000, z05 = 0.001960: 0.050130
ans =
n = 10000000, z05 = 0.000620: 0.050250
which demonstrates that randn creates vectors with "large" sample means (outside the 95% probability limits) the correct proportion of times. A sample mean of .003 is not terribly unusual unless you have a very large vector of random values. Perhaps you do. But still, the above demonstrates that the sample mean of vectors from randn converge in probability to 0 at just the rate you'd expect from independent draws from a standard normal.
Perhaps you have some other reason for wanting your normals to have zero mean. Unless you can figure out in advance the total number you need, I think you are done for.
More Answers (3)
Peter Perkins
on 20 Sep 2013
AND, the answer to your latest question, if I understand it correctly) is yes, it doesn't matter if you generate one value at a time, or one million all at once, or one thousand values a thousand times. You will get the same sequence of values:
>> rng default
>> randn, randn, randn, randn
ans =
0.53767
ans =
1.8339
ans =
-2.2588
ans =
0.86217
>> rng default
>> randn(2,1), randn(2,1)
ans =
0.53767
1.8339
ans =
-2.2588
0.86217
>> rng default
>> randn(4,1)
ans =
0.53767
1.8339
-2.2588
0.86217
A Jenkins
on 1 Sep 2013
n=100; %number of random numbers you need
x=randn(n/2,1); %half random numbers
y=[x; -x]; %array of 'random' numbers with mean 0
mean(y)
0 Comments
the cyclist
on 1 Sep 2013
Edited: the cyclist
on 1 Sep 2013
I believe this File Exchange submission will do close to what you need:
However, these numbers are uniformly distributed, not normally.
I'm not sure how hard it is to modify it.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!