How can I improve the performance of my code? Specifically with the randn function for large arrays in a Monte Carlo simulation.

3 views (last 30 days)
I have a script that runs a Monte Carlo simulation to determine call options of theoretical stocks. I am having serious performance issues on large simulations (1E7+ iterations) of ~8k elements. I have tried three different approaches each one has had its shortcomings. I am wondering if there are any changes I could make to improve performance.
As this is an assignment, I am required to use the old v4 rng generator which from a quick test is about 6-10x slower than the default mode. :(
My first approach was to use a simple loop for each simulation:
disp('starting')
tic%start time
S0=100;
K=100;
sig=0.2;
r=0.05;
T=1/4;
delta=0;
N=8640;
DELTAt=T/N;
term1=(r-sig^2/2)*DELTAt;
term2=sig*sqrt(DELTAt);
logS0 = log(S0);
NbSim=10e4;%10e6,10e8
randn('seed', 8128) %uses old v4 rng gen
%rng(8128);
ST=zeros(NbSim,1);
compStr = sprintf("Computing %.2E Simulations...This may take a while.\nThis message will close when complete",NbSim);
box = msgbox(compStr);
fprintf("Starting %.2E simulations on %G elements. Please Wait...",N,NbSim);
for i=1:NbSim
increments=term1+term2*randn(N,1);
LogPaths=sum([logS0; increments]);
ST(i)=exp(LogPaths);
end
disp("Simulations done");
close(box)
%a bit of calculus
It took my computer hours to compute 10E6 simulations and I tried to make it faster by using only arrays... with my test simulation number this method proved much faster, however I forgot how much memory large arrays take.
%same as above
ST=zeros(1,NbSim);
logS0mat = ones([1,NbSim])*logS0;
increments=term1+term2*randn(N,NbSim);
LogPaths = sum([logS0mat;increments],1);
SPaths = exp(LogPaths);
ST = SPaths(end,:);
%calculus
Once I tried running the actual sim numbers I realized this was not an approach which would work. I came to my third approach which was to "chunk" (?) the data into smaller pieces and then do array math on these.
%same as above
NbSim=10e6;%10e8
chunkSize = 1e4;%(depends on NbSim and N size) pick size you can manage on your machine, bigger = faster but more mem usage(i think)
simChunk = NbSim/chunkSize;% make sure its nice...
ST=zeros(1,NbSim);
increments = zeros(N,chunkSize);
logS0mat = ones([1,chunkSize])*logS0;
%start chunk sim
compStr = sprintf("Computing %.2E simulations in chunks of %.2E.\nThis may take a while (%d iterations).\nThis message will close when complete.",NbSim,chunkSize,simChunk);
box = msgbox(compStr);
for i=1:simChunk
increments=term1+term2*randn(N,chunkSize);
LogPaths = sum([logS0mat;increments],1);
SPaths = exp(LogPaths);
x = (i-1)*chunkSize+1;
ST(x:((x-1)+chunkSize)) = SPaths;
end
disp("Simulations done");
close(box);
%calculus
This was slower than my first approach, from a little testing it seemed that it was the randn function which was eating most of the time (took me about 6s for that line to execute alone). I was hoping to run the simulation for N = 10e8, but as it stands now it will take far too much time.
Why am I not seeing the same performance gains I saw between my loop and my array methodology (tested for smaller N)? Is there a better way?
Many thanks :)
MATLAB Version: 9.8.0.1323502 (R2020a)

Accepted Answer

the cyclist
the cyclist on 4 Apr 2020
I did not try to run your code, but did some independent testing.
I ran in chunks of 1e6, which seemed empirically seemed about optimal. (Running not in chunks was about 20% slower, and as you say, hits memory limitations.)
I find that MATLAB is consistently generating about about 1e6 values in 0.03 second (when seeded as you are required to).
You are hoping to generate roughly 1e4 * 1e8 = 1e12 values. So, that would take about 3e4 seconds, or roughly 8 hours.
I don't think there's any way around the fact that you are trying to generate a boatload of random numbers.

More Answers (0)

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!