Smarter Loop with big matrices

Hello everyone, i've got the following problem. I want to create 25 portfolios sorted by two variables. So at the end i should sort one variable at the quintiles and the other variable at the quintiles which will give me 25 portfolios if I combine them. I have a solution for this, but my loops work too long (over 2 hours). The size of the matrices are 594 x 15726. Here is my first and my second loop out of an overall of 6 loops.
Loop 1 for a quintile sort for the first variable
% 5 portfolios sorted for size
s1 = NaN(size(NSI));
s2 = NaN(size(NSI));
s3 = NaN(size(NSI));
s4 = NaN(size(NSI));
s5 = NaN(size(NSI));
u = 1;
for m=1:12:589;
n=min(594, m + 12 - 1);
for i=1:size(NSI,2)
if me(u,i) <= prctile(me(u,:),20,2);
s1(m:n,i) = 1;
elseif me(u,i) > prctile(me(u,:),20,2) & me(u,i) <= prctile(me(u,:),40,2);
s2(m:n,i) = 1;
elseif me(u,i) > prctile(me(u,:),40,2) & me(u,i) <= prctile(me(u,:),60,2);
s3(m:n,i) = 1;
elseif me(u,i) > prctile(me(u,:),60,2) & me(u,i) <= prctile(me(u,:),80,2);
s4(m:n,i) = 1;
elseif me(u,i) > prctile(me(u,:),80,2);
s5(m:n,i) = 1;
end
end
u = u + 12;
end
Loop 2 where I take the first portfolio of the first variable
% 5 portfolios sorted for book-to market for the smallest size portfolio
s1b1 = NaN(size(NSI));
s1b2 = NaN(size(NSI));
s1b3 = NaN(size(NSI));
s1b4 = NaN(size(NSI));
s1b5 = NaN(size(NSI));
u = 1;
for m=1:12:589;
n=min(594, m + 12 - 1);
for i=1:size(NSI,2)
if beme(u,i) <= prctile(beme(u,:),20,2);
s1b1(m:n,i) = s1(m:n,i);
elseif beme(u,i) > prctile(beme(u,:),20,2) & beme(u,i) <= prctile(beme(u,:),40,2);
s1b2(m:n,i) = s1(m:n,i);
elseif beme(u,i) > prctile(beme(u,:),40,2) & beme(u,i) <= prctile(beme(u,:),60,2);
s1b3(m:n,i) = s1(m:n,i);
elseif beme(u,i) > prctile(beme(u,:),60,2) & beme(u,i) <= prctile(beme(u,:),80,2);
s1b4(m:n,i) = s1(m:n,i);
elseif beme(u,i) > prctile(beme(u,:),80,2);
s1b5(m:n,i) = s1(m:n,i);
end
end
u = u + 12;
end
to be continued for the other 4 portfolios.
Is there a smarter way to get the 25 portfolios?. Thanks in advance.

5 Comments

Hi Sascha, As a first step you should be able to speed things up just by getting rid of half of the calls to prctile.
In the first example you can reduce the 'if' checks to
if me(u,i) <= prctile(me(u,:),20,2);
s1(m:n,i) = 1;
elseif me(u,i) <= prctile(me(u,:),40,2);
s2(m:n,i) = 1;
elseif me(u:i) <= prctile(me(u,:),60,2);
s3(m:n,i) = 1;
elseif me(u,i) <= prctile(me(u,:),80,2);
s4(m:n,i) = 1;
else
s5(m:n,i) = 1;
end
After all, if me(u,i) fails the first 'if' check for me(u,i) <= prctile(...,20) and gets to the next elseif, then you already know that me(u,i) > prctile(...,20), correct? No need to check for that. And so forth.
This is just a first step on modifying this code but it should improve the run time.
Get rid of the numbered variables and use indexing instead. Using either a cell array or an ND array will make your code simpler, neater, and easier maintain. Whenever you find yourself writing numbered variables then you are doing something wrong: in particular, you should be using indexing.
@ David Goodmanson, yeah you're absolutely right. I don't know why I did it like this. Thanks.
@ Stephen Cobedick, can you give me an example of how i can use it in this way please? Thanks in advance.
@ Stephen Cobeldick, ok thanks i tried to get a bit of an idea and indeed i now have a solution which is much faster, although the code is far away from beeing perfect.

Sign in to comment.

Answers (2)

Steven Lord
Steven Lord on 18 Apr 2017
If I understand what you're doing correctly, I think the discretize and/or histcounts functions will be of interest to you.

1 Comment

I think it may be posible, but at the moment I don't see how. Can xou give me an idea of how to do it?

Sign in to comment.

Hi Sascha.
Would it be possible to view your dataset?
I'm trying to do a similar exercise in Matlab.
Best,
Christoffer

Asked:

on 18 Apr 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!