Replace nested loops?
1 view (last 30 days)
Show older comments
Is it possible to replace 4 for loops in the form: for if for for if for if .... ....
with something that is more efficient?
because some of the data i run have millions of variables, and i have lots of data set to run, it takes a few days to finish them. So anything that would lower the run time of this section would be geatly appreciated. Thanks
[EDITED: Code yopied form the comments, Jan Simon]
for i=1:length(starts)
counter = 0;
if isempty(starts{i}) == 0
for j = 1: length(starts{i})
for k = 1: length(starts)
if isempty(starts{k}) ==0
for m = 1:length (starts{k})
if stops{i}(j) >= starts{k}(m) && stops{i}(j)< stops{k}(m) && isempty(peak_loc3{k})==0 && peak_loc3{i}(j)~= peak_loc3{k}(m)
counter = counter +1;
overlap{1,i}(counter) = peak_loc3{k}(m);
overlap{2,i}(counter) = peak_loc3{i}(j);
1 Comment
Sean de Wolski
on 7 Dec 2011
We really need to see the operations to figure out if it's possible. A well orchestrated for-loop should be fairly fast in newer versions.
Accepted Answer
on 7 Dec 2011
A small time-drain will be the fact that inside the loop, the overlap variable gets constantly resized. In the MATLAB editor, these variables will have a little orange line under them. If you hover over that line, it will warn you about this potential problem.
Here's a first attempt that will reduce the time needed. Note I've also replaced "isempty(x)==0" with "~isempty(x)" (for simplicity) and replaced some of the nested if statements with continue statements, just to have less nesting (which can get confusing).
overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
counter = 0;
thisStart = starts{i};
thisStop = stops{i};
for j = 1: length(thisStart)
for k = nonEmptyStarts
if isempty(peak_loc3{k}), continue; end
thatStart = starts{k};
thatStop = stops{k};
thisMask = thisStop(j)>=thatStart & thisStop(j)<thatStop & peak_loc3{i}(j)~=peak_loc3{k}(1:length(thatStart))';
for m = find(thisMask);
counter = counter +1;
overlap{1,i}(counter) = peak_loc3{k}(m);
overlap{2,i}(counter) = peak_loc3{i}(j);
Unfortunately there is still a big culprit of "variable size adjustment" sitting inside a loop, which will really slow down the code. If you see the line starting with overlap{1,i}(counter) =, you'll notice that every time this line is run, the variable sitting in the cell at overlap{1,i} grows by one. If this happens a lot, MATLAB has to work really hard to find new space in memory fitting this new size.
This updated code currently has an approximately 10-fold reduction in running time to the original.
overlap = cell(2, length(starts));
nonEmptyStarts = find(~cellfun(@isempty,starts));
for i=nonEmptyStarts
counter = 0;
% Get column vectors of the first start/stop pairs
startA = starts{i}'; stopA = stops{i}';
for k = nonEmptyStarts
% Get row vectors of the second start/stop pairs
startB = starts{k}; stopB = stops{k};
% Get a mask of all A-B pairs that match requirements
ABMask = bsxfun(@ge,stopA,startB) & ...
bsxfun(@lt,stopA,stopB) & ...
bsxfun(@ne,peak_loc3{i}(1:numel(startA)), peak_loc3{k}(1:numel(startB))');
[j,m] = find(ABMask);
numToAdd = length(m);
if ~numToAdd, continue; end
% Append them to "overlap"
indsToInsert = (1:numToAdd) + counter;
counter = counter + numToAdd;
overlap{1,i}(indsToInsert) = peak_loc3{k}(m);
overlap{2,i}(indsToInsert) = peak_loc3{i}(j);
This update should make significant improvements on a large dataset. There is still room for improvement, depending on the type and sizes of data you have. You can actually get a good view of what parts of the code take the most time by replacing tic and toc with profile on and profile viewer.
I have a feeling that the assignment into overlap will still be the biggest area for possible improvement.
More Answers (0)
See Also
Find more on Whos in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!