Why are loops the devil?

12 views (last 30 days)
Natalie
Natalie on 1 Nov 2013
Commented: Natalie on 4 Nov 2013
I know everybody talks about how you should avoid nested loops because they are slow but I still find it hard to avoid them. I know there are some situations where they just can't be avoided but I think it is mostly down to my lack of imagination. In some cases I avoid one loop by using find, e.g.
for n = 1:size(Data,1)
index = find(ismember(UniqueTitles.mass,Titles(n+1,Columns.mass)));
Cell=num2str(UniqueTitles.mass{index,1})) = Data(n,Columns.mass);
end
but then there are other places where I have three or four nested loops. For example...
for m=1:M
for i=1:N
for p=1:P
Temp{m,i}.(num2str(Index{p,1}))=0;
end
end
end
Does anyone have some general suggestions for avoiding nested loops?
Thanks in advance.

Answers (4)

Image Analyst
Image Analyst on 1 Nov 2013
They aren't inherently bad, and in some cases can be faster than vectorized code. Generally for loops of a few tens of thousands of iterations or less, you won't notice any slowness. Once you start getting up into hundreds of thousands or millions of iterations, you can see a noticeable difference. Looking at your code I doubt you'd notice any difference unless you actually timed it and looked at the elapsed time numbers.

Doug Hull
Doug Hull on 1 Nov 2013
In looking at your first code, I expect it is the FIND that is slowing down the code. In the second it is likely the num2str that is slowing it down.
It is not necessarily the for loop that slows down MATLAB code, especially since we introduced the JIT. For questions on speed, I would run your code through the profiler. In this case, the lines in the middle will just be the slow ones. You should instead break the lines down into smaller parts so that you can see which part of the code is slowest.
As near as I can tell, you are using dynamic fieldname allocation. It looks like the field names are going to be coming from Index. It also looks like you are always storing scalars in there. Maybe a three dimensional Matrix would be better instead of a multi dimensional cell/structure array.
  1 Comment
Natalie
Natalie on 4 Nov 2013
Thanks Doug. My reason for using cells was that I essentially have values associated with different technologies (p) for N simulations and M years. Index contains the names of these technologies (with all unwanted characters removed). I thought this would be a nice way of storing the values so that the user can automatically tell which technology they relate to. Although I have found cells quite difficult to work with at times. I guess maybe it's a trade off between speed and producing variable names that the user will immediately recognise?
In relation to using num2str, I'm not really sure why this is necessary because the values in Index are already strings, but my code doesn't seem to work when I omit this (even if I use round brackets).
Can you suggest an alternative to using 'find'? Perhaps it would be faster to just use another loop now that you have introduced JIT.

Sign in to comment.


Cedric
Cedric on 1 Nov 2013
Edited: Cedric on 1 Nov 2013
The archetype of situations where people say that it's awful to use nested loops is the following: we need to count the number of elements of some array A which are above 0.8..
A = rand(2000, 2000) ;
% Nested loops approach.
tic ;
count = 0 ;
for r = 1:size(A,1)
for c = 1:size(A,2)
if A(r,c) > 0.8
count = count + 1 ;
end
end
end
toc
% Vector approach.
tic ;
count = sum(A(:) > 0.8) ;
toc
The first approach is timed at
Elapsed time is 1.687180 seconds.
when the second leads to
Elapsed time is 0.021290 seconds.
We are actually are facing the following situations (not exhaustive)..
  • Sometimes, thinking matrix/vector" instead of "element by element using loops" brings a significant increase in performance (vector approach is 80 times faster in the example above).
  • Sometimes loops cannot be avoided.
  • Sometimes people propose using ARRAYFUN or CELLFUN to avoid using explicit loops. While this makes the code more concise, these functions are actually hidden/implicit loops which don't increase the overall performance (generally).
  • Sometimes a simple FOR loop is efficient enough and keeps the code clear and simple.
  • Sometimes a simple FOR loop over a short range is more efficient than building all the objects needed for implementing a vector approach.
So there is no absolute truth about loops!
About your code, as mentioned by others, NUM2STR, ISMEMBER, and dynamic field naming, are going to be slow, which will be amplified by the fact that they are in loops. So the general approach should be changed if you wanted/needed to increase the performance.

Natalie
Natalie on 4 Nov 2013
Thanks for all your help guys. You've given me lots to think about.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!