You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
extremely inefficient piece of code ... bug?
3 views (last 30 days)
Show older comments
I have two codes that implement exactly the same task:
idx(idx<=0) = 1;
idx(idx>=n) = n-1;
and
if idx <= 0
idx = 1;
elseif idx >= n
idx = n-1;
end
where idx and n are a scalar integer values.
This piece of code is evaluating multiple times (N = 1e6, for example). During profiling of the code, I found very strange inefficiency of the first code, which is about ~ 100-1000x slower than the 2nd one. I use the 1st type of code to get more readable code by avoiding if-else construct, but the overall performance is on R2023a really terrible. See attached test code (test.m).
Is this behavior normal, or is it a BUG?
Please, could you verify this code on other (older) MATLAB releases to see, if the problem is only R2023a related or not?
24 Comments
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
Why it's a bug? The first code is not efficient since it calls array comparison + logical array + array indexing + array sub assigment. each function have to create intermediate MATLAB array result, decide which branch code it goesn, and perhaps doing some deep duplication of data (?), garbage collector, etc...
Whereas the second code the JIT just do the work as simple as possible.
It is not a mistery or unusual that the first code is inefficient.
To me the first code doesn't look readable at all, it is very missleading for scalar input.
Michal
on 31 Aug 2023
Edited: Michal
on 31 Aug 2023
@Bruno Luong I did not say, that it is a BUG. I just asking for any opinions.
idx <= 0 or idx >= n is not an array comparison, due to the fact, that idx and n are scalar values.
If you were right, the code
if idx <= 0
idx(true) = 1;
elseif idx >= n
idx(true) = n-1;
end
would be as fast as the original 1st code. But it's not, it's still significantly faster!
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
BTW you repeat the uppercase BUG in two places.
Yes idx <= 0 if expressed inside the idexing would call function le that accepts arrays and returns a logical array, even though you have a scalar 1x1 array. It is NOT inside the if conditional test staement. Function le supposes to work on array so doing all kind of fancy stuffs that you don't see behind MATLAB compact syntax.
John D'Errico
on 31 Aug 2023
Edited: John D'Errico
on 31 Aug 2023
So only after I wrote my answer explaining why the two are different, do I learn that idx is a scalar. I give up. I cannot read your mind. I've deleted my answer, since it presumed you were using vectors, as the indexing lines only made sense if idx was a vector.
This is not a bug. You are using indexing to do something that is designed to be efficient for vectors, but no vectors were involved.
On a scalar, the simple if test will of course be more efficient. It does not even perform the second test and branch at all if the first branch is executed. AND it has always been that way, although the if test has become more efficient over the years, since they have improved the speed of the parser to write better code internally. Those changes in the parser for efficiency were made mainly almost certainly long before you ever learned MATLAB though.
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
When I'm lazy I do this
idx = max(min(idx,n-1),1);
or
idx = min(max(idx,1),n-1);
rather than if-else. Never cross my mind to do array logical indexing for scalar.
Strictly speaking this is NOT equivalent to either since your methods let NaN untouched, not mine (and in this case the two above min/max methods return different results for NaN).
When I need efficient code, I use if/else.
Michal
on 31 Aug 2023
@Bruno Luong Nice ... but still about 10x slower than if-else construct.
So, in the summary, I learned a few things:
- probably due to the continious JIT improvements, the low-level (not-vectorised) approaches (like if-else construct) will be more and more efficient
- using array oriented operations on scalar values is not a good idea at all
- this is definitely not a BUG, but frankly speaking, looks like a potencial BUG :)
Rik
on 31 Aug 2023
I was going to suggest something like Bruno posted (but was interupted before I hit submit).
The only thing to watch out for is that this code may not work in edge cases (like NaN, but also with non-integers).
idx=0.5;n=0;
Bruno(idx,n),Michal(idx,n)
1
-1
idx=1;n=1;
Bruno(idx,n),Michal(idx,n)
1
0
function Bruno(idx,n)
idx = max(min(idx,n-1),1);
disp(idx)
end
function Michal(idx,n)
if idx <= 0
idx(true) = 1;
elseif idx >= n
idx(true) = n-1;
else
error('undefined') % always include an else branch
end
disp(idx)
end
Bruno Luong
on 31 Aug 2023
This is not a bug it's a feature.
Rik
on 31 Aug 2023
Why do you insist on the word bug? A bug is unexpected/undefined/undocumented behavior. Perfomance isn't a bug unless the difference is catastrophic. If the indexing-style took 5 seconds to complete (instead of miliseconds), that would be a bug. Right now it is just behavior you did not expect.
I think the most valuable lesson here is this: write code the way Mathworks engineers expect you to. That way the improvements in the JIT (or Execution Engine, same difference) will benefit your code as well. Your coding style may or may not be intrinsically better or clearer. If you care about optimal performance on a broad range of releases, use the style Matlab expects.
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
To get it straight the time needed for
idx(idx<=0) = 1;
idx(idx>=n) = n-1;
is about 0.5 microsecond.
Michal
on 31 Aug 2023
Edited: Michal
on 31 Aug 2023
@Rik OK ... you are right regarding bug definition.
Is there available any comprehensive and clear description of what the "MATLAB Expects" programming style is? From my point of view, there is no generally recommended MATLAB programming style at all. There are only a myriad of recommendations spread over documentation and other MATLAB community sources. And these recommendations are sometimes contradictory or at least a bit confusing.
Michal
on 31 Aug 2023
@Bruno Luong Yes and time needed for
if idx <= 0
idx = 1;
elseif idx >= n
idx = n-1;
end
is about 0.5 nano seconds, I did not see you point...
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
I don't want to make any point beside giving a more precise timing for those who read with Rik statement 'If the indexing-style took 5 seconds to complete (instead of miliseconds)". It's submicrosecond range.
Michal
on 31 Aug 2023
The situation looks completely different, when you are using this piece of code many times. My MonteCarlo code use this piece code about 1e8 times for one simulation run, so the overall difference is 1min vs 0.3 sec. This is a huge difference when I need to perform about 1e3 separate MC runs.
Rik
on 31 Aug 2023
For the purposes of my comment microsecond and nanoseconds are the same. My point was that both methods are so fast that it doesn't matter. For one to be slow enough to be classified as a bug, they would have to be unexpectedly slow.
There are actually some recommended methods gathered in some places. There are cheat-sheet pdfs (although I can never find them, nor do I need them anymore) provided by Mathworks.
More interestingly: besides the scattered (and sometimes contradictory) advice in the documentation, there is this thread.
I don't know where in the documentation I would mention that indexing of a scalar is not recommend. Where would you do so?
Bruno Luong
on 31 Aug 2023
Edited: Bruno Luong
on 31 Aug 2023
Don't tell me that you mainly do scalar clipping on your montecarlo simulation.
Michal
on 31 Aug 2023
Edited: Michal
on 31 Aug 2023
@Bruno Luong Yes, I do :) I am simulating behaviour of random signals for different types of linear interpolations between sampling time points, and "scalar clipping" plays here relatively crucial role at fast linear single-point query interpolation, which takes about 30-50% of the whole processing time.
Single point query linear interpolation leeds to the scalar "idx"!
Bruno Luong
on 31 Aug 2023
Noise interpolation has no sense to start with....
Michal
on 31 Aug 2023
@Bruno Luong Well OK... You are asking and I am giving you an answer. This is the perfect closing statement, Bruno. Especially if you don't know anything about the topic at hand.
Bruno Luong
on 31 Aug 2023
Cheese: For the record: My timing figure is directed to Rik's comment, then you continue to draw me in with your justification about your montecarlo simulation, that I initially never ask for. I'm not really interested in knowing the topic to be honest. That's your work not mine. You don't need to attck me with "Especially if you don't know anything about the topic at hand."
Michal
on 31 Aug 2023
Edited: Michal
on 31 Aug 2023
@Bruno Luong I just react on your statement: "Don't tell me that you mainly do scalar clipping on your montecarlo simulation."
I tried to explain to you why I use "scalar clipping" in my MC simulation. You finally comment my explanation by: "Noise interpolation has no sense to start with...."
So, what ...?
Accepted Answer
Matt J
on 31 Aug 2023
Edited: Matt J
on 31 Aug 2023
Well, I think the bottom line is just that the JIT does not have any optimization for indexing expressions like idx(idx<=0). That might be because such an optimization would need to know in advance that idx is a scalar, and remains a scalar throughout the loop.
Coversely, parsing an if-statement,
if idx>threshold
end
doesn't require nearly as much work, even when idx is a vector. The if test is done by looping through the vector elements idx(i)>threshold, and as soon as one of the elements is false, it bails out.
5 Comments
Michal
on 31 Aug 2023
Edited: Michal
on 31 Aug 2023
OK ... but still, is the absence of any JIT optimization the only reason why is
idx(idx<=0)
so much slower (~ 1000x) than
if idx>threshold
end
Is using of idx(idx<=0) for scalar idx so insane as others suggest?
Rik
on 31 Aug 2023
In short: yes, it is extremely likely all due to the JIT.
And to your second question: no, it isn't insane, just wasteful and a subjectively odd way to write the code.
I honestly don't understand why that isn't obvious when looking at the code. You produce a logical array and use it to index a variable (possibly with an all-false array). Compare that to the evaluation of an if-statement. While 1000x is a bit surprising to me, anything below 20x would have made me doubt the timing test.
Matt J
on 31 Aug 2023
Edited: Matt J
on 31 Aug 2023
No, I'm saying that to achieve the same performance you're seeing with if-else, the scalar indexing case would have to receive its own special treatment from the JIT, separate from what is normally done for general logical indexing.
More Answers (0)
See Also
Categories
Find more on Arithmetic Operations in Help Center and File Exchange
Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)