Why is the vectorized version of simple local maxima detection code significantly slower (~2-3 times) than its for-loop version? %ntest data X = rand(100000,1000); % findig local maxima over columns of X % for-loop version tic; [I,J] = size(X); Ind = false(I,J); for j = 1:J Ind(:,j) = diff( sign( diff([0; X(:,j); 0]) ) ) < 0; end toc % vectorized version (~3 times slower than for-loop) tic; Ind_ = diff(sign(diff([zeros(1,J);X;zeros(1,J)],1,1)),1,1) < 0; toc % result identity test isequal(Ind,Ind_)

vectorised code is terribly slower

Bruno Luong on 9 Sep 2019

Open in MATLAB Online

I guess because

[zeros(1,J);X;zeros(1,J)]

MATLAB needs to allocate big chunk of memory (and copy segment by segment, but that happens also with for-loop).

Michal on 9 Sep 2019

Edited: Michal on 9 Sep 2019

Open in MATLAB Online

@Bruno I think the problem could be in built-in diff function, which is not properly programmed in a a case of dim = 1 option. See timing of the following code:

%% test data
X = rand(100000,1000);
%% findig local maxima over columns of X
[I,J] = size(X);
array = [zeros(1,J);X;zeros(1,J)];
% for-loop version
tic;
Ind = false(I,J);
for j = 1:J
    Ind(:,j) = diff( sign( diff(array(:,j)) ) ) < 0;
end
toc
% vectorized version (~2 times slower than for-loop)
tic;
Ind_ = diff(sign(diff(array,1,1)),1,1) < 0;
toc
%% result identity test
isequal(Ind,Ind_)

Bruno Luong on 9 Sep 2019

Edited: Bruno Luong on 9 Sep 2019

Open in MATLAB Online

Not entirely convinced. I still stick with memory related cause, because not only the verticat CAT but also DIFF, SIGN, DIFF create 3 big temporary arrays (hidden).

If you add 1,1 parameter in for-loop

tic;
[I,J] = size(X);
Ind = false(I,J);
for j = 1:J
    Ind(:,j) = diff( sign( diff(array(:,j),1,1) ),1,1) < 0;
end
toc

it's still fast. How do you explain that?

You note also that the reative difference of CPU times is less if you reduce the first dimension of X.

Michal on 9 Sep 2019

Edited: Michal on 9 Sep 2019

I guess, that In this case I call diff(array(:,j),1,1), where array(:,j) is a vector not matrix, so diff in this case does not perform computing over separated columns of array. May be the diff built-in function does not use multithreading properly in this case? But you are right the memory allocation in vectorized code could be really one (!) of slowness cause.

Bruno Luong on 9 Sep 2019

It is possibly that the DIFF implementation on array does not access sequently memory in case of 2D array data, but row-by-row of the array, that might slow down.

I don't think the multi-threading is wrongly implemented.

Michal on 9 Sep 2019

The main problem is, that during continuous development of JIT engine are alwyas changing MATLAB performance characteristics for vectorized codes. In general, the standard for-loop codes becomes faster and faster.

I have plenty of highly vectorized MATLAB codes created during last 10 years, which are during last few years becomes slower than theirs for-loop counter parts. So, there is no code performance stability.

vectorised code is terribly slower

6 Comments
Show 4 older comments Hide 4 older comments

Answers (0)

Categories

Tags

Community Treasure Hunt

vectorised code is terribly slower

6 Comments Show 4 older comments Hide 4 older comments

Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

6 Comments
Show 4 older comments Hide 4 older comments