Is vectorized code always faster than loops? Any exceptions?
2 views (last 30 days)
Show older comments
[EDIT: 20110727 09:35 CDT - reformat - WDR]
I have a critical chunk of a code that has six nested for-loops. I reduced the innermost three with vectorization and I see that the vectorized version (with exact same config of everything else and same computer) takes twice the run time. I ran each of them a few times and here are the results. Any light on understanding this behaviour is appreciated. Thanks.
% fem_nought is file with loops. Fem_optimised is one with the vectorized equivalent of the innermost 3 loops.
>>fem_optimized
Elapsed time is 10.073242 seconds.
>> fem_optimized
Elapsed time is 9.588474 seconds.
>> fem_optimized
Elapsed time is 9.872822 seconds.
>> fem_nought
Elapsed time is 4.047568 seconds.
>> fem_nought
Elapsed time is 3.678311 seconds.
>> fem_nought
Elapsed time is 3.672811 seconds.
Trimmed versions of both the codes are below: (decl of a lot of variables are removed)
LOOPS version:
for k=1:nel
for ri=1:8
for si=1:8
for mn=1:4
for nm=1:4
for km=1:4
r=.5*(a*p(mn)+r1+r2);
s=.5*(b*p(nm)+s3+s2);
t=.5*(c*p(km)+t1+t5);
a1=-.02*s+0.5*r*(1-r^2)+.05*t;
a2=-.05*t-.5*s;
%...............SHAPE FUNCTUION..........................
N(1)=((r-r2)/(r1-r2))*((s-s4)/(s1-s4))*((t-t5)/(t1-t5));
N(2)=((r-r1)/(r2-r1))*((s-s3)/(s2-s3))*((t-t6)/(t2-t6));
N(3)=((r-r4)/(r3-r4))*((s-s2)/(s3-s2))*((t-t7)/(t3-t7));
N(4)=((r-r3)/(r4-r3))*((s-s1)/(s4-s1))*((t-t8)/(t4-t8));
N(5)=((r-r6)/(r5-r6))*((s-s8)/(s5-s8))*((t-t1)/(t5-t1));
N(6)=((r-r5)/(r6-r5))*((s-s7)/(s6-s7))*((t-t2)/(t6-t2));
N(7)=((r-r8)/(r7-r8))*((s-s6)/(s7-s6))*((t-t3)/(t7-t3));
N(8)=((r-r7)/(r8-r7))*((s-s5)/(s8-s5))*((t-t4)/(t8-t4));
Nr(1)=(1/(r1-r2))*((s-s4)/(s1-s4))*((t-t5)/(t1-t5));
Nr(2)=(1/(r2-r1))*((s-s3)/(s2-s3))*((t-t6)/(t2-t6));
Nr(3)=(1/(r3-r4))*((s-s2)/(s3-s2))*((t-t7)/(t3-t7));
Nr(4)=(1/(r4-r3))*((s-s1)/(s4-s1))*((t-t8)/(t4-t8));
Nr(5)=(1/(r5-r6))*((s-s8)/(s5-s8))*((t-t1)/(t5-t1));
Nr(6)=(1/(r6-r5))*((s-s7)/(s6-s7))*((t-t2)/(t6-t2));
Nr(7)=(1/(r7-r8))*((s-s6)/(s7-s6))*((t-t3)/(t7-t3));
Nr(8)=(1/(r8-r7))*((s-s5)/(s8-s5))*((t-t4)/(t8-t4));
Ns(1)=((r-r2)/(r1-r2))*(1/(s1-s4))*((t-t5)/(t1-t5));
Ns(2)=((r-r1)/(r2-r1))*(1/(s2-s3))*((t-t6)/(t2-t6));
Ns(3)=((r-r4)/(r3-r4))*(1/(s3-s2))*((t-t7)/(t3-t7));
Ns(4)=((r-r3)/(r4-r3))*(1/(s4-s1))*((t-t8)/(t4-t8));
Ns(5)=((r-r6)/(r5-r6))*(1/(s5-s8))*((t-t1)/(t5-t1));
Ns(6)=((r-r5)/(r6-r5))*(1/(s6-s7))*((t-t2)/(t6-t2));
Ns(7)=((r-r8)/(r7-r8))*(1/(s7-s6))*((t-t3)/(t7-t3));
Ns(8)=((r-r7)/(r8-r7))*(1/(s8-s5))*((t-t4)/(t8-t4));
Nt(1)=((r-r2)/(r1-r2))*((s-s4)/(s1-s4))*(1/(t1-t5));
Nt(2)=((r-r1)/(r2-r1))*((s-s3)/(s2-s3))*(1/(t2-t6));
Nt(3)=((r-r4)/(r3-r4))*((s-s2)/(s3-s2))*(1/(t3-t7));
Nt(4)=((r-r3)/(r4-r3))*((s-s1)/(s4-s1))*(1/(t4-t8));
Nt(5)=((r-r6)/(r5-r6))*((s-s8)/(s5-s8))*(1/(t5-t1));
Nt(6)=((r-r5)/(r6-r5))*((s-s7)/(s6-s7))*(1/(t6-t2));
Nt(7)=((r-r8)/(r7-r8))*((s-s6)/(s7-s6))*(1/(t7-t3));
Nt(8)=((r-r7)/(r8-r7))*((s-s5)/(s8-s5))*(1/(t8-t4));
p1(ri,si,k)=a1*N(ri)*Ns(si)*w(mn)*w(nm)*w(km)*.125*a*b*c;
p2(ri,si,k)=a2*N(ri)*Nt(si)*w(mn)*w(nm)*w(km)*.125*a*b*c;
%Elemental Stiffness Matrix......................
ke(ri,si,k) = ke(ri,si,k) + p1(ri,si,k) + p2(ri,si,k);
end
end
end
end
end
end
VECTORIZED VERSION
for k=1:nel
r=.5*(a*p(mn)+r1+r2);
s=.5*(b*p(nm)+s3+s2);
t=.5*(c*p(km)+t1+t5);
Nr = zeros(4,4,4,8);
N = zeros(4,4,4,8);
Ns = zeros(4,4,4,8);
Nt = zeros(4,4,4,8);
for ri=1:8
for si=1:8
%...............SHAPE FUNCTUION..........................
Nr(:,:,:,1)=(1/(r1-r2))*((s-s4)/(s1-s4)).*((t-t5)/(t1-t5));
Nr(:,:,:,2)=(1/(r2-r1))*((s-s3)/(s2-s3)).*((t-t6)/(t2-t6));
Nr(:,:,:,3)=(1/(r3-r4))*((s-s2)/(s3-s2)).*((t-t7)/(t3-t7));
Nr(:,:,:,4)=(1/(r4-r3))*((s-s1)/(s4-s1)).*((t-t8)/(t4-t8));
Nr(:,:,:,5)=(1/(r5-r6))*((s-s8)/(s5-s8)).*((t-t1)/(t5-t1));
Nr(:,:,:,6)=(1/(r6-r5))*((s-s7)/(s6-s7)).*((t-t2)/(t6-t2));
Nr(:,:,:,7)=(1/(r7-r8))*((s-s6)/(s7-s6)).*((t-t3)/(t7-t3));
Nr(:,:,:,8)=(1/(r8-r7))*((s-s5)/(s8-s5)).*((t-t4)/(t8-t4));
N(:,:,:,1) = (r-r2).*Nr(:,:,:,1);
N(:,:,:,2) = (r-r1).*Nr(:,:,:,2);
N(:,:,:,3) = (r-r4).*Nr(:,:,:,3);
N(:,:,:,4) = (r-r3).*Nr(:,:,:,4);
N(:,:,:,5) = (r-r6).*Nr(:,:,:,5);
N(:,:,:,6) = (r-r5).*Nr(:,:,:,6);
N(:,:,:,7) = (r-r8).*Nr(:,:,:,7);
N(:,:,:,8) = (r-r7).*Nr(:,:,:,8);
Ns(:,:,:,1) = N(:,:,:,1)./(s-s4);
Ns(:,:,:,2) = N(:,:,:,2)./(s-s3);
Ns(:,:,:,3) = N(:,:,:,3)./(s-s2);
Ns(:,:,:,4) = N(:,:,:,4)./(s-s1);
Ns(:,:,:,5) = N(:,:,:,5)./(s-s8);
Ns(:,:,:,6) = N(:,:,:,6)./(s-s7);
Ns(:,:,:,7) = N(:,:,:,7)./(s-s6);
Ns(:,:,:,8) = N(:,:,:,8)./(s-s5);
Nt(:,:,:,1) = N(:,:,:,1)./(t-t5);
Nt(:,:,:,2) = N(:,:,:,2)./(t-t6);
Nt(:,:,:,3) = N(:,:,:,3)./(t-t7);
Nt(:,:,:,4) = N(:,:,:,4)./(t-t8);
Nt(:,:,:,5) = N(:,:,:,5)./(t-t1);
Nt(:,:,:,6) = N(:,:,:,6)./(t-t2);
Nt(:,:,:,7) = N(:,:,:,7)./(t-t3);
Nt(:,:,:,8) = N(:,:,:,8)./(t-t4);
kem = .125*a*b*c * N(:,:,:,ri).*w(mn).*w(nm).*w(km) ...
.* ( (-.02*s+0.5*r.*(1-r.^2)+.05*t).*Ns(:,:,:,si) ...
+ (-.05*t-.5*s).*Nt(:,:,:,si));
ke(ri,si,k) = sum(kem(:));
%
end
end
end
0 Comments
Accepted Answer
Jan
on 27 Jul 2011
No, vectorized code is not always faster. If the vectorization needs the creation of large temporary arrays, loops are often faster. The allocation of memory is very expensive, because it can cause a garbage collection or even disk swapping.
An example: http://www.mathworks.com/matlabcentral/answers/8461-double-summation-with-vectorized-loops
BTW: Because Nr, N, Ns and Nt are completely overwritten in each iteration. Therefore it is enough and more efficient to allocate them once before the loops.
More Answers (2)
Daniel Shub
on 27 Jul 2011
I am not sure if vectorization is always faster, but loops are not as expensive as they used to be, thanks to the JIT accelerator. I would guess there might be examples were loops are faster, but I cannot think of one off the top of my head.
2 Comments
Daniel Shub
on 27 Jul 2011
I am not the best person to answer that. I would suggest asking it as a new question to get a good answer.
See Also
Categories
Find more on Particle & Nuclear Physics in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!