Suggestions for vectorizing double/triple for loops in Matlab
9 views (last 30 days)
Show older comments
My Matlab code has multiple double/triple for loops which I believe cause a bottleneck when the index summation variables reach large values. I've tried to implement vectorization instead of double/triple for loops with some small performance gains. I'm interested in understanding best practices for changing double/triple for loops into vectorized Matlab code.
My real code includes 10-15 functions that implement variations of double/triple for loops. I'm curious on how I could improve or rewrite these for loops as vectorized code. I've included two examples below of a double for loop and a triple for loop. I'm interested in seeing if these could be improved. I plan to rewrite all of the for loops as vectorized code (if possible).
First example: This is a double for loop. Implementating the builtin vectorization in Matlab appears to increase the computational speed. I'm not sure if this could be done more efficiently.
close all;
clear all;
clc;
% Test 1: Small M,N
M = 50;
N = 50;
c = rand(N,M)+i*rand(N,M);
eps = .3;
delt = .4;
tStart = tic;
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m)*eps^n*delt^m;
end
end
tEnd = toc(tStart);
fprintf('Testing against a double for loop \n');
fprintf('First test with small N,M\n');
fprintf('Total time by double for loop is %f seconds \n', tEnd);
% This may be a sloppy way of vectorizing the code. I'm not sure if there
% is a more efficient method
tStart = tic;
[nval,mval] = ndgrid(1:N,1:M);
vander = eps.^nval.*delt.^mval;
polyv1 = sum(sum(c.*vander));
tEnd = toc(tStart);
fprintf('Total time by vectorization is %f seconds \n', tEnd);
diff = norm(polyv1 - polyv,inf); % should be tiny
% Test 2: "Large" M,N
M = 1000;
N = 1000;
c = rand(N,M)+i*rand(N,M);
eps = .3;
delt = .4;
tStart = tic;
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m)*eps^n*delt^m;
end
end
tEnd = toc(tStart);
fprintf('Second test with large N,M\n');
fprintf('Total time by double for loop is %f seconds \n', tEnd);
tStart = tic;
[nval,mval] = ndgrid(1:N,1:M);
vander = eps.^nval.*delt.^mval;
polyv1 = sum(sum(c.*vander));
tEnd = toc(tStart);
fprintf('Total time by vectorization is %f seconds \n', tEnd);
diff2 = norm(polyv1 - polyv,inf); % should be tiny
These two test return
Testing against a double for loop
First test with small N,M
Total time by double for loop is 0.008831 seconds
Total time by vectorization is 0.007447 seconds
Second test with large N,M
Total time by double for loop is 0.393363 seconds
Total time by vectorization is 0.163277 seconds
So there seems to be a small performance gain with the second method.
Second example: This is a triple for loop. I have created test data (my actual Matlab code code is far more complicated but I'm really only interested in improving performance).
close all;
clear all;
clc;
% Test data
N = 20;
M = 20;
N_Eps = 100;
N_delta = 100;
Nx = 32;
ru = zeros(N_Eps,N_delta);
rl = zeros(N_Eps,N_delta);
alpha = 0;
B = zeros(Nx,1);
C = zeros(Nx,1);
alpha_p = ones(Nx,1);
k_u = 2;
k_w = 1;
tStart = tic;
for ell=1:N_delta
for j=1:N_Eps
for p=1:N
% Real code is similar - this is just test data
if(alpha_p(p)^2 < k_u^2)
B(p) = 1;
end
if(alpha_p(p)^2 < k_w^2)
C(p) = 0;
end
end
ru(j,ell) = sum(B);
rl(j,ell) = sum(C);
end
end
tEnd = toc(tStart);
fprintf('Total time by triple for loop is %f seconds \n', tEnd);
% This has more meaning in my real Matlab code. I'm only inserting in test
% data to improve performance.
ee = 1.0 - ru - rl;
This returns
Total time by triple for loop is 0.013758 seconds
I'm not sure how to vectorize this code through a method similar to the first example.
Do you recommend (in general) vectorization over writing double/triple for loops? Should you always try to use vectorization and avoid writing any double/triple for loops? I'm curious what more experienced Matlab users do.
0 Comments
Accepted Answer
Jan
on 13 Jul 2021
Edited: Jan
on 13 Jul 2021
The power operation is very expensive, so avoid it whenever it is possible:
M = 50;
N = 50;
c = rand(N,M) + i*rand(N,M);
epsilon = 0.3; % Do not shadow the importan function "eps"
delt = 0.4;
tic;
for rep = 1:1000
polyv = 0;
for n = 1:N
for m = 1:M
polyv = polyv + c(n,m) * epsilon^n * delt^m;
end
end
end
toc
tic
for rep = 1:1000
polyv = 0;
c1 = 1;
for n = 1:N
c1 = c1 * epsilon;
c2 = 1;
for m = 1:M
c2 = c2 * delt;
polyv = polyv + c(n,m) * c1 * c2;
end
end
end
toc
3 times faster already.
tic
for rep = 1:1000
[nval,mval] = ndgrid(1:N,1:M);
vander = epsilon.^nval .* delt.^mval;
polyv1 = sum(sum(c .* vander));
end
toc
It has some severe drawbacks: It contains a lot of power operations, but they are called for the same inputs repeatedly. Avoid ndgrid, but calculate the powers of the vectors and create the matrix afterwards:
for rep = 1:1000
vander = (epsilon .^ (1:N).') .* (delt .^ (1:M));
polyv1 = sum(sum(c .* vander));
end
toc
But you can avoid the power operation by using cumprod():
tic
for rep = 1:1000
c1 = cumprod(repmat(epsilon, N, 1));
c2 = cumprod(repmat(delt, 1, M));
polyv = sum(c .* c1 .* c2, 'all');
end
toc
Please run this locally again to check the timings.
In your 2nd example, the repeated squaring of alpha_p, k_u and k_u is a waste of time. Do this once before the loops.
For real values, a^2 < b^2 is equivalent to abs(a) < abs(b).
A vectorized version of:
% Original:
for p=1:N
if(alpha_p(p)^2 < k_u^2)
B(p) = 1;
end
if(alpha_p(p)^2 < k_w^2)
C(p) = 0;
end
end
% Vectorized, alpha_p_2 = alpha_p .^ 2 etc. defined before the loops
B(alpha_p_2 < k_u_2) = 1;
C(alpha_p_2 < k_w_2) = 0;
0 Comments
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!