Clear Filters
Clear Filters

Parallelization is not working in Matlab

10 views (last 30 days)
Chang seok Ma
Chang seok Ma on 30 Mar 2021
Answered: Nipun on 14 May 2024 at 6:06
Hello,
I am trying to paralleliza my code in Matlab. Below is a part of my code.
Basically, I am finding the minimum position and the value given 4 state variables.
I want to speed up my code by using parallelization so I changed 'for i_a = 1:Na' to 'parfor i_a = 1:Na' but it doesn't seem like Matlab parallelize the code because if I see the result at the end of each loop (disp([i_a i_d i_y i_t toc])), it seems like Matlab is calculating minimum value one by one. (And I don't think the code is faster also)
Am I doing something wrong? (I don't think there is false sharing issue here because the one I update is Vnew and Vnew is not used during the calculation)
for i_a = 1:Na %Loop over state variable a
for i_d = 1:Nd %Loop over state variable d
for i_y = 1:Ny %Loop over state variable y
for i_t = 1:Nt %Loop over state variable t
tic;
utilityadj = @(adjf) -(((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))>0)*((1/(1-elasticity)*( ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))^relutility) * (adjf(2)^(1-relutility)) )^(1-elasticity)) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(1),T(1))*transition(i_y,1)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(2),T(1))*transition(i_y,2)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(3),T(1))*transition(i_y,3)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(4),T(1))*transition(i_y,4)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(5),T(1))*transition(i_y,5)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(6),T(1))*transition(i_y,6)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(7),T(1))*transition(i_y,7)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(1),T(2))*transition(i_y,1)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(2),T(2))*transition(i_y,2)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(3),T(2))*transition(i_y,3)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(4),T(2))*transition(i_y,4)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(5),T(2))*transition(i_y,5)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(6),T(2))*transition(i_y,6)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,adjf(1),adjf(2),YS(7),T(2))*transition(i_y,7)*prob(2)) ...
+ ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R + (1-delta)*D(i_d) - adjf(1) - adjf(2) - fixed*(1-delta)*D(i_d))<=0)*(-1e10));
noadj_damount = (1-delta)*D(i_d);
if i_d == 1
noadj_damount = d_min;
end
utilitynoadj = @(noadjf) -(((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)>0)*((1/(1-elasticity)*( ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)^relutility) * (((1-delta)*D(i_d))^(1-relutility)) )^(1-elasticity)) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(1),T(1))*transition(i_y,1)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(2),T(1))*transition(i_y,2)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(3),T(1))*transition(i_y,3)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(4),T(1))*transition(i_y,4)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(5),T(1))*transition(i_y,5)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(6),T(1))*transition(i_y,6)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(7),T(1))*transition(i_y,7)*prob(1) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(1),T(2))*transition(i_y,1)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(2),T(2))*transition(i_y,2)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(3),T(2))*transition(i_y,3)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(4),T(2))*transition(i_y,4)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(5),T(2))*transition(i_y,5)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(6),T(2))*transition(i_y,6)*prob(2) ...
+ beta * interpn(A,D,YS,T,V,noadjf,noadj_damount,YS(7),T(2))*transition(i_y,7)*prob(2)) ...
+ ((wage*hour*YS(i_y) + T(i_t) + A(i_a)*R - noadjf)<=0)*(-1e10));
lb = [a_min-10,d_min-10];
ub = [a_max+10,d_max+10];
a = [];
b = [];
aeq = [];
beq = [];
x0 = (lb + ub) / 2;
options = optimoptions('fmincon','Display','off');
[adjchoice,adjval] = fmincon(utilityadj,x0,a,b,aeq,beq,lb,ub,a,options);
[noadjchoice,noadjval] = fmincon(utilitynoadj,a_min,a,b,aeq,beq,a_min-10,a_max+10,a,options);
Vnew(i_a,i_d,i_y,i_t) = -min(adjval, noadjval);
if Vnew(i_a,i_d,i_y,i_t) == -adjval
indpol_ap(i_a,i_d,i_y,i_t) = adjchoice(1);
indpol_dp(i_a,i_d,i_y,i_t) = adjchoice(2);
indadj(i_a,i_d,i_y,i_t) = 1;
else
indpol_ap(i_a,i_d,i_y,i_t) = noadjchoice;
indpol_dp(i_a,i_d,i_y,i_t) = noadj_damount;
indadj(i_a,i_d,i_y,i_t) = 0;
end
disp([i_a i_d i_y i_t toc])
end
end
end
end

Answers (1)

Nipun
Nipun on 14 May 2024 at 6:06
Hi Chang Seok,
I understand that you intend to parallelize your MATLAB code to enhance its performance, particularly by utilizing the `parfor` loop for iterating over one of the state variables in your nested loop structure. However, you've observed that the execution does not seem to be parallelized effectively, as indicated by the sequential display of loop iterations and a lack of noticeable speed improvement.
1. Parallel Pool
Ensure that a parallel pool is active before executing the `parfor` loop. If a parallel pool is not already open, MATLAB will attempt to open one, which can add overhead, especially if the loop is relatively short. You can manually start a parallel pool using `parpool`
2. Overhead vs. Computation Time
The effectiveness of parallelization is more pronounced for loops where each iteration takes a significant amount of time. If the computations within each iteration are relatively quick, the overhead of distributing tasks among workers can outweigh the benefits. Your nested `fmincon` optimizations seem computationally intensive, which should, in theory, benefit from parallelization.
3. Data Transfer and Dependencies
`parfor` loops work best when each iteration is independent of others, minimizing the need for data transfer between workers. Your code appears to follow this principle, as each iteration writes to unique indices of `Vnew`, `indpol_ap`, `indpol_dp`, and `indadj`. However, ensure that all variables used within the loop are properly initialized and that there are no hidden dependencies.
4. Profiling and Optimization
Consider using MATLAB's Profiler to identify bottlenecks in your code. The Profiler can help you understand where the most time is spent and guide optimization efforts:
profile on
% Your code here
profile off
profile viewer
If after these considerations, you still do not observe improved performance, it might be worth examining more closely the specific computations within the loop or consulting MATLAB's parallel computing documentation for more advanced techniques, such as chunking iterations or optimizing data transfer between workers.
Hope this helps.
Regards,
Nipun

Categories

Find more on Graphics Performance in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!