- are you using "clear" or "clear all" at the top of your script? Try using "clearvars" instead.
- Are you loading the same files every loop? You could try loading once at the beginning of the loop instead.
When loading .mat files in a parfor, the first time is way slower than the second time.
3 views (last 30 days)
Show older comments
Hi all,
I've encountered a weird behavior I wasn't able to understand or find a possible explanation of.
I wrote a function for loading some files (data structures whose size ranges from 40 to 100 MB) from a dataset in a parfor, and do some operations.
I've noticed that the first time I launch the script, the execution is incredibly slower than the successive executions (38 seconds vs 1.8 seconds).
I've tried to remove the parfor and use a simple for, but there is still a difference between the first and the successive times, even thou more limited (17 seconds vs 11 seconds).
I've also tried different datasets, and there is the same behavior. When I restart Matlab and I launch the same call the first time, same thing. If I stop and restart parpool, same thing.
I am wondering why it is like this and if I can do something to avoid this behavior.
Matlab 2019a Update 4, Unix (64-bit)
PS: parpool was already started.
PPS: the successive executions are faster even after calling clear all/clearvars.
PPPS: to remove all possible other influences, I've cleaned the code so that now it just loads files. Same behavior.
2 Comments
Daniel M
on 15 Nov 2019
I think this is either an issue with either caching or the just-in-time compiler organizing itself on the first run of the loop, or an issue with broadcasting in the parfor.
Question:
Answers (1)
Daniel M
on 15 Nov 2019
Edited: Daniel M
on 15 Nov 2019
So you're doing something like this?
for k = 1:10
mydata = load('myfile.mat');
output = someFunction(mydata);
end
That's pretty inefficient. You should load the data once outside the loop. It will be faster to read the data from a cache than to load it each time (because typically speed of memory is better than I/O).
As for why the first iteration is slower, I believe that is due to the JIT compiler doing its magic. This is also referred to as 'warm-up time'. Hopefully someone with a deeper understanding can weigh-in here.
Try running this script to test for warm up time. Note: run this in a script, not the command window (because the JIT effects may not take place in the command window).
clearvars
close all
clc
% Create some data, but only once
if ~exist('data.mat','file')
data = rand(1,1e8,'single');
save('data.mat','data');
clear data
end
fname = 'data.mat';
fprintf('loading\n')
tic
mydata = load(fname);
data = mydata.data;
loadtime = toc;
% display the loading time
fprintf('It took %f s to load the file.\n',loadtime)
% Run some stuff in a loop and time it.
iters = 20;
t2 = zeros(1,iters);
for k = 1:iters
t1 = tic;
% do some random processes on mydata
tmp1 = data.^2;
tmp2 = sin(tmp1);
t2(k) = toc(t1);
end
figure
stem(t2)
xlabel('Time')
ylabel('Iteration')
% first couple iterations take longer
% get the warm up time (from first few iterations)
warmtime = max(t2(1:3))/mean(t2(end-3:end)) - 1;
fprintf('First few iterations were %.0f %% slower than last\n',warmtime*100)
fprintf('done!\n')
And the output:
loading
It took 2.576633 s to load the file.
First few iterations were 58 % slower than last
done!
% see attached figure
5 Comments
See Also
Categories
Find more on Parallel for-Loops (parfor) in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!