Problem using parfor for reading variable sized chunks of data into a larger pre-allocated container

Question

0 votes

Hi,

I have an problem where I have pre-allocated a large matrix or vector, a, and where I will read data blocks from a large number of files that will be inserted at the right indexes in a. The different files, and resulting blocks, will typically have different size.

A simple example:

a = zeros(15,1); % pre-allocated vector
b = [1,10;11,15]; % each row contains the start to stop index for each block
parfor i = 1:size(b,1)
    a(b(i,1):b(i,2),:) = i*ones((b(i,2)-b(i,1))+1,1);
end

With 'for' instead of 'parfor' it works as intended.

I have been reading several sites like https://www.mathworks.com/help/parallel-computing/troubleshoot-variables-in-parfor-loops.html and https://www.mathworks.com/matlabcentral/answers/126515-usage-of-sturctures-inside-parfor but I cannot wrap my head around my problem above.

Any tips or solutions which don't reduce the performance I am trying to obtain by using parfor in the first place?

Thanks,

Oyvind

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Edric Ellis on 21 Oct 2020

Edited: Edric Ellis on 21 Oct 2020

Open in MATLAB Online

0 votes

There's no simple way to do this without at least some duplication of data. With some duplication of data, you could do something simple like this:

aCell = cell(1, size(b,1));
parfor i = 1:size(b,1)
    aCell{i} = <stuff>; % return each block in its entirety
end
a = vertcat(aCell{:}); % concatenate all cell entries into the final result

If that is not sufficiently performant, you could consider using parfeval to give you a little more control, but this is more difficult to code, and may not actually save you much. Here's an untested sketch though:

a = zeros(15,1);
for i = 1:size(b,1)
    fut(i) = parfeval(@doStuff, 1, b(i,1), b(i,2)); % invoke doStuff(b(i,1),b(i,2))
end
for i = 1:size(b,1)
    [idx, result] = fetchNext(fut); % collect the next result
    % (note that 'idx' tells you the index into 'fut' that just
    % completed)
    a(b(idx,1):b(idx,2),:) = result; % push the result into 'a'
end

4 Comments
Show 2 older comments Hide 2 older comments

Oyvind Heg on 23 Oct 2020

Thank you for the answer.

I'v done a quick profiler example:

parfor:

1 result_cell = cell(N,1);

2 parfor i = 1:N

3 result_cell{i} = readData(...);

4 end

parfeval:

5 fut(1:N) = parallel.FevalFuture;

6 for i = 1:N

7 fut(i) = parfeval(@readData,...)

8 end

9 for i = 1:N

10 [idx,result] = fetchNext(fut);

11 end

My observations when running the profiler are:

The 'for' loop on line 2 and line 9 take about the same amount of time (about 12 seconds in my example).

However line 7 takes as long as 8 seconds. Is that to be expected? Seems like a lot of overhead when the actual work takes 12 seconds.

Thanks,

Oyvind

Oyvind Heg on 27 Oct 2020

Hi,

Any thoughts on the overhead on line 7? Is is as expected, or am I doing something wrong?

Thanks,

Oyvind

Sign in to comment.

Problem using parfor for reading variable sized chunks of data into a larger pre-allocated container

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

Problem using parfor for reading variable sized chunks of data into a larger pre-allocated container

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

4 Comments
Show 2 older comments Hide 2 older comments