Cheat for preallocation requirement

1 view (last 30 days)
Arda Aksu
Arda Aksu on 10 Jun 2021
Commented: dpb on 11 Jun 2021
Hi, I almost always use preallocation to make the code run faster. But recently I was faced with a problem where I couldn't use preallocation. Even creating a big array and then shrinking at the end was not possible to my use. So I needed a cheat. While trying some workaround solutions I realized preallocation is not such a problem for cell arrays as it is for numeric arrays. Below is an example code. I am wondering the reason behind this. How does Matlab handle memory management for cell arrays with different than numeric arrays?
tic
x=[];
for i=1:1e5
x(end+1,:)=1:10;
end
toc
tic
c={};
for i=1:1e5
c{end+1,1}=1:10;
end
x=zeros(numel(c),10);
for i=1:numel(c)
x(i,:)=c{i};
end
toc
% Elapsed time is 10.064584 seconds.
% Elapsed time is 0.132362 seconds.
  1 Comment
dpb
dpb on 10 Jun 2021
Cell arrays don't have to be contiguous as are native numeric data types.

Sign in to comment.

Answers (1)

Jan
Jan on 10 Jun 2021
x = [];
for i = 1:1e5
x(end+1, :) = 1:10;
end
Here you allocate sum(1:1e5) * 10 * 8 bytes: 400 GB.
c = {};
for i=1:1e5
c{end+1,1} = 1:10;
end
This allocates sum(1:1e5) * 8 bytes, because you need only one additional pointer per iteration. In addition, the elements of c point all to the same variable, which is shared:
format debug
c{1}
% Structure address = ebf72da0
% m = 1
% n = 10
% pr = f6ba8aa0
c{2}
% Structure address = ebf72da0
% m = 1
% n = 10
% pr = f6ba8aa0 <- the same address!
In the first code, the contents of the array is copied in each iteration, while in the cell method, all cells use the same data pointer, such that much less data have to be copied and less memery is occupied. On the other hand, each cell element has a header of about 100 bytes.
The JIT acceleration can cope with an omitted pre-allocation. This can reduce the dramatic slow down under some circumstances.
By the way, I cannot imagine a situation, where a pre-allocation is not possible. Could you post a minimal working example? At least allocating in steps of 1000 should be possible.
  9 Comments
James Tursa
James Tursa on 11 Jun 2021
"Do you have an explanation for the timings measured in the original question?"
Yes. As Jan pointed out, the first method copies and re-copies the data over and over and over again. The second method doesn't.
dpb
dpb on 11 Jun 2021
Well, yes...the Q? I interpreted as was about insight re: Jan's expectation of 10:1 vis a vis the OP's observed 100:1 ratio. Why on that, I've no klew, starting with I don't know why/how Jan derived his "expected" factor of 10.

Sign in to comment.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!