# How efficient is it to use (end+1) to add a value to an array?

6 views (last 30 days)
Katia Anatska on 7 Sep 2021
Edited: John D'Errico on 7 Sep 2021
I am going to work with a huge number of entries in an array (thousands or hundreds of thousands). What would be more efficient: to create an array of size n full of zeros and with each iteration replace a zero with a value I need?
array = zeros(n,1)
OR:
to create an empty array and add a value to the end of the array with each iteration?
array = []
with each iteration:
array(end+1) = some value
I just need to know which algorithm would be more efficient to use.
Thank you.

Dave B on 7 Sep 2021
Edited: Dave B on 7 Sep 2021
Preallocating an array is far more efficient. Why?
Here's a critical bit. MATLAB generally stores arrays in contiguous blocks of memory. "Repeatedly resizing arrays often requires MATLAB® to spend extra time looking for larger contiguous blocks of memory, and then moving the array into those blocks."
Let's see it in action:
tic
array = zeros(1e7,1);
for i = 1:1e7
array(i)=rand;
end
toc
Elapsed time is 0.469052 seconds.
tic
other=[];
for i = 1:1e7
other(end+1)=rand;
end
toc
Elapsed time is 1.524720 seconds.
Dave B on 7 Sep 2021
If your data are double, consider preallocating with NaN (using the nan function) which has the advantage of potentially catching edge cases where you expected to assign a value and didn't. If your data are integer, consider specifying that in the call to zeros:
a = nan(13,2);
for i = 1:12
a(i,1)=rand;
a(i,2)=randn;
end
bar(mean(a)); % bar is empty because we forgot to assign the 13th value b = zeros(10,1,'uint8'); % a uint8 vector of zeros consumes less resources than a double vector of zeros
for i = 1:10
b(i) = randi(255);
end
class(b)
ans = 'uint8'

John D'Errico on 7 Sep 2021
Edited: John D'Errico on 7 Sep 2021
Increasing the size of an array incrementally is a TERRIBLE thing to do. WHY?
Every time you add a new element to the end of the array, the vector becomes one element longer. So what does MATLAB need to do? It alocates a NEW array or vector of the new size. Then it copies the entire set of elements over. So every time, the array becomes larger and larger. And it takes more and more time to just add one more element. For vectors with many thousands of elements, this will become the largest source of time wasted in your code. A simple loop to create that vector might take days or weeks to create. Do NOT do this.
Instead, just preallocate the array to the final size. Then index into it. This will be incredibly fast and efficient. You will use a tool like zeros to perform the preallocation. I actually prefer to preallocate with the NaNs function, since then it shows me where the unused elements are. But the difference is trivial.
The only issue is if you don't know the final size. Then The best solution will arguably to just create a vector that is longer then you know it will need. When all is done, delete those last unused elements.
When you really have no clue how long your vector must be, and you cannot allocate something longer than you will need, there are tools I have provided on the file exchange to do this operation, to make it more efficient. They use tricks with cell arrays to avoid having to continuously reallocate new blocks of memory.

R2021a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!