memory space for cell array

9 views (last 30 days)
Yingke
Yingke on 8 Jun 2012
Edited: per isakson on 6 Nov 2014
Dear All
I am a bit curious about the memory space required by an empty cell element. In the following code, a and b are the same, but they have different size (in memory). It seems that '[]' occupies 60 B after explicit assignment, which seems quite expensive. I think a NULL pointer (with size 4B) is much more reasonable.
====CODE====
a = {[],[]};
b{2} = [];
isequal(a{1},b{1}) // they are the same
whos
a 1x2 120 cell
b 1x2 64 cell

Accepted Answer

James Tursa
James Tursa on 9 Jun 2012
A MATLAB variable consists of:
Structure (about 60 bytes on a 32-bit system, a bit more on 64-bit systems) containing information such as:
- Class
- Type (temporary, normal, sub-element, etc)
- Dimensions
- Pointers to data (real & imaginary)
- Pointers to sparse indexing (if applicable)
- Pointers to field names (for structs)
- A few other things
A double variable is all of the above, with data pointer(s) pointing to an array of double data. A single variable is all of the above, with data pointer(s) pointing to an array of single data. A char variable is all of the above, with data pointer pointing to an array of char data (2-bytes per char, btw). Etc.
A cell array is all of the above, with data pointer pointing to an array of other MATLAB variable pointers. E.g., in a double class variable, the "data" of the variable is 64-bit IEEE double precision values. For a cell array, the "data" of the variable is 32-bit pointers to the 60-byte variable structures of other variables (64-bit pointers on 64-bit systems)
When you create a cell array with the cell function, e.g.,
A = cell(1,2);
The variable A is all of the above, but the "data" of the cell array variable is an array of all NULL pointers (0). I.e., there is really nothing there at all. When you de-reference such an array like this:
A{1,1}
MATLAB actually creates a brand new empty double matrix on the fly.
When you fill a cell array element with an explicit empty double matrix like this:
B{2} = [];
MATLAB creates a brand new cell array B (assuming B did not already exist) with two data elements. The first element is actually NULL (0) and nothing is there. The second element actually has a variable pointer in it that is not NULL ... it points to another MATLAB variable that happens to be an empty double matrix. The B elements look a lot like variable A elements above (empty double matrices), but behind the scenes they are stored differently as outlined above. But even though they are stored differently behind the scenes, the isequal function still considers the elements equal since the NULL pointers get virtually equated to physically present empty matrices.
If you fill a cell element with something like this:
B{1} = [1 2 3];
MATLAB creates a shared data copy of the right-hand-side and puts the pointer to this shared data copy into the first data spot of B. When you subsequently de-reference the cell array element like this:
B{1}
MATLAB considers this an expression and evaluates it as a shared data copy of the variable that is actually in the B{1} spot.
A struct array is very similar to a cell array in that the "data" of a struct array is pointers to the 60-byte structures of other MATLAB variables. The difference is the extra complication of indexing into the data elements via the field names.
Also, you need to be very careful using the whos function when comparing variable memory footprints, since the whos function does not take into account variable sharing. E.g., consider the following formulations for A and B and the subsequent whos command:
>> A = cell(1,1000);
>> A(:) = {1:1000};
>> B = cell(1,1000);
>> for k=1:1000; B(k) = {1:1000}; end
>> whos
Name Size Bytes Class
A 1x1000 8060000 cell array
B 1x1000 8060000 cell array
k 1x1 8 double array
Grand total is 2002001 elements using 16120008 bytes
>> isequal(A,B)
ans =
1
It sure looks like A and B take the same amount of memory, and in fact their data is equal according to the isequal function. But in fact the actual memory footprint for A is much smaller than the memory footprint for B. That is because the data elements of A, the pointers themselves, are in fact exactly equal to each other ... i.e. all 1000 elements of A point to exactly the same MATLAB variable (sub-element sharing aka reference copy). But in variable B, each element was created new each time through the loop and thus it is a brand new variable that takes more storage. The memory footprint for B really is very large as indicated. To see this:
>> feature memstats
Physical Memory (RAM):
In Use: 3545 MB (dd990000)
>> clear A
>> feature memstats
Physical Memory (RAM):
In Use: 3545 MB (dd964000)
>> clear B
>> feature memstats
Physical Memory (RAM):
In Use: 3537 MB (dd1e5000)
Clearing A didn't do much of anything to the actual used memory, but clearing B sure did !

More Answers (0)

Categories

Find more on Structures in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!