How do I average all the values for each column in a cell array?
29 views (last 30 days)
Show older comments
Hi,
I have a cell array called new_mat. I would like to compute the mean of all the values in each column and save the result in a new array called averages. I would then have a numerical array with one row and five columns, so five values in total.
How would I do that?
I tried this
avg_cols = cellfun(@(x) mean(x, 1), new_mat, 'UniformOutput', false);
But I still get this
avg_cols =
5×5 cell array
{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]}
{[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]}
{[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]}
{[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {1×0 double}
{[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}
Any ideas why the columns are not averaged?
0 Comments
Accepted Answer
Voss
on 13 Dec 2022
Edited: Voss
on 13 Dec 2022
cellfun operates on the contents of each cell independently, performing the specified function (in this case the function is mean(x,1)). If the outputs of those function calls are all scalars of the same class, then cellfun is able to combine all the results into an array (which it will do by default). Otherwise, you need to use 'UniformOutput',false to have cellfun return a cell array.
Examples:
C1 = {[1 1] [2]} % cell array with two cells: 1st contains a 1x2 vector, 2nd contains a scalar
try
result = cellfun(@(x)x,C1) % the function @(x)x just returns the contents of the cell
catch e % error: non-scalar output at 1st cell (which is [1 1] - obviously non-scalar).
% That is, cellfun can't combine [1 1] and [2] into a 1-by-2 matrix (the size of C)
disp(e.message);
end
C2 = {single(1) double(2)} % cell array with two cells: both containing scalars but different classes
try
result = cellfun(@(x)x,C2) % the function @(x)x just returns the contents of the cell (again)
catch e % error: mismatch in type of outputs (single vs double)
% That is, cellfun doesn't know what class the result should be
disp(e.message);
end
In both of those examples, you must use 'UniformOutput',false to have cellfun return a cell array instead of trying to construct a numeric matrix and erroring-out. Of couse, since the function is @(x)x, the resulting cell array from cellfun will be the same as what you gave it.
result = cellfun(@(x)x,C1,'UniformOutput',false)
isequal(result,C1) % the same as what you started with
result = cellfun(@(x)x,C2,'UniformOutput',false)
isequal(result,C2) % the same as what you started with
Now, to turn to your cell array:
load new_mat
new_mat % notice the cell in the 4th row, 5th column contains an empty array
new_mat consists of 24 cells that contain a scalar and one cell that contains an empty array. The function you want to run is @(x)mean(x,1), which will return a scalar on each of the 24 cells containing scalars and will return an empty array on the cell that contains an empty array. Since not all results will be scalars, you must use 'UniformOutput',false. Of course, the mean of a scalar is the scalar itself and the mean of an empty array is an empty array, so the result you get is essentially what you started with (the difference is that the 0x0 empty array becomes 1x0 when passed through mean(x,1)).
result = cellfun(@(x)mean(x,1),new_mat,'UniformOutput',false)
isequal(result,new_mat) % not the same
isequal(result([1:23 25]),new_mat([1:23 25])) % but the only difference is in the 24th cell (row 4, column 5)
OK, so that's a cellfun primer. As I said, cellfun operates on each cell independently. But you want to operate on columns of cells together, so that makes cellfun ill-suited to the task. It's straightforward to write a loop to do what you want:
N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
result(ii) = mean([new_mat{:,ii}]);
end
disp(result)
which could also be written:
N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
result(ii) = mean(vertcat(new_mat{:,ii}));
end
disp(result)
The difference being that the first loop horizontally concantenates the contents of the cells in a given column, and the second loop vertically concatenates the contents of the cells in a given column. In either case the result of that concatenation is a vector, so no dimension argument is required for mean (that is, it's mean(x), not mean(x,1)), but you could include one (it would be 2 for the horizontal concatentation case and 1 for the vertical).
Note that the dimension argument sent to mean has nothing to do with how the cells are arranged in new_mat! You're wanting to do @(x)mean(x,1) because you are thinking of averaging a column of cells, but the function @(x)mean(x,1) when used in cellfun doesn't operate on a column of cells - it operates on one cell at a time. Each cell contains a scalar (or empty array), so the dimension argument passed in to mean() is irrelevant.
In order to take the mean of several elements at a time, you've got to concatenate them together somehow - that's what code inside the loops does ([new_mat{:,ii}] to horizontally concatenate the contents of the cells in the iith column of new_mat, or vertcat(new_mat{:,ii}) to concatenate the same things vertically).
3 Comments
Voss
on 13 Dec 2022
@lil brain: You're welcome! I'm glad it's useful.
"would it make sense to convert the cells that contain scalars to something else?"
I don't know what you'd convert them to.
If you didn't have that one cell that contains an empty array, then you could convert the entire 5-by-5 cell array to a numeric matrix. Let's say you replace that empty array in that one cell with a scalar NaN:
load new_mat
new_mat{4,5} = NaN
Now all the cells contain scalars, so you can put all those scalars together into a numeric matrix the same size as your original cell array:
% using cell2mat:
M = cell2mat(new_mat)
% or, concatenating and reshaping:
M = reshape([new_mat{:}],size(new_mat))
Now, I don't know, in your application, whether using a NaN instead of an empty array is a good idea (maybe you still need to distinguish NaN from empty, in which case you don't want to replace one with the other), but if it makes sense to do that (or use some other scalar place-holder value like Inf), then it's convenient to use a numeric matrix like above instead of a cell array where all the cells contain a scalar.
More Answers (1)
Walter Roberson
on 12 Dec 2022
avg_cols = cellfun(@(x) mean(x, 1), new_mat);
You only need non-uniform output if some of the outputs can be a different size or datatype than the others, or if the output datatype is one that cannot be concatenated into an array (for example, function handles)
2 Comments
Walter Roberson
on 13 Dec 2022
Ah, you have an empty cell. mean of empty is empty. That prevents you from creating a numeric array of results.
If you were to
mask = cellfun(@isempty, new_mat);
new_mat(mask) = nan;
Then the cellfun would return nan for those entries
See Also
Categories
Find more on Matrices and Arrays in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!