# Indexing arrays of binned data

14 views (last 30 days)

Show older comments

Dear all,

I have a cell array expData of 2745x1 cell. For every cell in this cell array I define the same range (i.e. bins). Then I discretize the data in expData based on the defined range.

Based on the discretized data in expData I want to find the corresponding values in the cell array velData, wich is illustrated in the picture below. Cell 14 is taken as an example. When the values are found I want to take the mean of it for every bin.

I tried this using accumarray but with no luck:

for i = 1:length(files)

% Define the range of the bins

rng_x{i} = -0.3:0.06:0.3;

% Assign the data of x-coordinate to a predefined range

disc_x{i} = discretize(expData{i,1}(:,1),rng_x{i});

% Calculate mean of every bin

x_mean{i} = accumarray(disc_x{1,i}(:,1), expData{i,1}(:,1),[11 1], @mean);

% Define the range of the bins

rng_z{i} = 0:0.06:0.78;

% Assign the data of z-coordinate to a predefined range

disc_z{i} = discretize(expData{i,1}(:,3),rng_z{i});

% Calculate mean of every bin

z_mean{i} = accumarray(disc_z{1,i}(:,1), expData{i,1}(:,3),[13 1], @mean);

vx_disc{i} = accumarray(disc_x{1,i}(:,1), velData{i,1}(:,1),[11 1], @mean); %Did not work

end

Splitapply does not work in this case since the bins will go empty when moving through the cells. You will get the following error if you use splitapply in this case:

"For N groups, every integer between 1 and N must occur at least once in the vector of group numbers."

##### 2 Comments

Dana
on 9 Sep 2020

### Answers (3)

Steven Lord
on 9 Sep 2020

Take a look at the groupsummary function.

% Include rng default so you generate the exact same random numbers I did

rng default

x = randn(10, 1);

y = -2:0.25:2;

d = discretize(x, y);

[values, groups] = groupsummary(x, d, @sum);

% Show the results in tabular form

xAndD = table(x, d, 'VariableNames', {'x_value', 'group'})

vAndG = table(values, groups, 'VariableNames', {'summed_value', 'corresponding_group'})

The value of summed_value in the row of vAndG whose corresponding_group entry is 10 represents the sum of the elements in the x variable in xAndD whose rows have 10 in the group variable.

group10_v1 = vAndG{vAndG.corresponding_group == 10, 1}

group10_v2 = sum(xAndD{xAndD.group == 10, 1})

group10_v1 == group10_v2 % True

Because of the rng default call I know that d has 10 in positions 5 and 8.

group10_v1 == x(5)+x(8) % True

Dana
on 9 Sep 2020

Index in position 1 exceeds array bounds (must not exceed 1).

This error is an indexing error, which suggests to me that one or more of your indices in that line of code are wrong. Further, it's not reporting the error from inside the function accumarray, which means the error is happening before anything is actually passed to that function. Based on that, we conclude that the error arises in the arguments you're passing to accumarray.

Since it's indicating that an index in position 1 is wrong, and the only part of that line of code with an index in position 1 that can potentially exceed 1 is velData{i,1} (the index in position 1 exceeds 1 if i>1), that's the obvious candidate. If you do size(velData,1), do you get something greater than 1? If not, that's your problem right there.

Based on my understanding of what you're trying to do, I would think you should get what you're after if you fix that problem. However, you said, "But even if I solve the above error, I doubt whether I will get the intended results. Because accumarray is saying that the data from velData will be devided into bins specified by disc_x." Isn't that what you want? I don't understand why that's a problem.

Essentially, using your strategy here, for file i, each row of expData{i,1} is associated with the same row of velData{i,1} (ignoring the indexing error, anyway). You're then binning the rows of expData{i,1} and velData{i,1} according to the values in the first column of expData{i,1}, with the index of the corresponding bin stored in the vector disc_x{i}. Next, you want to compute the means of the first column of velData{i,1} by bin. If that's what you're after, then your code should do that (again, as long as you fix the above indexing issue first).

##### 2 Comments

J. Alex Lee
on 10 Sep 2020

From what I can gather, it hsould be possible to reorganize your experimental data into a Nx6 matrix called Data, where N is the number of coordinate,velocity pairs, and the 6 columns are organized as

x|y|z|u|v|w

-----------

| | | | |

To bin just on the (x,z) coordinates, you can use histcounts2

[~,Xedges,Zedges,binX,binZ] = histcounts2(Data(:,1),Data(:,3),nBins);

where nBins is the number of bins you want in each direction x and z.

You can use Xedges and Yedges to compute the bin centers, and binX, and binY are the assignments of each data point (row in Data) into the 1D bins along each direction.

From there you just need to use binX and binY to determine which 2D bin a data point (row in Data) belongs to. I would then just loop through those indices to find average velocities, but perhaps you can somehow use "groupsummary" as suggested above, if you are allowed to define your own groups manually

##### 2 Comments

J. Alex Lee
on 10 Sep 2020

it would be nice if there was a "discretize2" function, this doesn't seem like such a niche need...

### See Also

### Categories

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!