extract numbers form a column

I have a double collumn set of data, something like that:
Q= [0 15
0 18
0 12
1 12
2 15
0 17
0 12
2 7
0 12
0 11]
I need to get the mean values of ech array in column 2, where are zeros in column 1. Or at lest extract those aarays onto different variables, they need to be separated. Can anybody help me please?
Thank you

3 Comments

Sargondjani
Sargondjani on 17 Jan 2021
Edited: Sargondjani on 17 Jan 2021
Try something like:
ind0 = Q(:,1) ==0%select the entries where Q(:,1) is 0
Sum_Q0 = mean(Q(ind0,2))%take the mean of the entries of Q(:,2) where ind0==1
This might not be fasted way, but its simple :-)
Thanks, but here I'll have the mean value of all numbers except those where first collunm is non zero. But I need them to be separated. The none zero values in column one, are not just need to be trown away, but also act as separators for new variables. Each 0 section in column 1 should yiel a variable in column 2.
See below...

Sign in to comment.

 Accepted Answer

dpb
dpb on 17 Jan 2021
Edited: dpb on 17 Jan 2021
v=sprintf('%d',double([nan;Q(:,1);nan].'==0));
iStrt=strfind(v,'01').';
iEnd=[strfind(v,'10')-1].';
grpsums=arrayfun(@(i1,i2)sum(Q(i1:i2,2)),iStrt,iEnd).';
This is same idea except uses the string form to find the transitions from '01' or '10' in the vector. It first converts the numeric values to logical of T|F on zero to make the comparison an exact transition.
If use the arrayfun solution, remember that the anonymous function encapsulates the data of any variables not passed as arguments in the function and that those are NOT updated if the variable changes. Hence, if Q changes, executing the function above with just a new set of start,stop indices would still be summing over those positions in the old Q. Of course, when the whole line is executed, the anonymous function is redefined with the Q at the time; the above really only "bites" if you define the anonymous function as a function handle and then reuse that handle.
As noted, still seems to me ought to be an easy way to build a group subs index by section and then use accumarray directly, but how to increment the groups more efficiently than again using the above indices just never came to me...

6 Comments

Thanks. The reshape function won't wore, as my zero arrays are not equal in size... Do you have an idea how to overcome it?
Thanks :)
dpb
dpb on 17 Jan 2021
Edited: dpb on 17 Jan 2021
The reshape only turns the vector of start/stop locations into two columns; there are two entries for each group of ones in the original vector so always end up with an even number of elements. I've not checked for certain, the above may require the first set be zeros; I just threw it together quickly for the specific data set.
It is not trying to reshape the vector itself based on assuming the same number of elements in each group; notice there is a group of three and two groups of two in the sample for which it is demonstrated to work.
Give a dataset that doesn't work for???
(Actually, I the idea of looking to make a little more generic and perhaps elegant solution, let me see about that...)
ShonyE
ShonyE on 17 Jan 2021
Edited: ShonyE on 17 Jan 2021
Thanks for the explanation. It says: Error using reshape
Product of known dimensions, 2, not divisible into total number of elements, 53
that's why I thought the reshape isn't working.
FInd please attached one of the real data samples.
Thanks
I tried to make the pieces neater on the fly in the edit window instead of first...there's a glitch above...see corrected that seems to work...
The only other options I could come up with right now are essentially the same thing except using a char() vector for the search operation, but still needed the arrayfun over the start,stop indices arrays.
My thought to get a subs vector built for accumarray() didn't pan out easily; I'm sure I'm just overlooking the obvious.
ShonyE
ShonyE on 18 Jan 2021
Edited: ShonyE on 18 Jan 2021
great, thanks. It works! The only thing I've changed, is the "sum" operator to "mean" in the last row. Also, by duplicating it with "std" operator, I'm calculating the standart deviation of those mean values.
Thank a lot again!
Oh, yeah, I dunno how I got 'sum'; I do see it was mean that was requested.
If you want multiple statistics, you can optimize a little by writing a function that returns multiple outputs instead of using the anonymous function with the cost of having to write the function itself.
Alternatively, if you have the Statistics TB, you could use the grpstats function with a nul set ("[]") for the grouping variable and the list of desired statistics wanted; TMW has written the function in the additional toolbox.

Sign in to comment.

More Answers (0)

Asked:

on 17 Jan 2021

Commented:

dpb
on 18 Jan 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!