Method to count number of observations by category

What would be the best method to count observations which fit the criteria number listed in the last column:
>> s
s =
22.9924 46.8888 1.4597 4.8547 1.3379 3.8795 24.8686 3.0000
19.7965 40.1072 1.4644 5.2458 1.3262 3.0253 19.7573 3.0000
12.0315 70.3394 1.8392 3.6708 1.2664 2.2959 13.2066 3.0000
0.0854 4.4000 1.3399 1.6076 1.2214 1.9499 1.7144 2.0000
-10.7796 26.7116 1.0871 1.0044 1.2722 1.6683 -12.4818 1.0000
1.3617 15.7695 1.0896 1.4550 1.2444 1.9098 1.2284 2.0000
14.2101 11.4473 1.2464 4.0444 1.2413 2.6900 16.6996 3.0000
12.9187 60.9303 1.5128 3.4454 1.2593 2.2976 15.4214 3.0000
4.8807 11.4097 1.4971 2.0412 1.2105 1.7832 5.7704 2.0000
5.7536 22.7122 1.1432 1.8351 1.2305 2.1097 9.1543 3.0000
another example:
s =
-3.8660 3.8000 1.8212 1.0659 0.9359 1.1195 -4.2488 4.0000
5.6911 1.6000 1.8920 1.0000 1.0772 1.1631 7.5085 6.0000
1.6582 31.9250 1.1889 1.0221 1.1091 1.1899 2.0074 5.0000
-3.0589 1.1000 1.6842 1.0184 0.9804 1.1165 -4.5364 4.0000
Such that I will have an alert that I only have one observation of , say 1.00 in the first piece of data, but 5 and 6 in the second piece?
Thanks, Kate

 Accepted Answer

Cedric
Cedric on 27 Jun 2014
Edited: Cedric on 27 Jun 2014
How about
counts = accumarray( s(:,end), 1 ) ;
In the first case, it gives
counts =
1
3
6
In the second case
counts =
0
0
0
2
1
1
As you can see, the i-th element of vector counts is the count of data i. And then you can process counts as you want/need, e.g.
is1 = counts == 1 ;
if any( is1 )
fprintf( 'Warning, the following data blabla..\n' ) ;
disp( find( is1 )) ;
end

More Answers (2)

I believe this is what you want:
s =[...
22.9924 46.8888 1.4597 4.8547 1.3379 3.8795 24.8686 3.0000
19.7965 40.1072 1.4644 5.2458 1.3262 3.0253 19.7573 3.0000
12.0315 70.3394 1.8392 3.6708 1.2664 2.2959 13.2066 3.0000
0.0854 4.4000 1.3399 1.6076 1.2214 1.9499 1.7144 2.0000
-10.7796 26.7116 1.0871 1.0044 1.2722 1.6683 -12.4818 1.0000
1.3617 15.7695 1.0896 1.4550 1.2444 1.9098 1.2284 2.0000
14.2101 11.4473 1.2464 4.0444 1.2413 2.6900 16.6996 3.0000
12.9187 60.9303 1.5128 3.4454 1.2593 2.2976 15.4214 18.0000
4.8807 11.4097 1.4971 2.0412 1.2105 1.7832 5.7704 2.0000
5.7536 22.7122 1.1432 1.8351 1.2305 2.1097 9.1543 3.0000]
category = s(:, end); % Extract last column
% Get histogram of it.
edges = unique(category); % Edges of histogram
counts = histc(category, unique(category))
% Find indexes for which the count is 1.
singleCounts = find(counts == 1)
for k = 1 : length(singleCounts)
thisIndex = singleCounts(k);
numberAtIndex = edges(thisIndex);
message = sprintf('The category %d shows up only once!', numberAtIndex);
uiwait(helpdlg(message));
end
It takes the histogram of the last column. Any number that shows up in that column only one time will popup a message to the user. I changed your s slightly so that the number 1 and the number 18 each show up only once in the last column.

4 Comments

Thanks, but I'm actually looking for ANY value which is only seen one time, not just observations of s==1. See above discussion.
Did you actually run my code? Maybe you ran it before I added the edit to have 18 also show up only once. Run it all the way through. You will find that it pops up two messages. The first one says that 1 shows up only once, and the second one says that 18 shows up only once. Isn't that what you got?
Kate, my code runs fine on your new, additional example with 5 and 6. Here, look at this:
s =[...
-3.8660 3.8000 1.8212 1.0659 0.9359 1.1195 -4.2488 4.0000
5.6911 1.6000 1.8920 1.0000 1.0772 1.1631 7.5085 6.0000
1.6582 31.9250 1.1889 1.0221 1.1091 1.1899 2.0074 5.0000
-3.0589 1.1000 1.6842 1.0184 0.9804 1.1165 -4.5364 4.0000]
category = s(:, end); % Extract last column
% Get histogram of it.
edges = unique(category); % Edges of histogram
counts = histc(category, edges )
% Find indexes for which the count is 1
singleCounts = find(counts == 1)
for k = 1 : length(singleCounts)
thisIndex = singleCounts(k);
numberAtIndex = edges(thisIndex);
message = sprintf('The category %d shows up only once!', numberAtIndex);
fprintf('%s\n', message);
uiwait(helpdlg(message));
end
In addition to the popup message box, you'll see this in the command window:
s =
-3.8660 3.8000 1.8212 1.0659 0.9359 1.1195 -4.2488 4.0000
5.6911 1.6000 1.8920 1.0000 1.0772 1.1631 7.5085 6.0000
1.6582 31.9250 1.1889 1.0221 1.1091 1.1899 2.0074 5.0000
-3.0589 1.1000 1.6842 1.0184 0.9804 1.1165 -4.5364 4.0000
counts =
2
1
1
singleCounts =
2
3
The category 5 shows up only once!
The category 6 shows up only once!
Tell me if this doesn't do exactly and perfectly what you want to do and you asked for.
Thanks Image Analyst, the new code works. I had run it on your previous code when I replied.

Sign in to comment.

what about:
size(s(s == 1.00),1))
which would give you the number of occurences in s meeting the criterion.
Or maybe something like this
if any(s(s == 1.00))
msgbox('we have a winner')
end

3 Comments

Is there a method where you can avoid specifying what s equals(e.g. s(s==1.00)? In some instances I might have only one observation of 3 or 18, etc etc.
Ok I'm not quite sure to understand sorry. If we take your matrix above for example, would you like to know how many times the value in the last column is present in the corresponding row? Or in the whole matrix? Or would you like to know that there are six 3's, three 2's and one 1?
To illustrate my question, here is a different piece of data. Here I would want to know that 5 and 6 are only seen once:
s =
-3.8660 3.8000 1.8212 1.0659 0.9359 1.1195 -4.2488 4.0000
5.6911 1.6000 1.8920 1.0000 1.0772 1.1631 7.5085 6.0000
1.6582 31.9250 1.1889 1.0221 1.1091 1.1899 2.0074 5.0000
-3.0589 1.1000 1.6842 1.0184 0.9804 1.1165 -4.5364 4.0000

Sign in to comment.

Products

Tags

Asked:

on 26 Jun 2014

Commented:

on 27 Jun 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!