Putting similar numbers into groups within an array

22 views (last 30 days)
I have an array of numbers that looks something like follows. I want to group the array into subgroups where the numbers are all within 2 of each other. In this case, there would be 3 groups. Is there an easy way to do this? I need the method to be automated for work for arrays 100 entries long.
7340.1
7340.3
7340.6
7349.0
7349.4
7358.0
7358.1
7358.2
7358.7
% New groups would look like follows:
% Group 1
7340.1
7340.3
7340.6
% Group 2
7349.0
7349.4
% Group 3
7358.0
7358.1
7358.2
7358.7

Accepted Answer

Image Analyst
Image Analyst on 19 Jun 2017
If you have the Statistics and Machine Learning Toolbox (for pdist2) and the Image Processing Toolbox (for bwlabel), you can do this:
m = [...
7340.1
7340.3
7340.6
7349.0
7349.4
7358.0
7358.1
7358.2
7358.7]
% % New groups would look like follows:
% % Group 1
% 7340.1
% 7340.3
% 7340.6
% % Group 2
% 7349.0
% 7349.4
% % Group 3
% 7358.0
% 7358.1
% 7358.2
% 7358.7]
% First sort m so that close by ones has adjacent indexes.
m = sort(m, 'ascend')
% Get distance of every element to every other element.
distances = pdist2(m, m)
% Find out which pairs are within 2 of each other.
within2 = distances > 0 & distances < 2
% Erase upper triangle to get rid of redundancy
numElements = numel(m);
t = logical(triu(ones(numElements, numElements), 0))
within2(t) = 0
% Label each group with an ID number.
[labeledGroups, numGroups] = bwlabel(within2)
% Put each group into a cell array
for k = 1 : numGroups
[rows, columns] = find(labeledGroups == k);
indexes = unique([rows, columns]);
groups{k} = m(indexes);
end
celldisp(groups); % Display the results in the command window.
It gives you your desired result. Should work for other arrays also, though I didn't test it with any others.
  6 Comments
Mr.Alb
Mr.Alb on 22 Mar 2022
Edited: Mr.Alb on 22 Mar 2022
Hi,
I'm still facing the @Ayca Altay's issue when groups hold just one element. How can I modify the previous script? I have an array like this:
a = (4.17, 8.33, 12.5, 16.67, 20.83, 25, 2.085, 6.245, 10.415, 14.585, 18.745, 22.915, 0.005, 4.165, 8.335, 12.505, 16.665, 20.835 ,2.08, 2.08, 6.25, 10.42, 14.58, 18.75);
Here, there are 13 groups (but they may vary depending on how a is made up, this is just an example), but I find only 10 groups with the method before. In particular, I lose the first group (1 occurence at about 0) and the two last groups ( 1 occurence each at about 23 and 25).
How can I solve this?
Many, thx
Image Analyst
Image Analyst on 22 Mar 2022
You could try kmeans() and have it automatically check a variety of k values for the "best" k (number of groups).

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 19 Jun 2017
Looks like it could be a job for dbscan https://en.wikipedia.org/wiki/DBSCAN
  2 Comments
Alamgir M S M
Alamgir M S M on 10 Mar 2019
This is my Data-
[705.7142857 705.7142857 173.4285714 84.71428571 232.5714286 232.5714286 114.2857143 55.14285714 25.57142857 74.85714286 35.42857143 15.71428571 5.857142857 5.857142857].
I want to group same data into 1 group. This is 14*1 matrix where it has 3 pairs(705.7142857 705.7142857, 232.5714286 232.5714286 & 5.857142857 5.857142857) of same value data. So, I want these 3 pairs into 3 groups and rest of them into other 8 groups.
Is there anybody who can help me to code this?
Image Analyst
Image Analyst on 10 Mar 2019
Use unique() and setdiff(). On the remainder, use kmeans() to group into 8 groups. Should be easy. If you can't figure it out, let me know.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!