A set of data has two large intervals, can it be divided into two sets of data? Thanks for your answer.
2 views (last 30 days)
John D'Errico on 20 Mar 2022
Are you asking how to divide this into two separate curves, the upper half, and the lower half? I can only assume that is your goal.
There could be many ways to solve this problem. For example, I might use knnsearch, to find the set of points that are nearest to each point, perhaps the 3 nearest neighbors. Then use a graph theoretic scheme to cluster the two croups. But perhaps the easiest way is to build a delaunay triangulation of the entire set. For example:
T = delaunayn(dataB1);
% now, list the set of all edges in that triangulation.
edges = [T(:,[1 2]);T(:,[1 3]);T(:,[2 3])];
% sort the edges to remove the edges that were listed twice.
edges = sort(edges,2);
edges = unique(edges,'rows');
% compute the length of each edge.
edgeLen = sqrt(sum((dataB1(edges(:,1),:) - dataB1(edges(:,2),:)).^2,2));
% discard any edges with a length of more than 5.
k = edgeLen > 5;
edges(k,:) = ;
% see how we did so far, by plotting the edges we have identified
That looks pretty good. I've managed to segregate the two curves into two cohesive groups by that scheme, at least, I have done so visually. And, yes, I know there were many other ways I could have done this much. Bit now can we break the curves into two segments? Again, the best way seem graph theoretic in nature. Take a look.
G = graph(edges(:,1),edges(:,2));
Now we can clearly see two disjoint segments. Split them easily now, as...
segmentId = conncomp(G);
C1 = find(segmentId == 1);
C2 = find(segmentId == 2);
So not that difficult. The only thing I had to do was to choose a distance threshold for the edges. With some more thought, I could probably have done that automatically too.