Core points of clusters
    5 views (last 30 days)
  
       Show older comments
    
    sreelekshmi ms
 on 25 Feb 2020
  
    
    
    
    
    Commented: Ameer Hamza
      
      
 on 8 Mar 2020
            I need to find the center points of a clusters. I used dbscan for clustering. Now I need to find the core points of these clusters. I used the corepts,but it gives the logical array. How can I find the core points of those clusters or atleast a point contained in those clusters. Anybody please help me.
[idx, corepts] = dbscan(asc,epsilon,minpts);
7 Comments
Accepted Answer
  Ameer Hamza
      
      
 on 7 Mar 2020
        As discussed here, https://stackoverflow.com/questions/52364959/how-to-find-center-points-of-dbscan-clusrering-in-sklearn and here https://www.quora.com/Is-there-anything-equivalent-to-a-centroid-in-DBSCAN, dbscan does not have a center of the cluster. However, it does generate core points. You can get the core points by modifying the line in your code
core = data(corepts, :);
It will give you all rows conntaining core points. Similarly you can get the cluster number of these core points
corr_idx = idx(corepts, :);
As an example, try this
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
fig1 = figure();
gscatter(data(:,1),data(:,2),idx);
fig2 = figure();
core=data(corepts, :);
corr_idx = idx(corepts, :);
gscatter(core(:,1),core(:,2),corr_idx);
4 Comments
  Ameer Hamza
      
      
 on 8 Mar 2020
				I think you misunderstood the meaning of core points. All the points shown in the image in my last comment are the core points of that cluster. The core point in dbscan does not imply the center of the cluster. If you want to find the five closest point from the center of the cluster (center as I calculated in the last comment by taking an average of the cluster), then you can try the following code
clc;
clear;
data=xlsread('glass.xlsx');
minpts=6;
epsilon=4;
[idx, corepts] = dbscan(data,epsilon,minpts);
fig1 = figure();
gscatter(data(:,1),data(:,2),idx);
fig2 = figure();
ax = axes();
hold on;
core=data(corepts, :);
core_idx = idx(corepts, :);
gscatter(core(:,1),core(:,2),core_idx);
centers = splitapply(@(x) mean(x, 1), core, core_idx);
gscatter(centers(:,1), centers(:,2), (1:6)');
for i=1:6
    ax.Children(i).Marker = 'x';
    ax.Children(i).MarkerSize = 30;
    ax.Children(i).LineWidth = 10;
end
clusters = splitapply(@(x) {x}, core, core_idx);
closest_points = cell(1,5);
closest_idx = cell(1,5);
for i = 1:length(clusters)
    [~, index] = mink(sum((clusters{i}-centers(i,:)).^2,2), 5, 1);
    closest_points{i} = clusters{i}(index,:);
    closest_idx{i} = i*ones(size(closest_points{i},1),1);
end
closest_points = cell2mat(closest_points');
closest_idx = cell2mat(closest_idx');
g = gscatter(closest_points(:,1), closest_points(:,2), closest_idx);
[g.MarkerSize] = deal(30);
[g.Color] = deal([0 0 0]);
The result is, the closet points are shown in black. Note that the distance is calculated in all 11 dimensions, so points may not appear close in 2 dimensions, but they are overall closer to center on considering 11 dimensions.

More Answers (0)
See Also
Categories
				Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


