How can we figure out a data set using all columns of a dataset with k=2 means clustering? Data set is here: https://archive.ics.uci.edu/ml/machine-learning-databases/hepatitis/

question about k means clustering

KALYAN ACHARJYA on 3 Jan 2021

Edited: KALYAN ACHARJYA on 3 Jan 2021

What is the problem? Issue with dataset or k-means?

Note: if you want help, then you need to make it easy to be helped.

Eeengineer on 3 Jan 2021

Open in MATLAB Online

when i add two columns of dataset it works,but when i try to add all columns it doesn't work.

load hepatitis;
X=hepatitis(:,16:17);
figure;
plot(X,'k*');
title 'Hepatitis Data';
hold on;
opts = statset('Display','final');
[idx,C] = kmeans(X,2,'Distance','sqeuclidean',...
    'Replicates',5,'Options',opts);

Image Analyst on 3 Jan 2021

Open in MATLAB Online

I saved "hepatitis.data" at that web site and it didn't work

load('hepatitis.data')
X=hepatitis(:,16:17);
figure;
plot(X,'k*');
title 'Hepatitis Data';
hold on;
opts = statset('Display','final');
[idx,C] = kmeans(X,2,'Distance','sqeuclidean',...
    'Replicates',5,'Options',opts);

Please post the actual data file and code that actually works with it.

TUTORIAL: How to ask a question (on Answers) and get a fast answer

Image Analyst on 3 Jan 2021

Doesn't run. load doesn't work. You're not making it easy for us, are you? I'll try to fix it. In the meantime, edit yoru post and format your code as code by highlighting and clicking the code icon.

Image Analyst on 3 Jan 2021

Open in MATLAB Online

hepatitis.xlsx

Come on Eeengineer. Please don't waste my time when I try to help you. I used xlsread() instead of load() and that got the data in, but there is no 17th column. Please fix or post your actual code. I'm going to do other stuff now and I'll check back later.

clear all; 
close all; 
clc;
format long g;
format compact;
fontSize = 15;
fprintf('Beginning to run %s.m ...\n', mfilename);
hepatitis = xlsread('hepatitis.xlsx')
X = hepatitis(:,16:17)
plot(X,'k*');
title 'Hepatitis Data';
hold on;
idx=kmeans(X,2);
opts = statset('Display','final');
[idx,C] = kmeans(X,2,'Distance','sqeuclidean',...
	'Replicates',5,'Options',opts);
figure;
plot(X(idx==1,1),X(idx==1,2),'r.','MarkerSize',12)
hold on
plot(X(idx==2,1),X(idx==2,2),'b.','MarkerSize',12)
plot(C(:,1),C(:,2),'kx',...
	'MarkerSize',15,'LineWidth',3)
legend('Cluster 1','Cluster 2','Centroids',...
	'Location','NW')
title 'Cluster Assignments and Centroids'
hold off

Image Analyst on 3 Jan 2021

Open in MATLAB Online

Only columns 2 and 15 look like there is any real data in them. The rest of the columns just have 1, 2, or nan in them. Which columns do you want to take as "observations"? Are all of them observations, or just the columns 2 and 15?

If I scatter columns 1 and 2 and 15, I see this:

hepatitis = xlsread('hepatitis.xlsx')
x = hepatitis(:,1);
y = hepatitis(:, 2);
z = hepatitis(:, 15);
scatter3(x, y, z, 'Filled');
title('Hepatitis Data', 'FontSize', 20);
xlabel('Column 1', 'FontSize', 20);
ylabel('Column 2', 'FontSize', 20);
zlabel('Column 15', 'FontSize', 20);

So where are the clusters? If you're going to include columns 1 and 3-14, and 16 in the observations, then the clusters might be dominated by what's in those columns since they're very discrete - either 1 or 2. Looking at just columns 2 and 15, it doesn't look like there are any meaningful clusters.

Eeengineer on 3 Jan 2021

i tried to use the first plot into the second plot using all values.As you said,it is about the columns data(1 and 2).ı want to use all values of first plot to second plot.I will do your advice thank you.

question about k means clustering

7 Comments
Show 5 older comments Hide 5 older comments

Answers (2)

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Tags

Community Treasure Hunt

question about k means clustering

7 Comments Show 5 older comments Hide 5 older comments

Answers (2)

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Tags

See Also

Community Treasure Hunt

7 Comments
Show 5 older comments Hide 5 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments