File Exchange

image thumbnail

kNN classifier

version 1.1.0.2 (2.78 KB) by Mahmoud Afifi
k-NN classifier

57 Downloads

Updated 04 Jan 2019

View License

-k-NN classifier: classifying using k-nearest neighbors algorithm. The nearest neighbors
-search method is euclidean distance
-Usage:
[predicted_labels,nn_index,accuracy] = KNN_(3,training,training_labels,testing,testing_labels)
predicted_labels = KNN_(3,training,training_labels,testing)
-Input:
- k: number of nearest neighbors
- data: (NxD) training data; N is the number of samples and D is the
dimensionality of each data point
- labels: training labels
- t_data: (MxD) testing data; M is the number of datapoints and D
is the dimensionality of each data point
- t_labels: testing labels (default = [])
-Output:
- predicted_labels: the predicted labels based on the k-NN
algorithm
- nn_index: the indices of the nearest training data point (Mx1).
- accuracy: if the testing labels are supported, the accuracy of
the classification is returned, otherwise it will be zero.

Cite As

Mahmoud Afifi (2019). kNN classifier (https://www.mathworks.com/matlabcentral/fileexchange/63621-knn-classifier), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (31)

Hello! It is a good coding. Thank you for it. But now, let me ask your help. I don't clearly understand that if my data is in .mat file, how can I apply my dataset in your coding.

@Fahad Ibrahim: Asalam Alaykum Mr. Fahad. Actually the code here is an example of how you can implement KNN algorithm.

Asalam alaykum mr. Afifi...I am currently working on my final year project and the final part of my project is the implement the KNN algorithm on the data set I have obtained. Can you please show me on how to implement it. this is my e-mail address fahad.agus@gmail.com

@Balaji M. Sontakke: we do not have a threshold in this implementation. The only hyperparameter we have is the number nearest neighbors.

what is the role of threshold in KNN classifier

Hello Sir
my mail id is aloormariena@gmai.com

@Geetha, please check your email

Geetha V
Dear Dr. Mahmoud Afifi.
Thank you Sir, I checked the code for my dataset. i am not getting correct result.
I sent my data to your mail. Please help me.

Mar Key

@Cheol please check the comments written on the first of the source code.

Cheol Shin

Dear Dr. Mahmoud Afifi.
I really appreciate you.
I want to know kNN classifier code. I am happy!
But, I don't know how to use this code.
Can you help me? I really want to use this code!!

??? Error using ==> KNN_ at 30
Too few input arguments.

:(

Alaa Hadi

Please Dr. Mahmoud Afifi
Can you send your email, I can send you me data set

@Vishnu Priya can you send me your training data with labels and testing data, so I could help.

Afroz Ahmad

Sir can you run this code on my dataset?

Hello!sir.I'm trying to implement this function on fisheriris dataset that has three labels setosa ,viginica and versicolor but am unable to get proper output because of unique function I believe.I'm gettting 118 and 115 i.e v and s as output but both lables viginica and versicolor have v so unable to have proper classification.Could you please help me with necessary changes?and if I want the complete label name as output?
Thank you sir

mafifi[at]yorku[dot]ca

@afifi sir how do I send you my data. can you please provide your email address?

@Maanvi can you send me your data to test?

I am not getting any accuracy. What should i do?

@Mohamed
Good question. It was an assignment

Hello sir, thank you for your contribution but i have a question which may be basic, Why you creat you're own KNN code if Matlab already have it ?
thank you sir

@Dipjyoti Chakraborty
The kNN classifier is a non-parametric classifier, such that the classifier doesn't learn any parameter (there is no training process). Instead, the idea is to keep all training samples in hand and when you receive a new data point (represent as a vector), the classifier measures the distance between the new data point and all training data it has. BTW, you have to provide the algorithm with the labels of each training data point. Then, the classifier will find the nearest k data points in the training samples and assume the label of the given data point is the major label of the nearest k training data points.
The dimensionality of both training data and testing data should be the same; for example, if your training data represent x,y- coordinates of points, the testing data point should be a vector with two values (x and y). Based on the data, people pick the suitable distance metric to use; for example, if your data is a set of 2d points in the Cartesian coordinate, the Ecludian distance is the best distance measurement you can use.

In my code, I use Ecludian distance and you have to give to the function the training data and its labels along with your testing data points:
predicted_labels = KNN_(k,training_data,training_labels,testing_data)
The predicted_labels will contain the predicted labels based on the k nearest data points of the training_data, and the label will be estimated via majority voting of the labels of these nearest points.

Suppose you have:
* d_t -> 10x 5 matrix, that represents the training data such that you have 10 data points, each one has 5 features (5 dimensions).
* L -> 10x1 vector, that represents the label of each data point in d_t
* d -> Nx5 matrix that represents the N testing data points you have and you need to estimate their labels.
run est_L = KNN_(3,d_t,L,d); where est_L is the estimated labels of d based on the 3-nearest neighbor points of d_t

Would you help me to implement this?? pls give me a full example, by that I can understand. If you do so then it will be pleasure to me

@Dipjyoti Chakraborty
Did you test the current code? I could find a variable called train_labels in the current source code!

After your last updated code, when I am trying to implement it is showing ----
Undefined function or variable 'train_labels'.
Error in classify (line 61)
options=unique(train_labels(k_nn(i,:)'));
what should I do??

@RK Ghadai

The bug is fixed, the part is replaced with the following code:

%find the nearest k for each data point of the testing data
k_nn=ind(:,1:k);
nn_index=k_nn(:,1);
%get the majority vote
for i=1:size(k_nn,1)
options=unique(labels(k_nn(i,:)));
max_count=0;
max_label=0;
for j=1:length(options)
if length(find(labels(k_nn(i,:))==options(j)))>max_count
max_label=labels(options(j));
max_count=j;
end
end
predicted_labels(i)=max_label;
end

RK Ghadai

Mistake in the "%get the majority vote" section. "options=unique(k_nn(i,:));" should be changed to "options=unique(train_labels(k_nn(i,:)));" Similarly the if condition there, should be corrected.

LEE ZISHENG

Updates

1.1.0.2

..

1.1.0.1

.

1.1.0.0

Bug fixed

MATLAB Release Compatibility
Created with R2015a
Compatible with any release
Platform Compatibility
Windows macOS Linux

KNN_