0% found this document useful (0 votes)
18 views

Summary of K-Nearest Neighbours Algorithms

The document provides an overview of K-Nearest Neighbors (KNN) machine learning algorithm. It describes KNN as a supervised learning technique that classifies new data based on similarity to existing data. It explains that KNN finds the K closest training examples to make predictions, where closeness is defined using distance metrics like Euclidean distance. The document outlines the key steps of KNN classification which include computing distances, finding the K nearest neighbors, and predicting the class based on a majority vote. It also provides sample R code for performing KNN classification.

Uploaded by

Pierre
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Summary of K-Nearest Neighbours Algorithms

The document provides an overview of K-Nearest Neighbors (KNN) machine learning algorithm. It describes KNN as a supervised learning technique that classifies new data based on similarity to existing data. It explains that KNN finds the K closest training examples to make predictions, where closeness is defined using distance metrics like Euclidean distance. The document outlines the key steps of KNN classification which include computing distances, finding the K nearest neighbors, and predicting the class based on a majority vote. It also provides sample R code for performing KNN classification.

Uploaded by

Pierre
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

A Complete Guide to K-Nearest-Neighbors with Applications in Python

and R
https://kevinzakka.github.io/2016/07/13/k-nearest-neighbor/

Characteristics

- Supervised learning: training dataset contains relationships between x and y; and the goal is
to infer y from a sample containing only x
- Non-parametric: No explicit assumption about the functional form of h(x), estimator of y

Mechanics

Form a majority vote of the K instances in the training sample that are the closest to the observation
to classify.

Euclidian distance: (1 , 2 ) = (11 21 )2 + + (1 2 )2

Other possible distances

Steps to classify an observation:

1. Run through the whole dataset computing d between x and each training observation
2. Find the K (odd to prevent ties) observations with the smallest d set
1
3. Find the conditional probability for each class: ( = | = ) = 1{=}
4. The observation is classified in the class with the highest probability

R code

library(class)

knn(train = , test = , cl = , k=K)

You might also like