Summary of K-Nearest Neighbours Algorithms
Summary of K-Nearest Neighbours Algorithms
and R
https://kevinzakka.github.io/2016/07/13/k-nearest-neighbor/
Characteristics
- Supervised learning: training dataset contains relationships between x and y; and the goal is
to infer y from a sample containing only x
- Non-parametric: No explicit assumption about the functional form of h(x), estimator of y
Mechanics
Form a majority vote of the K instances in the training sample that are the closest to the observation
to classify.
1. Run through the whole dataset computing d between x and each training observation
2. Find the K (odd to prevent ties) observations with the smallest d set
1
3. Find the conditional probability for each class: ( = | = ) = 1{=}
4. The observation is classified in the class with the highest probability
R code
library(class)