0% found this document useful (0 votes)
29 views

Lecture 05 - Nearest Neighbour

The document discusses instance-based learning and the k-nearest neighbors algorithm. The k-NN algorithm stores all training examples and classifies new examples based on the majority class of its k nearest neighbors. It can perform classification or regression. Distance weighting gives closer neighbors more influence. The algorithm is lazy, only generalizing to new queries. Advantages are robustness to noise and fast training, while disadvantages include the curse of dimensionality and slow classification. Case-based reasoning similarly retrieves similar past cases to solve new problems.

Uploaded by

Waseem Shahzad
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Lecture 05 - Nearest Neighbour

The document discusses instance-based learning and the k-nearest neighbors algorithm. The k-NN algorithm stores all training examples and classifies new examples based on the majority class of its k nearest neighbors. It can perform classification or regression. Distance weighting gives closer neighbors more influence. The algorithm is lazy, only generalizing to new queries. Advantages are robustness to noise and fast training, while disadvantages include the curse of dimensionality and slow classification. Case-based reasoning similarly retrieves similar past cases to solve new problems.

Uploaded by

Waseem Shahzad
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17

INSTANCE - BASED LEARNING

1
INSTANCE BASED LEARNING

K-Nearest Neighbors

Key Idea:
Just store all training examples <xi, f(xi)>

Thus the training algorithm is very simple

2
INSTANCE BASED LEARNING

K-Nearest Neighbors

Classification Algorithm:
* Given query instance xq
* Locate nearest training example xn
* Estimate

3
INSTANCE BASED LEARNING

K-Nearest Neighbors

Classification Algorithm (k-nearest neighbor):


* Given query instance xq
* Locate k nearest training example xn
* Estimate its class label by taking vote among k-
nearest neighbor class labels

4
INSTANCE BASED LEARNING

K-Nearest Neighbors

Note that 1-nearest neighbor classifies xq as positive, whereas


5-nearest neighbor classifies it as negative

5
INSTANCE BASED LEARNING

K-Nearest Neighbors

Classification Algorithm (k-nearest neighbor):


* If the class labels are real-valued
Take the mean of class labels (target function
‘f’ values) of k nearest neighbors
Example: Age

6
INSTANCE BASED LEARNING

Distance weighted kNN-algorithm

We weight the contribution of each of the k-neighbors


according to their distance to the query point xq

The closer neighbors are given a greater weight

where

7
INSTANCE BASED LEARNING

Distance weighted kNN-algorithm

If for a case d(xq, xi)2 = 0


we assign the class of xi to xq
If there are several xi equal to xq, we take a majority
vote

For real valued target functions,

8
INSTANCE BASED LEARNING

Distance weighted kNN-algorithm

If all the training examples are used to determine the


classification of xq, then the algorithm is called a global
method, otherwise it is called a local method

For real-valued functions, the global methods is also called


the Shepard’s method

9
INSTANCE BASED LEARNING

Terminology

It is useful to know the following terms:

10
INSTANCE BASED LEARNING

Terminology

For symbolic data, we have to adapt this method

Instead of Euclidean distance, we have to use other measures


based on similarity of two instances

11
INSTANCE BASED LEARNING

What class label will be assigned to the following unlabelled


sample assume k = 3?

X = {age = 30-40, income = medium, student = yes, credit-rating = fair}

12
INSTANCE BASED LEARNING

Terminology

Lazy and Eager learning methods

Lazy learning methods defer the decision of how to


generalize beyond the training data until a new query
instance is observed

Eager learning methods commit to the target model at the


time of training

13
INSTANCE BASED LEARNING

Advantages
• It is robust to noise
• Training is fast

Disadvantages

• All attributes are used for calculation of distances, whereas


only a few may be relevant (this problem of irrelevant
attributes is called curse of dimensionality)
• Classification is a slow process

14
INSTANCE BASED LEARNING

Case based Reasoning

In this method instances are typically represented using


symbolic descriptions

On the presentation of a new problem, similar problems and


their solutions are retrieved

This method has been applied to problems such as

- Conceptual design of mechanical devices based on stored


library of previous designs
- Reasoning about new legal cases based on previous rulings

15
INSTANCE BASED LEARNING

Case based Reasoning

The key issue is the similarity metric used

One or more retrieved cases may form the basis of a solution


to the new problem

The process for combining these multiple retrieved cases has


to based on knowledge-based reasoning process

There is usually a tight coupling between case retrieval,


knowledge based reasoning and problem solving

16
INSTANCE BASED LEARNING

Chapter 8 of T. Mitchell

17

You might also like