0% found this document useful (0 votes)
2 views

ML-Lec7

Instance-based learning methods, such as K-Nearest Neighbor, store training examples and classify new instances based on their proximity to these examples, postponing generalization until needed. While this approach allows for local approximations of complex target functions, it can be computationally expensive at classification time. Additionally, the methods may struggle with high-dimensional data if only a few attributes are relevant for classification.

Uploaded by

luosuochao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

ML-Lec7

Instance-based learning methods, such as K-Nearest Neighbor, store training examples and classify new instances based on their proximity to these examples, postponing generalization until needed. While this approach allows for local approximations of complex target functions, it can be computationally expensive at classification time. Additionally, the methods may struggle with high-dimensional data if only a few attributes are relevant for classification.

Uploaded by

luosuochao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Lecture 7 Instance-based Learning Instance-based Learning -

Overview
• In contrast to learning methods that construct a general,
• Introduction of Instance-based learning explicit description of the target function when training
• K-Nearest Neighbor examples are provided, instance-based learning methods
simply store the training examples.

• Generalizing beyond these examples is postponed until a new


instance must be classified.

• Each time a new query instance is encountered, its relationship


to the previously stored examples is examined in order to
assign a target function value for the new instance.
2

Instance-based Learning -
Instance-based Learning -
Overview
Overview • Instance-based learning methods such as Nearest
1. Instance-based learning methods are sometimes
Neighbor are conceptually straightforward
referred to as delayed/lazy learning methods because approaches to approximating real-valued or discrete-
they delay processing until a new instance must be valued target functions.
classified.
1. Learning in these algorithms consists of simply storing
2. A key advantage of delayed/lazy learning is that instead the presented training data.
of estimating the target function one for the entire 2. When a new query instance is encountered, a set of
instance space, these methods can estimate it locally similar related instances is retrieved from memory and
and differently for each new instance to be classified. used to classify the new query instance.

3 4
Instance-based Learning - Instance-based Learning -
Overview Disadvantages
• Many techniques construct only a local approximation
to the target function that applies in the neighborhood 1. One disadvantage of instance-based
of the new query instance, and never construct an approaches is that the cost of classifying new
approximation designed to perform well over the instances can be high.
entire instance space.
• This is due to the fact that nearly all computation
• This has significant advantages when the target takes place at classification time rather than when
function is very complex, but can still be described by the training examples are first encountered.
a collection of less complex local approximations.

5 6

Instance-based Learning - K-nearest Neighbor Learning


Disadvantages (Intro.)
2. A second disadvantage to many instance-based • The most basic instance-based method is the k-
approaches, especially nearest neighbor NEAREST NEIGHBOR algorithm.
approaches, is that they typically consider all
attributes of the instances when attempting to
retrieve similar training examples from memory. • This algorithm assumes all instances correspond to
points in the n-dimensional space n. That means
inputs of data are numeric ones.
• If the target concept depends on only a few of the many
available attributes, then the instances that are truly most
"similar" may well be a large distance apart.
• The nearest neighbors of an instance are defined in
terms of the standard Euclidean distance.

7 8
K-nearest Neighbor Learning K-nearest Neighbor Learning
(Euclidean distance) (output type)
• Let an arbitrary instance
• In nearest-neighbor learning the target function
x be described by the a
feature vector 1 (x),a2 (x),...an (x) may be either discrete-valued or real-valued.

• Let us first consider learning discrete-valued target


⚫ where ar( x ) denotes the value of the rth attribute of functions of the form f : n -> V, where V is the
instance x . Then the distance between two finite set {vl, . . . vs}.
instances xi and xj is defined tobe
d( xi , xj ) , where • The k-NEAREST NEIGHBOR algorithm for
n

 (a (x ) − a (x ))
approximating a discrete-valued target function is
d (xi , x j )  r i r j
2
given in next page ->
r =1 9 10

K-nearest Neighbor Algorithm for approximating K-nearest Neighbor Algorithm for approximating
a discrete-valued function f : n -> V, a discrete-valued function f : n -> V,
• Training algorithm:
• For each training example <x, f(x)>, add the example to the listing
• As shown there, the value fˆ(xq ) returned by this algorithm as its
training_examples estimate of f(xq) is just the most common value of f among the
k training examples nearest to xq.
• Classification algorithm
• Given a query instance xq to be classified,
• Let x1….xk denote the k instances from training_examples that are • If we choose k = 1, then the 1-NEAREST NEIGHBOR algorithm
nearest to xq assigns to fˆ(xq ) the value f(xi) where xi is the training instance
• Return nearest to xq.
k
fˆ(xq )  arg max
vV
 (v, f (x )) i • For larger values of k, the algorithm assigns the most common
i=1 value among the k nearest training examples.

⚫ Where (a, b) = 1 if a = b and 0 otherwise.


11 12
K-nearest Neighbor Algorithm for approximating K-nearest Neighbor Algorithm for approximating
a discrete-valued function f : n -> V a discrete-valued function f : n -> V

• Calculate the mean value of the k nearest training


examples.

• Replace the final line of previous algorithm by:


k
f (xi )
fˆ(xq )  i=1
k

1 Nearest Neighbor classifies xq positive, 5 NearestNeighbor


classifies xq as negative. The diagram on the right handside
shows the decision surface induced by 1-Nearest Neighbor. 14 15

Distance-Weighted Nearest Neighbor Algorithm for discrete-


valued target functions Distance-Weighted Nearest Neighbor Algorithm for
continuous- valued target functions

• Weight the contribution k • We can distance-weight


fˆ (xq )  arg max  wi (v, f (xi ))

 w f (x )
of each of the k the instances for real- k
vV
neighbors according to i=1 valued target functions in i i
their distance to the where a similar way. fˆ(xq )  i=1

 w
k
query point xq, giving wi 
1
greater weight to closer d (xq , xi )2 i=1 i
neighbors. where
1
⚫ If xq exactly match one of the training instances xi, the wi 
denominator d(xq,xi)2 is zero and, we assign fˆ(xq ) to be 16
d (xq , xi )2
f(xi).
16
Remark
Remark
• Note all of the above variants of the k-NEAREST NEIGHBOR algorithm
consider only the k nearest neighbors to classify the query point.
• The distance-weighted k-NEAREST
• Once we add distance weighting, there is no harm in allowing all NEIGHBOR algorithm is a highly effective
training examples because very distant examples will have very little inductive inference method for many practical
effect on fˆ(xq ) problems.

• The only disadvantage of considering all examples is that our classifier • It is robust to noisy training data and quite effective
will run more slowly.
when it is provided a sufficiently large set of training
data.
• If all training examples are considered when classifying a new query
instance, we call the algorithm a global method.
• Note that by taking the weighted average of the k
• If only the nearest training examples are considered, we call it a local neighbors nearest to the query point, it can smooth
method. out the impact of isolated noisy training examples.
17 18

Remark

• Since the algorithm delays all processing until a new query is


received, significant computation can be required to process
each new query.

• Various methods have been developed for indexing the stored


training examples so that the nearest neighbors can be
identified more efficiently at some additional cost in memory.

• One such indexing method is the kd-tree


• https://www.ri.cmu.edu/pub_files/pub1/moore_andrew_1991_1/
moore_andrew_1991_1.pdf

19

You might also like