SlideShare a Scribd company logo
Instance-Based Learning
4/16/2020 1Pavithra T, Dept of ECE, GSKSJTI
Overview
• Instance-Based Learning
•Comparison of Eager and Instance-Based Learning
• Instance Distances for Instance-Based Learning
• Nearest Neighbor (NN) Algorithm
• Advantages and Disadvantages of the NN algorithm
• Approaches to overcome the Disadvantages of the NN
algorithm
•Locally weighted regression
•Radial basis functions
•Case based Reasoning
4/16/202
0
2Pavithra T, Dept of ECE, GSKSJTI
Different Learning Methods
• Eager Learning
– Learning = acquiring an explicit structure of a classifier
on the whole training set;
– Classification = an instance gets a classification using
the explicit structure of the classifier.
• Instance-Based Learning (Lazy Learning)
– Learning = storing all training instances
– Classification = an instance gets a classification equal to
the classification of the nearest instances to the instance.
4/16/202
0
3Pavithra T, Dept of ECE, GSKSJTI
–All learning methods presented so far construct a general
explicit description of the target function when examples are
provided
–In case of Instance Based learning,
– Examples are simply stored
– Generalizing is postponed until a new instance must be
classified
– Sometimes referred to as lazy learning
– In order to assign a target function value, its relationship
to the previously stored examples is examined
– IBL includes Nearest neighbor , locally weighted regression
and case based reasoning methods
Instance-Based Learning
4/16/202
0
4
Pavithra T, Dept of ECE, GSKSJTI
 Advantages:
 Instead of estimating for the whole instance space, local
approximations to the target function are possible
 Especially if target function is complex but still decomposable
 Disadvantages:
 Classification costs are high (number of computations to index
each training example at query time)
 Efficient techniques for indexing examples are important to
reduce computational effort
 Typically all attributes are considered when attempting to
retrieve similar training examples from memory
 If the concept depends only on a few attributes, the truly most
similar instances may be far away
4/16/202
0
5Pavithra T, Dept of ECE, GSKSJTI
The Features of the Task of the NN Algorithm:
• The instance language comes with a set A with n attributes a1,
a2, … an.
• The domain of each attribute ai can be discrete or continuous.
• An instance x is represented as < a1(x), a2(x), … an(x) >,
where ai(x) is the value of the attribute ai for the instance x;
• The classes to be learned can be:
– Discrete: In this case we learn discrete function f(x) and the
co-domain C of the function consists of the classes c to be
learned.
– Continuous: In this case we learn continuous function f(x) and
the co-domain C of the function consists of the classes c to be
learned.
Nearest-Neighbor Algorithm (NN)
4/16/202
0
6
Pavithra T, Dept of ECE, GSKSJTI
a
ji
jia
range
xaxa
),x(xd
|)()(| 

Distance Functions
The distance functions are composed from difference metrics da
w.r.t. attributes a defined for each two instances xi and xj.
• If the attribute a is numerical, then :
• If the attribute a is discrete, then :


 

otherwise.1,
)a()a(if0, ji
jia
xx
),x(xd
4/16/202
0
7Pavithra T, Dept of ECE, GSKSJTI
Distance Functions
The main distance function for determining nearest
neighbors is the Euclidean distance:
2
),(


Aa
jiaji xxd),xd(x
4/16/202
0
8Pavithra T, Dept of ECE, GSKSJTI
k-Nearest-Neighbor Algorithm
4/16/202
0
9Pavithra T, Dept of ECE, GSKSJTI
+
+
+
+
-
-
-
-
-
-
e1
1-nn:1-nn: q1 is positive
5-nn: q1 is classified as negative
q1
Classification & Decision Boundaries
4/16/202
0
10
Pavithra T, Dept of ECE, GSKSJTI
4/16/202
0
11Pavithra T, Dept of ECE, GSKSJTI
k-Nearest-Neighbor Algorithm
4/16/202
0
12Pavithra T, Dept of ECE, GSKSJTI
Distance Weighted Nearest-Neighbor Algorithm
4/16/202
0
13
Pavithra T, Dept of ECE, GSKSJTI
Advantages of the NN Algorithm
• The NN algorithm can estimate complex target classes
locally and differently for each new instance to be
classified;
• The NN algorithm provides good generalization accuracy
on many domains
• The NN algorithm learns very quickly;
• The NN algorithm is robust to noisy training data;
• The NN algorithm is intuitive and easy to understand
which facilitates implementation and modification.
4/16/202
0
14Pavithra T, Dept of ECE, GSKSJTI
4/16/202
0
Pavithra T, Dept of ECE, GSKSJTI 15
Disadvantages of the NN Algorithm
• The NN algorithm has large storage requirements because it
has to store all the data
• The NN algorithm is slow during instance because all the
training instances have to be visited
• The accuracy of the NN algorithm degrades with increase of
noise in the training data
• The accuracy of the NN algorithm degrades with increase of
irrelevant attributes
4/16/202
0
16Pavithra T, Dept of ECE, GSKSJTI
4/16/202
0
Pavithra T, Dept of ECE, GSKSJTI 17
Remarks
 Highly effective inductive inference method for many
practical problems provided a sufficiently large set of
training examples
 Inductive bias of k-nearest neighbours assumption
that the classification of xq will be similar to the
classification of other instances that are nearby in the
Euclidean Distance
 Referred to as Curse of dimensionality
 Solutions to this problem:
 More relevant attributes can be stretched over the
axis and least relevant attributes can be shortened
over the axis
 attributes can be weighted differently and
eliminate least relevant attributes from instance
space 4/16/202
0
18
Pavithra T, Dept of ECE, GSKSJTI
A note on terminology:
Regression means approximating a real valued target
function
Residual is the error ˆ f(x)−f(x) in approximating the target
function
Kernel function is the function of distance that is used to
determine the weight of each training example.
In other words, the kernel function is the function K such that
wi=K(d(xi,xq))
4/16/202
0
19Pavithra T, Dept of ECE, GSKSJTI
Locally Weighted Linear Regression
• It is a generalization of NN approach
• Why local?
• because function is approximated using based on the data
near the query point
• Why weighted ?
• Methods like gradient descent can be used to calculate the
coefficients w0, w1, ..., wn to minimize the error in fitting
such linear functions
• Why linear?
• Target function is approximated using a linear function ˆ
f(x)=w0+w1a1(x)+...+wnan(x)
• Why regression ?
• Approximating a real valued target function
• ANNs require a global approximation to the target function but
here, just a local approximation is needed
• Therefore the error function has to be redefined
4/16/202
0
20Pavithra T, Dept of ECE, GSKSJTI
Possibilities to redefine the error criterion E
1.Minimize the squared error over just the k nearest
neighbours
E1(xq)≡(1/2) Σ x ∈ k nearest neighbours (f(x)−ˆ f(x))2
2.Minimize the squared error over the entire set D,
while weighting the error of each training example by
some decreasing function K of its distance from xq
E2(xq)≡ (½)Σ x ∈ D (f(x)−ˆ f(x))2·K(d(xq, x))
3.Combine 1 and 2
E3(xq)≡(1/2)Σ x∈k nearest neighbours(f(x)−ˆf(x))2·K(d(xq,x))
4/16/202
0
21Pavithra T, Dept of ECE, GSKSJTI
Choice of the error criterion
 E2 is the most efficient criterion:
 because it allows every training example to have impact
on the classification of xq
 However, computational effort grows with the number of
training examples
 E3 is a good approximation to E2 with constant effort
 Rederiving the gradient descent rule,
 ∆wj =η Σ x ∈ k nearest neighbours K(d(xq, x)) (f(x)−ˆ f(x)) aj
Remarks on locally weighted linear regression:
 In most cases, constant, linear or quadratic functions are
used for target functions
 Because costs for fitting more complex functions are
prohibitively high
 Simple approximations are good enough over a
sufficiently small subregion of instance space
4/16/202
0
22
Pavithra T, Dept of ECE, GSKSJTI
RADIAL BASIS FUNCTIONS
4/16/202
0
23
Pavithra T, Dept of ECE, GSKSJTI
 It is common to choose each function Ku(d(xu,x)) to
be a Gaussian function centred at xu with some
variance σ2
 Ku(d(xu,x))= e(1/2σ2)d2(xu,x)
 The function of ˆ f(x) can be viewed as describing a
two-layer network
1. layer1 consists of units computes the values of
various Ku (d(xu,x)) values
2. layer2 computes a linear combination of the
above results
4/16/202
0
24Pavithra T, Dept of ECE, GSKSJTI
CASE BASED REASONING
 3 imp properties of NN and Linear regression
1. Lazy learners
2. new query is classified by analyzing a similar instance
3. Instances are represented as real valued points on a
n dimensional space
• CBR based on first 2 principles
• Instances are represented using symbols
• Example:
i. CADET system uses CBR to assist design of simple mechanical
device like water faucets
ii. Library : 75 designs and design fragments in memory
iii. Instance is stored by describing its structure and qualitative design
iv. New design problem is presented by specifying the desired function
and requesting for corresponding structure4/16/202
0
25Pavithra T, Dept of ECE, GSKSJTI
A STORED CASE AND A NEW PROBLEM
+ indicates variable increases at the arrow head with variable at
its tail end
- indicates variable decreases at the arrow head with variable at
its tail end
4/16/202
0
26Pavithra T, Dept of ECE, GSKSJTI
Generic Properties of CBR
(Distinguishable from NN method)
 Instances represented by rich symbolic descriptions
 Multiple cases may be combined to form solution to
new problem
 There may be tight coupling between case retrieval,
knowledge based reasoning and problem solving
 Summary:
 CBR is a instance based learning method in which instances
are rich relational descriptions and in which retrieval and
combination of cases to current query may rely on
knowledge based reasoning and search intensive problem
solving methods.
4/16/202
0
27Pavithra T, Dept of ECE, GSKSJTI

More Related Content

What's hot (20)

PPTX
Learning set of rules
swapnac12
 
PPT
K mean-clustering algorithm
parry prabhu
 
PPTX
Constraint satisfaction problems (csp)
Archana432045
 
PDF
Decision trees in Machine Learning
Mohammad Junaid Khan
 
PPTX
Learning rule of first order rules
swapnac12
 
PDF
Bayes Belief Networks
Sai Kumar Kodam
 
PDF
I. AO* SEARCH ALGORITHM
vikas dhakane
 
PPTX
Ensemble methods in machine learning
SANTHOSH RAJA M G
 
PDF
Noise Models
Sardar Alam
 
PPT
Heuristic Search Techniques Unit -II.ppt
karthikaparthasarath
 
PPTX
Image restoration and degradation model
AnupriyaDurai
 
PPTX
Artificial Intelligence Searching Techniques
Dr. C.V. Suresh Babu
 
ODP
Machine Learning with Decision trees
Knoldus Inc.
 
PPTX
Ensemble learning
Haris Jamil
 
ODP
Genetic algorithm ppt
Mayank Jain
 
PPT
EULER AND FERMAT THEOREM
ankita pandey
 
PPT
Fields of digital image processing slides
Srinath Dhayalamoorthy
 
PPTX
Genetic programming
Meghna Singh
 
PDF
I.BEST FIRST SEARCH IN AI
vikas dhakane
 
PPTX
Genetic algorithms
swapnac12
 
Learning set of rules
swapnac12
 
K mean-clustering algorithm
parry prabhu
 
Constraint satisfaction problems (csp)
Archana432045
 
Decision trees in Machine Learning
Mohammad Junaid Khan
 
Learning rule of first order rules
swapnac12
 
Bayes Belief Networks
Sai Kumar Kodam
 
I. AO* SEARCH ALGORITHM
vikas dhakane
 
Ensemble methods in machine learning
SANTHOSH RAJA M G
 
Noise Models
Sardar Alam
 
Heuristic Search Techniques Unit -II.ppt
karthikaparthasarath
 
Image restoration and degradation model
AnupriyaDurai
 
Artificial Intelligence Searching Techniques
Dr. C.V. Suresh Babu
 
Machine Learning with Decision trees
Knoldus Inc.
 
Ensemble learning
Haris Jamil
 
Genetic algorithm ppt
Mayank Jain
 
EULER AND FERMAT THEOREM
ankita pandey
 
Fields of digital image processing slides
Srinath Dhayalamoorthy
 
Genetic programming
Meghna Singh
 
I.BEST FIRST SEARCH IN AI
vikas dhakane
 
Genetic algorithms
swapnac12
 

Similar to Instance Based Learning in Machine Learning (20)

PPT
Artificial Intelligence
butest
 
PPTX
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Dhivyaa C.R
 
PPTX
UNIT IV (4).pptx
DrDhivyaaCRAssistant
 
PDF
MLHEP Lectures - day 1, basic track
arogozhnikov
 
PDF
Machine learning in science and industry — day 1
arogozhnikov
 
PPTX
Machine Learning Algorithms (Part 1)
Zihui Li
 
PPTX
Deep learning from mashine learning AI..
premkumarlive
 
PDF
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
PPTX
K-Nearest Neighbor Classifier
Neha Kulkarni
 
PDF
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
IRJET Journal
 
PDF
KNN,Weighted KNN,Nearest Centroid Classifier,Locally Weighted Regression
Sharmila Chidaravalli
 
PPTX
Unsupervised learning clustering
Arshad Farhad
 
PPT
Parallel Computing 2007: Bring your own parallel application
Geoffrey Fox
 
DOCX
Neural nw k means
Eng. Dr. Dennis N. Mwighusa
 
PPT
Poggi analytics - distance - 1a
Gaston Liberman
 
PDF
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
Adam Fausett
 
PPTX
Image Recognition of recognition pattern.pptx
ssuseracb8ba
 
PPT
Lect4
sumit621
 
PPT
2.6 support vector machines and associative classifiers revised
Krish_ver2
 
PDF
Kernal based speaker specific feature extraction and its applications in iTau...
TELKOMNIKA JOURNAL
 
Artificial Intelligence
butest
 
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Dhivyaa C.R
 
UNIT IV (4).pptx
DrDhivyaaCRAssistant
 
MLHEP Lectures - day 1, basic track
arogozhnikov
 
Machine learning in science and industry — day 1
arogozhnikov
 
Machine Learning Algorithms (Part 1)
Zihui Li
 
Deep learning from mashine learning AI..
premkumarlive
 
Machine Learning Algorithms Introduction.pdf
Vinodh58
 
K-Nearest Neighbor Classifier
Neha Kulkarni
 
Enhancing Classification Accuracy of K-Nearest Neighbors Algorithm using Gain...
IRJET Journal
 
KNN,Weighted KNN,Nearest Centroid Classifier,Locally Weighted Regression
Sharmila Chidaravalli
 
Unsupervised learning clustering
Arshad Farhad
 
Parallel Computing 2007: Bring your own parallel application
Geoffrey Fox
 
Neural nw k means
Eng. Dr. Dennis N. Mwighusa
 
Poggi analytics - distance - 1a
Gaston Liberman
 
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
Adam Fausett
 
Image Recognition of recognition pattern.pptx
ssuseracb8ba
 
Lect4
sumit621
 
2.6 support vector machines and associative classifiers revised
Krish_ver2
 
Kernal based speaker specific feature extraction and its applications in iTau...
TELKOMNIKA JOURNAL
 
Ad

Recently uploaded (20)

PPTX
DAY 1_QUARTER1 ENGLISH 5 WEEK- PRESENTATION.pptx
BanyMacalintal
 
PPTX
How to Create a Customer From Website in Odoo 18.pptx
Celine George
 
PDF
Mahidol_Change_Agent_Note_2025-06-27-29_MUSEF
Tassanee Lerksuthirat
 
PDF
epi editorial commitee meeting presentation
MIPLM
 
PPTX
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
PPTX
Controller Request and Response in Odoo18
Celine George
 
PPTX
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PDF
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
PPTX
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
PPTX
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
PDF
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
PDF
Council of Chalcedon Re-Examined
Smiling Lungs
 
PPTX
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
PPTX
Difference between write and update in odoo 18
Celine George
 
PPTX
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 
PPTX
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
PDF
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
DAY 1_QUARTER1 ENGLISH 5 WEEK- PRESENTATION.pptx
BanyMacalintal
 
How to Create a Customer From Website in Odoo 18.pptx
Celine George
 
Mahidol_Change_Agent_Note_2025-06-27-29_MUSEF
Tassanee Lerksuthirat
 
epi editorial commitee meeting presentation
MIPLM
 
Introduction to Biochemistry & Cellular Foundations.pptx
marvinnbustamante1
 
Controller Request and Response in Odoo18
Celine George
 
grade 5 lesson matatag ENGLISH 5_Q1_PPT_WEEK4.pptx
SireQuinn
 
Horarios de distribución de agua en julio
pegazohn1978
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
Is Assignment Help Legal in Australia_.pdf
thomas19williams83
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
How to Set Up Tags in Odoo 18 - Odoo Slides
Celine George
 
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
Council of Chalcedon Re-Examined
Smiling Lungs
 
How to Manage Allocation Report for Manufacturing Orders in Odoo 18
Celine George
 
Difference between write and update in odoo 18
Celine George
 
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 
Identifying elements in the story. Arrange the events in the story
geraldineamahido2
 
Aprendendo Arquitetura Framework Salesforce - Dia 03
Mauricio Alexandre Silva
 
Ad

Instance Based Learning in Machine Learning

  • 2. Overview • Instance-Based Learning •Comparison of Eager and Instance-Based Learning • Instance Distances for Instance-Based Learning • Nearest Neighbor (NN) Algorithm • Advantages and Disadvantages of the NN algorithm • Approaches to overcome the Disadvantages of the NN algorithm •Locally weighted regression •Radial basis functions •Case based Reasoning 4/16/202 0 2Pavithra T, Dept of ECE, GSKSJTI
  • 3. Different Learning Methods • Eager Learning – Learning = acquiring an explicit structure of a classifier on the whole training set; – Classification = an instance gets a classification using the explicit structure of the classifier. • Instance-Based Learning (Lazy Learning) – Learning = storing all training instances – Classification = an instance gets a classification equal to the classification of the nearest instances to the instance. 4/16/202 0 3Pavithra T, Dept of ECE, GSKSJTI
  • 4. –All learning methods presented so far construct a general explicit description of the target function when examples are provided –In case of Instance Based learning, – Examples are simply stored – Generalizing is postponed until a new instance must be classified – Sometimes referred to as lazy learning – In order to assign a target function value, its relationship to the previously stored examples is examined – IBL includes Nearest neighbor , locally weighted regression and case based reasoning methods Instance-Based Learning 4/16/202 0 4 Pavithra T, Dept of ECE, GSKSJTI
  • 5.  Advantages:  Instead of estimating for the whole instance space, local approximations to the target function are possible  Especially if target function is complex but still decomposable  Disadvantages:  Classification costs are high (number of computations to index each training example at query time)  Efficient techniques for indexing examples are important to reduce computational effort  Typically all attributes are considered when attempting to retrieve similar training examples from memory  If the concept depends only on a few attributes, the truly most similar instances may be far away 4/16/202 0 5Pavithra T, Dept of ECE, GSKSJTI
  • 6. The Features of the Task of the NN Algorithm: • The instance language comes with a set A with n attributes a1, a2, … an. • The domain of each attribute ai can be discrete or continuous. • An instance x is represented as < a1(x), a2(x), … an(x) >, where ai(x) is the value of the attribute ai for the instance x; • The classes to be learned can be: – Discrete: In this case we learn discrete function f(x) and the co-domain C of the function consists of the classes c to be learned. – Continuous: In this case we learn continuous function f(x) and the co-domain C of the function consists of the classes c to be learned. Nearest-Neighbor Algorithm (NN) 4/16/202 0 6 Pavithra T, Dept of ECE, GSKSJTI
  • 7. a ji jia range xaxa ),x(xd |)()(|   Distance Functions The distance functions are composed from difference metrics da w.r.t. attributes a defined for each two instances xi and xj. • If the attribute a is numerical, then : • If the attribute a is discrete, then :      otherwise.1, )a()a(if0, ji jia xx ),x(xd 4/16/202 0 7Pavithra T, Dept of ECE, GSKSJTI
  • 8. Distance Functions The main distance function for determining nearest neighbors is the Euclidean distance: 2 ),(   Aa jiaji xxd),xd(x 4/16/202 0 8Pavithra T, Dept of ECE, GSKSJTI
  • 10. + + + + - - - - - - e1 1-nn:1-nn: q1 is positive 5-nn: q1 is classified as negative q1 Classification & Decision Boundaries 4/16/202 0 10 Pavithra T, Dept of ECE, GSKSJTI
  • 11. 4/16/202 0 11Pavithra T, Dept of ECE, GSKSJTI
  • 13. Distance Weighted Nearest-Neighbor Algorithm 4/16/202 0 13 Pavithra T, Dept of ECE, GSKSJTI
  • 14. Advantages of the NN Algorithm • The NN algorithm can estimate complex target classes locally and differently for each new instance to be classified; • The NN algorithm provides good generalization accuracy on many domains • The NN algorithm learns very quickly; • The NN algorithm is robust to noisy training data; • The NN algorithm is intuitive and easy to understand which facilitates implementation and modification. 4/16/202 0 14Pavithra T, Dept of ECE, GSKSJTI
  • 15. 4/16/202 0 Pavithra T, Dept of ECE, GSKSJTI 15
  • 16. Disadvantages of the NN Algorithm • The NN algorithm has large storage requirements because it has to store all the data • The NN algorithm is slow during instance because all the training instances have to be visited • The accuracy of the NN algorithm degrades with increase of noise in the training data • The accuracy of the NN algorithm degrades with increase of irrelevant attributes 4/16/202 0 16Pavithra T, Dept of ECE, GSKSJTI
  • 17. 4/16/202 0 Pavithra T, Dept of ECE, GSKSJTI 17
  • 18. Remarks  Highly effective inductive inference method for many practical problems provided a sufficiently large set of training examples  Inductive bias of k-nearest neighbours assumption that the classification of xq will be similar to the classification of other instances that are nearby in the Euclidean Distance  Referred to as Curse of dimensionality  Solutions to this problem:  More relevant attributes can be stretched over the axis and least relevant attributes can be shortened over the axis  attributes can be weighted differently and eliminate least relevant attributes from instance space 4/16/202 0 18 Pavithra T, Dept of ECE, GSKSJTI
  • 19. A note on terminology: Regression means approximating a real valued target function Residual is the error ˆ f(x)−f(x) in approximating the target function Kernel function is the function of distance that is used to determine the weight of each training example. In other words, the kernel function is the function K such that wi=K(d(xi,xq)) 4/16/202 0 19Pavithra T, Dept of ECE, GSKSJTI
  • 20. Locally Weighted Linear Regression • It is a generalization of NN approach • Why local? • because function is approximated using based on the data near the query point • Why weighted ? • Methods like gradient descent can be used to calculate the coefficients w0, w1, ..., wn to minimize the error in fitting such linear functions • Why linear? • Target function is approximated using a linear function ˆ f(x)=w0+w1a1(x)+...+wnan(x) • Why regression ? • Approximating a real valued target function • ANNs require a global approximation to the target function but here, just a local approximation is needed • Therefore the error function has to be redefined 4/16/202 0 20Pavithra T, Dept of ECE, GSKSJTI
  • 21. Possibilities to redefine the error criterion E 1.Minimize the squared error over just the k nearest neighbours E1(xq)≡(1/2) Σ x ∈ k nearest neighbours (f(x)−ˆ f(x))2 2.Minimize the squared error over the entire set D, while weighting the error of each training example by some decreasing function K of its distance from xq E2(xq)≡ (½)Σ x ∈ D (f(x)−ˆ f(x))2·K(d(xq, x)) 3.Combine 1 and 2 E3(xq)≡(1/2)Σ x∈k nearest neighbours(f(x)−ˆf(x))2·K(d(xq,x)) 4/16/202 0 21Pavithra T, Dept of ECE, GSKSJTI
  • 22. Choice of the error criterion  E2 is the most efficient criterion:  because it allows every training example to have impact on the classification of xq  However, computational effort grows with the number of training examples  E3 is a good approximation to E2 with constant effort  Rederiving the gradient descent rule,  ∆wj =η Σ x ∈ k nearest neighbours K(d(xq, x)) (f(x)−ˆ f(x)) aj Remarks on locally weighted linear regression:  In most cases, constant, linear or quadratic functions are used for target functions  Because costs for fitting more complex functions are prohibitively high  Simple approximations are good enough over a sufficiently small subregion of instance space 4/16/202 0 22 Pavithra T, Dept of ECE, GSKSJTI
  • 24.  It is common to choose each function Ku(d(xu,x)) to be a Gaussian function centred at xu with some variance σ2  Ku(d(xu,x))= e(1/2σ2)d2(xu,x)  The function of ˆ f(x) can be viewed as describing a two-layer network 1. layer1 consists of units computes the values of various Ku (d(xu,x)) values 2. layer2 computes a linear combination of the above results 4/16/202 0 24Pavithra T, Dept of ECE, GSKSJTI
  • 25. CASE BASED REASONING  3 imp properties of NN and Linear regression 1. Lazy learners 2. new query is classified by analyzing a similar instance 3. Instances are represented as real valued points on a n dimensional space • CBR based on first 2 principles • Instances are represented using symbols • Example: i. CADET system uses CBR to assist design of simple mechanical device like water faucets ii. Library : 75 designs and design fragments in memory iii. Instance is stored by describing its structure and qualitative design iv. New design problem is presented by specifying the desired function and requesting for corresponding structure4/16/202 0 25Pavithra T, Dept of ECE, GSKSJTI
  • 26. A STORED CASE AND A NEW PROBLEM + indicates variable increases at the arrow head with variable at its tail end - indicates variable decreases at the arrow head with variable at its tail end 4/16/202 0 26Pavithra T, Dept of ECE, GSKSJTI
  • 27. Generic Properties of CBR (Distinguishable from NN method)  Instances represented by rich symbolic descriptions  Multiple cases may be combined to form solution to new problem  There may be tight coupling between case retrieval, knowledge based reasoning and problem solving  Summary:  CBR is a instance based learning method in which instances are rich relational descriptions and in which retrieval and combination of cases to current query may rely on knowledge based reasoning and search intensive problem solving methods. 4/16/202 0 27Pavithra T, Dept of ECE, GSKSJTI