Memory-Based Learning: ENPM808F: Robot Learning Summer 2017
Memory-Based Learning: ENPM808F: Robot Learning Summer 2017
Robot Learning
Summer 2017
Lecture 2:
Memory-Based Learning
Course Outline
• Motor Learning and the EvoluBon of Intelligence
• Memory-Based Learning
• Behavior Based RoboBcs
• Reinforcement Learning
• Value versus Policy IteraBon
• Q-Learning and Actor-CriBc Models
• Robot Shaping and Evolving Behaviors
• Crossing the Reality Gap
• ImitaBon and Learning from DemonstraBon
• Deep Reinforcement Learning with CNNs
• On-line and Lifelong Learning
Global vs. Local Learning
Advantages: Advantages:
• Compact Representa.on • Rapid Convergence
• Automa.c Resource Alloca.on • Computa.onally Inexpensive
• Generally Con.nuous and • No Local Minima
Differen.able Mappings • Convergence Guaranteed
• Very High Accuracy
Disadvantages: Disadvantages:
• Very Slow Convergence • Memory Intensive
• Unpredictable Local Minima (may not • Resource Alloca.on Not Automa.c
converge to global minimum) • Con.nuity and Differen.ability of
• Computa.onally Expensive Mapping More Difficult to Guarantee
• Generaliza.on Not Easily Controllable
• Compara.vely Poor Accuracy
(on some problems)
Secant Approxima=on to Tangent
Con=nuous CMAC
Con=nuous CMAC
Curse of Dimensionality
vs.
Blessing of Non-Uniformity*
* --- Pedro Domingos
Lazy Learning methods store all of the training data and use it only
when called with a new input vector (query) to perform a
mapping. They make no assump=ons about the overall
shape of the global mapping before the query is presented.
Also referred to as Instance-based Learning methods.
Examples include: k-Nearest Neighbors, Locally Weighted Regression,
and Case-Based Reasoning
Lazy versus Eager Learning
Query
!!
k-Nearest Neighbor
Query
!!
k-Nearest Neighbor
k = 6
!
!!! !(!! )
!(!! ) ←
!
k-Nearest Neighbor
Requires:
• Training exemplars and queries map to points in ℜ!
• Small input vectors
• Sufficient density of training data to cover areas of interest
Advantages:
• No informa=on is lost
• Fast training
• Can model highly complex surfaces
Disadvantages:
• Addi=onal computa=onal complexity to answer queries
• Weights all input aFributes equally
• Suffers from Curse of Dimensionality
k-Nearest Neighbor Algorithm
(Real-valued)
Neighbors are weighted based upon their distance from the query point !!
!(!! , !! ) !! !!
Define distance between and
1
Define weights !! ≡ !
!(! ,
! ! ! )
!
!!! !! !(!! )
!! ! !! ←
Then given query , !
!!! !!
Shepard’s Method
• Note that since all points are used, Shepard’s Method is a global learning
algorithm
Nearest Neighbor vs
Locally Weighted Regression
k-Nearest Neighbor
(Discrete)
Distance Weighted
k-Nearest Neighbor
(Con=nuous)
Locally Weighted
Regression
Locally Weighted Regression
!(!)
LWR is a Lazy Learning method in which an approxima=on is
!!
formed around each query point
!!
• It is Local since only the points near are used.
! ! = !! + !! !! ! + ⋯ + !! !! (!)
Locally Weighted Linear Regression
To weight the influence of points based upon distance, define kernel func=on
1
!! ! ≡
!
1
!! ! ≡ !
!
!
!! ! ≡ ! !!
To minimize the weighted error across the en=re training set
!
1 !
!! !! ≡ ! ! −! ! !(!)
2
!!!
Locally Weighted Linear Regression
△ !! = ! !(!) ! ! − ! ! !! (!)
!!!
! !
with learning rate for each aFribute of input vector
Locally Weighted Nonlinear Regression
! = (!! ! , !! ! , … , !! (!))
Radial Basis Func=on Networks
! ! = !! + !! !! (!(!! , !))
!!!
!!
Kernel func=on is defined so that
it decreases with increasing distance.
!!
A common choice for is the
Gaussian func=on
! !
! ! ! (! ! ,!)
!! ! !! , ! = ! !!!
!!!
where denotes the variance of the Gaussian at !!
RBF Network Training
!! !! ≡ ! ! −! ! !(!)
2
!!!
or a localized error func=on, e.g.
!
1 !
!! !! ≡ ! ! −! ! !(!)
2
!!!
RBF Network Training
! = ! !! (!, !)
Learned database
implemen=ng
inverse model
Locally Weighted Learning
for Inverse Models
Pros:
• The database is “trained” by adding new points (!, !, !)
! !
• If there is a monotonic rela=onship between and , then there
are efficient methods for rapidly converging on the correct mapping
Cons:
• May not work if
Ø Vector space of ac=ons and outcomes is not the same
Ø Mapping is not one-to-one
Ø Data include misleading noisy observa=ons
Locally Weighted Learning
for Inverse Models
! = !(!, !)
Learned database
implemen=ng
forward model
Locally Weighted Learning
for Forward Models
Pros:
• The database is “trained” by adding new points (!, !, !)
• Allows “mental simula=on,” or predic=on of the effects of different
ac=ons
Cons:
• Requires search of the database to find ac=on that corresponds
to the desired outcome for the current state.
Combining Inverse and
Forward Models
An Inverse Model may be used to generate a good star=ng poin=ng
for search of a Forward Model.
!! (!, ! )
! ! = ! !
! may be used with a Lazy Forward Model
!
! = !(!, !! )
! !!
If is close to then Newton’s Method may be used to
!
further refine .
Locally Weighted Learning
for Robo=c Control
• Atkeson, C. G., Moore, A. W., and Schaal, S., “Locally Weighted Learning for Control,”
Ar.ficial Intelligence Review, 11:1-5 (1997): 75-113, 1997.
• Domingos, P., "A few useful things to know about machine learning,"
Communica.ons of the ACM, 55.10:78-87, 2012.
• Atkeson, C. G., Moore, A. W., & Schaal, S., “Locally Weighted Learning for
Control,” Ar=ficial Intelligence Review, 11:1-5 (1997): 75-113.
1) Program a Discrete CMAC and train it on a 1-D func=on (ref: Albus 1975, Fig. 5)
Explore effect of overlap area on generaliza=on and =me to convergence.
2) Program a Con=nuous CMAC by allowing par=al cell overlap, and modifying
the weight update rule accordingly. Compare the output of the Discrete CMAC
with that of the Con=nuous CMAC.
3) Discuss how you might use recurrent connec=ons to train a CMAC to output
a desired trajectory without using =me as an input (e.g., state only).