0% found this document useful (0 votes)

80 views

Neural Network Lectures RBF 1

This document provides an overview of radial basis function (RBF) networks. It begins by discussing linear models and their limitations, as well as methods like kernel methods and large margin techniques that make linear models more powerful. It then introduces various types of basis functions that can be used in RBF networks, including sigmoids, Gaussians, and polynomials. The document explains that RBF networks learn the parameters of the basis functions, making them more powerful than linear models but also more complex. It concludes by defining RBFs mathematically and listing examples of globally and compactly supported RBFs.

Uploaded by

Yekanth Ram

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views

Neural Network Lectures RBF 1

Uploaded by

Yekanth Ram

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Neural Network Lectures

ME 60033 IMS 2014-15

Radial Basis Functions
Dr C S Kumar, Robotics and Intelligent
Systems Lab

Is it a Neural Network?

Unsupervised Learning

References
1. Simon Haykins Neural Networks and Learning
Machines
2. Christopher Bishop Neural Networks in Pattern
Recognition
3. Kevin Gurney Introduction to Neural Networks
4. Hertz, Krogh, Palmer Introduction to theory of
Neural computation (Santa Fe Institute Series)
5. Handbook of Brain Sciences & Neural Networks

Linear models
It is mathematically easy to fit linear models to data.
We can learn a lot about model-fitting in this relatively simple case.
There are many ways to make linear models more powerful while
retaining their nice mathematical properties:
By using non-linear, non-adaptive basis functions, we can get
generalised linear models that learn non-linear mappings from input
to output but are linear in their parameters only the linear part of
the model learns.
By using kernel methods we can handle expansions of the raw data
that use a huge number of non-linear, non-adaptive basis functions.
By using large margin kernel methods we can avoid overfitting even
when we use huge numbers of basis functions.
But linear methods will not solve most AI problems.
They have fundamental limitations.

Some types of basis function in 1-D

Sigmoids

Gaussians

Polynomials

Sigmoid and Gaussian basis functions can also be used in multilayer

neural networks, but neural networks learn the parameters of the basis
functions. This is much more powerful but also much harder and much
messier.

Two types of linear model that are equivalent

with respect to learning
bias

y (x, w ) = w0 + w1 x1 + w2 x2 + ... = w T x
y (x, w ) = w0 + w11 (x) + w22 (x) + ... = w (x)
T

The first model has the same number of adaptive coefficients as

the dimensionality of the data +1.
The second model has the same number of adaptive coefficients
as the number of basis functions +1.
Once we have replaced the data by the outputs of the basis
functions, fitting the second model is exactly the same problem
as fitting the first model (unless we use the kernel trick)
So its silly to clutter up the math with basis functions

The loss function

Fitting a model to data is typically done by finding the
parameter values that minimize some loss function.
There are many possible loss functions. What criterion should
we use for choosing one?
Choose one that makes the math easy (squared error)
Choose one that makes the fitting correspond to maximizing
the likelihood of the training data given some noise model
for the observed outputs.
Choose one that makes it easy to interpret the learned
coefficients (easy if mostly zeros)
Choose one that corresponds to the real loss on a practical
application (losses are often asymmetric)

Minimizing squared error

y = w x
T

error =

2
T
(

)
t
w
x
n
n
n
T

w = ( X X)
*

optimal
weights

inverse of the
covariance
matrix of the
input vectors

X t

vector of
target values

the transposed
design matrix has
one input vector per
column

A geometrical view of the solution

The space has one axis for each
training case.
So the vector of target values is a
point in the space.
Each vector of the values of one
component of the input is also a
point in this space.
The input component vectors
span a subspace, S.
A weighted sum of the input
component vectors must lie
in S.
The optimal solution is the
orthogonal projection of the
vector of target values onto S.

3.1 4.2
1.5 2.7
0.6 1.8
component
vector

input vector

When is minimizing the squared error equivalent to

Maximum Likelihood Learning?
Minimizing the squared residuals
is equivalent to maximizing the
log probability of the correct
answer under a Gaussian
centered at the models guess.
t = the

yn = y ( x n , w )

correct
answer

p (t n | yn ) = p ( yn + noise = t n | x n , w ) =
log p (t n | yn ) = log 2 + log +
can be ignored if
sigma is fixed

y = models

estimate of most
probable value

1
2

( t n yn ) 2
2 2

(t n yn ) 2
2 2

can be ignored if
sigma is same for
every case

Multiple outputs
If there are multiple outputs we can often treat the learning
problem as a set of independent problems, one per output.
Not true if the output noise is correlated and changes from
case to case.
Even though they are independent problems we can save
work by only multiplying the input vectors by the inverse
covariance of the input components once. For output k we
have:

w *k = ( XT X) 1 XT t k
does not depend
on a

Least mean squares: An alternative approach for really

big datasets

weights after
seeing training
case tau+1

= w En ( )
learning
rate

vector of derivatives of the

squared error w.r.t. the weights
on the training case presented
at time tau.

This is called online learning. It can be more efficient if the dataset is very
redundant and it is simple to implement in hardware.
It is also called stochastic gradient descent if the training cases are picked
at random.
Care must be taken with the learning rate to prevent divergent
oscillations, and the rate must decrease at the end to get a good fit.

Regularized least squares

~
1 N
E (w ) = { y (x n , w ) t n }2 +
2
n =1

|| w ||2

The penalty on the squared weights is mathematically compatible with the

squared error function, so we get a nice closed form for the optimal weights
with this regularizer:

w * = ( I + XT X) 1 XT t
identity
matrix

Radial Basis Functions

Let : R + R be a continous function with (0) 0. If xi , let

i (xi ) = x xi ,
where is the Euclidean norm. Then i is called the RBF.
Linear:

Cubic:

Multiquadrics:

r 2 + c 2 where c is a shape parameter.

r 2 n log r , n 1, in 2D,
2 n 1
in 3D.
r , n 1,

Polyharmonic Spines:
Gaussian:
2014/8/25

cr 2
29

Globally Supported RBFs

=1+r

2014/8/25

= r 2 + c2

Compactly Supported RBFs

= (1 r ) 2+

2014/8/25

= (1 r ) 4+ (4r + 1)

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (78)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
2025 MandateForLeadership FULL
70% (10)
2025 MandateForLeadership FULL
920 pages
How To Kiss A Woman's Breast
60% (114)
How To Kiss A Woman's Breast
14 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Samacheer Kalvi 12th Physics Problems (New Syllabus)
87% (15)
Samacheer Kalvi 12th Physics Problems (New Syllabus)
52 pages
Introduction To Machine Learning Lecture 2: Linear Regression
No ratings yet
Introduction To Machine Learning Lecture 2: Linear Regression
38 pages
Advantages:: Q.No 1.a Ans
No ratings yet
Advantages:: Q.No 1.a Ans
12 pages
Anthony Kuh - Neural Networks and Learning Theory
No ratings yet
Anthony Kuh - Neural Networks and Learning Theory
72 pages
Lec 10 SVM
No ratings yet
Lec 10 SVM
35 pages
Wk01 machine learning
No ratings yet
Wk01 machine learning
6 pages
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
No ratings yet
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
7 pages
Module 2
No ratings yet
Module 2
44 pages
ML Unit V
No ratings yet
ML Unit V
10 pages
ML.1.Lecture.9 (Where It Actually Comes From)
No ratings yet
ML.1.Lecture.9 (Where It Actually Comes From)
31 pages
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
No ratings yet
Advanced Topics in Machine Learning: Supervised Learning, Deep Learning, and Optimization Techniques
5 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
1.1 Introduction
No ratings yet
1.1 Introduction
73 pages
Lecture 2: Basics and Definitions: Networks As Data Models
No ratings yet
Lecture 2: Basics and Definitions: Networks As Data Models
28 pages
Numerical Programming I (For CSE) : Final Exam
No ratings yet
Numerical Programming I (For CSE) : Final Exam
8 pages
L8-ANN
No ratings yet
L8-ANN
20 pages
Unit 2
No ratings yet
Unit 2
37 pages
WINSEM2024-25_BCSE204L_TH_VL2024250501496_2025-01-24_Reference-Material-I
No ratings yet
WINSEM2024-25_BCSE204L_TH_VL2024250501496_2025-01-24_Reference-Material-I
22 pages
Mid 2 NN
No ratings yet
Mid 2 NN
14 pages
AI-Lecture 12 - Simple Perceptron
100% (1)
AI-Lecture 12 - Simple Perceptron
24 pages
Numc PDF
No ratings yet
Numc PDF
18 pages
What are the commonly used activation functions
No ratings yet
What are the commonly used activation functions
8 pages
COMP2230 Introduction To Algorithmics: Lecture Overview
No ratings yet
COMP2230 Introduction To Algorithmics: Lecture Overview
35 pages
Huffman Codes: Spanning Tree
No ratings yet
Huffman Codes: Spanning Tree
6 pages
Chap 6 Embedding
No ratings yet
Chap 6 Embedding
44 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
homework-01
No ratings yet
homework-01
4 pages
Aoa PVQ
No ratings yet
Aoa PVQ
6 pages
Nonlinear Dimensionality Reduction
No ratings yet
Nonlinear Dimensionality Reduction
18 pages
exercise01
No ratings yet
exercise01
3 pages
ISKE2007 Wu Hongliang
No ratings yet
ISKE2007 Wu Hongliang
7 pages
Multi Perceptor
No ratings yet
Multi Perceptor
37 pages
Control Optimo Excel
No ratings yet
Control Optimo Excel
24 pages
Supervised Learning: Multilayer Networks I
No ratings yet
Supervised Learning: Multilayer Networks I
40 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
COMP4702 Notes 2019: Week 2 - Supervised Learning
No ratings yet
COMP4702 Notes 2019: Week 2 - Supervised Learning
23 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
ml
No ratings yet
ml
10 pages
MLCH9
No ratings yet
MLCH9
45 pages
DOS - Report
No ratings yet
DOS - Report
25 pages
Mesh Free Methods: Nico Van Der Aa
No ratings yet
Mesh Free Methods: Nico Van Der Aa
32 pages
CPSC 540 Assignment 1 (Due January 19)
100% (1)
CPSC 540 Assignment 1 (Due January 19)
9 pages
Parameter Estimation
100% (1)
Parameter Estimation
24 pages
Richi's Neural Nets Summary
No ratings yet
Richi's Neural Nets Summary
114 pages
05 Adaptive Resonance Theory
No ratings yet
05 Adaptive Resonance Theory
0 pages
LEAST SQUARE Curve Fitting
No ratings yet
LEAST SQUARE Curve Fitting
13 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
lec10svm
No ratings yet
lec10svm
35 pages
Vahid
No ratings yet
Vahid
18 pages
Session 6 Machine Learning Algorithms
No ratings yet
Session 6 Machine Learning Algorithms
46 pages
Experiment 2 v2
No ratings yet
Experiment 2 v2
10 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
No ratings yet
Week - 5 (Deep Learning) Q. 1) Explain The Architecture of Feed Forward Neural Network or Multilayer Perceptron. (12 Marks)
7 pages
ML Notes
No ratings yet
ML Notes
79 pages
Projects
No ratings yet
Projects
4 pages
PRu 4
No ratings yet
PRu 4
13 pages
Cmdie
No ratings yet
Cmdie
8 pages
4 - Basics in Statistics and Linear Algebra
No ratings yet
4 - Basics in Statistics and Linear Algebra
7 pages
05-Adaptive Resonance Theory
No ratings yet
05-Adaptive Resonance Theory
0 pages
Sol All
No ratings yet
Sol All
66 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
The 400m Hurdles
No ratings yet
The 400m Hurdles
16 pages
(En) Dor Lombar e Mobilidade Renal - A Manipulação Osteopática Fascial Local Diminui A Percepção Da Dor e Melhora A Mobilidade Renal
No ratings yet
(En) Dor Lombar e Mobilidade Renal - A Manipulação Osteopática Fascial Local Diminui A Percepção Da Dor e Melhora A Mobilidade Renal
11 pages
Concrete Technology Question Bank PDF
100% (1)
Concrete Technology Question Bank PDF
7 pages
CHAPTER 3 REFRIGERATION CYCLE (Complete Slide)
100% (1)
CHAPTER 3 REFRIGERATION CYCLE (Complete Slide)
19 pages
Start 41223 D
No ratings yet
Start 41223 D
6 pages
XI - IC - 1 - 45 Min MicroSchedule and ExamSchedule
No ratings yet
XI - IC - 1 - 45 Min MicroSchedule and ExamSchedule
22 pages
Hyper Terminal
No ratings yet
Hyper Terminal
22 pages
Simulation of Boiler Control Using PLC & SCADA: Shital S. Chopade, Pradhuman Verma, Prashant Verma
No ratings yet
Simulation of Boiler Control Using PLC & SCADA: Shital S. Chopade, Pradhuman Verma, Prashant Verma
4 pages
Superabsorbent Polymers and Superabsorbent Polymer Composites
No ratings yet
Superabsorbent Polymers and Superabsorbent Polymer Composites
5 pages
Freebitco - in Auto Bet, Auto Play
No ratings yet
Freebitco - in Auto Bet, Auto Play
2 pages
Recent Advances in Evolutionary Structural Optimization: Y.M. Xie, X. Huang, J.W. Tang and P. Felicetti
No ratings yet
Recent Advances in Evolutionary Structural Optimization: Y.M. Xie, X. Huang, J.W. Tang and P. Felicetti
8 pages
The Basics of Caches
No ratings yet
The Basics of Caches
14 pages
Dynamics Prob Set
0% (2)
Dynamics Prob Set
29 pages
Gss Maze 2
No ratings yet
Gss Maze 2
2 pages
9 Chemistry For Engineers Soil
No ratings yet
9 Chemistry For Engineers Soil
21 pages
Shut up bitch
No ratings yet
Shut up bitch
44 pages
1022 Carbon Steel Bar PDF
No ratings yet
1022 Carbon Steel Bar PDF
3 pages
Pharmaceutical Calculations and Techniques PHCL111: Our Lady of Fatima University
No ratings yet
Pharmaceutical Calculations and Techniques PHCL111: Our Lady of Fatima University
24 pages
Or Simplex
No ratings yet
Or Simplex
13 pages
Ramsina Sheeba Sada (Surveying)
No ratings yet
Ramsina Sheeba Sada (Surveying)
23 pages
Application: Name:Plate-Shaped RF Power Ceramic Capacitor Item#.: CCG81 Series
No ratings yet
Application: Name:Plate-Shaped RF Power Ceramic Capacitor Item#.: CCG81 Series
3 pages
Grade 7 Science Review Quiz
No ratings yet
Grade 7 Science Review Quiz
3 pages
Gui Design in C++.
No ratings yet
Gui Design in C++.
23 pages
Ghanchi Mukim Anvar 190230109019 Electrical Engineering: Name Enrollment No Branch
No ratings yet
Ghanchi Mukim Anvar 190230109019 Electrical Engineering: Name Enrollment No Branch
45 pages
Moloboco RSH 632 Infographic 2
No ratings yet
Moloboco RSH 632 Infographic 2
1 page
Determiners
No ratings yet
Determiners
14 pages
SCI10_Q2-3rd summative.
No ratings yet
SCI10_Q2-3rd summative.
2 pages
System For 24.03.17
No ratings yet
System For 24.03.17
2 pages
5 GRR Calculation Presentation Template
No ratings yet
5 GRR Calculation Presentation Template
5 pages

Neural Network Lectures RBF 1

Uploaded by

Neural Network Lectures RBF 1

Uploaded by

Neural Network Lectures

ME 60033 IMS 2014-15

Some types of basis function in 1-D

Sigmoid and Gaussian basis functions can also be used in multilayer

Two types of linear model that are equivalent

The first model has the same number of adaptive coefficients as

The loss function

Minimizing squared error

A geometrical view of the solution

When is minimizing the squared error equivalent to

Least mean squares: An alternative approach for really

vector of derivatives of the

Regularized least squares

The penalty on the squared weights is mathematically compatible with the

Radial Basis Functions

r 2 + c 2 where c is a shape parameter.

Globally Supported RBFs

Compactly Supported RBFs

You might also like