0% found this document useful (0 votes)
1 views

Lecture-3 Learning in Feedforward Neural Networks (1)654

The document discusses learning in feedforward neural networks, focusing on perceptrons and their limitations with linearly separable functions. It explains the role of bias in neural networks, the perceptron's learning rule, and applications of perceptrons, emphasizing their use for linearly separable data. Additionally, it highlights the limitations of single-layer perceptrons and introduces the concept of multilayer perceptrons for more complex problems.

Uploaded by

mohabfata2003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Lecture-3 Learning in Feedforward Neural Networks (1)654

The document discusses learning in feedforward neural networks, focusing on perceptrons and their limitations with linearly separable functions. It explains the role of bias in neural networks, the perceptron's learning rule, and applications of perceptrons, emphasizing their use for linearly separable data. Additionally, it highlights the limitations of single-layer perceptrons and introduces the concept of multilayer perceptrons for more complex problems.

Uploaded by

mohabfata2003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Lecture-3

Learning in Feedforward
Neural Networks
Learning - A trick to learn 
n

xw i i θ
i =1
n

xw i i −  0
i =1

w1 x 1 + w 2 x 2 + … wn x n -  > 0
w1 x1 + w2 x2 + … wn xn +  * (-1) > 0
Learning - A trick to learn 
Linearly Separable Functions
• Definition : Sets of points in 2-D space are linearly separable
if the sets can be separated by a straight line.
• Generalizing, a set of points in n-dimensional space are
linearly separable if there is a hyper plane of (n-1) dimensions
separates the sets.
• Example:

• Note : Perceptron cannot find weights for classification


problems that are not linearly separable.
Linearly Separable Functions

• A black dot corresponds to an output value of 1;an empty dot


corresponds to an output value of 0
• Can represent AND, OR, NOT, etc., but not XOR

• Can learn majority functions:


Linearly Separable Functions
• However, perceptrons are limited to solving problems that
are linearly separable.
• If two classes are linearly separable, this means that we can draw a
single line to separate the two classes.
• We can do this easily for the AND and OR gates, but there is no
single line that can separate the classes for the XOR gate! This
means that we can’t use our single-layer perceptron to model an
XOR gate.
Linearly Separable Functions: Example 1

• Construct a Perceptron which correctly classifies the following


data; use your knowledge of plane geometry to choose
appropriate values for the weights w0, w1 and w2.

X1 X2 Class
0 1 -1
2 0 -1
1 1 +1
Linearly Separable Functions: Example

• The first step is to plot the data on a 2-D graph, and draw a
line which separates the positive from the negative data
points:

• This line has slope -1/2 and x2-intersect 5/4, so its equation is:
x2 = 5/4 - x1/2,
i.e. 2x1 + 4x2 - 5 = 0.
• Taking account of which side is positive, this corresponds to
these weights:
w0 = -5, w1 = 2, w2 = 4
Single-layer perceptron

• Inputs are typically in the range [0 , 1], where 0 is "off" and 1 is "on".
• Weights can be any real number (positive or negative).
What is the role of the bias in NN?

• Neuron bias can be shown as an additional input


with a fixed value b and weight 1
What is the role of the bias in NN?

• A bias value allows you to shift the activation function to the


left or right, which may be critical for successful learning.

• A simple example. Consider a neural network that has no bias:

• The output of the network is computed by multiplying the


input (x) by the weight (w0) and passing the result through
some kind of activation function (e.g. a sigmoid function.)
What is the role of the bias in NN?

• Here is the function that this network computes, for various


values of w0:
What is the role of the bias in NN?

• Changing the weight w0 essentially changes the "steepness"


of the sigmoid. That's useful, but what if you wanted the
network to output 0 when x is 2? Just changing the steepness
of the sigmoid won't really work -- you want to be able to
shift the entire curve to the right.
• That's exactly what the bias allows you to do. If we add a bias
to that network, like so:
What is the role of the bias in NN?
• Then the output of the network becomes sig(w0*x + w1*1.0).
Here is what the output of the network looks like for various
values of w1:

• Having a weight of -5 for w1 shifts the curve to the right,


which allows us to have a network that outputs 0 when x is 2.
Bias of a Neuron
• We need the bias value to be added to the weighted
sum ∑wixi so that we can transform it from the origin.

v = ∑wixi + b, here b is the bias

x1-x2= -1
x2 x1-x2=0

x1-x2= 1

x1
Bias as extra input

x0 = +1 w0
Activation
x1 W1
function
v Output
Input
x2 w2   (− ) y

  Summing function
xm wm
m
weights v= w x j j
j =0

w0 = b
Bias of a Neuron: Example
• In the perceptron below, what will the output be
when the input is (0, 0)? What about inputs (0, 1), (1,
1) and (1, 0)? What if we change the bias weight to -
0.5?
Bias of a Neuron: Example
Perceptron's Learning Rule
• Choosing weights and threshold ɵ for the perceptron is not
easy! How to learn the weights and threshold from examples?
• We can use a learning algorithm that adjusts the weights and
threshold ɵ based on examples.
• Adjust the weights in such a way that the output of ANN is
consistent with class labels of training examples
– Error function:
e =   y − f (u ( x )) 
2

– Find the weights wi’s that minimize the above error


function
• e.g., gradient descent, backpropagation algorithm
Perceptron's Learning Rule
Perceptron's Learning Rule
Learning Rule: Example 1
Learning Rule: Example 1
Learning Rule: Example 2

• Do the same as in the previous slide, but using


learning rate η = 0.3.

• Do the same as in the previous slide, but considering


the threshold ɵ as a weight to be learnt.
Learning Rule: Example 3
Perceptron: A linear classifier
Applications of Perceptrons

• Perceptrons can be used only for linearly separable data.

– SPAM filter.

– Web pages classification.

• However, if we use a transformation of the data to another


space in which data separation is easier (kernels), perceptrons
can have more applications.

• Neural networks with several layers of artificial neurons


(multilayer perceptrons) can be used as universal
approximations (next lecture)!
Limitations of Single Layer Perceptrons

• Uses only Binary Activation function

• Can be used only for Linear Networks

• Training Time is More

• Cannot solve Linear In-separable Problem


Example: Linear Separable Problem

Can be solved using Single


layer Feed forward
Perceptron
Define input and output data
Generation of random Numbers for Input
Generation of random Numbers for Input
Generation of random Numbers for Output
Create and train perceptron
Plot decision boundary

You might also like