0% found this document useful (0 votes)
7 views

Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology

Uploaded by

Tanvir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology

Uploaded by

Tanvir Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Neural Network

Presented by
Md. Hafiz Ahamed
Lecturer
Dept. of Mechatronics Engineering
Rajshahi University of Engineering & Technology
Artificial neural networks
▪ Artificial neural networks are essentially modeled on the parallel architecture of animal brains, not
necessarily human ones. The network is based on a simple form of inputs and outputs.

▪ In biology terms, a neuron is a cell that can transmit and process chemical or electrical signals.

2
Main components of a neural network (or a parallel distributed model)

▪ A set of processing units (called neurons or cells);


▪ A state of activation yi for every unit, which is equivalent to the output of the unit;
▪ Connections between the units; each connection is defined by a weight wjk which determines the
effect which the signal of unit j has on unit k. The contribution for positive wjk is considered as an
excitation and for negative wjk as inhibition.
▪ A propagation rule, which determines the effective input xi of a unit from its external inputs;
▪ An activation function f, which determines the new level of activation based on the effective input
▪ Xi(t) and the current activation yi(t);
▪ An external input (also know as bias, offset) θi for each unit;
▪ A method for information gathering (the learning rule);
▪ An environment within which the system must operate, providing input signals and / if necessary /
error signals.
3
Neural Networks Types

The neural networks can be classified depending on:


The nature of information processing carried out at individual nodes:
▪ single layer network (perceptron);
▪ multi-layer network;
The connection geometries:
▪ feedforward network;
▪ backpropagation network;

4
Perceptron
▪ The basis for a neural network is the perceptron.
▪ It receives an input signal and then passes the value through some form of function.
▪ It outputs the result of the function.
▪ Consists of a single neuron with multiple inputs and a single output.
▪ It has restricted information processing capability.
▪ The information processing is done through a transfer function, which is either linear or
non-linear.
▪ The neuron can be trained to learn different simple tasks by modifying its threshold and
input weights.

5
Perceptron
The architecture of a simple perceptron
▪ X1, X2, ..., Xi, …, XN are inputs. These could be real numbers or Boolean values
depending on the problem.
▪ Y is the output and is Boolean.
▪ w1, w2, ..., wi, …, wN are weights of
the edges and are real value.
▪ θ is the threshold and is a real value.
▪ The role of the perceptron is to
classify a set of stimuli into one of
the two available classes.
▪ The decision regions are separated by a hyperplane 6
Activation Functions

7
Activation Functions

8
How the Perceptron Learns a Task

If the perceptron gives a wrong (undesirable) output, then one of two things could have happened:

▪ The desired output is 0, but the net input is above threshold. So the actual output becomes 1. In such a
case we should decrease the weights.
▪ The desired output is 1, but the net input is below threshold. We should now increase the weights.

9
Weight update rules:
1. The perceptron rule, and
2. Delta rule

The Perceptron Rule


The algorithm starts with a random
hyperplane and incrementally modifies
the hyperplane such that points that are
misclassified move closer to the correct
side of the boundary. The algorithm stops
when all learning examples are correctly
classified

Perceptron Convergence Theorem


The Perceptron convergence theorem
states that for any data set which is
linearly separable the Perceptron
learning rule is guaranteed to find a
solution in a finite number of steps.

10
Delta Rule

▪ Generalization of the perceptron training algorithm was presented by Widrow and Hoff as the least mean
square (LMS) learning procedure, also known as the delta rule
▪ Functional difference with the perceptron training rule is
1. Perceptron learning rule uses the output of the threshold function for learning;
2. Delta-rule uses the net output without further mapping into output values.

▪ When the data are not linearly separable we try to approximate the real concept using the delta rule.
▪ Use a gradient descent search to minimize the following error:

where:
• the sum goes over all training examples;
• Yi is the inner product w.x .

11
To find a minimum in the space of weights and the error function e. The delta rule works as follows:

➢ For a new training example X = (X1, X2,…, XN) the weights are updated according to the rule:
𝑤𝑖 = 𝑤𝑖 + Δ𝑤𝑖 ……………….(1)

12
Differences between the perceptron and the delta rule:

1. The perceptron is based on an output from a step function whereas the delta rule uses the linear
combination of inputs directly;

2. The perceptron is guaranteed to converge to a consistent hypothesis assuming the data is linearly
separable. The delta rule converges in the limit but it does not need the condition of linearly separable
data.

Difficulties with the gradient descent method:


➢ convergence to a minimum may take a long time;
➢ There is no guarantee we will find the global minimum.

13
Perceptron example for OR Function

With the initial configuration, the equation of the separator is:


w1X1 + w2X2 - θ = 0
which is:
-0.4X1 + 0.1⋅X2 – 0.2 = 0
or:
4X1 - X2 +2 = 0
14
15
Limitations of the Perceptron

➢ Learning is efficient if weights are not very large;

➢ Attributes are weighted independently;

➢ Can only learn lines-hyperplanes (cannot learn exclusive OR for example).

Multi-layer Perceptron
The perceptron can be successfully used for AND and OR logical functions. But a simple perceptron cannot
decide in the case of XOR function

16
➢ The XOR problem can be solved by introducing hidden units which involves extension of the network to a
multi-layered perceptron.

➢ multilayer networks can learn not only multiple decision boundaries, but the boundaries may be
nonlinear

17
A standard multilayer perceptron contains:

➢ An input layer;

➢ One or more hidden layers;

➢ An output layer.

18
Three important characteristics of a multilayer perceptron:

1. The neurons in the input, hidden and output layer use, in their mathematical model, activation functions
which are nonlinear and which are differentiable at any point;

2. The multilayer perceptron contains one or more hidden layers used for complex tasks

3. Have a high connectivity.

➢ To make nonlinear partitions on the space each unit has to be defined as a nonlinear function

➢ One solution is to use the sigmoid unit

➢ Reason for using sigmoids are that they are continuous unlike linear thresholds and are thus
differentiable at all points.

19
Back propagation Learning Algorithm
The training of a network by backpropagation involves three stages
1. The feedforward of the input training patterns;
2. The calculation and backpropagation of the associated error;
3. The adjustment of the weights.

➢ One of the most general used functions is sigmoid function.


Whose derivative is: f ’(x) = f(x)[1- f(x)].

➢ Other two common used activation functions are bipolar sigmoid and hypertangent:

20
Back propagation Learning Steps

1. Randomly initialize the weights

There are two situations which should be avoided:


➢ Too large values for the initial weights will make the initial input signals to each neuron in
the hidden layer and output layer respectively fall in the region where the derivative of the
sigmoid function has a very small value.
➢ Too small values for the initial weights will make the net input to a hidden or output neuron
be close to zero which causes slow learning.
2. Feedforward

3. Backpropagation of the error

4. Weights update

21
22
Examples

23
Examples

24
Examples

25

You might also like