0% found this document useful (0 votes)
75 views

Ict L2 PDF

The document discusses artificial neural networks (ANNs) and compares them to biological neural networks. It defines ANNs as networks of artificial neurons that constitute crude approximations of parts of real brains. The key components of ANNs are processing units, weighted interconnections between units, an activation rule, and optionally a learning rule. ANNs operate similarly to biological neurons by receiving inputs, integrating them, and generating outputs. They can learn from experience through adjusting synaptic weights.

Uploaded by

Hemant Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Ict L2 PDF

The document discusses artificial neural networks (ANNs) and compares them to biological neural networks. It defines ANNs as networks of artificial neurons that constitute crude approximations of parts of real brains. The key components of ANNs are processing units, weighted interconnections between units, an activation rule, and optionally a learning rule. ANNs operate similarly to biological neurons by receiving inputs, integrating them, and generating outputs. They can learn from experience through adjusting synaptic weights.

Uploaded by

Hemant Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE

Artificial Neural Networks


ANN
Neural Networks
1. Neural Networks (NNs) are networks of neurons, for example, as found in
real (i.e. biological) brains.

2. Artificial Neurons are crude approximations of the neurons found in


brains. They may be physical devices, or purely mathematical constructs.

3. Artificial Neural Networks (ANNs) are networks of Artificial Neurons,


and hence constitute crude approximations to parts of real brains. They
may be physical devices, or simulated on conventional computers.

4. From a practical point of view, an ANN is just a parallel computational


system consisting of many simple processing elements connected together
in a specific way in order to perform a particular task.

5. One should never lose sight of how crude the approximations are, and
how over-simplified our ANNs are compared to real brains.
2
Brains versus Computers
1. There are approximately 10 billion neurons in the human cortex, compared with
10 of thousands of processors in the most powerful parallel computers.
2. Each biological neuron is connected to several thousands of other neurons,
similar to the connectivity in powerful parallel computers.
3. Lack of processing units can be compensated by speed. The typical operating
speeds of biological neurons is measured in milliseconds (10-3 s), while a silicon
chip can operate in nanoseconds (10-9 s).
4. The human brain is extremely energy efficient, using approximately 10-16 joules
per operation per second, whereas the best computers today use around 10-6
joules per operation per second.
5. Brains have been evolving for tens of millions of years, computers have been
evolving for tens of decades.
6. Brain is capable of adaptation by changing the connectivity. But computer is hard
to be adaptive.
The brain uses massively parallel computation
– »1011 neurons in the brain
– »104 connections per neuron
3
Brains versus Computers
• receives input signals generated by
Biological neuron other neurons through its
dendrites,
• integrates these signals in its body,
• then generates its own signal (a
series of electric pulses) that travel
along the axon which in turn makes
contacts with dendrites of other
neurons.
• The points of contact between
neurons are called synapses.
• Incoming impulses can be
excitatory if they cause firing, or
inhibitory if they hinder the firing
of the response.
After carrying a pulse, an axon is in a state of non-excitability for a certain time called
the refractory period.
4
Brains versus Computers

Biological neuron

5
Brains versus Computers

Brain Computation

6
Biological Neural Networks (BNN)

7
Artificial Neural Networks (ANN)

A neural network consists of four main parts:


1. Processing units
2. Weighted interconnections between the various processing units.
3. An activation rule which acts on the set of input signals at a unit to
produce a new output signal, or activation.
4. Optionally, a learning rule that specifies how to adjust the weights
for a given input/output pair.

8
Definitions

Haykin :
A neural network is a massively parallel distributed processor that has a natural
propensity for storing experiential knowledge and making it available for use. It
resembles the brain in two respects:
– Knowledge is acquired by the network through a learning process.
– Interneuron connection strengths known as synaptic weights are used to store the
knowledge.

Zurada:
Artificial neural systems, or neural networks, are physical cellular systems
which can acquire, store, and utilize experiential knowledge.

9
Definitions

Mohamad H Hasssoun :
Neural Networks are neural in the sense that they may have been inspired by
neuroscience but not necessarily because they are faithful models of biologic
neural or cognitive phenomena

J.A. Anderson :
It is not absolutely necessary to believe that neural network models have
anything to do with the nervous system, but it helps. Because, we are able to use
a large body of ideas and facts from

10
Importance of ANN

• They are extremely powerful computational devices


• Massive parallelism makes them very efficient.
• They can learn and generalize from training data – so there is no
need for enormous feats of programming.
• They are particularly fault tolerant – this is equivalent to the
graceful degradation* found in biological systems. ‘you could
shoot every tenth neuron in the brain and not even notice it’
• They are very noise tolerant – so they can cope with situations
where normal systems would have difficulty.

* The property that enables a system to continue operating properly in the event
of the failure of some of its components.

11
Artificial Neural Net

W1
Y
X1

W2
X2

The figure shows a simple artificial neural net with two input
neurons (X1, X2) and one output neuron (Y). The inter connected
weights are given by W1 and W2.

12
Artificial Neural Net

Mathematical Model of Artificial Neuron

13
Artificial Neural Net

The neuron is the basic information processing unit of a NN. It


consists of:

1. A set of links, describing the neuron inputs, with weights W1, W2,
…, Wm.

2. An adder function (linear combiner) for computing the weighted


sum of the inputs (real numbers): m
u W jX j
j 1

3. Activation function for limiting the amplitude of the neuron output.


y (u b)
14
Artificial Neural Net

The bias value is added to the weighted sum

∑wixi so that we can transform it from the origin.

Yin = ∑wixi + b, where b is the bias

x1-x2= -1
x2 x1-x2=0

x1-x2= 1

x1

15
Operation of a Neural Network

-
x0 w0j
x1 w1j
f
Output y
xn wnj

Input Weight Weighted Activation


vector x vector w sum function

16
McCulloch-Pitts neuron model (1943)

Activation function
0
17
McCulloch-Pitts neuron model (1943)

18
Networks of McCulloch-Pitts Neurons

19
Building Blocks of Artificial Neural Net

• Network Architecture (Connection between Neurons)

• Setting the Weights (Training)

• Activation Function

20
Network Architecture

 Input Layer: Each input unit may be designated by an attribute


value possessed by the instance.

 Hidden Layer: Not directly observable, provides nonlinearities


for the network.

 Output Layer: Encodes possible values.

21
Training Process

 Supervised Training - Providing the network with a series of


sample inputs and comparing the output with the expected
responses.

 Unsupervised Training - Most similar input vector is assigned to


the same output unit.

 Reinforcement Training - Right answer is not provided but


indication of whether ‘right’ or ‘wrong’ is provided.

22
Activation Function

 ACTIVATION LEVEL – DISCRETE OR CONTINUOUS

 HARD LIMIT FUCNTION (DISCRETE)


• Binary Activation function
• Bipolar activation function
• Identity function

 SIGMOIDAL ACTIVATION FUNCTION (CONTINUOUS)


• Binary Sigmoidal activation function
• Bipolar Sigmoidal activation function

23
Activation Function

Activation functions:

(A) Identity

(B) Binary step

(C) Bipolar step

(D) Binary sigmoidal

(E) Bipolar sigmoidal

(F) Ramp

24
Decision Boundaries/Linear Separability

Linear separability is the concept wherein the separation of the input space into
regions is based on whether the network response is positive or negative.

The decision boundary is the surface at which the output of the unit is precisely
equal to the threshold.

x1
w1

1 w1
slope =
y w2
2 w2

w2

0 2
x2 W1=1, w2=2, 2

25
Learning and Generalization

Learning The network must learn decision surfaces from a set of


training patterns so that these training patterns are classified
correctly.

Generalization After training, the network must also be able to


generalize, i.e.correctly classify test patterns it has never seen
before.

Usually we want our neural networks to learn well, and also to


generalize well.

26
Perceptron

An arrangement of one input layer of neurons feeding forward to


one output layer of neurons is kown as Perceptron.

27
Perceptron Network

The Perceptron network consists of three units namely sensory unit


(input unit) associative unit (hidden unit) and response unit (output
unit)

28
Perceptron Network

 Epoch : Presentation of the entire training set to the neural


network. In the case of the AND function, an epoch consists of
four sets of inputs being presented to the network (i.e. [0,0], [0,1],
[1,0], [1,1]).
 Error: The error value is the amount by which the value output by
the network differs from the target value. For example, if we
required the network to output 0 and it outputs 1, then Error = -1.
 Target Value, T : When we are training a network we not only
present it with the input but also with a value that we require the
network to produce. For example, if we present the network with
[1,1] for the AND function, the training value will be 1.

29
Perceptron Network

 Output , O : The output value from the neuron.

 Ij : Inputs being presented to the neuron.

 Wj : Weight from input neuron (Ij) to the output neuron.

 LR : The learning rate. This dictates how quickly the network


converges. It is set by a matter of experimentation. It is typically
0.1.

30
Perceptron Learning

wi = wi + wi or wi = (t - o) xi
where
t = c(x) is the target value, o is the perceptron output, is a small
constant (e.g., 0.1) called learning rate.
 If the output is correct (t = o) the weights wi are not changed

 If the output is incorrect (t o) the weights wi are changed such that the
output of the perceptron for the new weights is closer to t.

 The algorithm converges to the correct classification


• if the training data is linearly separable
• is sufficiently small
31
Perceptron Architecture

32
Learning Rules

Multiple Neuron Perceptron

33
Learning Rules

Consider the four-class decision problem , train a perceptron


network to solve this problem using the perceptron learning rule.

34
Supervised Hebbian Learning

Linear Associator

Hebb’s learning law can be used in combination with a variety of


neural network architectures. The network we will use is the
linear associator

35
Supervised Hebbian Learning

36
Supervised Hebbian Learning

37
Steepest Descent Method

Trajectory with = 0.01 Trajectory with = 0.035

38
Steepest Descent Method

Trajectory with = 0.039 Trajectory with = 0.041

39
Steepest Descent Method

Steepest Descent with Minimization Along a Line Trajectory for Newton’s Method

40
Conjugate Gradient

1. Select the first search direction to be the negative of the gradient


2. Select the learning rate k to minimize the function along the search direction
3. Select the next search direction according to

4. If the algorithm has not converged, return to step 2

41
Widrow-Hoff Learning

ADALINE Network

42
Widrow-Hoff Learning
Mean Square Error

LMS Algorithm

Convergence Point
Stable Learning Rate

43
Widrow-Hoff Learning
Adaptive Filter ADALINE
Tapped Delay Line

44
Multilayer Perceptrons
Three-Layer Network

45
Pattern Classification
Two-Layer XOR Network

46
Function Approximation
Function Approximation Network

47
Function Approximation
Nominal Response of Network

48
Function Approximation
Effect of Parameter Changes on Network Response

49

You might also like