Artificial Neural Network
Artificial Neural Network
A neural network consists of inter connected processing elements called neurons that work together to
produce an output function. The output of a neural network relies on the cooperation of the individual
neurons within the network to operate. Well designed neural networks are trainable systems that can often
“learn” to solve complex problems from a set of exemplars and generalize the “acquired knowledge” to
solve unforeseen problems, i.e. they are self-adaptive systems. A neural network is used to refer to a
network of biological neurons. A neural network consists of a set of highly interconnected entities called
nodes or units. Each unit accepts a weighted set of inputs and responds with an output.
Mathematically let I = (I1, I2, … … In) represent the set of inputs presented to the unit U. Each input has an
associated weight that represents the strength of that particular connection. Let W=
(W1, W2, … … Wn) represent the weight vector corresponding to the input vector X. By applying to V,
these weighted inputs produce a net sum at U given by
Patterns appearing on the input nodes or the output nodes of a network are viewed as samples from
probability densities and a network is viewed as a probabilistic model that assigns probabilities to
patterns. Biologically, we can also define a neuron. The human body is made up of a vast array of living
cells. Certain cells are interconnected in a way that allows them to communicate pain or to actuate fibres
or tissues. Some cells control the opening and closing of minuscule valves in the veins and arteries. These
specialized communication cells are called neurons. Neurons are equipped with long tentacle like
structures that stretch out from the cell body, permitting them to communicate with other neurons. The
tentacles that take in signals from other cells and the environment itself are called dendrites, while the
tentacles that carry signals from the neuron to other cells are called axons.
Dendrites
Nucleus
Axons
Figure A Neuron
FEATURES OF ARTIFICIAL NETWORK (ANN)
Artificial neural networks may by physical devices or simulated on conventional computers. From a
practical point of view, an ANN is just a parallel computational system consisting of many simple
processing elements connected together in a specific way in order to perform a particular task. There are
some important features of artificial networks as follows.
(1) Artificial neural networks are extremely powerful computational devices (Universal computers).
(2) ANNs are modeled on the basis of current brain theories, in which information is represented by
weights.
(3) ANNs have massive parallelism which makes them very efficient.
(4) They can learn and generalize from training data so there is no need for enormous feats of
programming.
(5) Storage is fault tolerant i.e. some portions of the neural net can be removed and there will be only a
small degradation in the quality of stored data.
(6) They are particularly fault tolerant which is equivalent to the “graceful degradation” found in
biological systems.
(7) Data are naturally stored in the form of associative memory which contrasts with conventional
memory, in which data are recalled by specifying address of that data.
(8) They are very noise tolerant, so they can cope with situations where normal symbolic systems would
have difficulty.
(9) In practice, they can do anything a symbolic/ logic system can do and more.
(10) Neural networks can extrapolate and intrapolate from their stored information. The neural networks
can also be trained. Special training teaches the net to look for significant features or relationships of
data.
Let I = (i1, i2 … … . ij) be the input vector and let the activation function f be simply, so that the activation
value is just the net sum to a unit. The jxn weight matrix is calculated as follows.
I1
I
Ok = (W1K, W2K … … . . (2
Wji) )q
Ij
Multilayer Network
A multilayer network has two or more layers of units, with the output from one layer serving as input to
the next. Generally in a multilayer network there are 3 layers present like, input layer, output layer and
hidden layer. The layer with no external output connections are referred to as hidden layers. A multilayer
neural network structure is given in figure.
Figure A multilayer neural network
Any multilayer system with fixed weights that has a linear activation function is equivalent to a single
layer linear system, for example, the case of a two layer system. The input vector to the first layer is Irthe
output O = W1 × I and the second layer produces output O2 = W2 × O.HenceO2 = W2 × (W1 × I)
= (W2 × W1) × I
So a linear system with any number n of layers is equivalent to a single layer linear system whose weight
matrix is the product of the n intermediate weight matrices. A multilayer system that is not linear can
provide more computational capability than a single layer system. Generally multilayer networks have
proven to be very powerful than single layer neural network. Any type of Boolean function can be
implemented by such a network. At the output layer of a multilayer neural network the output vector is
compared to the expected output. If the difference is zero, no changes are made to the weights of
connections. If the difference is not zero, the error is calculated and is propagated back through the
network.
Data are introduced into the system through an input layer. This is followed by processing in one or more
intermediate (hidden layers). Output data emerge from the network’s final layer. The transfer functions
contained in the individual neurons can be almost anything. The input layer is also called as Zeroth layer, of
the network serves to redistribute input values and does no processing. The output of this layer is described
mathematically as follows.
The input to each neuron in the first hidden layer in the network is a summation all weighted connections
between the input or Zeroth layer and the neuron in the first hidden layer. We will write the weighted sum
as net sum or net input. We can write the net input to a neuron from the first layer as the product of that
input vector im and weight factor wm plus a bias term . The total weighted input to the neuron is a
summation of these individual input signals described as follows.
The net sum to the neuron is transformed by the neuron’s activation or transfer function, f to produce a
new output value for the neuron. With back propagation, this transfer function is most commonly either a
sigmoid or a linear function. In addition to the net sum, a bias term is generally added to offset the
input. The bias is designed as a weight coming from a unitary valued input and denoted as W 0. So, the final
output of the neuron is given by the following equation.
Output = f (net sum)
= f ( Σ wmim + θ)
m=1
N0
= f ( Σ wm im
O
+ wO )
m=1
But one question may arise in reader’s mind. Why we are using the hidden layer between the input and
output layer? The answer to this question is very silly. Each layer in a multilayer neural network has its
own specific function. The input layer accepts input signals from the outside world and redistributes these
signals to all neurons in the hidden layer. Actually, the input layer rarely includes computing neurons and
thus does not process input patterns. The output layer accepts output signals, or in other words a stimulus
patterns, from the hidden layer and established the output patterns of the entire network. Neurons in the
hidden layer detect the features, the weights of the neurons represent the features hidden in the input
patterns. These features are then used by the output layer in determining the output patterns. With one
hidden layer we can represent any continuous function of the input signals and with two hidden layers
even discontinuous functions can be represented. A hidden layer hides its desired output. Neurons in the
hidden layer cannot be observed through the input/ output behaviour of the network. The desired output
of the hidden layer is determined by the layer itself. Generally, we can say there is no obvious way to
know what the desired output of the hidden layer should be.
The goal of back propagation, as with most training algorithms, is to iteratively adjust the weights in the
network to produce the desired output by minimizing the output error. The algorithm’s goal is to solve
credit assignment problem. Back propagation is a gradient-descent approach in that it uses the minimization
of first-order derivatives to find an optimal solution. The standard back propagation algorithm is given
below.
Step1:
Build a network with the choosen number of input, hidden and output units.
Step2:
Step3:
Step4:
Step5:
Cycle the network so that the activation from the inputs generates the activations in the hidden and output
layers.
Step6:
Calculate the error derivative between the output activation and the final output.
Step7:
Apply the method of back propagation to the summed products of the weights and errors in the output
layer in order to calculate the error in the hidden units.
Step8:
Update the weights attached the each unit according to the error in that unit, the output from the unit
below it and the learning parameters, until the error is sufficiently low.
To derive the back propagation algorithm, let us consider the three layer network shown in figure .
Figure Three layer back-propagation neural network
To propagate error signals, we start at the output layer and work backward to the hidden layer. The error
signal at the output of neuron k at iteration x is defined as
Generally, computational learning theory is concerned with training classifiers on a limited amount of data.
In the context of neural networks a simple heuristic, called early stopping often ensures that the network
will generalize well to examples not in the training set. There are some problems with the back propagation
algorithm like speed of convergence and the possibility of ending up in a local minimum of the error
function. Today there are a variety of practical solutions that make back propagation in multilayer
perceptrons the solution of choice for many machine learning tasks.