Computational Intelligence
Computational Intelligence
From AI to Computational
intelligence
• Conventional AI involves methods characterized by
formalism and statistical analysis. This is also known as
symbolic AI, logical AI or neat AI. Methods include:
- Expert System: applies reasoning capabilities to reach a
conclusion.
- Case based reasoning: the process of solving new problems
based on the solutions of similar past problems.
- Bayesian reasoning: represents a set of variables together
with a joint probability distribution with explicit
independence assumptions.
- Behavior based AI: a modular method of building AI system
by hand.
Computational intelligence
• Computational intelligence involves iterative
development or learning. Learning is based on empirical
data. It is also known as non-Symbolic AI, Scruffy AI or
Soft Computing. Methods include:
- Neural Network: systems with very strong pattern
recognition capabilities.
- Fuzzy systems: techniques for reasoning under
uncertainty.
- Evolutionary Computation: applies biologically inspired
concepts such as populations, mutation and survival of
the fittest to generate better solutions to the problem.
- Hybrid intelligent systems: to combine above techniques.
Computational Paradigms
Computational Paradigms
Numerical Functional
Symbolic Logic Approximate
Modelling & optimization &
& Reasoning Reasoning
Search Random Search
Hard vs soft computing
• Hard computing:
This is the conventional methodology relies on
the principles of accuracy, certainty and
inflexibility. it is suitable for mathematical
problems.
• Soft computing:
It is a modern approach premised on the idea of
the approximation, uncertainty and flexibility.
What is soft computing
Wb Y
X2 f()
Wc
X3
x1 w1
y
Axon
x2 w2 Activation Function:
yin = x1w1 + x2w2 (y-in) = 1 if y-in >=
and (y-in) = 0
Dendrite
- A neuron receives input, determines the strength or the weight of the input, calculates the total
weighted input, and compares the total weighted with a value (threshold)
- If the total weighted input greater than or equal the threshold value, the neuron will produce the
output, and if the total weighted input less than the threshold value, no output will be produced
Why we use Artificial Neural
Network
• Human brain perform many desirable characteristics which
are not present in Von-Neumann Computer. They include:
- Used to extract patterns and detect trends that are too
complex to be noticed
- Real time operation based on Massive parallelism
- Adaptive Learning ability
- Distributed representation and computation
- Self Organization
- Inherent(Natural) Contextual information processing
- Fault tolerance via redundant information coding
- Low energy consumption
Multidisciplinary point of view of NN
Neurobiology
Artificial
Intelligence
Mathematics
Cognitive (Approximation
Psychology
Neural
theory,
Optimization)
Network
Economics Physics
(Time Series,
data mining)
(Statistical
Physics)
Engineering
Linguistic (Image/Signal
Processing)
History of ANN
• 1943 McCulloch-Pitts neurons
• 1949 Hebb’s law
• 1958 Perceptron (Rosenblatt)
• 1960 Adaline, better learning rule (Widrow,
Huff)
• 1969 Limitations (Minsky, Papert)
• 1972 Kohonen nets, associative memory
• 1977 Brain State in a Box (Anderson)
• 1982 Hopfield net, constraint satisfaction
• 1986 Backpropagation (Rumelhart, Hinton,
McClelland)
• 1987-1990 Adaptive Resonance Theory
(Carpenter and Grossberg)
• 1988 Neocognitron, character recognition
(Fukushima), Radial Basis Function Network
(Broomhead and Lowe)
Basic models of ANN
The models of ANN are specified by three basic
entities namely:
- The model’s synaptic interconnections: a pattern of
connections between neurons
- The training or learning rules adopted for updating
and adjusting the connections weights: a method of
determining the connection weights
- Their activation functions: Function to compute
output signal from input signal
Connections:
• An ANN consists of a set of highly interconnected processing
elements(neurons) such that each processing element is found
to be connected through weights to the other processing
elements or to itself.
• The arrangement of neurons to form layers and the connection
pattern formed within and between layers is called the network
architecture.
• Neural network are classified into single-layer or multilayer
neural nets.
• A layer is formed by taking a processing element and
combining it with other processing elements.
• Practically, a layer implies a stage, going stage by stage i.e.
the input stage and output stage are linked with each other.
• These linked interconnections lead to the formation of various
network architecture.
Neuron connection architecture
• Five types of neuron connection architectures:
- Single-layer feed-forward network
- Multilayer feed-forward network
- Single node with its own feedback
- Single-layer recurrent network
- Multilayer recurrent network
• A network is said to be feed –forward if no neuron in output layer
is an input to a node in the same layer or in the preceding layer.
Or A neural network that does not contain cycles(feedback loops) is
called feed-forward network or perceptron.
• And when outputs can be directed back as inputs to same or
preceding layer nodes then it results in the formation of feedback
networks.
Single-layer feed forward
• Layer is formed by taking
processing elements and
combining it with other
processing elements.
• Input and output are
linked with each other.
• Inputs are connected to
the processing nodes with
various weights, resulting
in series of output one per
node
Multilayer feed forward network
•A multilayer feed-forward network is formed by interconnection of several
layer.
• The input layer is that which receives the input and buffer the input signal.
•The output layer generate the output of the network and the layer formed
between the input and output layer is called hidden layer.
•The hidden layer is internal to the network and has no direct contact with
external environment.
•In case of fully connected network, every output from one layer is
connected to each and every node in the next layer.
Single node with its own feedback
• If the feedback of the output
of the processing element is
directed back as input to the
processing elements in the
same layer then it is called
lateral feedback.
• Recurrent networks are
feedback networks.
• The figure shows a simple
recurrent neural network
having a single neuron with
feedback to itself.
Single-layer recurrent network
• Single layer recurrent
network is a feedback
connection in which a
processing element’s
output can be directed
back to the processing
element itself or to the
other processing
element or to both.
Multilayer recurrent network
• A processing element
output can be directed
back to the nodes in a
preceding layer.
• Also, a processing
element output can be
directed back to the
processing element itself
and to other processing
elements in the same
layer.
on-center-off-surround or Lateral
inhibition structure
• In this structure, each
processing neuron
receives two different
classes of inputs:
Excitatory- input
from nearby
processing elements.
Inhibitory- input from
more distantly located
processing elements.
Learning
• Capability to learn is the main property of ANN.
• Learning or training is a process by which a neural network
adapts itself to a stimulus by making proper parameter
adjustments, resulting in the production of desired response.
• Types of learning:
- Parameter/weighted learning
- Structural learning
- Supervised learning
- Unsupervised learning
- Reinforcement learning
Parameter & structure Learning
• Parameter learning:
The learning that is used to update the
connecting weights in a neural net.
• Structure learning:
It focuses on the change in network structure
which includes the number of processing
elements as well as their connection types.
Supervised Learning
• During the training of ANN under
supervised learning, the input vector
is presented to the network, which will
give an output vector.
• This output vector is compared with
the desired output vector.
• An error signal is generated, if there is
a difference between the actual output
and the desired output vector.
• On the basis of this error signal, the
weights are adjusted until the actual
output is matched with the desired
output.
• This type of learning is done under the
supervision of a teacher. This learning
process is dependent.
Unsupervised learning
• During the training of ANN under
unsupervised learning, the input vectors
of similar type are combined to form
clusters.
• When a new input pattern is applied, then
the neural network gives an output
response indicating the class to which the
input pattern belongs.
• There is no feedback from the
environment as to what should be the
desired output and if it is correct or
incorrect.
• Hence, in this type of learning, the
network itself must discover the patterns
and features from the input data, and the
relation for the input data over the output.
• This type of learning is done without the
supervision of a teacher.
• This learning process is independent.
Reinforcement learning
• This type of learning is used to
reinforce or strengthen the network
over some critic information.
• During the training of network
under reinforcement learning, the
network receives some feedback
from the environment.
• This makes it somewhat similar to
supervised learning. However, the
feedback obtained here is evaluative
not instructive, which means there is
no teacher as in supervised
learning.
• After receiving the feedback, the
network performs adjustments of the
weights to get better critic
information in future.
comparison
Activation Function
• The activation function is used to calculate the out response
of a neuron
• The sum of the weighted input signal is applied with an
activation to obtain the response
• The activation function are linear, threshold or non-linear
Linear: The output is proportional to the total weighted
input.
Threshold: The output is set at one of two values, depending
on whether the total weighted input is greater than or less
than some threshold value.
Non‐linear: The output varies continuously but not linearly
as the input changes.
Activation function
• It may be defined as the extra force or effort applied over the input to obtain an exact output.
In ANN, we can also apply activation functions over the input to get the exact output.
Followings are some activation functions of interest −
Linear Activation Function:
It is also called the identity function as it performs no input editing. It can be defined as −
F(x)=x
Sigmoid Activation Function:
It is of two type as follows −
• Binary sigmoidal function − This activation function performs input editing between 0 and 1.
It is positive in nature. It is always bounded, which means its output cannot be less than 0 and
more than 1. It is also strictly increasing in nature, which means more the input higher would
be the output. It can be defined as
Bipolar sigmoidal function − This activation function performs input editing between -1 and
1. It can be positive or negative in nature. It is always bounded, which means its output
cannot be less than -1 and more than 1. It is also strictly increasing in nature like sigmoid
function. It can be defined as
Activation Function
Step function Sign function Sigmoid function Linear function
Y Y Y Y
+1 +1 1 1
0 X 0 X 0 X 0 X
-1 -1 -1 -1
C(bias)
Input X Y y=mx+C
Terminologies of ANN
• Weights: The weights contain information about the input signal. This
information is used by the net to solve a problem. Weight can be
represented in terms of matrix known as connection matrix.
• Bias: The bias included in the network has its impact in calculating the
net input. The bias is included by adding a component xₒ=1 to the input
vector x. thus, the input vector becomes
X= (1, X₁,…..Xi,…Xn)
The bias can be of two types: positive and negative. The positive bias helps
in increasing the net input of network and negative bias helps in
decreasing the net input of network.
• Threshold: it is a set value based upon which the final output of the
network may be calculated. Threshold value is used in activation function.
• Learning Rate: It is used to control the amount of weight adjustment at
each step of training. The learning rate, ranging from 0 to 1, determines
the rate of learning at each time step. The learning rate is denoted by ‘α’.
Terminologies of ANN
• Momentum factor: Convergence is made faster if a
momentum factor is added to the weight updation process. If
momentum has to be used, the weights from one or more
previous training pattern must be saved. It helps the network
in large weight adjustment until the corrections are in the
same general direction for several patterns. It is used in
back propagation network.
• Vigilance Parameter: It is used in ART network. It is used to
control the degree of similarity required for patterns to be
assigned to the same cluster unit. It ranges from 0.7 to 1 to
perform useful work in controlling the number of clusters. It
is denoted by ‘ρ’.
Neural Network Learning Rules
• Being a complex adaptive system, learning in ANN implies that
a processing unit is capable of changing its input/output
behavior due to the change in environment.
• The importance of learning in ANN increases because of the
fixed activation function as well as the input/output vector,
when a particular network is constructed. Now to change the
input/output behavior, it is needed to adjust the weights.
• During ANN learning, to change the input/output behavior, it is
needed to adjust the weights. Hence, a method is required with
the help of which the weights can be modified. These methods
are called Learning rules, which are simply algorithms or
equations.
McCulloch-Pitts Neuron Model
(M-P Neuron Model)
• The first mathematical model of a biological neuron was given by
McCulloch-Pitts in 1943
• This model does not exhibit any learning but serve as a basic
building block which is inspired further significant work in NN
research
• The M-P neurons are connected by directed weight path
• The connected path can be excitatory or inhibitory
• Excitatory neurons have positive weights and inhibitory neurons
have negative weights
• There will be same weights for excitatory connections entering
into a particular neuron
• The neuron is associated with the threshold value
• The neurons fires if the net input is grater than the threshold value
McCulloch-Pitts Neuron Model
x1
w
x2 w
xn w Y
-p
x n+1
-p
xn+2
-p
xn+m
• The activation of a McCulloch Pitts neuron is binary.
• Each neuron has a fixed threshold:
– f(yin) = 1 if yin >= θ
0 if yin < θ
• The threshold is set so that inhibition is absolute.
Example of McCulloch-Pitts Model
• It means that if any neuron, say yk , wants to win, then its induced local
field the output of summation unit, say vk, must be the largest among all
the other neurons in the network.
• Condition of sum total of weight − Another constraint over
the competitive learning rule is, the sum total of weights to
a particular output neuron is going to be 1. For example, if
we consider neuron k then −
Step 5 − Now obtain the net input with the following relation −
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output.
Step 7 − Adjust the weight and bias as follows −
Case 1 − if y ≠ t then,
• wi(new)=wi(old)+αtxi
• b(new)=b(old)+αt
Case 2 − if y = t then,
• wi(new)=wi(old)
• b(new)=b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which would happen
when there is no change in weight.
Training Algorithm for Multiple Output Units
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output for each
output unit j = 1 to m −
Step 7 − Adjust the weight and bias
for x = 1 to n and j = 1 to m as
follows −
Case 1 − if yj ≠ tj then,
wij(new)=wij(old)+αtjxi
bj(new)=bj(old)+αtj
Case 2 − if yj = tj then,
wij(new)=wij(old)
bj(new)=bj(old)
Here ‘y’ is the actual output and ‘t’ is
the desired/target output.
Step 8: − Test for the stopping
condition, which will happen when
there is no change in weight.
Adaptive Linear Neuron (Adaline)
Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output
−
Step 7 − Adjust the weight and bias as follows −
Case 1 − if y ≠ t then,
wi(new)=wi(old) + α(t−yin)xi
b(new)=b(old) + α(t−yin)
Case 2 − if y = t then,
wi(new)=wi(old)
b(new)=b(old)
Here ‘y’ is the actual output and ‘t’ is the desired/target output.
(t−yin) is the computed error.
Step 8 − Test for the stopping condition, which will happen when there
is no change in weight or the highest weight change occurred during
training is smaller than the specified tolerance.
Multiple Adaptive Linear Neuron (Madaline)
We know that only the weights and bias between the input and the
Adaline layer are to be adjusted, and the weights and bias between the
Adaline and the Madaline layer are fixed.
Step 1 − Initialize the following to start the training −
Weights Bias Learning rate α
For easy calculation and simplicity, weights and bias must be set equal
to 0 and the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-7 for every bipolar training pair s:t.
Step 4 − Activate each input unit as follows −
xi=si(i=1 to n)
Step 5 − Obtain the net input at each hidden layer, i.e. the Adaline layer
with the following relation −
Step 6 − Apply the following activation function to obtain the final output at the
Adaline and the Madaline layer −
• For training, BPN will use binary sigmoid activation function. The training of BPN will
have the following three phases.
Phase 1 − Feed Forward Phase
Phase 2 − Back Propagation of error
Phase 3 − Updating of weights
All these steps will be concluded in the algorithm as follows
Step 1 − Initialize the following to start the training −
Weights Learning rate α
For easy calculation and simplicity, take some small random values.
Step 2 − Continue step 3-11 when the stopping condition is not true.
Step 3 − Continue step 4-10 for every training pair.
Phase 1
Step 4 − Each input unit receives input signal xi and sends it to the hidden unit for all i = 1 to n
Step 5 − Calculate the net input at the hidden unit using the following relation −
• Here b0j is the bias on hidden unit, vij is the weight on j unit of the hidden layer coming
from i unit of the input layer.
Now calculate the net output by applying the following activation
function
Send these output signals of the hidden layer units to the output
layer units.
Step 6 − Calculate the net input at the output layer unit using the
following relation −
• Here b0k is the bias on output unit, wjk is the weight on k unit of
the output layer coming from j unit of the hidden layer.
• Calculate the net output by applying the following activation
function
Phase 2
Step 7 − Compute the error correcting term, in correspondence with the target
pattern received at each output unit, as follows −
On this basis, update the weight and bias as follows −
Architecture of ART1
It consists of the following two units −
Computational Unit − It is made up of the following :
• Input unit (F1 layer) − It further has the following two portions −
– F1a layer Input portion − In ART1, there would be no processing in this portion rather
than having the input vectors only. It is connected to F1b layer interface portion.
– F1b layer Interface portion − This portion combines the signal from the input portion
with that of F2 layer. F1bb layer is connected to F2 layer through bottom up
weights bij and F2 layer is connected to F1b layer through top down weights tji.
• Cluster Unit (F2 layer) − This is a competitive layer. The unit having the
largest net input is selected to learn the input pattern. The activation of all other
cluster unit are set to 0.
• Reset Mechanism − The work of this mechanism is based upon the similarity
between the top-down weight and the input vector. Now, if the degree of this
similarity is less than the vigilance parameter, then the cluster is not allowed to
learn the pattern and a rest would happen.
• Supplement Unit − Actually
the issue with Reset mechanism
is that the layer F2 must have to
be inhibited under certain
conditions and must also be
available when some learning
happens. That is why two
supplemental units
namely, G1 and G2 is added
along with reset unit, R. They
are called gain control units.
These units receive and send
signals to the other units present
in the network. ‘+’ indicates an
excitatory signal,
while ‘−’ indicates an inhibitory
signal.
Associative Memory Network
• Associative memory network can store a set of
patterns as memories.
• It is presented with a key pattern and responds by
producing one of the stored patterns, which
closely resembles or relates to the key pattern
• Thus, the recall is through association of the key
pattern, with the help of information memorized.
• These types of memories are also called as
content addressable memory.
• Pattern association is the process of forming
association between related pattern.
• The pattern that has to be associated may be of
same type or of a different type.
• Associative memory net are simplified model
of a human brain which can associate similar
pattern.
• Associative neural nets are single layer nets in
which the weights are determined to store an
asset of pattern association.
• Associative nets are of two types
- If the input vector pair is same as the output
vector pair, then it results in Auto Associative
Net
- If the input vector pair is different from that of
output vector pair then it is Hetro Associative
Net.
Hopfield Network
• Neural networks were designed on analogy with
the brain.
• The brain’s memory, however, works by
association.
• For example, we can recognize a familiar face even
in an unfamiliar environment within 100-200 ms.
• We can also recall a complete sensory experience,
including sounds and scenes, when we hear only a
few bars of music. The brain routinely associates
one thing with another.
Hopfield Network
• Multilayer neural networks trained with the back-
propagation algorithm are used for pattern
recognition problems.
• However, to emulate the human memory’s
associative characteristics we need a different type
of network: a recurrent neural network.
• A recurrent neural network has feedback loops
from its outputs to its inputs. The presence of such
loops has a profound impact on the learning
capability of the network.
• The Hopfield network is probably the second
most popular type of neural network after the
back-propogation model.
• It is based on Hebbian learning but uses
binary neurons.
• The Hopfield network can be used as
- Associative Memory
-Optimization Problems
• The basic idea of Hopfield network is that
it can store a set of exemplar patterns as
multiple stable states.
• Given a new input pattern which may be
partial or noisy, the network can converge to
one of the exemplar pattern that is nearest to
the input pattern.
• This is the basic concept of applying the
Hopfield network as associative memory.
Single-layer n-neuron Hopfield network
x1 1 y1
Input Signals
Output Signal
x2 2 y2
xi i yi
xn n yn
s
• Hopfield network consists of a single layer of
neurons 1, 2, 3, 4……n
• The network is fully interconnected that is
every neurons in the network is connected to
every other neuron.
• The network is recurrent that is it has feed
backward capabilities.
• Each input/output, xi , yj takes a discrete
bipolar values either 1 or -1
• Each edge is associated by weight, wij which
satisfies the following conditions
-The net has symmetrical weights with no self
connections i.e. diagonals elements of the
weights matrix of a Hopfield net are zero
i.e. wij = wji and wii =0
-Hopfield network is classified under
supervised learning since at the beginning it is
given correct exemplar pattern by a teacher
Algorithm
• The weights to be used for the application algorithm
are obtained from the training algorithm
• The activation are used for input vector
• The net input is calculated and applying the
activation function.
• The output is calculated
• The output is broadcasted to all other units
• The process is repeated until the convergence of net
is obtained
Step 1: Initialize weights to store pattern (use Hebb Rule)
while activation of net are not converge perform step 2 to 8
Step 2: For each input vector x, repeat step 3 to 7
Step 3: Set initial activation of the net equal to the external
input vector x
yi = xi (i=1,2,3…n)
Step 4: Perform steps 5 to 7 for each yi
Step 5: Computer the net input
yin i = xi + summation of yj wji
Step 6: Determine activation (output signal)
yi = 1 if yin i > theta
yi if yin i = theta
0 if yin i < theta
Step 7: Broadcast the value of yi to all other
units
Step 8: Test for convergence
The value of threshold theta is usually taken to
be zero
The Hopfield Network
•Example: Image reconstruction
•A 2020 discrete Hopfield network was trained with 20
input patterns, including the one shown in the left figure and
19 random patterns as the one on the right.
The Hopfield Network
•After providing only one fourth of the “face” image
as initial input, the network is able to perfectly
reconstruct that image within only two iterations.
Problems with Hopfield Model
O u t p u t Si g n a l s
y1
I n p u t Si g n a l s
x1
y2
x2
y3
Input Output
layer layer
Kohonen Network
• It is a network of two layers, the first is the input layer
and second is the output layer called Kohonen Layer
• The neurons on the Kohonen layer are called Kohonen
Neurons
• Every input neurons are connected to every Kohonen
neurons with a variable associated weight.
• The network is feed-forward
• Input values representing patterns are presented
sequentially in time through the input layer, without
specifying desired output.
• The neurons in the Kohonen Layer are arranged in
one dimensional and also in two dimensional array.
• In both one dimensional and two dimensional, a
neighborhood parameter or radius (r) can be
defined to indicate the neighborhood of a specific
neuron
• The key principle for map formation is that training
should take place over an extended region of the
network centered on the maximally active mode
Training Algorithm
• Initially weights and learning rate are set
• The input vectors to be clustered are presented to
the network
• Once the input vector is given based on the initial
weights, the winner unit is calculated either by
Euclidean distance method or sum of product
method
• Based on the winner unit selection, the weights are
updated for the particular winner unit using
competitive learning