0% found this document useful (0 votes)
21 views68 pages

AISC Notes2

Uploaded by

sachin Kathuria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views68 pages

AISC Notes2

Uploaded by

sachin Kathuria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

PART-A

Soft Computing
Soft computing is the reverse of hard (conventional) computing. It refers to a group of
computational techniques that are based on artificial intelligence (AI) and natural selection.
It provides cost-effective solutions to the complex real-life problems for which hard
computing solution does not exist.

Zadeh coined the term of soft computing in 1992. The objective of soft computing is to
provide precise approximation and quick solutions for complex real-life problems.

Characterstics of Soft Computing


o Soft computing provides an approximate but precise solution for real-life problems.
o The algorithms of soft computing are adaptive, so the current process is not affected
by any kind of change in the environment.
o The concept of soft computing is based on learning from experimental data. It means
that soft computing does not require any mathematical model to solve the problem.
o Soft computing helps users to solve real-world problems by providing approximate
results that conventional and analytical models cannot solve.
o It is based on Fuzzy logic, genetic algorithms, machine learning, ANN, and expert
systems.

EXAMPLE:-

string1 = "xyz" and string2 = "xyw"

1. Problem 1
2. Are string1 and string2 same?
3. Solution
4. No, the solution is simply No. It does not require any algorithm to analyze this.

Let's modify the problem a bit.

1. Problem 2
2. How much string1 and string2 are same?
3. Solution
4. Through conventional programming, either the answer is Yes or No. But these strings
might be 80% similar according to soft computing.

You have noticed that soft computing gave us the approximate solution.

Applications of soft computing


There are several applications of soft computing where it is used. Some of them are
listed below:

o It is widely used in gaming products like Poker and Checker.


o In kitchen appliances, such as Microwave and Rice cooker.
o In most used home appliances - Washing Machine, Heater, Refrigerator, and
AC as well.
o Apart from all these usages, it is also used in Robotics work (Emotional per
Robot form).
o Image processing and Data compression are also popular applications of soft
computing.
o Used for handwriting recognition.

Need of soft computing


Sometimes, conventional computing or analytical models does not provide a solution
to some real-world problems. In that case, we require other technique like soft
computing to obtain an approximate solution.

o Hard computing is used for solving mathematical problems that need a precise
answer. It fails to provide solutions for some real-life problems. Thereby for real-
life problems whose precise solution does not exist, soft computing helps.
o When conventional mathematical and analytical models fail, soft computing
helps, e.g., You can map even the human mind using soft computing.
o Analytical models can be used for solving mathematical problems and valid for
ideal cases. But the real-world problems do not have an ideal case; these exist
in a non-ideal environment.
o Soft computing is not only limited to theory; it also gives insights into real-life
problems.
o Like all the above reasons, Soft computing helps to map the human mind, which
cannot be possible with conventional mathematical and analytical models.

Elements of soft computing


Soft computing is viewed as a foundation component for an emerging field of conceptual
intelligence. Fuzzy Logic (FL), Machine Learning (ML), Neural Network (NN), Probabilistic
Reasoning (PR), and Evolutionary Computation (EC) are the supplements of soft computing.
Also, these are techniques used by soft computing to resolve any complex problem.

Any problems can be resolved effectively using these components. Following are three types
of techniques used by soft computing:

o Fuzzy Logic
o Artificial Neural Network (ANN)
o Genetic Algorithms
Fuzzy Logic (FL)
Fuzzy logic is nothing but mathematical logic which tries to solve problems with an open and
imprecise spectrum of data. It makes it easy to obtain an array of precise conclusions.

Fuzzy logic is basically designed to achieve the best possible solution to complex problems
from all the available information and input data. Fuzzy logics are considered as the best
solution finders.

Neural Network (ANN)


Neural networks were developed in the 1950s, which helped soft computing to solve real-
world problems, which a computer cannot do itself. We all know that a human brain can easily
describe real-world conditions, but a computer cannot.

An artificial neural network (ANN) emulates a network of neurons that makes a human brain
(means a machine that can think like a human mind). Thereby the computer or a machine can
learn things so that they can take decisions like the human brain.

Artificial Neural Networks (ANN) are mutually connected with brain cells and created
using regular computing programming. It is like as the human neural system.

Genetic Algorithms (GA)


Genetic algorithm is almost based on nature and take all inspirations from it. There is
no genetic algorithm that is based on search-based algorithms, which find its roots in
natural selection and the concept of genetics.

In addition, a genetic algorithm is a subset of a large branch of computation.

Soft computing vs hard computing

Parameters Soft Computing Hard Computing

Computation Takes less computation time. Takes more computation time.


time

Dependency It depends on approximation It is mainly based on binary logic


and dispositional. and numerical systems.

Computation Parallel computation Sequential computation


type

Result/Output Approximate result Exact and precise result


Example Neural Networks, such as Any numerical problem or traditional methods
Madaline, Adaline, Art of solving using personal computers.
Networks.

PART-B
What is a neural network?
A neural network is a method in artificial intelligence that teaches computers to process data in a
way that is inspired by the human brain. It is a type of machine learning process, called deep
learning, that uses interconnected nodes or neurons in a layered structure that resembles the
human brain. It creates an adaptive system that computers use to learn from their mistakes and
improve continuously. Thus, artificial neural networks attempt to solve complicated problems, like
summarizing documents or recognizing faces, with greater accuracy.

Why are neural networks important?


Neural networks can help computers make intelligent decisions with limited human assistance.
This is because they can learn and model the relationships between input and output data that
are nonlinear and complex. For instance, they can do the following tasks.

Make generalizations and inferences

Neural networks can comprehend unstructured data and make general observations without
explicit training. For instance, they can recognize that two different input sentences have a
similar meaning:

• Can you tell me how to make the payment?


• How do I transfer money?

A neural network would know that both sentences mean the same thing. Or it would be able to
broadly recognize that Baxter Road is a place, but Baxter Smith is a person’s name.

What are neural networks used for?


Neural networks have several use cases across many industries, such as the following:

• Medical diagnosis by medical image classification


• Targeted marketing by social network filtering and behavioral data analysis
• Financial predictions by processing historical data of financial instruments
• Electrical load and energy demand forecasting
• Process and quality control
• Chemical compound identification

We give four of the important applications of neural networks below.


Computer vision

Computer vision is the ability of computers to extract information and insights from images and
videos. With neural networks, computers can distinguish and recognize images similar to
humans. Computer vision has several applications, such as the following:

• Visual recognition in self-driving cars so they can recognize road signs and other road
users
• Content moderation to automatically remove unsafe or inappropriate content from image
and video archives
• Facial recognition to identify faces and recognize attributes like open eyes, glasses, and
facial hair
• Image labeling to identify brand logos, clothing, safety gear, and other image details
Speech recognition

Neural networks can analyze human speech despite varying speech patterns, pitch, tone,
language, and accent. Virtual assistants like Amazon Alexa and automatic transcription software
use speech recognition to do tasks like these:

• Assist call center agents and automatically classify calls


• Convert clinical conversations into documentation in real time
• Accurately subtitle videos and meeting recordings for wider content reach
Natural language processing

Natural language processing (NLP) is the ability to process natural, human-created text. Neural
networks help computers gather insights and meaning from text data and documents. NLP has
several use cases, including in these functions:

• Automated virtual agents and chatbots


• Automatic organization and classification of written data
• Business intelligence analysis of long-form documents like emails and forms
• Indexing of key phrases that indicate sentiment, like positive and negative comments on
social media
• Document summarization and article generation for a given topic
Recommendation engines

Neural networks can track user activity to develop personalized recommendations. They can also
analyze all user behavior and discover new products or services that interest a specific user. For
example, Curalate, a Philadelphia-based startup, helps brands convert social media posts into
sales. Brands use Curalate’s intelligent product tagging (IPT) service to automate the collection
and curation of user-generated social content. IPT uses neural networks to automatically find and
recommend products relevant to the user’s social media activity. Consumers don't have to hunt
through online catalogs to find a specific product from a social media image. Instead, they can
use Curalate’s auto product tagging to purchase the product with ease.

How do neural networks work?


The human brain is the inspiration behind neural network architecture. Human brain cells, called
neurons, form a complex, highly interconnected network and send electrical signals to each other
to help humans process information. Similarly, an artificial neural network is made of artificial
neurons that work together to solve a problem. Artificial neurons are software modules, called
nodes, and artificial neural networks are software programs or algorithms that, at their core, use
computing systems to solve mathematical calculations.

Simple neural network architecture

A basic neural network has interconnected artificial neurons in three layers:

Input Layer

Information from the outside world enters the artificial neural network from the input layer. Input
nodes process the data, analyze or categorize it, and pass it on to the next layer.

Hidden Layer

Hidden layers take their input from the input layer or other hidden layers. Artificial neural
networks can have a large number of hidden layers. Each hidden layer analyzes the output from
the previous layer, processes it further, and passes it on to the next layer.

Output Layer

The output layer gives the final result of all the data processing by the artificial neural network. It
can have single or multiple nodes. For instance, if we have a binary (yes/no) classification
problem, the output layer will have one output node, which will give the result as 1 or 0. However,
if we have a multi-class classification problem, the output layer might consist of more than one
output node.

Deep neural network architecture


Deep neural networks, or deep learning networks, have several hidden layers with millions of
artificial neurons linked together. A number, called weight, represents the connections between
one node and another. The weight is a positive number if one node excites another, or negative if
one node suppresses the other. Nodes with higher weight values have more influence on the
other nodes.
Theoretically, deep neural networks can map any input type to any output type. However, they
also need much more training as compared to other machine learning methods. They need
millions of examples of training data rather than perhaps the hundreds or thousands that a
simpler network might need.

What are the types of neural networks?


Artificial neural networks can be categorized by how the data flows from the input node to the
output node. Below are some examples:

Feedforward neural networks


Feedforward neural networks process data in one direction, from the input node to the output
node. Every node in one layer is connected to every node in the next layer. A feedforward
network uses a feedback process to improve predictions over time.

Backpropagation algorithm
Artificial neural networks learn continuously by using corrective feedback loops to improve their
predictive analytics. In simple terms, you can think of the data flowing from the input node to the
output node through many different paths in the neural network. Only one path is the correct one
that maps the input node to the correct output node. To find this path, the neural network uses a
feedback loop, which works as follows:

1. Each node makes a guess about the next node in the path.
2. It checks if the guess was correct. Nodes assign higher weight values to paths that lead
to more correct guesses and lower weight values to node paths that lead to incorrect
guesses.
3. For the next data point, the nodes make a new prediction using the higher weight paths
and then repeat Step 1.
Convolutional neural networks

The hidden layers in convolutional neural networks perform specific mathematical functions, like
summarizing or filtering, called convolutions. They are very useful for image classification
because they can extract relevant features from images that are useful for image recognition and
classification. The new form is easier to process without losing features that are critical for
making a good prediction. Each hidden layer extracts and processes different image features,
like edges, color, and depth.

How to train neural networks?


Neural network training is the process of teaching a neural network to perform a task. Neural
networks learn by initially processing several large sets of labeled or unlabeled data. By using
these examples, they can then process unknown inputs more accurately.

Supervised learning
In supervised learning, data scientists give artificial neural networks labeled datasets that provide
the right answer in advance. For example, a deep learning network training in facial recognition
initially processes hundreds of thousands of images of human faces, with various terms related
to ethnic origin, country, or emotion describing each image.

The neural network slowly builds knowledge from these datasets, which provide the right answer
in advance. After the network has been trained, it starts making guesses about the ethnic origin
or emotion of a new image of a human face that it has never processed before.

UnSupervised learning

Unsupervised learning, also known as unsupervised machine learning, uses machine


learning algorithms to analyze and cluster unlabeled datasets. These algorithms discover
hidden patterns or data groupings without the need for human intervention.
Its ability to discover similarities and differences in information make it the ideal solution
for exploratory data analysis, cross-selling strategies, customer segmentation, and image
recognition.

Reinforcement learning
Reinforcement Learning (RL) is the science of decision making. It is about
learning the optimal behavior in an environment to obtain maximum
reward. This optimal behavior is learned through interactions with the
environment and observations of how it responds, similar to children
exploring the world around them and learning the actions that help them
achieve a goal.
In the absence of a supervisor, the learner must independently discover the
sequence of actions that maximize the reward. This discovery process is
akin to a trial-and-error search. The quality of actions is measured by not
just the immediate reward they return, but also the delayed reward they
might fetch. As it can learn the actions that result in eventual success in an
unseen environment without the help of a supervisor, reinforcement
learning is a very powerful algorithm.

Supervised V/S Unsupervised learning

UNSUPERVISED
SUPERVISED LEARNING LEARNING

Uses Known and Labeled


Input Data Data as input Uses Unknown Data as input

Computational Less Computational


Complexity Complexity More Computational Complex

Uses Real Time Analysis of


Real Time Uses off-line analysis Data

Number of Number of Classes are Number of Classes are not


Classes known known

Accuracy of Accurate and Reliable Moderate Accurate and


Results Results Reliable Results

Output data Desired output is given. Desired output is not given.

In supervised learning it is In unsupervised learning it is


not possible to learn larger possible to learn larger and
and more complex models more complex models
than with supervised than with unsupervised
Model learning learning

In supervised learning
training data is used to In unsupervised learning
Training data infer model training data is not used.
Supervised learning is also Unsupervised learning is also
Another name called classification. called clustering.

Test of model We can test our model. We can not test our model.

Optical Character
Example Recognition Find a face in an image.

Single Layer Perceptron in TensorFlow


The perceptron is a single processing unit of any neural network. Frank
Rosenblatt first proposed in 1958 is a simple neuron which is used to classify its input
into one or two categories. Perceptron is a linear classifier, and is used in supervised
learning. It helps to organize the given input data.

A perceptron is a neural network unit that does a precise computation to detect


features in the input data. Perceptron is mainly used to classify the data into two parts.
Therefore, it is also known as Linear Binary Classifier.

Perceptron uses the step function that returns +1 if the weighted sum of its input 0
and -1.
The activation function is used to map the input between the required value like (0, 1)
or (-1, 1).

A regular neural network looks like this:

The perceptron consists of 4 parts.


o Input value or One input layer: The input layer of the perceptron is made of artificial
input neurons and takes the initial data into the system for further processing.
o Weights and Bias:
Weight: It represents the dimension or strength of the connection between units. If
the weight to node 1 to node 2 has a higher quantity, then neuron 1 has a more
considerable influence on the neuron.
Bias: It is the same as the intercept added in a linear equation. It is an additional
parameter which task is to modify the output along with the weighted sum of the input
to the other neuron.
o Net sum: It calculates the total sum.
o Activation Function: A neuron can be activated or not, is determined by an activation
function. The activation function calculates a weighted sum and further adding bias
with it to give the result.
A standard neural network looks like the below diagram.

How does it work?


The perceptron works on these simple steps which are given below:

a. In the first step, all the inputs x are multiplied with their weights w.
b. In this step, add all the increased values and call them the Weighted sum.

c. In our last step, apply the weighted sum to a correct Activation Function.

For Example:

A Unit Step Activation Function


There are two types of architecture. These types focus on the functionality of artificial
neural networks as follows-

o Single Layer Perceptron


o Multi-Layer Perceptron

Single Layer Perceptron


The single-layer perceptron was the first neural network model, proposed in 1958 by
Frank Rosenbluth. It is one of the earliest models for learning. Our goal is to find a
linear decision function measured by the weight vector w and the bias parameter b.

To understand the perceptron layer, it is necessary to comprehend artificial neural


networks (ANNs).

The artificial neural network (ANN) is an information processing system, whose


mechanism is inspired by the functionality of biological neural circuits. An artificial
neural network consists of several processing units that are interconnected.

This is the first proposal when the neural model is built. The content of the neuron's
local memory contains a vector of weight.

The single vector perceptron is calculated by calculating the sum of the input vector
multiplied by the corresponding element of the vector, with each increasing the
amount of the corresponding component of the vector by weight. The value that is
displayed in the output is the input of an activation function.
Let us focus on the implementation of a single-layer perceptron for an image
classification problem using TensorFlow. The best example of drawing a single-layer
perceptron is through the representation of "logistic regression."

Now, We have to do the following necessary steps of training logistic regression-

o The weights are initialized with the random values at the origination of each
training.
o For each element of the training set, the error is calculated with the difference
between the desired output and the actual output. The calculated error is used
to adjust the weight.
o The process is repeated until the fault made on the entire training set is less
than the specified limit until the maximum number of iterations has been
reached.

Hidden Layer Perceptron in TensorFlow


A hidden layer is an artificial neural network that is a layer in between input
layers and output layers. Where the artificial neurons take in a set of weighted inputs
and produce an output through an activation function. It is a part of nearly and neural
in which engineers simulate the types of activity that go on in the human brain.

The hidden neural network is set up in some techniques. In many cases, weighted
inputs are randomly assigned. On the other hand, they are fine-tuned and calibrated
through a process called backpropagation.

The artificial neuron in the hidden layer of perceptron works as a biological neuron in
the brain- it takes in its probabilistic input signals, and works on them. And it converts
them into an output corresponding to the biological neuron's axon.

Layers after the input layer are called hidden because they are directly resolved to the
input. The simplest network structure is to have a single neuron in the hidden layer
that directly outputs the value.

Multi-layer Perceptron in TensorFlow


Multi-Layer perceptron defines the most complex architecture of artificial neural
networks. It is substantially formed from multiple layers of the perceptron. TensorFlow
is a very popular deep learning framework released by, and this notebook will guide
to build a neural network with this library. If we want to understand what is a Multi-
layer perceptron, we have to develop a multi-layer perceptron from scratch using
Numpy.

The pictorial representation of multi-layer perceptron learning is as shown below-

MLP networks are used for supervised learning format. A typical learning algorithm for
MLP networks is also called back propagation's algorithm.
A multilayer perceptron (MLP) is a feed forward artificial neural network that generates
a set of outputs from a set of inputs. An MLP is characterized by several layers of input
nodes connected as a directed graph between the input nodes connected as a directed
graph between the input and output layers. MLP uses backpropagation for training
the network. MLP is a deep learning method.

BACK PROPOGATION LEARNING


Backpropagation is one of the important concepts of a neural network. Our task is to
classify our data best. For this, we have to update the weights of parameter and bias,
but how can we do that in a deep neural network? In the linear regression model, we
use gradient descent to optimize the parameter. Similarly here we also use gradient
descent algorithm using Backpropagation.

For a single training example, Backpropagation algorithm calculates the gradient of


the error function. Backpropagation can be written as a function of the neural
network. Backpropagation algorithms are a set of methods used to efficiently train
artificial neural networks following a gradient descent approach which exploits the
chain rule.

The main features of Backpropagation are the iterative, recursive and efficient method
through which it calculates the updated weight to improve the network until it is not
able to perform the task for which it is being trained. Derivatives of the activation
function to be known at network design time is required to Backpropagation.

Now, how error function is used in Backpropagation and how Backpropagation works?
Let start with an example and do it mathematically to understand how exactly updates
the weight using Backpropagation.
Advantages of Using the Backpropagation
Algorithm in Neural Networks

• No previous knowledge of a neural network is needed, making it


easy to implement.
• It’s straightforward to program since there are no other parameters
besides the inputs.
• It doesn’t need to learn the features of a function, speeding up the
process.
• The model is flexible because of its simplicity and applicable to many
scenarios.

Limitations of Using the Backpropagation


Algorithm in Neural Networks

• Training data can impact the performance of the model, so high-


quality data is essential.
• Noisy data can also affect backpropagation, potentially tainting its
results.
• It can take a while to train backpropagation models and get them
up to speed.
• Backpropagation requires a matrix-based approach, which can lead
to other issues.

Associate Memory Network


An associate memory network refers to a content addressable memory structure that
associates a relationship between the set of input patterns and output patterns. A
content addressable memory structure is a kind of memory structure that enables the
recollection of data based on the intensity of similarity between the input pattern and
the patterns stored in the memory.

Let's understand this concept with an example:


The figure given below illustrates a memory containing the names of various people.
If the given memory is content addressable, the incorrect string "Albert Einstein" as a
key is sufficient to recover the correct name "Albert Einstein."

In this condition, this type of memory is robust and fault-tolerant because of this type
of memory model, and some form of error-correction capability.

There are two types of associate memory- an auto-associative memory and hetero associative
memory.

Auto-associative memory:

An auto-associative memory recovers a previously stored pattern that most closely relates to
the current pattern. It is also known as an auto-associative correlator.
Consider x[1], x[2], x[3],….. x[M], be the number of stored pattern vectors, and
let x[m] be the element of these vectors, showing characteristics obtained from the patterns.
The auto-associative memory will result in a pattern vector x[m] when putting a noisy or
incomplete version of x[m].

Hetero-associative memory:

In a hetero-associate memory, the recovered pattern is generally different from the input
pattern not only in type and format but also in content. It is also known as a hetero-
associative correlator.
Consider we have a number of key response pairs {a(1), x(1)}, {a(2),x(2)},…..,{a(M),
x(M)}. The hetero-associative memory will give a pattern vector x(m) when a noisy or
incomplete version of the a(m) is given.

Neural networks are usually used to implement these associative memory models
called neural associative memory (NAM). The linear associate is the easiest artificial
neural associative memory.

These models follow distinct neural network architecture to memorize data.

Working of Associative Memory:


Associative memory is a depository of associated pattern which in some form. If the
depository is triggered with a pattern, the associated pattern pair appear at the output.
The input could be an exact or partial representation of a stored pattern.

If the memory is produced with an input pattern, may say α, the associated
pattern ω is recovered automatically.

These are the terms which are related to the Associative memory network:

Encoding or memorization:

Encoding or memorization refers to building an associative memory. It implies


constructing an association weight matrix w such that when an input pattern is given,
the stored pattern connected with the input pattern is recovered.

(Wij)k = (pi)k (qj)k


Where,

(Pi)k represents the ith component of pattern pk, and

(qj)k represents the jth component of pattern qk

Where,

strong>i = 1,2, …,m and j = 1,2,…,n.

Constructing the association weight matrix w is accomplished by adding the individual


correlation matrices wk , i.e.,

Where α = Constructing constant.

Errors and noise:

The input pattern may hold errors and noise or may contain an incomplete version of
some previously encoded pattern. If a corrupted input pattern is presented, the
network will recover the stored Pattern that is adjacent to the actual input pattern. The
existence of noise or errors results only in an absolute decrease rather than total
degradation in the efficiency of the network. Thus, associative memories are robust
and error-free because of many processing units performing highly parallel and
distributed computations.

Performance Measures:

The measures taken for the associative memory performance to correct recovery are
memory capacity and content addressability. Memory capacity can be defined as the
maximum number of associated pattern pairs that can be stored and correctly
recovered. Content- addressability refers to the ability of the network to recover the
correct stored pattern.

If input patterns are mutually orthogonal, perfect recovery is possible. If stored input
patterns are not mutually orthogonal, non-perfect recovery can happen due to
intersection among the patterns.

Associative memory models:


Linear associator is the simplest and most widely used associative memory models. It
is a collection of simple processing units which have a quite complex collective
computational capability and behavior. The Hopfield model computes its output that
returns in time until the system becomes stable. Hopfield networks are constructed
using bipolar units and a learning process. The Hopfield model is an auto-
associative memory suggested by John Hopfield in 1982. Bidirectional
Associative Memory (BAM) and the Hopfield model are some other popular artificial
neural network models used as associative memories.

Network architectures of Associate Memory Models:


The neural associative memory models pursue various neural network architectures to
memorize data. The network comprises either a single layer or two layers. The linear
associator model refers to a feed-forward type network, comprises of two layers of
different processing units- The first layer serving as the input layer while the other layer
as an output layer. The Hopfield model refers to a single layer of processing elements
where each unit is associated with every other unit in the given network.
The bidirectional associative memory (BAM) model is the same as the linear
associator, but the associations are bidirectional.

The neural network architectures of these given models and the structure of the
corresponding association weight matrix w of the associative memory are depicted.

Linear Associator model (two layers):

The linear associator model is a feed-forward type network where produced output is
in the form of single feed-forward computation. The model comprises of two layers of
processing units, one work as an input layer while the other work as an output layer.
The input is directly associated with the outputs, through a series of weights. The
connections carrying weights link each input to every output. The addition of the
products of the weights and the input is determined in each neuron node. The
architecture of the linear associator is given below.
All p inputs units are associated to all q output units via associated weight matrix

W = [wij]p * q where wij describes the strength of the unidirectional association of


the ith input unit to the jth output unit.

The connection weight matrix stores the z different associated pattern pairs {(Xk,Yk);
k= 1,2,3,…,z}. Constructing an associative memory is building the connection weight
matrix w such that if an input pattern is presented, the stored pattern associated with
the input pattern is recovered.

Adaptive Resonance Theory


The Adaptive Resonance Theory (ART) was incorporated as a hypothesis for human
cognitive data handling. The hypothesis has prompted neural models for pattern
recognition and unsupervised learning. ART system has been utilized to clarify different
types of cognitive and brain data.

The Adaptive Resonance Theory addresses the stability-plasticity(stability can be


defined as the nature of memorizing the learning and plasticity refers to the fact that
they are flexible to gain new information) dilemma of a system that asks how learning
can proceed in response to huge input patterns and simultaneously not to lose the
stability for irrelevant patterns. Other than that, the stability-elasticity dilemma is
concerned about how a system can adapt new data while keeping what was learned
before. For such a task, a feedback mechanism is included among the ART neural
network layers. In this neural network, the data in the form of processing elements
output reflects back and ahead among layers. If an appropriate pattern is build-up, the
resonance is reached, then adaption can occur during this period.

Gains control empowers L1 and L2 to recognize the current stages of the running cycle.
STM reset wave prevents active L2 cells when mismatches between bottom-up and
top-down signals happen at L1. The comparison layer gets the binary external input
passing it to the recognition layer liable for coordinating it to a classification category.
This outcome is given back to the comparison layer to find out when the category
coordinates the input vector. If there is a match, then a new input vector is read, and
the cycle begins once again. If there is a mismatch, then the orienting system is in
charge of preventing the previous category from getting a new category match in the
recognition layer. The given two gains control the activity of the recognition and the
comparison layer, respectively. The reset wave specifically and enduringly prevents
active L2 cell until the current is stopped. The offset of the input pattern ends its
processing L1 and triggers the offset of Gain2. Gain2 offset causes consistent decay of
STM at L2 and thereby prepares L2 to encode the next input pattern without bais.
ART1 Implementation process:
ART1 is a self-organizing neural network having input and output neurons mutually
couple using bottom-up and top-down adaptive weights that perform recognition. To
start our methodology, the system is first trained as per the adaptive resonance theory
by inputting reference pattern data under the type of 5*5 matrix into the neurons for
clustering within the output neurons. Next, the maximum number of nodes in L2 is
defined following by the vigilance parameter. The inputted pattern enrolled itself as
short term memory activity over a field of nodes L1. Combining and separating
pathways from L1 to coding field L2, each weighted by an adaptive long-term memory
track, transform into a net signal vector T. Internal competitive dynamics at L2 further
transform T, creating a compressed code or content addressable memory. With strong
competition, activation is concentrated at the L2 node that gets the maximal L1 → L2
signal. The primary objective of this work is divided into four phases as follows
Comparision, recognition, search, and learning.

Advantage of adaptive learning theory(ART):

It can be coordinated and utilized with different techniques to give more precise
outcomes.

It doesn't ensure stability in forming clusters.

It can be used in different fields such as face recognition, embedded system, and
robotics, target recognition, medical diagnosis, signature verification, etc.

It shows stability and is not disturbed by a wide range of inputs provided to inputs.
It has got benefits over competitive learning. The competitive learning cant include new
clusters when considered necessary.

Application of ART:
ART stands for Adaptive Resonance Theory. ART neural networks used for fast, stable
learning and prediction have been applied in different areas. The application incorporates
target recognition, face recognition, medical diagnosis, signature verification, mobile control
robot.

Target recognition:

Fuzzy ARTMAP neural network can be used for automatic classification of targets depend on
their radar range profiles. Tests on synthetic data show the fuzzy ARTMAP can result in
substantial savings in memory requirements when related to k nearest neighbor(kNN)
classifiers. The utilization of multiwavelength profiles mainly improves the performance of
both kinds of classifiers.

Medical diagnosis:

Medical databases present huge numbers of challenges found in general information


management settings where speed, use, efficiency, and accuracy are the prime
concerns. A direct objective of improved computer-assisted medicine is to help to
deliver intensive care in situations that may be less than ideal. Working with these
issues has stimulated several ART architecture developments, including ARTMAP-IC.
Signature verification:

Automatic signature verification is a well known and active area of research with
various applications such as bank check confirmation, ATM access, etc. the training of
the network is finished using ART1 that uses global features as input vector and the
verification and recognition phase uses a two-step process. In the initial step, the input
vector is coordinated with the stored reference vector, which was used as a training
set, and in the second step, cluster formation takes place.

Mobile control robot:

Nowadays, we perceive a wide range of robotic devices. It is still a field of research in


their program part, called artificial intelligence. The human brain is an interesting
subject as a model for such an intelligent system. Inspired by the structure of the
human brain, an artificial neural emerges. Similar to the brain, the artificial neural
network contains numerous simple computational units, neurons that are
interconnected mutually to allow the transfer of the signal from the neurons to
neurons. Artificial neural networks are used to solve different issues with good
outcomes compared to other decision algorithms.

Limitations of ART:
Some ART networks are contradictory as they rely on the order of the training data, or
upon the learning rate.
PART-C

What is Fuzzy Logic?


The 'Fuzzy' word means the things that are not clear or are vague. Sometimes, we
cannot decide in real life that the given problem or statement is either true or false. At
that time, this concept provides many values between the true and false and gives the
flexibility to find the best solution to that problem.

Example of Fuzzy Logic as comparing to Boolean Logic

Fuzzy logic contains the multiple logical values and these values are the truth values
of a variable or problem between 0 and 1. This concept was introduced by Lofti
Zadeh in 1965 based on the Fuzzy Set Theory. This concept provides the possibilities
which are not given by computers, but similar to the range of possibilities generated
by humans.

In the Boolean system, only two possibilities (0 and 1) exist, where 1 denotes the
absolute truth value and 0 denotes the absolute false value. But in the fuzzy system,
there are multiple possibilities present between the 0 and 1, which are partially false
and partially true.

The Fuzzy logic can be implemented in systems such as micro-controllers, workstation-


based or large network-based systems for achieving the definite output. It can also be
implemented in both hardware or software.
Characteristics of Fuzzy Logic
Following are the characteristics of fuzzy logic:

1. This concept is flexible and we can easily understand and implement it.
2. It is used for helping the minimization of the logics created by the human.
3. It is the best method for finding the solution of those problems which are suitable for
approximate or uncertain reasoning.
4. It always offers two values, which denote the two possible solutions for a problem and
statement.
5. It allows users to build or create the functions which are non-linear of arbitrary
complexity.
6. In fuzzy logic, everything is a matter of degree.
7. In the Fuzzy logic, any system which is logical can be easily fuzzified.
8. It is based on natural language processing.
9. It is also used by the quantitative analysts for improving their algorithm's execution.
10. It also allows users to integrate with the programming.

Architecture of a Fuzzy Logic System


In the architecture of the Fuzzy Logic system, each component plays an important
role. The architecture consists of the different four components which are given below.

1. Rule Base
2. Fuzzification
3. Inference Engine
4. Defuzzification

Following diagram shows the architecture or process of a Fuzzy Logic system:


1. Rule Base
Rule Base is a component used for storing the set of rules and the If-Then conditions
given by the experts are used for controlling the decision-making systems. There are
so many updates that come in the Fuzzy theory recently, which offers effective
methods for designing and tuning of fuzzy controllers. These updates or developments
decreases the number of fuzzy set of rules.

2. Fuzzification
Fuzzification is a module or component for transforming the system inputs, i.e., it
converts the crisp number into fuzzy steps. The crisp numbers are those inputs which
are measured by the sensors and then fuzzification passed them into the control
systems for further processing. This component divides the input signals into following
five states in any Fuzzy Logic system:

o Large Positive (LP)


o Medium Positive (MP)
o Small (S)
o Medium Negative (MN)
o Large negative (LN)

3. Inference Engine
This component is a main component in any Fuzzy Logic system (FLS), because all the
information is processed in the Inference Engine. It allows users to find the matching degree
between the current fuzzy input and the rules. After the matching degree, this system
determines which rule is to be added according to the given input field. When all rules are
fired, then they are combined for developing the control actions.

4. Defuzzification
Defuzzification is a module or component, which takes the fuzzy set inputs generated by
the Inference Engine, and then transforms them into a crisp value. It is the last step in the
process of a fuzzy logic system. The crisp value is a type of value which is acceptable by the
user. Various techniques are present to do this, but the user has to select the best one for
reducing the errors.

Membership Function
The membership function is a function which represents the graph of fuzzy sets, and
allows users to quantify the linguistic term. It is a graph which is used for mapping each
element of x to the value between 0 and 1.

This function is also known as indicator or characteristics function.

This function of Membership was introduced in the first papers of fuzzy set by Zadeh. For
the Fuzzy set B, the membership function for X is defined as: μB:X → [0,1]. In this function
X, each element of set B is mapped to the value between 0 and 1. This is called a degree of
membership or membership value.

Applications of Fuzzy Logic


Following are the different application areas where the Fuzzy Logic concept is widely
used:

1. It is used in Businesses for decision-making support system.


2. It is used in Automative systems for controlling the traffic and speed, and for
improving the efficiency of automatic transmissions. Automative systems also
use the shift scheduling method for automatic transmissions.
3. This concept is also used in the Defence in various areas. Defence mainly uses
the Fuzzy logic systems for underwater target recognition and the automatic
target recognition of thermal infrared images.
4. It is also widely used in the Pattern Recognition and Classification in the form
of Fuzzy logic-based recognition and handwriting recognition. It is also used in
the searching of fuzzy images.
5. Fuzzy logic systems also used in Securities.
Advantages of Fuzzy Logic
Fuzzy Logic has various advantages or benefits. Some of them are as follows:

1. The methodology of this concept works similarly as the human reasoning.


2. Any user can easily understand the structure of Fuzzy Logic.
3. It does not need a large memory, because the algorithms can be easily
described with fewer data.
4. It is widely used in all fields of life and easily provides effective solutions to the
problems which have high complexity.
5. This concept is based on the set theory of mathematics, so that's why it is
simple.
6. It allows users for controlling the control machines and consumer products.
7. The development time of fuzzy logic is short as compared to conventional
methods.
8. Due to its flexibility, any user can easily add and delete rules in the FLS system.

Disadvantages of Fuzzy Logic


Fuzzy Logic has various disadvantages or limitations. Some of them are as follows:

1. The run time of fuzzy logic systems is slow and takes a long time to produce
outputs.
2. Users can understand it easily if they are simple.
3. The possibilities produced by the fuzzy logic system are not always accurate.
4. Many researchers give various ways for solving a given statement using this
technique which leads to ambiguity.
5. Fuzzy logics are not suitable for those problems that require high accuracy.
6. The systems of a Fuzzy logic need a lot of testing for verification and validation.
Fuzzy set v/s Crisp set

S.No Crisp Set Fuzzy Set

Fuzzy set defines the value


1 Crisp set defines the value is
between 0 and 1 including both 0
either 0 or 1.
and 1.

2 It specifies the degree to which


It is also called a classical set.
something is true.

3 It shows full membership It shows partial membership.

Eg1. She is 18 years old. Eg1. She is about 18 years old.


4
Eg2. Rahul is 1.6m tall Eg2. Rahul is about 1.6m tall.

5 Crisp set application used for Fuzzy set used in the fuzzy
digital design. controller.

6 It is bi-valued function logic. It is infinite valued function logic

7 Full membership means Partial membership means true to


totally true/false, yes/no, 0/1. false, yes to no, 0 to 1.

Crisp logic

Crisp logic identifies a formal logics class that have been most
intensively studied and most widely used. The class is sometimes
called as standard logic also.

Number of properties which used to characterized are:


▪ Law of the excluded middle and Double negative
elimination;
▪ Law of non contradiction, and the principle of explosion;
▪ Monotonicity of entailment and Idem-potency of entailment;
▪ Commutativity of conjunction;
▪ De Morgan duality: every logical operator is dual to another;
Fuzzy Logic

While these not entailed by the preceding conditions,


contemporary discussions of classical logic normally only include
propositional and first-order logics (FOL).

The term Fuzzy Logic is a MISNOMER. It implies that in some way


the methodology is ill-definedor or vague. This is in fact far from
these case. Fuzzy logic just evolved from the need to model the
type of of vague or ill-defined systems that is difficult to handle
using conventional binary valued logic, but the methodology itself
is based on mathematical theory.
Difference between crisp logic and fuzzy logic

Crisp :
▪ Binary logic
▪ It may be occur or non occur
▪ indicator function

Fuzzy logic :
▪ Continuous valued logic
▪ membership function
▪ Consider about degree of membership

Fuzzification v/s Defuzzification

S.No. Comparison Fuzzification Defuzzification

Precise data is
converted into Imprecise data is converted
1. Basic imprecise data. into precise data.

Fuzzification is the Defuzzification is the inverse


method of process of fuzzification
2. Definition converting a crisp where the mapping is done
quantity into a fuzzy to convert the fuzzy results
quantity. into crisp results.

Like, Stepper motor and D/A


3. Example Like, Voltmeter converter

Intuition, inference,
rank ordering, Maximum membership
angular fuzzy sets, principle, centroid method,
neural network, weighted average method,
4. Methods etcetera. center of sums, etcetera.

5. Complexity It is quite simple. It is quite complicated.

It can use IF-THEN It uses the center of gravity


rules for fuzzifying methods to find the centroid
6. Use the crisp value. of the sets.

MinMax composition

Fuzzy composition
Fuzzy composition can be defined just as it is for crisp (binary)
relations. Suppose R is a fuzzy relation on X × Y, S is a fuzzy relation
on Y × Z, and T is a fuzzy relation on X × Z; then,
Fuzzy Max–Min composition is defined as:
It obtains fuzzy relation T as a composition b/w fuzzy relation. It takes the maximum
value first, and then search for the minimum value.

Fuzzy Max–Product composition is defined as:


It obtains fuzzy relation T as a composition b/w fuzzy relation. It takes a maximum
product value and return as T set.

Fuzzy rule based system


Fuzzy rule-based systems (FRBSs) are well known methods within soft computing, based on fuzzy
concepts to address complex real-world problems. They have become a powerful method to tackle
various problems such as uncertainty, imprecision, and non-linearity. They are commonly used for
identification, classification, and regression tasks.
FRBSs are also known as fuzzy inference systems or simply fuzzy systems. When applied to specific
tasks, they also may receive specific names such as fuzzy associative memories or fuzzy controllers.

FRBSs are an extension of classical rule-based systems (also known as production systems or expert
systems). Basically, they are expressed in the form “IF A THEN B” where A and B are fuzzy sets. A and
B are called the antecedent and consequent parts of the rule, respectively. Let us assume we are
trying to model the following problem: we need to determine the speed of a car considering some
factors such as the number of vehicles in the street and the width of the street. So, let us consider
three objects = {number of vehicles, width of street, speed of car} with linguistic values as follows:
Number of vehicles = {small, medium, large}. Width of street = {narrow, medium, wide}. Speed of car
= {slow, medium, fast}. Based on a particular condition, we can define a fuzzy IF-THEN rule as follows:
IF number of vehicles is small and width of street is medium THEN speed of car is fast.

Predicate Logic
A predicate is an expression of one or more variables determined on some specific domain.
A predicate with variables can be made a proposition by either authorizing a value to the
variable or by quantifying the variable.

o Consider E(x, y) denote "x = y"


o Consider X(a, b, c) denote "a + b + c = 0"
o Consider M(x, y) denote "x is married to y."

Quantifier:
The variable of predicates is quantified by quantifiers. There are two types of quantifier in
predicate logic - Existential Quantifier and Universal Quantifier.

Existential Quantifier:
If p(x) is a proposition over the universe U. Then it is denoted as ∃x p(x) and read as "There
exists at least one value in the universe of variable x such that p(x) is true. The quantifier ∃ is
called the existential quantifier.

There are several ways to write a proposition, with an existential quantifier, i.e.,

(∃x∈A)p(x) or ∃x∈A such that p (x) or (∃x)p(x) or p(x) is true for some x ∈A.

Universal Quantifier:
If p(x) is a proposition over the universe U. Then it is denoted as ∀x,p(x) and read as "For
every x∈U,p(x) is true." The quantifier ∀ is called the Universal Quantifier.

There are several ways to write a proposition, with a universal quantifier.

∀x∈A,p(x) or p(x), ∀x ∈A Or ∀x,p(x) or p(x) is true for all x ∈A.


Fuzzy Decision Making
Fuzzy Control System
Fuzzy logic control (FLC) is the most active research area in the application
of fuzzy set theory, fuzzy reasoning, and fuzzy logic. The application of FLC
extends from industrial process control to biomedical instrumentation and
securities. Compared to conventional control techniques, FLC has been best
utilized in complex ill-defined problems, which can be controlled by an
efficient human operator without knowledge of their underlying dynamics.
A control system is an arrangement of physical components designed to
alter another physical system so that this system exhibits certain desired
characteristics. There exist two types of control systems: open-loop and
closed-loop control systems. In open-loop control systems, the input
control action is independent of the physical system output. On the other
hand, in a closed-loop control system, the input control action depends on
the physical system output. Closed-Hoop control systems are also known
as feedback control systems. The first step toward controlling any physical
variable is to measure it. A sensor measures the controlled signal, A plant is
a physical system under control. In a closed-loop control system, forcing
signals of the system inputs are determined by the output responses of the
system. The basic control problem is given as follows:
The output of the physical system under control is adjusted by the help of
an error signal. The difference between the actual response (calculated) of
the płant and the desired response gives the error signal. For obtaining
satisfactory responses and characteristics for the closed-loop control
system, an additional system, called as compensator or controller, can be
added to the loop. The basic block diagram of the closed-loop control
system is shown in Figure 1. The fuzzy control rules are basically IE-THEN
rules.

Control System Design:


Designing a controller for a complex physical system involves the following
steps:
1. Decomposing the large-scale system into a collection of various
subsystems.
2. Varying the plant dynamics slowly and linearizing the nonlinear
plane dynamics about a set of operating points.
3. Organizing a set of state variables, control variables, or output
features for the system under consideration.
4. 4. Designing simple P, PD, PID controllers for the subsystems.
Optimal controllers can also be designed.
Apart from the first four steps, there may be uncertainties occurring due to
external environmental conditions. The design of the controller should be
made as dose as possible to the optimal controller design based on the
expert knowledge of the control engineer. This may be done by various
numerical observations of the input-output relationship in the form of
linguistic, intuitive, and other kinds of related information related to the
dynamics of the plant and the external environment. Finally, a supervisory
control system, either manual operator or automatic, forms an extra
feedback control loop to tune and adjust the parameters of the controller,
for compensating the variational effects caused by nonlinear and
remodelled dynamics. In comparison with a conventional control system
design, an FLC system design should have the following assumptions made,
in case it is selected. The plant under consideration should be observable
and controllable. A wide range of knowledge comprising a set of expert
linguistic rules, basic engineering common sense, a set of data for
input/output, or a controller analytic model, which can be fuzzified and from
which the fuzzy rule the base can be formed, should exist. Also, for the
problem under consideration, a solution should exist and it should be such
that the control the engineer is working for a “good” solution and not
especially looking for an optimum solution. The controller, in this case,
should be designed to the best of our ability and within an acceptable range
of precision. It should be noted that the problems of stability and optimality
are ongoing problems in the fuzzy controller design.
In designing a fuzzy logic controller, the process of forming fuzzy rules plays
a vital role. There are four structures of the fuzzy production rule system
(Weiss and Donnel, 1979) which are as follows:
1. A set of rules that represents the policies and heuristic strategies
of the expert decision-maker.
2. A set of input data that are assessed immediately prior to the
actual decision.
3. A method for evaluating any proposed action in terms of its
conformity to the expressed rules when there is available data.
4. A method for generating promising actions and determining when
to stop searching for better ones.
All the necessary parameters used in the fuzzy logic controller are defined
by membership functions. The rules are evaluated using techniques such as
approximate reasoning or interpolative reasoning. These four structures of
fuzzy rules help in obtaining the control surface that relates the control
action to the measured state or output variable. The control surface can
then be sampled down to a finite number of points and based on this
information, a look~up table may be Constructed. The look~up table
comprises the information about the control surface which can be
downloaded into a read·only memory chip. This chip would constitute a
fixed controller for the plant.
Architecture and Operations of FLC System:
The basic architecture of a fuzzy logic controller is shown in Figure 2. The
principal components of an FLC system is a fuzzifier, a fuzzy rule base, a
fuzzy knowledge base, an inference engine, and a defuzz.ifier. It also
includes parameters for normalization. When the output from the defuzzifier
is not a control action for a plant, then the system is a fuzzy logic decision
system. The fuzzifier present converts crisp quantities into fuzzy quantities.
The fuzzy rule base stores knowledge about the operation of the process of
domain expertise. The fuzzy knowledge base stores the knowledge about
all the input-output fuzzy relationships. It includes the membership
functions defining the input variables to the fuzzy rule base and the out
variables to the plant under control. The inference engine is the kernel of an
FLC system, and it possesses the capability to simulate human decisions by
performing approximate reasoning to achieve the desired control strategy.
The defuzzifier converts the fuzzy quantities into crisp quantities from an
inferred fuzzy control action by the inference engine.

Fig 2: Basic architecture of a FLC System

The various steps involved in designing a fuzzy logic controller are as


follows:
• Step 1: Locate the input, output, and state variables of the plane
under consideration. I
• Step 2: Split the complete universe of discourse spanned by each
variable into a number of fuzzy subsets, assigning each with a
linguistic label. The subsets include all the elements in the
universe.
• Step 3: Obtain the membership function for each fuzzy subset.
• Step 4: Assign the fuzzy relationships between the inputs or states
of fuzzy subsets on one side and the output of fuzzy subsets on the
other side, thereby forming the rule base.
• Step 5: Choose appropriate scaling factors for the input and output
variables for normalizing the variables between [0, 1] and [-1, I]
interval.
• Step 6: Carry out the fuzzification process.
• Step 7: Identify the output contributed from each rule using fuzzy
approximate reasoning.
• Step 8: Combine the fuzzy outputs obtained from each rule.
• Step 9: Finally, apply defuzzification to form a crisp output.
The above steps are performed and executed for a simple FLC system. The
following design elements are adopted for designing a general FLC system:
1. Fuzzification strategies and the interpretation of a fuzzifier.
2. Fuzzy knowledge base: Normalization of the parameters involved;
partitioning of input and output spaces; selection of membership
functions of a primary fuzzy set.
3. Fuzzy rule base: Selection of input and output variables; the
source from which fuzzy control rules are to be derived; types of
fuzzy control rules; completeness of fuzzy control rules.
4. Decision· making logic: The proper definition of fuzzy implication;
interpretation of connective “and”; interpretation of connective “or”;
inference engine.
5. Defuzzification materials and the interpretation of a defuzzifier.
Applications:
FLC systems find a wide range of applications in various industrial and
commercial products and systems. In several applications- related to
nonlinear, time-varying, ill-defined systems and also complex systems –
FLC systems have proved to be very efficient in comparison with other
conventional control systems. The applications of FLC systems include:
1. Traffic Control
2. Steam Engine
3. Aircraft Flight Control
4. Missile Control
5. Adaptive Control
6. Liquid-Level Control
7. Helicopter Model
8. Automobile Speed Controller
9. Braking System Controller
10. Process Control (includes cement kiln control)
11. Robotic Control
12. Elevator (Automatic Lift) control;
13. Automatic Running Control
14. Cooling Plant Control
15. Water Treatment
16. Boiler Control;
17. Nuclear Reactor Control;
18. Power Systems Control;
19. Air Conditioner Control (Temperature Controller)
20. Biological Processes
21. Knowledge-Based System
22. Fault Detection Control Unit
23. Fuzzy Hardware implementation and Fuzzy Computers

Genetic Algorithm
A genetic algorithm is an adaptive heuristic search algorithm inspired by
"Darwin's theory of evolution in Nature." It is used to solve optimization problems
in machine learning. It is one of the important algorithms as it helps solve complex
problems that would take a long time to solve.

Genetic Algorithms are being widely used in different real-world applications, for
example, Designing electronic circuits, code-breaking, image processing, and
artificial creativity.

In this topic, we will explain Genetic algorithm in detail, including basic terminologies
used in Genetic algorithm, how it works, advantages and limitations of genetic
algorithm, etc.

What is a Genetic Algorithm?


Before understanding the Genetic algorithm, let's first understand basic terminologies
to better understand this algorithm:

o Population: Population is the subset of all possible or probable solutions, which can
solve the given problem.
o Chromosomes: A chromosome is one of the solutions in the population for the given
problem, and the collection of gene generate a chromosome.
o Gene: A chromosome is divided into a different gene, or it is an element of the
chromosome.
o Allele: Allele is the value provided to the gene within a particular chromosome.
o Fitness Function: The fitness function is used to determine the individual's fitness level
in the population. It means the ability of an individual to compete with other
individuals. In every iteration, individuals are evaluated based on their fitness function.
o Genetic Operators: In a genetic algorithm, the best individual mate to regenerate
offspring better than parents. Here genetic operators play a role in changing the
genetic composition of the next generation.
o Selection

After calculating the fitness of every existent in the population, a selection process is used to
determine which of the individualities in the population will get to reproduce and produce the
seed that will form the coming generation.

Types of selection styles available

o Roulette wheel selection


o Event selection
o Rank- grounded selection

So, now we can define a genetic algorithm as a heuristic search algorithm to solve
optimization problems. It is a subset of evolutionary algorithms, which is used in computing.
A genetic algorithm uses genetic and natural selection concepts to solve optimization
problems.

How Genetic Algorithm Work?


The genetic algorithm works on the evolutionary generational cycle to generate high-quality
solutions. These algorithms use different operations that either enhance or replace the
population to give an improved fit solution.

It basically involves five phases to solve the complex optimization problems, which are given
as below:

o Initialization
o Fitness Assignment
o Selection
o Reproduction
o Termination

1. Initialization
The process of a genetic algorithm starts by generating the set of individuals, which is called
population. Here each individual is the solution for the given problem. An individual contains
or is characterized by a set of parameters called Genes. Genes are combined into a string
and generate chromosomes, which is the solution to the problem. One of the most popular
techniques for initialization is the use of random binary strings.
2. Fitness Assignment
Fitness function is used to determine how fit an individual is? It means the ability of an
individual to compete with other individuals. In every iteration, individuals are
evaluated based on their fitness function. The fitness function provides a fitness score
to each individual. This score further determines the probability of being selected for
reproduction. The high the fitness score, the more chances of getting selected for
reproduction.

3. Selection
The selection phase involves the selection of individuals for the reproduction of
offspring. All the selected individuals are then arranged in a pair of two to increase
reproduction. Then these individuals transfer their genes to the next generation.

There are three types of Selection methods available, which are:

o Roulette wheel selection


o Tournament selection
o Rank-based selection

4. Reproduction
After the selection process, the creation of a child occurs in the reproduction step. In
this step, the genetic algorithm uses two variation operators that are applied to the
parent population. The two operators involved in the reproduction phase are given
below:

o Crossover: The crossover plays a most significant role in the reproduction phase of the
genetic algorithm. In this process, a crossover point is selected at random within the
genes. Then the crossover operator swaps genetic information of two parents from the
current generation to produce a new individual representing the offspring.

The genes of parents are exchanged among themselves until the crossover point is
met. These newly generated offspring are added to the population. This process is also
called or crossover. Types of crossover styles available:
o One point crossover
o Two-point crossover
o Livery crossover
o Inheritable Algorithms crossover
o Mutation
The mutation operator inserts random genes in the offspring (new child) to maintain
the diversity in the population. It can be done by flipping some bits in the
chromosomes.
Mutation helps in solving the issue of premature convergence and enhances
diversification. The below image shows the mutation process:
Types of mutation styles available,
o Flip bit mutation
o Gaussian mutation
o Exchange/Swap mutation

5. Termination
After the reproduction phase, a stopping criterion is applied as a base for termination.
The algorithm terminates after the threshold fitness solution is reached. It will identify
the final solution as the best solution in the population.

General Workflow of a Simple Genetic Algorithm


Advantages of Genetic Algorithm
o The parallel capabilities of genetic algorithms are best.
o It helps in optimizing various problems such as discrete functions, multi-objective
problems, and continuous functions.
o It provides a solution for a problem that improves over time.
o A genetic algorithm does not need derivative information.

Limitations of Genetic Algorithms


o Genetic algorithms are not efficient algorithms for solving simple problems.
o It does not guarantee the quality of the final solution to a problem.
o Repetitive calculation of fitness values may generate some computational challenges.

Difference between Genetic Algorithms and


Traditional Algorithms
o A search space is the set of all possible solutions to the problem. In the traditional
algorithm, only one set of solutions is maintained, whereas, in a genetic algorithm,
several sets of solutions in search space can be used.
o Traditional algorithms need more information in order to perform a search, whereas
genetic algorithms need only one objective function to calculate the fitness of an
individual.
o Traditional Algorithms cannot work parallelly, whereas genetic Algorithms can work
parallelly (calculating the fitness of the individualities are independent).
o One big difference in genetic Algorithms is that rather of operating directly on seeker
results, inheritable algorithms operate on their representations (or rendering),
frequently appertained to as chromosomes.
o One of the big differences between traditional algorithm and genetic algorithm is that
it does not directly operate on candidate solutions.
o Traditional Algorithms can only generate one result in the end, whereas Genetic
Algorithms can generate multiple optimal results from different generations.
o The traditional algorithm is not more likely to generate optimal results, whereas
Genetic algorithms do not guarantee to generate optimal global results, but also there
is a great possibility of getting the optimal result for a problem as it uses genetic
operators such as Crossover and Mutation.
o Traditional algorithms are deterministic in nature, whereas Genetic algorithms are
probabilistic and stochastic in nature.

Various encoding methods


Biological Background :

Chromosome: All living organisms consist of cells. In each cell, there is the
same set of Chromosomes. Chromosomes are strings of DNA and consist of
genes, blocks of DNA. Each gene encodes a trait, for example, the color of
the eye.
Reproduction: During reproduction, combination (or crossover) occurs first.
Genes from parents combine to form a whole new chromosome. The newly
created offspring can then be mutated. The changes are mainly caused by
errors in copying genes from parents. The fitness of an organism is
measured by the success of the organism in its life.
Operation of Genetic Algorithms :
Two important elements required for any problem before a genetic
algorithm can be used for a solution are
• Method for representing a solution ex: a string of bits, numbers,
character ex: determination total weight.
• Method for measuring the quality of any proposed solution, using a
fitness function.

Basic principles :

• An individual is characterized by a set of parameters: Genes


• The genes are joined into a string: Chromosome
• The chromosome forms the genotype
• The genotype contains all information to construct an
organism: Phenotype
• Reproduction is a “dumb” process on the chromosome of
the genotype
• Fitness is measured in the real world (‘Struggle for life’) of the
phenotype.
Algorithmic Phases :
Encoding using string :
Encoding of chromosomes is the first step in solving the problem and it
depends entirely on the problem heavily. The process of representing the
solution in the form of a string of bits that conveys the necessary
information. just as in a chromosome, each gene controls particular
characteristics of the individual, similarly, each bit in the string represents
characteristics of the solution.
Encoding Methods :
• Binary Encoding: Most common methods of encoding.
Chromosomes are string of 1s and 0s and each position in the
chromosome represents a particular characteristics of the solution.

• Permutation Encoding: Useful in ordering such as the Travelling


Salesman Problem (TSP). In TSP, every chromosome is a string of
numbers, each of which represents a city to be visited.
• Value Encoding: Used in problems where complicated values,
such as real numbers, are used and where binary encoding would
not suffice. Good for some problems, but often necessary to
develop some specific crossover and mutation techniques for these
chromosomes.

Fitness Function of GA

What is a Fitness Function?


Fitness Function (also known as the Evaluation Function)
evaluates how close a given solution is to the optimum solution of
the desired problem. It determines how fit a solution is.

Why we use Fitness Functions?


In genetic algorithms, each solution is generally represented as a
string of binary numbers, known as a chromosome. We have to
test these solutions and come up with the best set of solutions to
solve a given problem. Each solution, therefore, needs to be awarded
a score, to indicate how close it came to meeting the overall
specification of the desired solution. This score is generated by
applying the fitness function to the test, or results obtained from
the tested solution.

Generic Requirements of a Fitness Function


The following requirements should be satisfied by any fitness
function.

1. The fitness function should be clearly defined. The reader


should be able to clearly understand how the fitness score is
calculated.

2. The fitness function should be implemented efficiently. If


the fitness function becomes the bottleneck of the
algorithm, then the overall efficiency of the genetic
algorithm will be reduced.

3. The fitness function should quantitatively measure how fit


a given solution is in solving the problem.

4. The fitness function should generate intuitive results.

Crossover in GA
Crossover is a genetic operator used to vary the programming of a
chromosome or chromosomes from one generation to the next. Crossover is
sexual reproduction. Two strings are picked from the mating pool at random
to crossover in order to produce superior offspring. The method chosen
depends on the Encoding Method.
Crossover mask: The choice of which parent contributes to the bit position
fI the offspring is given by an additional string called crossover mask
similar to bit masks in unity game engine.

Different types of crossover :

Single Point Crossover: A crossover point on the parent organism string is


selected. All data beyond that point in the organism string is swapped
between the two parent organisms. Strings are characterized by Positional
Bias.

Two-Point Crossover : This is a specific case of a N-point Crossover


technique. Two random points are chosen on the individual chromosomes
(strings) and the genetic material is exchanged at these points.
Uniform Crossover: Each gene (bit) is selected randomly from one of the
corresponding genes of the parent chromosomes.
Use tossing of a coin as an example technique.

The crossover between two good solutions may not always yield a better or
as good a solution. Since parents are good, the probability of the child being
good is high. If offspring is not good (poor solution), it will be removed in
the next iteration during “Selection”.
Problems with Crossover:
• Depending on coding, simple crossovers can have a high chance to
produce illegal offspring.
E.g. in TSP with simple binary or path coding, most offspring will
be illegal because not all cities will be in the offspring and some
cities will be there more than once.
• Uniform crossover can often be modified to avoid this problem
E.g. in TSP with simple path coding:
Where the mask is 1, copy cities from one parent
Where the mask is 0, choose the remaining cities in the order of
the other parent.

Mutation in GA
mutation may be defined as a small random tweak in the
chromosome, to get a new solution. It is used to maintain and
introduce diversity in the genetic population and is usually applied
with a low probability – pm. If the probability is very high, the GA
gets reduced to a random search.

Mutation is the part of the GA which is related to the


“exploration” of the search space. It has been observed that
mutation is essential to the convergence of the GA while
crossover is not.

Mutation Operators
In this section, we describe some of the most commonly used
mutation operators. Like the crossover operators, this is not an
exhaustive list and the GA designer might find a combination of
these approaches or a problem-specific mutation operator more
useful.

Bit Flip Mutation

In this bit flip mutation, we select one or more random bits and
flip them. This is used for binary encoded GAs.
Random Resetting

Random Resetting is an extension of the bit flip for the integer


representation. In this, a random value from the set of
permissible values is assigned to a randomly chosen gene.

Swap Mutation

In swap mutation, we select two positions on the chromosome at


random, and interchange the values. This is common in
permutation based encodings.

Scramble Mutation

Scramble mutation is also popular with permutation


representations. In this, from the entire chromosome, a subset of
genes is chosen and their values are scrambled or shuffled
randomly.

Inversion Mutation

In inversion mutation, we select a subset of genes like in


scramble mutation, but instead of shuffling the subset, we merely
invert the entire string in the subset.

Convergence in GA
Convergence in terms of genetic algorithms is a special case when a genetic algorithm needs to
stop due to the fact that every identity in the population is identical. There is full convergence and
premature convergence. Full convergence can be seen in algorithms only using cross-over.
Premature convergence occurs when a population has converged to a single solution, but that
solution is not as high of quality as expected, for example the population has gotten stuck. To
avoid convergence, a variety of diversity generating techniques can be used. Convergence does
not always indicate a negative sign.
Multilevel optimization
GA are stochastic methods for global search and optimization and belong to the
group of nature-inspired metaheuristics leading to the so-called natural
computing. It is a fast-growing interdisciplinary field in which a range of
techniques and methods are studied for dealing with large, complex, and dynamic
problems with various sources of potential uncertainties. GAs simultaneously
examine and manipulate a set of possible solutions. A gene is a part of a
chromosome (solution), which is the smallest unit of genetic information. Every
gene is able to assume different values called allele. All genes of an organism form
a genome, which affects the appearance of an organism called phenotype. The
chromosomes are encoded using a chosen representation and each can be
thought of as a point in the search space of candidate solutions. Each individual is
assigned a score (fitness) value that allows assessing its quality. The members of
the initial population may be randomly generated or by using sophisticated
mechanisms by means of which an initial population of high-quality chromosomes
is produced. The reproduction operator selects (randomly or based on the
individual’s fitness) chromosomes from the population to be parents and enter
them in a mating pool. Parent individuals are drawn from the mating pool and
combined so that information is exchanged and passed to off-springs depending
on the probability of the crossover operator. The new population is then subjected
to mutation and enters into an intermediate population. The mutation operator
acts as an element of diversity into the population and is generally applied with a
low-probability to avoid disrupting crossover results. Finally, a selection scheme is
used to update the population giving rise to a new generation. The individuals
from the set of solutions, which is called population will evolve from generation to
generation by repeated applications of an evaluation procedure that is based on
genetic operators. Over many generations, the population becomes increasingly
uniform until it ultimately converges to optimal or near-optimal solutions. The
different steps of the multilevel weighted genetic algorithm are described as
follows:
Hybrid systems
Hybrid systems: A Hybrid system is an intelligent system that is framed by
combining at least two intelligent technologies like Fuzzy Logic, Neural
networks, Genetic algorithms, reinforcement learning, etc. The combination
of different techniques in one computational model makes these systems
possess an extended range of capabilities. These systems are capable of
reasoning and learning in an uncertain and imprecise environment. These
systems can provide human-like expertise like domain knowledge,
adaptation in noisy environments, etc.
Types of Hybrid Systems:
• Neuro-Fuzzy Hybrid systems
• Neuro Genetic Hybrid systems
• Fuzzy Genetic Hybrid systems
(A) Neuro-Fuzzy Hybrid systems:
The Neuro-fuzzy system is based on fuzzy system which is trained on the
basis of the working of neural network theory. The learning process
operates only on the local information and causes only local changes in the
underlying fuzzy system. A neuro-fuzzy system can be seen as a 3-layer
feedforward neural network. The first layer represents input variables, the
middle (hidden) layer represents fuzzy rules and the third layer represents
output variables. Fuzzy sets are encoded as connection weights within the
layers of the network, which provides functionality in processing and
training the model.

Working flow:
• In the input layer, each neuron transmits external crisp signals
directly to the next layer.
• Each fuzzification neuron receives a crisp input and determines the
degree to which the input belongs to the input fuzzy set.
• The fuzzy rule layer receives neurons that represent fuzzy sets.
• An output neuron combines all inputs using fuzzy operation
UNION.
• Each defuzzification neuron represents the single output of the
neuro-fuzzy system.
Advantages:
• It can handle numeric, linguistic, logic, etc kind of information.
• It can manage imprecise, partial, vague, or imperfect information.
• It can resolve conflicts by collaboration and aggregation.
• It has self-learning, self-organizing and self-tuning capabilities.
• It can mimic the human decision-making process.
Disadvantages:
• Hard to develop a model from a fuzzy system
• Problems of finding suitable membership values for fuzzy systems
• Neural networks cannot be used if training data is not available.
Applications:
• Student Modelling
• Medical systems
• Traffic control systems
• Forecasting and predictions
(B) Neuro Genetic Hybrid systems:
A Neuro Genetic hybrid system is a system that combines Neural networks:
which are capable to learn various tasks from examples, classify objects and
establish relations between them, and a Genetic algorithm: which serves
important search and optimization techniques. Genetic algorithms can be
used to improve the performance of Neural Networks and they can be used
to decide the connection weights of the inputs. These algorithms can also
be used for topology selection and training networks.

Working Flow:
• GA repeatedly modifies a population of individual solutions. GA
uses three main types of rules at each step to create the next
generation from the current population:
1. Selection to select the individuals, called parents, that
contribute to the population at the next generation
2. Crossover to combine two parents to form children for
the next generation
3. Mutation to apply random changes to individual parents
in order to form children
• GA then sends the new child generation to ANN model as a new
input parameter.
• Finally, calculating the fitness by the developed ANN model is
performed.
Advantages:
• GA is used for topology optimization i.e to select the number of
hidden layers, number of hidden nodes, and interconnection
pattern for ANN.
• In GAs, the learning of ANN is formulated as a weight optimization
problem, usually using the inverse mean squared error as a fitness
measure.
• Control parameters such as learning rate, momentum rate,
tolerance level, etc are also optimized using GA.
• It can mimic the human decision-making process.
Disadvantages:
• Highly complex system.
• The accuracy of the system is dependent on the initial population.
• Maintenance costs are very high.
Applications:
• Face recognition
• DNA matching
• Animal and human research
• Behavioral system
(C) Fuzzy Genetic Hybrid systems:
A Fuzzy Genetic Hybrid System is developed to use fuzzy logic-based
techniques for improving and modeling Genetic algorithms and vice-versa.
Genetic algorithm has proved to be a robust and efficient tool to perform
tasks like generation of the fuzzy rule base, generation of membership
function, etc.
Three approaches that can be used to develop such a system are:
• Michigan Approach
• Pittsburgh Approach
• IRL Approach
Working Flow:
• Start with an initial population of solutions that represent the first
generation.
• Feed each chromosome from the population into the Fuzzy logic
controller and compute performance index.
• Create a new generation using evolution operators till some
condition is met.
Advantages:
• GAs are used to develop the best set of rules to be used by a fuzzy
inference engine
• GAs are used to optimize the choice of membership functions.
• A Fuzzy GA is a directed random search over all discrete fuzzy
subsets.
• It can mimic the human decision-making process.
Disadvantages:
• Interpretation of results is difficult.
• Difficult to build membership values and rules.
• Takes lots of time to converge.
Applications:
• Mechanical Engineering
• Electrical Engine
• Artificial Intelligence
• Economics

You might also like