0% found this document useful (0 votes)
13 views

Unit 1 Notes

Uploaded by

soham pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Unit 1 Notes

Uploaded by

soham pawar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

ALARD CHARITABLE TRUST'S

ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Unit I BASICS OF NEURAL NETWORKS

Review of Transistor as a switch, Logic gates and Truth Tables. Characteristic of Neural Networks,
Historical Development of Neural Networks, Biological Neuron and their artificial Model, McCulloch
Pitts Neuron Model, Thresholding Logic functions, Neural Network Learning rules, Perceptron
Learning Algorithm, Perceptrons Model, Simulation of logic gates, Limitations of Pereceptron Learning.

Unit 1 : Basics of Neural Network

Introduction to Artificial Neutral Networks


ANN learning is robust to errors in the training data and has been successfully applied for learning
real-valued, discrete-valued, and vector-valued functions containing problems such as interpreting
visual scenes, speech recognition, and learning robot control strategies. The study of artificial neural
networks (ANNs) has been inspired in part by the observation that biological learning systems are
built of very complex webs of interconnected neurons in brains. The human brain contains a densely
interconnected network of approximately 10^11-10^12 neurons, each connected neuron, on average
connected, to l0^4-10^5 other neurons. So on average human brain takes approximately 10^-1 to
make surprisingly complex decisions. ANN systems are motivated to capture this kind of highly
parallel computation based on distributed representations. Generally, ANNs are built out of a densely
interconnected set of simple units, where each unit takes a number of real-valued inputs and produces
a single real-valued output. But ANNs are less motivated by biological neural systems, there are
many complexities to biological neural systems that are not modeled by ANNs. Some of them are
shown in the figures.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Difference between Biological Neurons and Artificial Neurons


Biological Neurons Artificial Neurons
Major components: Axions, Dendrites, Major Components: Nodes, Inputs, Outputs, Weights,
Synapse Bias
Information from other neurons, in the The arrangements and connections of the neurons made
form of electrical impulses, enters the up the network and have three layers. The first layer is
dendrites at connection points called called the input layer and is the only layer exposed to
synapses. The information flows from the external signals. The input layer transmits signals to the
dendrites to the cell where it is processed. neurons in the next layer, which is called a hidden layer.
The output signal, a train of impulses, is The hidden layer extracts relevant features or patterns
then sent down the axon to the synapse of from the received signals. Those features or patterns that
other neurons. are considered important are then directed to the output
layer, which is the final layer of the network.

A synapse is able to increase or decrease The artificial signals can be changed by weights in a
the strength of the connection. This is manner similar to the physical changes that occur in the
where information is stored. synapses.
Approx 1011 neurons. 102– 104 neurons with current technology

Difference between the human brain and computers in terms of how information is processed.

Human Brain(Biological Neuron Network) Computers(Artificial Neuron Network)


The human brain works asynchronously Computers(ANN) work synchronously.
Biological Neurons compute slowly (several ms Artificial Neurons compute fast (<1 nanosecond per
per computation) computation)
The brain represents information in a distributed
way because neurons are unreliable and could In computer programs every bit has to function as
die any time. intended otherwise these programs would crash.

Our brain changes their connectivity over time The connectivity between the electronic components
to represents new information and requirements in a computer never change unless we replace its

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

imposed on us. components.

Biological neural networks have complicated


topologies. ANNs are often in a tree structure.
Researchers are still to find out how the brain
actually learns. ANNs use Gradient Descent for learning.

Advantage of Using Artificial Neural Networks:


 Problem in ANNs can have instances that are represented by many attribute-value pairs.
 ANNs used for problems having the target function output may be discrete-valued, real-valued, or a
vector of several real- or discrete-valued attributes.
 ANN learning methods are quite robust to noise in the training data. The training examples may
contain errors, which do not affect the final output.
 It is used generally used where the fast evaluation of the learned target function may be required.
 ANNs can bear long training times depending on factors such as the number of weights in the
network, the number of training examples considered, and the settings of various learning algorithm
parameters.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Implementing Models of Artificial Neural Network


1. McCulloch-Pitts Model of Neuron
The McCulloch-Pitts neural model, which was the earliest ANN model, has only two types of inputs
— Excitatory and Inhibitory. The excitatory inputs have weights of positive magnitude and the
inhibitory weights have weights of negative magnitude. The inputs of the McCulloch-Pitts neuron
could be either 0 or 1. It has a threshold function as an activation function. So, the output signal yout is
1 if the input ysum is greater than or equal to a given threshold value, else 0. The diagrammatic
representation of the model is as follows:

McCulloch-Pitts Model

Simple McCulloch-Pitts neurons can be used to design logical operations. For that purpose, the
connection weights need to be correctly decided along with the threshold function (rather than the
threshold value of the activation function). For better understanding purpose, let me consider an
example:
John carries an umbrella if it is sunny or if it is raining. There are four given situations. I need to
decide when John will carry the umbrella. The situations are as follows:
 First scenario: It is not raining, nor it is sunny
 Second scenario: It is not raining, but it is sunny
 Third scenario: It is raining, and it is not sunny
 Fourth scenario: It is raining as well as it is sunny

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

To analyse the situations using the McCulloch-Pitts neural model, I can consider the input signals
as follows:
 X1: Is it raining?
 X2 : Is it sunny?
So, the value of both scenarios can be either 0 or 1. We can use the value of both weights X 1 and
X2 as 1 and a threshold function as 1. So, the neural network model will look like:

Truth Table for this case will be:


Situation x1 x2 ysum yout

1 0 0 0 0

2 0 1 1 1

3 1 0 1 1

4 1 1 2 1

So, I can say that,

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

The truth table built with respect to the problem is depicted above. From the truth table, I can conclude
that in the situations where the value of yout is 1, John needs to carry an umbrella. Hence, he will need to
carry an umbrella in scenarios 2, 3 and 4.
2. Rosenblatt’s Perceptron
Rosenblatt’s perceptron is built around the McCulloch-Pitts neural model. The diagrammatic
representation is as follows:

Rosenblatt’s Perceptron

The perceptron receives a set of input x 1, x2,….., xn. The linear combiner or the adder mode
computes the linear combination of the inputs applied to the synapses with synaptic weights being
w1, w2,……,wn. Then, the hard limiter checks whether the resulting sum is positive or negative If
the input of the hard limiter node is positive, the output is +1, and if the input is negative, the
output is -1. Mathematically the hard limiter input is:

However, perceptron includes an adjustable value or bias as an additional weight w0. This
additional weight is attached to a dummy input x0, which is assigned a value of 1. This
consideration modifies the above equation to:

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

The output is decided by the expression:

The objective of the perceptron is o classify a set of inputs into two classes c 1 and c2. This can be
done using a very simple decision rule – assign the inputs to c 1 if the output of the perceptron i.e.
yout is +1 and c2 if yout is -1. So for an n-dimensional signal space i.e. a space for ‘n’ input signals, the
simplest form of perceptron will have two decision regions, resembling two classes, separated by a
hyperplane defined by:

Therefore, the two input signals denoted by the variables x 1 and x2, the decision boundary is a
straight line of the form:
or

So, for a perceptron having the values of synaptic weights w 0,w1 and w2 as -2, 1/2 and 1/4,
respectively. The linear decision boundary will be of the form:

So, any point (x,1x2) which lies above the decision boundary, as depicted by the graph, will be assigned
to class c1 and the points which lie below the boundary are assigned to class c2.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Thus, we see that for a data set with linearly separable classes, perceptrons can always be
employed to solve classification problems using decision lines (for 2-dimensional space),
decision planes (for 3-dimensional space) or decision hyperplanes (for n-dimensional space).
Appropriate values of the synaptic weights can be obtained by training a perceptron. However,
one assumption for perceptron to work properly is that the two classes should be linearly
separable i.e. the classes should be sufficiently separated from each other. Otherwise, if the
classes are non-linearly separable, then the classification problem cannot be solved by
perceptron.

Linear Vs Non-Linearly Separable Classes

Multi-layer perceptron: A basic perceptron works very successfully for data sets which possess
linearly separable patterns. However, in practical situations, that is an ideal situation to have. This
was exactly the point driven by Minsky and Papert in their work in 1969. They showed that a basic
perceptron is not able to learn to compute even a simple 2 bit XOR. So, let us understand the
reason.
Consider a truth table highlighting output of a 2 bit XOR function:

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

x1 x2 x1 XOR x2 Class

1 1 0 c2

1 0 1 c1

0 1 1 c1

0 0 0 c2

The data is not linearly separable. Only a curved decision boundary can separate the classes
properly. To address this issue, the other option is to use two decision boundary lines in place of
one.

Classification with two decision lines in the XOR function output

This is the philosophy used to design the multi-layer perceptron model. The major highlights of
this model are as follows:
 The neural network contains one or more intermediate layers between the input and output
nodes, which are hidden from both input and output nodes
 Each neuron in the network includes a non-linear activation function that is differentiable.
 The neurons in each layer are connected with some or all the neurons in the previous layer.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

The McCulloch-Pitts Model of Neuron:


 The early model of an artificial neuron is introduced by Warren McCulloch and Walter Pitts in 1943.
The McCulloch-Pitts neural model is also known as linear threshold gate.
 These neuron are connected by direct weighted path.The connected path can be excitatory and
inhibitory.
 There will be same weight for the excitatory connection entering
Architecture :

 The connection weights from x 1,x2,…….xn are exhibitory denoted by ‘w’ and connection
weights from Xn+1 , Xn+2,…….Xn+m are inhibitory denoted by ‘-p’.
-> The McCulloch-Pitts neuron Y has the activation function.
f(yin) = 1 if yin >= Θ where net input yin is given by yin = Σ xiwi
0 if yin < Θ
where Θ is the threshold value and yin is the total net input signal received by neuron Y.
-> The McCulloch-Pitts neuron will fire if it receives k or more exhibitory inputs and no
inhibitory inputs.
Kw >= Θ > (K-1)w

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Perceptron in Machine Learning


In Machine Learning and Artificial Intelligence, Perceptron is the most commonly used term for all folks. It is the
primary step to learn Machine Learning and Deep Learning technologies, which consists of a set of weights, input
values or scores, and a threshold. Perceptron is a building block of an Artificial Neural Network. Initially, in the
mid of 19th century, Mr. Frank Rosenblatt invented the Perceptron for performing certain calculations to detect
input data capabilities or business intelligence. Perceptron is a linear Machine Learning algorithm used for
supervised learning for various binary classifiers. This algorithm enables neurons to learn elements and processes
them one by one during preparation. In this tutorial, "Perceptron in Machine Learning," we will discuss in-depth
knowledge of Perceptron and its basic functions in brief. Let's start with the basic introduction of Perceptron.

What is the Perceptron model in Machine Learning?


Perceptron is Machine Learning algorithm for supervised learning of various binary classification tasks.
Further, Perceptron is also understood as an Artificial Neuron or neural network unit that helps to detect certain
input data computations in business intelligence.

Perceptron model is also treated as one of the best and simplest types of Artificial Neural networks. However, it is a
supervised learning algorithm of binary classifiers. Hence, we can consider it as a single-layer neural network with
four main parameters, i.e., input values, weights and Bias, net sum, and an activation function.

What is Binary classifier in Machine Learning?


In Machine Learning, binary classifiers are defined as the function that helps in deciding whether input data can be
represented as vectors of numbers and belongs to some specific class.

Binary classifiers can be considered as linear classifiers. In simple words, we can understand it as a classification
algorithm that can predict linear predictor function in terms of weight and feature vectors.

Basic Components of Perceptron


Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three main components.
These are as follows:

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

o Input Nodes or Input Layer:

This is the primary component of Perceptron which accepts the initial data into the system for further processing.
Each input node contains a real numerical value.

o Wight and Bias:

Weight parameter represents the strength of the connection between units. This is another most important parameter
of Perceptron components. Weight is directly proportional to the strength of the associated input neuron in deciding
the output. Further, Bias can be considered as the line of intercept in a linear equation.

o Activation Function:

These are the final and important components that help to determine whether the neuron will fire or not. Activation
Function can be considered primarily as a step function.

Types of Activation functions:

o Sign function
o Step function, and
o Sigmoid function

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

The data scientist uses the activation function to take a subjective decision based on various problem statements and
forms the desired outputs. Activation function may differ (e.g., Sign, Step, and Sigmoid) in perceptron models by
checking whether the learning process is slow or has vanishing or exploding gradients.

How does Perceptron work?


In Machine Learning, Perceptron is considered as a single-layer neural network that consists of four main
parameters named input values (Input nodes), weights and Bias, net sum, and an activation function. The perceptron
model begins with the multiplication of all input values and their weights, then adds these values together to create
the weighted sum. Then this weighted sum is applied to the activation function 'f' to obtain the desired output. This
activation function is also known as the step function and is represented by 'f'.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning


This step function or Activation function plays a vital role in ensuring that output is mapped between required
values (0,1) or (-1,1). It is important to note that the weight of input is indicative of the strength of a node.
Similarly, an input's bias value gives the ability to shift the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1

In the first step first, multiply all input values with corresponding weight values and then add them to determine the
weighted sum. Mathematically, we can calculate the weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

Add a special term called bias 'b' to this weighted sum to improve the model's performance.

∑wi*xi + b

Step-2

In the second step, an activation function is applied with the above-mentioned weighted sum, which gives us output
either in binary form or a continuous value as follows:

Y = f(∑wi*xi + b)

Types of Perceptron Models


Based on the layers, Perceptron models are divided into two types. These are as follows:

1. Single-layer Perceptron Model


2. Multi-layer Perceptron model

Single Layer Perceptron Model:


This is one of the easiest Artificial neural networks (ANN) types. A single-layered perceptron model consists feed-
forward network and also includes a threshold transfer function inside the model. The main objective of the single-
layer perceptron model is to analyze the linearly separable objects with binary outcomes.

In a single layer perceptron model, its algorithms do not contain recorded data, so it begins with inconstantly
allocated input for weight parameters. Further, it sums up all inputs (weight). After adding all inputs, if the total
sum of all inputs is more than a pre-determined value, the model gets activated and shows the output value as +1.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning


If the outcome is same as pre-determined or threshold value, then the performance of this model is stated as
satisfied, and weight demand does not change. However, this model consists of a few discrepancies triggered when
multiple weight inputs values are fed into the model. Hence, to find desired output and minimize errors, some
changes should be necessary for the weights input.

"Single-layer perceptron can learn only linearly separable patterns."

Multi-Layered Perceptron Model:


Like a single-layer perceptron model, a multi-layer perceptron model also has the same model structure but has a
greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm, which executes in two stages as
follows:

o Forward Stage: Activation functions start from the input layer in the forward stage and terminate on the
output layer.
o Backward Stage: In the backward stage, weight and bias values are modified as per the model's
requirement. In this stage, the error between actual output and demanded originated backward on the output
layer and ended on the input layer.

Hence, a multi-layered perceptron model has considered as multiple artificial neural networks having various layers
in which activation function does not remain linear, similar to a single layer perceptron model. Instead of linear,
activation function can be executed as sigmoid, TanH, ReLU, etc., for deployment.

A multi-layer perceptron model has greater processing power and can process linear and non-linear patterns.
Further, it can also implement logic gates such as AND, OR, XOR, NAND, NOT, XNOR, NOR.

Advantages of Multi-Layer Perceptron:

o A multi-layered perceptron model can be used to solve complex non-linear problems.


o It works well with both small and large input data.
o It helps us to obtain quick predictions after the training.
o It helps to obtain the same accuracy ratio with large as well as small data.

Disadvantages of Multi-Layer Perceptron:

o In Multi-layer perceptron, computations are difficult and time-consuming.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

o In multi-layer Perceptron, it is difficult to predict how much the dependent variable affects each independent
variable.
o The model functioning depends on the quality of the training.

Perceptron Function
Perceptron function ''f(x)'' can be achieved as output by multiplying the input 'x' with the learned weight coefficient
'w'.

Mathematically, we can express it as follows:

f(x)=1; if w.x+b>0

otherwise, f(x)=0

o 'w' represents real-valued weights vector


o 'b' represents the bias
o 'x' represents a vector of input x values.

Characteristics of Perceptron
The perceptron model has the following characteristics.

1. Perceptron is a machine learning algorithm for supervised learning of binary classifiers.


2. In Perceptron, the weight coefficient is automatically learned.
3. Initially, weights are multiplied with input features, and the decision is made whether the neuron is fired or
not.
4. The activation function applies a step rule to check whether the weight function is greater than zero.
5. The linear decision boundary is drawn, enabling the distinction between the two linearly separable classes
+1 and -1.
6. If the added sum of all input values is more than the threshold value, it must have an output signal;
otherwise, no output will be shown.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Limitations of Perceptron Model


A perceptron model has limitations as follows:

o The output of a perceptron can only be a binary number (0 or 1) due to the hard limit transfer function.
o Perceptron can only be used to classify the linearly separable sets of input vectors. If input vectors are non-
linear, it is not easy to classify them properly.

Future of Perceptron
The future of the Perceptron model is much bright and significant as it helps to interpret data by building intuitive
patterns and applying them in the future. Machine learning is a rapidly growing technology of Artificial Intelligence
that is continuously evolving and in the developing phase; hence the future of perceptron technology will continue
to support and facilitate analytical behavior in machines that will, in turn, add to the efficiency of computers.

The perceptron model is continuously becoming more advanced and working efficiently on complex problems with
the help of artificial neurons.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Characteristics of Artificial Neural Network


 It is neurally implemented mathematical model
 It contains huge number of interconnected processing elements called neurons to do all operations
 Information stored in the neurons are basically the weighted linkage of neurons
 The input signals arrive at the processing elements through connections and connecting weights.
 It has the ability to learn , recall and generalize from the given data by suitable assignment and
adjustment of weights.
 The collective behavior of the neurons describes its computational power, and no single neuron
carries specific information .
How simple neuron works ?
Let there are two neurons X and Y which is transmitting signal to another neuron Z . Then
, X and Y are input neurons for transmitting signals and Z is output neuron for receiving signal .
The input neurons are connected to the output neuron , over a interconnection links ( A and B ) as
shown in figure .

For above neuron architecture, the net input has to be calculated in the way.
I = xA + yB
where x and y are the activations of the input neurons X and Y. The output z of the output
neuron Z can be obtained by applying activations over the net input.
O = f(I)
Output = Function ( net input calculated )
The function to be applied over the net input is called activation function . There are
various activation function possible for this

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Application of Neural Network

1. Every new technology need assistance from the previous one i.e. data from previous ones
and these data are analyzed so that every pros and cons should be studied correctly. All of
these things are possible only through the help of neural network.
2. Neural network is suitable for the research on Animal behavior, predator/prey
relationships and population cycles .
3. It would be easier to do proper valuation of property, buildings, automobiles, machinery
etc. with the help of neural network.
4. Neural Network can be used in betting on horse races, sporting events, and most
importantly in stock market.
5. It can be used to predict the correct judgment for any crime by using a large data of crime
details as input and the resulting sentences as output.
6. By analyzing data and determining which of the data has any fault ( files diverging from
peers ) called as Data mining, cleaning and validation can be achieved through neural
network.
7. Neural Network can be used to predict targets with the help of echo patterns we get from
sonar, radar, seismic and magnetic instruments.
8. It can be used efficiently in Employee hiring so that any company can hire the right
employee depending upon the skills the employee has and what should be its productivity in
future.
9. It has a large application in Medical Research .
10. It can be used to for Fraud Detection regarding credit cards , insurance or taxes by
analyzing the past records

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Neural Representation of AND, OR, NOT, XOR and XNOR Logic Gates (Perceptron
Algorithm)

While taking the Udacity Pytorch Course by Facebook, I found it difficult understanding
how the Perceptron works with Logic gates (AND, OR, NOT, and so on). I decided to
check online resources, but as of the time of writing this, there was really no explanation
on how to go about it. So after personal readings, I finally understood how to go about it,
which is the reason for this medium post.

Note: The purpose of this article is NOT to mathematically explain how the neural
network updates the weights, but to explain the logic behind how the values are being
changed in simple terms.

First, we need to know that the Perceptron algorithm states that:

Prediction (y`) = 1 if Wx+b > 0 and 0 if Wx+b ≤ 0

Also, the steps in this method are very similar to how Neural Networks learn, which is as
follows;

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

 Initialize weight values and bias

 Forward Propagate

 Check the error

 Backpropagate and Adjust weights and bias

 Repeat for all training examples

Now that we know the steps, let’s get up and running:

AND Gate

From our knowledge of logic gates, we know that an AND logic table is given by the
diagram below

AND Gate

The question is, what are the weights and bias for the AND perceptron?

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

First, we need to understand that the output of an AND gate is 1 only if both inputs (in this
case, x1 and x2) are 1. So, following the steps listed above;
Row 1
 From w1*x1+w2*x2+b, initializing w1, w2, as 1 and b as –1, we get;
x1(1)+x2(1)–1
 Passing the first row of the AND logic table (x1=0, x2=0), we get;
0+0–1 = –1
 From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this row is correct, and no
need for Backpropagation.
Row 2
 Passing (x1=0 and x2=1), we get;
0+1–1 = 0
 From the Perceptron rule, if Wx+b≤0, then y`=0. This row is correct, as the output is 0
for the AND gate.

 From the Perceptron rule, this works (for both row 1, row 2 and 3).
Row 4
 Passing (x1=1 and x2=1), we get;
1+1–1 = 1
 Again, from the perceptron rule, this is still valid.

Therefore, we can conclude that the model to achieve an AND gate, using the Perceptron
algorithm is;

x1+x2–1

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

OR Gate

OR Gate

From the diagram, the OR gate is 0 only if both inputs are 0.

Row 1

 From w1x1+w2x2+b, initializing w1, w2, as 1 and b as –1, we get;

x1(1)+x2(1)–1

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

 Passing the first row of the OR logic table (x1=0, x2=0), we get;
0+0–1 = –1
 From the Perceptron rule, if Wx+b≤0, then y`=0. Therefore, this row is correct.
Row 2
 Passing (x1=0 and x2=1), we get;
0+1–1 = 0
 From the Perceptron rule, if Wx+b <= 0, then y`=0. Therefore, this row is incorrect.
 So we want values that will make inputs x1=0 and x2=1 give y` a value of 1. If we
change w2 to 2, we have;
0+2–1 = 1
 From the Perceptron rule, this is correct for both the row 1 and 2.
Row 3
 Passing (x1=1 and x2=0), we get;
1+0–1 = 0
 From the Perceptron rule, if Wx+b <= 0, then y`=0. Therefore, this row is incorrect.
 Since it is similar to that of row 2, we can just change w1 to 2, we have;
2+0–1 = 1
 From the Perceptron rule, this is correct for both the row 1, 2 and 3.
Row 4
 Passing (x1=1 and x2=1), we get;
2+2–1 = 3
 Again, from the perceptron rule, this is still valid. Quite Easy!
Therefore, we can conclude that the model to achieve an OR gate, using the Perceptron
algorithm is;
2x1+2x2–1

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

NOT Gate

NOT Gate
From the diagram, the output of a NOT gate is the inverse of a single input. So,
following the steps listed above;
Row 1
 From w1x1+b, initializing w1 as 1 (since single input), and b as –1, we get;

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

x1(1)–1
 Passing the first row of the NOT logic table (x1=0), we get;
0–1 = –1
 From the Perceptron rule, if Wx+b≤0, then y`=0. This row is incorrect, as the output is 1
for the NOT gate.
 So we want values that will make input x1=0 to give y` a value of 1. If we change b to 1,
we have;
0+1 = 1
 From the Perceptron rule, this works.
Row 2
 Passing (x1=1), we get;
1+1 = 2
 From the Perceptron rule, if Wx+b > 0, then y`=1. This row is so incorrect, as the output
is 0 for the NOT gate.
 So we want values that will make input x1=1 to give y` a value of 0. If we change w1 to
–1, we have;
–1+1 = 0
 From the Perceptron rule, if Wx+b ≤ 0, then y`=0. Therefore, this works (for both row 1
and row 2).
Therefore, we can conclude that the model to achieve a NOT gate, using the Perceptron
algorithm is;
–x1+1

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

NOR Gate

NOR Gate
From the diagram, the NOR gate is 1 only if both inputs are 0.
Row 1
 From w1x1+w2x2+b, initializing w1 and w2 as 1, and b as –1, we get;
x1(1)+x2(1)–1
 Passing the first row of the NOR logic table (x1=0, x2=0), we get;
0+0–1 = –1
 From the Perceptron rule, if Wx+b≤0, then y`=0. This row is incorrect, as the output is 1
for the NOR gate.

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

 So we want values that will make input x1=0 and x2 = 0 to give y` a value of 1. If we
change b to 1, we have;
0+0+1 = 1
 From the Perceptron rule, this works.
Row 2
 Passing (x1=0, x2=1), we get;
0+1+1 = 2
 From the Perceptron rule, if Wx+b > 0, then y`=1. This row is incorrect, as the output is 0
for the NOR gate.
 So we want values that will make input x1=0 and x2 = 1 to give y` a value of 0. If we
change w2 to –1, we have;
0–1+1 = 0
 From the Perceptron rule, this is valid for both row 1 and row 2.
Row 3
 Passing (x1=1, x2=0), we get;
1+0+1 = 2
 From the Perceptron rule, if Wx+b > 0, then y`=1. This row is incorrect, as the output is 0
for the NOR gate.
 So we want values that will make input x1=0 and x2 = 1 to give y` a value of 0. If we
change w1 to –1, we have;
–1+0+1 = 0

 From the Perceptron rule, this is valid for both row 1, 2 and 3.

Row 4

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

 Passing (x1=1, x2=1), we get;

-1-1+1 = -1

 From the Perceptron rule, this still works.

Therefore, we can conclude that the model to achieve a NOR gate, using the Perceptron
algorithm is;

-x1-x2+1

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

NAND Gate

From the diagram, the NAND gate is 0 only if both inputs are 1.
Row 1
 From w1x1+w2x2+b, initializing w1 and w2 as 1, and b as -1, we get;
x1(1)+x2(1)-1
 Passing the first row of the NAND logic table (x1=0, x2=0), we get;
0+0-1 = -1
 From the Perceptron rule, if Wx+b≤0, then y`=0. This row is incorrect, as the output is 1
for the NAND gate.

 So we want values that will make input x1=0 and x2 = 0 to give y` a value of 1. If we
change b to 1, we have;
0+0+1 = 1
 From the Perceptron rule, this works.
Row 2
 Passing (x1=0, x2=1), we get;
0+1+1 = 2
 From the Perceptron rule, if Wx+b > 0, then y`=1. This row is also correct (for both row
2 and row 3).

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

Row 4
 Passing (x1=1, x2=1), we get;
1+1+1 = 3
 This is not the expected output, as the output is 0 for a NAND combination of x1=1 and
x2=1.
 Changing values of w1 and w2 to -1, and value of b to 2, we get;
-1-1+2 = 0
 It works for all rows.
Therefore, we can conclude that the model to achieve a NAND gate, using the Perceptron
algorithm is;

-x1-x2+2

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

XNOR Gate

XNOR Gate
Now that we are done with the necessary basic logic gates, we can combine them to give an
XNOR gate.
The boolean representation of an XNOR gate is;
x1x2 + x1`x2`
Where ‘`' means inverse.
From the expression, we can say that the XNOR gate consists of an AND gate (x1x2), a
NOR gate (x1`x2`), and an OR gate.

This means we will have to combine 3 perceptrons:

 AND (x1+x2–1)

 NOR (-x1-x2+1)

 OR (2x1+2x2–1)

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

XOR Gate

XOR Gate
The boolean representation of an XOR gate is;
x1x`2 + x`1x2
We first simplify the boolean expression
x`1x2 + x1x`2 + x`1x1 + x`2x2
x1(x`1 + x`2) + x2(x`1 + x`2)

(x1 + x2)(x`1 + x`2)

TE ANN Dr. Shilpa Vikas Shinde


ALARD CHARITABLE TRUST'S
ALARD COLLEGE OF ENGINEERING & MANAGEMENT

Department of Artificial Intelligence and Machine Learning

(x1 + x2)(x1x2)`

From the simplified expression, we can say that the XOR gate consists of an OR gate (x1 +
x2), a NAND gate (-x1-x2+1) and an AND gate (x1+x2–1.5).

This means we will have to combine 2 perceptrons:

 OR (2x1+2x2–1)

 NAND (-x1-x2+2)

 AND (x1+x2–1)

TE ANN Dr. Shilpa Vikas Shinde

You might also like