0% found this document useful (0 votes)
7 views

ml 6th (2)

The document provides an overview of biological neurons and their classification, detailing unipolar, bipolar, and multipolar neurons, as well as their functions as sensory, motor, and interneurons. It introduces Artificial Neural Networks (ANNs) as models inspired by biological systems, explaining how they process information similarly through learning and synaptic connections. Additionally, it discusses the McCulloch-Pitts model of artificial neurons, highlighting differences between biological and artificial neurons in terms of processing elements, speed, and fault tolerance.

Uploaded by

mayurpatilfake
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

ml 6th (2)

The document provides an overview of biological neurons and their classification, detailing unipolar, bipolar, and multipolar neurons, as well as their functions as sensory, motor, and interneurons. It introduces Artificial Neural Networks (ANNs) as models inspired by biological systems, explaining how they process information similarly through learning and synaptic connections. Additionally, it discusses the McCulloch-Pitts model of artificial neurons, highlighting differences between biological and artificial neurons in terms of processing elements, speed, and fault tolerance.

Uploaded by

mayurpatilfake
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

~ •U I

_________
Machine Learning (SPPU-Sem5-IT)

~ ,.. BIOLOGICAL NEURON


_____
...;.;.;;..;.;.;.;.,.;;...;.;;_
. .-•
(Introduction to Artificial Neural Network)... Page no (6-
3

l'nipolur neurons hove u single process. Their dendrites


uml 1111011 are locutcd on the sume stem. These neurons
• Neuron is the elenwnltuy m'l\'l' rc:11 and u husit' 111111 of 111c found u, invcrtcbrutes. Bipolur neurons have two
the nen•ous system Nl•uwn hns m1 i11f'o1111111 ion prnccs~e~.
processing obilit). Their dendrite, 11ml u~on have lwo separated processes

• Each nClll'l'll hns thn'l' 11111111 ll'~i1,ns. rcll hody m Sonm, too. Multipolnr neuron~ ure commonly found in
Awn amt Dl·mlrit,•s. Somn con1nins thr nucleus und ii 1m11nnwls. Some examples of these neurons are spinal
ptl>ecsws 1he mfonnatton A)l.on is a long fiber tlml motor neurons, pyramidal cells :ind purkinje cells.
sen <'S JS .1 transmission line. When bioJog,cal neurons are classified by function thet
-.
l •
End pa11 of the Axon split~ into tine nrllorizntion thal full into three cotegories. The first group is sensory
,rnds m10 small bulb cnlled UN a Synapse :il111osl neurons.
touching the thmdrite of the neighboring neuron. • These neurons provide all information for perception
and mo101 coordination. The second group provides
information to muscles, and s}ands. Th~e are called
motor neurons.
• The last group. the interneuronal, contains all other
neurons and has two subclasses. One group called relay
or protection interneuron. They are usually found in the
brain and connect different parts of it. The other group
called local lntemeuron' s are only used in local
Fig. 6.1.1 : Biological Neuron Model
circuits.
• Dendrites accept the input from the neighboring neuron
through axon. Dendrites, look like a tree structure. INTRODUCTION TO ANN
receives signals from other neurons. Synapse is the
electro-chemical contact between the organs. • An Artificial Neural Network (ANN} is inspired from
the Biological Nervous System. The way by which
• They do not physically touch because they are
Biological Nervous System such as Bram processes the
separated by a cleft. The electric signals are sent
information in the same manner ANN also processe:.
lhrough chemical interaction. The neuron sending the
the information.Al"-m is~ al~o called as infonnation
signal_ is called pre-synaptic cell and the neuron
processing paradigm. It resembles the brain in two
rece~ng the electrical signal is called postsynaptic
respects:
cell
..P Leaming process is used to acquire the knowledge
• The electrical signals that the neurons use to convey the
from the Environment by the network. -
information of the brain are all identical. The brain can
determine which type of information is being received v O Acquired Knowledge is stored using Intemeuron
based on the path of the signal. connection strengths known as synaptic "eiehts.

• The brain analyzes all patterns of signals sent, and from • The lnfonnalion processing system is comp,1Se~J of a
that information ii interprets the type of information large number of highly mterconnected pro..:essmg
received. There are different types of Biological ele~ents (nellrons) which ",wks togetht>r to solv;
specific problems.
neurons.
When the neurons are classified by the processes they
• In . Biological Sys1t•m learning prO\.'ess involves
carry out they are classified a1> unipolar neuron~. tulJustmonts to lhc S} naptic connections th .
at exist

-
h etwccn lhl· m•urnns. in thr s.une
bipolar neurons and mult1polar neurons. manner learning is
cumcd 0111 111 ANN. ~
UNI'
~ _ _ _ _ _ V]
CNew Syllabus w.e.f academic year 21-22) (PS-46) ~ Tech•Neo Publications.. A SACHIN SHAH Venture ■
~~~~
1tl 'f •
Machine Leaming (SPPU-SemS-IT)
(Introduction to Artificial Neural Network) ...
• ANN can be used in many Apphnil
mn~. P.illcrn
Page no 6-4
extraction and detectio n of trend, 1, ,1 kdm • Till' outp ut of the ahove model is~i)'en
as,
us pwrc.ss
V- 1 ·l r 11elI = r,wIJ.. '1' XJ-> T
for humans and ol hn l'l'llll'\llc1 tcchniqm
networks. with llll'i i H'm111 l,.ahlc nh1h
•:.. Nl·

mcanmg from l'1,mplk,11,·il 111 imprcrisc d.,i.,


1y 111 dl'II W
uinl
jf nc1i • .I:Wi; • xj < T
Whc,e ·1' 1cp1ci;cnl the output neuron and
\
, ,•,,n hl'
u,cd to c,11.wt patt,·111, 11ml d,•l,'l'I 1r,•1111' 'j' represent
1hr lllpll l IICIIIOII ,
• Onl ' ,,f th,' .1pph,·a111,," ,,t \ "lN 1m·lt1,k,. ·\d11
p11vc • Tin: i;l!.1li1r i11p111 X jq trnns111it1ed through
ll'.Ul\11\g II\ tlm, till' d,11.1 " ~P' ,•11 fo, a connection
tn1111111l' ,11111 the 1hn1 111ul 1ipll t, 1h strengtl, by the scalar
11,'t\\,,1!. u,c-, th,, data 1,, 1";1111 ho\\ lo pc1h weight W to
•11111h1• la~i..s lm111 thl' product W .. X, again a scalar
b.1,r,1 ,,n tlu, data . The weighted
111p111 W • X ,s lhc only urgurnent of
• \;-,:">; ,,,mr111:11ion, Ill.I~ h,· ,·.um·d the transfer
mll Ill 11arullcl ilnd l1111rtm11 I, which produces the ~calar
'i~c-,al h.ml\\ .11\' ,IC', 1rc, .nr h,·mg d,•s1 output Y The
gncd .11111 11c111011 may have a scalar hws, b. You
manuf.1c-1urcd "Inc h 1.1k,· 111\ .,11tag,· 1,t lhi, can IIJCW ihe
,•apah1h1y hias as simply being added to the product
• Oth rr ·'l'l'l11•,,1io11s ,,r "', 11i.•h1d,· Sign, W • X. The
11 l'mccssing, hi;1~ 1s much hke a weight, except that it
P:1m•m R1·,·l1gn111011 StX'l'1h Rc~ogni has a constam
tion , Dain input of I.
C'0mprc,,10:i. C\1mpu1c1 \ 1si1'1l and Gam
e Play111g.
• The transfer function net input 'n', again
n scalar, 1s the
~ 6.J MCCULLOCH - PITTS NEURON sum of the weighted input W x X and the
bias b. Th,_,
sum is lhe argument of the transfer function
~kCulloch and Pitts proposed a computa f Here f
tional model a transfer function, typically a step
lh3t rese mbk , the Biological Neuron funcuon or a
model. These neurons sigmoid function, that takes the argu
\\er e rcp~ ente d as models of b1ologic ment D ~
al networks into produces the output Y. Note that W and
conceptual components for circuits that b axe ~
could perform adjustable scalar parameters of the neuron.
compotationnJ tasks. The basic model of
the artificial neuron
b founded upon the functionality of the
biological neuron.
• The central idea of neural networks
is th3: SIX"
An anificial neuron 1s a math
ematical function that parameters can be adjusted "o that the netv
.ork eu: n
resembles the biological neuron. some desued or interesting behavior. Thu
s, }OU cc
train the network to do a paniculnr job by
adJt:Stmg tile
a' Neu ron Model weight or bias parameters, or perhap~ the
net\\.:irk 1::>r '
will adjust these parameters to ach ie,e
• A neuron with a scalar input and no
bias appears end
,:>me ~_,_-e_
below. The Fig. 6.3 1 shows a simple artif
icial neural
net with n input neurons <X1, Xz. ., Xn> and
one output • As prni ous ly noted, the bi.i, b :, an nd l:Su
b!e :),.-JLu
neuron (Y) The mterconnect.cd weights parnme1er of the neuron. It 1s not an tnpu
are given hy t H.:,\ e,c
w 1, w1 , and Wn the constant I that dm e, :ht> t-1.,~ i- ,m
put anJ
be tre.ited ns such \\ hen ) ou ron.s .k
~
depe nden ce of input ,ecto?S m lme
F :er,
Ex, 6.3.1 : Sinmln11on of :-.;o r g Mct ullo ch Pitts
Model The truth 1,1Me oft +-~ ....: .---1 s 1-, O\\ s

l'l&-6.3,J : Artllklal Neuron Model 0


0
\\ C MIIIK' the WC\ hi VCCt W

'1 Ttch Nao


Machine Leamll lg (SPPU•Sem5-IT)
(Introduction to Art1hcial Neural Network) ... Page no (8-ll "'.!
For the first row (1.e input 1) we maJ write net value as According to the McCulloch - Pi11•s model if the
w• x = W •t= W according to the McCulloch _ Pitts output is I then net value must be l~s than threshold
model if the output i~ 0 then net value mu~t be lcs, than W 1 + W 2 ~T
threshold W < T Now we arc having four equations
For the second row (i.e. input OJ we may write net I. 0<T 2. W2 <T
value as w • X :\\' •O=O according to the \1cCulloch . Pitts 3 W 1 <T 4. W 1+W2~ T
model ,f the output b I lhen net "alue must be greater than Now select the values of W . W 2· and T such that the
1
or equal to threshold. 0 ~ T above conditions get satisfied.
Now we are ha\'lng two eciuauons Let·s assume T = 0.8, W2 = 0.3, W1 = 0.5.

W<T 2. 0~T Using this values the Aa"ID gate is represented as.
I.
Now select lhe values of W and T such that the abo\'e W 1 =05
X1
conditions gets satisfied ( T= 0.8 X
One of the possible ulues are T=O. W= - I X2
W2 =03
Using this values the NOT gate is represented as,
~ 6.4 DIFFERENCE BE1WEEN
X
~ y BIOLOGICAL NEURON AND
ARTIFICIAL NEURON
Ex. 6.3.2 : Simulation of AND gate using McCulloch . Pius
Model The truth table of the AND gate is as follows Sr. Points of Biological NN Artiflclal NN
No. Difference
Input Output 14
I. Processing 10 synapses 108 transistors
X1 Xi y elements
0 0 0 2. Speed Slow Fast
0 1 0 3. Processing Parallel Execution One by one
I 0 0 4. Size and Less More,
I I Complexity difficu lt to
I
of implement
We assume the weight vector as W I for X1 and W2 for X2 operation complex
For the first row, we may wnte nel values as operation
CW1X X1) + CW2X Xi) = (W1 XO)+ (W2x 0) = 0 5. Faull Exist Doesn't exist
Tolerance
According to the McCulloch - Pitts model if the output
6. Storage If new data is added Erased
,s 0 then net value must be less than threshold O< T.
old is not era~ed
For the second row, we may write net values as 7. Control Every neuron acts CPU
(W1xX 1) + (W2 x Xi)= (W 1x 0) + (W2 x I) = Wz Mechanism independently
According to the McCu lloch - Pitts model if the output
~ 6.5 ACTIVATION FUNCTIONS AND
is 0 then net value must be Jess than threshold W2< T.
TYPES
For the third row. we may wnte net values as
(WI x X1) + CW2 x X2) =(WI X 1) + (W2 :,< 0) = W1 ~ 6.5.1 Activation Functions In a Neural
According to the McCulloch • Pitts model if the output
Network
is 0 then net value must be less than threshold W1< T • Artific ial neural networks are important part of many
structu res that are helping revolutionize the world
For the fourth row, we may write net values as
around us.
iW1 XX1) + CW2x Xv = CW1 X 1) + (\V2X I)= w ,+ W:

<New Syllabus w.e.f academ ic year 21·22) (PS-46) [i1 Tech•Neo Publications_.A SACH1N SHAH \len\\Q
r
Machine Learning (SPPu s
· emS-li)
• Now Jet us see, how do artificial (Introduction to Arlif!clal Neural Network) ... Page no 6-s
the required perfor neural networks shows
problems A d hmance to find solll!ions !o real-world • Activation functions help neural neLworks
10
ma};
. · n l e answer for this is an Ac1ivmion
Functions. sense of complicated, high dimensional, and non-line;
Big 011111 ~ct~ 1ha1 huve an intricate archi tecture_ !he
• Anificial neural n t k . .
c wor s uses 11ctiv1111011 functloi1s 10 conlnin multiple hidden Juyers in be1ween the input on:
compute many comple~ cnlculmmns 111 tl1c hlclch.i, 1 OU!Jllll Juyer.
layers and then forwurd the rcsul1 to the unlplH luycr.
The main aim of m:h\'(ttlon l'llnctinns is 10 introduce 'a. 6.5.J Sort and Hard limiting Function TYJts
non-linenr properties in the ncurnl network. ;-----------------------------
UQ, Write a note on Sigmoid, Tanh, Retu
..
• Acth•otion ftinctions convert the linear input signals of
1

{SPPU - Dec 19 8 M"'kiJ


''
:i node into non-linear outpu1 sigrmls to focill1n1c 1he
I' UQ. States & explain 3 types of neuron~ that add no,,. ;
learning of high order polynominls time go beyond one
degree for deep networks. linearity In thelr computations

• A distinct propeny of .1ctivmion fonclion Is 1hn1 they


'' UQ,
{SPPU - Dec 18 4 MMki)

nre differentiable. This property helps 1hem function I Ellplain slgmold function. •·
during the backprop.'lgntion of the neural networks. l- -- - - - - • - • - -- - - - -- - - --- ------.J'
• Activation function f(x) is used 10 give ompu1 of a \" !The hard limiting activation functions forces a neuron
neuron in tenns of a local field x or net, The various to output I if its net input reaches a threshold.
activation functions are discussed in the coming otherwise ii outputs O•.This allows a neuron t ~ a
section. decision or classification.
• It can say yes or no. This kind of neuron is often
'a. 6.5.2 Need for Non-linearity trained with the pe~_:_p~earning i:ule. Exactly

',--------------- ---------------
UQ. Explain why we use non-linearity function?
reverse of hard limit ls the soft limitine activation
functions.
' ,.
(SPPU - Dec 18, 4 Marks)
~------------------------

-----J
If we do not use activation functions then the output
Linear

Output "' net


signal would be a linear function. And Lhe output would
be a polynomial of one degree. '
'-': Even if linear equations nrc easy to solve, they have a --------- ----,.----
limited comple.lU1y quotient. Due to these linear
equations have less power to !earn complex functional
-11 O +l n [21
.,,.- •
mappings from data.
From this we can say that if we do not use activation
_ _,,'' _____ --------
-1
functions then neural network would be a li near
re&!CsSfon model with limited functianalities.
a = sattin(s)
.........- •· And this we do not expect from a neural network. The
main purpose of neural networks is to calculate • Linear nctivDtion function gives the outpm san~ .lS- tbsl
extremely
, complicated calculations. of the input or net vnlue.
_,,.. Also if we do not use activation functlons lhen neural • The M111lnb toolbox has ;\ funch1111, purehn. 10 tt.tlizt
networks wHI not be able to learn. Neun!! networks will lhe nmlhemutical li11e,1r transfer function shown abO\t
a1so not be able to model complicated data, including
Neurons of this type nrc used ns linear nppro\i matOIS in
1mageh, ~peech, videos, 11udio, etc.
I.incur Filter~

(New Syllabus w.e.f academic year 21-22) (PS-46) Ii] Tech-Neo Publkat1ons.. A SACHIN SHAH venll,l'f
Machme leant.'lg (SPPU-Sem5-lT}
(lntroauetion to Art15aa Ne~•al Ne:W00<J...Page no (6-n
/ " Rectified Linear Unit (ReLU)

Outp:::= ~ (0, Oct = net
ifoet~0
=max <O. net. =o 11 Det <0
• Rect;f..eo lmear
• • - lll.!!l.JS 0De of the Ill_QSt fr~U\" n5td
0 n

i!C.J,·auon fun~on in ~ leammg models. ReLU


function. is a fast-leai:ninz activation function that
~ the delive1y of very good performance with a = hare =In)
Symmetnc f-lard-Umrt Transfar FunctJon
approprial.e and com-rt resul~ -
..__ -= ;;:::.:.._;
• ReLt: shows better generaliulion and performance • The Symmeuical bard-luni1 transfer function shown
when compared to other activation functiom; like· the abo,·e limiis the ouq)ln of the neuron IO either - l. if
s!gmoid an~ ~ func!iQ~Tbe function is a nearly the net input argument n is Jess th.111 0. or I. if n is
linear_funcuon that n:tams the properties of linear greater than or equal to 0.
models, ~bicb ~ them easy 10 optimize with

--
• This function is used in Percepaon's s. to create
gradient-descent metiiods.
neurons that make classification decisions.
• The ReLU function perfonns a lhreshold operation on
each input element where all values Jess than zero are 5, Saturating linear
set 10 zero.
Ou1pu1 =O if oe1 < 0
~- ~ d llmlt / Unipolar binary = net ifOSne1< I
Output =O Ifoet<O = I if net::! I
= I ifnc120
a
3
r +1 '
! O +1 n

tr 0 n
[] -1
-1--·
a = satlin(n)
a = hardlim(n) satlln Transfer Function

Hard-limit Transfer Function 6. Symmetrical saturating linear

Output = - I if net<· I
• The hard-limit transfer function shown above hmll\ the
output of the neuron to either 0. if the net input = net if -1 S net < I
argument n is less than 0, or I, if o is greater than or =I if net 2 I
equal to 0. a
• This function is used m Perceptroo·s. 10 create neurons +1
that make classification decisions.

4. Symmetrical Hard limit/Bipolar binary


n
Output = - I ifnet<0
= I ifnet~0

UNIT
------------------==---a-=sa_ u_1n(-s)------ kiY
(New Syllabus w.e.f academic year 21-22) (PS-46) ~ Tech-Neo Publications .A SACHIN SHAH Venture
Machine Leaming (SPPU-SemS-IT)

• The symbol in the square (lnlroduclion to Arllflolal Neural Network) ... Page no 6-a
function graph h . to th c nght of each trnnsfcr
,. f
5
O\\n above r('p1cscn1s lhc a\s1>nnI •d
~ 6.6 NEURAL NETWORK ARCHITECJull
lrans,er unchon. 1

• ~-UQ,~---- ~~-- rl!ed forward


Wtile n note on----- ----- ---- -----::-:
network
These icons replal'l' lhl" gl."nc1al 1 1
networ"
I d.1agr.1ms to sh,,\\ r Ill l he hO;\CS or I
I
I
thi• I
p,1111rul111 tra11Nl1•1 : UQ. LMplnin the anhttecture of feed forward neu,a, :
function t-cm11- 11 ~cll
network Give lh llmllat1ons '
7• Unipola r continuous
(SPPU · Ocl 16, J.ln 19, 8 M,,rkiJ
I
I
Output
I +c
- Xnc1 '---- ----- ----- ----- ----
Ncum11s urn interconnected 10 ----- 71ie
each other. --·
,11n111gcnw111 nf neuron, i~ called:,, Network architecture,
a
Ntulll Network

n retc! lo,,,,11d Net..ork

Single Mui~ Rtd,11 •


•r••
Percop~on
•r•r
Ptroep~on
BHil
Func1ion
a =·1ogs1g(n)
NtMorlc Ntr-lc Nt-k
Log-sigmoid Transfer Function
Fig. 6.6.J : Network Architecture
• The sigmoid transfer function shown above takes the
input, which can have any value between plus and ► 1. Feed forward Network
minus infinity. and squashes lhe output into the range
In this type of network the neurons are connected in n •
010 I.
forward di11:::c1ion. Conoeclion is allowed from layer i
• This transfer funcuon is commonly u~ed in back
10 layer j if and ool) 1f. 1 < J
propagation netw01i.:s, 111 pan because ii is
differentiable. Input layer i Output layer J
X1 w11
Y1
.._Jt: Tanh / Bipolar continu ous
2
.., Output = ). ncr 1
l+e
The sigmoid transfer funcuon shown above takes the
input,- which can have any value between plus and
mmus mfinity. and squashes the output into the
range - I 10 I.

Fig. 6.6.2 : Single Layer Feed forn·ard ~ehrnrk

• Input vector, output vccto, .111J the "eight mam, oi the


nc1work me represented ,1s folio\\ s,
Input vector, X [X 1, x2• ,\, .. \al

.,
r ts,
Y
011tplll \'C,,'l<'r,

= IY 1, Y2, Y1........
a • tan&lg(n)
T■n••lgmo1d Transfer Function

(New Syllabus w.e f 1cadem1c year 21·22) (PS-46) Ii} Tech Neo Pubhcations.. A SACHIN SHAH Ventult
- , .. ,

Machine Leaming (SPPU-Sem5•1T


I
(
(l~troductJon to Artificial Neural Networ1<
) ... Page no 6-9
(ill) Radial Basis Function Network
Weight matrix, W= In thb type of networks a single hidden
layer is present.
Hldd ,n
Out put layer
layer
Wm1 Wm2 Wm3 Wmn Y1
• The net input to the r IS calculated using the following
equation.
net, = l:W • X1 for j = I tom and i = I 10 n
11

• Once the net mpu1 1s calculnied, the


output of the jth
neuron is calculated by applying the acti
vnhon function
on the_net input as follows,
YJ = f (netj)
Fig. 6.6..J : Radial Connected neu ral
(I I Single Layer Perceptron Net netw ork
wor k
► 2. Feedback /Re cur ren t Net wor k
~ Fig.6.6.2 n:preseots the sing
le layer Perceptron • Feedback network can be obtained
network. In this type of network ther from the feed
e are only input forward n/w by connecting its output to
and l"Utput layers are present. input.
• The inputs are applied at the initial
....-. It consists of single layer, where the instance then the
inputs are directly input is removed and the network rem
connected to the outputs, via a serried ains autonomous
of weights. thereafter (t > 0)
• The ·',!!91' of the products of the weights
and the inputs
is,.e:!lculated in each neuron node, and
if the value is
above some threshold (generally. 0)
the neuron fires
and displays the output (generally. I)
otherwise inhibit
the value (generally, l ).
(ii) MuJII-Layer Perceptron Network v
This type of network besides having
the mput and
output layers also have one or mor
e hidden layers in
between input and output layers.

Hidden Hidden Fig. 6.6.S : Feedback/ Recurrent Networ


k
layer 1 laye r2
• The recurrent networks differ from
feed-forward
architecture. A recurrent network
has at least one
feedback loop

(i) Com peti tive netw ork s : In this type


of network
the neurons of the outi;ut layers com
pete between
themselves to find the ma,imum outp
ut.
(ii) Self Organising Map (SOM)
: The input neurons
activute the closest output neuron.
Fig. 6.6.3 : Multi-Layer Perceptron Net
work (ill) Hopfield Network : Each neu
ron IS COMCCtcd to
every other neuron but not back lo itse
lf.
CHlw Syllabus w.e.f academic year 21-22) (PS
-46)
lil Tech-Neo Publications...A SACHIN S
'Machine Learning tSPPU-Sem5-IT)
(Introduction to Artificial Neural Network)... Page no (6-1o
~ 6.7 TYPES Of LEARNING Ex. 6.7.1 : A Single Jnyer NN is to be designed with 6 inPIII
,--- ---------
~ UQ. Compare
------Unsup
Supervised,
------------ and 2 output . The outputs are to be limited to Ind
continuous over the rnngc O to J. Answer the Following
ervised, Semi- :
Qucs11011s
1 Supervised Leaming with examples
I I I low 1111111y neurons nre required?
I (SPPU · Jan . 19, 6 Marks )
I 2 Whul arc lhl· ,.hmcns1ons of weight matrix?
, UQ. Explain supervised and unsupervised learning, 3. Whut kmd of transfer function should be used?
I
0
I
.. _ - - - - - - -- - - -- ----(SPPU
----· Au- -.-17,- -S-Marks )
- - - ...
Soln. : For each mpul and output one neuron
required. Hence total 8 neurons are required. 15
Learning in biological systems involves .,djustments to
the synaptic connechon~ th.II cxi:;t between the neurons. In weight matrix the number of row reprin1s the output
This is tme of .-\NNs ,1s well: I.earn111g of neural ne1wo1k neurons and number of columns represent the input neurons.
means settmg 01 updalmg the weights The first row reprints all the mcommg vectors of the first
output neuron and 1he second row represent all the incoming
1. Supe rvised Learn ing vectors of the second output neuron. Hence the matrix
dimension is 2 X 6 a~ follows
Input Adaptive Network
W11 W12 W13 W1-1 W15 W16 ]
W = [
W21 W22 W23 W24 W25 W26
d-0 The outputs are to be limited to and continuous over the
d
range O to 1.Hence we may use unipolar continuous
Fig. 6.7.1 : Supervised Learn ing function.
In thls type of learn mg when input is applted supervisor f(net) = (1 + exp(- ~*net))
provides a desired response. The difference between
the actual response (o) and the desired response (d) is Ex. 6.7.2 : Given a 2 mput neuron with following
parameters. b = 1.2, w = [3, 2), x = [-5. 6]. Calculate
calculated called as error measure whrch is used 10
neuron·~ output for folio\\ ing transfer functions.
correct the network parameters.
I. Hard limit 2. Symmetrical Hard ~
2. Unsu pervis ed Learn ing 3. Linear 4. Saturating linear
5. Symmetrical saturating linear
Input Adaptive Network 0 6. Urupolar conunuous 7 Bipolar continuous
0 Soln. :

Fig. 6.7.2 : Unsupervised Learn ing


Gi.l.4
IT]
In this type of learmnglsupe~isor is no1 present due 10
th.is there is no idea or guess of outpu t Network
modifies it's weights based on pat1ems of input and/or
Fir~t we calculate the ne1 value which .:an 1'c cakul.ited
output
as follows
(a) Hybrid Learning : In this type a combination of
n
supervised and unsuperv1se<l lcaming rs used
lll"I, = ~ \VIJ XJ
~ <b) Competitive Learning : In this 1he output J• I
DCUrons compete between themselves. The neuron Bui, in this part1cul,tr problem bias value is also giveA.
having the maximum response is declared as a Bius is a value which is added m the net to improve the
winner neuroo and the weights of winner neurons performance of the network so we will use the followill8
are modified else rcmwns unchanged, formulu

w., f academic year 21·22) (PS-46)


Iii Tech-Neo Publlcations...A SACHJN.~.,..,.~~
Machine Leaming (SPPU•SemS-IT)
'
(Introduction to Artifldlll Neui'al Ntitwo1k)., Page no (6-11
• Ex. 6.7,3 : ('ompule the output of the following network
ne~ = L WtiXJ+h
u~ing Unlpu!ur co111i11uous
J•'
net = (- 5 )( .I) + (6 )( 2H 1.2 ,, -2.82

Now we will c11lcu\1nc thr fin ;1l ut1lpu1 rm !he 11,/veu 4.83
nc1i\'Dlion f1111clion~
0 ) -"-""--{HHc1A. 5.73

,. Hurd lhnlt
Ott!put •O ir, UCI <0
., if, net 2:: 0
Mcnee, Y • 0

L Srmmc1rical Hurd limit -2.74

j Output =-
•I
I if.ne!<O
if,nct2::0
@ Soln. : PirMl we wi!I calculate net input and output of
hidden nodes,
l-ll (nel) .. (4.8)X0)+(-4.IJ3X l)-2.82=-7.65
Hence, Y =- I _,
,. Linear
Hl (ou1pu1) "' 4.758 X JO
H2(nctJ "' (-4.63x0)+(4.6x 1)-2.74= 1.86
Output = net
H2{(1\l!putl = 0.865
Hence, Y = -0.8
Now we will calculate net input and output of output
4. Sa1urating linear node,

Output • 0
= net
if,net<O
if,0:Snet< I
= (4.758x 10
= 2.167
-· x 5.73)+(0.865x5.83)-2.86

• I if, net 2: I O(outpul) = 0.899

Hence, Y • 0
»I 6.8 PERCEPTRON
5. Symmetrical saturating linear
--------------------------------
: UQ. Explain Perceptron training algorithm for linear :
Output = - I if,net<-1
= net if.-1:Snet<l
:
---- -- - -- - -- - - - - - -- - - - - - - - - - - - - .
classification, :

• I if, net 2:: I II'? Perceptron Architecture


Henc.e, Y = - I
One 1ype of NN system is based on 1he ··Perceptrnn'·_ A
6. Unipolar continuous Perceptron computes a sum of weighted. CombinJtion of its
inpu1s, if the sum is greater than n cenain t~shold, ~o its
Output "' (I+ exp (-Ax net)) output is I else - I.
We assume the value of A equal to 1(1f lhe value of A is w
not given then assume the siandard value as equal to l)
X t s '- 0

Hence, Y=0.1418

7. Bipolar continuous b
2 Fig. 6.8.1 : Pcrceprron Architecture
Output = (I+ exp(. 1.. x net)) UNl
-----'H"''c"""e"c,• Y = 0. 71
(New Syllabus w.e.f academic year 21-22) (PS-46) [i1 Tech-Neo Publications
~I
A SACHIN SHAH Venture
Machine Leaming (SPPU-SemS-IT)
(Introduction to Artificial Neural Network) ... Page no (&-1
. Output of the neuron is 1/0 or 1/- l. thus each neuron 2
~ e network divides the input spa.:e into two regions. This
th Ill 6. 9 PERCEPTRON LEARNING
ALGORITHM
is useful to determine the boundary between U1ese regions.
Let's see the example for tl1is ,--- ------ -- --- -- - ----- ----- ---,
; UQ. Ekplaln Perceptron training algorithm for linear :
P1 W11 1
classification. 1 IJ.i44 6•/1 I
1 _ _ -- -- - -- - -- -- - - - -- -- -- - -- ---- !
X1 wil I th neuron
W12
x2---::J r - -, -o
b x,
Output of the above network is given by, ti Wj
0 = hardlim(W 11 xP 1 +W12XP2+b)
X d
Input vector for which the net input is zero determines
the decision boundary
W11XP1+ W12XP2+b :0 C

Let's take the valueofW 11 =W 12 : I llildb:- J and Fig. 6.9.I; Perceptron Learning rule
substitute this values in the above equation we wil l gel
Percepr.ron Learning is a supervised type of !earning as
Pr+P2-l=O
the desiied res_ppnse is present.JI~ is applicable on!~
To draw the decision boundary we need to find the Binary types of neurons (activation functions). Learning
intercepting points of P1 and P2 axes signal is the difference between the actual output and
P1+P2-l = 0 desired output of neuron and it is used to update the weight.
Substitute P1"'0 then we will get P2 "' l i.e. (0, !) Learning signal, r :@-@ ,L.t S11tJ)..
Substitute P2"' 0 then we will get P 1= I Le. (J ,Q) Where o; is the oucpit'-tY lhe
0

/h neuron and d;is the


Now 10 find wl1ich decision region belongs to desired response.
output"' I Let's pick one point (2, 0) and substitute in the Weightfncrernent, I 4Wij = Cx(d;-oiJ x X; }"
following equation
1 Where C is a constant, X is input and j = J ton.
0 = Hardlim (Y,/ P + b)
Wnew "' Wold+ 11Wij
"' Hardlim[~ ][I l]+(-1)):Hardlim(l)=i
tn the Perceptron learning weights are updated only if.
d; ¢ oi
Let's assume
if, d1=lando;:-l
1 (0, ') Output= 1
then 11Wij = Cx[di-oj]xXj
I
I
butput =• ,,,i--~,,~
.,~
,~-7,pC-
T
1

= CX[l-(-l)]X){_j
"' 2CXj
!f, d;:-Jandoi:J

Then.
Decision boundary is always orthogonal to the weight ti.W1J "'Cxfd1-oj]xXj
vector and it always points 1owards the region where neuron =Cx[-1-IJxXJ
output is ! .
= -2C)S
Hence, AWy "" :t2 C X J

(New Sytlab!Js w.eJ academic year 21-22) (PS-46)


l'1 Tech-Neo Publications..A SACHIN SHAH y. .
Machine Leaming (SPPU-Sem5•1T}
Ex. 6.9.1 : Initial weight vector W1 needs lo be · d (Introduction to Artificial Neural Network)...Page no 6-13
. ·
usmg Iramc
three mput :ectors. x,, X2 and X3 and their desired • Slep 3 : When X3 Is applied
respo_n~cs. Fmd final weight \cr1or using l>crccpi,a,
learning rule. 1

""' • w; xx,. ro8-o., oo.,i[ ;}- 21


"•= ' [ -:] x,=[ -:-~.] x,.[ -~:.I.~]
0
~. 01 - f(l1Cl3) ~ I
W4 = W:1+Cxldro3JxX 3

x,{ ;;JC =O I, d1 =·I, d1 1.d,= I


=-
=[ -:;}lx(J-(-1{ ;;}[
0.6]
- 0.4
0.1
0.5
@ Soln.:
Ex. 6.9.2 : rmplcment Perceptron Leaming rule
Perceptron learning is applicable only for binary
funCtJon and also in this problem desired responses are
given m l and - I fonna1 Hence we solve this problem
using bipolar binaJ') function
l
w,{ x,=[ J.x,=[ =:]
C= l,d 1 =-1.dz= L
W Bipolar Binary Repeat training sequence until two correct responses in a
row are achieved.
For bipolar binary output = - L if net < 0
0 Soln.:
= I ifnel::::O
Perceptron learning is applicable only for binary
• Step 1 : When X1 is applied
function and also in this problem desired responses are
given in 1 and - l format Hence we solve this problem
using bipolar binary function
=w:xx,=[l-100.SJ [ )]=25 Bipolar Binary

For bipolar binary output = - I if net< 0


= I if net~ 0
W2 = W1+Cx[d1 -oi]xX1
• Step 1 : When X1 is applied

=[ i]+0.1(-1-1)[-~]=[-~:] ""• =w:xx =[01 o{ _:]=I 1

~ -I ~
o 1 = f(net 1) = l as d 1 .t o 1
• Step 2 : When X2 is applied
W2 = w , +Cx(d1-otJXX1

netz = W: X X2 =[0.8-0.6 0 0.7{


1.5
OJ =- 1.6
-0.5
= [! )+1(-1-1)+[ _:)=[ =:1
-1 • Step 2 : When X2 is applied
02 ::: f(nelz) = - I ai; net < 0

Since, d2 = o2• weight updalion is not required


[J=- 1
UM

(New Syllabus w.e.f academic year 21-22) (PS-46)


--Ii
~ Tech•Neo Publications...A SACHIN SHAH Venture
:lie Leaming (SPPIJ.Sem5-IT)
IOi == f(nct2) == - 1 as net < o (Introduction to ArtifiClal Neural Networlt)••Page llO (6-14
1
'vi 3 == W C
2~ X [d2 - 02] X X2 The above equation can be wriuen as. \\'3 = W4 - ~W
3

~
As we know t1W1 =± 2 x C x X3 and d3 =- I. 1-e

= r =:-1 + 1 ))l- -
(1 - (- I 7J = r - ; ]
consider the negative ~ign

2
L -1 L o'
In lhc problem it is given thal we have to repeat the
trainmg Until lwo correct responses are achieved, so we will
again apply x 1•

• Step 3 : When X1 is applied

oet3 = w;xx1 =H -3 oi[ : ] =- II


-1
• Step 2
03 = f(net3) =- I
W3 =W2+t.W2
As d1 = o 3, weight updation is not required
The above equation can be written as. Wz = W3-llW2
W4 = W3
As we know /J.W2 = ± '.! x C X X2 and d2 = I we
• Step 4 : When X2 is applied
consider the positive sign

•~, = w;xX,=H -3
04 = f(ne14) =l
oi[ -~]=3
As d 2 = o4 ,weight updation is not required
-1
AW, = 2xcxx,{ ~f]
Ws = W4
Thus we have obtained the correct response in a row
two times.
w, = W,-AW,=[ ~]
Ex. 6.9.3 : A Single neuron nelwork using f(net) = sgn(nel) • Step3
bas been trained using the pairs of Xi, ~ as follows. Find
Initial weight vector. W2 = W 1 +t.W 1

)Jx,{ ~n
The above equation can be written as, W 1 = W - fl W
2 1

w,=U J.x,{
As we know t.W1 = ± 2 x C x X 1 and d 1 = - I we
consider the negative sign

x,{ ;u C= 1, d1 =- I, "2= I, d3=- I


AW, = -2 xCxX1 =[ ~~]
@ Soln. : This Problem is different from the previous
problems; we have to find the initial weight vector using the
w, = w2-Aw1 =[ ) ]
final weight vector
• Step I

(New Syllabus w.e.f academic year 21-22) (PS-46)


lil Tech-Neo Publications..A SACHIN SHAH Ventuie
Machine Leaming (SPPU-SemS-IT}
Ex. 6.9.4 : Implement Perce {lntrodUcilon to Artlflclal Neural Natwork) ... Page no •15
· U . Plron Netwo"'· f
funcuon smg the concept of d ec1smn
- • Bound "' or A.ND
P1:[00J,t1:0andP2:[0 I) 1 -oa dP ary,
' 2- n 3=(lO]
P,=[-'],
· 2 i=O, P4= [-'J I t 4 = J,C= 1
t3=0andP4=[1 IJ.t 4 : I • IJl Lhi~ problem th e d e~ ,--•
'"" responses are given in 110 format
0 Soln.: SO
li'.JWe will NOlvc 1h·
1 ~ pro hi em using unipolar function.
Now first we plot the given points For the p . Soln.:
. ' lhe
4 IX)!llt
target or. desired response is given as I this is represented ns Iterat1on1
filled circle and for the remaining points it"s o W,.LCh 1s .
• St ep 1 : When P 1 L~ 11pp1led
. as empty circle. After plotting <h e pomts
represented · the
next step 1s to draw the decision boundary such thnt it will o1:: lrnrdlim (WI x P1 + b1)"' hardlim (OJ= I
divide the poinlS into -two regions according to their desired Brror, e= t 1 -o 1= O- I=_ I
responses (i.e. output I and 0).
W2"" W1+cxP:=[OOJ+(-1)[22J={-2-2]
P, b2 = b1+c:::-O+(-l)=-!
-+ r
" • Step 2 : When p 1 Is applled
I
T
t t 0 2 "' harcl!imfN2XP2+~)=harcllim(l)=I
t- J Error. e= t-i-o 2= 1 _ I =O
(0, 1)
Since the error is 0, weight and bias updation is nm
required.
W3 = W2
(0,0) (1,0) P1 j
b3 = ~

As we know weight vector is orthogonal to the decision • Step 3 : When P 3 is applied


boundary and it points towards the region where neuron 03 = hardlim (W3 X P3 + b3) = hardlim (- I)= 0
output is I (in this case for P4 (I.I) output is\) Hence we
Error, e= t3-o3 =0- 0 =0
will take W = [2 2]
Since the error is 0, weight and bias upclation is not
To find the value of bias pick one point on decision
required.
boundary as P = [1.5 OJ
1 W4"' W3 b4= b3
By substituting chis values in the equation W p + b={)
• Step 4 : When P4 is applied
we will get b: - 3. We test the network for our calculated
values as follows 04"' bardlim (y.t 4 X P4 + b4) = bardlim (- I) =Cl
Let's take P I to test Error,e=t 4 -04= 1-0= !
1
0 "'Hard!im ry,/ P + b) W5 '
= W4+eP 4 =!-2-2J+l(-l 1\=[-3-I)

bs = b4+e=-l+l=O
=Hardlim([ ~] [OOJ+(-3)):Hardlim(-3)=0
We have to repeat Lhe iterations until all input vec1or-s
Ex. 6.9.5 : Solve the following classification problem with are correctly classified (i.e. error = 0 for all the input
Perceptron learning rule. Apply each input vector in order vectors)
for as many repetitions as it takes lo ensure that the problem Iteration 2
is sol\'ed. Draw a graph of t11e problem only ofter you found • Step 5 : When Pa L~ applied
a solulion. ] 05=hardlim(W,xP 1+b 5)=hnrdhm(-8) =0
1
W =fOOJ,b =0.P 1=[ : ] t 1 =0,P2= [ _ t2=l, Enor,e=ti-os=0-0=0 UNf
1 1 2

-----------;::;;-----1
(New Syllabus w.e.t academic year 21-22) (P5"46)
~ Tech-Nee Publication~ ..A $ACHIN SHAH Venture
~3~!ne learning (SPPU-Sem5-IT) (Introduction to Artificial Neural Network) .•. Page no (6-16)

S ·:,..:e the error is 0, weight and bias updation 1s nol W11 = Wio
requrred. b11 = bio
W6 = W5 Now for P1 also we are getting the error. e = 0.(n this

b6 = b5 iteration e = 0 f~r all input vectors. Thus, we can say i.hat we

• Step 6 : When P 2 is applled bave found a solution


. h tor and bias value are as follows
06 = bardlim (W6 x P2 + b6) = hardlim(- 1) = 0 Final we1g t vec
W = [- 2 - 3] b = l
Error, e = t2 - o4 =l - 0 = I ·rute these values in the following
t Now we su bsti
W7 = W6+eP2 =[-3-l]+l[l-2]=[-2-3] . t timd the equation of decision boundary
equallon o
b7 = b6 + e =0 + l = l W11XP1+W12XP2+b == 0
• Step 7 : When P3 is applied - 2P 1 - 3P2 + 1 = O
o = hardlim (W7 x P3 + b7) = hardlim (- 1) = 0 Now to draw the decision boundary we need to rmd the
7
Error, e = L3 -07 = 0 intercepting points
By substituting P1 = 0 we will get P2= 0.3 i.e. pointl (0,0.3)
Since the error is 0, weight and bias updation is not
By substituting p2 =0 we will get P1= 0.5 i.e. point2 (0.5,0)
required.
Wg = W7
bs = b,
t Step 8 : When P 4 is applied
og = bardlim (Wg x P4 + bg) = hardlim(O) = 0
Error. e = t,rog = 0
Since the error is 0, weight and bias updation is not
required.
W9 = Ws
b9 = bs • From the above diagram we can say 1ha1 the decision
In !his iteration for P1, P3 and P4, e = 0 Hence we will boundary classifies the input vectors in to two classes
again go for Iteration 3 corresponding to output 1 and O.

lteration 3 UEx. 6.9.6 SPPU - Jan. 19, 8 Marks


t Step 9 : When P 1 is applied Design a pcrceptron network for AND function for bipolar
inputs and targets.
o9 = hardlim (W9 x P 1 + b9) = bardlim(- 9) = 0
The Truth table for the A..'IID function using bipolar inputs
Error, e = 11 - 09 = 0 and targets is as follows,
-
Since lhe error is O, weight and bias updation is not Input Output
required. X1 Xz y
W10 = W9 I I I
b10 = b9 I -I - I
t Step 10 : When Pi Is applied -1 I - I
0 10 = bardllm (W1o x P2 + b 1o) =hardlim (5) =I - I - I - I
Error,e = t2 -o,o=0 • Let's assume WI = 0.5, W2 = 0 _8 & b = 0
Since the error is 0, weight and bias updation is not For the first row:
required.
net I = (- I X 0.5) + (- I X 0.8) + 0 = - 1.3

(New Syllabus w.e.f academic year 21-22) (PS-46)


Iii Tech-Neo Publications.. A SACHIN SHAH V. . . .
Machine Leaming (SPPU-SemS-IT)
(Introduction to Artificial Neural Network)... Page no
01 = - I
As O 1 = Yl weight updalion is not rcquirrd. ~ 6• t O SIGMOID NEURON
WI =0.5, W2=0.8.b=O • Sigmoh.J Neuron can be used for both bi11arv
For the second row C'lr1.i·.i·/Jlc-at1r111 1111(1 rewr.vsltm problems. The outpul wiil
he [I vuluc hclwcen 'O' nnd '/' and based on the value
Nel2 = (-IX 0.5) + (I x 0.8) + 0 = 0.J wr cnn use ii lo indic;1te whelhcr ii belongs to 'class O'
02 = I or 'c/u.1·.1· I' for binury classificarion. Por e,cample. the
As 02 -I-Y2 output value of 0.75 will belong to 'cla,fS I' whereas a
WI =Wl(old)+(Y2-02)(X I) value of 0.2 will belong to 'class o•. This clai:slfication
is based on the tliresliold.
= 0.5 + (- I - l) (- 1)
= 2.5
• In case of Regression, given the inputs xi, x2, x3, .. xn
we are trying to regress a probability using the Sigmoid
W2 = W2 (old)+ (Y2-02) (X2) function. We can relate it like giving a rating. wbere the
=Q.8+(-1-1)(1) =-1.2 values lie between 'O' to •/ •. ·
B = b (old) + (Y2-02) • In con1ras1 to Perceplron 0/1 function, 'S-shaped'
=0+(- I -1) =-2 Sigmoid family of functions gives us a smoother curve
- closer to bow humans mnke decisions. Given X and
For the third row Y. where X is a high dimensional belonging to
Net3 = (l X 2.5) + (- I X (- 1.2)) + (- 2) = l.7 Real-valued inputs. The approximate relationship
between Y and X is given by below mentioned
03 = I
Sigmoid Function.
As 03 Y3 f.
• In case of a I-dimensional input X. the sigmoid
Wl = Wl (old)+ (Y3-03) (Xl)
function that best describes the relationship between
= 2.5 + (-1 - l)(l) = 0.5 input and output is given by
W2 = W2 (old)+ (Y3 - 03) (X2) 1
Yp = -{wx+b)
= - 1.2 + (- 1- 1) (- 1) = 0.8

~
1 +e

b = b (old) + (Y3 - 03) ln case of a 2-dimensional input (which contains



2 input features), the sigmoid function that best
= - 2 + (- I - t)::: - 4 describes the input-output relation is given by
For the fourth row I
Net4 = (lx0.5)+(Ix0.8)+(-4)=-2.7 Yp = l+e-(w1 x1 +w1 x2 +b)

04 = -1 • In case of a high-dimensional input with many features.

AsO4 # Y4 the sigmoid functton is given by


WI= Wl(old)+(Y4-O4)(Xl) t
= 0.5+(1-(-l))(l) Yp= J+e-{wx++b)
• Unlike Perceptron and MP Neuron (binary output). the
= 25
W2 (old)+ (Y4-04) (X2) sigmoid function output hes between md ·1·. ·o·
W2 = 0.8 + (I - (-1)) (1) Another advantage of using Sigmoid Neuron is that it
:::
2.8 can deal with data that are not linearly separable. A
:::
b (old)+ (Y4 - 04) sigmoid neuron cannot completely separate positive
B ::: _ + (l _ (- 1)) points from negative points, but it can give a graded
::: 4 output which allows better interpretation. VNn'
= -2
28
~H=e=nce~th=e~fi=m~al~v-al_u_e_s_ar_e_w_1_==_2._s_.w_z_=__·_,b_=__
2
- _____ rt~----.:-.-~===~==-
"1rech•Neo Publications.. A SACHIN SHAH Venture
WI

(New Syllabus w.e.f academic year 21-22) (PS-46)


e lea~g (SPPU-Se.'llS-IT)
(lntrOduction lo Artificial Neural Network) ... Paga no (6-1
B) changing the ,a1ucs of ..."" • and 'b ',
v.c can get • This ISa ee>uess method• which has a d!sadV2nrh
different types of S1gmo1d functions. ~Cof
finding the values of 'w' and 'b · with an mcrcaslllg IOSI
Let us see how Loss function is calculated at ~Orne point. So, tlus method doesn't always ~Ort
I x. X2 y yp 7 We need 10 find
'w' and 'b'.
n principled way to find the values of

1 1 0.5 0.6 • So the goal of the learning algorithm is 10 find the


2 I 0.8 values of 'w' and 'b • such that, from all the possible
0.7
Sigmoid functions we have lo find one SigJnouj
l 2 0.2 0.1 function that is clo,er to the uue relation between the
2 2 0.9 input and output.
0.5
• Sigmoid Neuron Learning Algorithm is to some extent
4
sunilar to our Perceptron model learning algorithm,
Loss = L (y - Yp ) = 0.18
2
which is driven by Loss function. The Algorithm g~
i= I this way,
• Basically Sigmoid is a family of functions, where, by I. Initialize the values of 'w' and 'b · randomly
changing the values of 'w' and 'b ' we can get '-lrious
2. Calculate the: predicted output \'alues for the
sigmoid functtons.
inputs using Sigmoid function with randomly
• By changing values of W' and 'b' we can change the initialized 'w' and 'b'
slope of the curve and the displacement of the curve.
3. Calculate loss using Loss Function
We are given with Input and output. and we have
chosen to approximate using Sigmoid function with 4. Update the values of ·w· and 'b' such that the loss
parameten; 'w' and 'b'. steadily decreases and repeat the above steps till
loss is 'O' (or, acceptable minimal value (or)
• Using the learning algorithm we have to find the values
::ruled iterations of CCU} (or) the change in loss
of parameters 'II'' and 'b · such that., when we plug in
between two consecutive iterations is negligible.
the input m the Sigmoid function equation we should
get the predicted output close to the true output. So we • If we are 10 randomly guess the values of 'iv' and 'b',
want to find the values of 'II'' and 'b such that the the loss m.1y not steadily decrease. The loss may
sigmoid function should pass lhrough the input-output decrease and increa~e randomly. Error surface is
points. obtained by plouing 'w', 'b' against loss for all possible
combinations of 'w' and 'b',
• If that happens, then we know that whenever we plug
in the value from the training data. the output my • Hence we need a principled way based on the loss
sigmoid function gives me is very close to the true function, to find the values of 'w' and 'b' for which the
output, in which case the values of 'w' and 'b' are predicted values of output is close to the true output.
learned correctly i;o that lhe loss on my training data is
~ 6.t f MULTILAYER PERCEPTRON MODEL
minimized. This is the objective of our learning
algorithm.
a. 6.1 t .1 Introduction
• How do we go about finding the values for '111' and 'b'?
Start with some random values of 'w' and 'b' as we arc • A multilayer perceptron (MLP) is a type of
clueless about these values. Go over all the training feedforward artificial neural network. Most of the time
points. Compute the loss. multilayer perceptron name is used to refer any feed
• Try to update the values for 'w' and 'b' such that we forward artificial neural network.
start from a random sigmoid function and slowly • Sometimes this term is strictly used to refer to networb
approach our desired sigmoid function for which the which has multiple layers of percepUOa'S widl
loss is going to be minimal. threshold activation.

li1 Tech-Neo Publications...A SAOef


Machine Learning {SPPU-Sems-1n
Jn nn multilayer perceptron network th (Introduction to Artificial Neural Network) ... Page no (&,19
• ere are nt 1
~ Jayefli of nodes such as • eas1
ltn input layer u h'1 Id
• Delhi Learning is a supervised type of !earning as the
Ja)·er and an output layer. ' < on desired response i~ present Jt i~ applicable only for
Continuou~ 1ypcs of neurons (activation functions).
• Each node in multilayer perceptron presen'"
that uses a non r1near ac1n·t1tion
. . functio "' neuron • Leaming signal is the product or difference between the
. n except for the
input nodes. Multllayel' perceptron net k nctua] output and desired output of newon and
. . wor uses fl
supervised learning technique kn derivative gf the output
. ., .. own as
backpropagauon ,or !raining. • Learning signal, r = [di - Oi]f'(neti) where Qi is tbe
As we have seen earlier perceptron can I output of the j 111 neuron & di is the desired response
• . . on Y produce
and r(neti) is the derivative of the output
linear dec1smn boundaries. But the
. . re are many
interesting and real hfo problems such , S • For unipolar continuous, r(oeti) = 0 (I - 0)
•. , S pe~11
recogn1uon. Image classification Text summ · ·
. • unzauon, • Por bipolar continuous, f' (neli) = ½ (1 - o-)
0

Object detecuon etc. are not linearly separable. • Weight Increment, AWij = C x (di-Oi] )( £' (neti) )( Xj
• TbcSC problems will require non-linear boundari~. We where C is a constant and Xis input & j = l to n
can solve this problem by using more complex network WMw=WoJd+t.Wij
with more than one perceptron.
Ex. 6.11.1 : Prove the following
• In perceptron network learns by updating the weights
(a) For unipolar continuous, f'(ne1i) =0 (I - 0)
until prediction is satisfactory. 0
(b) For bipolar continuous. f' (neti) = V2 (I - o·)
• Here we need multiple layers and all that layers should
be fully connected to each other, so when 1he input (a) For unipolar continuous, f'(netl)= 0 (1-0)
signal propagates through the network in a forward • As we know for unipolar continuous function
direction, on a layer-by-layer basis these neural f(nel) = 1/(J+e·x)
networks commonly referred to as M11l1ilayer • We diffcren1iate the above equation with respect to x
Perceptro11. f'(net) = ( (l+ e-~) d/dx (I)- I d/dx (1 + e-x)) / (I+ e-x)i
-x -x 2
=e/(l+e)
'a 6.1 f .2 Leaming Parameters- Wel1ht and
• We may write th!.'; abo,·e equation as
Blas
= (l+e-x-l)/{l+e·x)2
Jn above sections we have seen how to update weight = ((l+e-x)/(l+e-x}2)-{l/(l+e-x.)°½
and bias in case of single perceptron. Now we will see bow = (l/(l+e•x))-(l/(1+e-x)2)
IO update weight and bias for multilayer perceptron. To 0
= f(net) - f(net) w

Unders1and this we will see Delta learning rule and 2


=0-0 =00-0)
generalised delta learning rule. ,
b) For bipolar continuous, f'(netl)= ½ (1-0· )
1. Delta Learning Rule
• We know for bipolar continuous function
x, w,, 1th neuron -, ))-1
f(net) =f2/(l+e
0
x, • When we differentiate the above equation with respect
to x we get
f net
x. w,. r(net) a::2e-
X
/(I +e
- X 2
) ... (1)
2
d-0 we have co prove 1he R.H.S. as Vi 11- 0 )
_, 2
' • • d
= l/2fi - ((2/(1 + e ))- I] l
' =112[1-[{4/{l+e·Xi2)-(4/(l+e-x))+11) UNJl
C = 1121-4/ (1 ♦ e-x>2 - 4/(1 +e-x))
Fig. 6.11.1 : Delta Learning Rule
1'lrech-Neo Publicatlons..A SACHIN SHAM-V,tt g tMl
~ e Leaming (SP
PU-SemS-IT)
• . ==-2,\l+e-'r-2,(l+e-') (lntrOductlon lo Artificial Neural Network) ... Page no s.20
~o" We \\ ill take a L.C.M • I_'! the nhovo equation this term is nothing but the
:: '-,
' .. +,- 'l
, + e-, )) / (I + e- ,)2 IJ (nclj} und the first term is the differentiation of error
w.r.t Y
::: 2 e- ' • tl + e- ')2
From Equation ll) and t2). l ll.S = R.11.S
,., ( 2) • As w~ knuw error is given as E = (d - o/
dli/t1Y1 ~ d/dYJ ( ½ }.;( dk - f(nelk (y/1)
2. Generalised Delta Learning Rule "' 2:1 dk Ok> d/dY, f( nelk (y))
• We
. will demc
. '•t ~~"
ncra1 ~•,pr-.-ss1on for the wc1ghl "' l:C elk Ok) X ( (netk) X (d(netk) I dYj)
increment ~' .11 for ,in, l.t) er of ncu1·t,ns thul ,s not on I

output layer • We cnn write (dk - Ok) X f (nelk) = dok which is the
error present in the output layer and d(ne!Jt)/dY, a\ Wk,
• The above •~ a simplified diagram for the muh11l1yer which is the weight vector present between the hidden
perceptron network \\ hich represents the three Inyo, and output layer.
input(t). h1ddenO) and output(k).
• Thus we will get
Layer i (input) Layer j (hidden) Layer k (output) dE/d Y1 = - :[d0 kWkj
z y 0 • By substituling this in Equation (4) we will gel

dY; = ( 1:dokWk;) fj ' (net;)


V w
• By substituting this in Equation (3) we will gel
Fig. 6.11.2
6 vji = Tl x Zi x tj' '(ne~) x (:Edok Wkj)
• At the input layer input Z is applied, Y represents the
Where t/ (ne~) = 1/2 (I - 0
2
) for bipolar and
output of the hidden layer ond O represents the oulpul
of the output layer. V is the weight vector present O(J - O)for unipolar function. This is the Generalised
between the input and the hidden layer and W Delta learning rule.
represents the weight vector present between the
'2S., 6. 11 .l Error Back Propagation Training
ludden and the output layer.
Algorithm
• According to -ve gradient descent formula for hidden
layer In Error backpropagation, the network learns during a
6 VJI = -T)c!E/ dVji ...(6.11.1) training phase.

• We may write the term dE/dVj, as The steps followed during learning are :
dF/dVji :: de/ dneljX dne~/ dVjl ... (6.11.2) 1. First the input is applied to the input layer to calculate
_ (dE/dnet:_;) is che error signal for hidden layer which Lhe output of !he hldden layer. The output of !he hidden
can be represented as dY, layer becomes the input of the next layer. Finally. the
dnetj/dVj, represent the input applied at the input layer output of the output layer is calculated.

i.e., Zi 2. The desired and the actual output at the output layer are
By substituting this values in Equation (6. I l.2) and
• compared with each other and an error sign:il is
Equation (6.11.1) we will gel generated.
... (6.11.3)
6 Vji = TJ x dYj X Zi
3. The error of Lhe output layer is back propagated 10 the
• As we are saying that dYJ = - dE/dneti
hidden layer so that the weights connected in each layer
• We can write this term as of the network can be properly adjusted.
. .. (6.11.4)
dYj = - dE/dYj X dY/dnel;

(New Syllabus w.e.f academic year 21•22) (PS-46)


l'1 Tech- Neo Publlcations..A SACHJNSHAHVentutt
Machine Leerning (SPPU-Seo,S-IT)

4. When Back propagation


network i6 1 . (Introduction to Anilicial Neural Neiwork) ___ Page no &-21
correct clasi;ificat.ion for 11 1ra· . nuncd for the
· · mmg dma 1h Em:lidian distance between two cluslers mus! be
data IS applied to lhe netw ork m . orde • e11 n te~ting properly specified, so thm idenLirying the patterns will

.
unseen pattems are co~tJy clus fi r to check if 1hc
s, 1ed or not.
,
4,
he ca~y. Increase number of hidden layers. performance
nmy get linprovcd .
learning con•tant
Lc:nrning coMl:mt Hhould be voried when the learning
r111e lncre11~cs speed of ,raining increases.lO-l to 10
values ure proved 10 be successful for learning rate.
Leaming rate has to be increased for some training
cycle.~ and 1hen it hus to be reduced.
c.k:~1110 urrurdQ, <N
<10 • (do - o,r r, ,,.'- i
lfYsw1' ·ao•v-
-...
BIIQln

l rllltj"II
5, Momentum
Momentum gives push to training Momentum is lhe
previous weight adjusllllent.
AdiUlt weights of 0\J!pl,l layo, at
.O.W(t) = -T\ AI!(!}+ a: 6W(t- I)
w .. w .,i ·oo·v'
where a: values varies from 0.1 to 0.8
~ we~~"' af ~ layer u 6. Steepness or activation function
V"'V•'l·crt·i
It is advisable to keep).. at a standard value of I.choice
E ~-
. a

" Mam Plll!Ofr\11


'"
and shape of activation function strongly affect speed
of network training

,.
E< ' - n IN li1lltllllg ..i
Ex. 6.11.2 : Classify the input of following network using

~
unipolar continuous function and EBPT A.

""' W=[-11] Wo=-L V --[ 2 ,' ].

Fig. '6.11.3 Vo=[0-1]. t=0.9, T\ = 0.3

lo. 6.11.4 Learning Factors 0


_, _,
If the network is not trained then possible reasons for
the same are as follows and also the solution to avoid the
respective problems are also given, _,
1. Initial weight
·
Selected in such a way that network reqmres more

number of training cycle to train th e network. 0.6 0


Adjusting weight will train network properly
@ Soln.:
2. Fixing up the desired output feed forward suise

. t deciding desired
Based on input pattern 1f we are no First, we will calculate the output of Hidden Layer as,
output properly network will not gel lfained Y1 = f(0.6x2+ I x0)=0.76
3. Non separable patterns Y1, = f(-J+0.6xl+2x0)=0.4

If features of two classes O


f patterns are similar,

network will not be trained prope ·


h Id
rly Features s ou
Now we will calculate lhe output of Output Layer as,
o = f(076x(-l)+0.4xl+(-1))=02 I
VJ

~ Tech-Neo Pubhcat1ons.. A SACHIN SHAH Vtnllll'e


be diffentiable.
~ Syllabus w.e.f academic year 21•22) (PS-46)
l Network) ... Pa ge no 6-22
Ma chi ne Leaming (SPPU-Sa
m5-IT) (Introduction lo Artificial Neura

• Bade pro pa pd on of error • TheMSE function is exp res sed as


N 2
of the Output Layer as.
First. we will calculate !be Error J
EMsE == N l:
d' d 11
II uctual, - pre icte ;
) x 0.2 x (J -0. 2) = 0.I 12
d0 = (t- 0) f (0 ) = (0. 9- 0.2 i::: I
car pric e prediction to
of rhe Hidden La yer as. Ler us see an exa mp le of
Now we will calculate the Error an squ are d error.
l2 x( - I)x 0.7 6(J -0. 76 ) undersrand the con cep t of me
dY1 =d 0" '(- l)x r'( Y1)= 0.l Pre dic ted Er ror Squared
Car Actual Pri ce Error
=- 0.0 2 hs) Price
, Nu mb er (in lak (in laJ< bs)
.4)
dY2 =d 0 X (I) X f (Y z)= 0.112 ~ (J) x 0.4 ( l -0 12 -2 4
1 10
=0 .02 6 8 0 0
2 8
24 -4 16
ts an d Bla s
q> Up da tlo n of we igh 3 20
14 - 1
ts between the Hidden 13
Fir st we will update the weigh 4 21
and lhe Ou tpu t Lay er as. Su m
-l+ 0. 3x 0.l l2 x0 .76
W u= WJ J+T JX dO XY 1= 21
=5.25
MS E = 4
= -0. 97 5
J + 0.3 X 0.J 12 X 0.4 = l.01 3 CO NC EP T OF DE EP LE
AR NI NG
W12 = W12 +TJX dO X Y2 = ~ 6. fl
Bia s is updated as
x0 .ll 2 =-0 .99 6 ,- -- -- -- -- -- -- -- -- -- --
lica
--
tion
-- -- -- -,
s.
1

Wo = Wo + 1J xd O =- 1 + 0.3 g wit h app


' UQ. Explain deep learnin
Jan . 19, 8 Ma rks)
ights between rhe Hidden (SP PU
I
N ow we will updOle the we I
?
I
g. Wh at are cha len ges in it
and the lnpuc Layer as. , UQ. Explain deep learnin
rks) I

v,1 = =
V11 +1) xd Y1 xz , 2 + 0.3
x (-0 .02 ) x0 .6 I
I
(SP PU · Jun e 18, 7 Ma I
I
I
learning.
= 2.0 036 • UQ. Write a no te on Deep
I 16, 8 Ma rks )
V12 = Y1 2+T JX dY1X Zz
= I +0 .3x (-0 .02 )x0 = I I (SP PU • De c. 16, Jun e

Y21 = Y21 +T )Xd Yz XZ 1 =]


+0.3x 0.0 26 X0 .6= l.3
·- -- -- -- -- -- -- -- -- -- -- -- --nB -- -- -·
Y22 = Y2 2+ 11x dY2X Zz =2
+0 .3x 0.0 26 X0 =2 'B. 6.1 3. f uction to De ep Lu ml
Introd
Bia s is updated as ma chi ne Jeaming which is
• De ep learning is a bra nch of
.3x 0.0 26 = -0. 99 2 al neu ral networks. as
Yo2 = V0 2+ TJ Xd Y2 =-l +0 completely bas ed on arti fici
neural net wo rk is goi ng to
mimic the human brain so
SQ UA RE of mi mi c of human brain.
~ 6. f 2 LOSS FU NC TIO N- ME AN dee p lea rni ng is als o a kin d
nee d to exp lici tly program
ERROR In dee p lea rni ng. we do n't
everything.
ared as the average of the is not new . 11 has bee n
• Me an squared err or is calcul • Th e con cep i of dee p lea rni ng
the estimated and actual no w.
squared differences between aro und for a cou ple of yea rs
values. e ear lie r IVe did not have
• It's on hype now ada ys bec aus
regardless of the sig n of er and a 101 of dat a. As in the
• The result is always positive tha t mu ch pro ces sin g pow
ues and a perfect value sin g pow er increases
the predicted and actual val last 20 years. the pro ces
and ma chi ne learning
is 0.0. exponcntiaJJy, dee p lea rni ng
er mistakes result in more cam e in the picture.
• The squ ari ng means that larg
aning tbal the mo del is
error than smaller mistakes. me
pun ish ed for making larg er
mis1.1kes

~ Tech-Neo Publicetlons...A SACHJN SHAH v. nu t


Ml i,c hii e Le am ing (SP
PU-SemS-IT) (lntrodUCtion to Ar1ificial
Neural Ne lWOfk) ••• P
'a . 6. 1 J. l Ar dl ke au
( O N 1f )
ns of De ep lw h1 • Th ird . Ch oo se the Deep Leaming
Ai go nd lm
(1) De ep Ne ur al Ne appropriately. Fourth. Al "th
tw or k : It is a neural ne go n m sho uld be us ed while
tw ork wu h a tra ini ng the data.set. Fif
ce na in level of complellily ha ih, Fin al tcS tin g sh ou ld
ving multiple hidden be do ne
lay ers in be tw een inp on the da tas et.
ut and ou tpu t la) ers .
cap lb ey 3l'C Step I :Understand the pr
able of modeling oblem and cb ec t
and processing non-linear du tie■ iNlilt7
relationships.
for de ep leaming
pc-.
(2) De ep N 'i
Be lie f Ne tw or k (D BN · a
cla ss o r Decp
) : It 1s Ste p 2 : ldcotify relevant
dala and prq,are it
Neural Network. It is mu
lti-layer be lie f networks. St ep 3 : Select Deep learning algor
Steps ithm
for performing DBN :
Step 4 :T ra in the algorithm
(a) Learn a la}'er of fea
tures- ~m visible units Ste p S : Tc..-.1 the perfonn
usi ng
----
Contrastive Di v~ en ce
(b) Tr ea t activacions
algorithm.
of pr ev i~ y trained fea
anc:e of tbe model

visible un its and then lea


tures as
rn features of features.
a 6. 11 .4 AppUaidons
ot 0e ep 1.eam1nc
(c) Finally, the wh ole (1)
DB N is tra ine d wh en Au tom ati c Te xt Ge ne
the ra tio n : Co rp us of
( p. ,J ~ ng for the final hid de n lay
er is lea rne d an d fro m thi s mo de l ne w tex t is
tex t is
1 achieved..
(3) Recurrent Ne ur wo rd- by -w ord or ch ara ge ne rat ed .
al Ne tw or k : Allows for cte r-b y-c ha rac ter . Th
pa ral lel an d en thi s
seqy-'nrial co mp uta tio n. mo de l is ca pa ble of lea
Sim ila r to the hu ma n bra .ming ho w to sp ell . pu
lar ge fee db ack ne tw ork in as for m sen ten ce s, or it ma nc tua te,
of co nn ec ted ne uro ns. Th y ev en ca ptu re the sty le.
ab le to rem em be r im po ey are
rta nt thi ng s ab ou t the inp (2) Healthcare : He
ut lps in dia gn os ing va rio
received an d he nc e en ab the y us dis ea se s an d
les the m to be mo re pre tre ati ng it.
cise.
(3) Au tom ati c M ac
a. 6. 1 J. J Working hin e Tr an sla tio n :
Ce rta in wo rd s,
sen ten ce s or ph ras es
in on e lan gu ag e is tra
• First, we ne ed co ide nti int o an oth er lan gu ag e ns fo rm ed
fy the ac tua l pro ble m in (D ee p Leaming is ac hie
or de r to res ult s in the are as of tex vin g to p
gee the rig ht SQlution an t, im ag es) .
d it sh ou ld be un de rst oo
feasibility of the De ep d, the (3) Im ag e Re co gn iti
Le arn ing sh ou ld als o be on : Re co gn ize s an
ch ec ke d d ide nti fie s
(w he the r it sh ou ld fit De pe op les an d ob jec ts in
ep Lea.ming or no t). im ag es as we ll as to un
co nte nt an d co nte xt. Th de rst an d
Se co nd , we ne ed to ide is area is alr ea dy be ing
nti fy the rel ev an t da ta Ga mi ng . Re tai l, To ur ism us ed in
wh ich , etc .
sh ou ld co rre sp on d to the
actual pro ble m an d sh ou
pre pa red ac co rdi ng ly. ld be (4) Predicting Earth
quakes : Te ac he s a co
pe rfo rm vis co ela sti c co mp ut er to
mp uta tio ns wh ich ar
pre dic tin g ea rth qu ak es. c us ed in

f
Ch ap ter En ds...
aaa

,_
r- )

You might also like