0% found this document useful (0 votes)

196 views11 pages

The Multilayer Perceptron

The document summarizes the artificial neuron model and multilayer perceptrons. It discusses how artificial neurons operate by weighting inputs and applying a transfer function to calculate the output. Multilayer perceptrons connect neurons in layers to solve nonlinearly separable problems. They use continuous transfer functions like sigmoid and learn through backpropagation of errors between layers. Perceptrons can represent complex decision regions and are fault tolerant due to their distributed nature. Applications discussed include speech synthesis, financial analysis, and more.

Uploaded by

Adithya Chandrasekaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

196 views11 pages

The Multilayer Perceptron

Uploaded by

Adithya Chandrasekaran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 11

The artificial neuron revisited The synthetic or artificial neuron, which is a simple model of the biological neuron, was

first proposed in 1943 by McCulloch and Pitts. t consists of a summing function with an internal threshold, and !weighted! inputs as shown below.

"or a neuron recei#ing n inputs, each input xi $ i ranging from 1 to n% is weighted by multiplying it with a weight wi . The sum of the wixi products gi#es the net acti#ation of the neuron. This acti#ation #alue is sub&ected to a transfer function to produce the neuron's output. The weight #alue of the connection or lin( carrying signals from a neuron i to a neuron j is termed wij..
wij j

3irection of signal flow

Transfer functions )ne of the design issues for *++s is the type of transfer function used to compute the output of a node from its net acti#ation. *mong the popular transfer functions are, -tep function -ignum function -igmoid function .yperbolic tangent function n the step function, the neuron produces an output only when its net acti#ation reaches a minimum #alue / (nown as the threshold. "or a binary neuron i, whose output is a 0 or 1 #alue, the step function can be summarised as,

0 if activationi T outputi = 1 if activationi > T

112213113.doc

4hen the threshold T is 0, the step function is called signum function.

112213113.doc

*nother type of signum function is 1 if activation i > 0 output i = 0 if activation i = 0 1 if activation < 0 i The sigmoid transfer function produces a continuous #alue in the range 0 to 1. t has the form, 1 output i = gain . activation 1 + e
i

The parameter gain is determined by the system designer. t affects the slope of the transfer around 5ero. The multilayer perceptron uses the sigmoid as the transfer function. * #ariant of the sigmoid transfer function is the hyperbolic tangent function. t has the form,

output i

e activationi e activationi = activationi activationi e +e

where u = gain . activationi. This function has a shape similar to the sigmoid $shaped li(e an S%, with the difference that the #alue of outputi ranges between /1 and 1.

112213113.doc

output i

Step function

Signum

0 T activation i

0 activation i

output i output i

1 Sigmoid 0.8

0 71

0 activation i

activation i

Figure "unctional form of transfer functions

Hyperbolic Tangent

Introduction to the multilayer perceptron

To be able to sol#e nonlinearly separable problems, a number of neurons are connected in layers to build a multilayer perceptron. 6ach of the perceptrons is used to identify small linearly separable sections of the inputs. )utputs of the perceptrons are combined into another perceptron to produce the final output. The hard7limiting $step% function used for producing the output pre#ents information on the real inputs flowing on to inner neurons. To sol#e this problem, the step function is replaced with a continuous function7 usually the sigmoid function.

112213113.doc

The *rchitecture of the Multilayer Perceptron

n a multilayer perceptron, the neurons are arranged into an input layer, an output layer and one or more hidden layers.

The 9eneralised 3elta :ule

The learning rule for the multilayer perceptron is (nown as !the generalised delta rule! or the !bac(propagation rule!. The generalised delta rule repetiti#ely calculates an error function for each input and bac(propagates the error from one layer to the pre#ious one. The weights for a particular node are ad&usted in direct proportion to the error in the units to which it is connected. ;et Ep tpj opj wij = = < < error function for pattern p target output for pattern p on node j actual output for pattern p on node j weight from node i to node j

The error function Ep is defined to be proportional to the s=uare of the difference tpj 7 opj $1% Ep = 1>1$tpj - opj ! j The acti#ation of each unit j, for pattern p, can be written as net pj = wijopi $1%
8

112213113.doc

i The output from each unit j is determined by the non7linear transfer function fj opj = fj"netpj 4e assume fj to be the sigmoid function, f"net = 1>$1 ? e-k.net%, where k is a positi#e constant that controls the !spread! of the function. The delta rule implements weight changes that follow the path of steepest descent on a surface in weight space. The height of any point on this surface is e=ual to the error measure Ep. This can be shown by showing that the deri#ati#e of the error measure with resepect to each weight is proportional to the weight change dictated by the delta rule, with a negati#e constant of proportionality, i.e., pwi 7Ep#wij

The multilayer perceptron learning algorithm using the generalised delta rule
1. 1. 3. nitialise weights $to small random #alues% and transfer function Present input *d&ust weights by starting from output layer and wor(ing bac(wards wij$t $ %% < wij$t% ? pjopi wij$t% represents the weights from node i to node j at time t& is a gain term, and pj is an error term for pattern p on node j. "or output layer units pj = kopj"% - opj "tpj - opj "or hidden layer units pj = kopj"% - opj p'wj' ' where the sum is o#er the ' nodes in the following layer. The learning rule in a multilayer perceptron is not guaranteed to produce con#ergence, and it is possible for the networ( to fall into a situation $the so called local minima% in which it is unable to learn the correct output.

112213113.doc

Multilayer Perceptrons as Classifiers

The single layer perceptron is limited to calculating a single line of separation between classes. ;et us consider a two layer perceptron with two units in the input layer. f one unit is set to respond with a 1 if the input is abo#e its decision line, and the other responds with a 1 if the input is below its decision line, the second layer produces a solution in the form of a 1 if its input is abo#e line 1 and below line 1.

line 1

line 2

Fig. A 2-layer perceptron and the resluting decesion region.

* three layer perceptron can therefore produce arbitrarily shaped decision regions, and are capable of separating any classes. This statement is referred to as the (olmogorov theorem. Considering pattern recognition as a mapping function from un(nown inputs to (nown classes, any function, no matter how comple@, can be represented by a multilayer perceptron of no more than three layers.

The energy landscape

The beha#iour of a neural networ( as it attempts to arri#e at a solution can be #isualised in terms of the error or energy function Ep. The energy is a function of the input and the weights. "or a gi#en pattern, Ep can be plotted against the weights to gi#e the so called energy surface. The energy surface is a landscape of hills and #alleys, with points of minimum energy corresponding to wells and ma@imum energy found on pea(s. The generalised delta rule aims to minimise Ep by ad&usting weights so that they correspond to points of lowest energy. t does this by the method of gradient descent where the changes are made in the steepest downward direction. *ll possible solutions are depressions in the energy surface, (nown as basins of attraction.
112213113.doc A

;earning 3ifficulties in Multilayer Perceptrons

)ccasionally, the multilayer perceptron fails to settle into the global minimum of the energy surface and instead find itself in one of the local minima. This is due to the gradient descent strategy followed. * number of alternati#e approaches can be ta(en to reduce this possibility,

;owering the gain term progressi#ely *ddition of more nodes for better representation of patterns

ntroduction of a momentum term which determines the effect of past weight changes on the current direction of the mo#ement in weight space, wij$t $ %% < wij$t% ? where momentum term 0 B B 1.

pjopi ? "wij$t% 7 wij$t - %%%

*ddition of random noise to perturb a system out of a local minima.

Advantages of Multilayer Perceptrons

The following two features characterise multilayer perceptrons and artificial neural networ(s in general. They are mainly responsible for the !edge! these networ(s ha#e o#er con#entional computing systems. 9eneralisation +eural networ(s are capable of generalisation, that is, they classify an un(nown pattern with other (nown patterns that share the same distinguishing features. This means noisy or incomplete inputs will be classified because of their similarity with pure and complete inputs.

"ault Tolerance +eural networ(s are highly fault tolerant. This characteristic is also (nown as !graceful degradation!. Cecause of its distributed nature, a neural networ( (eeps on wor(ing e#en when a significant fraction of its neurons and interconnections fail. *lso, relearning after damage can be relati#ely =uic(.

112213113.doc

Applications of Multilayer Perceptrons

The multilayer perceptron with bac(propagation has been applied in numerous applications ranging from )C: $)ptical Character :ecognition% to medicine. Crief accounts of a few are gi#en below. -peech synthesis * #ery well (nown use of the multilayer perceptron is +6Ttal( E1F, a te@t7to7speech con#ersion system, de#eloped by -e&nows(i and :osenberg in 19DA. t consists of 103 input units, 110 hidden units, and 12 output units with o#er 1A000 synapses. 6ach output unit represents one basic unit of sound, (nown as a phoneme. Conte@t is utilised in training by presenting se#en successi#e letters to the input and the net learns to pronounce the middle letter. 90G correct pronunciation achie#ed with the training set $D07DAG with unseen set%. :esistant to damage and displays graceful degradation. Multilayer perceptrons are also being used for speech recognition to be used in #oice acti#ated control systems. "inancial applications 6@amples include bond rating, loan application e#aluation and stoc( mar(et prediction. Cond rating in#ol#es categorising the bond issuerHs capability. There is no hard and fast rules for determining these ratings. -tatistical regression is inappropriate because the factors to be used are not well defined. +eural networ(s trained with bac(propagation has consistently outperformed standard statistical techni=ues E1F. Pattern :ecognition

112213113.doc

"or many of the applications of neural networ(s, the underlying principle is that of pattern recognition. Target identification from sonar echoes has been de#eloped. 9i#en only a day of training, the net produced 100G correct identification of the target, compared to 93G scored by a Cayesian classifier. There are many commercial applications of networ(s in character recognition. )ne such system performs signature #erification on ban( che=ues. +etwor(s ha#e been applied to the problems of aircraft identification, and to terrain matching for automatic na#igation.

;imitations of Multilayer Perceptrons

1. Computationally e@pensi#e learning process

;arge number of iterations re=uired for learning, not suitable for real7time learning 1. +o guaranteed solution :emedies such as the !momentum term! add to computational cost )ther remedies, using estimates of transfer functions using transfer functions with easy to compute deri#ati#es using estimates of error #alues, eg., a single global error #alue for the hidden layer 3. -caling problem 3o not scale up well from small research systems to larger real systems. Coth too many and too few units slow down learning.

Ciological arguments against Cac(propagation

Cac(propagation not used or used through different pathways in biological systems.

Ciological systems use only local information for self7ad&ustments.

The =uestion one might as( at this point is 7 does an effecti#e system need to mimic nature e@actlyI

112213113.doc

:6"6:6+C6Ceale, :., J Kac(son, T., !+eural Computing, *n ntroduction!, Cristol , .ilger, c1990. $Contains full deri#ation of the generalised delta rule. *#ailable at Murdoch library%

112213113.doc

For Transformer: Given Magnetizing Inductance LM: 500 H Let L1 LM 500 H Also Np/Ns 3 It Is Known That Therefore
No ratings yet
For Transformer: Given Magnetizing Inductance LM: 500 H Let L1 LM 500 H Also Np/Ns 3 It Is Known That Therefore
1 page
Pspice For Linear and Switching Electronic Circuit PDF
No ratings yet
Pspice For Linear and Switching Electronic Circuit PDF
64 pages
Java Graphical User Interfaces An Introduction To Java Programming David Etheridge download
100% (1)
Java Graphical User Interfaces An Introduction To Java Programming David Etheridge download
37 pages
1948-65-HD-FL-Panhead-Parts-Catalog
No ratings yet
1948-65-HD-FL-Panhead-Parts-Catalog
86 pages
BB Bank Account Conditions
No ratings yet
BB Bank Account Conditions
38 pages
Because with only a high
No ratings yet
Because with only a high
2 pages
Computer_science_paper_1__TZ1_HL
No ratings yet
Computer_science_paper_1__TZ1_HL
11 pages
0 - Bethune College - IDC Syllabus of All Department - 230810 - 194239
No ratings yet
0 - Bethune College - IDC Syllabus of All Department - 230810 - 194239
5 pages
Deep Learning Hands On
100% (1)
Deep Learning Hands On
18 pages
Introduction of Transaction: MCA Sem - 1 Unit - 3
No ratings yet
Introduction of Transaction: MCA Sem - 1 Unit - 3
23 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
12 pages
Topic 5 - Part1 Multilayer Perceptron
No ratings yet
Topic 5 - Part1 Multilayer Perceptron
28 pages
Name - Sanjay Nithin S REG NO - 20BIT0150
No ratings yet
Name - Sanjay Nithin S REG NO - 20BIT0150
29 pages
Foreman or Supervisor Register Rev
No ratings yet
Foreman or Supervisor Register Rev
8 pages
Bruker: Technical Manual
No ratings yet
Bruker: Technical Manual
27 pages
Purchasing Monthly Report - November 2023
No ratings yet
Purchasing Monthly Report - November 2023
105 pages
Soal Bahasa Inggris SMP
No ratings yet
Soal Bahasa Inggris SMP
7 pages
BH Series Table
No ratings yet
BH Series Table
1 page
Siemens Ups
No ratings yet
Siemens Ups
65 pages
5 2 Multilayer Perceptron
No ratings yet
5 2 Multilayer Perceptron
17 pages
Lec 10 20th Feb 2013
No ratings yet
Lec 10 20th Feb 2013
32 pages
Theories of Intelligence
50% (2)
Theories of Intelligence
3 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Deep Learning Step by Step
No ratings yet
Deep Learning Step by Step
171 pages
Neural Networks - Basics Matlab PDF
No ratings yet
Neural Networks - Basics Matlab PDF
59 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
AIML - 04 Single Layer Perceptron
No ratings yet
AIML - 04 Single Layer Perceptron
11 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Current Log
No ratings yet
Current Log
6 pages
Soft Max
No ratings yet
Soft Max
6 pages
The Real You by DR - Sudipta Rath
100% (2)
The Real You by DR - Sudipta Rath
84 pages
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
100% (1)
Peter Dueben: Royal Society University Research Fellow & ECMWF's Coordinator For Machine Learning and AI Activities
33 pages
Loading The Dataset: 'Churn - Modelling - CSV'
No ratings yet
Loading The Dataset: 'Churn - Modelling - CSV'
6 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
20 Electrostatics-Coulomb's Law
50% (2)
20 Electrostatics-Coulomb's Law
5 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
P1 - Single Layer Feed Forward Networks
No ratings yet
P1 - Single Layer Feed Forward Networks
52 pages
Lec 3 14th Jan 2013 PDF
No ratings yet
Lec 3 14th Jan 2013 PDF
21 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Model of Human Occupation Frame of Reference: Theoretical Base
No ratings yet
Model of Human Occupation Frame of Reference: Theoretical Base
14 pages
KV DV
No ratings yet
KV DV
2 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
B. Appendix: Case A:Line From Gull (Bus 14) To Steel Mill (Bus 18) and Line From Swift (Bus 16) To Steel Mill (Bus 18)
No ratings yet
B. Appendix: Case A:Line From Gull (Bus 14) To Steel Mill (Bus 18) and Line From Swift (Bus 16) To Steel Mill (Bus 18)
14 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Akıllı Telefonlar Ile Ilgili İngilizce Essay - Smartphones Essay - Essay Kontrol
No ratings yet
Akıllı Telefonlar Ile Ilgili İngilizce Essay - Smartphones Essay - Essay Kontrol
2 pages
Unit 2
No ratings yet
Unit 2
112 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
27 pages
Lecture 03 Gradient Descent
No ratings yet
Lecture 03 Gradient Descent
26 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Bidirectional RNN and RVNN
No ratings yet
Bidirectional RNN and RVNN
15 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
DC-DC Buck-Boost Converter by Bill Pray Mike Perry Course Section Lab T.A. Inseop Lee 5-4-99 Project Number 40
No ratings yet
DC-DC Buck-Boost Converter by Bill Pray Mike Perry Course Section Lab T.A. Inseop Lee 5-4-99 Project Number 40
17 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
35 pages
Ex8 1
No ratings yet
Ex8 1
4 pages
Gradient Descent Algorithms and Variations - PyImageSearch
No ratings yet
Gradient Descent Algorithms and Variations - PyImageSearch
21 pages
PPT_Btech CSE
No ratings yet
PPT_Btech CSE
17 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
AI-Lecture 12 - Simple Perceptron
100% (1)
AI-Lecture 12 - Simple Perceptron
24 pages
DL Lab Manual
No ratings yet
DL Lab Manual
65 pages
CD Player & FM Tuner PDF
No ratings yet
CD Player & FM Tuner PDF
8 pages
Presentation On Mobile Services
No ratings yet
Presentation On Mobile Services
12 pages
Human Relations Dimension of Supervision Human Relations Dimension of Supervision
100% (1)
Human Relations Dimension of Supervision Human Relations Dimension of Supervision
12 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Ece4200-5200 Fall13 Project4
No ratings yet
Ece4200-5200 Fall13 Project4
2 pages
Gene Keys - Magical Contemplations
100% (8)
Gene Keys - Magical Contemplations
5 pages
Dropout Vs Pruning
No ratings yet
Dropout Vs Pruning
2 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
1
No ratings yet
1
43 pages
Ensemble Methods Bagging Boosting and Stacking
100% (1)
Ensemble Methods Bagging Boosting and Stacking
19 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
RT Procedure Rev01E
No ratings yet
RT Procedure Rev01E
20 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Ece6200 Spring14 Course Outline
No ratings yet
Ece6200 Spring14 Course Outline
4 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
1 - Optimize Amazon SageMaker Deployment Strategies
No ratings yet
1 - Optimize Amazon SageMaker Deployment Strategies
45 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
PMI ACP Exam Prep Apr2018 Updates Only
No ratings yet
PMI ACP Exam Prep Apr2018 Updates Only
12 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
25 pages
RBF, KNN, SVM, DT
No ratings yet
RBF, KNN, SVM, DT
9 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
1.1 - Equations of The Induction Motor Model
No ratings yet
1.1 - Equations of The Induction Motor Model
25 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Machine Learning Module-3
No ratings yet
Machine Learning Module-3
23 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
LN Electrical Data
No ratings yet
LN Electrical Data
7 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages