Mcculloch-Pitts "Unit": A G (In) G W A
Mcculloch-Pitts "Unit": A G (In) G W A
♦ Multilayer perceptrons
♦ Applications of neural networks
ini ini
(a) (b)
AND OR NOT
Axon from another cell
Synapse
McCulloch and Pitts: every Boolean function can be implemented
Dendrite Axon
Nucleus
Synapses
Feed-forward networks implement functions, have no internal state Represents a linear separator in input space:
Feed-forward network = a parameterized family of nonlinear functions: Simple weight update rule:
Perceptron output
1 1 1
0.8
0.9 0.9
0.6
0.4 0.8 0.8
0.2 4
2 0.7 0.7
0 0 x2
-4 -2 -2
Input Output 0 2 -4 0.6 Perceptron 0.6
x1
Wj,i 4
0.5
Decision tree
0.5 Perceptron
Units Units Decision tree
0.4 0.4
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Output units all operate separately—no shared weights Training set size - MAJORITY on 11 inputs Training set size - RESTAURANT data
Adjusting weights moves the location, orientation, and steepness of cliff Perceptron learns majority function easily, DTL is hopeless
DTL learns restaurant function easily, perceptron cannot represent it
Input units ak
i ∂Wk,j
∂ X
0
= − ∆iWj,ig (in j ) Wk,j ak
X
12
Hidden layer: back-propagate the error from the output layer: 10
0 8
∆j = g (in j ) Wj,i∆i .
X
i 6
Update rule for weights in hidden layer: 4
2
Wk,j ← Wk,j + α × ak × ∆j .
0
0 50 100 150 200 250 300 350 400
(Most neuroscientists deny that back-propagation occurs in the brain)
Number of epochs
0.9
0.8
0.7
0.4
0 10 20 30 40 50 60 70 80 90 100
Training set size - RESTAURANT data
Summary
Most brains have lots of neurons; each neuron ≈ linear–threshold unit (?)
Perceptrons (one-layer networks) insufficiently expressive
Multi-layer networks are sufficiently expressive; can be trained by gradient
descent, i.e., error back-propagation
Many applications: speech, driving, handwriting, fraud detection, etc.
Engineering, cognitive modelling, and neural system modelling
subfields have largely diverged