0% found this document useful (0 votes)
42 views

Artificial Neural Networks: Introduction To Computational Neuroscience

This document provides an introduction to artificial neural networks. It discusses how artificial neural networks work and the types of neural networks used for different tasks. It begins with an overview of the history of artificial neurons and covers the basic perceptron model. It then discusses multi-layer perceptrons and the universal approximation theorem. The document outlines forward and backpropagation and different loss functions. It categorizes different types of neural networks including convolutional neural networks, recurrent neural networks, and autoencoders. It concludes with some key takeaways about when different neural network architectures are commonly applied.

Uploaded by

NAGAVARSHINI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Artificial Neural Networks: Introduction To Computational Neuroscience

This document provides an introduction to artificial neural networks. It discusses how artificial neural networks work and the types of neural networks used for different tasks. It begins with an overview of the history of artificial neurons and covers the basic perceptron model. It then discusses multi-layer perceptrons and the universal approximation theorem. The document outlines forward and backpropagation and different loss functions. It categorizes different types of neural networks including convolutional neural networks, recurrent neural networks, and autoencoders. It concludes with some key takeaways about when different neural network architectures are commonly applied.

Uploaded by

NAGAVARSHINI
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Introduction to Computational Neuroscience

Artificial Neural Networks


Tambet Matiisen
15.10.2018
Artificial neural network

NB! Inspired by biology, not based on biology!


Applications
Automatic speech recognition Automatic image tagging

Machine translation
Learning objectives
 How artificial neural networks work?

 What types of artificial neural networks are


used for what tasks?

 What are the state-of-the-art results


achieved with artificial neural networks?
Part 1

HOW NEURAL NETWORKS WORK?


Frank Rosenblatt (1957)

Added learning rule to McCulloch-Pitts neuron.


Perceptron
Prediction: Learning:
1, if  xi wi  b  0 wi  wi  ( y  z ) xi

z i
b  b  ( y  z)

0, otherwise

x1 w1
w2 Σ
x2 z
b
1
Let’s try it out!
x1 x2 y = x1 or x2
0 0 0
0 1 1
1 0 1
1 1 1

Algorithm: repeat
1, if x1w1  x2 w2  b  0
z
0, otherwise
w1  w1  ( y  z ) x1
w2  w2  ( y  z ) x2
b  b  ( y  z)
until y=z holds for entire dataset
Perceptron limitations

Perceptron learning algorithm converges only for


linearly separable problems.

Minsky, Papert, “Perceptrons” (1969)


Multi-layer perceptrons
Add non-linear Add hidden layer(s)
activation functions

Universal approximation theorem:


Any continous function can be approximated to given
precision by feed-forward neural network with single
hidden layer containing finite number of neurons.
Forward propagation

+1 b1
+1 c

b2 a1  x1w11  x2 w21  b1
h1   (a1 ) z  h1v1  h2v2  c

w11 v1
x1 Σ Σ
w12
a2  x1w12  x2 w22  b2
w21 h2   (a2 )
1
 ( x) 
x2
w22
Σ v2 1  e x
Loss function
1
• Function approximation: L  ( z  y)2
2

( z  10) 2

Now we just need to find weight values that minimize


the loss function for all inputs. How do we do that?
Backpropagation
L L z h1 a1 L L z
   zy
+1 b1 z h1 a1 b1 +1
c z c

L L z h2 a2

b2 z h2 a2 b2 L L z h1 L
  ( z  y)v1h1 (1  h1 )  zy
a1 z h1 a1 z
L L z h1 a1

w11 z h1 a1 w11
x1 Σ L L z Σ
L L z h2 a2   ( z  y )h1
 v1 z v1
w12 z h2 a2 w12
L L z h2
L L z h1 a1   ( z  y)v2 h2 (1  h2 )
 a2 z h2 a2
w21 z h1 a1 w21
x2 Σ L L z
L L z h2 a2   ( z  y )h2
 v2 z v2
w22 z h2 a2 w22

1  ' ( x)   ( x)(1   ( x))


ai  x1w1i  x2 w2i  bi hi   (ai ) z  h1v1  h2v2  c L  ( z  y)2
2
Gradient Descent

  {wij , v j , b j , c}
L
   

  learning rate

• Gradient descent finds weight values that result in small loss.


• Gradient descent is guaranteed to find only local minimum.
• But there is plenty of them and they are often good enough!
Other loss functions
• Binary classification:  log( p)

p   ( z)
L   y log( p)  (1  y) log(1  p)

 log(1  p)
• Multi-class classification:
e zi
p  softmax( z ), pi 
e
zj

j 1
 ( x) 
1  e x
L   yi log pi   log pk
i
Things to remember...
 Perceptron was the first artificial neuron model
invented in late 1950s.
 Perceptron can learn only linearly separable
classification problems.
 Feed-forward networks with non-linear activation
functions and hidden layers can overcome
limitations of perceptrons.
 Multi-layer artificial neural networks are trained
using backpropagation and gradient descent.
Part 2

NEURAL NETWORKS TAXONOMY


Simple feed-forward networks
• Architecture: OUTPUT LAYER
– Each node connected to all
nodes of previous layer.
– Information moves in one
direction only.
HIDDEN LAYER
• Used for:
– Function approximation
– Simple classification problems
– Not too many inputs (~100)
INPUT LAYER
Convolutional neural networks
• Architecture:
– Convolutional layer: POOLING 2 2
local connections + LAYER
weight sharing.
max
– Pooling layer: translation
invariance. CONVOLUTIONAL -2 2 1 2
LAYER
• Used for:
– images and spatial data,
– any other data with locality
property, i.e. adjacent 0 1 2 -1 1 -3
characters make up word. INPUT LAYER

weights: 1 0 -1
Hubel & Wiesel (1959)
• Performed experiments
with anesthetized cat.
• Discovered topographical
mapping, sensitivity to
orientation and
hierarchical processing.
Convolution

Convolution matches the same pattern over entire


image and calculates score for each match.
Example: edge detector

https://developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html
Pooling

Pooling achieves translation invariance by taking


maximum of adjacent convolution scores.
Example: handwritten digit recognition

Y. LeCun et al., “Handwritten digit recognition: Applications of


neural net chips and automatic learning”, 1989.

LeCun et al. (1989)


Recurrent neural networks
• Architecture: OUTPUT LAYER
– Hidden layer nodes
connected to itself.
– Allows retaining internal
state and memory.
• Used for: HIDDEN LAYER

– speech recognition,
– machine translation,
– language modeling,
– any time series. INPUT LAYER
Backpropagation through time
y1 y2 y3 y4
1
L  ( z  y)2
2
OUTPUT LAYER z1 z2 z3 z4

HIDDEN LAYER h0 h1 h2 h3 h4

INPUT LAYER x1 x2 x3 x4

time
Different configurations
Autoencoders
• Architecture: OUTPUT LAYER = INPUT LAYER
– Input and output layers
are the same.
– Hidden layer functions
as a “bottleneck”.
– Network is trained to HIDDEN LAYER
reconstruct input from
hidden layer activations.
• Used for:
– image semantic hashing
– dimensionality reduction INPUT LAYER
We didn’t talk about...
• Long Short Term Memory (LSTMs)
• Restricted Boltzmann Machines (RBMs)
• Echo State Networks / Liquid State Machines
• Hopfield Network
• Self-organizing maps (SOMs)
• Radial basis function networks (RBFs)

• But we covered the most important ones!


Things to remember...
Simple feed-forward networks are usually
used for function approximation and
classification with few input features.
Convolutional neural networks are mostly
used for images and spatial data.
Recurrent neural networks are used for
language modeling and time series.
Autoencoders are used for image semantic
hashing and dimensionality reduction.
Part 3

SOME STATE-OF-THE-ART RESULTS


Deep Learning
• Artificial neural networks and backpropagation
have been around since 1980s. What’s all this
fuss about “deep learning”?

• What has changed:


– we have much bigger datasets,
– we have much faster computers (think GPUs),
– we have learned few tricks how to train neural
networks with very many layers.
Revolution of Depth

(human error ~5.1%)


Neural Image Processing
Instance Segmentation

https://www.youtube.com/watch?v=OOT3UIXZztE

https://github.com/matterport/Mask_RCNN
Image Captioning
Image Captioning Errors
Reinforcement
screen learning
score

Pong Breakout Space Invaders


actions

Seaquest Beam Rider Enduro


http://sodeepdude.blogspot.com/2015/03/deepminds-atari-paper-replicated.html
Mnih et al., “Human-level control through deep reinforcement learning” (2015)
Skype Translator

https://www.youtube.com/watch?v=NhxCg2PA3ZI
Adversarial Examples

https://www.youtube.com/watch?v=XaQu7kkQBPc
Things to remember...
Artificial neural networks are state-of-the-art
in image recognition, speech recognition,
machine translation and many other fields.
Anything that you can do in 1 second,
probably we can train a neural network to do
the same, i.e. neural nets can do perception.
But in the end they are just reactive function
approximators and can be easily fooled. In
particular they do not think like humans (yet).
Thank you!

[email protected]

You might also like