0% found this document useful (0 votes)

2 views

TensorFlow CNN

The document provides an overview of Convolutional Neural Networks (CNNs), which are effective for processing 2D data, particularly in image classification. It explains the concept of local receptive fields and the two main types of layers in CNNs: convolution and pooling. Additionally, it describes how images are represented as matrices and the application of convolution using sliding window functions.

Uploaded by

Surya Bhoi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

TensorFlow CNN

Uploaded by

Surya Bhoi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 138

Convolutional Neural Networks in TensorFlow

Over view

Convolutional NNs are one kind of NN

architecture which work well with 2D data

Modeled on the visual cortex, they are amazing at

image classification
How Do We See?
Viewing an Image

All neurons in the eye don’t see the entire image

Viewing an Image

Each neuron has its own local receptive field

Viewing an Image

It reacts only to visual stimuli located in its receptive field

Viewing an Image

Some neurons react to more complex patterns that are

combinations of lower level patterns
Neural Net works

Layer 2
Layer 1

Layer N
…
Sounds like a classic neural network problem
Two Kinds of Layers in CNNs

Convolution Pooling

Local receptive field Subsampling of inputs

Convolution
Two Kinds of Layers in CNNs

Convolution Pooling

Local receptive field Subsampling of inputs

Convolution
In this context, a sliding window function applied to
a matrix
Convolution
In this context, a sliding window function applied to
a matrix

e.g. a matrix of pixels representing

an image
Convolution
In this context, a sliding window function applied to
a matrix

Often called a kernel or filter

Convolution
In this context, a sliding window function applied to
a matrix

Kernel is applied element-wise in sliding-

window fashion
Representing Images as Matrices
28

= 784 pixels
Representing Images as Matrices
6

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0

0.2 0.9 0 0.3 0.8 0

6
0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

= 36 pixels
Representing Images
3

0 0 0 0 0 0
1 0 1
0.2 0.8 0 0.3 0.6 0

0.2 0.9 0 0.3 0.8 0

0 1 0
0.3 0.8 0.7 0.8 0.9 0 3

0 0 0 0.2 0.8 0
1 0 1
0 0 0 0.2 0.2 0

Matrix Kernel
Convolution
3
0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 x1 x0 x1

0.2 0.9 0 0.3 0.8 0

3 x0 x1 x0

0.3 0.8 0.7 0.8 0.9 0 x1 x0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Kernel
Convolution
4
0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0
4 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8 0.9 0 1.0 2.1 2.4 1.4
x1 x0 x1

0 0 0 0.2 0.8 0
x0 x1 x0

x1 x0 x1
1.0 1.8 2.0 1.8
0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 x1 0x0 0 x1 0 0 0

0.2
x0 0.8x1 0x0 0.3 0.6 0

0.2x1 0.9x0 0 x1 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 x1 0x0 0 x1 0 0 0

0.2
x0 0.8x1 0x0 0.3 0.6 0 1
0.2x1 0.9x0 0 x1 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 x1 0x0 0 x1 0 0

0.2 0.8
x0 0 x1 0.3
x0 0.6 0 1
0.2 0.9x1 0x0 0.3x1 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 x1 0x0 0 x1 0 0

0.2 0.8
x0 0 x1 0.3
x0 0.6 0 1 1.2
0.2 0.9x1 0x0 0.3x1 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 x1 0x0 0 x1 0

0.2 0.8 0x0 0.3x1 0.6

x0 0 1 1.2
0.2 0.9 0 x1 0.3
x0 0.8x1 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 x1 0x0 0 x1 0

0.2 0.8 0x0 0.3x1 0.6

x0 0 1 1.2 1.1
0.2 0.9 0 x1 0.3
x0 0.8x1 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 x1 0x0 0 x1

0.2 0.8 0 0.3

x0 0.6x1 0x0 1 1.2 1.1
0.2 0.9 0 0.3x1 0.8
x0 0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 x1 0x0 0 x1

0.2 0.8 0 0.3

x0 0.6x1 0x0 1 1.2 1.1 0.9
0.2 0.9 0 0.3x1 0.8
x0 0 x1

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2x1 0.8
x0 0 x1 0.3 0.6 0 1 1.2 1.1 0.9
0.2
x0 0.9x1 0x0 0.3 0.8 0

0.3x1 0.8
x0 0.7x1 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2x1 0.8
x0 0 x1 0.3 0.6 0 1 1.2 1.1 0.9
0.2
x0 0.9x1 0x0 0.3 0.8 0 1.9
0.3x1 0.8
x0 0.7x1 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8x1 0x0 0.3x1 0.6 0 1 1.2 1.1 0.9

0.2 0.9
x0 0 x1 0.3
x0 0.8 0 1.9
0.3 0.8x1 0.7x0 0.8x1 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8x1 0x0 0.3x1 0.6 0 1 1.2 1.1 0.9

0.2 0.9
x0 0 x1 0.3
x0 0.8 0 1.9 2.7
0.3 0.8x1 0.7x0 0.8x1 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 x1 0.3

x0 0.6x1 0 1 1.2 1.1 0.9
0.2 0.9 0x0 0.3x1 0.8
x0 0 1.9 2.7
0.3 0.8 0.7x1 0.8
x0 0.9x1 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 x1 0.3

x0 0.6x1 0 1 1.2 1.1 0.9
0.2 0.9 0x0 0.3x1 0.8
x0 0 1.9 2.7 2.5
0.3 0.8 0.7x1 0.8
x0 0.9x1 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3x1 0.6

x0 0 x1 1 1.2 1.1 0.9
0.2 0.9 0 0.3
x0 0.8x1 0x0 1.9 2.7 2.5
0.3 0.8 0.7 0.8x1 0.9
x0 0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3x1 0.6

x0 0 x1 1 1.2 1.1 0.9
0.2 0.9 0 0.3
x0 0.8x1 0x0 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8x1 0.9
x0 0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2x1 0.9x0 0 x1 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3
x0 0.8x1 0.7
x0 0.8 0.9 0

0 x1 0x0 0 x1 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2x1 0.9x0 0 x1 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3
x0 0.8x1 0.7
x0 0.8 0.9 0 1.0
0 x1 0x0 0 x1 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9x1 0x0 0.3x1 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8
x0 0.7x1 0.8
x0 0.9 0 1.0
0 0 x1 0x0 0.2x1 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9x1 0x0 0.3x1 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8
x0 0.7x1 0.8
x0 0.9 0 1.0 2.1
0 0 x1 0x0 0.2x1 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 x1 0.3
x0 0.8x1 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7x0 0.8x1 0.9x0 0 1.0 2.1
0 0 0 x1 0.2
x0 0.8x1 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 x1 0.3
x0 0.8x1 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7x0 0.8x1 0.9x0 0 1.0 2.1 2.4
0 0 0 x1 0.2
x0 0.8x1 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3x1 0.8
x0 0 x1 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8
x0 0.9x1 0x0 1.0 2.1 2.4
0 0 0 0.2x1 0.8
x0 0 x1

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3x1 0.8
x0 0 x1 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8
x0 0.9x1 0x0 1.0 2.1 2.4 1.4
0 0 0 0.2x1 0.8
x0 0 x1

0 0 0 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3x1 0.8
x0
0.7x1 0.8 0.9 0 1.0 2.1 2.4 1.4
0x0 0 x1 0x0 0.2 0.8 0

0 x1 0x0 0 x1 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3x1 0.8
x0
0.7x1 0.8 0.9 0 1.0 2.1 2.4 1.4
0x0 0 x1 0x0 0.2 0.8 0 1.0
0 x1 0x0 0 x1 0.2 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8x1 0.7x0 0.8x1 0.9 0 1.0 2.1 2.4 1.4
0 0x0 0 x1 0.2x0 0.8 0 1.0
0 0 x1 0x0 0.2x1 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8x1 0.7x0 0.8x1 0.9 0 1.0 2.1 2.4 1.4
0 0x0 0 x1 0.2x0 0.8 0 1.0 1.8
0 0 x1 0x0 0.2x1 0.2 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7x1 0.8
x0
0.9x1 0 1.0 2.1 2.4 1.4
0 0 0x0 0.2x1 0.8
x0 0 1.0 1.8
0 0 0 x1 0.2
x0
0.2x1 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7x1 0.8
x0
0.9x1 0 1.0 2.1 2.4 1.4
0 0 0x0 0.2x1 0.8
x0 0 1.0 1.8 2.0
0 0 0 x1 0.2
x0
0.2x1 0

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8x1 0.9x0 0 x1 1.0 2.1 2.4 1.4
0 0 0 0.2
x0
0.8x1 0x0 1.0 1.8 2.0
0 0 0 0.2x1 0.2x0 0 x1

Matrix Convolution Result

Convolution

0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0 1.9 2.7 2.5 1.9
0.3 0.8 0.7 0.8x1 0.9x0 0 x1 1.0 2.1 2.4 1.4
0 0 0 0.2
x0
0.8x1 0x0 1.0 1.8 2.0 1.8
0 0 0 0.2x1 0.2x0 0 x1

Matrix Convolution Result

Choice of Kernel Function
Averaging neighbouring pixels ~ Blurring

x0 x0 x0 Subtracting neighbouring pixels ~ Edge detection

x0 x1 x1 x0
Positive middle, negative neighbours ~ Sharpen
x0 x1 x1 x0

x0 x1 x1 x0
Negative corners, zero elsewhere ~ Edge enhance
x0 x0

More complex patterns ~ Emboss

…
Choice of Kernel Function

http://aishack.in/tutorials/image-convolution-examples/
Blur
Line Detection
Horizontal Lines
Edge Detection
Zero-padding, Stride Size
Narrow vs. Wide Convolution

Input matrix i.e. image

Narrow vs. Wide Convolution

Convolution result
Narrow vs. Wide Convolution

Narrow Convolution Wide Convolution

Little zero padding; output narrower than Lots of zero padding; output wider than
input input
Without Zero Padding
6
4
0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0

0.2 0.9 0 0.3 0.8 0

6 0.3 0.8 0.7 0.8 0.9 0
x1

x0
x0

x1
x1

x0
4
x1 x0 x1

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Matrix Convolution Result

Zero Padding
10
8
0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0.2 0.8 0 0.3 0.6 0 0 0 8

0 0 0.2 0.9 0 0.3 0.8 0 0 0
10 0 0 0.3 0.8 0.7 0.8 0.9 0 0 0
x1

x0
x0

x1
x1

0 0 0 0 0 0.2 0.8 0 0 0 x1 x0 x1

0 0 0 0 0 0.2 0.2 0 0 0

0 0 0 0 0 0 0 0 0 0

Matrix Convolution Result

Zero Padding
12
10
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0.2 0.8 0 0.3 0.6 0 0 0 0
10
12 0 0 0 0.2 0.9 0 0.3 0.8 0 0 0 0 x1 x0 x1

0 0 0 0.3 0.8 0.7 0.8 0.9 0 0 0 0 x0 x1 x0

0 0 0 0 0 0 0.2 0.8 0 0 0 0 x1 x0 x1

0 0 0 0 0 0 0.2 0.2 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0

Matrix Convolution Result

Zero Padding

x0 x0 x0 With zero-padding, every element of matrix will be passed into

x0 x1 x1 x0 filter
x0 x1 x1 x0

x0 x1 x1 x0
Can decide number of zero columns to pad with

Use to get output larger than input

x0 x0
Stride Size

0 x1 0x0 0 x1 0 0 0

0.2
x0 0.8x1 0x0 0.3 0.6 0

0.2x1 0.9x0 0 x1 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 x1 0x0 0 x1 0 0

0.2 0.8
x0 0 x1 0.3
x0 0.6 0

0.2 0.9x1 0x0 0.3x1 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Horizontal stride of 1
Stride Size

0 x1 0x0 0 x1 0 0 0

0.2
x0 0.8x1 0x0 0.3 0.6 0

0.2x1 0.9x0 0 x1 0.3 0.8 0

0.3 0.8 0.7 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0
Stride Size

0 0 0 0 0 0

0.2x1 0.8x0 0 x1 0.3 0.6 0

0.2
x0 0.9x1 0x0 0.3 0.8 0

0.3x1 0.8x0 0.7x1 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Vertical stride of 1
Stride Size

0 0 0 0 0 0

0.2x1 0.8x0 0 x1 0.3 0.6 0

0.2
x0 0.9x1 0x0 0.3 0.8 0

0.3x1 0.8x0 0.7x1 0.8 0.9 0

0 0 0 0.2 0.8 0

0 0 0 0.2 0.2 0

Stride size is an important hyper parameter in

CNNs
Convolutional Neural Net works
Neural Net works for Image Classification

Layer 2
Layer 1

Layer N
…
Corpus of Layers in a neural network ML-based Classifier
Images
Neural Net works for Image Classification

Pixels Processed groups of pixels

Corpus of Each layer consists of individual ML-based Classifier

Images interconnected neurons
Parameter Explosion

Consider a 100 x 100 pixel image (10,000 pixels)

If first layer = 10,000 neurons

Interconnections ~ O(10,000 * 10,000)

100 million parameters to train neural network!

Parameter Explosion

Dense, fully connected neural networks can’t cope

Convolutional neural networks to the rescue

CNNs Introduced

Eye perceives visual stimulus in 2D visual field

Eye sends 2D image to visual cortex

Visual cortex adds depth perception

Individual neurons in cortex focus on small field

“Local receptive field”

CNNs Introduced

CNNs perform spectacularly well at many tasks

Particularly at image recognition

Dramatically fewer parameters than DNN with similar

performance
Inspirations for CNNs

Two Dimensions Local Receptive Fields

Data comes in expressed in 2D Neurons focus on narrow portions

CNN Layers

Convolution layers - zoom in on specific bits of input

Successive layers aggregate inputs into higher level features

Pixels >> Lines >> Contours/Edges >> Object

Convolutional Layers
Feature Maps

Image Pixels Feature Map

Feature Maps

Neurons

Pixels Convolutional Layer

Feature Maps
Local Receptive
Neuron i
Field of Neuron i

Pixels Convolutional Layer

Feature Maps
Number of neurons in
receptive field = kernel Neuron i
size

Pixels Convolutional Layer

Kernel Size

The convolutional kernel size is usually expressed in terms of

width and height of receptive area

Use small convolutional kernels, more efficient

Stacking 2 3x3 kernels is preferable to 1 9x9 kernel

Feature Maps

Stride: Distance between

successive receptive
fields

Pixels Convolutional Layer

Feature Maps

Horizontal Stride

Pixels Convolutional Layer

Feature Maps

Vertical Stride

Pixels Convolutional Layer

Feature Maps

Pixels Convolutional Layer

Feature Maps

Pixels Convolutional Layer

Zero padding may Feature Maps
be needed at the
edges

Convolutional Layer
Feature Maps

All neurons in a feature map have the same weights and

biases

Two big advantages over DNNs

- Dramatically fewer parameters to train

- CNN can recognise feature patterns independent of location

Feature Maps

The parameters of all neurons in a feature map are collectively

called the filter

Why filter?

Because weights highlight (filter) specific patterns from the

input pixels
Filters

Horizontal Filter Vertical Filter

Neuron will detect horizontal Neuron will detect vertical lines

lines in input in input
Feature Maps

Notice also that neurons are not connected to all pixels

CNNs are sparse neural networks

Convolutional Layer

Each convolutional layer consists of several feature maps of

equal sizes

The different feature maps have different parameters

Convolutional Layer

Each neuron’s receptive field includes the feature maps of all

previous layers

This is how aggregated features are picked up

The CNN as a whole consists of multiple convolutional (and

pooling) layers

More on pooling layers in a bit

CNNs

Feature Map Convolutional

Layer CNN
RGB Channels

Feature Map Convolutional

RGB Layer
RGB Channels

Feature Map Convolutional

RGB Layer
Output of a Convolution Layer Neuron

Input Image Layer 1 Layer 2 Layer L

Output of a Convolution Layer Neuron
Map m,
Column c,
Row r

Input Image Layer 1 Layer 2 Layer L

Output of a Convolution Layer Neuron

Map m,
Column c,
Row r
Input Image Layer 1 Layer 2 Layer L
Neuron output depends on corresponding* neurons from
each preceding layer
(*corresponding: same receptive field and feature maps,
different layers)
Pooling Layers
Two Kinds of Layers in CNNs

Convolutional Pooling

Local receptive field Subsampling of inputs

Convolution
6
4
0 0 0 0 0 0

0.2 0.8 0 0.3 0.6 0 1 1.2 1.1 0.9

0.2 0.9 0 0.3 0.8 0
4 1.9 2.7 2.5 1.9
6 0.3 0.8 0.7 0.8 0.9 0
x1 x0 x1 1.0 2.1 2.4 1.4
0 0 0 0.2 0.8 0
x0 x1 x0

x1 x0 x1
1.0 1.8 2.0 1.8
0 0 0 0.2 0.2 0

Matrix Convolution Result

Two Kinds of Layers in CNNs

Convolutional Pooling

Local receptive field Subsampling of inputs

Pooling Layers