0% found this document useful (0 votes)

20 views

DL6 - Convnets 4

Convolutional neural networks use three main types of layers: convolution layers, activation layers, and pooling layers. Convolution layers apply filters to input data to extract features. These layers incorporate translation invariance, allowing the network to detect patterns regardless of position. Deeper convolution layers detect more complex patterns by processing information from larger regions of the input. Overall, convolutional neural networks use local connectivity and weight sharing to efficiently process visual input data and learn hierarchical representations.

Uploaded by

razifa0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

DL6 - Convnets 4

Uploaded by

razifa0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 57

Convolutional neural networks

Deep Learning – 046211

Daniel Soudry and Yossi Keshet
https://deepdreamgenerator.com/
Motivation - Classification task progress
30
28.2 Deep learning era
25.8
25
ImageNet top-5 Error

16.4

11.7

7.3
6.7

5 Human
3.57

0
1

2010 2011 2012 2013 2014(1) 2014(2) 2015

Convolution Neural nets: Building blocks

• Convolution layers

• Activation Layers (ReLU)

• Pooling layers

• Fully connected layers

Convolutional Neural Nets Properties

• Property 1: Deeper layers represent more complex parts of the image

• Property 2: Translational invariance/equivariance

4
Local Connectivity for Hierarchical
Representations
• Multilayer neural networks: each neuron
connected to every neuron in next layer

• Convnets: only local connectivity

• Deeper neurons affected by larger input regions
• Hierarchical Representation:
• Shallow layers detect local features
• Deeper layers detect global features
Invariance to translations

Invariance: moving the image

does not change its class

Note: not necessary if all images (train+test) are pre-aligned

(e.g., see Deepface: Convnet without weight sharing)

6
Are fully connected layers
invariant to translation? neuron k
• Fully connected layer with images input:
p ixels 𝐱 [ i, j ] W[i,j,k]

On this training image:

• Cat detector: Red weights will change
to better detect a cat

• Not invariant to On this test image:

Green weights will not
translation! detect a cat
Convnets: built-in translation invariance
• Label unaffected by translation
 classifier should be translation invariant

? Translation
invariant
Why invariance is not enough?
• Shallow layer features: “eyes” + “nose” + “mouth”
• Deeper layer should detect “face”,
but only of all features are spatially related

Don’t lose location information!

Invariant layers will detect face in a Picasso picture:

Source: 03:22 here.

Invariance / equivariance
• is invariant 𝜏
transformation that
if for all changes order of

• Example: Image classification

components
𝑓 𝑓
𝐶𝑎𝑡

• is equivariant 𝜏
if for all
• Example: Edge detection 𝑓 𝑓
𝜏
Convnets: built-in translation invariance
• Label unaffected by translation
 Classifier should be translation invariant
• Features should translate with image
 Convnet hidden layers are equivariant

Translation Translation
equivariant invariant
How do we build equivariant layers?
Exercise: Let where is component-wise invertible non-linearity and
Prove that is equivariant to transformation family if and only if

Example: find W equivariant (1,2)

to Cyclic Translation: (2,3)

W
(3,4)
Cyclic Convolution (in 1D) !
(4,1)

Exercise: Find conditions on so that

is equivariant to permutations of the components of .
Marron et al. 2020
Convolution and Cross-correlation in 2D
• Two real signals:
Q: Which one is used in
convolutional neural nets?
• 2D convolution:
A: Cross-correlation! (but we
call it convolution anyway)

Q: Are both translation

equivariant?
• 2D cross-correlation:
A: Convolution – yes.
Cross-correlation – only in y

13
2D Convolution on a single map (channel)
Examples of 2D Convolutions

∗ ¿ Edge detection

∗ ¿ Sharpening
Zero Padding
Padding = 0 Padding = 1 Padding = 2

kernel size=3, stride=1, dilation=1

Stride
Stride = 1 Stride = 2

kernel size=3, padding=1, dilation=1

Dilation
Dilation=2

kernel size=3, stride=1, padding=0

Upsampled/Transposed convolution
• More relevant for generative models (e.g., Generative Adversarial models, GANs)
• Not “Deconvolution”!
Convolution Layer: Single Input, Single
Output
Output map

learned
kernel

Input map
Convolution Layer: Single Input, Two
Outputs
Output map

Another
learned
kernel

image
Convolution Layer: Single Input, Many Outputs

Input map N output maps

N kernels
Convolution Layer: Many Inputs, One Output
Convolution Layer: Many Inputs, Many Outputs
3
2
1

3
2 Q: What does a 1x1 kernel do?
1 A: “Point-wise convolution”.
Scalars multiplying entire
maps. Useful for cheaply
3 combining maps.
123 2
1

3 N output maps

M input maps
2
1
MN KxK kernels
Grouped Convolution
2
1

2
1
4
3
1234

4
M output maps
M input maps 3

G groups of KxK kernels

Depth-wise convolution
• Grouped convolution with #groups = M = N

1234 3

4
M input maps M output maps
KxK kernels
Common Cheap Option: Separable Convolution
• Key idea: divide the convolution to two steps [introduced in the Mobilenet architecture]

Depthwise convolution Pointwise convolution

(spatial domain) (channel domain)

3
2
1 1
3
2 2
1
3
2 123
1234 3 1
3
4 2
1
M input maps M KxK kernels M intermediate maps MN 1x1 kernel N output maps
Q: #parameters in comparison to standard convolution? Q: #multiplications in comparison to standard convolution?
A: standard: , separable:
Convolution Neural net: Building blocks

• Convolution layers

• Activation Layers (ReLU)

• Pooling layers

• Fully connected layers

Pooling layer
• Makes the representation smaller
• Operates map-wise
• Can use instead: strides, dilation
Are convnets really equivariant/invariants?

Building blocks: Equivariant?

• Convolution layers • Yes
• Activation Layers (e.g., ReLU, GeLU) • Yes
• Normalization layers • Yes (for some types)

• No
• Pooling layers
Result: single pixel shifts can change
classification

[
Azulay & Weiss, JMLR 201
9
] • Problem can be solved,
even for fractional shifts [Hagay et al., CVPR 2023]
• But was is the problem?
What is the problem? Aliasing!

Original

Reconstructed

[Tero et al., NeurIPS 2021]

Improved performance (in GANs)

[Tero et al., NeurIPS 2021]

A few comments on convolution layers
(1) Convolution layer vs. fully connected
layer
• Convnets use same kernel for every output neuron (share parameters)
• Example: 300 x 300 input map, 300 x 300 output map
Q: How many parameters in a
 Convolution layer (5 x 5 kernel)?
26 parameters
 Fully connected layer?
8.1 x parameters
Q: Convents better due to #params savings?
A: No, even with similar parameter numbers,
convnets outperform fully connected networks Malach & Shalev-Shwartz ICLR 2020
(2) Can we use FFT to accelerate
convolution operation?
• Most common convulsion kernel is 3x3
→ Best convolution implementations are spatial (not Fourier)

• Fourier can be beneficial in non-standard domains

[Spherical CNNs, Clebsch–Gordan Nets]

• Usually better: approximate domain

so we can use spatial convolution [Cohen et al.]
(3) How to backpropagate through convolution?
𝑤

𝑥
Convnet backpropagation:

¿ ∗

For example:
¿ ∗
.
.
.
Convnet backpropagation:

¿ ∗

0 0 0 0

0 0

¿ ∗
0 0

0 0 0 0
History & Architectures
History – first CNN (1993)
ImageNet challenge

• ~14 million labeled images, 20k classes

• Images gathered from Internet

• Human labels via Amazon Mechanical Turk

• ImageNet Large-Scale Visual Recognition

Challenge (ILSVRC):
1.2 million training images, 1000 classes
AlexNet – ILSVRC 2012 winner

• Innovation:
• Max pooling, ReLU nonlinearity
• More data and bigger model (7 hidden layers, 650K units, 60M params)
• GPU implementation (50x speedup over CPU)
• Trained on two GPUs for a week
• Dropout regularization (later)

• Achieved top-5 ~11% lower than second position!

• AlexNet: arguably the most influential paper published in computer vision
VGG (2014)
• Deeper network
• Improved AlexNet by 9%

• Only 3x3 convolution

How do we train more than 30 layers?
How do we train more than 30 layers?
Skip Connections!
[He et al. 2015]

ImageNet ImageNet

Bold: test Bold: test

Thin: train add skip Thin: train

connections Q: What happens to

with standard inits ?
A: it explodes with depth!
Q: Solution?
A: Normalization layers, or
modify init (later lecture)
ImageNet progress
30
28.2 Deep learning era
25.8
25

20
ImageNet top-5

16.4

11.7

7.3
6.7

5 Human
3.57

0
1
AlexNet VGG ResNet
2010 2011 2012 2013 2014(1) 2014(2) 2015
Comparing architectures

https://culurciello.github.io/tech/2016/06/04/nets.html
Basic Residual Architectures
More efficient variations

ResNext Module MobileNetV2 Module

ResNet Module
(“Inverted Bottleneck”, also
(Grouped convolution) different activations)
Densenet
• Densenet Block

• Concatenation?

• Putting it all together

How to find new architectures?
• (low risk) Start from current good architecture
• Make operations more efficient (e.g., group conv)
• If does not hurt much, increase depth/width to improve performance

• (high risk) Try something new

• In many cases, extending existing good ideas (e.g., Resnet -> Densenet)

• Architecture search

• Standard hyperparameters need tuning (learning rate, etc.) for new architecture
Neural architectural search (NAS)
• Motivation: automating the architectural design process
• Huge search space → use reduced space
• Search on small datasets (CIFAR10), apply to large datasets (ImageNet)
• Optimization methods:
• Evolutionary algorithms [AmobaNet]
• Reinforcement learning [NasNet]
• Grid search [EfficientNet]
• Gradient based methods (e.g., Darts) …
[MnasNet,
• Hardware objectives (FLOPS, power, latency) can be added
EfficientNetV2]
Extensions
Other Uses in Vision Tasks

:.Different Tasks require architecture changes, e.g

55
Convnets for Speech Classification
Amplitude Speech Signal

time

Frequency Short time Fourier transform

2D Input
to a convnet

time
56
Summary
• Hierarchical representation using local connectivity
• Invariance and Equivariance using convolutions
• Building blocks
• Architectures
• Extensions

Some slides and visual adopted from courses cs231n (Standford) , 236278 (Technion), A guide to convolution arithmetic for deep learning

The Web 3.0 Road Map
No ratings yet
The Web 3.0 Road Map
37 pages
Wonderware West Tech Note 88 - Trouble-Shooting InTouch Application Corruption - Stand-Alone Applications - Wonderware West
100% (1)
Wonderware West Tech Note 88 - Trouble-Shooting InTouch Application Corruption - Stand-Alone Applications - Wonderware West
15 pages
PRTG License Key
100% (1)
PRTG License Key
1 page
AE556_2024_Topic4_CNN
No ratings yet
AE556_2024_Topic4_CNN
26 pages
Cnns Convolution Neural Networks
No ratings yet
Cnns Convolution Neural Networks
50 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Unit 1
No ratings yet
Unit 1
109 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
CNN2
No ratings yet
CNN2
70 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
Lecture 08
No ratings yet
Lecture 08
43 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
CNNs
No ratings yet
CNNs
22 pages
Cnn
No ratings yet
Cnn
123 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
AIML_ECE_UNIT-5
No ratings yet
AIML_ECE_UNIT-5
48 pages
Week 11 - Convolutional
No ratings yet
Week 11 - Convolutional
78 pages
03 Convolutional Neural Networks
No ratings yet
03 Convolutional Neural Networks
83 pages
convolutional_neural_networks
No ratings yet
convolutional_neural_networks
108 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
CNN
No ratings yet
CNN
10 pages
Lecture-CNN
No ratings yet
Lecture-CNN
68 pages
NN 07
No ratings yet
NN 07
24 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
Deep Learning: Alberto Ezpondaburu
No ratings yet
Deep Learning: Alberto Ezpondaburu
58 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
Scan 30 Sep 23 18 20 44
No ratings yet
Scan 30 Sep 23 18 20 44
30 pages
Module 3
No ratings yet
Module 3
67 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
CNN 2
No ratings yet
CNN 2
47 pages
Cnn
No ratings yet
Cnn
32 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
No ratings yet
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
69 pages
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
No ratings yet
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
64 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
0% (1)
Deep Learning 4/7: Convolutional Neural Networks: C. de Castro, IEIIT-CNR, Cristina - Decastro@ieiit - Cnr.it
49 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
66 pages
L09-10 DL and CNN
No ratings yet
L09-10 DL and CNN
56 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
11 pages
On Convolutional Neural Network: Zeng Huang
No ratings yet
On Convolutional Neural Network: Zeng Huang
19 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
41 pages
CNN For Computer Vision Problem (Session 1)
No ratings yet
CNN For Computer Vision Problem (Session 1)
43 pages
MLT CNN Architectures
No ratings yet
MLT CNN Architectures
104 pages
Cnnbasics 171028092801
No ratings yet
Cnnbasics 171028092801
43 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
9 pages
What is a Convolutional Neural Network-unit3.docx
No ratings yet
What is a Convolutional Neural Network-unit3.docx
12 pages
Iii Unit - Deeplearning
No ratings yet
Iii Unit - Deeplearning
93 pages
Sarma Cnn Vce Oct 2022
No ratings yet
Sarma Cnn Vce Oct 2022
63 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Week8 WEB
No ratings yet
Week8 WEB
54 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Week2_lecture1_2
No ratings yet
Week2_lecture1_2
113 pages
Ch3 CNN
No ratings yet
Ch3 CNN
64 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
Cnn
No ratings yet
Cnn
98 pages
ML-13
No ratings yet
ML-13
34 pages
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
No ratings yet
Convolutional Neural Networks: CMSC 733 Fall 2015 Angjoo Kanazawa
55 pages
CNN 1
No ratings yet
CNN 1
9 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
Cisco Catalyst and Cisco DNA Software Subscription Matrix For Switching
No ratings yet
Cisco Catalyst and Cisco DNA Software Subscription Matrix For Switching
1 page
Iwl280 Wireless Series User Guide
No ratings yet
Iwl280 Wireless Series User Guide
32 pages
An Ecosystem For Corporate Training With Accessible MOOCs and OERs
No ratings yet
An Ecosystem For Corporate Training With Accessible MOOCs and OERs
6 pages
A Cross-Industry Public Foresight Project: Co-Authors Contributing Authors
100% (4)
A Cross-Industry Public Foresight Project: Co-Authors Contributing Authors
28 pages
Collection
100% (1)
Collection
41 pages
Point Mobility
No ratings yet
Point Mobility
10 pages
Web Surfing and Cyber Security-1
No ratings yet
Web Surfing and Cyber Security-1
4 pages
Datasheet - RMD v1.0 2021-02-18
No ratings yet
Datasheet - RMD v1.0 2021-02-18
2 pages
Iwip
No ratings yet
Iwip
34 pages
HL Main 2024-02-16
No ratings yet
HL Main 2024-02-16
251 pages
Chapter 3 (B) - Selection (Nested If, Switch)
No ratings yet
Chapter 3 (B) - Selection (Nested If, Switch)
6 pages
Tarea 1
No ratings yet
Tarea 1
5 pages
Christiana Bitrus Chapter One
No ratings yet
Christiana Bitrus Chapter One
44 pages
Lect# 6
No ratings yet
Lect# 6
37 pages
Sensor Fusion and Tracking For Autonomous Systems White Paper PDF
No ratings yet
Sensor Fusion and Tracking For Autonomous Systems White Paper PDF
15 pages
[Ebooks PDF] download Accelerators for Convolutional Neural Networks Arslan Munir full chapters
No ratings yet
[Ebooks PDF] download Accelerators for Convolutional Neural Networks Arslan Munir full chapters
34 pages
Veryfinaldocumentofourresearch 3
No ratings yet
Veryfinaldocumentofourresearch 3
52 pages
Lecture1 2024
No ratings yet
Lecture1 2024
38 pages
Dbms Important Questions and Answers
No ratings yet
Dbms Important Questions and Answers
9 pages
Sucharitha
No ratings yet
Sucharitha
5 pages
ITT542 - Case Study 1 Network Layer Protocol Group - 3
No ratings yet
ITT542 - Case Study 1 Network Layer Protocol Group - 3
11 pages
Field Guide To Special Functions For Engineers
No ratings yet
Field Guide To Special Functions For Engineers
5 pages
Resume PDF
No ratings yet
Resume PDF
1 page
Proxmox Mail Gateway 6.4 Datasheet
No ratings yet
Proxmox Mail Gateway 6.4 Datasheet
4 pages
Welcome Letter Wisdom Global School Admissions 2024-25
No ratings yet
Welcome Letter Wisdom Global School Admissions 2024-25
2 pages
Unit 3 Normalization
No ratings yet
Unit 3 Normalization
157 pages
Nokia SRAN BTS - 5G NB AMOD Install - Connection Overview Ed1.0 20W06.3b
100% (1)
Nokia SRAN BTS - 5G NB AMOD Install - Connection Overview Ed1.0 20W06.3b
43 pages