SlideShare a Scribd company logo
Deep Residual Learning for
Image Recognition
Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
Presented by – Sanjay Saha, School of Computing, NUS
CS6240 – Multimedia Analysis – Sem 2 AY2019/20
Objective | Problem Statement
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Motivation
Performance of plain networks in a deeper architecture
Image source: paper
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Main Idea
• Skip Connections/ Shortcuts
• Trying to avoid:
‘Vanishing Gradients’
‘Long training times’
Image source: Wikipedia
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Contributions | Problem Statement
• These extremely deep residual nets are easy to optimize, but the
counterpart “plain” nets (that simply stack layers) exhibit higher
training error when the depth increases.
• These deep residual nets can easily enjoy accuracy gains from greatly
increased depth, producing results substantially better than previous
networks.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
A residual learning framework to ease the training of networks that
are substantially deeper than those used previously.
Perfor
mance
Depth
School of Computing
Literature
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Literature Review
• Partial solutions for vanishing
• Batch Normalization – To rescale the weights over some batch.
• Smart Initialization of weights – Like for example Xavier initialization.
• Train portions of the network individually.
• Highway Networks
• Feature residual connections of the form
𝑌 = 𝑓 𝑥 × 𝑠𝑖𝑔𝑚𝑜𝑖𝑑(𝑊𝑥 + 𝑏) + 𝑥 × (1 − 𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑊𝑥 + 𝑏 )
• Data-dependent gated shortcuts with parameters
• When gates are ‘closed’, the layers become ‘non-residual’.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
ResNet | Design | Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Plain Block
𝑎[𝑙] 𝑎[𝑙+2]
𝑎[𝑙+1]
𝑧[𝑙+1]
= 𝑊[𝑙+1]
𝑎[𝑙]
+ 𝑏[𝑙+1]
“linear”
𝑎[𝑙+1] = 𝑔(𝑧[𝑙+1])
“relu”
𝑧[𝑙+2] = 𝑊[𝑙+2] 𝑎[𝑙+1] + 𝑏[𝑙+2]
“output”
𝑎[𝑙+2]
= 𝑔 𝑧 𝑙+2
“relu on output”
Image source: deeplearning.ai
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Residual Block
𝑎[𝑙] 𝑎[𝑙+2]
𝑎[𝑙+1]
𝑧[𝑙+1]
= 𝑊[𝑙+1]
𝑎[𝑙]
+ 𝑏[𝑙+1]
“linear”
𝑎[𝑙+1] = 𝑔(𝑧[𝑙+1])
“relu”
𝑧[𝑙+2] = 𝑊[𝑙+2] 𝑎[𝑙+1] + 𝑏[𝑙+2]
“output”
𝑎[𝑙+2]
= 𝑔 𝑧 𝑙+2
+ 𝑎 𝑙
“relu on output plus input”
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: deeplearning.ai
School of Computing
Skip Connections
• Skipping immediate connections!
• Referred to as residual part of the network.
• Such residual part receives the input as an amplifier to its output –
The dimensions usually are the same.
• Another option is to use a projection to the output space.
• Either way – no additional training parameters are used.
Image source: towardsdatascience.com
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
ResNet Architecture
Image source: paper
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
Stacked Residual Blocks
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
3x3 conv
layers
2x # of filters
2 strides to down-sample
Avg. pool after the
last conv layer
FC layer to
output classes
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
3x3 conv
layers
2x # of filters
2 strides to down-sample
Avg. pool after the
last conv layer
FC layer to
output classes
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
3x3 conv
layers
2x # of filters
2 strides to down-sample
Avg. pool after the
last conv layer
FC layer to
output classes
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
1x1 conv with 64 filters
28x28x64
Input:
28x28x256
Image source: paper
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
1x1 conv with 64 filters
28x28x64
Input:
28x28x256
3x3 conv on 64 feature
maps only
Image source: paper
School of Computing
ResNet Architecture
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
1x1 conv with 64 filters
28x28x64
Input:
28x28x256
3x3 conv on 64 feature
maps only
1x1 conv with 256 filters
28x28x256
BOTTLENECK
Image source: paper
School of Computing
Summary | Advantages
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Benefits of Bottleneck
• Less training time for deeper networks
• By keeping time complexity same as
two-layer conv.
• Hence, allows to increase # of layers.
• And, model converges faster: 152-
layer ResNet has 11.3 billion FLOPS
while VGG-16/19 nets has 15.3/19.6
billion FLOPS.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Input:
28x28x256
Image source: paper
School of Computing
Summary – Advantages of ResNet over Plain
Networks
• A deeper plain network tends to perform bad because of the
vanishing and exploding gradients
• In such cases, ResNets will stop improving rather than decrease in
performance: 𝑎[𝑙+2] = 𝑔 𝑧 𝑙+2 + 𝑎 𝑙 = 𝑔(𝑤 𝑙+1 𝑎 𝑙+1 + 𝑏 𝑙 + 𝑎[𝑙])
• If a layer is not ‘useful’, L2 regularization will bring its parameters very
close to zero, resulting in 𝑎[𝑙+2]
= 𝑔 𝑎[𝑙]
= 𝑎[𝑙]
(when using ReLU)
• In theory, ResNet is still identical to plain networks, but in practice
due to the above the convergence is much faster.
• No additional training parameters and complexity introduced.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Results
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Results
• ILSVRC 2015 classification winner (3.6% top 5 error) -- better than “human performance”!
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Error rates (%) of ensembles. The top-5 error is on the
test set of ImageNet and reported by the test server
School of Computing
Results
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Error rates (%, 10-crop testing) on ImageNet
validation set
Error rates (%) of single-model results on
the ImageNet validation set
School of Computing
Plain vs. ResNet
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
School of Computing
Plain vs. Deeper ResNet
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu)
Image source: paper
School of Computing
Conclusion | Future Trends
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Conclusion
•Easy to optimize deep neural networks.
•Guaranteed Accuracy gain with deeper layers.
•Addressed: Vanishing Gradient and Longer
Training duration.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Conclusion
•Easy to optimize deep neural networks.
•Guaranteed Accuracy gain with deeper layers.
•Addressed: Vanishing Gradient and Longer
Training duration.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Conclusion
•Easy to optimize deep neural networks.
•Guaranteed Accuracy gain with deeper layers.
•Addressed: Vanishing Gradient and Longer
Training duration.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Conclusion
•Easy to optimize deep neural networks.
•Guaranteed Accuracy gain with deeper layers.
•Addressed: Vanishing Gradient and Longer
Training duration.
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Future Trends
• Identity Mappings in Deep Residual Networks suggests to pass the
input directly to the final residual layer, hence allowing the network
to easily learn to pass the input as identity mapping both in forward
and backward passes. (He et. al. 2016)
• Using the Batch Normalization as pre-activation improves the
regularization
• Reduce Learning Time with Random Layer Drops
• ResNeXt: Aggregated Residual Transformations for Deep Neural
Networks. (Xie et. al. 2016)
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Presented by – Sanjay Saha (sanjaysaha@u.nus.edu) School of Computing
Questions?

More Related Content

What's hot (20)

Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
MojammilHusain
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
Dean Wyatte
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
geetachauhan
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
PyData
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
Owin Will
 
Object recognition of CIFAR - 10
Object recognition of CIFAR  - 10Object recognition of CIFAR  - 10
Object recognition of CIFAR - 10
Ratul Alahy
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044
Jinwon Lee
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
UMBC
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognition
YUNG-KUEI CHEN
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Ferdous ahmed
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
SungminYou
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
MojammilHusain
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
Dean Wyatte
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
geetachauhan
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
PyData
 
backpropagation in neural networks
backpropagation in neural networksbackpropagation in neural networks
backpropagation in neural networks
Akash Goel
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
Owin Will
 
Object recognition of CIFAR - 10
Object recognition of CIFAR  - 10Object recognition of CIFAR  - 10
Object recognition of CIFAR - 10
Ratul Alahy
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
MobileNet - PR044
MobileNet - PR044MobileNet - PR044
MobileNet - PR044
Jinwon Lee
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
UMBC
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
Gaurav Mittal
 
Convolutional Neural Network (CNN) - image recognition
Convolutional Neural Network (CNN)  - image recognitionConvolutional Neural Network (CNN)  - image recognition
Convolutional Neural Network (CNN) - image recognition
YUNG-KUEI CHEN
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Suraj Aavula
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Ferdous ahmed
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
SungminYou
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 

Similar to ResNet basics (Deep Residual Network for Image Recognition) (20)

CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
Dmytro Mishkin
 
Improved Image Based Super Resolution and Concrete Crack Prediction Using Pre...
Improved Image Based Super Resolution and Concrete Crack Prediction Using Pre...Improved Image Based Super Resolution and Concrete Crack Prediction Using Pre...
Improved Image Based Super Resolution and Concrete Crack Prediction Using Pre...
Journal of Soft Computing in Civil Engineering
 
Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017
SERC at Carleton College
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 
Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?
Zachary Thomas
 
CS772-Lec1.pptx
CS772-Lec1.pptxCS772-Lec1.pptx
CS772-Lec1.pptx
adarshbarnwal5
 
03_Optimization (1).pptx
03_Optimization (1).pptx03_Optimization (1).pptx
03_Optimization (1).pptx
KHUSHIJAIN197601
 
A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...
LeMeniz Infotech
 
An Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learningAn Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learning
Chetan Khatri
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
Advanced Topics in Machine Learning - Basics
Advanced Topics in Machine Learning - BasicsAdvanced Topics in Machine Learning - Basics
Advanced Topics in Machine Learning - Basics
ssusere142e5
 
Please don't make me draw (eKnow 2010)
Please don't make me draw (eKnow 2010)Please don't make me draw (eKnow 2010)
Please don't make me draw (eKnow 2010)
Andrea Valente
 
Machine Learning a whirlwind tour of key concepts for the uninitiated
Machine Learning a whirlwind tour of key concepts for the uninitiatedMachine Learning a whirlwind tour of key concepts for the uninitiated
Machine Learning a whirlwind tour of key concepts for the uninitiated
tobybreckon1
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
MLconf
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
KamleshKumar394
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
David Dao
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
Fwdays
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
NoorUlHaq47
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
Jason Riedy
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
Dmytro Mishkin
 
Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017Developing Computational Skills in the Sciences with Matlab Webinar 2017
Developing Computational Skills in the Sciences with Matlab Webinar 2017
SERC at Carleton College
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 
Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?Could a Data Science Program use Data Science Insights?
Could a Data Science Program use Data Science Insights?
Zachary Thomas
 
A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...A fast fault tolerant architecture for sauvola local image thresholding algor...
A fast fault tolerant architecture for sauvola local image thresholding algor...
LeMeniz Infotech
 
An Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learningAn Introduction Linear Algebra for Neural Networks and Deep learning
An Introduction Linear Algebra for Neural Networks and Deep learning
Chetan Khatri
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming GraphsScalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
Advanced Topics in Machine Learning - Basics
Advanced Topics in Machine Learning - BasicsAdvanced Topics in Machine Learning - Basics
Advanced Topics in Machine Learning - Basics
ssusere142e5
 
Please don't make me draw (eKnow 2010)
Please don't make me draw (eKnow 2010)Please don't make me draw (eKnow 2010)
Please don't make me draw (eKnow 2010)
Andrea Valente
 
Machine Learning a whirlwind tour of key concepts for the uninitiated
Machine Learning a whirlwind tour of key concepts for the uninitiatedMachine Learning a whirlwind tour of key concepts for the uninitiated
Machine Learning a whirlwind tour of key concepts for the uninitiated
tobybreckon1
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
MLconf
 
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on ClusteringAbility Study of Proximity Measure for Big Data Mining Context on Clustering
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
KamleshKumar394
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
David Dao
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
Fwdays
 
UNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptxUNetEliyaLaialy (2).pptx
UNetEliyaLaialy (2).pptx
NoorUlHaq47
 
High-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming GraphsHigh-Performance Analysis of Streaming Graphs
High-Performance Analysis of Streaming Graphs
Jason Riedy
 

More from Sanjay Saha (7)

Face Recognition Basic Terminologies
Face Recognition Basic TerminologiesFace Recognition Basic Terminologies
Face Recognition Basic Terminologies
Sanjay Saha
 
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Sanjay Saha
 
Convolutional Deep Belief Nets by Lee. H. 2009
Convolutional Deep Belief Nets by Lee. H. 2009Convolutional Deep Belief Nets by Lee. H. 2009
Convolutional Deep Belief Nets by Lee. H. 2009
Sanjay Saha
 
IEEE_802.11e
IEEE_802.11eIEEE_802.11e
IEEE_802.11e
Sanjay Saha
 
Image Degradation & Resoration
Image Degradation & ResorationImage Degradation & Resoration
Image Degradation & Resoration
Sanjay Saha
 
Fault Tree Analysis
Fault Tree AnalysisFault Tree Analysis
Fault Tree Analysis
Sanjay Saha
 
Stack and Queue (brief)
Stack and Queue (brief)Stack and Queue (brief)
Stack and Queue (brief)
Sanjay Saha
 
Face Recognition Basic Terminologies
Face Recognition Basic TerminologiesFace Recognition Basic Terminologies
Face Recognition Basic Terminologies
Sanjay Saha
 
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Is Face Recognition Safe from Realizable Attacks? - IJCB 2020 - Sanjay Saha, ...
Sanjay Saha
 
Convolutional Deep Belief Nets by Lee. H. 2009
Convolutional Deep Belief Nets by Lee. H. 2009Convolutional Deep Belief Nets by Lee. H. 2009
Convolutional Deep Belief Nets by Lee. H. 2009
Sanjay Saha
 
Image Degradation & Resoration
Image Degradation & ResorationImage Degradation & Resoration
Image Degradation & Resoration
Sanjay Saha
 
Fault Tree Analysis
Fault Tree AnalysisFault Tree Analysis
Fault Tree Analysis
Sanjay Saha
 
Stack and Queue (brief)
Stack and Queue (brief)Stack and Queue (brief)
Stack and Queue (brief)
Sanjay Saha
 

Recently uploaded (20)

lecture 33333222234555555555555555556.pptx
lecture 33333222234555555555555555556.pptxlecture 33333222234555555555555555556.pptx
lecture 33333222234555555555555555556.pptx
obsinaafilmakuush
 
15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf
AffinityCore
 
Role_Based_Permissions_Kick-off_Deck_202203.pptx
Role_Based_Permissions_Kick-off_Deck_202203.pptxRole_Based_Permissions_Kick-off_Deck_202203.pptx
Role_Based_Permissions_Kick-off_Deck_202203.pptx
SystemsBenya
 
Digestive_System_Presentation_BSc_Nursing.pptx
Digestive_System_Presentation_BSc_Nursing.pptxDigestive_System_Presentation_BSc_Nursing.pptx
Digestive_System_Presentation_BSc_Nursing.pptx
RanvirSingh357259
 
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
yashikanigam1
 
delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)
jamespromind
 
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Karim Baïna
 
Internal Architecture of Database Management Systems
Internal Architecture of Database Management SystemsInternal Architecture of Database Management Systems
Internal Architecture of Database Management Systems
M Munim
 
GDPR Audit - GDPR gap analysis cost Data Protection People.pdf
GDPR Audit - GDPR gap analysis cost  Data Protection People.pdfGDPR Audit - GDPR gap analysis cost  Data Protection People.pdf
GDPR Audit - GDPR gap analysis cost Data Protection People.pdf
Data Protection People
 
How to Choose the Right Online Proofing Software
How to Choose the Right Online Proofing SoftwareHow to Choose the Right Online Proofing Software
How to Choose the Right Online Proofing Software
skalatskayaek
 
time_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptxtime_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptx
stefanopinto1113
 
Nonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptxNonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptx
srtcuibinpm
 
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdfComprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
epsilonice
 
An Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of PokerAn Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of Poker
Graham Ware
 
Acounting Softwares Options & ERP system
Acounting Softwares Options & ERP systemAcounting Softwares Options & ERP system
Acounting Softwares Options & ERP system
huenkwan1214
 
Understanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its ApplicationsUnderstanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its Applications
M Munim
 
Debo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in CDebo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in C
ssuser49be50
 
Cyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptxCyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptx
vilakshbhargava
 
GROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptxGROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptx
mardoglenn21
 
Chronic constipation presentaion final.ppt
Chronic constipation presentaion final.pptChronic constipation presentaion final.ppt
Chronic constipation presentaion final.ppt
DrShashank7
 
lecture 33333222234555555555555555556.pptx
lecture 33333222234555555555555555556.pptxlecture 33333222234555555555555555556.pptx
lecture 33333222234555555555555555556.pptx
obsinaafilmakuush
 
15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf15 Benefits of Data Analytics in Business Growth.pdf
15 Benefits of Data Analytics in Business Growth.pdf
AffinityCore
 
Role_Based_Permissions_Kick-off_Deck_202203.pptx
Role_Based_Permissions_Kick-off_Deck_202203.pptxRole_Based_Permissions_Kick-off_Deck_202203.pptx
Role_Based_Permissions_Kick-off_Deck_202203.pptx
SystemsBenya
 
Digestive_System_Presentation_BSc_Nursing.pptx
Digestive_System_Presentation_BSc_Nursing.pptxDigestive_System_Presentation_BSc_Nursing.pptx
Digestive_System_Presentation_BSc_Nursing.pptx
RanvirSingh357259
 
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
Brain, Bytes & Bias: ML Interview Questions You Can’t Miss!
yashikanigam1
 
delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)delta airlines new york office (Airwayscityoffice)
delta airlines new york office (Airwayscityoffice)
jamespromind
 
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Ethical Frameworks for Trustworthy AI – Opportunities for Researchers in Huma...
Karim Baïna
 
Internal Architecture of Database Management Systems
Internal Architecture of Database Management SystemsInternal Architecture of Database Management Systems
Internal Architecture of Database Management Systems
M Munim
 
GDPR Audit - GDPR gap analysis cost Data Protection People.pdf
GDPR Audit - GDPR gap analysis cost  Data Protection People.pdfGDPR Audit - GDPR gap analysis cost  Data Protection People.pdf
GDPR Audit - GDPR gap analysis cost Data Protection People.pdf
Data Protection People
 
How to Choose the Right Online Proofing Software
How to Choose the Right Online Proofing SoftwareHow to Choose the Right Online Proofing Software
How to Choose the Right Online Proofing Software
skalatskayaek
 
time_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptxtime_series_forecasting_constructor_uni.pptx
time_series_forecasting_constructor_uni.pptx
stefanopinto1113
 
Nonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptxNonverbal_Communication_Presentation.pptx
Nonverbal_Communication_Presentation.pptx
srtcuibinpm
 
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdfComprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
Comprehensive Roadmap of AI, ML, DS, DA & DSA.pdf
epsilonice
 
An Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of PokerAn Algorithmic Test Using The Game of Poker
An Algorithmic Test Using The Game of Poker
Graham Ware
 
Acounting Softwares Options & ERP system
Acounting Softwares Options & ERP systemAcounting Softwares Options & ERP system
Acounting Softwares Options & ERP system
huenkwan1214
 
Understanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its ApplicationsUnderstanding Tree Data Structure and Its Applications
Understanding Tree Data Structure and Its Applications
M Munim
 
Debo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in CDebo: A Lightweight and Modular Infrastructure Management System in C
Debo: A Lightweight and Modular Infrastructure Management System in C
ssuser49be50
 
Cyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptxCyber Security Presentation(Neon)xu.pptx
Cyber Security Presentation(Neon)xu.pptx
vilakshbhargava
 
GROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptxGROUP 7 CASE STUDY Real Life Incident.pptx
GROUP 7 CASE STUDY Real Life Incident.pptx
mardoglenn21
 
Chronic constipation presentaion final.ppt
Chronic constipation presentaion final.pptChronic constipation presentaion final.ppt
Chronic constipation presentaion final.ppt
DrShashank7
 

ResNet basics (Deep Residual Network for Image Recognition)

  • 1. Deep Residual Learning for Image Recognition Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun Presented by – Sanjay Saha, School of Computing, NUS CS6240 – Multimedia Analysis – Sem 2 AY2019/20
  • 2. Objective | Problem Statement Presented by – Sanjay Saha ([email protected]) School of Computing
  • 3. Motivation Performance of plain networks in a deeper architecture Image source: paper Presented by – Sanjay Saha ([email protected]) School of Computing
  • 4. Main Idea • Skip Connections/ Shortcuts • Trying to avoid: ‘Vanishing Gradients’ ‘Long training times’ Image source: Wikipedia Presented by – Sanjay Saha ([email protected]) School of Computing
  • 5. Contributions | Problem Statement • These extremely deep residual nets are easy to optimize, but the counterpart “plain” nets (that simply stack layers) exhibit higher training error when the depth increases. • These deep residual nets can easily enjoy accuracy gains from greatly increased depth, producing results substantially better than previous networks. Presented by – Sanjay Saha ([email protected]) A residual learning framework to ease the training of networks that are substantially deeper than those used previously. Perfor mance Depth School of Computing
  • 6. Literature Presented by – Sanjay Saha ([email protected]) School of Computing
  • 7. Literature Review • Partial solutions for vanishing • Batch Normalization – To rescale the weights over some batch. • Smart Initialization of weights – Like for example Xavier initialization. • Train portions of the network individually. • Highway Networks • Feature residual connections of the form 𝑌 = 𝑓 𝑥 × 𝑠𝑖𝑔𝑚𝑜𝑖𝑑(𝑊𝑥 + 𝑏) + 𝑥 × (1 − 𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑊𝑥 + 𝑏 ) • Data-dependent gated shortcuts with parameters • When gates are ‘closed’, the layers become ‘non-residual’. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 8. ResNet | Design | Architecture Presented by – Sanjay Saha ([email protected]) School of Computing
  • 9. Plain Block 𝑎[𝑙] 𝑎[𝑙+2] 𝑎[𝑙+1] 𝑧[𝑙+1] = 𝑊[𝑙+1] 𝑎[𝑙] + 𝑏[𝑙+1] “linear” 𝑎[𝑙+1] = 𝑔(𝑧[𝑙+1]) “relu” 𝑧[𝑙+2] = 𝑊[𝑙+2] 𝑎[𝑙+1] + 𝑏[𝑙+2] “output” 𝑎[𝑙+2] = 𝑔 𝑧 𝑙+2 “relu on output” Image source: deeplearning.ai Presented by – Sanjay Saha ([email protected]) School of Computing
  • 10. Residual Block 𝑎[𝑙] 𝑎[𝑙+2] 𝑎[𝑙+1] 𝑧[𝑙+1] = 𝑊[𝑙+1] 𝑎[𝑙] + 𝑏[𝑙+1] “linear” 𝑎[𝑙+1] = 𝑔(𝑧[𝑙+1]) “relu” 𝑧[𝑙+2] = 𝑊[𝑙+2] 𝑎[𝑙+1] + 𝑏[𝑙+2] “output” 𝑎[𝑙+2] = 𝑔 𝑧 𝑙+2 + 𝑎 𝑙 “relu on output plus input” Presented by – Sanjay Saha ([email protected]) Image source: deeplearning.ai School of Computing
  • 11. Skip Connections • Skipping immediate connections! • Referred to as residual part of the network. • Such residual part receives the input as an amplifier to its output – The dimensions usually are the same. • Another option is to use a projection to the output space. • Either way – no additional training parameters are used. Image source: towardsdatascience.com Presented by – Sanjay Saha ([email protected]) School of Computing
  • 12. ResNet Architecture Image source: paper Presented by – Sanjay Saha ([email protected]) School of Computing
  • 13. ResNet Architecture Presented by – Sanjay Saha ([email protected]) Image source: paper Stacked Residual Blocks School of Computing
  • 14. ResNet Architecture Presented by – Sanjay Saha ([email protected]) Image source: paper 3x3 conv layers 2x # of filters 2 strides to down-sample Avg. pool after the last conv layer FC layer to output classes School of Computing
  • 15. ResNet Architecture Presented by – Sanjay Saha ([email protected]) Image source: paper 3x3 conv layers 2x # of filters 2 strides to down-sample Avg. pool after the last conv layer FC layer to output classes School of Computing
  • 16. ResNet Architecture Presented by – Sanjay Saha ([email protected]) Image source: paper 3x3 conv layers 2x # of filters 2 strides to down-sample Avg. pool after the last conv layer FC layer to output classes School of Computing
  • 17. ResNet Architecture Presented by – Sanjay Saha ([email protected]) 1x1 conv with 64 filters 28x28x64 Input: 28x28x256 Image source: paper School of Computing
  • 18. ResNet Architecture Presented by – Sanjay Saha ([email protected]) 1x1 conv with 64 filters 28x28x64 Input: 28x28x256 3x3 conv on 64 feature maps only Image source: paper School of Computing
  • 19. ResNet Architecture Presented by – Sanjay Saha ([email protected]) 1x1 conv with 64 filters 28x28x64 Input: 28x28x256 3x3 conv on 64 feature maps only 1x1 conv with 256 filters 28x28x256 BOTTLENECK Image source: paper School of Computing
  • 20. Summary | Advantages Presented by – Sanjay Saha ([email protected]) School of Computing
  • 21. Benefits of Bottleneck • Less training time for deeper networks • By keeping time complexity same as two-layer conv. • Hence, allows to increase # of layers. • And, model converges faster: 152- layer ResNet has 11.3 billion FLOPS while VGG-16/19 nets has 15.3/19.6 billion FLOPS. Presented by – Sanjay Saha ([email protected]) Input: 28x28x256 Image source: paper School of Computing
  • 22. Summary – Advantages of ResNet over Plain Networks • A deeper plain network tends to perform bad because of the vanishing and exploding gradients • In such cases, ResNets will stop improving rather than decrease in performance: 𝑎[𝑙+2] = 𝑔 𝑧 𝑙+2 + 𝑎 𝑙 = 𝑔(𝑤 𝑙+1 𝑎 𝑙+1 + 𝑏 𝑙 + 𝑎[𝑙]) • If a layer is not ‘useful’, L2 regularization will bring its parameters very close to zero, resulting in 𝑎[𝑙+2] = 𝑔 𝑎[𝑙] = 𝑎[𝑙] (when using ReLU) • In theory, ResNet is still identical to plain networks, but in practice due to the above the convergence is much faster. • No additional training parameters and complexity introduced. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 23. Results Presented by – Sanjay Saha ([email protected]) School of Computing
  • 24. Results • ILSVRC 2015 classification winner (3.6% top 5 error) -- better than “human performance”! Presented by – Sanjay Saha ([email protected]) Error rates (%) of ensembles. The top-5 error is on the test set of ImageNet and reported by the test server School of Computing
  • 25. Results Presented by – Sanjay Saha ([email protected]) Error rates (%, 10-crop testing) on ImageNet validation set Error rates (%) of single-model results on the ImageNet validation set School of Computing
  • 26. Plain vs. ResNet Presented by – Sanjay Saha ([email protected]) Image source: paper School of Computing
  • 27. Plain vs. Deeper ResNet Presented by – Sanjay Saha ([email protected]) Image source: paper School of Computing
  • 28. Conclusion | Future Trends Presented by – Sanjay Saha ([email protected]) School of Computing
  • 29. Conclusion •Easy to optimize deep neural networks. •Guaranteed Accuracy gain with deeper layers. •Addressed: Vanishing Gradient and Longer Training duration. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 30. Conclusion •Easy to optimize deep neural networks. •Guaranteed Accuracy gain with deeper layers. •Addressed: Vanishing Gradient and Longer Training duration. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 31. Conclusion •Easy to optimize deep neural networks. •Guaranteed Accuracy gain with deeper layers. •Addressed: Vanishing Gradient and Longer Training duration. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 32. Conclusion •Easy to optimize deep neural networks. •Guaranteed Accuracy gain with deeper layers. •Addressed: Vanishing Gradient and Longer Training duration. Presented by – Sanjay Saha ([email protected]) School of Computing
  • 33. Future Trends • Identity Mappings in Deep Residual Networks suggests to pass the input directly to the final residual layer, hence allowing the network to easily learn to pass the input as identity mapping both in forward and backward passes. (He et. al. 2016) • Using the Batch Normalization as pre-activation improves the regularization • Reduce Learning Time with Random Layer Drops • ResNeXt: Aggregated Residual Transformations for Deep Neural Networks. (Xie et. al. 2016) Presented by – Sanjay Saha ([email protected]) School of Computing
  • 34. Presented by – Sanjay Saha ([email protected]) School of Computing Questions?