0% found this document useful (0 votes)

44 views

CNN Architectures - Transfer Learning

The document discusses CNN architectures including AlexNet, ZF Net, and VGG. AlexNet used ReLU, dropout, data augmentation and reduced the top-5 error rate to 15%. VGG used only 3x3 filters and increased depth to achieve a top-5 error rate of 7.3%. GoogLeNet introduced Inception modules to reduce parameters and improve accuracy.

Uploaded by

Barani

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views

CNN Architectures - Transfer Learning

Uploaded by

Barani

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 64

Convolutional Neural

Network (CNN)

Day 3
CNN Architectures Transfer Learning
CNN
Architectures
CNN Architecture Decisions

➢ Number of Layers
➢ Number of filters
➢ Filter or Kernel Size
➢ Pooling
➢ Stride
➢ Fully Connected Layers
➢ Regularizers e.g. Batch Norm, Dropout
What are the
Best Practices?

Review work in related domain and follow best practices..

ILSVRC
(Imagenet Large Scale Visual Recognition Challenge)
What is ImageNet

- Large Image Dataset

- 14+ Million images

- ~22K Categories

- Human labeled

- ‘Describes’ the world around us

image-net.org
Top 5% Error Rate

CNN Based

Accuracy measured on test dataset for 1000 Categories

AlexNet (2012)

➢ Reduced Error rate from 26% to 15%

○ A watershed moment in Computer Vision
➢ Used a Deep Architecture
➢ ReLU
➢ Dropout
➢ Data Augmentation
➢ Inference Augmentation
AlexNet
SoftMax
FC 1000

FC 4096

Pool 3x3 S:2

- 5 Convolutional Layers
Conv 256 3x3, S:1, P:1

Conv 384 3x3, S:1, P:1 - 3 Max Pool Layers

Conv 384 3x3, S:1, P:1
- 3 Fully Connected Layers
Pool 3x3 S:2

Conv 256 5x5, S:1, P:2

- GTX580 , 5-6 Days

Pool 3x3 S:2

Conv 96 11x11, S:4, P:0

Input 227x227x3
AlexNet
SoftMax
FC 1000

FC 4096

FC 4096
Conv 2
Conv 1
Pool 3x3 S:2 ? Overlapping
? ?
Input Image Max Pool 256
Conv 256 3x3, S:1, P:1 96
227x227x3 5x5
11x11 3x3
Stride=2 Stride = 1
Conv 384 3x3, S:1, P:1 Stride = 4
Padding = 2

Conv 384 3x3, S:1, P:1

Pool 3x3 S:2

Conv 256 5x5, S:1, P:2

Pool 3x3 S:2 - Output size : (N - F + 2P)/s + 1

Conv 96 11x11, S:4, P:0
- How many Weights to learn?
Input 227x227x3
Overlapping Max Pool

1 4 5 2 7

5 3 6 3 6

7 2 1 1 4 - 3 x 3 Filter

- Stride 2
3 9 4 6 7

4 2 5 1 2
Overlapping Max Pool

1 4 5 2 7

5 3 6 3 6

7 2 1 1 4 - 3 x 3 Filter

- Stride 2
3 9 4 6 7

4 2 5 1 2
AlexNet - Overlapping Max Pool

1 4 5 2 7

5 3 6 3 6

7 2 1 1 4 - 3 x 3 Filter

- Stride 2
3 9 4 6 7

4 2 5 1 2
Relu vs tanh

hyperbolic tangent max(0,x)

Relu helps with Vanishing Gradients issue

Dropout

Dropout applied with Fully connected Layers

Data Augmentation

Horizontal Flip
Data Augmentation

Random Crop
Inference Augmentation

Multiple Images for

Prediction
Average
Output for
Model Prediction

Prediction Time Augmentation

Summary - AlexNet (2012)

➢ Deep Architecture with Convolutional Layers

➢ Trained on ImageNet
➢ Used ReLU instead of tanh
➢ Dropout with FC Layers
➢ Data Augmentation - Horizontal flips, Translations
➢ Trained on GPU
Top 5% Error Rate

CNN Based

Accuracy measured on test dataset for 1000 Categories

ZF Net
SoftMax
FC 1000

FC 4096

Pool 3x3 S:2

- Similar to AlexNet
Conv 512 3x3, S:1

Conv 1024 3x3, S:1 - Smaller filter size but more filters
Conv 512 3x3, S:1
- GTX580 , 11-12 Days
Pool 3x3 S:2

Conv 256 3x3, S:1

- Error rate of 11.7%

Pool 3x3 S:2

Conv 96 7x7, S:2

Input 224x224x3
Building
Deeper Networks
VGG (2014)

researchgate.net
SoftMax
FC 1000
FC 4096
FC 4096
Pool 3x3
Conv 3x3, 512
All Conv filters : 3x3 stride 1 pad 1

Conv 3x3, 512

Conv 3x3, 512
Pool 3x3
Very simple architecture

Nvidia Titan 2-3 Weeks

Max Pool : 2x2 stride 2
VGGNet

Conv 3x3, 512

Conv 3x3, 512 Error rate of 7.3%
Conv 3x3, 512
Pool
Conv 3x3, 512
Conv 3x3, 256
Conv 3x3, 256
Pool
-

-
Conv 3x3, 128
Conv 3x3, 128
Pool
Conv 3x3, 64
Conv 3x3, 64
Input
What should be Filter Size?

Smaller size
filter OR
Smaller
Receptive field

- Region that a CNN Filter gets to look at in the input is call

its Receptive Field
- Filter or Kernel capture Pixel Level interaction
Larger size
- What should be filter size? filter OR
Larger
Receptive field
How to achieve Larger Receptive field

➢ Larger Kernels e.g. 5x5, 7x7, 11x11

○ Downside -> More Weights

➢ Pooling

○ Downside -> Information Loss

➢ Using multiple layers of smaller filter e.g. 3x3

5x5 Filter

x x x x x

5x5 Filter
With Relu

x
Multi-layer 3x3 Filter

x x x x x

3x3 Filter
With Relu

x x x

3x3 Filter
With Relu

Additional non-linearity (Relu twice in 3x3 vs one in 5x5)

How many Parameters?

Input Input
30x30x64 30x30x64

Conv 1 Conv 1
64, 3x3 64, 5x5,
S=1, Relu S=1, Relu

Conv 1
64, 3x3, S=1, Relu

Reduces Model Size

3x3x64xx64 + 5x5x64xx64
3x3x64x64 = 25x64x64
= 18x64x64
Increasing Filters with Depth

- Initial layers capture low level

information e.g edges etc

- Later layers combine initial

features to learn for higher level
info

researchgate.net
Ensembles

Model # 1

Average of
Model # 2 Multiple
Predictions

Model # n
Ensembles in VGG

- VGG16
- VGG19

Reduces Overfitting,
Improves Accuracy
Summary - VGG (2014)

➢ Use of only 3x3 filters

➢ Increasing filters with depth
➢ Using Ensembles to improve results
➢ Top-5 error rate 7.3%
Moving away from ‘Simple’
SoftMax
FC
Avg Pool
9. Inception
No FC Layer except last one

8. Inception
Stacked Inception modules

7. Inception
9 Inception Modules
GoogLeNet

6. Inception
Error rate of 6.7%
5. Inception
4. Inception
3. Inception
2. Inception
1. Inception
-

-
Pool
Conv
Conv
Pool
Conv
Input
Convolution OR Pooling?

What Size Convolution?

Using all the options

Concatenation

1x1 Conv 3x3 Conv 5x5 Conv 3x3 Max Pool

Previous Layer

Naive Inception module

But it does not work :(
Naive Inception Module

28x28x(128+192+96+256)
=28x28x672

Depth-wise Number of Ops:

Concatenation
1x1 Conv : 28x28x128x1x1x256

28x28x128 28x28x192 28x28x96 28x28x256

3x3 Conv : 28x28x192x3x3x256

128 1x1 Conv 192 3x3 Conv 96 5x5 Conv 3x3 MaxPool, 5x5 Conv : 28x28x96x5x5x256
S: 1, P:0 S:1, P:1 S:1, P:2 S:1, P:1
Total: 854M

28x28x256

Input
Computationally very very complex
Power of 1x1 Convolution
28x28x256 28x28x32

1x1 Conv with

32 filters

Reduces depth
Efficient Inception Module

Concatenation

3x3 Conv 5x5 Conv 1x1 Conv

1x1 Conv

1x1 Conv 1x1 Conv 3x3 Max Pool

Previous Layer
Efficient Inception Module

28x28x480 Number of Ops:

Depth-wise
Concatenation 1x1 Conv : 28x28x128x1x1x256
1x1 Conv : 28x28x64x1x1x256
1x1 Conv : 28x28x64x1x1x256
28x28x192 28x28x96 28x28x64
3x3 Conv : 28x28x192x3x3x64
192 3x3 Conv 96 5x5 Conv 64 1x1 Conv
28x28x128 5x5 Conv : 28x28x96x5x5x64
128 1x1 Conv 1x1 Conv : 28x28x64x1x1x256
28x28x64 28x28x64 28x28x256 Total: 358M
64 1x1 Conv 64 1x1 Conv 3x3 MaxPool

28x28x256

Input
GoogLeNet Architecture
Auxiliary Loss

➢ Calculate Loss for earlier Layers

➢ Combine Auxiliary Loss with Final Loss

➢ Why have Auxiliary loss?

○ Reduce Vanishing Gradient for earlier layers
No Fully Connected Layer

Conv Conv

7 x 7 x 1024 7 x 7 x 1024
Earlier Network
approaches GoogLeNet
approach
FC Layer
Global Average
Pooling
1024

1024 1 x 1 x 1024

How many Weights?

No Fully Connected Layer

Conv Conv

7 x 7 x 1024 7 x 7 x 1024
Earlier Network
approaches GoogLeNet
approach
FC Layer
Global Average
Pooling
1024

7 x 7 x 1024 x 1024 = 50M 0

Reduces Model size significantly

Summary - GoogLeNet (2014)

➢ Use of Inception Module

➢ 1 x 1 Convolution
➢ Auxiliary Loss
➢ Avoid FC Layers to reduce Size
➢ Global Average Pooling
How deep can we
really go?
Accuracy saturates and then
degrades
ILSVRC Winners

Deeper Deep?
Networks
ResNet (2015)

➢ 1st Place in ILSVRC 2015

➢ 1st Place in COCO Detection & Segmentation
➢ Replacing VGG-16 with ResNet 101 in Faster-RCNN improved results
by 28%
➢ Efficiently trained networks with 100 layers and 1000 layers
ResNet
SoftMax
FC 1000

Pool

3x3 Conv 64

3x3 Conv 64
- Ultra deep : 152 Layers
Conv 128 3x3

Conv 128 3x3 - Residual blocks

- Error Rate 3.7%

Conv 128 3x3 - 8 GPUs , 2-3 Weeks

Conv 128 3x3

Pool

Conv 64 7x7

Input
Residual Block

H(X relu
F(X) + X +
)

Conv Conv

X
relu relu

Conv Conv

X X

Regular Stacking Residual Block

Residual Block

relu
F(X) + X +

H(x) = F(X) + X
Conv

X
relu F(x) = H(X) - X

Conv
Smaller value,
easier to Optimize
X

Residual Block
Summary - ResNet (2015)

➢ Residual Blocks with Skip connection

➢ Batch Normalization
➢ No Dropout
➢ No FC Layer
Deep CNNs require lots of Data
Transfer Learning
Retrained

SoftMax
FC 1000
FC 4096
FC 4096
Pool 3x3
Conv 3x3, 512
Conv 3x3, 512
Conv 3x3, 512
Conv 3x3, 512
Pool 3x3
VGGNet

Conv 3x3, 512

Conv 3x3, 512
Conv 3x3, 512
Conv 3x3, 512
Frozen

Pool
Conv 3x3, 256
Conv 3x3, 256
Pool
Conv 3x3, 128
Conv 3x3, 128
Pool
Conv 3x3, 64
Conv 3x3, 64
Input
Identifying Flowers

Daisy Roses

Dandelion

Tulips
Sunflowers
Applying Transfer Learning

Daisy

Roses
Fully
Fully
Connected
Connected Dandelion
5
ResNet 200
(SoftMax)
(Frozen Layers) Tulips

Sunflowers

Flatten
Do we keep all Layers
Frozen?
More Options

Similar to Original Different from Original

Small Dataset

for layer in model.layers: for layer in model.layers[:10]:

layer.trainable = False layer.trainable = False
Large Dataset

for layer in model.layers[:10]: for layer in model.layers:

layer.trainable = False layer.trainable = True

CNN Short
No ratings yet
CNN Short
61 pages
141-CIS Lab Manual v3
100% (2)
141-CIS Lab Manual v3
56 pages
5b Dana
No ratings yet
5b Dana
67 pages
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
No ratings yet
CS60010: Deep Learning CNN - Part 3: Sudeshna Sarkar
167 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
No ratings yet
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
82 pages
Aidl 2023s DL 08 CNN Architectures
No ratings yet
Aidl 2023s DL 08 CNN Architectures
51 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
Modern CNN Architectures
No ratings yet
Modern CNN Architectures
32 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
MLT CNN Architectures
No ratings yet
MLT CNN Architectures
104 pages
CS436_CS5310_EE513_L05_CNN2
No ratings yet
CS436_CS5310_EE513_L05_CNN2
27 pages
Lecture06 VDL
No ratings yet
Lecture06 VDL
79 pages
138 B Pretrained Networks Classification Complete
No ratings yet
138 B Pretrained Networks Classification Complete
47 pages
Week3_Lec1_2
No ratings yet
Week3_Lec1_2
107 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Ch-3 Convolutional Neural Networks (CNNs)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNs)
11 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
14 pages
138 A VGG Googlenet in B Now
No ratings yet
138 A VGG Googlenet in B Now
18 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
CNN Variants V1
No ratings yet
CNN Variants V1
109 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
17 pages
TResNet
No ratings yet
TResNet
37 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
Convolutional Neural Networks (CNN) : Convolutions
No ratings yet
Convolutional Neural Networks (CNN) : Convolutions
17 pages
An Analysis of Convolutional Neural Network Architectures
No ratings yet
An Analysis of Convolutional Neural Network Architectures
54 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
Lecture2 Advanced CNN
No ratings yet
Lecture2 Advanced CNN
55 pages
CNN Apps
No ratings yet
CNN Apps
17 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
39 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
37 pages
VGG net
No ratings yet
VGG net
6 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
ch4_CNN
No ratings yet
ch4_CNN
35 pages
Lecture 6 Deep Learning Training and Testing 2025
No ratings yet
Lecture 6 Deep Learning Training and Testing 2025
36 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
CNN 2
No ratings yet
CNN 2
47 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
Res Net 4
No ratings yet
Res Net 4
23 pages
10. Image Processing With Deep Learning
No ratings yet
10. Image Processing With Deep Learning
39 pages
L10 - Intro - To - Deep - Learning
No ratings yet
L10 - Intro - To - Deep - Learning
75 pages
Unit-3
No ratings yet
Unit-3
38 pages
2023 AN2DL Lez 4 CNN Famous Architectures
No ratings yet
2023 AN2DL Lez 4 CNN Famous Architectures
113 pages
lecture_5 (5)
No ratings yet
lecture_5 (5)
35 pages
Deep Learning (MODULE-3) (1)
No ratings yet
Deep Learning (MODULE-3) (1)
85 pages
Goog Le Net
No ratings yet
Goog Le Net
30 pages
Unit III
No ratings yet
Unit III
58 pages
CNN
No ratings yet
CNN
31 pages
Intro CNN PDF
No ratings yet
Intro CNN PDF
31 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
Lec 9_AIHC_S2022_V2
No ratings yet
Lec 9_AIHC_S2022_V2
124 pages
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
No ratings yet
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
55 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
4 March 23 - DL
No ratings yet
4 March 23 - DL
79 pages
3_lecture_21_01_25
No ratings yet
3_lecture_21_01_25
62 pages
Convolutional Neural Networks _ deeplearning-notes
No ratings yet
Convolutional Neural Networks _ deeplearning-notes
43 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
03 Convolutional Neural Networks
No ratings yet
03 Convolutional Neural Networks
83 pages
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
UNIT - 1 PPT - DBMS - BSC
No ratings yet
UNIT - 1 PPT - DBMS - BSC
27 pages
The Ethics of Artificial Intelligence - Final Modified Verion For XLRI
100% (1)
The Ethics of Artificial Intelligence - Final Modified Verion For XLRI
13 pages
INDU 6121 (6 - Set Covering Problem)
No ratings yet
INDU 6121 (6 - Set Covering Problem)
21 pages
AHFE17 FinalProgram
No ratings yet
AHFE17 FinalProgram
127 pages
Effective Metrics For Software Process
No ratings yet
Effective Metrics For Software Process
13 pages
Jurnal Ieee
No ratings yet
Jurnal Ieee
496 pages
Sapthagiri College of Engineering Information Science and Engineering
0% (1)
Sapthagiri College of Engineering Information Science and Engineering
14 pages
ML Ch-1 Introduction To ML 1
No ratings yet
ML Ch-1 Introduction To ML 1
35 pages
Presentasi Leadership Materi 1
No ratings yet
Presentasi Leadership Materi 1
7 pages
9 - The Database Design Part-3
No ratings yet
9 - The Database Design Part-3
18 pages
Thermofluids Ch3 Heat Engines
No ratings yet
Thermofluids Ch3 Heat Engines
61 pages
Lean Maintenance Roadmap
0% (1)
Lean Maintenance Roadmap
11 pages
American Sign Language Research Paper
No ratings yet
American Sign Language Research Paper
5 pages
Unit Iv
No ratings yet
Unit Iv
3 pages
Aims Institute OF Management Sciences Question Paper Mid Term Examination Spring 2019
No ratings yet
Aims Institute OF Management Sciences Question Paper Mid Term Examination Spring 2019
3 pages
Free Software Testing Course Content
No ratings yet
Free Software Testing Course Content
6 pages
BA Process Template
No ratings yet
BA Process Template
7 pages
PID Controler
No ratings yet
PID Controler
30 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
68 pages
2 - Database Development Process
No ratings yet
2 - Database Development Process
44 pages
Case Study - 9
No ratings yet
Case Study - 9
2 pages
2 - HCI and Overview
No ratings yet
2 - HCI and Overview
23 pages
Anglais Presentation
No ratings yet
Anglais Presentation
10 pages
Intro To Tel 433
No ratings yet
Intro To Tel 433
2 pages
Song Recommdation
No ratings yet
Song Recommdation
18 pages
Chapter 10: Natural Language Processing: Components of NLP
100% (1)
Chapter 10: Natural Language Processing: Components of NLP
3 pages
TST CT119-3-2-Data Mining and Predictive Modelling (VE1)
No ratings yet
TST CT119-3-2-Data Mining and Predictive Modelling (VE1)
1 page
CFLBA Syllabus v1.1 PDF
No ratings yet
CFLBA Syllabus v1.1 PDF
105 pages
CDS 101/110a: Lecture 9-1 PID Control: Richard M. Murray 24 November 2008
No ratings yet
CDS 101/110a: Lecture 9-1 PID Control: Richard M. Murray 24 November 2008
14 pages