ch4_CNN
ch4_CNN
DEEP LEARNING
CNN USE CASES
Object
detection
DEEP LEARNING 3
RGB IMAGE
Shape( n*m*3)
n*m : image resolution
3= number of channels (Red Green Blue)
DEEP LEARNING 4
GRAYSCALE IMAGE
Why CNN ?
Images have big pixels!
Fully-connected neural network would have too many parameters!
Translational invariance in images
Yann LeCun
DEEP LEARNING 6
WHAT IS A CNN?
A CNN is a type of artificial neural network usually designed to extract features from data and to classify given
high dimensional data.
Basic Architecture :
DEEP LEARNING 7
CONVOLUTION LAYER : 1 CHANNEL
Convoluting a 5x5x1 image with a 3x3x1 kernel (or filter) to get a 3x3x1 convolved feature
By convention, the value of ‘f,’ i.e. filter size, is usually odd in computer vision.
DEEP LEARNING 8
CONVOLUTION LAYER : 3 CHANNELS
In the case of images with multiple channels (e.g. RGB), the Kernel has the same depth as that of the input image.
DEEP LEARNING 9
IMAGE KERNELS
https://setosa.io/ev/image-kernels/
DEEP LEARNING 10
MULTIPLE CONVOLUTION LAYERS
DEEP LEARNING 11
CONVOLUTIONAL LAYER- HYPER PARAMETERS : STRIDE
The stride indicates the pace by which the filter moves horizontally & vertically over the pixels of the input image during
convolution.
Stride value :
For several fine-grained features : small stride size
If we are only interested in the macro-level of features : large stride size.
DEEP LEARNING
Stride = 2 12
CONVOLUTIONAL LAYER- HYPER PARAMETERS : PADDING
Downsides of convolution :
After a convolution operator, the image shrinks : loose of a lot of information.
During convolution, the pixels in the corners & the edges are considered only once :
a lot of information near the edge of the image are thrown away
Solution : ‘pad’ the image.
Padding : p
𝑛+2𝑝−𝑓 𝑛+2𝑝−𝑓
Output image size : ( + 1) × ( + 1)
𝑠 𝑠
DEEP LEARNING 13
POOLING LAYERS
DEEP LEARNING 14
POOLING LAYERS
Pooling has the advantage of making the representation more compact by reducing the spatial size of the feature
maps, thereby reducing the number of parameters to be learnt.
The pooling layer has ‘NO PARAMETERS’ i.e. ‘ZERO TRAINABLE PARAMETERS’.
Illustrations :
DEEP LEARNING 15
EXAMPLE MAXPOOLING
DEEP LEARNING 16
TRANSFER LEARNING
DEEP LEARNING 17
TRANSFER LEARNING
A technique where knowledge acquired from solving one task is reused to enhance performance on a
related task.
key points about transfer learning:
Transfer learning has been studied since the 1970s.
Transfer learning finds applications in various domains, including cancer subtype discovery, text
classification, medical imaging, and spam filtering.
By reusing information from previously learned tasks, transfer learning significantly improves
learning efficiency.
DEEP LEARNING 18
BENEFITS OF USING TRANSFER LEARNING
DEEP LEARNING 19
IMPLEMENTING TRANSFER LEARNING
1. Get the pre-trained model: The first step is to obtain the pre-trained model adapted to the problem.
2. Create a base model: Instantiate the basic model using one of the known architectures.
3. Freeze layers so they don't change during training: base_model.trainable = False
4. Add new trainable layers
5. Train the new layers on the dataset
6. Enhance the model with fine tuning
DEEP LEARNING 20
FINE TUNING
Fine-tuning refers to taking a pre-trained model and further training it on a new dataset.
Fine-tuning involves training the entire model, including the initial layers.
Fine-tuning is performed by unfreezing the base model or part of it and retraining the entire model on
the data set at a very low learning rate.
The later layers make use of a higher learning rate to adapt to the new dataset.
DEEP LEARNING 21
FINE TUNING
DEEP LEARNING 22
WAYS TO FINE TUNE THE MODEL
Feature extraction : remove the output layer and then use the entire network as a fixed feature
extractor for the new data set.
Use the Architecture of the pre-trained model : use architecture of the model while initializing
all the weights randomly and train the model according to the dataset again.
Train some layers while freeze others : keep the weights of initial layers of the model frozen
while retraining only the higher layers.
DEEP LEARNING 23
WAYS TO FINE TUNE THE MODEL
LeNet-5: 1998
AlexNet: 2012
VGG-Net : 2014
Inception-v1 to v3
ResNet: 2015
DEEP LEARNING 25
LENET-5
DEEP LEARNING 26
ALEXNET
DEEP LEARNING 27
ALEXNET
DEEP LEARNING 28
VGG-NET
DEEP LEARNING 29
RESNET : RESIDUAL BLOCKS
DEEP LEARNING 30
RESNET
DEEP LEARNING 31
INCEPTION-1 :
Problem :
Because of this huge variation in the location of the information, choosing the right kernel size for
the convolution operation becomes tough
Very deep networks are prone to overfitting. It also hard to pass gradient updates through the
entire network.
Naively stacking large convolution operations is computationally expensive.
Solution : filters with multiple sizes that operate on the same level
DEEP LEARNING 32
INCEPTION-1 :
Inception is a deep convolutional neural network architecture that was introduced in 2014.
It won the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC14).
It was mostly developed by Google researchers.
DEEP LEARNING 33
INCEPTION-1 : GOOGLENET
DEEP LEARNING 34
INCEPTION-3
DEEP LEARNING 35