GoogleNet
GoogleNet
s GoogleNet
C a s e S t u d y :
A I A M
By Group
Manoj3Mishra
Prashant Sehrawat
Nischay
Mandeep Singh
1
ANN vs CNN
Problems in ANN :
Benefit of CNN :
1) Interaction between hidden layer of
1) Lesser no of interaction in the
neurons is very high.
architecture.
2) Vanishing or Exploding gradient : For a
2) Vanishing or Exploding gradient is not
very deep neural network, due to backward
there.
Propagation the gradient vanishes or
explodes as it propagates backward.
2
Convolution Neural Network
Consists of multiple layers like the input layer, Convolutional layer, Pooling layer, and fully connected
layers.
Now imagine taking a small patch of this image and running a small neural network, called a
filter or kernel on it, with say, K outputs and representing them vertically. Now slide that
neural network across the whole image, as a result, we will get another image with different
widths, heights, and depths. Instead of just R, G, and B channels now we have more channels
but lesser width and height. This operation is called Convolution. If the patch size is the
same as that of the image it will be a regular neural network. Because of this small patch,4we
have fewer weights.
Types of layers
Input Layers
Convolutional
Layers
Activation
Layer
Pooling layer
5
Convolution Neural Network
8
Features of GoogleNet
2) 1×1 convolution :
One unique thing introduced by GoogleNet was 1X1 Convolution. These convolutions used to
decrease the number of parameters (weights and biases) of the architecture.
It is also used as a Bottleneck layer and helps in reduction in no of computation.
Let’s look at an example of a 1×1 convolution below: For Example, If we want to
perform 5×5 convolution having 48 filters without using 1×1 convolution as intermediate:
Global average pooling is used at the end of the network. This method reduces the total number
of parameters and minimizes overfitting. For example, consider you have a feature map with
dimensions 10,10, 32 (Width, Height, Channels). Global Average Pooling performs an average
operation across the Width and Height of each filter channel separately. This reduces the feature
map to a vector that is equal to the size of the number of channels. Benefits are : Reduced
Dimensionality / Robustness to Spatial Variations / Computationally Efficient 9
Features of GoogleNet
22 layers deep. No FC layer
Able run on individual devices even with low computational resources.
Also called a Network in Network.
All the convolutions inside this architecture uses Rectified Linear Units (ReLU) as their activation functions.
The architecture also contains two auxiliary classifier layers @ Inception (4a) and Inception (4d) layers for Deep
Neural Networks training, and avoid vanishing gradients (when the gradients turn into extremely small values).
These auxiliary classifiers help the gradient to flow and not diminish too quickly, as it propagates back through the
deeper layers.
Auxiliary classifiers also help with model regularization. Since each classifier contributes to the final output, as a
result, the network distributes its learning across different parts of the network. This distribution prevents the
network from relying too heavily on specific features or layers, which reduces the chances of overfitting.:
An average pooling layer of filter size 5×5 and stride 3.
A 1×1 convolution with 128 filters for dimension reduction and ReLU activation.
A fully connected layer with 1025 outputs and ReLU activation
Dropout Regularization with dropout ratio = 0.7
A softmax classifier with 1000 classes output similar to the main softmax classifier.
4d
4a
Stacked Inception :
9
Classifier
10
Aux. Classifier
OUR PROCESS
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
THANK YOU
40