0% found this document useful (0 votes)
55 views

GoogleNet

GoogleNet explanation

Uploaded by

pgba23manojm
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

GoogleNet

GoogleNet explanation

Uploaded by

pgba23manojm
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Winner

s GoogleNet

C a s e S t u d y :
A I A M

Under Guidance of : Prof Prageet


Aeron

By Group
 Manoj3Mishra
 Prashant Sehrawat
 Nischay
 Mandeep Singh
1
ANN vs CNN

Problems in ANN :
Benefit of CNN :
1) Interaction between hidden layer of
1) Lesser no of interaction in the
neurons is very high.
architecture.
2) Vanishing or Exploding gradient : For a
2) Vanishing or Exploding gradient is not
very deep neural network, due to backward
there.
Propagation the gradient vanishes or
explodes as it propagates backward.
2
Convolution Neural Network
Consists of multiple layers like the input layer, Convolutional layer, Pooling layer, and fully connected
layers.

 Convolutional layer applies filters to the input image to extract features


 Pooling layer down samples the image to reduce computation
 Fully connected layer makes the final prediction.
 The network learns the optimal filters through backpropagation and gradient descent. 3
How Convolutional Layers
works
Imagine having an image. It can be represented as a cuboid having its length, width
(dimension of the image), and height (i.e the channel as images generally have red, green,
and blue channels).

Now imagine taking a small patch of this image and running a small neural network, called a
filter or kernel on it, with say, K outputs and representing them vertically. Now slide that
neural network across the whole image, as a result, we will get another image with different
widths, heights, and depths. Instead of just R, G, and B channels now we have more channels
but lesser width and height. This operation is called Convolution. If the patch size is the
same as that of the image it will be a regular neural network. Because of this small patch,4we
have fewer weights.
Types of layers
Input Layers

Convolutional
Layers
Activation
Layer
Pooling layer

5
Convolution Neural Network

 Processing in Series. Takes longer time in case of Deeper network 6


Inception

Two distinctive classes from


1000 Classes of ILSVRC
2014 classification
challenge. Domain
knowledge is required to
distinguish between these
classes.
 Google Net (or Inception V1) was proposed : “Going Deeper with Convolutions”.
 This architecture was the winner at the ILSVRC 2014 image classification challenge.
 It has provided a significant decrease in error rate as compared to previous winners AlexNet (Winner
7
of ILSVRC 2012) and ZF-Net (Winner of ILSVRC 2013).
Features of GoogleNet
1) Inception Module:
In the Inception module 1×1, 3×3, 5×5 convolution
and 3×3 max pooling performed in a parallel way at the
input and the output of these are stacked together to
generated final output. The idea behind that
convolution filters of different sizes will handle objects
at multiple scale better.

8
Features of GoogleNet
2) 1×1 convolution :
One unique thing introduced by GoogleNet was 1X1 Convolution. These convolutions used to
decrease the number of parameters (weights and biases) of the architecture.
It is also used as a Bottleneck layer and helps in reduction in no of computation.
Let’s look at an example of a 1×1 convolution below: For Example, If we want to
perform 5×5 convolution having 48 filters without using 1×1 convolution as intermediate:

(14 x 14 x 48) x (5 x 5 x 480) = (14 x 14 x 16) x (1 x 1 x 480) + (14 x 14 x 48)


x (5 x 5 x 16) = 1.5M + 3.8M = 5.3M
It 112.9
led to M
~95% computational calculation saving.

3) Global Average Pooling :

Global average pooling is used at the end of the network. This method reduces the total number
of parameters and minimizes overfitting. For example, consider you have a feature map with
dimensions 10,10, 32 (Width, Height, Channels). Global Average Pooling performs an average
operation across the Width and Height of each filter channel separately. This reduces the feature
map to a vector that is equal to the size of the number of channels. Benefits are : Reduced
Dimensionality / Robustness to Spatial Variations / Computationally Efficient 9

Features of GoogleNet
22 layers deep. No FC layer
 Able run on individual devices even with low computational resources.
 Also called a Network in Network.
 All the convolutions inside this architecture uses Rectified Linear Units (ReLU) as their activation functions.
 The architecture also contains two auxiliary classifier layers @ Inception (4a) and Inception (4d) layers for Deep
Neural Networks training, and avoid vanishing gradients (when the gradients turn into extremely small values).
These auxiliary classifiers help the gradient to flow and not diminish too quickly, as it propagates back through the
deeper layers.
 Auxiliary classifiers also help with model regularization. Since each classifier contributes to the final output, as a
result, the network distributes its learning across different parts of the network. This distribution prevents the
network from relying too heavily on specific features or layers, which reduces the chances of overfitting.:
 An average pooling layer of filter size 5×5 and stride 3.
 A 1×1 convolution with 128 filters for dimension reduction and ReLU activation.
 A fully connected layer with 1025 outputs and ReLU activation
 Dropout Regularization with dropout ratio = 0.7
 A softmax classifier with 1000 classes output similar to the main softmax classifier.
4d
4a

Stacked Inception :
9
Classifier
10
Aux. Classifier
OUR PROCESS

11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
THANK YOU

40

You might also like