0% found this document useful (0 votes)
2 views

Inception New

The document discusses the Inception Architecture, also known as Google LeNet, which enhances deep neural networks by improving computational efficiency while increasing depth and width. It outlines the evolution from Inception V1 to V3, highlighting techniques like dimensionality reduction and factorization of convolutions to minimize parameters. Additionally, it demonstrates the application of transfer learning using a pre-trained Inception model for image classification, achieving a validation accuracy of 90%.

Uploaded by

BENAZIR AE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Inception New

The document discusses the Inception Architecture, also known as Google LeNet, which enhances deep neural networks by improving computational efficiency while increasing depth and width. It outlines the evolution from Inception V1 to V3, highlighting techniques like dimensionality reduction and factorization of convolutions to minimize parameters. Additionally, it demonstrates the application of transfer learning using a pre-trained Inception model for image classification, achieving a validation accuracy of 90%.

Uploaded by

BENAZIR AE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Inception Architecture

Understanding the Inception (Google LeNet) Architecture

Figure 1. Google LeNet (Inception) architecture (Source: Image from


the original paper)

Inception Architecture &


Applying It To A Real-World
Dataset
Fun fact: The inception model takes its name from a famous
internet meme
Source
Index Of Contents
· Introduction
· The architecture of Inception V1
· How does this architecture reduce dimensionality?
· What is different in the Inception V3 network from the inception
V1 network?
· Using transfer learning(pre-trained inception network) on an
image classification problem.
· References

Introduction
Building a powerful deep neural network is possible by
increasing the number of layers in a network.

Two problems with the above approach are that increasing


the number of layers of a neural network may lead to
overfitting especially if you have limited labeled training data
and there is an increase in the computational requirement.

Inception networks were created with the idea of increasing


the capability of a deep neural network while efficiently
using computational resources.
We propose a deep convolutional neural network
architecture codenamed “Inception”, which was responsible
for setting the new state of the art for classification and
detection in the ImageNet Large-Scale Visual Recognition
Challenge 2014 (ILSVRC 2014). The main hallmark of this
architecture is the improved utilization of the computing
resources inside the network. This was achieved by a
carefully crafted design that allows for increasing the depth
and width of the network while keeping the computational
budget constant.

Inception networks are released in versions, each version


having some improvement over the previous ones.

The architecture of Inception V1


Consider the below images of peacocks. The area of the
image occupied by the peacock varies in both
images, selecting the right kernel size thus becomes a
difficult choice.

A large kernel size is used to capture a global distribution of


the image while a small kernel size is used to capture more
local information.
Inception network architecture makes it possible to use
filters of multiple sizes without increasing the depth of the
network.

The different filters are added parallelly instead of being


fully connected one after the other.
This is known as the naive version of the inception model.
The problem with this model was the huge number of
parameters. To mitigate the same, they came up with the
below architecture.
How does this architecture reduce dimensionality?
Adding a 1X1 convolution before a 5X5 convolution would
reduce the number of channels of the image when it is
provided as an input to the 5X5 convolution, in turn reducing
the number of parameters and the computational
requirement.

Let me explain with an example.


What is different in the Inception V3 network from the
inception V1 network?
Inception V3 is an extension of the V1 module, it uses
techniques like factorizing larger convolutions to smaller
convolutions (say a 5X5 convolution is factorized into two
3X3 convolutions) and asymmetric factorizations (example:
factorizing a 3X3 filter into a 1X3 and 3X1 filter).
These factorizations are done with the aim of reducing the
number of parameters being used at every inception module.
Below is an image of the inception V3 module.
Using transfer learning(pre-trained inception network)
on an image classification problem.
I would be solving the same problem we solved in the last
article(link at the start of this article) using CNNs to
compare the performance of using a vanilla CNN from a pre-
trained inception network.

In case you have not read the previous article, we are trying
to classify images into 6 different classes, the training data is
fairly balanced and with a convolution neural network, we
were able to achieve a validation accuracy of 77%.

Let us now use an inception model and train only its last
layer as below.

# Import necessary libraries from TensorFlow


from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import layers
from tensorflow.keras import Model

# Path to the pre-trained InceptionV3 weights file (without the top


classification layer)
local_weights_file =
'../input/inception-weights/inception_v3_weights_tf_dim_ordering_tf_kernel
s_notop.h5'

# Initialize the InceptionV3 model without the top layer (for feature
extraction)
pre_trained_model = InceptionV3(input_shape = (150, 150, 3), # Input
image size (150x150, 3 color channels)
include_top = False, # Exclude the
top (classification) layer
weights = None) # Do not load
default weights initially

# Load pre-trained weights into the model from the file


pre_trained_model.load_weights(local_weights_file)

# Freeze the pre-trained layers to prevent them from being trained


for layer in pre_trained_model.layers:
layer.trainable = False

# Print the output shape of the last convolutional layer ('mixed7')


last_layer = pre_trained_model.get_layer('mixed7')
print('last layer output shape: ', last_layer.output_shape) # Prints the
shape of the output feature map
last_output = last_layer.output # Get the output of the last
convolutional layer

# Add custom layers on top of the pre-trained model for classification


x = layers.Flatten()(last_output) # Flatten
the 3D output from 'mixed7' to 1D vector
x = layers.Dense(1024, activation='relu')(x) # Fully
connected layer with 1024 neurons and ReLU activation
x = layers.Dropout(0.2)(x) # Dropout
layer with 20% rate to prevent overfitting
x = layers.Dense(6, activation='softmax')(x) # Output
layer with 6 units for classification (6 classes)

# Create the final model with the pre-trained input and the custom layers
on top
model = Model(pre_trained_model.input, x)

# Compile the model with the RMSprop optimizer, categorical cross-entropy


loss, and accuracy metric
model.compile(optimizer = RMSprop(lr=0.0001), # RMSprop
optimizer with a learning rate of 0.0001
loss = 'categorical_crossentropy', #
Categorical cross-entropy loss function for multi-class classification
metrics = ['acc']) # Track
accuracy during training

# Train the model using the training and validation data generators
history = model.fit(train_generator, # Training
data generator
epochs=10, # Train for
10 epochs
validation_data=validation_generator) # Validation
data generator for evaluation

We were able to get a validation accuracy of 90%, by using


the above architecture!

You might also like