0% found this document useful (0 votes)
18 views20 pages

ML II - Unit IV

Uploaded by

PARTH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views20 pages

ML II - Unit IV

Uploaded by

PARTH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Application of CNN

1.Decoding Facial Recognition:


• Identifying every face in the picture
• Focusing on each face despite external factors, such as light, angle,
pose, etc.
• Identifying unique features
• Comparing all the collected data with already existing data in the
database to match a face with a name.
2.Facial emotion recognition: CNNs have been used to help distinguish
between different facial expressions such as anger, sadness, or
happiness.
3.Analyzing Documents:
• Convolutional neural networks can also be used for document
analysis. This is not just useful for handwriting analysis, but also has a
major stake in recognizers.
Application of CNN

4.Understanding Climate:
• CNNs can be used to play a major role in the fight against climate
change, especially in understanding the reasons why we see such
drastic changes and how we could experiment in curbing the effect.
5.Grey Areas
• Introduction of the grey area into CNNs is posed to provide a much
more realistic picture of the real world.
6.Advertising
• CNNs have already brought in a world of difference to advertising
with the introduction of programmatic buying and data-driven
personalized advertising.
Application of CNN
7.Object detection: CNN has been applied to object recognition across
images by classifying objects based on shapes and patterns found
within an image. CNN models have been created that can detect a wide
range of objects
8. Self-driving or autonomous cars: CNN has been used within the
context of automated vehicles to enable them to detect obstacles or
interpret street signs. CNNs have been used in conjunction with
reinforcement learning
9. Handwritten character recognition: CNNs can be used to recognize
handwritten characters. CNNs take the image of a character as an input
and break it down into smaller sections, identifying points that can
connect or overlap with other points in order to determine the shape
of the larger character.
Application of CNN
10.Image captioning: CNN is also being used for the task of generating
short captions describing what’s contained within new images fed into
CNN networks or processing multiple images at once with existing
ones, such as those found on social media sites.
11. Biometric authentication: CNN has been used for biometric
authentication of user identity by identifying certain physical
characteristics associated with a person’s face.
ConvNet Architectures
• AlexNet:
AlexNet was primarily designed by Alex Krizhevsky. AlexNet was the
first convolutional network which used GPU to boost performance.
1.AlexNet architecture consists of 5 convolutional layers, 3 max-pooling
layers, 2 normalization layers, 2 fully connected layers, and 1 softmax layer.
2.Each convolutional layer consists of convolutional filters and a nonlinear
activation function ReLU.
3.The pooling layers are used to perform max pooling.
4.Input size is fixed due to the presence of fully connected layers.
5.The input size is mentioned at most of the places as 224x224x3 but due to
some padding which happens it works out to be 227x227x3
The AlexNet contains 8 layers with weights
AlexNet
• LSVRC (Large Scale Visual Recognition Challenge) is a competition
where research teams evaluate their algorithms on a huge dataset of
labeled images (ImageNet) and compete to achieve higher accuracy
on several visual recognition tasks in 2012.
• He and a few other researchers were proven correct in two years with
the publication of the paper “Image Net Classification with Deep
Neural Networks” by Alex Krizhevsky, Ilya Sutskever, and Geoffrey E.
Hinton.
AlexNet Architecture
AlexNet Summary
VGG Architecture
• VGG stands for Visual Geometry Group; it is a standard deep
Convolutional Neural Network (CNN) architecture with multiple
layers.
• The “deep” refers to the number of layers with VGG-16 or VGG-19
consisting of 16 and 19 convolutional layers.
• The VGG architecture is the basis of ground-breaking object
recognition models.
• Developed as a deep neural network, the VGGNet also surpasses
baselines on many tasks and datasets beyond ImageNet.
• Moreover, it is now still one of the most popular image recognition
architectures.
VGG Architecture
• Very tiny convolutional filters are used in the construction of the VGG network. Thirteen
convolutional layers and three fully connected layers make up the VGG-16.
• Conv-1 Layer has 64 number of filters, Conv-2 has 128 filters, Conv-3 has 256 filters, Conv 4
and Conv 5 has 512 filters.
• Three Fully-Connected (FC) layers follow a stack of convolutional layers: the first two have
4096 channels each, the third performs 1000-way ILSVRC classification and thus contains
1000 channels (one for each class). The final layer is the soft-max layer.
VGG Architecture
• The 16 in VGG16 refers to 16 layers that have weights. In VGG16 there
are thirteen convolutional layers, five Max Pooling layers, and three
Dense layers which sum up to 21 layers but it has only sixteen weight
layers i.e., learnable parameters layer.
• VGG16 takes input tensor size as 224, 244 with 3 RGB channel.
• VGG16 is object detection and classification algorithm which is able to
classify 1000 images of 1000 different categories with 92.7% accuracy. It
is one of the popular algorithms for image classification and is easy to
use with transfer learning.
GoogLeNet
• GoogLeNet is a 22-layer deep convolutional neural network that’s a
variant of the Inception Network, a Deep Convolutional Neural
Network developed by researchers at Google.
• Google Net (or Inception V1) was proposed by research at Google
(with the collaboration of various universities) in 2014 in the research
paper titled “Going Deeper with Convolutions”.
• It has provided a significant decrease in error rate as compared to
previous winners AlexNet (Winner of ILSVRC 2012) and ZF-Net
(Winner of ILSVRC 2013) and significantly less error rate than VGG
(2014 runner up).
• This architecture was the winner at the ILSVRC 2014 image
classification challenge.
GoogLeNet
• It has provided a significant decrease in error rate as compared to
previous winners AlexNet (Winner of ILSVRC 2012) and ZF-Net
(Winner of ILSVRC 2013) and significantly less error rate than VGG
(2014 runner up).
• Today GoogLeNet is used for other computer vision tasks such as face
detection and recognition, adversarial training etc.
• Inception Module:
GoogleNet Architecture
• The GoogleNet Architecture is 22 layers deep, with 27 pooling layers
included. There are 9 inception modules stacked linearly in total.
• The ends of the inception modules are connected to the global
average pooling layer. This architecture uses techniques such
as 1×1 convolutions in the middle of the architecture and global
average pooling.
GoogleNet Architecture
• 1×1 convolution : The inception architecture uses 1×1 convolution in its
architecture. These convolutions used to decrease the number of
parameters (weights and biases) of the architecture. By reducing the
parameters we also increase the depth of the architecture.
• Total Number of operations : (14 x 14 x 48) x (5 x 5 x 480) = 112.9 M

• With 1×1 convolution :

• (14 x 14 x 16) x (1 x 1 x 480) + (14 x 14 x 48) x (5 x 5 x 16) = 1.5M + 3.8M = 5.3M which is
much smaller than 112.9M.
ResNet
• Residual Network (ResNet) is a deep learning model used for
computer vision applications. It is a Convolutional Neural Network
(CNN) architecture designed to support hundreds or thousands of
convolutional layers.
• Previous CNN architectures were not able to scale to a large number
of layers, which resulted in limited performance. However, when
adding more layers, researchers faced the “vanishing gradient”
problem.
ResNet
• In order to solve the problem of the vanishing/exploding gradient, this
architecture introduced the concept called Residual Blocks.
• ResNet, which was proposed in 2015 by researchers at Microsoft Research
introduced a new architecture called Residual Network.
• This network uses a technique called skip connections. The skip connection
connects activations of a layer to further layers by skipping some layers in
between. This forms a residual block. Resnets are made by stacking these residual
blocks together.
• F(x) := H(x) - x which gives H(x) := F(x) + x
ResNet Architecture

• Network Architecture: This network uses a 34-layer plain network


architecture inspired by VGG-19 in which then the shortcut connection is
added.
• These shortcut connections then convert the architecture into a residual
network.
• However, subsequent research discovered that increasing the number of
layers could significantly improve CNN performance.
• Skipping a connection does not add additional computational load to the
network.
• This technique of adding the input of the previous layer to the output of a
subsequent layer is now very popular, and has been applied to many other
neural network architectures
ResNet Architecture
Thank You!

You might also like