0% found this document useful (0 votes)

2 views

Alex Net

AlexNet is a convolutional neural network architecture developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, achieving a Top-5 error rate of 15.3% in the ImageNet challenge. It consists of 5 convolutional layers, 3 max-pooling layers, and 2 fully connected layers, utilizing ReLU activation and dropout for improved performance. The architecture significantly contributed to the resurgence of deep learning by demonstrating the effectiveness of CNNs in image classification tasks.

Uploaded by

BENAZIR AE

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Alex Net

Uploaded by

BENAZIR AE

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 26

AlexNet - Introduction

The convolutional neural network (CNN)

architecture known as AlexNet was created by
Alex Krizhevsky, Ilya Sutskever, and Geoffrey
Hinton.

That was the year ImageNet Large Scale Visual

Recognition Challenge (ILSVRC) was launched.

He and a few other researchers were proven

correct in two years with the publication of the
paper “Image Net Classification with Deep
Neural Networks” by Alex Krizhevsky, Ilya
Sutskever, and Geoffrey E. Hinton. The study
employed CNN to obtain a Top-5 error rate of
15.3% (percentage of not correctly identifying
an image’s genuine label among its top 5
guesses). The second-best outcome lagged far
behind (26.2%). Deep Learning became popular
once more after the dust settled.
Alex Krizhevsky, the architecture utilised in the
2012 study is known as AlexNet.
AlexNet Architecture

AlexNet Architecture- overview

Alexnet is a deep architecture, the authors
introduced padding to prevent the size of the
feature maps from reducing drastically.
The input to this model is the images of size
227X227X3.
For the first two convolutional layers, each
convolutional layers is followed by a
Overlapping Max Pooling layer.
Third, fourth and fifth convolution layers are
directly connected with each other.
The fifth convolutional layer is followed by
Overlapping Max Pooling Layer, which is then
connected to fully connected layers.
The fully connected layers have 4096 neurons
each and the second fully connected layer is
feed into a softmax classifier having 1000
classes.

 This was the first architecture that used

GPU to boost the training performance.
 AlexNet consists of 5 convolution layers, 3
max-pooling layers, 2 Normalized layers, 2
fully connected layers and 1 SoftMax layer.
 Each convolution layer consists of a
convolution filter and a non-linear activation
function called “ReLU”.
 The pooling layers are used to perform the
max-pooling function and the input size is
fixed due to the presence of fully connected
layers.
 The input size is mentioned at most of the
places as 224x224x3 but due to some padding
which happens it works out to be 227x227x3.
 AlexNet has over 60 million parameters.

Key Features:
 ‘ReLU’ is used as an activation function rather
than ‘tanh’
 Batch size of 128
 SGD Momentum is used as a learning
algorithm
 Data Augmentation is been carried out like
flipping, jittering, cropping, colour
normalization, etc.

AlexNet was trained on a GTX 580 GPU with

only 3 GB of memory which couldn’t fit the
entire network. So the network was split across
2 GPUs, with half of the neurons(feature maps)
on each GPU.

Convolution and Maxpooling Layers

Convolution and max-pooling layers are
fundamental building blocks of AlexNet.
These layers extract features and reduce
spatial dimensions, enabling efficient
processing while retaining critical image
information.
 Filters: 96 filters, each of size 11×11.
 Stride: 4.
 Activation: ReLU.
 Output Feature Map: 55x55x96.
Note: To calculate the output size of a
convolution layer, use the formula:

The number of filters becomes the number of

channels in the output feature map.
First Max-Pooling Layer
 Pool Size: 3×3.
 Stride: 2.
 Output Feature Map: 27x27x96.
Second Convolution Layer
 Filters: 256 filters, each of size 5×5.
 Stride: 1, with padding of 2.
 Activation: ReLU.
 Output Feature Map: 27x27x256.
Second Max-Pooling Layer
 Pool Size: 3×3.
 Stride: 2.
 Output Feature Map: 13x13x256.
Third Convolution Layer
 Filters: 384 filters, each of size 3×3.
 Stride: 1, with padding of 1.
 Activation: ReLU.
 Output Feature Map: 13x13x384.
Fourth Convolution Layer
 Filters: 384 filters, each of size 3×3.
 Stride and Padding: Both set to 1.
 Activation: ReLU.
 Output Feature Map: Remains 13x13x384.
Final Convolution Layer
 Filters: 256 filters, each of size 3×3.
 Stride and Padding: Both set to 1.
 Activation: ReLU.
 Output Feature Map: 13x13x256.
 Increasing Filters: The number of filters
increases as we go deeper, allowing for
more complex feature extraction.
 Decreasing Filter Size: The filter size
reduces in each layer, from larger filters at
the beginning to smaller ones deeper in
the architecture, resulting in a smaller
feature map shape.
Fully Connected and Dropout Layers
After this, we have our first dropout layer.
The drop-out rate is set to be 0.5.
Then we have the first fully connected layer
with a relu activation function. The size of
the output is 4096.
Next comes another dropout layer with the
dropout rate fixed at 0.5.
This followed by a second fully connected
layer with 4096 neurons and relu activation.
Finally, we have the last fully connected
layer or output layer with 1000 neurons as
we have 10000 classes in the data set.
The activation function used at this layer is
Softmax.
This is the architecture of the Alexnet
model. It has a total of 62.3 million
learnable parameters.

Max Pooling
The main idea behind a pooling layer is to
“accumulate” features from maps generated by
convolving a filter over an image.
Function : to progressively reduce the spatial
size of the representation to reduce the
number of parameters and computations in the
network. The most common form of pooling is
max pooling.
helps over-fitting by providing an abstracted
form of the representation.
Max pooling is done by applying a max filter
to (usually) non-overlapping sub-regions of
the initial representation.
AlexNet used pooling windows, sized 3×3
with a stride of 2 between the adjacent
windows.
Due to this overlapping nature of Max Pool,
the top-1 error rate was reduced by 0.4% and
the top-5 error rate was reduced by 0.3%
respectively.
If you compare this to using non-overlapping
pooling windows of size 2×2 with a stride of
2, that would give the same output
dimensions.

ReLU Non-Linearity
AlexNet demonstrates that saturating
activation functions like Tanh or Sigmoid can
be used to train deep CNNs much more
quickly.
The image below demonstrates that AlexNet
can achieve a training error rate of 25% with
the aid of ReLUs (solid curve).
Compared to a network using tanh, this is six
times faster (dotted curve). On the CIFAR-10
dataset, this was evaluated.
Data Augmentation
Overfitting can be avoided by showing Neural
Net various iterations of the same image.
It produces more data and compels the
Neural Net to memorise the main qualities.
Augmentation by Mirroring

Consider that our training set contains a picture

of a cat. A cat can also be seen as its mirror
image. This indicates that by just flipping the
image above the vertical axis, we may double
the size of the training datasets.

Data Augmentation by Mirroring

Augmentation by Random Cropping of Images

Randomly cropping the original image will also

produce additional data that is simply the
original data shifted.
For the network’s inputs, the creators of
AlexNet selected random crops with
dimensions of 227 by 227 from within the 256
by 256 image boundary. They multiplied the
size of the data by 2048 using this technique.

Data Augmentation by Random Cropping

Dropout
A neuron is removed from the neural
network during dropout with a probability of
0.5.
A neuron that is dropped does not make any
contribution to either forward or backward
propagation.
Each input is processed by a separate Neural
Network design.
The acquired weight parameters are
therefore more reliable and less prone to
overfitting.

AlexNet Summary
Architecture Implementation

Import Libraries and Load the Dataset

For the implementation process, we will be
taking a part of the ImageNet dataset by
scraping images over the internet using a
python library named ‘Beautiful Soup’ and will
be passing this dataset on our model to check
how is the performance of the AlexNet
architecture.
Pre-processing
Once we have scraped the images we will be
storing the images according to the data labels
and we will pre-process the data.

Define the Model.

We will be creating the AlexNet Architecture
from scratch, although there is a pre-defined
function in Keras that will help you to run the
AlexNet Architecture.
Initialize the training parameters

Train the model

Prediction
ResNet50

ResNet50 is a deep convolutional neural

network (CNN) architecture that was developed
by Microsoft Research in 2015.

It is a variant of the popular ResNet

architecture, which stands for “Residual
Network.”

The “50” in the name refers to the number of

layers in the network, which is 50 layers deep.

ResNet50 is a powerful image classification

model that can be trained on large datasets and
achieve state-of-the-art results.

One of its key innovations is the use of residual

connections, which allow the network to learn a
set of residual functions that map the input to
the desired output.
These residual connections enable the network
to learn much deeper architectures than was
previously possible, without suffering from the
problem of vanishing gradients.

The architecture of ResNet50 is divided into

four main parts: the convolutional layers, the
identity block, the convolutional block, and the
fully connected layers.

The convolutional layers are responsible for

extracting features from the input image, while
the identity block and convolutional block are
responsible for processing and transforming
these features.

Finally, the fully connected layers are used to

make the final classification.

The convolutional layers in ResNet50 consist of

several convolutional layers followed by batch
normalization and ReLU activation.
These layers are responsible for extracting
features from the input image, such as edges,
textures, and shapes.

The convolutional layers are followed by max

pooling layers, which reduce the spatial
dimensions of the feature maps while
preserving the most important features.

The identity block and convolutional block are

the key building blocks of ResNet50.

The identity block is a simple block that passes

the input through a series of convolutional
layers and adds the input back to the output.

This allows the network to learn residual

functions that map the input to the desired
output.

The convolutional block is similar to the identity

block, but with the addition of a 1x1
convolutional layer that is used to reduce the
number of filters before the 3x3 convolutional
layer.

The final part of ResNet50 is the fully connected

layers.

These layers are responsible for making the

final classification.

The output of the final fully connected layer is

fed into a softmax activation function to
produce the final class probabilities.
How it solved the problem of vanishing
gradients:

Skip Connections

Skip connections, also known as residual

connections, are a key feature of the ResNet50
architecture. They are used to allow the
network to learn deeper architectures without
suffering from the problem of vanishing
gradients.

Vanishing gradients is a problem that occurs

when training deep neural networks, where the
gradients of the parameters in the deeper
layers become very small, making it difficult for
those layers to learn and improve. This problem
becomes more pronounced as the network
becomes deeper.

Skip connections address this problem by

allowing the information to flow directly from
the input to the output of the network,
bypassing one or more layers. This allows the
network to learn residual functions that map
the input to the desired output, rather than
having to learn the entire mapping from
scratch.

In ResNet50, skip connections are used in the

identity block and convolutional block. The
identity block passes the input through a series
of convolutional layers and adds the input back
to the output, while the convolutional block
uses a 1x1 convolutional layer to reduce the
number of filters before the 3x3 convolutional
layer and then adds the input back to the
output.

The use of skip connections in ResNet50 allows

the network to learn deeper architectures while
still being able to train effectively and prevent
vanishing gradients.

Summary:
In summary, ResNet50 is a cutting-edge deep
convolutional neural network architecture that
was developed by Microsoft Research in 2015.
It is a variant of the popular ResNet architecture
and comprises of 50 layers that enable it to
learn much deeper architectures than
previously possible without encountering the
problem of vanishing gradients. The
architecture of ResNet50 is divided into four
main parts: the convolutional layers, the
identity block, the convolutional block, and the
fully connected layers. The convolutional layers
are responsible for extracting features from the
input image, the identity block and
convolutional block process and transform
these features, and the fully connected layers
make the final classification. ResNet50 has been
trained on the large ImageNet dataset,
achieving an error rate on par with human
performance, making it a powerful model for
various image classification tasks such as object
detection, facial recognition and medical image
analysis. Additionally, it has also been used as a
feature extractor for other tasks, such as object
detection and semantic segmentation.

“ResNet50, with its deep residual networks,

opened the door for the training of even deeper
architectures and helped push the boundaries
of what was possible in computer vision.”
— Yann LeCun, Director of AI Research at
Facebook

03 Deep Learning Overview
No ratings yet
03 Deep Learning Overview
80 pages
Unit-3
No ratings yet
Unit-3
38 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
37 pages
Untitled document (2)
No ratings yet
Untitled document (2)
15 pages
Different Deep CNN Architectures - LeNet, AlexNet, VGG
No ratings yet
Different Deep CNN Architectures - LeNet, AlexNet, VGG
13 pages
Module3 Casestudy
No ratings yet
Module3 Casestudy
13 pages
Unit V
No ratings yet
Unit V
84 pages
CNN Case Studies Unit 4
No ratings yet
CNN Case Studies Unit 4
13 pages
XCXC
No ratings yet
XCXC
16 pages
DNN Architectures
No ratings yet
DNN Architectures
12 pages
Project Exhibition 2
No ratings yet
Project Exhibition 2
42 pages
Architecture of Inception-ResnetV2
No ratings yet
Architecture of Inception-ResnetV2
6 pages
dl ass 742
No ratings yet
dl ass 742
14 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
14 pages
AlexNet
No ratings yet
AlexNet
3 pages
Deep LearningUNIT-IV
No ratings yet
Deep LearningUNIT-IV
16 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
6 pages
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
No ratings yet
CS601 - Machine Learning - Unit 3 - Notes - 1672759761
15 pages
Unit 5
No ratings yet
Unit 5
24 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
UNIT-III DLL full unit
No ratings yet
UNIT-III DLL full unit
63 pages
Transfer Learning
No ratings yet
Transfer Learning
15 pages
CNN notes unit-3
No ratings yet
CNN notes unit-3
12 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Convolutional Neural Networks. Before Kickstarting Into CNNs We Must - by Namita - Medium
No ratings yet
Convolutional Neural Networks. Before Kickstarting Into CNNs We Must - by Namita - Medium
13 pages
Cnn
No ratings yet
Cnn
9 pages
Building A Convolutional Neural Network Using Tensorflow Keras
No ratings yet
Building A Convolutional Neural Network Using Tensorflow Keras
10 pages
Understanding AlexNet
No ratings yet
Understanding AlexNet
8 pages
Identify Web Cam Images Using Neural Networks
No ratings yet
Identify Web Cam Images Using Neural Networks
17 pages
Convolutional Neural Networks (CNN)
No ratings yet
Convolutional Neural Networks (CNN)
7 pages
Chap 2 DL
No ratings yet
Chap 2 DL
88 pages
XLA_final_report (1)
No ratings yet
XLA_final_report (1)
17 pages
Introduction To CNNs
No ratings yet
Introduction To CNNs
26 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
Imagenet Classification
No ratings yet
Imagenet Classification
9 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
15 pages
Unit 3 ML
No ratings yet
Unit 3 ML
27 pages
A convolutional neural network
No ratings yet
A convolutional neural network
6 pages
Principles of Convolutional Neural Networks
No ratings yet
Principles of Convolutional Neural Networks
9 pages
Unit II.
No ratings yet
Unit II.
14 pages
21-Foundations of Convolutional Neural Networks-04!09!2024
No ratings yet
21-Foundations of Convolutional Neural Networks-04!09!2024
10 pages
10. Image Processing With Deep Learning
No ratings yet
10. Image Processing With Deep Learning
39 pages
deeplearning_ppt_unit 4 and 5.pptx
No ratings yet
deeplearning_ppt_unit 4 and 5.pptx
154 pages
cnn
No ratings yet
cnn
10 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
15 pages
Ch-3 Convolutional Neural Networks (CNNs)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNs)
11 pages
Gender Classification Using Convolutional Neural Networks: J Component - Project Report
No ratings yet
Gender Classification Using Convolutional Neural Networks: J Component - Project Report
36 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
IBM Question & Answers
No ratings yet
IBM Question & Answers
3 pages
Lecture_3
No ratings yet
Lecture_3
48 pages
Convolutional Neural Networks & Zapier
No ratings yet
Convolutional Neural Networks & Zapier
75 pages
CNN
No ratings yet
CNN
6 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Inception New
No ratings yet
Inception New
11 pages
Ex NO 9 DL LAB
No ratings yet
Ex NO 9 DL LAB
3 pages
Bootstrap Lab Manual
No ratings yet
Bootstrap Lab Manual
28 pages
JS FUNCTIONS
No ratings yet
JS FUNCTIONS
8 pages
Javascript Programs
No ratings yet
Javascript Programs
14 pages
Unit 1
No ratings yet
Unit 1
16 pages
Css Text Styling
No ratings yet
Css Text Styling
20 pages
CNN3 Pooling and Fully Contected Layers
No ratings yet
CNN3 Pooling and Fully Contected Layers
21 pages
Unit - I Artificial Neural Networks
No ratings yet
Unit - I Artificial Neural Networks
23 pages
Chapter 1 - Introduction To Deep Learning 2023
No ratings yet
Chapter 1 - Introduction To Deep Learning 2023
50 pages
Unit III
No ratings yet
Unit III
89 pages
Perceptron
No ratings yet
Perceptron
2 pages
Advanced Soft Computing
No ratings yet
Advanced Soft Computing
24 pages
Types of Networks
No ratings yet
Types of Networks
31 pages
Neural Network - DR - Nadir N. Charniya
No ratings yet
Neural Network - DR - Nadir N. Charniya
20 pages
Deep Learning - Lesson Plan
No ratings yet
Deep Learning - Lesson Plan
5 pages
Daftar Peserta MLT2 PROA 2022
No ratings yet
Daftar Peserta MLT2 PROA 2022
37 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
26 pages
Introduction To Neural Networks: Freek Stulp
No ratings yet
Introduction To Neural Networks: Freek Stulp
12 pages
08 An Example of NN Using ReLu
No ratings yet
08 An Example of NN Using ReLu
10 pages
第4章参数高效微调
No ratings yet
第4章参数高效微调
33 pages
ANN Question Paper 2022
No ratings yet
ANN Question Paper 2022
4 pages
Deepfake Video Detection System Using Deep Neural Networks
No ratings yet
Deepfake Video Detection System Using Deep Neural Networks
6 pages
Answer All Questions PART A - (5 2 10)
No ratings yet
Answer All Questions PART A - (5 2 10)
2 pages
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
No ratings yet
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
50 pages
Unit - 1
No ratings yet
Unit - 1
69 pages
DL501 Course Summary
100% (1)
DL501 Course Summary
2 pages
Unit 5
No ratings yet
Unit 5
39 pages
Ai Unit 4 Notes
No ratings yet
Ai Unit 4 Notes
11 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
22 pages
Deep Learning As A Frontier of Machine Learning A
No ratings yet
Deep Learning As A Frontier of Machine Learning A
10 pages
j2020 A Survey of The Usages of Deep Learning For Natural Language Processing
No ratings yet
j2020 A Survey of The Usages of Deep Learning For Natural Language Processing
21 pages
AI & Deep Learning TensorFlow, Keras, PyTorch_80 hours-1
No ratings yet
AI & Deep Learning TensorFlow, Keras, PyTorch_80 hours-1
12 pages
Quiz-2: Attempt History
No ratings yet
Quiz-2: Attempt History
7 pages
Deep Neural Network DNN
No ratings yet
Deep Neural Network DNN
5 pages
Feedforward Neural Networks - Part 1 - Parveen Khurana - Medium
No ratings yet
Feedforward Neural Networks - Part 1 - Parveen Khurana - Medium
53 pages

Alex Net

Uploaded by

Alex Net

Uploaded by

AlexNet - Introduction

The convolutional neural network (CNN)

That was the year ImageNet Large Scale Visual

He and a few other researchers were proven

AlexNet Architecture- overview

 This was the first architecture that used

AlexNet was trained on a GTX 580 GPU with

Convolution and Maxpooling Layers

The number of filters becomes the number of

Consider that our training set contains a picture

Data Augmentation by Mirroring

Augmentation by Random Cropping of Images

Randomly cropping the original image will also

Data Augmentation by Random Cropping

Import Libraries and Load the Dataset

Define the Model.

Train the model

ResNet50 is a deep convolutional neural

It is a variant of the popular ResNet

The “50” in the name refers to the number of

ResNet50 is a powerful image classification

One of its key innovations is the use of residual

The architecture of ResNet50 is divided into

The convolutional layers are responsible for

Finally, the fully connected layers are used to

The convolutional layers in ResNet50 consist of

The convolutional layers are followed by max

The identity block and convolutional block are

The identity block is a simple block that passes

This allows the network to learn residual

The convolutional block is similar to the identity

The final part of ResNet50 is the fully connected

These layers are responsible for making the

The output of the final fully connected layer is

Skip connections, also known as residual

Vanishing gradients is a problem that occurs

Skip connections address this problem by

In ResNet50, skip connections are used in the

The use of skip connections in ResNet50 allows

“ResNet50, with its deep residual networks,

You might also like