0% found this document useful (0 votes)

32 views

Untitled document (2)

Uploaded by

sankeerthmanmadhan2002

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

Untitled document (2)

Uploaded by

sankeerthmanmadhan2002

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

AlexNet: The First CNN to win Image

Net
By Great Learning Team 22959

Table of contents

1. AlexNet: History
2. CNN Architecture
3. AlexNet Architecture
4. Key Features of AlexNet
5. Data Augmentation
6. Results

This article is a AlexNet Tutorial which is focused on exploring AlexNet which became
one of the most popular CNN architectures.
History of AlexNet

AlexNet was primarily designed by Alex Krizhevsky. It was published with Ilya
Sutskever and Krizhevsky’s doctoral advisor Geoffrey Hinton, and is a Convolutional
Neural Network or CNN. Learn more about it in this CNN Course.

After competing in ImageNet Large Scale Visual Recognition Challenge, AlexNet shot to
fame. It achieved a top-5 error of 15.3%. This was 10.8% lower than that of runner up.
The primary result of the original paper was that the depth of the model was absolutely
required for its high performance. This was quite expensive computationally but was
made feasible due to GPUs or Graphical Processing Units, during training.

CNN Architectures

Before exploring AlexNet it is essential to understand what is a convolutional neural

network. Convolutional neural networks are one of the variants of neural networks
where hidden layers consist of convolutional layers, pooling layers, fully connected
layers, and normalization layers.

Convolution is the process of applying a filter over an image or signal to modify it. Now
what is pooling? It is a sample-based discretization process. The main reason is to
reduce the dimensionality of the input. Thus, allowing assumptions to be made about
the features contained in the sub-regions binned.

A detailed explanation of this can be found at Understanding Neural Networks.

A stack of distinct layers that transform input volume into output volume with the help of
a differentiable function is known as CNN Architecture. (e.g. holding the class scores)

In other words, one can understand a CNN architecture to be a specific arrangement of

the above-mentioned layers. Numerous variations of such arrangements have
developed over the years resulting in several CNN architectures. The most common
amongst them are:
1. LeNet-5 (1998)

2. AlexNet (2012)

3. ZFNet (2013)

4. GoogleNet / Inception(2014)

5. VGGNet (2014)

6. ResNet (2015)

AlexNet Architecture
AlexNet was the first convolutional network which used GPU to boost performance.

1. AlexNet architecture consists of 5 convolutional layers, 3 max-pooling layers, 2

normalization layers, 2 fully connected layers, and 1 softmax layer.

2. Each convolutional layer consists of convolutional filters and a nonlinear activation

function ReLU.

3. The pooling layers are used to perform max pooling.

4. Input size is fixed due to the presence of fully connected layers.

5. The input size is mentioned at most of the places as 224x224x3 but due to some
padding which happens it works out to be 227x227x3

6. AlexNet overall has 60 million parameters.

Model Details

The model which won the competition was tuned with specific details-

1. ReLU is an activation function

2. Used Normalization layers which are not common anymore

3. Batch size of 128

4. SGD Momentum as learning algorithm

5. Heavy Data Augmentation with things like flipping, jittering, cropping, color
normalization, etc.

6. Ensembling of models to get the best results.

AlexNet was trained on a GTX 580 GPU with only 3 GB of memory which couldn’t fit the
entire network. So the network was split across 2 GPUs, with half of the
neurons(feature maps) on each GPU.

This is the reason one can see a split in the architecture diagram.

Key Features

Overlapping Max Pooling

To down-sample an image or a representation, Max Pool is used. It reduces its
dimensionality by allowing assumptions to be made about features contained in the sub-
regions binned.

Overlapping Max Pool layers are similar to Max Pool layers except the adjacent
windows over which the max is calculated overlaps each other. The authors of AlexNet
used pooling windows, sized 3×3 with a stride of 2 between the adjacent windows. Due
to this overlapping nature of Max Pool, the top-1 error rate was reduced by 0.4% and
top-5 error rate was reduced by 0.3% respectively. If you compare this to using a non-
overlapping pooling windows of size 2×2 with a stride of 2, that would give the same
output dimensions.

ReLU Nonlinearity
Using ReLU non-linearity, AlexNet shows us that deep CNN’s can be trained much
faster with the help of saturating activation functions such as Tanh or Sigmoid. The
figure shown below shows us that with the help of ReLUs(solid curve), AlexNet can
achieve a 25% training error rate. This is six times faster than an equivalent network
that uses tanh(dotted curve). This was tested on the CIFAR-10 dataset.

Data Augmentation
When you show a Neural Net different variation of the same image, it helps prevent
overfitting. It also forces the Neural Net to memorize the key features and helps in
generating additional data.

Data Augmentation by Mirroring

Let’s say we have an image of a cat in our training set. The mirror image is also a valid
image of a cat. This mean that we can double the size of the training datasets by simply
flipping the image above the vertical axis.

Source: https://www.learnopencv.com/wp-content/uploads/2018/05/AlexNet-Data-Augmentation-
Mirror-Image.jpg
Data Augmentation by Random Crops

Also, cropping the original image randomly will lead to additional data that is just a
shifted version of the original data.

The authors of AlexNet extracted random crops sized 227×227 from inside the 256×256
image boundary, and used this as the network’s inputs. Using this method, they
increased the size of the data by a factor of 2048.
Source: https://www.learnopencv.com/wp-content/uploads/2018/05/AlexNet-Data-Augmentation-
Random-Crops.jpg

Dropout

During dropout, a neuron is dropped from the Neural Network with a probability of 0.5.
When a neuron is dropped, it does not contribute to forward propagation or backward
propagation. Every input goes through a different Neural Network architecture, as
shown in the image below. As a result, the learned weight parameters are more robust
and do not get overfitted easily.
Results

In the 2010 version of ImageNet challenge AlexNet vastly outpaced the second-best
model with 37.5% top -1 error vs 47.5% top-1 error , and 17.0% top-5 error to 37.55 top-
5 error. AlexNet was able to recognize off-center objects and most of its top 5 classes
for each image were reasonable. AlexNet won the 2012 competition with a top-5 error
rate of 15.3% compared to second place top-5 error rate of 26.2%.
Lecture 9 Stanford University: https://www.youtube.com/watch?v=DAOcjicFr1Y

The success of AlexNet is mostly attributed to its ability to leverage GPU for training and
being able to train these huge numbers of parameters.

In the following layers, there were multiple improvements over AlexNet resulting in
models like VGG, GoogleNet, and lately ResNet

LSTM From Scratch in Python
No ratings yet
LSTM From Scratch in Python
11 pages
Assignment No 2 (Aleeza Anjum CS101)
No ratings yet
Assignment No 2 (Aleeza Anjum CS101)
60 pages
Shayak
No ratings yet
Shayak
6 pages
Classification of Iris Data Set
No ratings yet
Classification of Iris Data Set
16 pages
Different Deep CNN Architectures - LeNet, AlexNet, VGG
No ratings yet
Different Deep CNN Architectures - LeNet, AlexNet, VGG
13 pages
Alex Net
No ratings yet
Alex Net
26 pages
Unit V
No ratings yet
Unit V
84 pages
AlexNet
No ratings yet
AlexNet
3 pages
Unit-3
No ratings yet
Unit-3
38 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
37 pages
Unit 5
No ratings yet
Unit 5
24 pages
DL_Unit IV
No ratings yet
DL_Unit IV
36 pages
CNN Case Studies Unit 4
No ratings yet
CNN Case Studies Unit 4
13 pages
10. Image Processing With Deep Learning
No ratings yet
10. Image Processing With Deep Learning
39 pages
MRI Brain Image Classification Using Various Deep Learning
No ratings yet
MRI Brain Image Classification Using Various Deep Learning
18 pages
Understanding AlexNet
No ratings yet
Understanding AlexNet
8 pages
CNN Architectures - LeNet, AlexNet, VGG, GoogLeNet, ResNet and More - by Siddharth Das - Analytics Vidhya - Medium
No ratings yet
CNN Architectures - LeNet, AlexNet, VGG, GoogLeNet, ResNet and More - by Siddharth Das - Analytics Vidhya - Medium
6 pages
Famous Networks
No ratings yet
Famous Networks
6 pages
EfficientNet Tutorial
No ratings yet
EfficientNet Tutorial
20 pages
Types of Convolutional Neural Networks - LeNet, AlexNet, VGG-16 Net, ResNet and Inception Net - by Bhavesh Singh Bisht - Analytics Vidhya - Medium
100% (1)
Types of Convolutional Neural Networks - LeNet, AlexNet, VGG-16 Net, ResNet and Inception Net - by Bhavesh Singh Bisht - Analytics Vidhya - Medium
6 pages
Imagenet Classification
No ratings yet
Imagenet Classification
9 pages
Goog Le Net
No ratings yet
Goog Le Net
30 pages
Modern Convolutional Neural Networks
No ratings yet
Modern Convolutional Neural Networks
68 pages
Res Net
No ratings yet
Res Net
8 pages
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
No ratings yet
Difference Between AlexNet, VGGNet, ResNet, and Inception - by Aqeel Anwar - Towards Data Science
14 pages
Kanoria Shubham Anil 2023HT01569
No ratings yet
Kanoria Shubham Anil 2023HT01569
9 pages
Mhamdan Publication
No ratings yet
Mhamdan Publication
7 pages
Classic Cnn
No ratings yet
Classic Cnn
39 pages
UNIT-III DLL full unit
No ratings yet
UNIT-III DLL full unit
63 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
15 pages
Deep Paper
No ratings yet
Deep Paper
12 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
XCXC
No ratings yet
XCXC
16 pages
Architecture of Inception-ResnetV2
No ratings yet
Architecture of Inception-ResnetV2
6 pages
Advancements in Image Classification Using Convolutional Neural Network
No ratings yet
Advancements in Image Classification Using Convolutional Neural Network
8 pages
Building A Convolutional Neural Network Using Tensorflow Keras
No ratings yet
Building A Convolutional Neural Network Using Tensorflow Keras
10 pages
Exercise 8
No ratings yet
Exercise 8
6 pages
Le y Yang - Tiny ImageNet Visual Recognition Challenge
No ratings yet
Le y Yang - Tiny ImageNet Visual Recognition Challenge
6 pages
DCNN Algorithms
No ratings yet
DCNN Algorithms
4 pages
[email protected]
No ratings yet
[email protected]
4 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
What Is The Need For Residual Learning?
No ratings yet
What Is The Need For Residual Learning?
3 pages
Architecture Handbook
No ratings yet
Architecture Handbook
19 pages
Google Le Net
No ratings yet
Google Le Net
9 pages
Alexnet: The Architecture That Challenged Cnns
No ratings yet
Alexnet: The Architecture That Challenged Cnns
6 pages
Ch-3 Convolutional Neural Networks (CNNs)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNs)
11 pages
CV Lab 12 - Implementatin of a Simple CNN
No ratings yet
CV Lab 12 - Implementatin of a Simple CNN
9 pages
Room Classification Using Machine Learning
No ratings yet
Room Classification Using Machine Learning
16 pages
DFANet Deep Feature Aggregation For Real-Time Semantic Segmentation
No ratings yet
DFANet Deep Feature Aggregation For Real-Time Semantic Segmentation
10 pages
DNN Architectures
No ratings yet
DNN Architectures
12 pages
dl ass 742
No ratings yet
dl ass 742
14 pages
Inception
No ratings yet
Inception
56 pages
Convolution Neural Networks Vs Fully Connected Neural Networks
No ratings yet
Convolution Neural Networks Vs Fully Connected Neural Networks
6 pages
Difference between AlexNet, VGGNet, ResNet, and Inception
No ratings yet
Difference between AlexNet, VGGNet, ResNet, and Inception
25 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
U-Net: Convolutional Networks For Biomedical Image Segmentation
No ratings yet
U-Net: Convolutional Networks For Biomedical Image Segmentation
8 pages
Szegedy Rethinking The Inception CVPR 2016 Paper PDF
No ratings yet
Szegedy Rethinking The Inception CVPR 2016 Paper PDF
9 pages
Szegedy Rethinking The Inception CVPR 2016 Paper
No ratings yet
Szegedy Rethinking The Inception CVPR 2016 Paper
9 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DevOps for Networking
From Everand
DevOps for Networking
Steven Armstrong
4/5 (2)
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
4 a 14 Csc Operating System
No ratings yet
4 a 14 Csc Operating System
2 pages
02.02-15645872-OBC-2024
No ratings yet
02.02-15645872-OBC-2024
1 page
info sec unit-1 note-05
No ratings yet
info sec unit-1 note-05
5 pages
0_try to remember
No ratings yet
0_try to remember
4 pages
Gradient Descent
No ratings yet
Gradient Descent
12 pages
1. Introduction to Artificial Neural Networks _ Neural networks and deep learning
No ratings yet
1. Introduction to Artificial Neural Networks _ Neural networks and deep learning
26 pages
Lecture3
No ratings yet
Lecture3
92 pages
tSNE1
No ratings yet
tSNE1
20 pages
NN-examples
No ratings yet
NN-examples
91 pages
Chapter II Build A Neural Network Step by Step
No ratings yet
Chapter II Build A Neural Network Step by Step
31 pages
Matlab Instruction RBF
No ratings yet
Matlab Instruction RBF
22 pages
Echo State Network
No ratings yet
Echo State Network
4 pages
Gen Ai
No ratings yet
Gen Ai
23 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
Lecture 2 Deep Learning Overview
No ratings yet
Lecture 2 Deep Learning Overview
99 pages
Deep Learning Q Bank Mte
No ratings yet
Deep Learning Q Bank Mte
2 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Unit 5
No ratings yet
Unit 5
8 pages
Adaptive Linear Neuron
No ratings yet
Adaptive Linear Neuron
4 pages
Deep Learning Unit 5
No ratings yet
Deep Learning Unit 5
23 pages
Soft Computing MCQ
No ratings yet
Soft Computing MCQ
10 pages
Deep Learning For Credit Card Fraud Detection A Review of Algorithms Challenges and Solutions
No ratings yet
Deep Learning For Credit Card Fraud Detection A Review of Algorithms Challenges and Solutions
18 pages
DL, Course Introduction
No ratings yet
DL, Course Introduction
9 pages
A Gentle Introduction To LSTM Autoencoders
No ratings yet
A Gentle Introduction To LSTM Autoencoders
16 pages
Multiclass Classification
No ratings yet
Multiclass Classification
3 pages
FunAI-Assignment-Week-12
No ratings yet
FunAI-Assignment-Week-12
3 pages
Cheat Sheet For Exam
No ratings yet
Cheat Sheet For Exam
2 pages
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
No ratings yet
Convolutional Neural Networks: Computer Vision CS 543 / ECE 549 University of Illinois Jia-Bin Huang
76 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
CS230 Midterm Fall 2022
No ratings yet
CS230 Midterm Fall 2022
14 pages
Bab 7
No ratings yet
Bab 7
3 pages
EBPN
No ratings yet
EBPN
10 pages
805-Article Text-5777-1-10-20220901
No ratings yet
805-Article Text-5777-1-10-20220901
10 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
SC Question Bank
No ratings yet
SC Question Bank
3 pages