SlideShare a Scribd company logo
BY:SURBHI SAROHA
 DIMENTIONALITY REDUCTION :
 Linear (PCA, LDA) and manifolds,
 metric learning – Auto encoders and dimensionality reduction in networks
 - Introduction to Convnet - Architectures –AlexNet, VGG, Inception, ResNet
 -Training a Convnet: weights initialization,
 batch normalization,
 hyperparameter optimization
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Dimensionality reduction is the process of reducing the number of features (or
dimensions) in a dataset while retaining as much information as possible.
 This can be done for a variety of reasons, such as to reduce the complexity of a
model, to improve the performance of a learning algorithm, or to make it easier to
visualize the data.
 There are several techniques for dimensionality reduction, including principal
component analysis (PCA), singular value decomposition (SVD), and linear
discriminant analysis (LDA).
 Each technique uses a different method to project the data onto a lower-
dimensional space while preserving important information.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Dimensionality reduction is a technique used to reduce the number of features in
a dataset while retaining as much of the important information as possible.
 In other words, it is a process of transforming high-dimensional data into a lower-
dimensional space that still preserves the essence of the original data.
 In machine learning, high-dimensional data refers to data with a large number of
features or variables.
 The curse of dimensionality is a common problem in machine learning, where the
performance of the model deteriorates as the number of features increases.
 This is because the complexity of the model increases with the number of features,
and it becomes more difficult to find a good solution.
 In addition, high-dimensional data can also lead to overfitting, where the model
fits the training data too closely and does not generalize well to new data.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Dimensionality reduction can help to mitigate these problems by reducing the
complexity of the model and improving its generalization performance. There are
two main approaches to dimensionality reduction: feature selection and feature
extraction.
 Feature Selection:
Feature selection involves selecting a subset of the original features that are most
relevant to the problem at hand. The goal is to reduce the dimensionality of the
dataset while retaining the most important features. There are several methods
for feature selection, including filter methods, wrapper methods, and embedded
methods. Filter methods rank the features based on their relevance to the target
variable, wrapper methods use the model performance as the criteria for selecting
features, and embedded methods combine feature selection with the model
training process.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Feature Extraction:
Feature extraction involves creating new features by combining or transforming
the original features.
 The goal is to create a set of features that captures the essence of the original
data in a lower-dimensional space.
 There are several methods for feature extraction, including principal component
analysis (PCA), linear discriminant analysis (LDA), and t-distributed stochastic
neighbor embedding (t-SNE). PCA is a popular technique that projects the original
features onto a lower-dimensional space while preserving as much of the variance
as possible.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 There are two components of dimensionality reduction:
 Feature selection: In this, we try to find a subset of the original set of variables, or
features, to get a smaller subset which can be used to model the problem. It
usually involves three ways:
 Filter
 Wrapper
 Embedded
 Feature extraction: This reduces the data in a high dimensional space to a lower
dimension space, i.e. a space with lesser no. of dimensions.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 This method was introduced by Karl Pearson.
 It works on the condition that while the data in a higher dimensional space is
mapped to data in a lower dimension space, the variance of the data in the lower
dimensional space should be maximum.

Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 It helps in data compression, and hence reduced storage space.
 It reduces computation time.
 It also helps remove redundant features, if any.
 Improved Visualization: High dimensional data is difficult to visualize, and
dimensionality reduction techniques can help in visualizing the data in 2D or 3D,
which can help in better understanding and analysis.
 Overfitting Prevention: High dimensional data may lead to overfitting in machine
learning models, which can lead to poor generalization performance.
Dimensionality reduction can help in reducing the complexity of the data, and
hence prevent overfitting.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Feature Extraction: Dimensionality reduction can help in extracting important
features from high dimensional data, which can be useful in feature selection for
machine learning models.
 Data Preprocessing: Dimensionality reduction can be used as a preprocessing step
before applying machine learning algorithms to reduce the dimensionality of the
data and hence improve the performance of the model.
 Improved Performance: Dimensionality reduction can help in improving the
performance of machine learning models by reducing the complexity of the data,
and hence reducing the noise and irrelevant information in the data.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 It may lead to some amount of data loss.
 PCA tends to find linear correlations between variables, which is sometimes undesirable.
 PCA fails in cases where mean and covariance are not enough to define datasets.
 We may not know how many principal components to keep- in practice, some thumb rules are
applied.
 Interpretability: The reduced dimensions may not be easily interpretable, and it may be difficult
to understand the relationship between the original features and the reduced dimensions.
 Overfitting: In some cases, dimensionality reduction may lead to overfitting, especially when the
number of components is chosen based on the training data.
 Sensitivity to outliers: Some dimensionality reduction techniques are sensitive to outliers, which
can result in a biased representation of the data.
 Computational complexity: Some dimensionality reduction techniques, such as manifold
learning, can be computationally intensive, especially when dealing with large datasets.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Linear Discriminant Analysis (LDA) is one of the commonly used dimensionality
reduction techniques in machine learning to solve more than two-class
classification problems. It is also known as Normal Discriminant Analysis (NDA)
or Discriminant Function Analysis (DFA).
 This can be used to project the features of higher dimensional space into lower-
dimensional space in order to reduce resources and dimensional costs.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Although the logistic regression algorithm is limited to only two-class, linear
Discriminant analysis is applicable for more than two classes of classification
problems.
 Linear Discriminant analysis is one of the most popular dimensionality reduction
techniques used for supervised classification problems in machine learning.
 It is also considered a pre-processing step for modeling differences in ML and
applications of pattern classification.
 Whenever there is a requirement to separate two or more classes having multiple
features efficiently, the Linear Discriminant Analysis model is considered the most
common technique to solve such classification problems. For e.g., if we have two
classes with multiple features and need to separate them efficiently. When we
classify them using a single feature, then it may show overlapping.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 At the heart of deep learning lies the neural network, an intricate interconnected
system of nodes that mimics the human brain’s neural architecture.
 Neural networks excel at discerning intricate patterns and representations within
vast datasets, allowing them to make predictions, classify information, and
generate novel insights.
 Autoencoders emerge as a fascinating subset of neural networks, offering a unique
approach to unsupervised learning.
 Autoencoders are an adaptable and strong class of architectures for the dynamic
field of deep learning, where neural networks develop constantly to identify
complicated patterns and representations.
 With their ability to learn effective representations of data, these unsupervised
learning models have received considerable attention and are useful in a wide
variety of areas, from image processing to anomaly detection.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Auto encoders are a specialized class of algorithms that can learn efficient
representations of input data with no need for labels.
 It is a class of artificial neural networks designed for unsupervised learning.
 Learning to compress and effectively represent input data without specific labels
is the essential principle of an automatic decoder.
 This is accomplished using a two-fold structure that consists of an encoder and a
decoder.
 The encoder transforms the input data into a reduced-dimensional representation,
which is often referred to as “latent space” or “encoding”.
 From that representation, a decoder rebuilds the initial input.
 For the network to gain meaningful patterns in data, a process of encoding and
decoding facilitates the definition of essential features.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 The general architecture of an auto encoder includes an encoder, decoder, and
bottleneck layer.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Input layer take raw input data
 The hidden layers progressively reduce the dimensionality of the input, capturing
important features and patterns. These layer compose the encoder.
 The bottleneck layer (latent space) is the final hidden layer, where the dimensionality is
significantly reduced.
 This layer represents the compressed encoding of the input data.
 Decoder
 The bottleneck layer takes the encoded representation and expands it back to the
dimensionality of the original input.
 The hidden layers progressively increase the dimensionality and aim to reconstruct the
original input.
 The output layer produces the reconstructed output, which ideally should be as close as
possible to the input data.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 These are some groundbreaking CNN architectures that were proposed to achieve
a better accuracy and to reduce the computational cost .
 AlexNet
 This network was very similar to LeNet-5 but was deeper with 8 layers, with more
filters, stacked convolutional layers, max pooling, dropout, data augmentation,
ReLU and SGD.
 AlexNet was the winner of the ImageNet ILSVRC-2012 competition, designed by
Alex Krizhevsky, Ilya Sutskever and Geoffery E. Hinton.
 It was trained on two Nvidia Geforce GTX 580 GPUs, therefore, the network was
split into two pipelines.
 AlexNet has 5 Convolution layers and 3 fully connected layers. AlexNet consists of
approximately 60 M parameters. A major drawback of this network was that it
comprises of too many hyper-parameters.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 The major shortcoming of too many hyper-parameters of AlexNet was solved by VGG
Net by replacing large kernel-sized filters (11 and 5 in the first and second convolution
layer, respectively) with multiple 3×3 kernel-sized filters one after another.
 The architecture developed by Simonyan and Zisserman was the 1st runner up of the
Visual Recognition Challenge of 2014.
 The architecture consist of 3*3 Convolutional filters, 2*2 Max Pooling layer with a
stride of 1, keeping the padding same to preserve the dimension.
 In total, there are 16 layers in the network where the input image is RGB format with
dimension of 224*224*3, followed by 5 pairs of Convolution(filters: 64, 128,
256,512,512) and Max Pooling.
 The output of these layers is fed into three fully connected layers and a softmax
function in the output layer.
 In total there are 138 Million parameters in VGG Net.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 1. Long training time
2. Heavy model
3. Computationally expensive
4. Vanishing/exploding gradient problem
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Inception network also known as GoogleLe Net was proposed by developers at
google in “Going Deeper with Convolutions” in 2014.
 The motivation of InceptionNet comes from the presence of sparse features Salient
parts in the image that can have a large variation in size.
 Due to this, the selection of right kernel size becomes extremely difficult as big
kernels are selected for global features and small kernels when the features are
locally located.
 The InceptionNets resolves this by stacking multiple kernels at the same level.
 Typically it uses 5*5, 3*3 and 1*1 filters in one go.
 For better understanding refer to the image below:
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 ResNet, the winner of ILSVRC-2015 competition are deep networks of over 100
layers.
 Residual networks are similar to VGG nets however with a sequential approach
they also use “Skip connections” and “batch normalization” that helps to train
deep layers without hampering the performance.
 After VGG Nets, as CNNs were going deep, it was becoming hard to train them
because of vanishing gradients problem that makes the derivate infinitely small.
 Therefore, the overall performance saturates or even degrades.
 The idea of skips connection came from highway network where gated shortcut
connections were used.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 While building and training neural networks, it is crucial to initialize the weights
appropriately to ensure a model with high accuracy.
 If the weights are not correctly initialized, it may give rise to the Vanishing Gradient
problem or the Exploding Gradient problem.
 Hence, selecting an appropriate weight initialization strategy is critical when training
DL models.
 Following notations must be kept in mind while understanding the Weight
Initialization Techniques. These notations may vary at different publications.
However, the ones used here are the most common, usually found in research papers.
 fan_in = Number of input paths towards the neuron
 fan_out = Number of output paths towards the neuron
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 For the above neuron,
 fan_in = 3 (Number of input paths towards the neuron)
 fan_out = 2 (Number of output paths towards the neuron)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Internal covariate shift is a major challenge encountered while training deep learning
models.
 Batch normalization was introduced to address this issue.
 In this article, we are going to learn the fundamentals and need of Batch
normalization.
 What is Batch Normalization?
 Batch normalization was introduced to mitigate the internal covariate shift problem
in neural networks by Sergey Ioffe and Christian Szegedy in 2015.
 The normalization process involves calculating the mean and variance of each feature
in a mini-batch and then scaling and shifting the features using these statistics.
 This ensures that the input to each layer remains roughly in the same distribution,
regardless of changes in the distribution of earlier layers’ outputs.
 Consequently, Batch Normalization helps in stabilizing the training process, enabling
higher learning rates and faster convergence.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 Faster Convergence: Batch Normalization reduces internal covariate shift,
allowing for faster convergence during training.
 Higher Learning Rates: With Batch Normalization, higher learning rates can be
used without the risk of divergence.
 Regularization Effect: Batch Normalization introduces a slight regularization
effect that reduces the need for adding regularization techniques like dropout.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
 What are the Hyperparameters?
 Hyperparameters are those parameters that we set for training.
 Hyperparameters have major impacts on accuracy and efficiency while training
the model.
 Therefore it needed to be set accurately to get better and efficient results.
 Hyperparameters are pre-established parameters that are not learned during the
training process. They control a machine learning model’s general behaviour,
including its architecture, regularisation strengths, and learning rates.
 The process of determining the ideal set of hyperparameters for a machine
learning model is known as hyperparameter optimization.
 Usually, strategies like grid search, random search, and more sophisticated ones
like genetic algorithms or Bayesian optimization are used to accomplish this.
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
Ad

More Related Content

Similar to Deep learning(UNIT 3) BY Ms SURBHI SAROHA (20)

Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
IRJET Journal
 
Principal component analysis in machine L
Principal component analysis in machine LPrincipal component analysis in machine L
Principal component analysis in machine L
satyanarayana242612
 
Machine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdfMachine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdf
PhD Assistance
 
Principal Component Analysis in Machine Learning.pdf
Principal Component Analysis in Machine Learning.pdfPrincipal Component Analysis in Machine Learning.pdf
Principal Component Analysis in Machine Learning.pdf
Julie Bowie
 
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET -  	  An User Friendly Interface for Data Preprocessing and Visualizati...IRJET -  	  An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET Journal
 
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET Journal
 
Data Clustering
Data Clustering Data Clustering
Data Clustering
Mohammed Ayoub Othman
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
jagan477830
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
theijes
 
UNIT-2. unsupervised learning of machine learning
UNIT-2. unsupervised learning of machine learningUNIT-2. unsupervised learning of machine learning
UNIT-2. unsupervised learning of machine learning
Pondicherry university
 
High dimensionality reduction on graphical data
High dimensionality reduction on graphical dataHigh dimensionality reduction on graphical data
High dimensionality reduction on graphical data
eSAT Journals
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET Journal
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
Dr. Abdul Ahad Abro
 
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET Journal
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET Journal
 
Core Concepts and Cutting Edge Technologies in Data Science
Core Concepts and Cutting Edge Technologies in Data ScienceCore Concepts and Cutting Edge Technologies in Data Science
Core Concepts and Cutting Edge Technologies in Data Science
analyticsinsightmaga
 
IRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live ImageIRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live Image
IRJET Journal
 
Anomaly Detection using multidimensional reduction Principal Component Analysis
Anomaly Detection using multidimensional reduction Principal Component AnalysisAnomaly Detection using multidimensional reduction Principal Component Analysis
Anomaly Detection using multidimensional reduction Principal Component Analysis
IOSR Journals
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
IOSR Journals
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET Journal
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
IRJET Journal
 
Principal component analysis in machine L
Principal component analysis in machine LPrincipal component analysis in machine L
Principal component analysis in machine L
satyanarayana242612
 
Machine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdfMachine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdf
PhD Assistance
 
Principal Component Analysis in Machine Learning.pdf
Principal Component Analysis in Machine Learning.pdfPrincipal Component Analysis in Machine Learning.pdf
Principal Component Analysis in Machine Learning.pdf
Julie Bowie
 
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET -  	  An User Friendly Interface for Data Preprocessing and Visualizati...IRJET -  	  An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET Journal
 
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET- Pattern Recognition Process, Methods and Applications in Artificial In...
IRJET Journal
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
jagan477830
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
theijes
 
UNIT-2. unsupervised learning of machine learning
UNIT-2. unsupervised learning of machine learningUNIT-2. unsupervised learning of machine learning
UNIT-2. unsupervised learning of machine learning
Pondicherry university
 
High dimensionality reduction on graphical data
High dimensionality reduction on graphical dataHigh dimensionality reduction on graphical data
High dimensionality reduction on graphical data
eSAT Journals
 
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...
IRJET Journal
 
Regression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms ExcelRegression with Microsoft Azure & Ms Excel
Regression with Microsoft Azure & Ms Excel
Dr. Abdul Ahad Abro
 
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET- Predicting Customers Churn in Telecom Industry using Centroid Oversamp...
IRJET Journal
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET Journal
 
Core Concepts and Cutting Edge Technologies in Data Science
Core Concepts and Cutting Edge Technologies in Data ScienceCore Concepts and Cutting Edge Technologies in Data Science
Core Concepts and Cutting Edge Technologies in Data Science
analyticsinsightmaga
 
IRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live ImageIRJET - Face Recognition in Digital Documents with Live Image
IRJET - Face Recognition in Digital Documents with Live Image
IRJET Journal
 
Anomaly Detection using multidimensional reduction Principal Component Analysis
Anomaly Detection using multidimensional reduction Principal Component AnalysisAnomaly Detection using multidimensional reduction Principal Component Analysis
Anomaly Detection using multidimensional reduction Principal Component Analysis
IOSR Journals
 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
IOSR Journals
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET Journal
 

More from Dr. SURBHI SAROHA (20)

MOBILE COMPUTING UNIT 2 by surbhi saroha
MOBILE COMPUTING UNIT 2 by surbhi sarohaMOBILE COMPUTING UNIT 2 by surbhi saroha
MOBILE COMPUTING UNIT 2 by surbhi saroha
Dr. SURBHI SAROHA
 
Mobile Computing UNIT 1 by surbhi saroha
Mobile Computing UNIT 1 by surbhi sarohaMobile Computing UNIT 1 by surbhi saroha
Mobile Computing UNIT 1 by surbhi saroha
Dr. SURBHI SAROHA
 
DEEP LEARNING (UNIT 2 ) by surbhi saroha
DEEP LEARNING (UNIT 2 ) by surbhi sarohaDEEP LEARNING (UNIT 2 ) by surbhi saroha
DEEP LEARNING (UNIT 2 ) by surbhi saroha
Dr. SURBHI SAROHA
 
Introduction to Deep Leaning(UNIT 1).pptx
Introduction to Deep Leaning(UNIT 1).pptxIntroduction to Deep Leaning(UNIT 1).pptx
Introduction to Deep Leaning(UNIT 1).pptx
Dr. SURBHI SAROHA
 
Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2Cloud Computing (Infrastructure as a Service)UNIT 2
Cloud Computing (Infrastructure as a Service)UNIT 2
Dr. SURBHI SAROHA
 
Management Information System(Unit 2).pptx
Management Information System(Unit 2).pptxManagement Information System(Unit 2).pptx
Management Information System(Unit 2).pptx
Dr. SURBHI SAROHA
 
Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)Searching in Data Structure(Linear search and Binary search)
Searching in Data Structure(Linear search and Binary search)
Dr. SURBHI SAROHA
 
Management Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptxManagement Information System(UNIT 1).pptx
Management Information System(UNIT 1).pptx
Dr. SURBHI SAROHA
 
Introduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptxIntroduction to Cloud Computing(UNIT 1).pptx
Introduction to Cloud Computing(UNIT 1).pptx
Dr. SURBHI SAROHA
 
JAVA (UNIT 5)
JAVA (UNIT 5)JAVA (UNIT 5)
JAVA (UNIT 5)
Dr. SURBHI SAROHA
 
DBMS (UNIT 5)
DBMS (UNIT 5)DBMS (UNIT 5)
DBMS (UNIT 5)
Dr. SURBHI SAROHA
 
DBMS UNIT 4
DBMS UNIT 4DBMS UNIT 4
DBMS UNIT 4
Dr. SURBHI SAROHA
 
JAVA(UNIT 4)
JAVA(UNIT 4)JAVA(UNIT 4)
JAVA(UNIT 4)
Dr. SURBHI SAROHA
 
OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)OOPs & C++(UNIT 5)
OOPs & C++(UNIT 5)
Dr. SURBHI SAROHA
 
OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)OOPS & C++(UNIT 4)
OOPS & C++(UNIT 4)
Dr. SURBHI SAROHA
 
DBMS UNIT 3
DBMS UNIT 3DBMS UNIT 3
DBMS UNIT 3
Dr. SURBHI SAROHA
 
JAVA (UNIT 3)
JAVA (UNIT 3)JAVA (UNIT 3)
JAVA (UNIT 3)
Dr. SURBHI SAROHA
 
Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)Keys in dbms(UNIT 2)
Keys in dbms(UNIT 2)
Dr. SURBHI SAROHA
 
DBMS (UNIT 2)
DBMS (UNIT 2)DBMS (UNIT 2)
DBMS (UNIT 2)
Dr. SURBHI SAROHA
 
JAVA UNIT 2
JAVA UNIT 2JAVA UNIT 2
JAVA UNIT 2
Dr. SURBHI SAROHA
 
Ad

Recently uploaded (20)

Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 4-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 4-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
Celine George
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 
Quality Contril Analysis of Containers.pdf
Quality Contril Analysis of Containers.pdfQuality Contril Analysis of Containers.pdf
Quality Contril Analysis of Containers.pdf
Dr. Bindiya Chauhan
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
How to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odooHow to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odoo
Celine George
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-3-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-3-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
SPRING FESTIVITIES - UK AND USA -
SPRING FESTIVITIES - UK AND USA            -SPRING FESTIVITIES - UK AND USA            -
SPRING FESTIVITIES - UK AND USA -
Colégio Santa Teresinha
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar RabbiPresentation on Tourism Product Development By Md Shaifullar Rabbi
Presentation on Tourism Product Development By Md Shaifullar Rabbi
Md Shaifullar Rabbi
 
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
How to track Cost and Revenue using Analytic Accounts in odoo Accounting, App...
Celine George
 
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Phoenix – A Collaborative Renewal of Children’s and Young People’s Services C...
Library Association of Ireland
 
Social Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy StudentsSocial Problem-Unemployment .pptx notes for Physiotherapy Students
Social Problem-Unemployment .pptx notes for Physiotherapy Students
DrNidhiAgarwal
 
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptxSCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
SCI BIZ TECH QUIZ (OPEN) PRELIMS XTASY 2025.pptx
Ronisha Das
 
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
World war-1(Causes & impacts at a glance) PPT by Simanchala Sarab(BABed,sem-4...
larencebapu132
 
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - WorksheetCBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
CBSE - Grade 8 - Science - Chemistry - Metals and Non Metals - Worksheet
Sritoma Majumder
 
apa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdfapa-style-referencing-visual-guide-2025.pdf
apa-style-referencing-visual-guide-2025.pdf
Ishika Ghosh
 
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACYUNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
UNIT 3 NATIONAL HEALTH PROGRAMMEE. SOCIAL AND PREVENTIVE PHARMACY
DR.PRISCILLA MARY J
 
Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025Stein, Hunt, Green letter to Congress April 2025
Stein, Hunt, Green letter to Congress April 2025
Mebane Rash
 
Quality Contril Analysis of Containers.pdf
Quality Contril Analysis of Containers.pdfQuality Contril Analysis of Containers.pdf
Quality Contril Analysis of Containers.pdf
Dr. Bindiya Chauhan
 
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...Multi-currency in odoo accounting and Update exchange rates automatically in ...
Multi-currency in odoo accounting and Update exchange rates automatically in ...
Celine George
 
How to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odooHow to Set warnings for invoicing specific customers in odoo
How to Set warnings for invoicing specific customers in odoo
Celine George
 
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdfExploring-Substances-Acidic-Basic-and-Neutral.pdf
Exploring-Substances-Acidic-Basic-and-Neutral.pdf
Sandeep Swamy
 
One Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learningOne Hot encoding a revolution in Machine learning
One Hot encoding a revolution in Machine learning
momer9505
 
Understanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s GuideUnderstanding P–N Junction Semiconductors: A Beginner’s Guide
Understanding P–N Junction Semiconductors: A Beginner’s Guide
GS Virdi
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
milanasargsyan5
 
Ad

Deep learning(UNIT 3) BY Ms SURBHI SAROHA

  • 2.  DIMENTIONALITY REDUCTION :  Linear (PCA, LDA) and manifolds,  metric learning – Auto encoders and dimensionality reduction in networks  - Introduction to Convnet - Architectures –AlexNet, VGG, Inception, ResNet  -Training a Convnet: weights initialization,  batch normalization,  hyperparameter optimization Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 3.  Dimensionality reduction is the process of reducing the number of features (or dimensions) in a dataset while retaining as much information as possible.  This can be done for a variety of reasons, such as to reduce the complexity of a model, to improve the performance of a learning algorithm, or to make it easier to visualize the data.  There are several techniques for dimensionality reduction, including principal component analysis (PCA), singular value decomposition (SVD), and linear discriminant analysis (LDA).  Each technique uses a different method to project the data onto a lower- dimensional space while preserving important information. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 4.  Dimensionality reduction is a technique used to reduce the number of features in a dataset while retaining as much of the important information as possible.  In other words, it is a process of transforming high-dimensional data into a lower- dimensional space that still preserves the essence of the original data.  In machine learning, high-dimensional data refers to data with a large number of features or variables.  The curse of dimensionality is a common problem in machine learning, where the performance of the model deteriorates as the number of features increases.  This is because the complexity of the model increases with the number of features, and it becomes more difficult to find a good solution.  In addition, high-dimensional data can also lead to overfitting, where the model fits the training data too closely and does not generalize well to new data. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 5.  Dimensionality reduction can help to mitigate these problems by reducing the complexity of the model and improving its generalization performance. There are two main approaches to dimensionality reduction: feature selection and feature extraction.  Feature Selection: Feature selection involves selecting a subset of the original features that are most relevant to the problem at hand. The goal is to reduce the dimensionality of the dataset while retaining the most important features. There are several methods for feature selection, including filter methods, wrapper methods, and embedded methods. Filter methods rank the features based on their relevance to the target variable, wrapper methods use the model performance as the criteria for selecting features, and embedded methods combine feature selection with the model training process. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 6.  Feature Extraction: Feature extraction involves creating new features by combining or transforming the original features.  The goal is to create a set of features that captures the essence of the original data in a lower-dimensional space.  There are several methods for feature extraction, including principal component analysis (PCA), linear discriminant analysis (LDA), and t-distributed stochastic neighbor embedding (t-SNE). PCA is a popular technique that projects the original features onto a lower-dimensional space while preserving as much of the variance as possible. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 7. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 8.  There are two components of dimensionality reduction:  Feature selection: In this, we try to find a subset of the original set of variables, or features, to get a smaller subset which can be used to model the problem. It usually involves three ways:  Filter  Wrapper  Embedded  Feature extraction: This reduces the data in a high dimensional space to a lower dimension space, i.e. a space with lesser no. of dimensions. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 9.  This method was introduced by Karl Pearson.  It works on the condition that while the data in a higher dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower dimensional space should be maximum.  Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 10.  It helps in data compression, and hence reduced storage space.  It reduces computation time.  It also helps remove redundant features, if any.  Improved Visualization: High dimensional data is difficult to visualize, and dimensionality reduction techniques can help in visualizing the data in 2D or 3D, which can help in better understanding and analysis.  Overfitting Prevention: High dimensional data may lead to overfitting in machine learning models, which can lead to poor generalization performance. Dimensionality reduction can help in reducing the complexity of the data, and hence prevent overfitting. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 11.  Feature Extraction: Dimensionality reduction can help in extracting important features from high dimensional data, which can be useful in feature selection for machine learning models.  Data Preprocessing: Dimensionality reduction can be used as a preprocessing step before applying machine learning algorithms to reduce the dimensionality of the data and hence improve the performance of the model.  Improved Performance: Dimensionality reduction can help in improving the performance of machine learning models by reducing the complexity of the data, and hence reducing the noise and irrelevant information in the data. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 12.  It may lead to some amount of data loss.  PCA tends to find linear correlations between variables, which is sometimes undesirable.  PCA fails in cases where mean and covariance are not enough to define datasets.  We may not know how many principal components to keep- in practice, some thumb rules are applied.  Interpretability: The reduced dimensions may not be easily interpretable, and it may be difficult to understand the relationship between the original features and the reduced dimensions.  Overfitting: In some cases, dimensionality reduction may lead to overfitting, especially when the number of components is chosen based on the training data.  Sensitivity to outliers: Some dimensionality reduction techniques are sensitive to outliers, which can result in a biased representation of the data.  Computational complexity: Some dimensionality reduction techniques, such as manifold learning, can be computationally intensive, especially when dealing with large datasets. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 13.  Linear Discriminant Analysis (LDA) is one of the commonly used dimensionality reduction techniques in machine learning to solve more than two-class classification problems. It is also known as Normal Discriminant Analysis (NDA) or Discriminant Function Analysis (DFA).  This can be used to project the features of higher dimensional space into lower- dimensional space in order to reduce resources and dimensional costs. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 14.  Although the logistic regression algorithm is limited to only two-class, linear Discriminant analysis is applicable for more than two classes of classification problems.  Linear Discriminant analysis is one of the most popular dimensionality reduction techniques used for supervised classification problems in machine learning.  It is also considered a pre-processing step for modeling differences in ML and applications of pattern classification.  Whenever there is a requirement to separate two or more classes having multiple features efficiently, the Linear Discriminant Analysis model is considered the most common technique to solve such classification problems. For e.g., if we have two classes with multiple features and need to separate them efficiently. When we classify them using a single feature, then it may show overlapping. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 15.  At the heart of deep learning lies the neural network, an intricate interconnected system of nodes that mimics the human brain’s neural architecture.  Neural networks excel at discerning intricate patterns and representations within vast datasets, allowing them to make predictions, classify information, and generate novel insights.  Autoencoders emerge as a fascinating subset of neural networks, offering a unique approach to unsupervised learning.  Autoencoders are an adaptable and strong class of architectures for the dynamic field of deep learning, where neural networks develop constantly to identify complicated patterns and representations.  With their ability to learn effective representations of data, these unsupervised learning models have received considerable attention and are useful in a wide variety of areas, from image processing to anomaly detection. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 16.  Auto encoders are a specialized class of algorithms that can learn efficient representations of input data with no need for labels.  It is a class of artificial neural networks designed for unsupervised learning.  Learning to compress and effectively represent input data without specific labels is the essential principle of an automatic decoder.  This is accomplished using a two-fold structure that consists of an encoder and a decoder.  The encoder transforms the input data into a reduced-dimensional representation, which is often referred to as “latent space” or “encoding”.  From that representation, a decoder rebuilds the initial input.  For the network to gain meaningful patterns in data, a process of encoding and decoding facilitates the definition of essential features. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 17.  The general architecture of an auto encoder includes an encoder, decoder, and bottleneck layer. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 18.  Input layer take raw input data  The hidden layers progressively reduce the dimensionality of the input, capturing important features and patterns. These layer compose the encoder.  The bottleneck layer (latent space) is the final hidden layer, where the dimensionality is significantly reduced.  This layer represents the compressed encoding of the input data.  Decoder  The bottleneck layer takes the encoded representation and expands it back to the dimensionality of the original input.  The hidden layers progressively increase the dimensionality and aim to reconstruct the original input.  The output layer produces the reconstructed output, which ideally should be as close as possible to the input data. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 19.  These are some groundbreaking CNN architectures that were proposed to achieve a better accuracy and to reduce the computational cost .  AlexNet  This network was very similar to LeNet-5 but was deeper with 8 layers, with more filters, stacked convolutional layers, max pooling, dropout, data augmentation, ReLU and SGD.  AlexNet was the winner of the ImageNet ILSVRC-2012 competition, designed by Alex Krizhevsky, Ilya Sutskever and Geoffery E. Hinton.  It was trained on two Nvidia Geforce GTX 580 GPUs, therefore, the network was split into two pipelines.  AlexNet has 5 Convolution layers and 3 fully connected layers. AlexNet consists of approximately 60 M parameters. A major drawback of this network was that it comprises of too many hyper-parameters. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 20. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 21.  The major shortcoming of too many hyper-parameters of AlexNet was solved by VGG Net by replacing large kernel-sized filters (11 and 5 in the first and second convolution layer, respectively) with multiple 3×3 kernel-sized filters one after another.  The architecture developed by Simonyan and Zisserman was the 1st runner up of the Visual Recognition Challenge of 2014.  The architecture consist of 3*3 Convolutional filters, 2*2 Max Pooling layer with a stride of 1, keeping the padding same to preserve the dimension.  In total, there are 16 layers in the network where the input image is RGB format with dimension of 224*224*3, followed by 5 pairs of Convolution(filters: 64, 128, 256,512,512) and Max Pooling.  The output of these layers is fed into three fully connected layers and a softmax function in the output layer.  In total there are 138 Million parameters in VGG Net. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 22. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 23.  1. Long training time 2. Heavy model 3. Computationally expensive 4. Vanishing/exploding gradient problem Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 24.  Inception network also known as GoogleLe Net was proposed by developers at google in “Going Deeper with Convolutions” in 2014.  The motivation of InceptionNet comes from the presence of sparse features Salient parts in the image that can have a large variation in size.  Due to this, the selection of right kernel size becomes extremely difficult as big kernels are selected for global features and small kernels when the features are locally located.  The InceptionNets resolves this by stacking multiple kernels at the same level.  Typically it uses 5*5, 3*3 and 1*1 filters in one go.  For better understanding refer to the image below: Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 25. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 26.  ResNet, the winner of ILSVRC-2015 competition are deep networks of over 100 layers.  Residual networks are similar to VGG nets however with a sequential approach they also use “Skip connections” and “batch normalization” that helps to train deep layers without hampering the performance.  After VGG Nets, as CNNs were going deep, it was becoming hard to train them because of vanishing gradients problem that makes the derivate infinitely small.  Therefore, the overall performance saturates or even degrades.  The idea of skips connection came from highway network where gated shortcut connections were used. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 27. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 28.  While building and training neural networks, it is crucial to initialize the weights appropriately to ensure a model with high accuracy.  If the weights are not correctly initialized, it may give rise to the Vanishing Gradient problem or the Exploding Gradient problem.  Hence, selecting an appropriate weight initialization strategy is critical when training DL models.  Following notations must be kept in mind while understanding the Weight Initialization Techniques. These notations may vary at different publications. However, the ones used here are the most common, usually found in research papers.  fan_in = Number of input paths towards the neuron  fan_out = Number of output paths towards the neuron Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 29. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 30.  For the above neuron,  fan_in = 3 (Number of input paths towards the neuron)  fan_out = 2 (Number of output paths towards the neuron) Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 31.  Internal covariate shift is a major challenge encountered while training deep learning models.  Batch normalization was introduced to address this issue.  In this article, we are going to learn the fundamentals and need of Batch normalization.  What is Batch Normalization?  Batch normalization was introduced to mitigate the internal covariate shift problem in neural networks by Sergey Ioffe and Christian Szegedy in 2015.  The normalization process involves calculating the mean and variance of each feature in a mini-batch and then scaling and shifting the features using these statistics.  This ensures that the input to each layer remains roughly in the same distribution, regardless of changes in the distribution of earlier layers’ outputs.  Consequently, Batch Normalization helps in stabilizing the training process, enabling higher learning rates and faster convergence. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 32.  Faster Convergence: Batch Normalization reduces internal covariate shift, allowing for faster convergence during training.  Higher Learning Rates: With Batch Normalization, higher learning rates can be used without the risk of divergence.  Regularization Effect: Batch Normalization introduces a slight regularization effect that reduces the need for adding regularization techniques like dropout. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 33.  What are the Hyperparameters?  Hyperparameters are those parameters that we set for training.  Hyperparameters have major impacts on accuracy and efficiency while training the model.  Therefore it needed to be set accurately to get better and efficient results.  Hyperparameters are pre-established parameters that are not learned during the training process. They control a machine learning model’s general behaviour, including its architecture, regularisation strengths, and learning rates.  The process of determining the ideal set of hyperparameters for a machine learning model is known as hyperparameter optimization.  Usually, strategies like grid search, random search, and more sophisticated ones like genetic algorithms or Bayesian optimization are used to accomplish this. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)
  • 34. Shobhit Institute of Engineering and Technology (NAAC 'A' Grade Accredited Deemed to be University)