0% found this document useful (0 votes)
5 views

unitV (1)

The document discusses perceptrons, the simplest form of neural networks, and their role in machine learning as linear classifiers for supervised learning. It explains the differences between single-layer and multi-layer perceptrons, detailing their structures and functionalities, including the importance of activation functions and backpropagation in training. Additionally, it contrasts deep learning with traditional machine learning, emphasizing the advantages of deep learning in handling large datasets and learning high-level features without extensive domain expertise.

Uploaded by

alphawin88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

unitV (1)

The document discusses perceptrons, the simplest form of neural networks, and their role in machine learning as linear classifiers for supervised learning. It explains the differences between single-layer and multi-layer perceptrons, detailing their structures and functionalities, including the importance of activation functions and backpropagation in training. Additionally, it contrasts deep learning with traditional machine learning, emphasizing the advantages of deep learning in handling large datasets and learning high-level features without extensive domain expertise.

Uploaded by

alphawin88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

NEERAJ KHARYA, DEPARTMENT OF

COMPUTER APPLICATIONS, BIT DURG 1


Perceptron
 A perceptron, a neuron’s computational prototype, is
categorized as the simplest form of a neural network.
Perceptron takes its name from the basic unit of a neuron,
which also goes by the same name.
 Frank Rosenblatt invented the perceptron at the Cornell
Aeronautical Laboratory in 1957. A perceptron has one or
more than one inputs, a process, and only one output.
 The concept of perceptron has a critical role in machine
learning. It is used as an algorithm or a linear classifier to
facilitate supervised learning of binary classifiers.
 A linear classifier that the perceptron is categorized as is a
classification algorithm, which relies on a linear predictor
function to make predictions. Its predictions are based on a
combination that includes weights and feature vector.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 2
Single Layer & Multi Layer Perceptron
 The single-layer type organizes neurons in a single layer, The
Single Layer perceptron is defined by its ability to linearly
classify inputs. This means that this kind of model only utilizes a
single hyperplane line and classifies the inputs as per the
learned weights beforehand.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 3
Single Layer & Multi Layer Perceptron
 In multi-layer neurons are arranged in multiple layers.
 Each neuron of the first layer takes inputs and gives a response
to the group of neurons present in the second layer. This
process continues until the last layer is reached.
 The Multi-Layer Perceptron is defined by its ability to use
layers while classifying inputs. This type is a high processing
algorithm that allows machines to classify inputs using various
more than one layer at the same time.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 4
Constituents of Perceptron
 Input value or One input layer: The input layer of the
Perceptron is made of artificial input neurons and takes the
initial data into the system for further processing.
 Weights: Weights represents the dimension or strength of the
connection between units. If the weight to node 1 to node 2
has a higher quantity, then neuron 1 has a more considerable
influence on the neuron.
Bias: It is the same as the intercept added in a linear equation.
It is an additional parameter which task is to modify the output
along with the weighted sum of the input to the other neuron.
 Net sum: It calculates the total sum.
 Activation Function: A neuron can be activated or not, is
determined by an activation function. The activation function
calculates a weighted sum and further adding bias with it to
give the result.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 5
Activation Function

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 6
Single Layer and Multi Layer Perceptron (MLP)
 Perceptron algorithms can be categorized into single-layer and
multi-layer Perceptron.
 The single-layer type organizes neurons in a single layer, The
Single Layer Perceptron is defined by its ability to linearly
classify inputs. This means that this kind of model only utilizes a
single hyper plane line and classifies the inputs as per the
learned weights beforehand.
 In the multi-layer scenario, each neuron of the first layer takes
inputs and gives a response to the group of neurons present in
the second layer. This process continues until the last layer is
reached.
 The Multi-Layer Perceptron is defined by its ability to use
layers while classifying inputs. This type is a high processing
algorithm that allows machines to classify inputs using various
more than one layer at the same time.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 7
Single Layer and Multi Layer Perceptron (MLP)
 MLP is a neural network where the mapping between inputs
and output is non-linear.
 A Multilayer Perceptron has input and output layers, and one
or more hidden layers with many neurons stacked together.
 In Perceptron the neuron must have an activation function that
imposes a threshold, like ReLU or sigmoid, whereas neurons in
a Multilayer Perceptron can use any arbitrary activation
function.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 8
Single Layer and Multi Layer Perceptron (MLP)
 Multilayer Perceptron falls under the category of feedforward
algorithms, because inputs are combined with the initial
weights in a weighted sum and subjected to the activation
function, just like in the Perceptron. But the difference is that
each linear combination is propagated to the next layer.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 9
Single Layer and Multi Layer Perceptron (MLP)
 Multilayer Perceptron falls under the category of feedforward
algorithms, because inputs are combined with the initial
weights in a weighted sum and subjected to the activation
function, just like in the Perceptron. But the difference is that
each linear combination is propagated to the next layer.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 10
Back Propogation in Multi Layer Perceptron
 Back propagation is a supervised learning algorithm that is used
to train neural Networks.
 Back propagation is the learning mechanism that allows the
Multilayer Perceptron to iteratively adjust the weights in the
network, with the goal of minimizing the cost function.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 11
Non Linear Regression Model
 Nonlinear regression is a form of regression analysis in which
observational data are modeled by a function which is a
nonlinear combination of the model parameters and depends
on one or more independent.
 Nonlinear regression is a curved function of an X variable (or
variables) that is used to predict a Y variable
 Nonlinear models are more complicated than linear models to
develop because the function is created through a series of
approximations (iterations) that may stem from trial-and-error.
 In order to obtain accurate results from the nonlinear
regression model, you should make sure the function you
specify describes the relationship between the independent and
dependent variables accurately.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 12
Non Linear Regression Model
 Nonlinear regression can be used is to predict population
growth over time. A scatter plot of changing population data
over time shows that there seems to be a relationship between
time and population growth, but that it is a nonlinear
relationship, requiring the use of a nonlinear regression model.
A logistic population growth model can provide estimates of
the population for periods that were not measured, and
predictions of future population growth

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 13
MultiClass Classification
 Classification is the process of dividing the data into different
categories or groups by adding labels to it, based on some
conditions.
 Grouping the data points to different classes.
 Examples : Email (Spam/ Non Spam) :2 Classes (Binary)
 Weather (Hot, Rainy, Cloudy, Snow) : 4 Classes
 Characters (A-Z, a-z) : 52 classes
 Naïve Bayes, KNN etc are the multiclass classifier

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 14
Deep Neural Networks
 Deep learning neural networks learn a mapping function from
inputs to outputs, which is achieved by updating the weights of
the network in response to the errors the model makes on the
training dataset.
 Updates are made to continually reduce this error until either
a good enough model is found or the learning process gets
stuck and stops.
 The process of training neural networks is the most challenging
part of using the technique in general and is by far the most
time consuming, both in terms of effort required to configure
the process and computational complexity required to execute
the process.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 15
Learning Process
 Developing a model requires historical data from the domain that is
used as training data. This data is comprised of observations or
examples from the domain with input elements that describe the
conditions and an output element that captures what the
observation means.
 A neural network model uses the examples to learn how to map
specific sets of input variables to the output variable.
 It must do this in such a way that this mapping works well for the
training dataset, but also works well on new examples not seen by
the model during training. This ability to work well on specific
examples and new examples is called the ability of the model to
generalize.
 we can describe the broader problem that neural networks solve as
“function approximation.” They learn to approximate an unknown
underlying mapping function given a training dataset. They do this by
learning weights and the model parameters, given a specific network
structure that we design.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 16
Learning Network Weights
 For many simpler machine learning algorithms, we can use
linear algebra to calculate the specific coefficients of a linear
regression model and a training dataset that best minimizes the
squared error for optimal model.
 Similarly, we can use optimization algorithms that offer
convergence guarantees when finding an optimal set of model
parameters for nonlinear algorithms such as logistic regression
or support vector machines and for many machine learning
algorithms involves solving a convex optimization problem: that
is an error surface that is shaped like a bowl with a single best
solution.
 But in case of Neural Network we cannot neither directly
compute the optimal set of weights for a model, nor can we
get global convergence guarantees to find an optimal set of
weights, this makes the training part of neural network is
challenging.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 17
Learning Network Weights
 A model with a specific set of weights can be evaluated on the
training dataset and the average error over all training datasets
can be thought of as the error of the model. A change to the
model weights will result in a change to the model error.
Therefore, we seek a set of weights that result in a model with
a small error.
 This involves repeating the steps of evaluating the model and
updating the model parameters in order to step down the
error surface. This process is repeated until a set of parameters
is found that is good enough or the search process gets stuck.
 The algorithm that is most commonly used to navigate the
error surface is called stochastic gradient descent, or SGD for
short.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 18
Components of the Learning Algorithm
 Training a deep learning neural network model using stochastic
gradient descent with back propagation involves choosing a number
of components and hyper parameters namely:
 Loss Function. An error function must be chosen, often called the
objective function, cost function, or the loss function. The function
used to estimate the performance of a model with a specific set of
weights on examples from the training dataset.
 Weight Initialization. The procedure by which the initial small
random values are assigned to model weights at the beginning of the
training process.
 Batch Size. The number of examples used to estimate the error
gradient before updating the model parameters.
 Learning Rate: The amount that each model parameter is updated
per cycle of the learning algorithm.A hyperparameter called the
“learning rate” controls how much to update model weights and, in
turn, controls how fast a model learns on the training dataset.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 19
Components of the Learning Algorithm
 Epochs: The training process must be repeated many times
until a good or good enough set of model parameters is
discovered. The total number of iterations of the process is
bounded by the number of complete passes through the
training dataset after which the training process is terminated.
This is referred to as the number of training “epochs.” The
number of complete passes through the training dataset before
the training process is terminated.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 20
Deep Learning Over Machine Learning
 Machine Learning has become necessary in every sector as
a way of making machines intelligent.
 Machine Learning is set of algorithms that parse data,
learn from them, and then apply what they’ve learned to
make intelligent decisions.
 Deep Learning is a subset of Machine Learning that
achieves great power and flexibility by learning to
represent the world as nested hierarchy of concepts, with
each concept defined in relation to simpler concepts, and
more abstract representations computed in terms of less
abstract ones.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 21
Deep Learning Over Machine Learning
 A deep learning technique learn categories incrementally
through it’s hidden layer architecture, defining low-level
categories like in the example of image recognition it means
identifying light/dark areas before categorizing lines and then
shapes to allow face recognition.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 22
Features of Deep Learning
 A big advantage with deep learning, and a key part in
understanding why it’s becoming popular, is that it’s powered by
massive amounts of data. The “Big Data Era” of technology will
provide huge amounts of opportunities for new innovations in
deep learning.
 The analogy to deep learning is that the rocket engine is
the deep learning models and the fuel is the huge amounts
of data we can feed to these algorithms.”
 Deep Learning requires high-end machines contrary to
traditional Machine Learning algorithms. GPU has become a
integral part now to execute any Deep Learning algorithm.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 23
Features of Deep Learning

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 24
Features of Deep Learning
 In traditional Machine learning techniques, most of the
applied features need to be identified by an domain expert in
order to reduce the complexity of the data and make patterns
more visible to learning algorithms to work.
 The biggest advantage Deep Learning algorithms as
discussed before are that they try to learn high-level features
from data in an incremental manner. This eliminates the need of
domain expertise and hard core feature extraction.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 25
Features of Deep Learning
 Deep Learning techniques tend to solve the problem end to
end, where as Machine learning techniques need the
problem statements to break down to different parts to be
solved first and then their results to be combine at final stage.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 26
Features of Deep Learning
 Yolo CNN (You Only Look Once) take the image as input
and provide the location and name of objects at output
 But in usual Machine Learning algorithms like SVM, a
bounding box object detection algorithm is required first to
identify all possible objects to have the HOG (Histogram of
Oriented Gradients) as input to the learning algorithm in
order to recognize relevant objects

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 27
Features of Deep Learning
 Deep Learning algorithm takes a long time to train due to
large number of parameters. Where as, traditional Machine
Learning algorithms take few seconds to few hours to train.
 The scenario is completely reverse in testing phase. At test
time, Deep Learning algorithm takes much less time to run.
Whereas, if you compare it with k-nearest neighbors (a type of
machine learning algorithm), test time increases on increasing
the size of data.*

*Well,This is not applicable on all machine learning algorithms, as some of them have small
testing times too.
NEERAJ KHARYA, DEPARTMENT OF
COMPUTER APPLICATIONS, BIT DURG 28
When to use Deep Learning or not over others?
1. Deep Learning out perform other techniques if the data
size is large. But with small data size, traditional Machine
Learning algorithms are preferable.
2. Deep Learning techniques need to have high end
infrastructure to train in reasonable time.
3. When there is lack of domain understanding for feature
introspection, Deep Learning techniques outshines others
as you have to worry less about feature engineering.
4. Deep Learning really shines when it comes to complex
problems such as image classification, natural language
processing, and speech recognition.

NEERAJ KHARYA, DEPARTMENT OF


COMPUTER APPLICATIONS, BIT DURG 29

You might also like