0% found this document useful (0 votes)

6 views

Neural Network (Perceptrons)

The document discusses neural networks, focusing on perceptrons and their application in classification problems, specifically predicting student pass rates based on study hours. It explains the conversion of linear hypotheses to non-linear ones using activation functions like sigmoid, and outlines the process of training neural networks through forward propagation, loss calculation, and backpropagation. Additionally, it highlights the importance of gradient descent in optimizing weights and biases within the network structure.

Uploaded by

anthonio.alex.d.costa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Neural Network (Perceptrons)

Uploaded by

anthonio.alex.d.costa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

● Here, let’s say we are trying to predict whether a student is going to pass based on hours study
● Hours study, in this case is the feature and Pass, is a label
● Pass is a categorical variable consisting of two values {Yes, No}
● Such categorical variable outcome prediction is referred to as classification problem
● It is a task of supervised learning
Hours Study Pass Hours Study Pass
2 No 2 0
3 No 3 0
4 Yes 4 1
5 No 5 0
6 Yes 6 1
7 Yes 7 1
8 Yes 8 1
9 Yes 9 1
10 ? 10 ?

● Let’s try to plot the dataset

● Linear models can be problematic to fit such data

● Thus, we convert the linear hypothesis to a non-linear one to fit the
data

● One such non-linear activation function here used is Sigmoid

● Such an activation function converts a linear hypothesis into non-linear one
● Such nonlinear classification is known as logistic regression
● There are other activation functions available (ReLU, Tanh)
Sigmoid function
● If we are trying to fit a linear equation z = w.x+b to a non-linear pattern, we convert it
to a nonlinear equation instead
● The conversion is done through non-linear activation functions
● One such function is the sigmoid function.

Where, z = w.x+b [The linear function that we are trying to convert]

w = weight [Learnable parameter]

b = bias [Learnable parameter]

x = feature value [For multiple features, it’s a matrix multiplied with a weight
matrix]
Let’s assume,

Hours Study = x

Pass = y

● We’ll have to figure out z = w.x+b, and pass it through the

nonlinear function
● The challenge is to find appropriate values for w and b, which
we can do through gradient descent
● For gradient descent, we calculate the derivative of the loss
function, and subtract the derivative from the respective
parameter to update it
Loss function
Intuition behind the loss function:
Logistic regression uses the following loss function:
If y = 1 and a = 0.000005 [Extreme case of misclassification]
loss = -y.log(a) - (1-y).log(1-a)
loss = -1.log0 - (1-1).log(1-0) = 5.301 [High loss value]
Where,
If y = 1 and a = 1 [Best case of correct classification]
y = Ground truth
loss = -1.log1 - (1-1).log(1-1) = 0 [Low loss value]
a = Predicted value

If y = 0 and a = .000095 [Extreme case of misclassification]

loss = -0.log1 - (1-0).log(1-1) = 4.022 [High loss value]

If y = 0 and a = 0 [Best case of correct classification]

loss = -0.log0 - (1-0).log(1-0) = 0 [Low loss value]

Steps of logistic regression

● Randomly initialize the w and b for z=w.x+b

● Calculate the z for the initial w and b
● Calculate the loss
● Apply gradient descent to update the w and b
● Repeat the process until convergence
Logistic regression to neural networks
● Logistic regression can be expressed as a single neuron neural network
● Where, for a single feature and a single neuron, the network looks like:

● For multiple features on a single neuron, the architecture may look like:

● Single layer neural networks are referred to as perceptrons

Logistic regression to neural networks
● A network can involve multiple neurons

● And it can become fairly large and complex

Neural networks
● A method of processing based on multiple connected processing unit
● Each neuron is a linear function, followed by a nonlinear activation function
● The connectivity of the neurons build up a nested function
● Can learn complex patterns

● In the neural network above: z = σ(σ(x1.w1+b1).w2+b2)

● In a larger network, the function becomes much more nested and complex with many
learnable parameters, giving it the ability to learn complex patterns
● The learning process stays the same. After initialization, the weights {w 1, w2, w3,…, wn} and
biases {b1, b2, b3,..., bm} have to be updated through gradient descent
Neural networks
● The types of neural networks that we talked about until now is known as
feedforward neural networks
● There are also other forms of neural networks
● Some of the basic types are:
○ Feedforward Neural Networks
○ Convolutional Neural Networks
○ Recurrent Neural Networks
○ Generative Adversarial Networks
○ Transformers
○ … and Many More!
● For this lecture, we are going to talk about the feedforward networks only
Training a Neural Network
● Collect dataset
● Define the architecture of the Neural Network model that is to be trained on the
data
● Initialize all the weights and biases
● Calculate output based on the initialized weights and bias, which is called forward
propagation
● Calculate loss using an appropriate loss function
● Update the last layer weights and bias using the gradient of the loss
● Propagate the gradient to update the backwards layers, known as
backpropagation
● Run in a loop until convergence
Forward Propagation
Back to the Single Neuron Perceptron

● Here, there are two input feature values (x1, x2), output is (a)
● The loss function discussed in the logistic regression can be used here,
referred to as sigmoid loss function
Update process

● Calculate output a for all the data points in the dataset

● Calculate loss for n number of data points using:

● To update xᵢ, Calculate derivative of loss using

● Update xᵢ using
● Where, α is the learning rate
● Update all the weights and biases in the network
Gradient Calculation
Let’s calculate the derivatives
Let’s calculate the derivatives
Let’s calculate the derivatives
So, for a single neuron perceptron,

The gradient with respect to w1 is,

The gradient with respect to w2 would be,

This is for a single data point. For n number of samples, all the gradients have to be summed
up and then averaged
Some Basic Simulations
X Y
Let’s have a toy dataset .1 0
.2 0
.3 1
.4 ?

We are trying to fit it with the following perceptron

w and b are the unknowns, which have to be figured out

For three data points, the loss function is:

Initializing,

W = .7

b = .1

Thus, z = .7x+.1

● As the weights are randomly initialized, it should not fit our data points well, resulting in large error.

● Let’s calculate the error through the loss function

(Please note that we should use the normalized version of all the values for faster

and appropriate convergence. For simplicity, we are ignoring that step here.)
● Now, calculating the sum of loss for the three points in the dataset:

[For the first data point]

[For the second data point]

[For the third data point]

Average loss value across the three data points is 0.928/3 = 0.309
● Let’s try to reduce the loss value

● We’ll have to update both w and b using:

● Where, α is the learning rate

● Thus, if α=1, the updated w and b are:

● w = .7 - 1 * .013 = 0.687

● B = .1 - 1 * .23 = -0.13
Recalculating the sum of loss:

Average = 0.86 / 3 = 0.287

Slightly better than before!

We need to run this many times in a loop to get further improvements. (Depending on the nature of data, good
enough convergence might not be possible.)
But what about a larger network?
● This is a fairly large differentiation and, with increasing layer count, can very quickly
go out of hand. But wait…

● While calculating the derivative w.r.t. w2, the red-marked portion has already been

calculated. We can just get the numerical values and plug them in.
● For the gradient of each of the layer, we just need to store the gradient value for the
earlier layer and plug them in, making the derivative calculation process much simpler.
● In such a case, we are just propagating the gradient back from the later layer, hence the
name ‘backpropagation’.
● No matter how complex an expression is, it can be broken down into simple arithmetic
expressions.
● It is possible to easily calculate the derivative of these individual simple expressions,
get their values, apply chain rule repeatedly, and plug them in to the earlier operation
from the later operation.
● This method is known as automatic differentiation and is capable of calculating all the
gradients in a single backward swoop.
● This is how deep learning libraries are capable of calculating derivatives even in very
complex networks.
● Some other differentiation alternatives would be symbolic differentiation (can become
too complex in a large network) and numerical differentiation (not very accurate).

English in Common With Active Book
74% (31)
English in Common With Active Book
128 pages
IO Psych Drills 22 July 2023
No ratings yet
IO Psych Drills 22 July 2023
12 pages
How To Build Your Own Neural Network From Scratch in
No ratings yet
How To Build Your Own Neural Network From Scratch in
6 pages
Real Lives Real Listening Advanced Answer Key
83% (6)
Real Lives Real Listening Advanced Answer Key
45 pages
2. Neural Network Training
No ratings yet
2. Neural Network Training
73 pages
NeuralNetworks
No ratings yet
NeuralNetworks
29 pages
Understanding and Creating Neural Networks
No ratings yet
Understanding and Creating Neural Networks
69 pages
Neural Networks
No ratings yet
Neural Networks
52 pages
neural-networks-essay-feranmi-dere
No ratings yet
neural-networks-essay-feranmi-dere
7 pages
Lecture_09_slides_-_after
No ratings yet
Lecture_09_slides_-_after
57 pages
12. NN Introduction MES
No ratings yet
12. NN Introduction MES
39 pages
cs188 sp23 Note25
No ratings yet
cs188 sp23 Note25
8 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Neural Networks Handout
No ratings yet
Neural Networks Handout
7 pages
Annette Paper
No ratings yet
Annette Paper
7 pages
Machine Learning Lecture 11
No ratings yet
Machine Learning Lecture 11
28 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
HODL Lec 2 Training NNs Intro TF
No ratings yet
HODL Lec 2 Training NNs Intro TF
83 pages
PDF_1678529419
No ratings yet
PDF_1678529419
100 pages
Pr2_ANN_WriteUp.docx
No ratings yet
Pr2_ANN_WriteUp.docx
11 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Understanding and Coding Neural Networks From Scratch in Python and R
No ratings yet
Understanding and Coding Neural Networks From Scratch in Python and R
12 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
DeepLearning Introduction
No ratings yet
DeepLearning Introduction
14 pages
Understanding Backpropagation Algorithm - Towards Data Science
No ratings yet
Understanding Backpropagation Algorithm - Towards Data Science
11 pages
Understanding and Coding Neural Networks From Scratch in Python and R
100% (1)
Understanding and Coding Neural Networks From Scratch in Python and R
15 pages
NN Lecture Notes
No ratings yet
NN Lecture Notes
45 pages
DeepLearning Recap
No ratings yet
DeepLearning Recap
104 pages
Presentation 1
No ratings yet
Presentation 1
14 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
EE769 7 Introduction To Neural Networks
No ratings yet
EE769 7 Introduction To Neural Networks
52 pages
NN_Notes
No ratings yet
NN_Notes
39 pages
An Introduction To Mathematics Behind Neural Networks
No ratings yet
An Introduction To Mathematics Behind Neural Networks
5 pages
Classification BP Regression KNN Other Classifiers_ Final.ppt
No ratings yet
Classification BP Regression KNN Other Classifiers_ Final.ppt
116 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
nn2
No ratings yet
nn2
12 pages
Neural
No ratings yet
Neural
53 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
ML.8-Neural Networks - Deep Learning (Week 12,13)
No ratings yet
ML.8-Neural Networks - Deep Learning (Week 12,13)
80 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
A Step by Step Forward Pass and Backpropagation Example
No ratings yet
A Step by Step Forward Pass and Backpropagation Example
14 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Intro To Neural Networks
No ratings yet
Intro To Neural Networks
100 pages
Back-Propagation Is Very Simple. Who Made It Complicated
No ratings yet
Back-Propagation Is Very Simple. Who Made It Complicated
26 pages
Neural Network
100% (1)
Neural Network
54 pages
Vectorization: Linear Model As A Perceptron
No ratings yet
Vectorization: Linear Model As A Perceptron
5 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Understanding Neural Networks
No ratings yet
Understanding Neural Networks
12 pages
ml
No ratings yet
ml
10 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Programme 02133274 Abridged
No ratings yet
Programme 02133274 Abridged
8 pages
213-Article Text-557-1-10-20220129
No ratings yet
213-Article Text-557-1-10-20220129
16 pages
Ida Jean Orlando
No ratings yet
Ida Jean Orlando
9 pages
Betty Neuman'S Theory: Introduction About Theorist
No ratings yet
Betty Neuman'S Theory: Introduction About Theorist
11 pages
Assignment Brief Form - People Analytics - 2324 - Final 2-1
No ratings yet
Assignment Brief Form - People Analytics - 2324 - Final 2-1
7 pages
EE
No ratings yet
EE
2 pages
3.written Communication-1
0% (1)
3.written Communication-1
19 pages
Pronouns TOEFL Exercise
No ratings yet
Pronouns TOEFL Exercise
6 pages
OCBC Public Scholarship Application Form 2025
No ratings yet
OCBC Public Scholarship Application Form 2025
7 pages
CSEPlacement 201415
No ratings yet
CSEPlacement 201415
5 pages
Bloques Funcionales de Luria Peña Casanova
No ratings yet
Bloques Funcionales de Luria Peña Casanova
16 pages
Communicativ E Strategies: Oral Communication in Context
No ratings yet
Communicativ E Strategies: Oral Communication in Context
48 pages
Language Surveillance: Pressure To Follow Local Models of Speakerhood Among Latinx Students in Madrid
No ratings yet
Language Surveillance: Pressure To Follow Local Models of Speakerhood Among Latinx Students in Madrid
32 pages
QuillBot Your Complete Writing Solution
No ratings yet
QuillBot Your Complete Writing Solution
1 page
Details of State PSC AE Exam
No ratings yet
Details of State PSC AE Exam
2 pages
Gcse Citizenship Coursework Ideas
100% (2)
Gcse Citizenship Coursework Ideas
7 pages
001 Functions
No ratings yet
001 Functions
11 pages
Lesson Plan
No ratings yet
Lesson Plan
5 pages
Enhancement Program
No ratings yet
Enhancement Program
6 pages
Mostafa Farid Abd El-Raouf Zedan: Education
No ratings yet
Mostafa Farid Abd El-Raouf Zedan: Education
2 pages
Pediatric Allergy Principles and Practice Expert Consult Leung Pediatric Allergy Second Edition Donald Y. M. Leung MD PHD
100% (8)
Pediatric Allergy Principles and Practice Expert Consult Leung Pediatric Allergy Second Edition Donald Y. M. Leung MD PHD
70 pages
Concept of Health and Illness
No ratings yet
Concept of Health and Illness
29 pages
A MESSY ROOM - Furniture and Prepositions Worksheet
No ratings yet
A MESSY ROOM - Furniture and Prepositions Worksheet
3 pages
Nelson CV
No ratings yet
Nelson CV
4 pages
Waterproofing
No ratings yet
Waterproofing
5 pages
Bceng 102
No ratings yet
Bceng 102
2 pages
Application No.: 171721
No ratings yet
Application No.: 171721
3 pages