0% found this document useful (0 votes)
4 views

Unit 1 DL

The document outlines a course on Deep Learning Principles and Practices, focusing on theoretical foundations, algorithms, and applications of neural networks. It details course objectives, outcomes, and key topics such as supervised learning, linear regression, and the implementation of neural networks. Students will learn to design, optimize, and apply deep learning models in various real-world scenarios.

Uploaded by

23adl05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Unit 1 DL

The document outlines a course on Deep Learning Principles and Practices, focusing on theoretical foundations, algorithms, and applications of neural networks. It details course objectives, outcomes, and key topics such as supervised learning, linear regression, and the implementation of neural networks. Students will learn to design, optimize, and apply deep learning models in various real-world scenarios.

Uploaded by

23adl05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

U21AD505 - DEEP

LEARNING PRINCIPLES AND


PRACTICES
Google Classrooms: yklhb7j & m3a4kqh
COURSE OBJECTIVES:
• To present theoretical foundations, algorithms, methodologies, and
applications of neural networks and deep Learning
• To design and develop an application-specific deep learning model
• To apply the deep learning models in various real-world applications
COURSE OUTCOMES:
• Upon completion of the course, the student will be able to
CO1: Understand the fundamentals of Deep learning (Understand)
CO2: Optimize and regularize the deep learning algorithms (Apply)
CO3: Design and implement Convolutional Neural Networks (Apply)
CO4: Analyze and design Recurrent Neural Networks (Apply)
CO5: Develop deep learning models to encode the original data and
reconstruct data (Apply)
Unit 1: FUNDAMENTALS OF DEEP LEARNING
Functions – Derivatives – Nested Functions – Chain Rule – Functions
with Multiple Inputs – Derivatives of Functions with Multiple Inputs –
Creating New Features – Derivatives of Functions – Computational
Graph – The Backward Pass – Supervised Learning – Linear Regression
– Training the Model – Assessing Our Model – Neural Networks from
Scratch
Functions
• What is a function?
• Math
• f1(x) = x2
f2(x) = max(x, 0)
• Basic functions in NumPy
Derivatives
• Derivative of a function at a point is the “rate of change” of the
output of the function with respect to its input at that point.
• Math
Nested Functions
• f 2 (f 1(x)) = y
Chain Rule


Functions with Multiple Inputs
• Math
a = α(x, y) = x + y
s = σ (a)

f(x, y) = σ (x + y)
Derivatives of Functions with Multiple
Inputs
Creating New Features
• Perhaps the single most common operation in neural networks is to
form a “weighted sum” of these features, where the weighted sum
could emphasize certain features and de-emphasize others and thus
be thought of as a new feature that itself is just a combination of old
features.
• A concise way to express this mathematically is as a dot product of
this observation, with some set of “weights” of the same length as
the features.
• N = ν(X, W)= X × W = x1 × w1 + x2 × w2 + ... + xn × wn
Derivatives of Functions with Multiple
Vector Inputs
• For vector functions, it isn’t immediately obvious what the
derivative is: if we write a dot product as ν (X, W) = N
• the derivative with respect to each element of the matrix
Computational Graph

• Now we’ll define the following straightforward operations to these


matrices:
1. Multiply these matrices together. As before, we’ll denote the
function that does this as ν(X, W) and the output as N, so that N = ν(X,
W).
2. Feed N result through some differentiable function σ, and define (S
= σ(N).
• First, let’s multiply X and W:
• Next, we’ll feed this result through σ, which just means applying σ to
every element of the matrix X × W:
• Finally, we can simply sum up these elements:

• Gradient of L
The Backward Pass

• We can write L as Λ(σ(ν(X, W))) . If this were a regular function, we


would just write the chain rule:
• =
Supervised Learning
• Machine learning can be described as building algorithms that can
uncover or “learn” relationships in data;
• Supervised learning can be described as the subset of machine
learning dedicated to finding relationships between characteristics
of the data that have already been measured.
• Independent variables determines the dependent variables
• categorical to numerical
• Feature engineering
• Specifically, our data will be represented in a matrix X with n rows,
each of which represents an observation with k features, all of which
are numbers. Each row observation will be a vector, as in xi = [xi1, xi2,
xi3,... Xik]
Linear Regression
• Linear regression is often shown as:

• Diagram
• the output of a linear regression model as the dot product of each
observation vector
Training the Model
• Backpropagation

• Calculating the Gradients


• Since Λ(P, Y) = (Y – P) 2 for each element in Y and P:

Or dLdP = 2(P - Y)

is just a matrix of +1+s, of the same shape as N


Assessing Our Model
• Sample from a population
• Overfitting
• Training Set Versus Testing Set
• mean absolute error:

• root mean squared error:


Neural Networks from Scratch
Step 1: A Bunch of Linear Regressions
• Linear regression involved doing a matrix multiplication with a set of
parameters: if our data X had dimensions [batch_size, num_features],
• then we multiplied it by a weight matrix W with dimensions
[num_features, 1] to get an output of dimension [batch_size, 1];
• this output is, for each observation in the batch, simply a weighted
sum of the original features.
• To do multiple linear regressions, we’ll simply multiply our input by a
weight matrix with dimensions [num_features, num_outputs],
resulting in an output of dimensions [batch_size, num_outputs]
Step 2: A Nonlinear Function
• function we use here to be monotonic so that it “preserves”
information about the numbers that were fed in. -3 & 3 -> Square = 9
• nonlinearity will enable our neural network to model the inherently
nonlinear relationship between the features and the target
• Finally, the sigmoid function has the nice property that its derivative
can be expressed in terms of the function itself:
Step 3: Another Linear Regression

You might also like