0% found this document useful (0 votes)

0 views

Lecture 1.1 Gradient Descent Algorithm

The document discusses closed-form equations and various types of gradient descent (Batch, Stochastic, Mini-batch) used in optimization for machine learning. It explains the definitions, properties, advantages, and disadvantages of each gradient descent type, along with mathematical formulations and numerical examples. The content is presented by Dr. Mainak Biswas and emphasizes the importance of these concepts in minimizing loss functions.

Uploaded by

Biswas Lectures

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Biswas Lectures

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Lecture 1.

Closed-form Equation, Type of Gradient

Descent
(Batch, Stochastic, Mini-batch) -
Definition, properties.

Dr. Mainak Biswas

Closed-form Equation
• Closed-form Equation: closed-form equation is
a mathematical expression that provides a
direct way to compute a value without
requiring iterative procedures or infinite series
– Example:
• Sum of an Arithmetic Series: The sum of the first n
terms of an arithmetic series with the first term a and
common difference d is:
𝑛
𝑆𝑛 = (2𝑎 + 𝑛 − 1 𝑑
2

Dr. Mainak Biswas

Gradient Descent
• Gradient Descent is an optimization algorithm used in
machine learning and deep learning to minimize the
loss function by updating the model's parameters in
the direction of the steepest descent
• The type of gradient descent depends on how much
data is used to compute the gradient at each iteration
• Gradient descent is also called “the deepest downward
slope algorithm”
• It is very important in machine learning, where it is
used to minimize a cost function

Dr. Mainak Biswas

Dr. Mainak Biswas
Loss function
𝑁
1 2
𝐸 𝑤 = 𝑓 𝑥𝑖 − 𝑦𝑖
2𝑁
𝑖=1
• Where 𝑓 𝑥𝑖 = 𝑤 𝑇 𝑥𝑖 , then
𝑁
𝜕𝐸 1
= 𝑓 𝑥𝑖 − 𝑦𝑖 𝑥𝑖
𝜕𝑤 𝑁
𝑖=1

Dr. Mainak Biswas

Mathematical Formulation of Gradient
Descent

𝑤 = 𝑤 − 𝜂𝛻𝐸(𝑤)
• 𝑤 : Model parameters (weights)
• 𝜂 : Learning rate
• 𝛻𝐸(𝑤): Gradient of the loss function 𝐸(𝑤) with
respect to 𝑤
• It can be also written as:
𝑁
1
𝑤 =𝑤−𝜂 𝑓 𝑥𝑖 − 𝑦𝑖 𝑥𝑖
𝑁
𝑖=1

Dr. Mainak Biswas

Dr. Mainak Biswas
Numerical Problem
• Let 𝐸 𝑤 = 𝑤 − 3 2 + 2, 𝜂 = 0.1, 𝑤 = 0 ,
then find w and E(w) for five iterations:
• So, we see 𝑥𝑖 = 1, therefore iterations can be
solved in terms of w only
𝛿𝐸
• =2 𝑤−3
𝛿𝑤
• 𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − 𝜂𝛻𝐸 𝑤𝑜𝑙𝑑 ⇒ 0.8𝑤𝑜𝑙𝑑 + 0.6

Sl w 𝐸 𝑤
1 0 11
2 0.6 7.76

Dr. Mainak Biswas

Sl w 𝐸 𝑤
1 0 11
2 0.6 7.76
2 1.0800 5.6864
2 1.4640 4.3593
2 1.7712 3.5099

Dr. Mainak Biswas

Batch Gradient Descent
• Batch Gradient Descent is an optimization algorithm
used to minimize a loss function by iteratively updating
the model's parameters using the entire dataset to
calculate the gradient
• Advantages:
– Computes the gradient with high precision using the entire
dataset
– Converges steadily towards the minimum
– Suitable for smooth and convex loss functions
• Disadvantages:
– Memory-intensive when the dataset is large
– Requires processing the entire dataset for each iteration

Dr. Mainak Biswas

Stochastic Gradient Descent
• Stochastic Gradient Descent (SGD) is a variant of gradient
descent where the model parameters are updated using
only a single training example at a time, rather than the
entire dataset
• This leads to faster updates and can help the algorithm
escape local minima, making it suitable for large datasets
• Advantages
– Faster Updates
– Escaping Local Minima
– Scalability
• Disadvantages
– Noisy Convergence
– Requires More Iterations

Dr. Mainak Biswas

Mini-Batch Gradient Descent
• Mini-batch Gradient Descent is a hybrid approach between Batch
Gradient Descent and Stochastic Gradient Descent. It aims to combine the
advantages of both by updating the model parameters using a subset
(mini-batch) of the training data rather than the entire dataset (batch) or
just one data point (stochastic)
– Mini-batch: The dataset is divided into small batches, each containing a fixed
number of training examples (The size of each mini-batch (denoted as 𝑏) is a
hyper-parameter)
– Gradient Calculation: For each mini-batch, the gradient is calculated based on
the average of the training examples in that batch
– Weight Update: The model parameters are updated using the computed
gradient for the mini-batch
– Repeat for all mini-batches until convergence
• Advantages: Faster than Batch GD, Less Noisy than SGD
• Disadvantages: Choosing the Right Batch Size, Memory Considerations

Dr. Mainak Biswas

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
The Ultimate TMUA Guide: Complete revision for the Cambridge TMUA. Learn the knowledge, practice the skills, and master the TMUA
From Everand
The Ultimate TMUA Guide: Complete revision for the Cambridge TMUA. Learn the knowledge, practice the skills, and master the TMUA
Chloe Bowman
No ratings yet
PCA and Convex optimization and bias , Variance-2
No ratings yet
PCA and Convex optimization and bias , Variance-2
29 pages
optimization techniques (SGD alternatives)
No ratings yet
optimization techniques (SGD alternatives)
34 pages
Introduction to Optimization-Lec1
No ratings yet
Introduction to Optimization-Lec1
36 pages
WINSEM2024-25_CSE4006_ETH_AP2024254000693_2025-01-08_Reference-Material-I
No ratings yet
WINSEM2024-25_CSE4006_ETH_AP2024254000693_2025-01-08_Reference-Material-I
40 pages
Lecture_2
No ratings yet
Lecture_2
31 pages
lec03_MultLinRegression
No ratings yet
lec03_MultLinRegression
42 pages
Gradient Descent_PR
No ratings yet
Gradient Descent_PR
31 pages
Unit 3
No ratings yet
Unit 3
110 pages
03 Linear Models
No ratings yet
03 Linear Models
46 pages
Summarised Content Data Structures & Algorithm
No ratings yet
Summarised Content Data Structures & Algorithm
100 pages
Unit 3
No ratings yet
Unit 3
102 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
CV Lec4
No ratings yet
CV Lec4
46 pages
Unit 2.4
No ratings yet
Unit 2.4
31 pages
Lec-01-Introduction to Statistical Learning
No ratings yet
Lec-01-Introduction to Statistical Learning
38 pages
CNN Regularization
No ratings yet
CNN Regularization
9 pages
Lecture 2.7
No ratings yet
Lecture 2.7
18 pages
Deep Learning Basics Lecture 3 Regularization I
No ratings yet
Deep Learning Basics Lecture 3 Regularization I
32 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
27 pages
Analysis of Variance (ANOVA) : Part 1: One-Way ANOVA (Equal Sample Sizes) Part 2: One-Way ANOVA (Unequal Sample Sizes)
No ratings yet
Analysis of Variance (ANOVA) : Part 1: One-Way ANOVA (Equal Sample Sizes) Part 2: One-Way ANOVA (Unequal Sample Sizes)
33 pages
Under The Guidance of Mr.M.Jagadeesh Assistant Professor CSE Department by M.Praveen Kumar 1221010121 M.Tech-SE-IV Sem
No ratings yet
Under The Guidance of Mr.M.Jagadeesh Assistant Professor CSE Department by M.Praveen Kumar 1221010121 M.Tech-SE-IV Sem
17 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
Lec 8
No ratings yet
Lec 8
43 pages
BDA-PPT Final
No ratings yet
BDA-PPT Final
28 pages
ML mod1
No ratings yet
ML mod1
48 pages
Customer_segmentation
No ratings yet
Customer_segmentation
43 pages
Lecture 9 Loss, Optimizers, Batch Processing, Accuracy
No ratings yet
Lecture 9 Loss, Optimizers, Batch Processing, Accuracy
12 pages
Unit 4 Dimenstionality Reduction
No ratings yet
Unit 4 Dimenstionality Reduction
104 pages
09_PCA
No ratings yet
09_PCA
19 pages
Machine Learning With K Nearest Neighbors Course Notes 365 Data Science
No ratings yet
Machine Learning With K Nearest Neighbors Course Notes 365 Data Science
24 pages
cours5
No ratings yet
cours5
23 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
5.1Loss Function, Optimization,Gd
No ratings yet
5.1Loss Function, Optimization,Gd
39 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
DSS08 - CLS-ANN, SVM, Ensemble-Vn
No ratings yet
DSS08 - CLS-ANN, SVM, Ensemble-Vn
44 pages
Math 404 - W01 - Intro
No ratings yet
Math 404 - W01 - Intro
28 pages
Lecture 2.6
No ratings yet
Lecture 2.6
23 pages
Unit3_rev3
No ratings yet
Unit3_rev3
201 pages
PPO Final Hopeso
No ratings yet
PPO Final Hopeso
14 pages
Anthony PDF
No ratings yet
Anthony PDF
33 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Lesson 5 Model Selection
No ratings yet
Lesson 5 Model Selection
41 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Lecture 1.7 Corrected
No ratings yet
Lecture 1.7 Corrected
10 pages
SVM Consolidated
No ratings yet
SVM Consolidated
34 pages
Provable Non-Convex Optimization For ML: Prateek Jain Microsoft Research India
No ratings yet
Provable Non-Convex Optimization For ML: Prateek Jain Microsoft Research India
86 pages
Module3_Ch1
No ratings yet
Module3_Ch1
83 pages
lecture_06
No ratings yet
lecture_06
51 pages
Lecture 3
No ratings yet
Lecture 3
61 pages
Mapping
No ratings yet
Mapping
39 pages
Random Vibrion2
No ratings yet
Random Vibrion2
45 pages
Unit – IV
No ratings yet
Unit – IV
24 pages
Using the Standards - Problem Solving, Grade 6
From Everand
Using the Standards - Problem Solving, Grade 6
Frank Schaffer Publications
No ratings yet
Introduction to Finite Element Analysis
From Everand
Introduction to Finite Element Analysis
Rahul Basu
No ratings yet
Rao Mitra Tables Part 2
No ratings yet
Rao Mitra Tables Part 2
67 pages
3.6.3 Practice - Polynomial and Rational Functions (Practice)
100% (1)
3.6.3 Practice - Polynomial and Rational Functions (Practice)
8 pages
Wavelet Numerical Method and Its Applications in Nonlinear Problems You He Zhou - Explore the complete ebook content with the fastest download
100% (3)
Wavelet Numerical Method and Its Applications in Nonlinear Problems You He Zhou - Explore the complete ebook content with the fastest download
70 pages
The Development of Multi-Axis
No ratings yet
The Development of Multi-Axis
148 pages
ISLR Chap 6 Shaheryar
No ratings yet
ISLR Chap 6 Shaheryar
22 pages
Linear Algebra With MATLAB (Elimination With Matrices)
No ratings yet
Linear Algebra With MATLAB (Elimination With Matrices)
10 pages
Mastering SciPy - Sample Chapter
No ratings yet
Mastering SciPy - Sample Chapter
45 pages
Linear Program Resource
No ratings yet
Linear Program Resource
3 pages
Finite Difference Schemes
No ratings yet
Finite Difference Schemes
9 pages
ODE Scilab
No ratings yet
ODE Scilab
155 pages
Linear and Nonlinear Programing With Maple™: An Interactive, Applications-Based Approach
No ratings yet
Linear and Nonlinear Programing With Maple™: An Interactive, Applications-Based Approach
402 pages
NUMERICAL TECHQNIQUE2
No ratings yet
NUMERICAL TECHQNIQUE2
2 pages
Maths DPP_02
No ratings yet
Maths DPP_02
2 pages
Numerical Technique
No ratings yet
Numerical Technique
2 pages
CQF Brochure Jun19 1 PDF
No ratings yet
CQF Brochure Jun19 1 PDF
24 pages
A First Course in Computational Physics
100% (7)
A First Course in Computational Physics
435 pages
DT Assignment I.docx 2015
No ratings yet
DT Assignment I.docx 2015
1 page
S0022460X2100451X
No ratings yet
S0022460X2100451X
24 pages
Linear Algebra Harvard Notes Lecture 3
No ratings yet
Linear Algebra Harvard Notes Lecture 3
3 pages
Optimality Test For Transportation Problem: V O Thomas
No ratings yet
Optimality Test For Transportation Problem: V O Thomas
79 pages
For The Slab Waveguide Shown
No ratings yet
For The Slab Waveguide Shown
20 pages
Class X Worksheet Real Numbers, Polynomials & Pair of Linear Equations With Two Variables.
100% (1)
Class X Worksheet Real Numbers, Polynomials & Pair of Linear Equations With Two Variables.
3 pages
Final Updated - Me Robotics and Automation Syllabus 2021
No ratings yet
Final Updated - Me Robotics and Automation Syllabus 2021
115 pages
9th STD Polynomials Text .Book - 2021 Vol 1
No ratings yet
9th STD Polynomials Text .Book - 2021 Vol 1
19 pages
Daa Unit 2 2 Daa
No ratings yet
Daa Unit 2 2 Daa
24 pages
Mechanical Engr (ME) : Courses
No ratings yet
Mechanical Engr (ME) : Courses
1 page
Quarterly QP AK
No ratings yet
Quarterly QP AK
7 pages
Math 10 - Q1 - Week 6 - Module 6 - Division-Of-Polynomials - For - Reproduction
60% (5)
Math 10 - Q1 - Week 6 - Module 6 - Division-Of-Polynomials - For - Reproduction
31 pages
Response Surface Methodology
No ratings yet
Response Surface Methodology
26 pages
Jacobi Eigen
No ratings yet
Jacobi Eigen
8 pages

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Lecture 1.1 Gradient Descent Algorithm

Uploaded by

Lecture 1.

Closed-form Equation, Type of Gradient

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

Dr. Mainak Biswas

You might also like