0% found this document useful (0 votes)

45 views18 pages

Gradient Descent

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent. The gradient is calculated at each step to determine the direction, and the step size is adjusted to minimize the function. The algorithm iterates until a minimum is reached. Newton's method is also described as an optimization technique using the second derivative.

Uploaded by

Priyanka Sindhwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views18 pages

Gradient Descent

Uploaded by

Priyanka Sindhwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Gradient Descent

2 4/3/2017
Fundamental Problem in Machine Learning

 In ML, the objective is to find parameters (0, 1, …, n)

such that:

n
Min  h(Xi )  Yi  2
i 1

h(Xi) value of Yi predicted by the algorithm

3 4/3/2017
What is Gradient

 Gradient is a derivative of function of several variables.

 Gradient can be used for finding optimal solution to a

function.

4 4/3/2017
20

0
-3 -2 -1 0 1 2 3 4

f(x)
x f(x) = x2 + 5
6 41
5.9 39.81
5.8 38.64
5.7 37.49
5.6 36.36
5.5 35.25
5.4 34.16
Derivative

' f(x  h)  f(x)

f (x)  Lt
h 0 h

We start with a random value for x and change x to x + h

such that the value of function decreases.

That means, f’(x) should be negative.

7 4/3/2017
Taylor Series Expansion

2 n
h h n
f ( x  h)  f ( x)  hf ' ( x)  f ' ' ( x)  ...  f ( x)
2! n!

h2
f ( x  h)  f ( x)  hf ' ( x)  f ' ' ( x)
2!

8 4/3/2017
Newton’s Method
2
h
f ( x  h)  f ( x)  hf ' ( x)  f ' ' ( x)
2!
The maximum difference between f(x+h) and f(x) occurs
when:

f ' ( x)
*
h 
f ' ' ( x)
In Newton’s Method the value of xk+1 is chosen such that:

f ' ( x)
xk 1  xk 
f ' ' ( x)
9 4/3/2017
Gradient Method (Ascent or Descent)
 Step 1: Choose the direction of search (direction of
search is -f(xk) for descent f(xk) for ascent.

 Fine a new point xk+1 such that:

xk 1  xk   k d k
k is the step size and dk is the direction

10 4/3/2017
Optimal Step size

 The optimal step size k is calculated using by optimizing

f(xk + kdk). That is, by setting f’(xk + kdk) = 0.

13 4/3/2017
Gradient Decent Example

14 4/3/2017
2 2
f ( x1, x2 )  5x1  x2  4 x1x2  6 x1  4 x2  15

f(x1, x 2 )  10x1  4x 2  6, 2x 2  4x1  4

X11 = 0, X12 = 10, k = 0.1

f ( x1, x2 )  75

f(x1, x 2 )  34, 16

xk 1  xk   k d k
X 12  X 11  0.1f ( X 11)  0  0.1* 34  3.4
X 22  X 21  0.1f ( X 11)  10  0.1*16  8.4
Optimal Step Size

Maximize

2 2
4 x1  6 x2  2 x1  2 x1x2  2 x2
xk 1  xk   k d k

d k  f(xk )

17 4/3/2017
(x1, x2) = (1, 1)

dk  f(xk )  (4  4 x1  2 x2 ,6  2 x1  4 x2 )  (2,0)

xk 1  xk  αk dk  (1,1)   k (2,0)  (1  2 k ),1

f ( k )  4(1  2 k )  6  2(1  2 k ) 2  2(1  2 k )  2

Mathematical Methods of Optimization
No ratings yet
Mathematical Methods of Optimization
62 pages
Optimization and Gradient Descent Algorithm
No ratings yet
Optimization and Gradient Descent Algorithm
37 pages
Chapter 3 Unconstrained Convex Optimization
No ratings yet
Chapter 3 Unconstrained Convex Optimization
28 pages
06_23ECE216_GradientDescent_v2
No ratings yet
06_23ECE216_GradientDescent_v2
73 pages
Class 12 Cs Project Model
No ratings yet
Class 12 Cs Project Model
22 pages
Lecture8_UnconstrainedII_2023
No ratings yet
Lecture8_UnconstrainedII_2023
57 pages
Unit VI Optimization Techniques question bank solved answer
No ratings yet
Unit VI Optimization Techniques question bank solved answer
20 pages
10725_Lecture11
No ratings yet
10725_Lecture11
6 pages
Optimization2
No ratings yet
Optimization2
40 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
Applications of Differentiation
No ratings yet
Applications of Differentiation
30 pages
Crs Mfai 2024 Slides
No ratings yet
Crs Mfai 2024 Slides
46 pages
Numerical Differentiation and IntegrationGrad Study
No ratings yet
Numerical Differentiation and IntegrationGrad Study
9 pages
Chapter 4
No ratings yet
Chapter 4
65 pages
Math Lecture 4
No ratings yet
Math Lecture 4
27 pages
Optimization For Machine Learning: Massachusetts Institute of Technology
No ratings yet
Optimization For Machine Learning: Massachusetts Institute of Technology
169 pages
1.3 Taylor Series
No ratings yet
1.3 Taylor Series
29 pages
Lec 02
No ratings yet
Lec 02
43 pages
Optimization PPT - Part-2
No ratings yet
Optimization PPT - Part-2
42 pages
mit18_s096iap23_lec06
No ratings yet
mit18_s096iap23_lec06
9 pages
06 Optimization
No ratings yet
06 Optimization
42 pages
Lec4 Gradient Method Revise
No ratings yet
Lec4 Gradient Method Revise
33 pages
Lec_11
No ratings yet
Lec_11
13 pages
Section 9 Root-Locus Design
No ratings yet
Section 9 Root-Locus Design
98 pages
Classification Trees - CART and CHAID
No ratings yet
Classification Trees - CART and CHAID
50 pages
Chapter 8 Lecture Notes
No ratings yet
Chapter 8 Lecture Notes
4 pages
TensorFlow Sec1
No ratings yet
TensorFlow Sec1
14 pages
2.NCC-SFC-LMT-KKT 2
No ratings yet
2.NCC-SFC-LMT-KKT 2
56 pages
Topic3 PDF
No ratings yet
Topic3 PDF
50 pages
Optim
No ratings yet
Optim
70 pages
14-May - Jupyter Notebook
No ratings yet
14-May - Jupyter Notebook
15 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
Maximum Slope Method
No ratings yet
Maximum Slope Method
14 pages
Assignment 2 Module 3
No ratings yet
Assignment 2 Module 3
26 pages
Group Project GFW0001 2021
No ratings yet
Group Project GFW0001 2021
3 pages
Optimization in Neural Network
No ratings yet
Optimization in Neural Network
22 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
UNIT III IRT
No ratings yet
UNIT III IRT
66 pages
Opt_Lec_10
No ratings yet
Opt_Lec_10
16 pages
Pharmeasy
No ratings yet
Pharmeasy
16 pages
6 Gradient Method
No ratings yet
6 Gradient Method
19 pages
Ethics and Second Order Cybernetics PDF
100% (1)
Ethics and Second Order Cybernetics PDF
2 pages
Assignment 5
No ratings yet
Assignment 5
43 pages
Fasoo Brochure Fasoo Data Radar
No ratings yet
Fasoo Brochure Fasoo Data Radar
2 pages
UAS Pragmatics Fadia Azizah Agustiana
No ratings yet
UAS Pragmatics Fadia Azizah Agustiana
22 pages
Chap01 3 Taylor Series
No ratings yet
Chap01 3 Taylor Series
26 pages
Da Unit-4
No ratings yet
Da Unit-4
43 pages
Month 205-70-N1190 PC - 198 - 27 - 42263 PC - 203 - 32 - 51461 PC - 600 - 863 - 4210 PC - 6735 - 61 - 3410
No ratings yet
Month 205-70-N1190 PC - 198 - 27 - 42263 PC - 203 - 32 - 51461 PC - 600 - 863 - 4210 PC - 6735 - 61 - 3410
4 pages
BDB 2020 SP2 Assignment 1 - Fadel Thariq Gifari - Technology Review
No ratings yet
BDB 2020 SP2 Assignment 1 - Fadel Thariq Gifari - Technology Review
9 pages
Coal Mine Rescue Robot
No ratings yet
Coal Mine Rescue Robot
2 pages
Artificial Intelligence: An Introduction: Fabio Mart Inez, PH.D
No ratings yet
Artificial Intelligence: An Introduction: Fabio Mart Inez, PH.D
20 pages
Optimization Methods (MFE) : Elena Perazzi
100% (1)
Optimization Methods (MFE) : Elena Perazzi
31 pages
DBMS
No ratings yet
DBMS
19 pages
Notes 240105 063345 933
No ratings yet
Notes 240105 063345 933
6 pages
4 Chapter 21 Non Linear Programming
No ratings yet
4 Chapter 21 Non Linear Programming
37 pages
Arabic English Speech Emotion Recognition System
No ratings yet
Arabic English Speech Emotion Recognition System
5 pages
Dropout Improves Recurrent Neural Networks For Handwriting Recognition
No ratings yet
Dropout Improves Recurrent Neural Networks For Handwriting Recognition
6 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Annurev Fluid 010719 060214
No ratings yet
Annurev Fluid 010719 060214
34 pages
Slides-4 Optimization Extra Gradient Descent
No ratings yet
Slides-4 Optimization Extra Gradient Descent
67 pages
ECOM 6302: Engineering Optimization: Chapter Three
100% (1)
ECOM 6302: Engineering Optimization: Chapter Three
56 pages
Advanced Scikit Learn
No ratings yet
Advanced Scikit Learn
98 pages
Machine Learning Foundation
No ratings yet
Machine Learning Foundation
4 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
No ratings yet
Mscfe XXX (Course Name) - Module X: Collaborative Review Task
19 pages
1D FDM
No ratings yet
1D FDM
19 pages
Image Captioning
67% (3)
Image Captioning
16 pages
CS-6777 Liu Abs
No ratings yet
CS-6777 Liu Abs
103 pages
Oxford Handbook of Governance
100% (2)
Oxford Handbook of Governance
37 pages
Chapter 4: Unconstrained Optimization
No ratings yet
Chapter 4: Unconstrained Optimization
25 pages
Differentiation and Integration: (Lectures On Numerical Analysis For Economists II)
No ratings yet
Differentiation and Integration: (Lectures On Numerical Analysis For Economists II)
50 pages
Artificial and Computational Intelligence - HO V-2.0
No ratings yet
Artificial and Computational Intelligence - HO V-2.0
6 pages
Mid Term Question 232 CSE3521 A SaIs
No ratings yet
Mid Term Question 232 CSE3521 A SaIs
3 pages
MScFE 650 MLF - Video - Transcripts - M3
No ratings yet
MScFE 650 MLF - Video - Transcripts - M3
19 pages
(K) K (k+1) (K) K (K)
No ratings yet
(K) K (k+1) (K) K (K)
6 pages
Algorithm For Unconstrained-Multivariable Case-2 (CH 6)
No ratings yet
Algorithm For Unconstrained-Multivariable Case-2 (CH 6)
31 pages
Gradient Of A Function هّلادلا رادحنإ
No ratings yet
Gradient Of A Function هّلادلا رادحنإ
11 pages
OpTimIzation Overview
No ratings yet
OpTimIzation Overview
47 pages
Difference Between Intrapersonal and Interpersonal Communication
100% (1)
Difference Between Intrapersonal and Interpersonal Communication
3 pages
Multi-Variable Optimization Methods
No ratings yet
Multi-Variable Optimization Methods
21 pages
EE364a Homework 7 Solutions
No ratings yet
EE364a Homework 7 Solutions
16 pages
Lecture Note Chapter 11 PID Controller Design Tuning and Troubleshooting 2016
No ratings yet
Lecture Note Chapter 11 PID Controller Design Tuning and Troubleshooting 2016
61 pages
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
No ratings yet
(k+1) K (K) (K) (K) : Recall That A Direction Is A Vector of Unit Length
5 pages
Lecture 3-Non Verbal Communication
No ratings yet
Lecture 3-Non Verbal Communication
30 pages
Linear Regression: Scikit-Learn
No ratings yet
Linear Regression: Scikit-Learn
3 pages
Process Optimization
No ratings yet
Process Optimization
70 pages
Project For Automated Train by Roshan
No ratings yet
Project For Automated Train by Roshan
6 pages
Note Set 1 - The Basics: 1.1 - Overview
No ratings yet
Note Set 1 - The Basics: 1.1 - Overview
24 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
The History of Artificial Intelligence
No ratings yet
The History of Artificial Intelligence
23 pages
Machine Learning Unit 2 MCQ
No ratings yet
Machine Learning Unit 2 MCQ
17 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet

Gradient Descent

Uploaded by

Gradient Descent

Uploaded by

Gradient Descent

 In ML, the objective is to find parameters (0, 1, …, n)

h(Xi) value of Yi predicted by the algorithm

 Gradient is a derivative of function of several variables.

 Gradient can be used for finding optimal solution to a

' f(x  h)  f(x)

We start with a random value for x and change x to x + h

That means, f’(x) should be negative.

 Fine a new point xk+1 such that:

 The optimal step size k is calculated using by optimizing

f(x1, x 2 )  10x1  4x 2  6, 2x 2  4x1  4

f(x1, x 2 )  34, 16

xk 1  xk  αk dk  (1,1)   k (2,0)  (1  2 k ),1

f ( k )  4(1  2 k )  6  2(1  2 k ) 2  2(1  2 k )  2

You might also like