0% found this document useful (0 votes)

27 views

4-Optimization of 2 Variables, Gradient Descent

The document discusses optimization of functions with multiple variables. It describes finding stationary points using gradients and classifying them using the Hessian matrix. It also provides an example problem and solution. Gradient descent and its analogy to descending a mountain are explained as algorithmic approaches for optimization.

Uploaded by

SHREYAS YADAV

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

4-Optimization of 2 Variables, Gradient Descent

Uploaded by

SHREYAS YADAV

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

22-10-2020

Optimization of function of two variables

• Problem: Minimize 𝑓 𝑥 , 𝑥 = (𝑥 , 𝑥 … … . . 𝑥 ) ∈ 𝑅
• Solution:
Find stationary points 𝑥 ∗ by solving 𝛻𝑓 𝑥 = 0
At each stationary point evaluate Hessian matrix 𝐻(𝑥 ∗ )
which is a matrix containing all second order partial
derivatives of 𝑓 w.r.t. 𝑥 ,𝑥 ……..𝑥

𝑥∗ 𝑥∗ 𝑥∗
⋯
𝑥∗ 𝑥∗ 𝑥∗
𝐻 𝑥∗ =
⋮ ⋱ ⋮
∗ ∗
(𝑥 ) 𝑥 ⋯ 𝑥∗
×

 Draw conclusion based on following table:

𝐻 𝑥∗ Conclusion
Positive definite Minimum
Negative definite Maximum
indefinite Saddle point
Semidefinite No conclusion

 Recall:
A square matrix is said to be
 positive definite if all its eigen values are positive(>0) ,
 negative definite if all its eigen values are negative(<0)
 Positive semidefinite if all its eigen values are ≥ 0
 Negative semidefinite if all its eigen values are ≤ 0
 indefinite if it has both positive & negative eigen values.

1
22-10-2020

Solve and check for H(x*)

• Example: 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦 + 3𝑥𝑦

Solve
• Example: 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦 + 3𝑥𝑦

• Hints: 𝛻𝑓 = = &𝐻=

Apply first derivative test to find stationary points:

𝛻𝑓 = 0 =>

Solve simultaneous equations to get stationary points.

Now to check maxima and minima, evaluate Hessian matrix at

these points

2
22-10-2020

• Example: 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦 + 3𝑥𝑦

3𝑥 + 3𝑦 6𝑥 3
• Solution: 𝛻𝑓 = = &𝐻= =
3𝑦 + 3𝑥 3 6𝑦

Apply first derivative test to find stationary points:

3𝑥 + 3𝑦 0
𝛻𝑓 = 0 => = => 𝑥 = −𝑦 & 𝑦 = −𝑥
3𝑦 + 3𝑥 0
Solving simultaneously we get 0,0 & (−1, −1) as stationary points.
Now to check maxima and minima we evaluate Hessian matrix at these points

0 3
𝐻 0,0 = Eigen values of these matrix are 3 & -3(Check) i.e H is
3 0
indefinite which implies that (0,0) is saddle point

−6 3
𝐻 −1, −1 = Eigen values of these matrix are -9 & -3(Check) i.e H is
3 −6
negative definite which implies that (−1, −1) is a point of maxima.

Application: Least squares

• In machine learning models used for prediction, often
we need to solve the systems 𝐴𝑥 = 𝑏
• This system is many a times inconsistent, in which
case we have to look for the solution such that the
difference between 𝐴𝑥 & 𝑏(called residual) is minimized
i.e. Solve the optimization problem min 𝐴𝑥 − 𝑏

i.e min 𝐴𝑥 − 𝑏
For 𝑝 = 2 , the problem is called (𝐿 )least
squares & the term 𝐴𝑥 − 𝑏 is called residual
sum of squares(RSS)

3
22-10-2020

• Example: Solve 𝐴𝑥 = 𝑏 where

2 0 1
𝐴 = −1 1 &𝑏= 0
0 2 −1
• Solution:
Observe that the system is inconsistent.
We will find the least squares solution.
The objective function is

𝐴𝑥 − 𝑏 = 2𝑥 − 1 + −𝑥 + 𝑦 + 2𝑦 + 1
Finding the minima by the method seen earlier , we get the

solution as 𝑥 = ,𝑦 = − (Check)

Algorithmic approach:
• In practice, computing and storing the full Hessian matrix takes large
memory, which is infeasible for high-dimensional functions such as
the loss function with large numbers of parameters. For such
situations, first order algorithmic methods like gradient descent or
second order methods like Newton’s method have been developed.

• General structure of algorithms for unconstrained minimization :

• Choose a starting point 𝑥
• Beginning at 𝑥 , generate a sequence of iterates 𝑥 with non-
increasing function 𝑓 value until a solution point with sufficient
accuracy is found or until no further progress can be made.
• To generate the next iterate 𝑥 , the algorithm uses information about
the function at 𝑥 and possibly earlier iterates.
•

4
22-10-2020

Gradient Descent:
• Algorithmic method for finding a local minimum of a
differentiable function.
• The algorithm is initiated by choosing random values to the
parameters
• Improve the parameters gradually by taking steps proportional
to the negative of the gradient (or approximate gradient) of the
cost function at the current point.
• Continue the process until the algorithm converges to a
minimum i.e until the difference between the successive
iterates becomes stable or reaches a threshold
 Note: If we instead take steps proportional to the positive of
the gradient, we approach a local maximum of that function;
the procedure is then known as gradient ascent.

Analogy:
• To get an idea of how Gradient Descent works, let us take an
example.

• Suppose you are at the top of a mountain and want to reach

the base camp which is all the way down at the lowest point
of the mountain. Also, due to the bad weather, the visibility
is really low and you cannot see the path at all. How would
you reach the base camp?

5
22-10-2020

Analogy:

• One of the ways is to use your feet to know where the land

tends to descend. This will give an idea in what direction,

the steep is low and you should take your first step. If you

follow the descending path until you encounter a plain area

or an ascending path, it is very likely you would reach the

base camp.

Analogy:

6
22-10-2020

Limitation:
• If there is a slight rise in the ground when you are going

downhill you would immediately stop assuming that you

reached the base camp (global minima), but in reality, you

are still stuck at the mountain at a local minima.

• In other words , gradient descent does not guarantee finding

global mimina(maxima) of the function

• However most of the objective functions used in machine

learning such as cost function are convex functions which

ensure that the local minimum is also a global minimum.

Mathematical formulation of the idea:

• The algorithm is based on the fact that at any

given point ′𝑥 ′ in the domain of the function 𝑓(𝑥),
the function decreases fast in the direction of
negative gradient and increases in the opposite
direction.

• If one goes from position 𝑎 to position 𝑎 by

going in the direction of negative gradient with
step size/length 𝛾 i.e. 𝑎 = 𝑎 − 𝛾𝛻𝑓(𝑎 ) then he
will be going towards the minimum.

7
22-10-2020

Mathematical formulation of the idea:

• It can be shown that if 𝛾 is small then 𝑓 𝑎 ≥

𝑓 𝑎 . So if 𝑥 is the starting point of the
algorithm followed by the sequence
𝑥 , 𝑥 , 𝑥 … … then 𝑥 = 𝑥 −𝛾 𝛻𝑓 𝑥 𝑤ℎ𝑒𝑟𝑒 𝑓 𝑥 ≥
𝑓 𝑥 ≥𝑓 𝑥 ….

Algorithm in machine learning

• In machine learning code we normally use the following
notations :

𝜃
𝜃
• Cost function is denoted by 𝐽 𝜃 where = , 𝜃 is the 𝑖
⋮
𝜃
parameter and learning rate is denoted by 𝛼
• so that the iterative formula becomes 𝜃 ≔ 𝜃 − 𝛼𝛻𝐽(𝜃). If we
apply this formula individually to the components of 𝜃 then
the formula for 𝑗 component 𝜃 becomes

• 𝜃 := 𝜃 − 𝛼

8
22-10-2020

Note on step size:

• The value of step size 𝛾 can be changed at every
iteration.(Hence the notation 𝛾 ).

• In machine learning the value 𝛾 is called the learning

rate(which can be varied).

• Usually, we take the value of the learning rate to be small

such as 0.1, 0.01,0.001 etc..

• The value of the step should not be too big as it can skip the
minimum point and thus the optimization can fail. It is a
hyper-parameter and you need to experiment with its
values.

Note on step size:

9
22-10-2020

Additional Example:Single variable case

Minimize 𝑓 𝑥 = 𝑥

Solution:

We are given a function of one variable. Here cost function 𝐽 𝜃 = 𝜃

and there is only one parameter so that 𝜃 = [𝜃 ]

From our cost function 𝐽 𝜃 , we can clearly say that it will be

minimum at 𝜃 = 0, but it won’t be so easy to derive such conclusions
while working with some complex functions, so we will apply gradient
descent here.

Step 1: Initialize 𝜃 by a random number say 𝜃 = 5 and let the

learning rate 𝛾 = 0.1

Example:Single variable case

Step 2: Simplification of the iteration formula: 𝜃 : = 𝜃 − 𝛼

𝜃: = 𝜃 − 𝛼

𝜕 𝜃
𝜃 ≔ 𝜃 − 0.1
𝜕𝜃
𝜃: = 𝜃 − 0.1 ∗ (2𝜃)
𝜃: = 0.8 ∗ 𝜃

10
22-10-2020

• ( F ) Table Generation:
• Here we are stating with θ = 5.keep in mind that here θ = 0.8*θ, for our learning rate and
cost function.

𝜃 𝐽(𝜃) 𝜃 𝐽(𝜃)

5 25 -5 25

4 16 -4 16

3.2 10.24 -3.2 10.24

2.56 6.55 -2.56 6.55

2.04 4.19 -2.04 4.19

⋮ ⋮ ⋮ ⋮

0 0 0 0

• We can see that, as we increase our number of iterations, our cost value goes down and the
algorithm converges to the optimum value 0

Additional Example:Two variable case

• Example 2(Two Variables case):

• Minimize 𝑓 𝑥, 𝑦 = 𝑥 + 𝑦
• Solution:

𝜃
• Our cost function is : 𝐽 𝜃 = 𝜃 +𝜃 where 𝜃 = & let the learning
𝜃
rate 𝛼 = 0.1
•

• Increment function: 𝜃 ≔ 𝜃 − 𝛼 & 𝜃 ≔𝜃 −𝛼

( ) ( )
• 𝜃 ≔ 𝜃 − 0.1 ∗ & 𝜃 ≔ 𝜃 − 0.1 ∗

• 𝜃 ≔ 𝜃 − 0.1 ∗ (2𝜃 ) & 𝜃 ≔ 𝜃 − 0.1 ∗ (2𝜃 )

• 𝜃 ≔ 0.8 ∗ 𝜃 & 𝜃 ≔ 0.8 ∗ 𝜃

11
22-10-2020

• Initialize 𝜃 = 1 & 𝜃 = 1 and iterate

𝜃 𝜃 𝐽 𝜃

1 1 2

0.8 0.8 1.28

0.64 0.64 0.4096

0.512 0.512 0.2621

0.4096 0.4096 0.1677

⋮ ⋮ ⋮

0 0 0

• We can see that, as we increase our number of iterations,

our cost value goes down and the algorithm slowly
converges to the optimum value (0,0)

Xi - Maths - Chapter 11 - Hyperbolic Functions (156-165)
No ratings yet
Xi - Maths - Chapter 11 - Hyperbolic Functions (156-165)
10 pages
Luenberger - Optimzation by Vector Space Methods
No ratings yet
Luenberger - Optimzation by Vector Space Methods
341 pages
Funcional
100% (1)
Funcional
626 pages
Fall 2024_MTH623_1_SOL
No ratings yet
Fall 2024_MTH623_1_SOL
3 pages
Numerical Methods For Engineers ch3
No ratings yet
Numerical Methods For Engineers ch3
61 pages
Numerical Method
No ratings yet
Numerical Method
31 pages
55_Optimization
No ratings yet
55_Optimization
21 pages
Week 3
No ratings yet
Week 3
30 pages
ME554 Sheet 3 Final PDF
No ratings yet
ME554 Sheet 3 Final PDF
31 pages
Lesson 2 Limit of Trigonometric Functions
100% (1)
Lesson 2 Limit of Trigonometric Functions
5 pages
Lecture 2 - Optimization With Equality Constraints
No ratings yet
Lecture 2 - Optimization With Equality Constraints
44 pages
Math 404 - W01 - SVO
No ratings yet
Math 404 - W01 - SVO
24 pages
Optimization Lesson 4 - Numerical Solutions of Unconstrained Single-Variable Optimization
No ratings yet
Optimization Lesson 4 - Numerical Solutions of Unconstrained Single-Variable Optimization
14 pages
Ls 1
No ratings yet
Ls 1
26 pages
lecture 3 updated
No ratings yet
lecture 3 updated
29 pages
Lecture Note 08- Root finding
No ratings yet
Lecture Note 08- Root finding
30 pages
Module_3
No ratings yet
Module_3
76 pages
Week 1 - Intro To Limits
No ratings yet
Week 1 - Intro To Limits
35 pages
06_23ECE216_GradientDescent_v2
No ratings yet
06_23ECE216_GradientDescent_v2
73 pages
PHY103 - Lec - 2 - 20210104 - 205222
No ratings yet
PHY103 - Lec - 2 - 20210104 - 205222
19 pages
Maths Model Paper
No ratings yet
Maths Model Paper
2 pages
Httpspmt.physicsandmathstutor.comdownloadMathsGCSEWorksheetsAlgebraSolving Equations and Inequalitiesc.20Simultaneous
No ratings yet
Httpspmt.physicsandmathstutor.comdownloadMathsGCSEWorksheetsAlgebraSolving Equations and Inequalitiesc.20Simultaneous
15 pages
Week3 Multivariable Optimization
No ratings yet
Week3 Multivariable Optimization
43 pages
Lecture Note_2023
No ratings yet
Lecture Note_2023
25 pages
Calculus
No ratings yet
Calculus
8 pages
ITM107 LP Sensitivity Analysis
No ratings yet
ITM107 LP Sensitivity Analysis
4 pages
MAT6105 3.3 HigherDerivatives PDF
No ratings yet
MAT6105 3.3 HigherDerivatives PDF
48 pages
Optimization Lec01 ClassicalUnconstrained
No ratings yet
Optimization Lec01 ClassicalUnconstrained
16 pages
MATH412 QUIZ 3 Solution
No ratings yet
MATH412 QUIZ 3 Solution
5 pages
11_Optimization of Area_MC_Guide and Lab_PR25.docx
No ratings yet
11_Optimization of Area_MC_Guide and Lab_PR25.docx
17 pages
Math Notes-1-23-1-22-1-18
No ratings yet
Math Notes-1-23-1-22-1-18
18 pages
Math Notes
No ratings yet
Math Notes
26 pages
Evaluating Limits Analytically
No ratings yet
Evaluating Limits Analytically
18 pages
h. Rearranging Formulae ms
No ratings yet
h. Rearranging Formulae ms
11 pages
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
44 pages
Subgradient Method
No ratings yet
Subgradient Method
22 pages
Discontinuity and Derivatives
No ratings yet
Discontinuity and Derivatives
52 pages
CAPE Integrated Mathematics Stationary Points
100% (2)
CAPE Integrated Mathematics Stationary Points
14 pages
Optimization PPT - Part-2
No ratings yet
Optimization PPT - Part-2
42 pages
Chapter-5.-Linear-Programming
No ratings yet
Chapter-5.-Linear-Programming
21 pages
29-Introduction To Classical Optimization-20-03-2024
No ratings yet
29-Introduction To Classical Optimization-20-03-2024
70 pages
MAT 116 Lecture-14 Review Mid
No ratings yet
MAT 116 Lecture-14 Review Mid
79 pages
Math Lecture 3
No ratings yet
Math Lecture 3
26 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
Mat Mark Tema03 Cálculo Diferencial MKT Cle
No ratings yet
Mat Mark Tema03 Cálculo Diferencial MKT Cle
20 pages
03. Inequalities and Linear Programming S1 211
No ratings yet
03. Inequalities and Linear Programming S1 211
73 pages
Newton-Raphson Method
No ratings yet
Newton-Raphson Method
13 pages
Formulas not in tables
No ratings yet
Formulas not in tables
5 pages
C. Simultaneous Equations
No ratings yet
C. Simultaneous Equations
15 pages
Revisionsums 2ndvolume
No ratings yet
Revisionsums 2ndvolume
5 pages
2022 Linear Regression
No ratings yet
2022 Linear Regression
34 pages
6 Slides
No ratings yet
6 Slides
19 pages
CIEN301_Numerical Integration and Differentiation
No ratings yet
CIEN301_Numerical Integration and Differentiation
42 pages
Optimization3
No ratings yet
Optimization3
30 pages
QA Basic Calculus Quarter 3 Week 4 Final
No ratings yet
QA Basic Calculus Quarter 3 Week 4 Final
12 pages
G11W3L2 - Quadratic Graphs and Quadratic Inequalities PPT
No ratings yet
G11W3L2 - Quadratic Graphs and Quadratic Inequalities PPT
47 pages
0 Memorization Test BLANK COPY 2024
No ratings yet
0 Memorization Test BLANK COPY 2024
6 pages
Math05_CO6_SY20222023
No ratings yet
Math05_CO6_SY20222023
31 pages
PHY103A: Lecture # 2: Semester II, 2017-18 Department of Physics, IIT Kanpur
No ratings yet
PHY103A: Lecture # 2: Semester II, 2017-18 Department of Physics, IIT Kanpur
21 pages
Definite Integral
No ratings yet
Definite Integral
52 pages
[08] Applications of Derivatives
No ratings yet
[08] Applications of Derivatives
15 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Thermal Conductivity of Composite Slab: Vishwakarma Institute Technology, Pune of
No ratings yet
Thermal Conductivity of Composite Slab: Vishwakarma Institute Technology, Pune of
6 pages
Vishwakarma Institute of Technology, Pune: Seminar Presentation
No ratings yet
Vishwakarma Institute of Technology, Pune: Seminar Presentation
26 pages
SR NO: Experiment Name Performance Date Submission Date
No ratings yet
SR NO: Experiment Name Performance Date Submission Date
1 page
Prof. Lokhande Gokuldas Dattatray & Borse C. M., Ph. D
No ratings yet
Prof. Lokhande Gokuldas Dattatray & Borse C. M., Ph. D
5 pages
2024-02-20 MA110 Slides Compilation
No ratings yet
2024-02-20 MA110 Slides Compilation
62 pages
320 - CS8391 Data Structures - Important Questions
100% (5)
320 - CS8391 Data Structures - Important Questions
22 pages
L 06 Hankel Transform
No ratings yet
L 06 Hankel Transform
11 pages
Part 2 - Graph Algorithms and Data Structures
No ratings yet
Part 2 - Graph Algorithms and Data Structures
28 pages
On The Early History of SVD
No ratings yet
On The Early History of SVD
25 pages
Integration Techniques, L'Hôpital's Rule, and Improper Integrals
No ratings yet
Integration Techniques, L'Hôpital's Rule, and Improper Integrals
56 pages
Derivative
No ratings yet
Derivative
15 pages
HN Daa m3 Notes
No ratings yet
HN Daa m3 Notes
24 pages
ADA Assignment1 (2)
No ratings yet
ADA Assignment1 (2)
2 pages
Maximum and Minimum Values: Click Here For Answers. Click Here For Solutions
No ratings yet
Maximum and Minimum Values: Click Here For Answers. Click Here For Solutions
5 pages
Relations and Functions Assignment 1672218500907
No ratings yet
Relations and Functions Assignment 1672218500907
8 pages
Derivatives Notes by Aakash Toppers
No ratings yet
Derivatives Notes by Aakash Toppers
21 pages
Fundamental Group and COVERING SPACES
No ratings yet
Fundamental Group and COVERING SPACES
5 pages
Chap 12
No ratings yet
Chap 12
120 pages
Transsipment
No ratings yet
Transsipment
26 pages
Assignment 3 Sol 1 Series and Matrices Iitm
No ratings yet
Assignment 3 Sol 1 Series and Matrices Iitm
3 pages
Basis Functions For Serendipity Finite Element Methods: Andrew Gillette
No ratings yet
Basis Functions For Serendipity Finite Element Methods: Andrew Gillette
25 pages
HW8-Solutions
No ratings yet
HW8-Solutions
3 pages
Extra Practice Topic 1.5 Polynomial Functions - Complex Zeros
No ratings yet
Extra Practice Topic 1.5 Polynomial Functions - Complex Zeros
3 pages
Final - Assessment-MOGAJI GABRIEL ROTIMI - R1812D7158691 - UU-COM-3005-42931
No ratings yet
Final - Assessment-MOGAJI GABRIEL ROTIMI - R1812D7158691 - UU-COM-3005-42931
23 pages
Assignment Linear Algebra 2021
100% (1)
Assignment Linear Algebra 2021
2 pages
PHYS4055 - Mathematical Methods 1 (Dec. Exam) - Course Handbook
No ratings yet
PHYS4055 - Mathematical Methods 1 (Dec. Exam) - Course Handbook
3 pages
Chapter 2
No ratings yet
Chapter 2
185 pages
Kron Shipley
No ratings yet
Kron Shipley
6 pages
Ranga Kutta
No ratings yet
Ranga Kutta
12 pages
Maths Mid Term MQP2-Solution
100% (6)
Maths Mid Term MQP2-Solution
14 pages
Lesson 4.3 KEY
No ratings yet
Lesson 4.3 KEY
4 pages

4-Optimization of 2 Variables, Gradient Descent

Uploaded by

4-Optimization of 2 Variables, Gradient Descent

Uploaded by

22-10-2020

Optimization of function of two variables

 Draw conclusion based on following table:

Solve and check for H(x*)

Apply first derivative test to find stationary points:

Solve simultaneous equations to get stationary points.

Now to check maxima and minima, evaluate Hessian matrix at

Apply first derivative test to find stationary points:

Application: Least squares

• Example: Solve 𝐴𝑥 = 𝑏 where

• General structure of algorithms for unconstrained minimization :

• Suppose you are at the top of a mountain and want to reach

tends to descend. This will give an idea in what direction,

follow the descending path until you encounter a plain area

or an ascending path, it is very likely you would reach the

downhill you would immediately stop assuming that you

reached the base camp (global minima), but in reality, you

are still stuck at the mountain at a local minima.

• In other words , gradient descent does not guarantee finding

global mimina(maxima) of the function

• However most of the objective functions used in machine

learning such as cost function are convex functions which

ensure that the local minimum is also a global minimum.

Mathematical formulation of the idea:

• The algorithm is based on the fact that at any

• If one goes from position 𝑎 to position 𝑎 by

Mathematical formulation of the idea:

• It can be shown that if 𝛾 is small then 𝑓 𝑎 ≥

Algorithm in machine learning

Note on step size:

• In machine learning the value 𝛾 is called the learning

• Usually, we take the value of the learning rate to be small

Note on step size:

Additional Example:Single variable case

We are given a function of one variable. Here cost function 𝐽 𝜃 = 𝜃

From our cost function 𝐽 𝜃 , we can clearly say that it will be

Step 1: Initialize 𝜃 by a random number say 𝜃 = 5 and let the

Example:Single variable case

Step 2: Simplification of the iteration formula: 𝜃 : = 𝜃 − 𝛼

3.2 10.24 -3.2 10.24

2.56 6.55 -2.56 6.55

2.04 4.19 -2.04 4.19

Additional Example:Two variable case

• Example 2(Two Variables case):

• Increment function: 𝜃 ≔ 𝜃 − 𝛼 & 𝜃 ≔𝜃 −𝛼

• 𝜃 ≔ 𝜃 − 0.1 ∗ (2𝜃 ) & 𝜃 ≔ 𝜃 − 0.1 ∗ (2𝜃 )

• Initialize 𝜃 = 1 & 𝜃 = 1 and iterate

0.8 0.8 1.28

0.64 0.64 0.4096

0.512 0.512 0.2621

0.4096 0.4096 0.1677

• We can see that, as we increase our number of iterations,

You might also like