Chapter 9 Newton's Method
Chapter 9 Newton's Method
An Introduction to Optimization
Spring, 2014
Wei-Ta Chu
1
Introduction
The steepest descent method uses only first derivatives in
selecting a suitable search direction.
Newton’s method (sometimes called Newton-Raphson method)
uses first and second derivatives and indeed performs better.
Given a starting point, construct a quadratic approximation to
the objective function that matches the first and second
derivative values at that point. We then minimize the
approximate (quadratic function) instead of the original
objective function. The minimizer of the approximate function
is used as the starting point in the next step and repeat the
procedure iteratively.
2
Introduction
We can obtain a quadratic approximation to the twice
continuously differentiable function using the
Taylor series expansion of about the current point ,
neglecting terms of order three and higher.
If , then achieves a
minimum at
3
Example
Use Newton’s method to minimize the Powell function:
4
Example
Iteration 1.
5
Example
Iteration 2.
6
Example
Iteration 3.
7
Introduction
Observe that the th iteration of Newton’s method can be
written in two steps as
1. Solve for
2. Set
Step 1 requires the solution of an system of linear
equations. Thus, an efficient method for solving systems of
linear equations is essential when using Newton’s method.
As in the one-variable case, Newton’s method can be viewed as
a technique for iteratively solving the equation
9
Analysis of Newton’s Method
The convergence analysis of Newton’s method when is a
quadratic function is straightforward. Newton’s method reaches
the point such that in just one step starting from
any initial point .
Suppose that is invertible and
Then, and
Hence, given any initial point , by Newton’s algorithm
11
Analysis of Newton’s Method
12
Analysis of Newton’s Method
Subtracting from both sides of Newton’s algorithm and
taking norms yields
Then
13
Analysis of Newton’s Method
By induction, we obtain
14
Analysis of Newton’s Method
Theorem 9.2: Let be the sequence generated by Newton’s
method for minimizing a given objective function . If the
Hessian and , then the search
direction
15
Analysis of Newton’s Method
Proof: Let , then using the chain rule, we
obtain
Hence,
because and .
Thus, there exists an so that for all ,
This implies that for all
16
Analysis of Newton’s Method
Theorem 9.2 motivates the following modification of Newton’s
method
where
that is, at each iteration, we perform a line search in the
direction
A drawback of Newton’s method is that evaluation of
for large can be computationally expensive. Furthermore, we
have to solve the set of linear equations . In
Chapters 10 and 11 we discuss this issue.
The Hessian matrix may not be positive definite. In the next we
describe a simple modification to overcome this problem.
17
Levenberg-Marquardt Modification
If the Hessian matrix is not positive definite, then the
search direction may not point in a descent
direction.
Levenberg-Marquardt modification:
18
Levenberg-Marquardt Modification
Indeed,
19
Levenberg-Marquardt Modification
If we further introduce a step size
20
Newton’s Method for Nonlinear Least Squares
Consider , where ,
are given functions. This particular problem is called a
nonlinear least-squares problem.
Suppose that we are given measurements of a process at
points in time. Let denote the measurement times and
the measurements values. Note that and
We wish to fit a sinusoid to the measurement data.
21
Newton’s Method for Nonlinear Least Squares
The equation of the sinusoid is
22
Newton’s Method for Nonlinear Least Squares
Defining , we write the objective function as
. To apply Newton’s method, we need to
compute the gradient and the Hessian of .
The th component of is
23
Newton’s Method for Nonlinear Least Squares
We compute the Hessian matrix of . The th component
of the Hessian is given by
24
Newton’s Method for Nonlinear Least Squares
Therefore, Newton’s method applied to the nonlinear least-
squares problem is given by
25
Example
26
Newton’s Method for Nonlinear Least Squares
A potential problem with the Gauss-Newton method is that the
matrix may not be positive definite.
This problem can be overcome using a Levenberg-Marquardt
modification:
27