Taylor Series Notes
Taylor Series Notes
MATH 22B
Introduction
17.1. According to legend 1, Richard Feynman got into the challenge to compute the
cube root of 1729.03 against an Abacus computation. By using linear approximation
and a bit o luck, he could get 12.002384 using paper and pencil. The actual cube
root is 12.002383785691718123057. How did Feynman do it? The secret is in linear
approximation. This means that we approximate a function like f (x) = x1/3 with a
linear function. The same can be done with functions of several variables. The linear
approximation if of the form L(x) = f (a) + f 0 (a)(x − a).
17.2. One can also do higher order approximations. The function f (x) = ex for
example has the linear approximation L(x) = 1 + x at a = 0 and the quadratic
approximation Q(x) = 1 + x + x2 /2 at a = 0. To get the quadratic term, we just
need to make sure that the first and second derivative at x = a agree. This gives the
formula Q(x) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2 /2. Indeed, you can check that f (x)
and Q(x) have the same first derivatives and the same second derivatives at x = a. A
For the function ex for example, we have the m’th order approximation
17.3. The same can be done in higher dimensions. Everything is the same. We just
have to use the derivative df rather than the usual derivative f 0 . We look here only at
linear and quadratic approximation of functions Rn → R The linear approximation is
then
L(x) = f (a) + ∇f (a)(x − a)
where ∇f (a) = df (a) = [fx1 (a), · · · , fxn (a)] is the Jacobian matrix, which ii a row
vector. Now, since we can see df (x) : Rn → Rn the second derivative is a matrix
d2 f (x) = H(x). It is called the Hessian. It encodes all the second derivatives Hij (x) =
f xi xj .
Lecture
17.4. Given a function f : Rm → Rn , its derivative df (x) is the Jacobian matrix. For
every x ∈ Rm , we can use the matrix df (x) and a vector v ∈ Rm to get Dv f (x) =
df (x)v ∈ Rm . For fixed v, this defines a map x ∈ Rm → df (x)v ∈ Rn , like the original
f . Because Dv is a map on X = { all functions from Rm → Rn }, one calls it an
operator. The Taylor formula f (x + t) = eDt f (x) holds in arbitrary dimensions:
17.5. Proof. It is the single variable Taylor on the line x+tv. The directional derivative
Dv f is there the usual derivative as limt→0 [f (x + tv) − f (x)]/t = Dv f (x). Technically,
we need the sum to converge as well: like functions built from polynomials, sin, cos, exp.
17.6. The Taylor formula can be written down using successive derivatives df, d2 f, d3 f
also, which are then called tensors. In the scalar case n = 1, the first derivative df (x)
leads to the gradient ∇f (x), the second derivative d2 f (x) to the Hessian matrix
H(x) which is a bilinear form acting on pairs of vectors. The third derivative d3 f (x)
then acts on triples of vectors etc. One can still write as in one dimension
2
Theorem: f (x) = f (x0 ) + f 0 (x0 )(x − x0 ) + f 00 (x0 ) (x−x
2!
0)
+ ···
if we write f (k) = dk f . For a polynomial, this just means that we first write down the
constant, then all linear terms then all quadratic terms, then all cubic terms etc.
17.7. Assume f : Rm → R and stop the Taylor series after the first step. We get
L(x0 + v) = f (x0 ) + ∇f (x0 ) · v .
It is custom to write this with x = x0 + v, v = x − x0 as
17.8. If we stop the Taylor series after two steps, we get the function Q(x + v) =
f (x) + df (x) · v + v · d2 f (x) · v/2. The matrix H(x) = d2 f (x) is called the Hessian
matrix at the point x. It is also here custom to eliminate v by writing x = x0 + v.
is called the quadratic approximation of f . The kernel of Q−f (x0 ) is the quadratic
manifold Q(x) − f (x0 ) = x · Bx + Ax = 0, where A = df and B = d2 f /2. It
approximates the surface {x | f (x) − f (x0 ) = 0} even better than the linear one. If
|x − x0 | is of the order , then |f (x) − L(x)| is of the order 2 and |f (x) − Q(x)| is of
the order 3 . This follows from the exact Taylor with remainder formula. 3
L=C
f=C
Q=C
17.9. To get the tangent plane to a surface f (x) = C one can just look at the linear
manifold L(x) = C. However, there is a better method:
The tangent plane to a surface f (x, y, z) = C at (x0 , y0 , z0 ) is ax+by+cz =
d, where [a, b, c]T = ∇f (x0 , y0 , z0 ) and d = ax0 + by0 + cz0 .
2Again: the linearization idea is utmost important because it brings in linear algebra.
3If Pn Rt
f ∈ C n+1 , f (x+t) = k=0 f (k) (x)tk /k!+ 0 (t−s)n f (n+1) (x+s)ds/n! (prove this by induction!)
Linear Algebra and Vector Analysis
Proof. Let r(t) be a curve on S with r(0) = x0 . The chain rule assures d/dtf (r(t)) =
∇f (r(t)) · r0 (t). But because f (r(t)) = c is constant, this is zero assuring r0 (t) being
perpendicular to the gradient. As this works for any curve, we are done.
Examples
17.11. Let f : R2 → R be given as f (x, y) = x3 y 2 + x + y 3 . What is the quadratic
approximation at (x0 , y0 ) = (1, 1)? We have df (1, 1) = [4, 5] and
fx 4 fxx fxy 6 6
∇f (1, 1) = = , H(1, 1) = = .
fy 5 fyx fyy 6 8
The linearization is L(x, y) = 4(x − 1) + 5(y − 1) + 3. The quadratic approximation
is Q(x, y) = 3 + 4(x − 1) + 5(y − 1) + 6(x − 1)2 /2 + 12(x − 1)(y − 1)/2 + 8(y − 1)2 /2.
This is the situation displayed to the left in Figure (2). For v = [7, 2]T , the directional
derivative Dv f (1, 1) = ∇f (1, 1) · v = [4, 5]T · [7, 2] = 38. The Taylor expansion given
at the beginning is a finite series because f was a polynomial: f ([1, 1] + t[7, 2]) =
f (1 + 7t, 1 + 2t) = 3 + 38t + 247t2 + 1023t3 + 1960t4 + 1372t5 .
17.12. For f (x, y, z) = −x4 + x2 + y 2 + z 2 , the gradient and Hessian are
fx 2 fxx fxy fxz −10 0 0
∇f (1, 1, 1) = fy = 2 , H(1, 1, 1) = fyx fyy fyz = 0 2 0 .
fz 2 fzx fzy fzz 0 0 2
The linearization is L(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1). The quadratic
approximation
Q(x, y, z) = 2 − 2(x − 1) + 2(y − 1) + 2(z − 1) + (−10(x − 1)2 + 2(y − 1)2 + 2(z − 1)2 )/2
is the situation displayed to the right in Figure (2).
17.13. What is the tangent plane to the surface f (x, y, z) = 1/10 for f (x, y, z) =
10z 2 − x2 − y 2 + 100x4 − 200x6 + 100x8 − 200x2 y 2 + 200x4 y 2 + 100y 4 = 1/10
0
at the point (x, y, z) = (0, 0, 1/10)? The gradient is ∇f (0, 0, 1/10) = 0 . The
2
tangent plane equation is 2z = d, where the constant d is obtained by plugging in the
point. We end up with 2z = 2/10. The linearization is L(x, y, z) = 1/20 + 2(z − 1/10).
Figure 3.
Homework