0% found this document useful (0 votes)
2 views9 pages

14meanval

The document discusses key theorems in mathematical analysis, particularly Rolle's Theorem and the Mean Value Theorem, which establish conditions under which a function's derivative equals zero at some point within an interval. It further explores Newton's method for approximating roots of equations and introduces the Quadratic Mean Value Theorem, which improves approximations using quadratic functions. Finally, it presents Taylor's Theorem, which generalizes these concepts to polynomials of higher degrees for approximating differentiable functions.

Uploaded by

fatihkoc.0314
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

14meanval

The document discusses key theorems in mathematical analysis, particularly Rolle's Theorem and the Mean Value Theorem, which establish conditions under which a function's derivative equals zero at some point within an interval. It further explores Newton's method for approximating roots of equations and introduces the Quadratic Mean Value Theorem, which improves approximations using quadratic functions. Finally, it presents Taylor's Theorem, which generalizes these concepts to polynomials of higher degrees for approximating differentiable functions.

Uploaded by

fatihkoc.0314
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

The Mean Value Theorem


Rolle’s Theorem. If f (x) is continuous in the closed interval [a, b] and
differentiable in the open interval (a, b), and if f (a) = f (b) = 0, then there
exists a number c ∈ (a, b) such that f 0 (c) = 0.

When it is represented geometrically, this theorem should strike one as obvious;


and to prove it formally may seem a waste of time. Nevertheless, in proving
it, we can show how visual logic may be converted into verbal or algebraic
logic. Some of the more punctilious issues of Mathematical Analysis cannot be
represented adequately in diagrams, and it is for this reason that we prefer to
rely upon algebraic methods when seeking firm proofs of analytic propositions.
To prove Rolle’s theorem algebraically, we should invoke the result that
the function f (x) must achieve an upper bound or a lower bound in the interval
(a, b). Or course, the function might have both an upper an a lower bound in
(a, b). However, imagine that the function rises above the line in the interval,
and that it cuts the line only at the point points a, b which are not included
in the open interval. Then it has an upper bound but not a lower bound. If
the function is horizontal over the interval, then every point in (a, b) is both
an upper bound and a lower bound. Since, in that, case f 0 (x) = 0 for every
x ∈ (a, b) there is nothing to prove.
We shall prove the theorem for the case where there is an upper bound in
(a, b) which corresponds to the point c. Then, by assumption, f (c) ≥ f (c + h)
for small values of h > 0 which do not carry us outside the interval. It follows
that
f (c + h) − f (c)
f 0 (c+) = lim ≤ 0.
h→0+ h
Here the symbolism associated with the limit indicates that h tends to 0 from
above.
Now let h be a small negative number such that c + h < c remains in the
interval (a, b). Then

f (c + h) − f (c)
f 0 (c−) = lim ≥ 0,
h→0− h

where the symbolism associated with the limit indicates that h tends to 0
from below. Now the assumption that f (x) is continuous in (a, b) implies that
f 0 (c+) = f 0 (c−); and the only way in which this can be reconciled with the
inequalities above is if f 0 (c) = 0. This proves the theorem in part. The rest of
the proof, which concerns the case where there is a lower bound, follows along
the same lines.

1
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

The following theorem, which is of prime importance in Mathematical


Analysis, represents a generalisation of Rolle’s theorem and it has a similar
visual or geometric interpretation:

The Mean Value Theorem. If f (x) is continuous in the interval [a, b] and
differentiable in the open interval (a, b), then there exists a point c ∈ (a, b)
such that
f (b) − f (a)
= f 0 (c).
b−a

Proof. The line passing through the coordinates {a, f (a)} and {b, f (b)} of
the function f (x) has the equation

f (b) − f (a)
`(x) = f (a) + (x − a).
b−a

Define a function φ(x) = f (x) − `(x) which represents the vertical discrepancy
between the line and the function.
Since `(x) and f (x) agree at the points x = a, b, we have φ(a) = φ(b) = 0.
Therefore Rolle’s theorem applies to φ(x) when x ∈ (a, b)—and we may note
in passing that φ is just the Greek version of f . It follows from the theorem
that there exists a point c ∈ (a, b) at which φ0 (c) = 0. That is

φ0 (c) = f 0 (c) − `0 (c)


f (b) − f (a)
= f 0 (c) − = 0.
b−a

When this equation is rearranged, we have the result which was to be proved.

The mean value theorem can be represented in a way which conforms with
some later results. Consider the equation of the linear function `(x) which takes
the same values as the function f (x) at the points x = a, b. Setting x = b in
the equation and using f (b) = `(b) gives rise to the expression

f (b) − f (a)
f (b) = f (a) + (b − a)
b−a
= f (a) + f 0 (c)(b − a),

where c is a value in the interval (a, b), as indicated by the mean value theorem.
Let h = b − a. Then c = a + λh for some h ∈ (0, 1), and the equation above
can be written in the form of

f (a + h) = f (a) + hf 0 (a + λh).

2
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

In fact, this expression is valid not only for the points a, b but for any x, x + h ∈
[a, b]; and, therefore, it is appropriate to write

f (x + h) = f (x) + hf 0 (x + λh).

Linear Approximations and Newton’s Method


The equation above, which provides us with the value of the function f
at the point x + h, depends upon our knowing the precise point x + λh, called
the mean value, at which to evaluate the derivative f 0 . When the derivative is
evaluated at the same point x as the function itself is evaluated, we obtain an
alternative equation in the form of

f (x + h) = f (x) + hf 0 (x) + r,

where r is a remainder term.


This equation is the basis of of Newton’s method of approximation which
is used for finding the root (ie. the solution) of an algebraic equation f (x) = 0.
Imagine that ξ is an approximation to the value of x which solves this equation,
and let the exact value of the root be denoted by x = ξ + h. Then

0 = f (ξ + h) = f (ξ) + hf 0 (ξ) + r.

The solution for h is © ª−1


h = − f 0 (ξ) f (ξ) − r;
and, therefore, the root may be expressed as
© ª−1
ξ + h = ξ − f 0 (ξ) f (ξ) − r.

When seeking to evaluate this expression, we are likely to know all but the
final term r. On setting this to zero, we obtain a value ξ1 on the LHS of the
equation which is an approximation to the root, which is liable to be a better
approximation than was the original value ξ. To improve the approximation
still further, we may replace ξ on the RHS of the equation by ξ1 so as to obtain
ξ2 on the LHS. The process can be repeated indefinitely in the expectation
of generating ever-improving approximations to the root of the equation. In
effect, we have specified an iterative method for finding the root. The method
is attributable to Newton. The equation of the algorithm by which the (r +1)th
approximation is found from the rth approximation is
© ª−1
ξr+1 = ξr − f 0 (ξr ) f (ξr ).

3
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

Example. Consider the matter of finding the square root of the number N > 0,
which is the root of the equation

f (x) = x2 − N = 0.

The derivative of the function is f 0 (x) = 2x; and therefore the algorithm above
becomes
1 ¡ 2
ξr+1 = ξr − ξ − N)
2ξr r
µ ¶
1 N
= ξr + .
2 ξr
The starting value, which may denoted by ξ0 , can be an arbitrary positive
number, and ξ0 = N is a reasonable √ choice. The convergence of the sequence
{ξ0 , ξ1 , ξ2 , . . .} to the value of x = N is remarkably rapid; and, in fact, this
algorithm forms part of the library of numerical routines which is built into the
ROM of a microcomputer. The following sequence has been generated by a
computer which has applied the algorithm to the case where N = 2 and ξ0 = 2:

ξ0 = 2.00000000 ξ02 = 4.00000000


ξ1 = 1.50000000 ξ12 = 2.25000000
ξ2 = 1.41666663 ξ22 = 2.00694433
ξ3 = 1.41421568 ξ32 = 2.00000600
ξ4 = 1.41421354 ξ42 = 1.99999993

After three iterations, the approximation to 2 is acceptable. After a few more
iterations, one can proceed no further on account of the finite (ie. limited)
accuracy with which the computer represents real numbers.

The Quadratic Mean Value Theorem


We have come to regard the mean value theorem as a theorem concerning
the approximation of a continuous differentiable function f (x) over the interval
[a, a + h] by a linear function `(x). Linear approximations are of fundamental
importance and are used in many varied contexts. However, they have evident
limitations; and, often, it is appropriate to consider more sophisticated approx-
imations which have the potential for being more accurate. Therefore, at least,
we should consider using a quadratic function to approximate f (x).
A familiar way of representing a quadratic function is via the equation
p(x) = ax2 + bx + c. We shall use other notation. For a start, the letters a, b, c
have been preempted for other uses. In the second place, it is helpful to place

4
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

the “origin” of the argument at the beginning of the interval [a, b]. Therefore
we shall represent the quadratic by

q(x) = q0 + q1 (x − a) + q2 (x − a)2 .

In comparison with the linear function l(x), there is an extra parameter which
can be used in pursuit of an improved approximation to f (x) over the interval.
We shall continue to apply the endpoint conditions which are that q(a) = f (a)
and q(b) = f (b). However, there is a variety of ways in which we may use the
additional degree of freedom which is afforded by the extra parameter q2 . The
appropriate choice, for present purposes, is to constrain the derivative of q(x)
to equal that of f (x) at the point x = a. The three conditions which determine
the quadratic parameters are therefore

f (a) = q(a) = q0 ,
f 0 (a) = q 0 (a) = q1 ,

f (b) = q(b) = q0 + q1 h + q2 h2 ,

where h = b − a. The parameters q0 and q1 are determined immediately by the


first two equations, whence the third equation gives

f (b) − f (a) − hf 0 (a)


q2 = .
h2
Thus the approximating quadratic may be written as

q(x) = f (a) + (x − a)f 0 (a) + q2 (x − a)2 ,

and this becomes

q(b) = f (b) = f (a) + hf 0 (a) + q2 h2

at the point x = b. The result which is known as the quadratic mean value
theorem asserts that the parameter q2 may be expressed in terms of the second-
order derivative of the function f (x) evaluated at some point c ∈ (a, b), which
is denoted by f 00 (c):

The Quadratic Mean-Value Theorem. If f (x) is continuous in the interval


[a, b] of length h = b − a and twice differentiable in the open interval (a, b),
then there is a point c ∈ [a, b] such that

f (b) = f (a) + hf 0 (a) + 12 h2 f 00 (c).

5
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

Proof. Consider the function φ(x) = f (x) − q(x) which represents the
discrepancy between the function f (x) and its quadratic approximation q(x).
Since the two functions agree at the points a, b, it follows that φ(a) = φ(b) = 0.
Therefore Rolle’s theorem can be applied to show that there is a point c1 ∈ (a, b)
such that φ0 (c1 ) = 0.
Now recall that the parameters of q(x) have been determined to ensure
that the condition f 0 (a) = q 0 (a) is fulfilled. This implies that φ0 (a) = 0. It
follow that Rolle’s theorem can be applied a second time to the derived function
φ0 (x). Thus it transpires that there is a point c ∈ (a, c1 ) at which φ00 (c) = 0.
That is
0 = φ00 (c) = f 00 (c) − q 00 (c)
= f 00 (c) − 2q2 ;
1 00
and hence the value q2 = 2 f (c) is attributed to the quadratic parameter
associated with h2 . ¦

Since the quadratic mean value theorem applies not only in respect of the
endpoints of the interval [a, b] but also for any two points x, x + h ∈ [a, b], it is
convenient to use the following expression in representing the result:

f (x + h) = f (x) + hf 0 (x) + 12 h2 f 00 (x + λh),

where x + λh, with λ ∈ [0, 1], is some point in the interval [x, x + h].

Taylor’s Theorem
The quadratic mean value theorem represents a stepping stone on the way
to a general result known as Taylor’s theorem. This theorem indicates that,
if the function f (x) is n times differentiable over the interval [a, b], then there
exists a polynomial p(x) of degree n which agrees with f (x) at the points a and
b and which shares with f (x) its derivatives at the point a up to the (n − 1)th.
The nth derivative of this approximating polynomial, which is its final nonzero
derivative, can be expressed in term of the nth derivative of f (x) evaluated at
some point c ∈ (a, d).

Taylor’s Theorem. If f (x) is a function continuous and n times differen-


tiable in an interval [a, b] of length h = b − a, then there exists a point
c ∈ [a, b] such that

0 h2 00
f (b) = f (a) + hf (a) + f (a) + · · ·
2
h(n−1) (n−1) hn n
···+ f (a) + f (c).
(n − 1)! n!

6
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

Proof. Consider approximating f (x) over the interval [a, b] by the polynomial

p(x) = p0 + p1 (x − a) + · · · + pn−1 (x − a)n−1 + pn (x − a)n .

The polynomial may be constrained to agree with f (x) and the endpoints a, b
such that p(a) = f (a) and p(b) = f (b). The first of these conditions gives
p0 = f (a). Also, the derivatives of p(x) up to the (n − 1)th may be constrained
to agree with those of f (x) at the point a:

p(k) (a) = k!pk = f (k) (a); k = 1, . . . , n − 1.

These equalities provide the coefficients p1 , . . . , pn−1 of the approximating poly-


nomial.
Now define φ(x) = f (x)−p(x) which represents the error of approximation.
Then φ(a) = φ(b) = 0; as so, by Rolle’s theorem, there exists a point c1 ∈ (a, b)
such that φ0 (c1 ) = 0. But, by construction, we have φ0 (a) = 0, since the
first derivatives of f (x) and p(x) have been made to agree at the point a.
Therefore Rolle’s theorem can be applied again to show that there exists a
point c2 ∈ (a, b) such that φ00 (c2 ) = 0. Proceeding in this way, we can show
that there is a sequence of points c1 , c2 , . . . , cn−1 , with the ordering

a < cn−1 < · · · < c2 < c1 < b,

such that φ0 (c1 ) = φ00 (c2 ) = · · · = φ(n−1) (cn−1 ) = 0. A final application of


Rolle’s theorem shows that there exists a point c ∈ (a, cn−1 ) such that

0 = φn (c) = f n (c) − pn (c)


= f n (c) − n!pn .

This establishes the final coefficient pn = f n (c)/n! of the approximating poly-


nomial; and the theorem is proved. ¦

We shall delay using this theorem for a while. Our immediate objective is
to exploit the quadratic theorem.

The Maxima and Minima of Twice-Differentiable Functions

Much of economic theory is concerned with problems of optimisation where


it is required to find the maximum or the minimum of a twice-differentiable
function. In the following discussion, we shall consider only the minimisation
of a function of a single variable. Since a problem of maximisation can be solved
by minimising the negative of the function in question, there is no real omission
in ignoring problems of maximisation. Later we shall consider the optimisation

7
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

of functions of several variables, including cases where the variables are subject
of constraints, ie where the values which can be assigned to the variables are
not wholly independent of each other.
We should begin by giving a precise definition of the minimum of a uni-
variate function.

A point ξ is said to be a strict minimum of the function f (x) if f (ξ) <


f (x) for all x in an neighbourhood (ξ − ², ξ + ²) of ξ or, equivalently, if
f (ξ) < f (ξ + h) whenever |h| < ² for some small ² > 0. The point is said to
be a weak minimum of f (x) if f (ξ) ≤ f (x) for all x in the neighbourhood.

In effect, the point ξ is a strict minimum if the value of f increases with any
small departure from ξ, whereas it is a weak minimum if the fails to increase.
In general, a function may exhibit these properties at a number of points which
are described as local minima. If there is a unique point at which the function
is lowest, then this is called a global minimum.
It is not possible to demonstrate that an analytic function has a global
minimum without a complete knowledge of its derivatives of all orders. The
conditions which are sufficient for the existence of a local minimum are modest
by comparison.

Conditions for a Minimum. A continuous and twice differentiable function


f (x) has a minimum at the point ξ if and only if f 0 (ξ) = 0 and f 00 (x) ≥ 0
for all x in a neighbourhood (ξ − ², ξ + ²) of ξ.

Proof. The quadratic mean-value theorem indicates that

h2 00
f (ξ + h) = f (ξ) + hf 0 (ξ) + f (ξ + λh)
2
for some value λ ∈ [0, 1]. Therefore the condition that f (ξ + h) ≥ f (ξ) when
|h| < ² implies that
h2
hf 0 (ξ) + f 00 (ξ + λh) ≥ 0.
2
If h > 0, then this implies that

h 00
f 0 (ξ) + f (ξ + λh) ≥ 0,
2
and letting h → 0+ shows that f 0 (ξ) ≥ 0. On the other hand, if h < 0, then
dividing by h shows that

h 00
f 0 (ξ) + f (ξ + λh) ≤ 0,
2

8
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS

and letting h → 0− shows that f 0 (ξ) ≤ 0. The two inequalities can be reconciled
only if f 0 (ξ) = 0.
Now if f 0 (ξ) = 0, then the inequality f (ξ + h) ≥ f (ξ) is satisfied for all
|h| < ² if and only if 12 h2 f 00 (ξ + λh) ≥ 0 which is if and only if f 00 (ξ + λh) ≥ 0.
Letting h → 0 establishes that f 00 (ξ) ≥ 0 is necessary and sufficient.

You might also like