14meanval
14meanval
f (c + h) − f (c)
f 0 (c−) = lim ≥ 0,
h→0− h
where the symbolism associated with the limit indicates that h tends to 0
from below. Now the assumption that f (x) is continuous in (a, b) implies that
f 0 (c+) = f 0 (c−); and the only way in which this can be reconciled with the
inequalities above is if f 0 (c) = 0. This proves the theorem in part. The rest of
the proof, which concerns the case where there is a lower bound, follows along
the same lines.
1
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
The Mean Value Theorem. If f (x) is continuous in the interval [a, b] and
differentiable in the open interval (a, b), then there exists a point c ∈ (a, b)
such that
f (b) − f (a)
= f 0 (c).
b−a
Proof. The line passing through the coordinates {a, f (a)} and {b, f (b)} of
the function f (x) has the equation
f (b) − f (a)
`(x) = f (a) + (x − a).
b−a
Define a function φ(x) = f (x) − `(x) which represents the vertical discrepancy
between the line and the function.
Since `(x) and f (x) agree at the points x = a, b, we have φ(a) = φ(b) = 0.
Therefore Rolle’s theorem applies to φ(x) when x ∈ (a, b)—and we may note
in passing that φ is just the Greek version of f . It follows from the theorem
that there exists a point c ∈ (a, b) at which φ0 (c) = 0. That is
When this equation is rearranged, we have the result which was to be proved.
The mean value theorem can be represented in a way which conforms with
some later results. Consider the equation of the linear function `(x) which takes
the same values as the function f (x) at the points x = a, b. Setting x = b in
the equation and using f (b) = `(b) gives rise to the expression
f (b) − f (a)
f (b) = f (a) + (b − a)
b−a
= f (a) + f 0 (c)(b − a),
where c is a value in the interval (a, b), as indicated by the mean value theorem.
Let h = b − a. Then c = a + λh for some h ∈ (0, 1), and the equation above
can be written in the form of
f (a + h) = f (a) + hf 0 (a + λh).
2
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
In fact, this expression is valid not only for the points a, b but for any x, x + h ∈
[a, b]; and, therefore, it is appropriate to write
f (x + h) = f (x) + hf 0 (x + λh).
f (x + h) = f (x) + hf 0 (x) + r,
0 = f (ξ + h) = f (ξ) + hf 0 (ξ) + r.
When seeking to evaluate this expression, we are likely to know all but the
final term r. On setting this to zero, we obtain a value ξ1 on the LHS of the
equation which is an approximation to the root, which is liable to be a better
approximation than was the original value ξ. To improve the approximation
still further, we may replace ξ on the RHS of the equation by ξ1 so as to obtain
ξ2 on the LHS. The process can be repeated indefinitely in the expectation
of generating ever-improving approximations to the root of the equation. In
effect, we have specified an iterative method for finding the root. The method
is attributable to Newton. The equation of the algorithm by which the (r +1)th
approximation is found from the rth approximation is
© ª−1
ξr+1 = ξr − f 0 (ξr ) f (ξr ).
3
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
Example. Consider the matter of finding the square root of the number N > 0,
which is the root of the equation
f (x) = x2 − N = 0.
The derivative of the function is f 0 (x) = 2x; and therefore the algorithm above
becomes
1 ¡ 2
ξr+1 = ξr − ξ − N)
2ξr r
µ ¶
1 N
= ξr + .
2 ξr
The starting value, which may denoted by ξ0 , can be an arbitrary positive
number, and ξ0 = N is a reasonable √ choice. The convergence of the sequence
{ξ0 , ξ1 , ξ2 , . . .} to the value of x = N is remarkably rapid; and, in fact, this
algorithm forms part of the library of numerical routines which is built into the
ROM of a microcomputer. The following sequence has been generated by a
computer which has applied the algorithm to the case where N = 2 and ξ0 = 2:
4
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
the “origin” of the argument at the beginning of the interval [a, b]. Therefore
we shall represent the quadratic by
q(x) = q0 + q1 (x − a) + q2 (x − a)2 .
In comparison with the linear function l(x), there is an extra parameter which
can be used in pursuit of an improved approximation to f (x) over the interval.
We shall continue to apply the endpoint conditions which are that q(a) = f (a)
and q(b) = f (b). However, there is a variety of ways in which we may use the
additional degree of freedom which is afforded by the extra parameter q2 . The
appropriate choice, for present purposes, is to constrain the derivative of q(x)
to equal that of f (x) at the point x = a. The three conditions which determine
the quadratic parameters are therefore
f (a) = q(a) = q0 ,
f 0 (a) = q 0 (a) = q1 ,
f (b) = q(b) = q0 + q1 h + q2 h2 ,
at the point x = b. The result which is known as the quadratic mean value
theorem asserts that the parameter q2 may be expressed in terms of the second-
order derivative of the function f (x) evaluated at some point c ∈ (a, b), which
is denoted by f 00 (c):
5
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
Proof. Consider the function φ(x) = f (x) − q(x) which represents the
discrepancy between the function f (x) and its quadratic approximation q(x).
Since the two functions agree at the points a, b, it follows that φ(a) = φ(b) = 0.
Therefore Rolle’s theorem can be applied to show that there is a point c1 ∈ (a, b)
such that φ0 (c1 ) = 0.
Now recall that the parameters of q(x) have been determined to ensure
that the condition f 0 (a) = q 0 (a) is fulfilled. This implies that φ0 (a) = 0. It
follow that Rolle’s theorem can be applied a second time to the derived function
φ0 (x). Thus it transpires that there is a point c ∈ (a, c1 ) at which φ00 (c) = 0.
That is
0 = φ00 (c) = f 00 (c) − q 00 (c)
= f 00 (c) − 2q2 ;
1 00
and hence the value q2 = 2 f (c) is attributed to the quadratic parameter
associated with h2 . ¦
Since the quadratic mean value theorem applies not only in respect of the
endpoints of the interval [a, b] but also for any two points x, x + h ∈ [a, b], it is
convenient to use the following expression in representing the result:
where x + λh, with λ ∈ [0, 1], is some point in the interval [x, x + h].
Taylor’s Theorem
The quadratic mean value theorem represents a stepping stone on the way
to a general result known as Taylor’s theorem. This theorem indicates that,
if the function f (x) is n times differentiable over the interval [a, b], then there
exists a polynomial p(x) of degree n which agrees with f (x) at the points a and
b and which shares with f (x) its derivatives at the point a up to the (n − 1)th.
The nth derivative of this approximating polynomial, which is its final nonzero
derivative, can be expressed in term of the nth derivative of f (x) evaluated at
some point c ∈ (a, d).
0 h2 00
f (b) = f (a) + hf (a) + f (a) + · · ·
2
h(n−1) (n−1) hn n
···+ f (a) + f (c).
(n − 1)! n!
6
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
Proof. Consider approximating f (x) over the interval [a, b] by the polynomial
The polynomial may be constrained to agree with f (x) and the endpoints a, b
such that p(a) = f (a) and p(b) = f (b). The first of these conditions gives
p0 = f (a). Also, the derivatives of p(x) up to the (n − 1)th may be constrained
to agree with those of f (x) at the point a:
We shall delay using this theorem for a while. Our immediate objective is
to exploit the quadratic theorem.
7
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
of functions of several variables, including cases where the variables are subject
of constraints, ie where the values which can be assigned to the variables are
not wholly independent of each other.
We should begin by giving a precise definition of the minimum of a uni-
variate function.
In effect, the point ξ is a strict minimum if the value of f increases with any
small departure from ξ, whereas it is a weak minimum if the fails to increase.
In general, a function may exhibit these properties at a number of points which
are described as local minima. If there is a unique point at which the function
is lowest, then this is called a global minimum.
It is not possible to demonstrate that an analytic function has a global
minimum without a complete knowledge of its derivatives of all orders. The
conditions which are sufficient for the existence of a local minimum are modest
by comparison.
h2 00
f (ξ + h) = f (ξ) + hf 0 (ξ) + f (ξ + λh)
2
for some value λ ∈ [0, 1]. Therefore the condition that f (ξ + h) ≥ f (ξ) when
|h| < ² implies that
h2
hf 0 (ξ) + f 00 (ξ + λh) ≥ 0.
2
If h > 0, then this implies that
h 00
f 0 (ξ) + f (ξ + λh) ≥ 0,
2
and letting h → 0+ shows that f 0 (ξ) ≥ 0. On the other hand, if h < 0, then
dividing by h shows that
h 00
f 0 (ξ) + f (ξ + λh) ≤ 0,
2
8
MATHEMATICAL THEORY FOR SOCIAL SCIENTISTS
and letting h → 0− shows that f 0 (ξ) ≤ 0. The two inequalities can be reconciled
only if f 0 (ξ) = 0.
Now if f 0 (ξ) = 0, then the inequality f (ξ + h) ≥ f (ξ) is satisfied for all
|h| < ² if and only if 12 h2 f 00 (ξ + λh) ≥ 0 which is if and only if f 00 (ξ + λh) ≥ 0.
Letting h → 0 establishes that f 00 (ξ) ≥ 0 is necessary and sufficient.