Directional Derivatives
Directional Derivatives
(c) = lim
h0
(c +h) (c)
h
, (3.2.1)
is the slope of the best ane approximation to at c. We may also regard it as the slope
of the graph of at (c, (c)), or as the instantaneous rate of change of (x) with respect
to x when x = c. As a prelude to nding the best ane approximations for a function
f : R
n
R, we will rst discuss how to generalize (3.2.1) to this setting using the ideas of
slopes and rates of change for our motivation.
Directional derivatives
Example Consider the function f : R
2
R dened by
f(x, y) = 4 2x
2
y
2
,
the graph of which is pictured in Figure 3.2.1. If we imagine a bug moving along this
surface, then the slope of the path encountered by the bug will depend both on the bugs
position and the direction in which it is moving. For example, if the bug is above the point
(1, 1) in the xy-plane, moving in the direction of the vector v = (1, 1) will cause it to
head directly towards the top of the graph, and thus have a steep rate of ascent, whereas
moving in the direction of v = (1, 1) would cause it to descend at a fast rate. These two
possibilities are illustrated by the red curve on the surface in Figure 3.2.1. For another
example, heading around the surface above the ellipse
2x
2
+y
2
= 3
in the xy-plane, which from (1, 1) means heading initially in the direction of the vector
w = (1, 2), would lead the bug around the side of the hill with no change in elevation,
and hence a slope of 0. This possibility is illustrated by the green curve on the surface in
Figure 3.2.1. Thus in order to talk about the slope of the graph of f at a point, we must
specify a direction as well. For example, suppose the bug moves in the direction of v. If
we let
u =
1
2
(1, 1),
the direction of v, then, letting c = (1, 1),
f(c +hu) f(c)
h
1 Copyright c by Dan Sloughter 2001
2 Directional Derivatives and the Gradient Section 3.2
-2
-1
0
1
2 x
-2
-1
0
1
2
y
-2
0
2
4
z
-2
-1
0
1
2 x
-2
-1
0
1
2
y
Figure 3.2.1 Graph of f(x) = 4 2x
2
y
2
would, for any h > 0, represent an approximation to the slope of the graph of f at (1, 1)
in the direction of u. As in single-variable calculus, we should expect that taking the limit
as h approaches 0 should give us the exact slope at (1, 1) in the direction of u. Now
f(c +hu) f(c) = f
1
h
2
, 1
h
f(1, 1)
= 4 2
1
h
1
h
2
1
= 3 3
2h +
h
2
2
= 3
2h
3h
2
2
= h
2
3h
2
,
so
lim
h0
f(c +hu) f(c)
h
= lim
h0
2
3h
2
= 3
2.
Hence the graph of f has a slope of 3
2 and
the slope in the direction of
Section 3.2 Directional Derivatives and the Gradient 3
w
w
=
1
5
(1, 2)
is 0.
Denition Suppose f : R
n
R is dened on an open ball about a point c. Given a
unit vector u, we call
D
u
f(c) = lim
h0
f(c +hu) f(c)
h
, (3.2.2)
provided the limit exists, the directional derivative of f in the direction of u at c.
Example From our work above, if f(x, y) = 4 2x
2
y
2
and
u =
1
2
(1, 1),
then D
u
f(1, 1) = 3
2.
Directional derivatives in the direction of the standard basis vectors will be of special
importance.
Denition Suppose f : R
n
R is dened on an open ball about a point c. If we
consider f as a function of x = (x
1
, x
2
, . . . , x
n
) and let e
k
be the kth standard basis
vector, k = 1, 2, . . . , n, then we call D
e
k
f(c), if it exists, the partial derivative of f with
respect to x
k
at c.
Notations for the partial derivative of f with respect to x
k
at an arbitrary point
x = (x
1
, x
2
, . . . , x
n
) include D
x
k
f(x
1
, x
2
, . . . , x
n
), f
x
k
(x
1
, x
2
, . . . , x
n
), and
x
k
f(x
1
, x
2
, . . . , x
n
).
Now suppose f : R
n
R and, for xed x = (x
1
, x
2
, . . . , x
n
), dene g : R R by
g(t) = f(t, x
2
, . . . , x
n
).
Then
f
x
1
(x
1
, x
2
, . . . , x
n
) = lim
h0
f((x
1
, x
2
, . . . , x
n
) +he
1
) f(x
1
, x
2
, . . . , x
n
)
h
= lim
h0
f((x
1
, x
2
, . . . , x
n
) + (h, 0, . . . , 0)) f(x
1
, x
2
, . . . , x
n
)
h
= lim
h0
f(x
1
+h, x
2
, . . . , x
n
) f(x
1
, x
2
, . . . , x
n
)
h
= lim
h0
g(x
1
+h) g(x
1
)
h
= g
(x
1
).
(3.2.3)
4 Directional Derivatives and the Gradient Section 3.2
In other words, we may compute the partial derivative f
x
1
(x
1
, x
2
, . . . , x
n
) by treating
x
2
, x
3
, . . . , x
n
as constants and dierentiating with respect to x
1
as we would in single-
variable calculus. The same statement holds for any coordinate: To nd the partial
derivative with respect to x
k
, treat the other coordinates as constants and dierentiate
as if the function depended only on x
k
.
Example If f : R
2
R is dened by
f(x, y) = 3x
2
4xy
2
,
then, treating y as a constant and dierentiating with respect to x,
f
x
(x, y) = 6x 4y
2
and, treating x as a constant and dierentiating with respect to y,
f
y
(x, y) = 8xy.
Example If f : R
4
R is dened by
f(w, x, y, z) = log(w
2
+x
2
+y
2
+z
2
),
then
w
f(w, z, y, z) =
2w
w
2
+x
2
+y
2
+z
2
,
x
f(w, z, y, z) =
2x
w
2
+x
2
+y
2
+z
2
,
y
f(w, z, y, z) =
2y
w
2
+x
2
+y
2
+z
2
,
and
z
f(w, z, y, z) =
2z
w
2
+x
2
+y
2
+z
2
.
Example Suppose g : R
2
R is dened by
g(x, y) =
xy
x
2
+y
2
, if (x, y) = (0, 0),
0, if (x, y) = (0, 0).
We saw in Section 3.1 that lim
(x,y)(0,0)
g(x, y) does not exist; in particular, g is not continuous
at (0, 0). However,
x
g(0, 0) = lim
h0
g((0, 0) +h(1, 0)) g(0, 0)
h
= lim
h0
g(h, 0)
h
= lim
h0
0
h
= 0
Section 3.2 Directional Derivatives and the Gradient 5
and
y
g(0, 0) = lim
h0
g((0, 0) +h(0, 1)) g(0, 0)
h
= lim
h0
g(0, h)
h
= lim
h0
0
h
= 0.
This shows that it is possible for a function to have partial derivatives at a point without
being continuous at that point. However, we shall see in Section 3.3 that this function is
not dierentiable at (0, 0); that is, f does not have a best ane approximation at (0, 0).
The gradient
Denition Suppose f : R
n
R is dened on an open ball containing the point c and
x
k
f(c) exists for k = 1, 2, . . . , n. We call the vector
f(c) =
x
1
f(c),
x
2
f(c), . . . ,
x
n
f(c)
(3.2.4)
the gradient of f at c.
Example If f : R
2
R is dened by
f(x, y) = 3x
2
4xy
2
,
then
f(x, y) = (6x 4y
2
, 8xy).
Thus, for example, f(2, 1) = (8, 16).
Example If f : R
4
R is dened by
f(w, x, y, z) = log(w
2
+x
2
+y
2
+z
2
),
then
f(w, x, y, z) =
2
w
2
+x
2
+y
2
+z
2
(w, x, y, z).
Thus, for example,
f(1, 2, 2, 1) =
1
5
(1, 2, 2, 1).
Notice that if f : R
n
R, then f : R
n
R
n
; that is, we may view the gradient as a
function which takes an n-dimensional vector for input and returns another n-dimensional
vector. We call a function of this type a vector eld.
Denition We say a function f : R
n
R is C
1
on an open set U if f is continuous on
U and, for k = 1, 2, . . . , n,
f
x
k
is continuous on U.
6 Directional Derivatives and the Gradient Section 3.2
Now suppose f : R
2
R is C
1
on some open ball containing the point c = (c
1
, c
2
). Let
u = (u
1
, u
2
) be a unit vector and suppose we wish to compute the directional derivative
D
u
f(c). From the denition, we have
D
u
f(c) = lim
h0
f(c +hu) f(c)
h
= lim
h0
f(c
1
+hu
1
, c
2
+hu
2
) f(c
1
, c
2
)
h
= lim
h0
f(c
1
+hu
1
, c
2
+hu
2
) f(c
1
+hu
1
, c
2
) +f(c
1
+hu
1
, c
2
) f(c
1
, c
2
)
h
= lim
h0
f(c
1
+hu
1
, c
2
+hu
2
) f(c
1
+hu
1
, c
2
)
h
+
f(c
1
+hu
1
, c
2
) f(c
1
, c
2
)
h
.
For a xed value of h = 0, dene : R R by
(t) = f(c
1
+hu
1
, c
2
+t). (3.2.5)
Note that is dierentiable with
(t) = lim
s0
(t +s) (t)
s
= lim
s0
f(c
1
+hu
1
, c
2
+t +s) f(c
1
+hu
1
, c
2
+t)
s
=
y
f(c
1
+hu
1
, c
2
+t).
(3.2.6)
Hence if we dene : R R by
(t) = (u
2
t) = f(c
1
+hu
1
, c
2
+tu
2
), (3.2.7)
then is dierentiable with
(t) = u
2
(u
2
t) = u
2
y
f(c
1
+hu
1
, c
2
+tu
2
). (3.2.8)
By the Mean Value Theorem from single-variable calculus, there exists a number a between
0 and h such that
(h) (0)
h
=
(a). (3.2.9)
Putting (3.2.7) and (3.2.8) into (3.2.9), we have
f(c
1
+hu
1
, c
2
+hu
2
) f(c
1
+hu
2
, c
2
)
h
= u
2
y
f(c
1
+hu
1
, c
2
+au
2
). (3.2.10)
Similarly, if we dene : R R by
(t) = f(c
1
+tu
1
, c
2
), (3.2.11)
Section 3.2 Directional Derivatives and the Gradient 7
then is dierentiable,
(t) = u
1
x
f(c
1
+tu
1
, c
2
), (3.2.12)
and, using the Mean Value Theorem again, there exists a number b between 0 and h such
that
f(c
1
+hu
1
, c
2
) f(c
1
, c
2
)
h
=
(h) (0)
h
=
(b) = u
1
x
f(c
1
+bu
1
, c
2
). (3.2.13)
Putting (3.2.10) and (3.2.13) into our expression for D
u
f(c) above, we have
D
u
f(c) = lim
h0
u
2
y
f(c
1
+hu
1
, c
2
+au
2
) +u
1
x
f(c
1
+bu
1
, c
2
)
. (3.2.14)
Now both a and b approach 0 as h approaches 0 and both
f
x
and
f
y
are assumed to be
continuous, so evaluating the limit in (3.2.14) gives us
D
u
f(c) = u
2
y
f(c
1
, c
2
) +u
1
x
f(c
1
, c
2
) = f(c) u. (3.2.15)
A straightforward generalization of (3.2.15) to the case of a function f : R
n
R gives
us the following theorem.
Theorem Suppose f : R
n
R is C
1
on an open ball containing the point c. Then for
any unit vector u, D
u
f(c) exists and
D
u
f(c) = f(c) u. (3.2.16)
Example If f : R
2
R is dened by
f(x, y) = 4 2x
2
y
2
,
then
f(x, y) = (4x, 2y).
If
u =
1
2
(1, 1),
then
D
u
f(1, 1) = f(1, 1) u = (4, 2)
2
(1, 1)
=
6
2
= 3
2,
as we saw in this rst example of this section. Note also that
D
u
f(1, 1) = f(1, 1) (u) = (4, 2)
2
(1, 1)
=
6
2
= 3
2
8 Directional Derivatives and the Gradient Section 3.2
and, if
w =
1
5
(1, 2),
D
w
f(1, 1) = f(1, 1) (w) = (4, 2)
5
(1, 2)
= 0,
as claimed earlier.
Example Suppose the temperature at a point in a metal cube is given by
T(x, y, z) = 80 20xe
1
20
(x
2
+y
2
+z
2
)
,
where the center of the cube is taken to be at (0, 0, 0). Then we have
x
T(x, y, z) = 2x
2
e
1
20
(x
2
+y
2
+z
2
)
20e
1
20
(x
2
+y
2
+z
2
)
,
y
T(x, y, z) = 2xye
1
20
(x
2
+y
2
+z
2
)
,
and
z
T(x, y, z) = 2xze
1
20
(x
2
+y
2
+z
2
)
,
so
T(x, y, z) = e
1
20
(x
2
+y
2
+z
2
)
(2x
2
20, 2xy, 2xz).
Hence, for example, the rate of change of temperature at the origin in the direction of the
unit vector
u =
1
3
(1, 1, 1)
is
D
u
T(0, 0, 0) = T(0, 0, 0) u = (20, 0, 0)
3
(1, 1, 1)
=
20
3
.
An application of the Cauchy-Schwarz inequality to (3.2.16) shows us that
|D
u
f(c)| = |f(c) u| f(c)u = f(c). (3.2.17)
Thus the magnitude of the rate of change of f in any direction at a given point never
exceeds the length of the gradient vector at that point. Moreover, in our discussion of
the Cauchy-Schwarz inequality we saw that we have equality in (3.2.17) if and only if u is
parallel to f(c). Indeed, supposing f(c) = 0, when
u =
f(c)
f(c)
,
Section 3.2 Directional Derivatives and the Gradient 9
we have
D
u
f(c) = f(c) u =
f(c) f(c)
f(c)
=
f(c)
2
f(c)
= f(c) (3.2.18)
and
D
u
f(c) = f(c). (3.2.19)
Hence we have the following result.
Proposition Suppose f : R
n
R is C
1
on an open ball containing the point c. Then
D
u
f(c) has a maximum value of f(c) when u is the direction of f(c) and a minimum
value of f(c) when u is the direction of f(c).
In other words, the gradient vector points in the direction of the maximum rate of
increase of the function and the negative of the gradient vector points in the direction of
the maximum rate of decrease of the function. Moreover, the length of the gradient vector
tells us the rate of increase in the direction of maximum increase and its negative tells us
the rate of decrease in the direction of maximum decrease.
Example As we saw above, if f : R
2
R is dened by
f(x, y) = 4 2x
2
y
2
,
then
f(x, y, ) = (4x, 2y).
Thus f(1, 1) = (4, 2). Hence if a bug standing above (1, 1) on the graph of f wants
to head in the direction of most rapid ascent, it should move in the direction of the unit
vector
u =
f(1, 1)
f(1, 1)
=
1
5
(2, 1).
If the bug wants to head in the direction of most rapid descent, it should move in the
direction of the unit vector
u =
1
5
(2, 1).
Moreover,
D
u
f(1, 1) = f(1, 1) =
20
and
D
u
f(1, 1) = f(1, 1) =
20.
Figure 3.2.2 shows scaled values of f(x, y) plotted for a grid of points (x, y). The vec-
tors are scaled so that they t in the plot, without overlap, yet still show their relative
magnitudes. This is another good geometric way to view the behavior of the function.
Supposing our bug were placed on the side of the graph above (1, 1) and that it headed up
the hill in such a manner that it always chose the direction of steepest ascent, we can see
that it would head more quickly toward the y-axis than toward the x-axis. More explicitly,
10 Directional Derivatives and the Gradient Section 3.2
-1 -0.5 0.5 1 1.5 2
-1
-0.5
0.5
1
1.5
2
Figure 3.2.2 Scaled gradient vectors for f(x, y) = 4 2x
2
y
2
if C is the shadow of the path of the bug in the xy-plane, then the slope of C at any point
(x, y) would be
dy
dx
=
2y
4x
=
y
2x
.
Hence
1
y
dy
dx
=
1
2x
.
If we integrate both sides of this equality, we have
1
y
dy
dx
dx =
1
2x
dx.
Thus
log |y| =
1
2
log |x| +c
for some constant c, from which we have
e
log |y|
= e
1
2
log |x|+c
.
It follows that
y = k
|x|,
where k = e
c
. Since y = 1 when x = 1, k = 1 and we see that C is the graph of y =
x.
Figure 3.2.2 shows C along with the plot of the gradient vectors of f, while Figure 3.2.3
shows the actual path of the bug on the graph of f.
Section 3.2 Directional Derivatives and the Gradient 11
-2
-1
0
1
2 x
-2
-1
0
1
2
y
-2
0
2
4
z
-2
-1
0
1
2 x
-2
-1
0
1
2
y
Figure 3.2.3 Graph f(x, y) = 4 2x
2
y
2
with path of most rapid ascent from (1, 1, 1)
Example For a two-dimensional version of the temperature example discussed above,
consider a metal plate heated so that its temperature at (x, y) is given by
T(x, y) = 80 20xe
1
20
(x
2
+y
2
)
.
Then
T(x, y) = e
1
20
(x
2
+y
2
)
(2x
2
20, 2xy),
so, for example,
T(0, 0) = (20, 0).
Thus at the origin the temperature is increasing most rapidly in the direction of u = (1, 0)
and decreasing most rapidly in the direction of (1, 0). Moreover,
D
u
T(0, 0) = f(0, 0) = 20
and
D
u
T(0, 0) = f(0, 0) = 20.
Note that
D
u
T(0, 0) =
x
T(0, 0)
and
D
u
T(0, 0) =
x
T(0, 0).
12 Directional Derivatives and the Gradient Section 3.2
-4 -2 2 4
-4
-2
2
4
Figure 3.2.4 Scaled gradient vectors for T(x, y) = 80 20xe
1
20
(x
2
+y
2
)
Figure 3.2.4 is a plot of scaled gradient vectors for this temperature function. From the
plot it is easy to see which direction a bug placed on this metal plate would have to choose
in order to warm up as rapidly as possible. It should also be clear that the temperature
has a relative maximum around (3, 0) and a relative minimum around (3, 0); these points
are, in fact, exactly (
10, 0) and (
5
(1, 2).
Find D
u
f(3, 1) directly from the denition (3.2.2).
2. For each of the following functions, nd the partial derivatives with respect to each
variable.
Section 3.2 Directional Derivatives and the Gradient 13
(a) f(x, y) =
4x
x
2
+y
2
(b) g(x, y) = 4xy
2
e
y
2
(c) f(x, y, z) = 3x
2
y
3
z
4
13x
2
y (d) h(x, y, z) = 4xze
1
x
2
+y
2
+z
2
(e) g(w, x, y, z) = sin(
w
2
+x
2
+ 2y
2
+ 3z
2
)
3. Find the gradient of each of the following functions.
(a) f(x, y, z) =
x
2
+y
2
+z
2
(b) g(x, y, z) =
1
x
2
+y
2
+z
2
(c) f(w, x, y, z) = tan
1
(4w + 3x + 5y +z)
4. Find D
u
f(c) for each of the following.
(a) f(x, y) = 3x
2
+ 5y
2
, u =
1
13
(3, 2), c = (2, 1)
(b) f(x, y) = x
2
2y
2
, u =
1
5
(1, 2), c = (2, 3)
(c) f(x, y, z) =
1
x
2
+y
2
+z
2
, u =
1
6
(1, 2, 1), c = (2, 2, 1)
5. For each of the following, nd the directional derivative of f at the point c in the
direction of the specied vector w.
(a) f(x, y) = 3x
2
y, w = (2, 3), c = (2, 1)
(b) f(x, y, z) = log(x
2
+ 2y
2
+z
2
), w = (1, 2, 3), c = (2, 1, 1)
(c) f(t, x, y, z) = tx
2
yz
2
, w = (1, 1, 2, 3), c = (2, 1, 1, 2)
6. A metal plate is heated so that its temperature at a point (x, y) is
T(x, y) = 50y
2
e
1
5
(x
2
+y
2
)
.
A bug is placed at the point (2, 1).
(a) The bug heads toward the point (1, 2). What is the rate of change of temperature
in this direction?
(b) In what direction should the bug head in order to warm up at the fastest rate?
What is the rate of change of temperature in this direction?
(c) In what direction should the bug head in order to cool o at the fastest rate? What
is the rate of change of temperature in this direction?
(d) Make a plot of the gradient vectors and discuss what it tells you about the tem-
peratures on the plate.
7. A heat-seeking bug is a bug that always moves in the direction of the greatest increase
in heat. Discuss the behavior of a heat seeking bug placed on a metal plate heated so
that the temperature at (x, y) is given by
T(x, y) = 100 40xye
1
10
(x
2
+y
2
)
.
14 Directional Derivatives and the Gradient Section 3.2
8. Suppose g : R
2
R is dened by
g(x, y) =
xy
x
2
+y
2
, if (x, y) = (0, 0),
0, if (x, y) = (0, 0).
We saw above that both partial derivatives of g exist at (0, 0), although g is not
continuous at (0, 0).
(a) Show that neither
g
x
nor
g
y
is continuous at (0, 0).
(b) Let
u =
1
2
(1, 1).
Show that D
u
g(0, 0) does not exist. In particular, D
u
g(0, 0) = g(0, 0) u.
9. Suppose the price of a certain commodity, call it commodity A, is x dollars per unit
and the price of another commodity, B, is y dollars per unit. Moreover, suppose that
d
A
(x, y) represents the number of units of A that will be sold at these prices and
d
B
(x, y) represents the number of units of B that will be sold at these prices. These
functions are known as the demand functions for A and B.
(a) Explain why it is reasonable to assume that
x
d
A
(x, y) < 0
and
y
d
B
(x, y) < 0
for all (x, y).
(b) Suppose the two commodities are competitive. For example, they might be two
dierent brands of the same product. In this case, what would be reasonable
assumptions for the signs of
y
d
A
(x, y)
and
x
d
B
(x, y)?
(c) Suppose the two commodities complement each other. For example, commodity
A might be a computer and commodity B a type of software. In this case, what
would be reasonable assumptions for the signs of
y
d
A
(x, y)
and
x
d
B
(x, y)?
Section 3.2 Directional Derivatives and the Gradient 15
10. Suppose P(x
1
, x
2
, . . . , x
n
) represents the total production per week of a certain factory
as a function of x
1
, the number of workers, and other variables, such as the size of
the supply inventory, the number of hours the assembly lines run per week, and so on.
Show that average productivity
P(x
1
, x
2
, . . . , x
n
)
x
1
increases as x
1
increases if and only if
x
1
P(x
1
, x
2
, . . . , x
n
) >
P(x
1
, x
2
, . . . , x
n
)
x
1
.
11. Suppose f : R
n
R is C
1
on an open ball about the point c.
(a) Given a unit vector u, what is the relationship between D
u
f(c) and D
u
f(c)?
(b) Is it possible that D
u
f(c) > 0 for every unit vector u?