0% found this document useful (0 votes)
305 views

Active Calculus

Material de cálculo

Uploaded by

micheliroloff
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
305 views

Active Calculus

Material de cálculo

Uploaded by

micheliroloff
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 426

ACTIVE CALCULUS

Matthew Boelkins, David Austin & Steven


Schlicker
Grand Valley State University
Grand Valley State University
Active Calculus

Matthew Boelkins, David Austin & Steven Schlicker


This text is disseminated via the Open Education Resource (OER) LibreTexts Project (https://LibreTexts.org) and like the
hundreds of other texts available within this powerful platform, it is freely available for reading, printing and
"consuming." Most, but not all, pages in the library have licenses that may allow individuals to make changes, save, and
print this book. Carefully consult the applicable license(s) before pursuing such effects.
Instructors can adopt existing LibreTexts texts or Remix them to quickly build course-specific resources to meet the needs
of their students. Unlike traditional textbooks, LibreTexts’ web based origins allow powerful integration of advanced
features and new technologies to support learning.

The LibreTexts mission is to unite students, faculty and scholars in a cooperative effort to develop an easy-to-use online
platform for the construction, customization, and dissemination of OER content to reduce the burdens of unreasonable
textbook costs to our students and society. The LibreTexts project is a multi-institutional collaborative venture to develop
the next generation of open-access texts to improve postsecondary education at all levels of higher learning by developing
an Open Access Resource environment. The project currently consists of 14 independently operating and interconnected
libraries that are constantly being optimized by students, faculty, and outside experts to supplant conventional paper-based
books. These free textbook alternatives are organized within a central environment that is both vertically (from advance to
basic level) and horizontally (across different fields) integrated.
The LibreTexts libraries are Powered by MindTouch® and are supported by the Department of Education Open Textbook
Pilot Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable
Learning Solutions Program, and Merlot. This material is based upon work supported by the National Science Foundation
under Grant No. 1246120, 1525057, and 1413739. Unless otherwise noted, LibreTexts content is licensed by CC BY-NC-
SA 3.0.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do
not necessarily reflect the views of the National Science Foundation nor the US Department of Education.
Have questions or comments? For information about adoptions or adaptions contact [email protected]. More
information on our activities can be found via Facebook (https://facebook.com/Libretexts), Twitter
(https://twitter.com/libretexts), or our blog (http://Blog.Libretexts.org).

This text was compiled on 12/22/2021


TABLE OF CONTENTS
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration

1 12/22/2021
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

Active Calculus is different from most existing calculus texts in at least the following ways: the text is free for download by students
and instructors; in the electronic format, graphics are in full color and there are live html links to java applets; the text is open source,
and interested instructors can gain access to the original source files upon request; the style of the text requires students to be active
learners — there are very few worked examples in the text, with there instead being 3-4 activities per section that engage students in
connecting ideas, solving problems, and developing understanding of key calculus concepts; each section begins with motivating
questions, a brief introduction, and a preview activity, all of which are designed to be read and completed prior to class; the exercises
are few in number and challenging in nature.

1: UNDERSTANDING THE DERIVATIVE


1.1: HOW DO WE MEASURE VELOCITY?
1.2: THE NOTION OF LIMIT
1.3: THE DERIVATIVE OF A FUNCTION AT A POINT
1.4: THE DERIVATIVE FUNCTION
1.5: INTERPRETATING, ESTIMATING, AND USING THE DERIVATIVE
1.6: THE SECOND DERIVATIVE
1.7: LIMITS, CONTINUITY, AND DIFFERENTIABILITY
1.8: THE TANGENT LINE APPROXIMATION
1.E: UNDERSTANDING THE DERIVATIVE (EXERCISES)

2: COMPUTING DERIVATIVES
Throughout Chapter 2, we will be working to develop shortcut derivative rules that will help us to bypass the limit definition of the
derivative in order to quickly determine the formula for f (x) when we are given a formula for f(x).

2.1: ELEMENTARY DERIVATIVE RULES

2 12/22/2021
2.2: THE SINE AND COSINE FUNCTION
2.3: THE PRODUCT AND QUOTIENT RULES
2.4: DERIVATIVES OF OTHER TRIGONOMETRIC FUNCTIONS
2.5: THE CHAIN RULE
2.6: DERIVATIVES OF INVERSE FUNCTIONS
2.7: DERIVATIVES OF FUNCTIONS GIVEN IMPLICITELY
2.8: USING DERIVATIVES TO EVALUATE LIMITS
2.E: COMPUTING DERIVATIVES (EXERCISES)

3: USING DERIVATIVES
3.1: USING DERIVATIVES TO IDENTIFY EXTREME VALUES
3.2: USING DERIVATIVES TO DESCRIBE FAMILIES OF FUNCTIONS
3.3: GLOBAL OPTIMIZATION
3.4: APPLIED OPTIMIZATION
3.5: RELATED RATES
3.E: USING DERIVATIVES (EXERCISES)

4: THE DEFINITE INTEGRAL


4.1: DETERMINING DISTANCE TRAVELED FROM VELOCITY
4.2: RIEMANN SUMS
4.3: THE DEFINITE INTEGRAL
4.4: THE FUNDAMENTAL THEOREM OF CALCULUS
4.E: THE DEFINITE INTEGRAL (EXERCISES)

5: FINDING ANTIDERIVATIVES AND EVALUATING INTEGRALS


5.1: CONSTRUCTION ACCURATE GRAPHS OF ANTIDERIVATIVES
5.2: THE SECOND FUNDAMENTAL THEOREM OF CALCULUS
5.3: INTEGRATION BY SUBSTITUTION
5.4: INTEGRATION BY PARTS
5.5: OTHER OPTIONS FOR FINDING ALGEBRAIC DERIVATIVES
5.6: NUMERICAL INTEGRATION
5.E: FINDING ANTIDERIVATIVES AND EVALUATING INTEGRALS (EXERCISES)

6: USING DEFINITE INTEGRALS


6.1: USING DEFINITE INTEGRALS TO FIND AREA AND LENGTH
6.2: USING DEFINITE INTEGRALS TO FIND VOLUME
6.3: DENSITY, MASS, AND CENTER OF MASS
6.4: PHYSICS APPLICATIONS - WORK, FORCE, AND PRESSURE
6.5: IMPROPER INTEGRALS
6.E: USING DEFINITE INTEGRALS (EXERCISES)

7: DIFFERENTIAL EQUATIONS
7.1: AN INTRODUCTION TO DIFFERENTIAL EQUATIONS
7.2: QUALITATIVE BEHAVIOR OF SOLUTIONS TO DIFFERENTIAL EQUATIONS
7.3: EULER'S METHOD
7.4: SEPARABLE DIFFERENTIAL EQUATIONS
7.5: MODELING WITH DIFFERENTIAL EQUATIONS
7.6: POPULATION GROWTH AND THE LOGISTIC EQUATION
7.E: DIFFERENTIAL EQUATIONS (EXERCISES)

8: SEQUENCES AND SERIES


8.1: SEQUENCES
8.2: GEOMETRIC SERIES
8.3: SERIES OF REAL NUMBERS
8.4: ALTERNATING SERIES
8.5: TAYLOR POLYNOMIALS AND TAYLOR SERIES
8.6: POWER SERIES

3 12/22/2021
8.E: SEQUENCES AND SERIES (EXERCISES)

9: MULTIVARIABLE AND VECTOR FUNCTIONS


9.1: FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE
9.2: SECTION 2-
9.3: SECTION 3-
9.4: SECTION 4-
9.5: SECTION 5-
9.6: SECTION 6-

10: DERIVATIVES OF MULTIVARIABLE FUNCTIONS


10.1: SECTION 1-
10.2: SECTION 2-
10.3: SECTION 3-
10.4: SECTION 4-
10.5: SECTION 5-
10.6: SECTION 6-

11: MULTIPLE INTEGRALS


11.1: SECTION 1-
11.2: SECTION 2-
11.3: SECTION 3-
11.4: SECTION 4-
11.5: SECTION 5-
11.6: SECTION 6-

BACK MATTER
INDEX
GLOSSARY

4 12/22/2021
CHAPTER OVERVIEW

1 12/22/2021
1: UNDERSTANDING THE DERIVATIVE
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

1.1: HOW DO WE MEASURE VELOCITY?


The average velocity on [a,b] can be viewed geometrically as the slope of the line between the points (a,s(a)) and (b,s(b)) on the graph
of y=s(t). The instantaneous velocity of a moving object at a fixed time is estimated by considering average velocities on shorter and
shorter time intervals that contain the instant of interest

1.2: THE NOTION OF LIMIT


Limits enable us to examine trends in function behavior near a specific point. In particular, taking a limit at a given point asks if the
function values nearby tend to approach a particular fixed value.

1.3: THE DERIVATIVE OF A FUNCTION AT A POINT


An idea that sits at the foundations of calculus is the instantaneous rate of change of a function. This rate of change is always
considered with respect to change in the input variable, often at a particular fixed input value. This is a generalization of the notion of
instantaneous velocity and essentially allows us to consider the question “how do we measure how fast a particular function is
changing at a given point?”

1.4: THE DERIVATIVE FUNCTION


The limit definition of the derivative produces a value for each x at which the derivative is defined, and this leads to a new function
whose formula is y = f'(x). Hence we talk both about a given function f and its derivative f'. It is especially important to note that
taking the derivative is a process that starts with a given function (f) and produces a new, related function (f').

1.5: INTERPRETATING, ESTIMATING, AND USING THE DERIVATIVE


Regardless of the context of a given function y = f(x), the derivative always measures the instantaneous rate of change of the output
variable with respect to the input variable. The units on the derivative function y = f (x) are units of f per unit of x. Again, this

measures how fast the output of the function f changes when the input of the function changes.

3 12/22/2021
1.6: THE SECOND DERIVATIVE
A differentiable function f is increasing at a point or on an interval whenever its first derivative is positive, and decreasing whenever
its first derivative is negative. By taking the derivative of the derivative of a function f', we arrive at the second derivative, f''. The
second derivative measures the instantaneous rate of change of the first derivative, and thus the sign of the second derivative tells us
whether or not the slope of the tangent line to f is increasing or decreasing.

1.7: LIMITS, CONTINUITY, AND DIFFERENTIABILITY


A function f has limit as x → a if and only if f has a left-hand limit at x = a, has a right-hand limit at x = a, and the left- and right-hand
limits are equal. A function f is continuous at x = a whenever f (a) is defined, f has a limit as x → a, and the value of the limit and the
value of the function agree. This guarantees that there is not a hole or jump in the graph of f at x = a. A function f is differentiable at x
= a whenever f' (a) exists.

1.8: THE TANGENT LINE APPROXIMATION


The principle of local linearity tells us that if we zoom in on a point where a function y = f (x) is differentiable, the function should
become indistinguishable from its tangent line. That is, a differentiable function looks linear when viewed up close.

1.E: UNDERSTANDING THE DERIVATIVE (EXERCISES)


These are homework exercises to accompany Chapter 1 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
1.1: How do we Measure Velocity?
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How is the average velocity of a moving object connected to the values of its position function?
How do we interpret the average velocity of an object geometrically with regard to the graph of its position
function?
How is the notion of instantaneous velocity connected to average velocity?

Calculus can be viewed broadly as the study of change. A natural and important question to ask about any changing
quantity is “how fast is the quantity changing?” It turns out that in order to make the answer to this question precise,
substantial mathematics is required.
We begin with a familiar problem: a ball being tossed straight up in the air from an initial height. From this elementary
scenario, we will ask questions about how the ball is moving. These questions will lead us to begin investigating ideas that
will be central throughout our study of differential calculus and that have wide-ranging consequences. In a great deal of
our thinking about calculus, we will be well-served by remembering this first example and asking ourselves how the
various (sometimes abstract) ideas we are considering are related to the simple act of tossing a ball straight up in the air.

Preview Activity 1.1.1:

Suppose that the height s of a ball (in feet) at time t (in seconds) is given by the formula s(t) = 64 − 16(t − 1) .
2

a. Construct an accurate graph of y = s(t) on the time interval 0 ≤ t ≤ 3 . Label at least six distinct points on
the graph, including the three points that correspond to when the ball was released, when the ball reaches its
highest point, and when the ball lands.
b. In everyday language, describe the behavior of the ball on the time interval 0 < t < 1 and on time interval
1 < t < 3 . What occurs at the instant t = 1 ?

s(1) − s(0.5)
c. Consider the expression AV [0.5,1] = .
1 − 0.5

Compute the value of AV . What does this value measure geometrically? What does this value measure
[0.5,1]

physically? In particular, what are the units on AV ? [0.5,1]

Position and Average Velocity


Any moving object has a position that can be considered a function of time. When this motion is along a straight line, the
position is given by a single variable, and we usually let this position be denoted by s(t) , which reflects the fact that
position is a function of time. For example, we might view s(t) as telling the mile marker of a car traveling on a straight
highway at time t in hours; similarly, the function s described in Preview Activity 1.1.1 is a position function, where
position is measured vertically relative to the ground.
Not only does such a moving object have a position associated with its motion, but on any time interval, the object has an
average velocity. Think, for example, about driving from one location to another: the vehicle travels some number of miles
over a certain time interval (measured in hours), from which we can compute the vehicle’s average velocity. In this
situation, average velocity is the number of miles traveled divided by the time elapsed, which of course is given in miles
per hour. Similarly, the calculation of AV in Preview Activity 1.1.1 found the average velocity of the ball on the time
[0.5,1]

interval [0.5, 1], measured in feet per second.


In general, we make the following definition: for an object moving in a straight line whose position at time t is given by
the function s(t) , the average velocity of the object on the interval from t = a to t = b , denoted AV , is given by the
[a,b]

formula
s(b) − s(a)
AV|a,b] = (1.1.1)
b −a

Matthew Boelkins, David Austin & Steven


1.1.1 11/24/2021 https://math.libretexts.org/@go/page/4292
Schlicker
Note well: the units on AV [a,b] are “units of s per unit of t ,” such as “miles per hour” or “feet per second.”

Activity 1.1.2:

The following questions concern the position function given by s(t) = 64 − 16(t − 1)
2
, which is the same function
considered in Preview Activity 1.1.1
a. Compute the average velocity of the ball on each of the following time intervals:
. Include units for
[0.4, 0.8], [0.7, 0.8], [0.79, 0.8], [0.799, 0.8], [0.8, 1.2], [0.8, 0.9], [0.8, 0.81], [0.8, 0.801]

each value.
b. On the provided graph in Figure 1.1.1, sketch the line that passes through the points A = (0.4, s(0.4)) and
B = (0.8, s(0.8)). What is the meaning of the slope of this line? In light of this meaning, what is a

geometric way to interpret each of the values computed in the preceding question?
c. Use a graphing utility to plot the graph of s(t) = 64 − 16(t − 1) on an interval containing the value
2

t = 0.8 . Then, zoom in repeatedly on the point (0.8, s(0.8)). What do you observe about how the graph

appears as you view it more and more closely?


d. What do you conjecture is the velocity of the ball at the instant t = 0.8 ? Why?

Figure 1.1.1 : A partial plot of s(t) = 64 − 16(t − 1) . 2

Instantaneous Velocity
Whether driving a car, riding a bike, or throwing a ball, we have an intuitive sense that any moving object has a velocity at
any given moment – a number that measures how fast the 4 object is moving right now. For instance, a car’s speedometer
tells the driver what appears to be the car’s velocity at any given instant. In fact, the posted velocity on a speedometer is
really an average velocity that is computed over a very small time interval (by computing how many revolutions the tires
have undergone to compute distance traveled), since velocity fundamentally comes from considering a change in position
divided by a change in time. But if we let the time interval over which average velocity is computed become shorter and
shorter, then we can progress from average velocity to instantaneous velocity.
Informally, we define the instantaneous velocity of a moving object at time t = a to be the value that the average velocity
approaches as we take smaller and smaller intervals of time containing t = a to compute the average velocity. We will
develop a more formal definition of this momentarily, one that will end up being the foundation of much of our work in
first semester calculus. For now, it is fine to think of instantaneous velocity this way: take average velocities on smaller
and smaller time intervals, and if those average velocities approach a single number, then that number will be the
instantaneous velocity at that point.

Activity 1.1.3:

Each of the following questions concern s(t) = 64 − 16(t − 1) , the position function from Preview Activity 1.1.1.
2

a. Compute the average velocity of the ball on the time interval [1.5, 2]. What is different between this value
and the average velocity on the interval [0, 0.5]?

Matthew Boelkins, David Austin & Steven


1.1.2 11/24/2021 https://math.libretexts.org/@go/page/4292
Schlicker
b. Use appropriate computing technology to estimate the instantaneous velocity of the ball at t = 1.5 .
Likewise, estimate the instantaneous velocity of the ball at t = 2 . Which value is greater?
c. How is the sign of the instantaneous velocity of the ball related to its behavior at a given point in time? That
is, what does positive instantaneous velocity tell you the ball is doing? Negative instantaneous velocity?
d. Without doing any computations, what do you expect to be the instantaneous velocity of the ball at t = 1 ?
Why?

At this point we have started to see a close connection between average velocity and instantaneous velocity, as well as how
each is connected not only to the physical behavior of the moving object but also to the geometric behavior of the graph of
the position function. In order to make the link between average and instantaneous velocity more formal, we will introduce
the notion of limit in Section 1.2. As a preview of that concept, we look at a way to consider the limiting value of average
velocity through the introduction of a parameter. Note that if we desire to know the instantaneous velocity at t = a of a
moving object with position function s, we are interested in computing average velocities on the interval [a, b] for smaller
and smaller intervals. One way to visualize this is to think of the value b as being b = a + h , where h is a small number
that is allowed to vary. Thus, we observe that the average velocity of the object on the interval [a, a + h] is
s(a + h) − s(a)
AV[a,a+h] = , (1.1.2)
h

with the denominator being simply h because (a + h) − a = h . Initially, it is fine to think of h being a small positive real
number; but it is important to note that we allow h to be a small negative number, too, as this enables us to investigate the
average velocity of the moving object on intervals prior to t = a , as well as following t = a . When h < 0 , AV [a,a+h]

measures the average velocity on the interval [a + h, a] .


To attempt to find the instantaneous velocity at t = a , we investigate what happens as the value of h approaches zero. We
consider this further in the following example.

Example 1.1.1:

For a falling ball whose position function is given by s(t) = 16 − 16t (where s is measured in feet and t in seconds),
2

find an expression for the average velocity of the ball on a time interval of the form [0.5, 0.5 + h] where
−0.5 < h < 0.5 and h , 0. Use this expression to compute the average velocity on [0.5, 0.75]and [0.4, 0.5], as well as

to make a conjecture about the instantaneous velocity at t = 0.5 .


Solution
We make the assumptions that −0.5 < h < 0.5 and h , 0 because h cannot be zero (otherwise there is no interval on
which to compute average velocity) and because the function only makes sense on the time interval 0 ≤ t ≤ 1 , as this
is the duration of time during which the ball is falling. Observe that we want to compute and simplify
s(0.5 + h) − s(0.5)
AV[0.5,0.5+h] = (1.1.3)
(0.5 + h) − 0.5

The most unusual part of this computation is finding s(0.5 + h) . To do so, we follow the rule that defines the function
s . In particular, since s(t) = 16 − 16t , we see that
2

2
s(0.5 + h) = 16 − 16(0.5 + h) (1.1.4)

2
= 16 − 16(0.25 + h + h )

2
= 16 − 4 − 16h − 16h
2
= 12 − 16h − 16 h .

Now, returning to our computation of the average velocity, we find that

Matthew Boelkins, David Austin & Steven


1.1.3 11/24/2021 https://math.libretexts.org/@go/page/4292
Schlicker
s(0.5 + h) − s(0.5)
AV[0.5,0.5+h] = (1.1.5)
(0.5 + h) − 0.5

2 2
(12 − 16h − 16 h ) − (16 − 16(0.5 ) )
=
0.5 + h − 0.5
2
12 − 16h − 16 h − 12
=
h
2
−16h − 16h
= .
h

At this point, we note two things: first, the expression for average velocity clearly depends on h , which it must, since
as h changes the average velocity will change. Further, we note that since h can never equal zero, we may further
simplify the most recent expression. Removing the common factor of h from the numerator and denominator, it
follows that
AV[0.5,0.5+h] = −16 − 16h (1.1.6)

Now, for any small positive or negative value of h , we can compute the average velocity. For instance, to obtain the
average velocity on [0.5, 0.75], we let h = 0.25, and the average velocity is −16 − 16(0.25) = −20f t/sec. To get
the average velocity on [0.4, 0.5], we let h = −0.1 , which tells us the average velocity is
−16 − 16(−0.1) = −14.4f t/sec. Moreover, we can even explore what happens to AV as h gets closer and
[0.5,0.5+h]

closer to zero. As h approaches zero, −16h will also approach zero, and thus it appears that the instantaneous velocity
of the ball at t = 0.5 should be −16f t/sec.

Activity 1.1.4:

For the function given by s(t) = 64 − 16(t − 1) from Preview Activity 1.1.1, find the most simplified expression
2

you can for the average velocity of the ball on the interval [2, 2 + h] . Use your result to compute the average velocity
on [1.5, 2] and to estimate the instantaneous velocity at t = 2 . Finally, compare your earlier work in Activity 1.1.1

Summary
In this section, we encountered the following important ideas:
The average velocity on [a, b] can be viewed geometrically as the slope of the line between the points (a, s(a)) and
(b, s(b)) on the graph of y = s(t) , as shown in Figure 1.1.2.

Figure 1.1.2 : The graph of position function s together with the line through (a, s(a)) and (b, s(b)) whose slope is
s(b) − s(a)
m = . The line’s slope is the average rate of change of s on the interval [a, b].
b −a

Given a moving object whose position at time t is given by a function s , the average velocity of the object on the time
s(b) − s(a)
interval [a, b] is given by AV[a,b] = . Viewing the interval [a, b] as having the form [a, a + h] , we
b −a
s(a + h) − s(a)
equivalently compute average velocity by the formula AV [a,a+h]
= .
h

Matthew Boelkins, David Austin & Steven


1.1.4 11/24/2021 https://math.libretexts.org/@go/page/4292
Schlicker
The instantaneous velocity of a moving object at a fixed time is estimated by considering average velocities on shorter
and shorter time intervals that contain the instant of interest.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.1.5 11/24/2021 https://math.libretexts.org/@go/page/4292
Schlicker
1.2: The Notion of Limit
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is the mathematical notion of limit and what role do limits play in the study of functions?
What is the meaning of the notation lim f (x) = L?
x→a

How do we go about determining the value of the limit of a function at a point?


How do we manipulate average velocity to compute instantaneous velocity?

Functions are at the heart of mathematics: a function is a process or rule that associates each individual input to exactly one
corresponding output. Students learn in courses prior to calculus that there are many different ways to represent functions,
including through formulas, graphs, tables, and even words. For example, the squaring function can be thought of in any of
these ways. In words, the squaring function takes any real number x and computes its square. The formulaic and graphical
representations go hand in hand, as y = f (x) = x is one of the simplest curves to graph. Finally, we can also partially
2

represent this function through a table of values, essentially by listing some of the ordered pairs that lie on the curve, such
as (−2, 4), (−1, 1), (0, 0), (1, 1), and (2, 4).
Functions are especially important in calculus because they often model important phenomena – the location of a moving
object at a given time, the rate at which an automobile is consuming gasoline at a certain velocity, the reaction of a patient
to the size of a dose of a drug – and calculus can be used to study how these output quantities change in response to
changes in the input variable. Moreover, thinking about concepts like average and instantaneous velocity leads us naturally
from an initial function to a related, sometimes more complicated function. As one example of this, think about the falling
ball whose position function is given by s(t) = 64 − 16t and the average velocity of the ball on the interval [1, x].
2

Observe that
2 2
s(x) − s(1) (64 − 16 x ) − (64 − 16) 16 − 16x
AV[1,x] = = = . (1.2.1)
x −1 x −1 x −1

Now, two things are essential to note: this average velocity depends on x (indeed, AV is a function of x), and our most
[1,x]

focused interest in this function occurs near x = 1 , which is where the function is not defined. Said differently, the
2
16−16x
function g(x) = x−1
tells us the average velocity of the ball on the interval from t = 1 to t = x , and if we are
interested 12 in the instantaneous velocity of the ball when t = 1 , we’d like to know what happens to g(x) as x gets closer
and closer to 1. At the same time, g(1) is not defined, because it leads to the quotient 0/0.
This is where the idea of limits comes in. By using a limit, we’ll be able to allow x to get arbitrarily close, but not equal, to
1 and fully understand the behavior of g(x) near this value. We’ll develop key language, notation, and conceptual
understanding in what follows, but for now we consider a preliminary activity that uses the graphical interpretation of a
function to explore points on a graph where interesting behavior occurs.

Preview Activity 1.2.1:

Suppose that g is the function given by the graph below. Use the graph to answer each of the following questions.
a. Determine the values g(−2), g(−1), g(0), g(1) , and g(2) , if defined. If the function value is not defined, explain
what feature of the graph tells you this.
b. For each of the values a = −1 , a = 0 , and a = 2 , complete the following sentence: “As x gets closer and closer
(but not equal) to a , g(x) gets as close as we want to .”
c. What happens as x gets closer and closer (but not equal) to a = 1 ? Does the function g(x) get as close as we
would like to a single value?

Matthew Boelkins, David Austin & Steven


1.2.1 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
Figure 1.5: Graph of y = g(x) for Preview Activity 1.2.

The Notion of Limit


Limits can be thought of as a way to study the tendency or trend of a function as the input variable approaches a fixed
value, or even as the input variable increases or decreases without bound. We put off the study of the latter idea until
further along in the course when we will have some helpful calculus tools for understanding the end behavior of functions.
Here, we focus on what it means to say that “a function f has limit L as x approaches a .” To begin, we think about a
recent example.
In Preview Activity 1.2, you saw that for the given function g , as x gets closer and closer (but not equal) to 0, g(x) gets as
close as we want to the value 4. At first, this may feel counterintuitive, because the value of g(0) is 1, not 4. By their very
definition, limits regard the behavior of a function arbitrarily close to a fixed input, but the value of the function at the
fixed input does not matter. More formally1 , we say the following.

Definition: Limits
Given a function f , a fixed input x = a , and a real number L, we say that f has limit L as x approaches a , and write

lim f (x) = L (1.2.2)


x→a

provided that we can make f (x) as close to L as we like by taking x sufficiently close (but not equal) to a . If we
cannot make f (x) as close to a single value as we would like as x approaches a , then we say that f does not have a
limit as x approaches a .

For the function g pictured in Figure 1.5, we can make the following observations:
lim g(x) = 3, lim g(x) = 4, and lim g(x) = 1, (1.2.3)
x→−1 x→0 x→2

but g does not have a limit as x → 1 . When working graphically, it suffices to ask if the function approaches a single value
from each side of the fixed input, while understanding that the function value right at the fixed input is irrelevant. This
reasoning explains the values of the first three stated limits. In a situation such as the jump in the graph of g at x = 1 , the
issue is that if we approach x = 1 from the left, the function values tend to get as close to 3 as we’d like, but if we
approach x = 1 from the right, the function values get as close to 2 as we’d like, and there is no single number that all of
these function values approach. This is why the limit of g does not exist at x = 1 . For any function f , there are typically
three ways to answer the question “does f have a limit at x = a , and if so, what is the limit?” The first is to reason
graphically as we have just done with the example from Preview Activity 1.2. If we have a formula for f (x), there are two
additional possibilities:
1. evaluate the function at a sequence of inputs that approach a on either side, typically using some sort of computing
technology, and ask if the sequence of outputs seems to approach a single value;
2. use the algebraic form of the function to understand the trend in its output as the input values approach a .
The first approach only produces an approximation of the value of the limit, while the latter can often be used to determine
the limit exactly. The following example demonstrates both of these approaches, while also using the graphs of the

Matthew Boelkins, David Austin & Steven


1.2.2 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
respective functions to help confirm our conclusions.
1
What follows here is not what mathematicians consider the formal definition of a limit. To be completely precise, it is
necessary to quantify both what it means to say “as close to L as we like” and “sufficiently close to a .” That can be
accomplished through what is traditionally called the epsilon-delta definition of limits. The definition presented here is
sufficient for the purposes of this text.

Example 1.2.1:

For each of the following functions, we’d like to know whether or not the function has a limit at the stated a-values.
Use both numerical and algebraic approaches to investigate and, if possible, estimate or determine the value of the
limit. Compare the results with a careful graph of the function on an interval containing the points of interest.
2

a. f (x) = 4−x

x+2
; a = −1, a = −2

b. g(x) = sin( ) ; a = 3,
π

x
a =0

Solution
We first construct a graph of f along with tables of values near a = −1 and a = −2 .

Figure 1.6: Tables and graph for f (x) = 4−x

x+2
.
From the left table, it appears that we can make f as close as we want to 3 by taking x sufficiently close to -1, which
suggests that lim f (x) = 3 . This is also consistent with the graph of f . To see this a bit more rigorously and from
x→−1
2
4−x
an algebraic point of view, consider the formula for f : f (x) = . The numerator and denominator are each
x+2

polynomial functions, which are among the most well-behaved functions that exist. Formally, such functions are
continuous2 , which means that the limit of the function at any point is equal (2see Section 1.7 for more on the notion
of continuity) to its function value. Here, it follows that as x → −1 , (4 − x ) → (4 − (−1) ) = 3 , and
2 2

(x + 2) → (−1 + 2) = 1 , so as x → −1 , the numerator of f tends to 3 and the denominator tends to 1, hence

=3.
3
lim f (x) =
x→−1
1

The situation is more complicated when x → −2 , due in part to the fact that f (−2) is not defined. If we attempt to use
a similar algebraic argument regarding the numerator and denominator, we observe that as
x → −2, (4 − x ) → (4 − (−2 ) ) = 0 , and (x + 2) → (−2 + 2) = 0 , so as x → −2 , the numerator of f tends to 0
2 2

and the denominator tends to 0. We call 0/0 an indeterminate form and will revisit several important issues surrounding
such quantities later in the course. For now, we simply observe that this tells us there is somehow more work to do.
From the table and the graph, it appears that f should have a limit of 4 at x = −2 . To see algebraically why this is the
case, let’s work directly with the form of f (x). Observe that
2
4 −x
lim f (x) = lim (1.2.4)
x→−2 x→−2 x +2

(2 − x)(2 + x)
= lim .
x→−2 x +2

At this point, it is important to observe that since we are taking the limit as x → −2 , we are considering x values that
are close, but not equal, to -2. Since we never actually allow x to equal -2, the quotient has value 1 for every
2+x

x+2

Matthew Boelkins, David Austin & Steven


1.2.3 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
possible value of x. Thus, we can simplify the most recent expression above, and now find that
lim f (x) = lim 2 − x. (1.2.5)
x→−2 x→−2

Because 2 − x is simply a linear function, this limit is now easy to determine, and its value clearly is 4. Thus, from
several points of view we’ve seen that lim f (x) = 4 .
x→−2

Next we turn to the function g , and construct two tables and a graph.

Figure 1.7: Tables and graph for g(x) = sin( π

x
) .
First, as x → 3 , it appears from the data (and the graph) that the function is approaching approximately 0.866025. To
be precise, we have to use the fact that → , and thus we find that g(x) = sin( ) → sin( ) as x → 3 . The exact
π

x
π

3
π

x
π

3
√3
value of sin( π

3
) is 2
, which is approximately 0.8660254038. Thus, we see that

√3
lim g(x) = (1.2.6)
x→3 2

.
As x → 0 , we observe that does not behave in an elementary way. When x is positive and approaching zero, we are
π

dividing by smaller and smaller positive values, and increases without bound. When x is negative and approaching
π

zero, decreases without bound. In this sense, as we get close to x = 0 , the inputs to the sine function are growing
π

rapidly, and this leads to wild oscillations in the graph of g . It is an instructive exercise to plot the function
) with a graphing utility and then zoom in on x = 0 . Doing so shows that the function never settles
π
g(x) = sin(
x

down to a single value near the origin and suggests that g does not have a limit at x = 0 .
How do we reconcile this with the righthand table above, which seems to suggest that the limit of g as x approaches 0
may in fact be 0? Here we need to recognize that the data misleads us because of the special nature of the sequence
k

{0.1, 0.01, 0.001, . . .}: when we evaluate g(10


−k
) , we get g(10
−k
) = sin
π
−k
= sin(
10

π
) =0 for each positive
10

integer value of k . But if we take a different sequence of values approaching zero, say {0.3, 0.03, 0.003, . . . }
, then we
find that
k –
π 10 π √3
−k
g(3 ⋅ 10 ) = sin( ) = sin( ) =− ≈ −0.866025. (1.2.7)
−k
3 ⋅ 10 3 2

√3
That sequence of data would suggest that the value of the limit is 2
. Clearly the function cannot have two different
values for the limit, and this shows that g has no limit as x → 0 .

An important lesson to take from Example 1.2 is that tables can be misleading when determining the value of a limit.
While a table of values is useful for investigating the possible value of a limit, we should also use other tools to confirm
the value, if we think the table suggests the limit exists.

Activity 1.2.2:

Estimate the value of each of the following limits by constructing appropriate tables of values. Then determine the
exact value of the limit by using algebra to simplify the function. Finally, plot each function on an appropriate interval
to check your result visually.

Matthew Boelkins, David Austin & Steven


1.2.4 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
2

a. limx→1
x −1

x−1
3
(2+x) −8
b. limx→0
x
√x+1 −1
c. limx→0
x

This concludes a rather lengthy introduction to the notion of limits. It is important to remember that our primary
motivation for considering limits of functions comes from our interest in studying the rate of change of a function. To that
end, we close this section by revisiting our previous work with average and instantaneous velocity and highlighting the
role that limits play.

Instantaneous Velocity
Suppose that we have a moving object whose position at time t is given by a function s . We know that the average
s(b)−s(a)
velocity of the object on the time interval [a, b] is AV =[a,b]
. We define the instantaneous velocity at a to be the
b−a

limit of average velocity as b approaches a . Note particularly that as b → a , the length of the time interval gets shorter and
shorter (while always including a ). In Section 1.3, we will introduce a helpful shorthand notation to represent the
instantaneous rate of change. For now, we will write I V for the instantaneous velocity at t = a , and thus
t=a

s(b) − s(a)
I Vt=a = lim AV[a,b] = lim . (1.2.8)
b→a b→a b −a

Equivalently, if we think of the changing value b as being of the form b = a + h , where h is some small number, then we
may instead write
s(a + h) − s(a)
I Vt=a = lim AV[a,a+h] = lim . (1.2.9)
h→0 h→0 h

Again, the most important idea here is that to compute instantaneous velocity, we take a limit of average velocities as the
time interval shrinks. Two different activities offer the opportunity to investigate these ideas and the role of limits further.

Activity 1.2.3

Consider a moving object whose position function is given by 2


s(t) = t , where s is measured in meters and t is
measured in minutes.
a. Determine the most simplified expression for the average velocity of the object on the interval [3, 3 + h] , where
h > 0.

b. Determine the average velocity of the object on the interval [3, 3.2]. Include units on your answer.
c. Determine the instantaneous velocity of the object when t = 3 . Include units on your answer.

The closing activity of this section asks you to make some connections among average velocity, instantaneous velocity,
and slopes of certain lines.

Activity 1.2.4

For the moving object whose position s at time t is given by the graph below, answer each of the following questions.
Assume that s is measured in feet and t is measured in seconds.

Matthew Boelkins, David Austin & Steven


1.2.5 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
Figure 1.8: Plot of the position function y = s(t) in Activity 1.6
a. Use the graph to estimate the average velocity of the object on each of the following intervals:
. Draw each line whose slope represents the average velocity you seek.
[0.5, 1], [1.5, 2.5], [0, 5]

b. How could you use average velocities or slopes of lines to estimate the instantaneous velocity of the object at a
fixed time?
c. Use the graph to estimate the instantaneous velocity of the object when t = 2 . Should this instantaneous velocity at
t = 2 be greater or less than the average velocity on [1.5, 2.5] that you computed in (a)? Why?

Summary
In this section, we encountered the following important ideas:
Limits enable us to examine trends in function behavior near a specific point. In particular, taking a limit at a given
point asks if the function values nearby tend to approach a particular fixed value.
When we write lim x→a f (x) = L , we read this as saying “the limit of f as x approaches a is L,” and this means that

we can make the value of f (x) as close to L as we want by taking x sufficiently close (but not equal) to a .
If we desire to know lim x→a f (x) for a given value of a and a known function f , we can estimate this value from the

graph of f or by generating a table of function values that result from a sequence of x-values that are closer and closer
to a . If we want the exact value of the limit, we need to work with the function algebraically and see if we can use
familiar properties of known, basic functions to understand how different parts of the formula for f change as x → a .
The instantaneous velocity of a moving object at a fixed time is found by taking the limit of average velocities of the
object over shorter and shorter time intervals that all contain the time of interest.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.2.6 11/17/2021 https://math.libretexts.org/@go/page/4293
Schlicker
1.3: The Derivative of a Function at a Point
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How is the average rate of change of a function on a given interval defined, and what does this quantity measure?
How is the instantaneous rate of change of a function at a particular point defined? How is the instantaneous rate of
change linked to average rate of change?
What is the derivative of a function at a given point? What does this derivative value measure? How do we
interpret the derivative value graphically?
How are limits used formally in the computation of derivatives?

Introduction
An idea that sits at the foundations of calculus is the instantaneous rate of change of a function. This rate of change is
always considered with respect to change in the input variable, often at a particular fixed input value. This is a
generalization of the notion of instantaneous velocity and essentially allows us to consider the question “how do we
measure how fast a particular function is changing at a given point?” When the original function represents the position of
a moving object, this instantaneous rate of change is precisely velocity, and might be measured in units such as feet per
second. But in other contexts, instantaneous rate of change could measure the number of cells added to a bacteria culture
per day, the number of additional gallons of gasoline consumed by going one mile per additional mile per hour in a car’s
velocity, or the number of dollars added to a mortgage payment for each percentage increase in interest rate. Regardless of
the presence of a physical or practical interpretation of a function, the instantaneous rate of change may also be interpreted
geometrically in connection to the function’s graph, and this connection is also foundational to many of the main ideas in
calculus.
In what follows, we will introduce terminology and notation that makes it easier to talk about the instantaneous rate of
change of a function at a point. In addition, just as instantaneous velocity is defined in terms of average velocity, the more
general instantaneous rate of change will be connected to the more general average rate of change. Recall that for a moving
object with position function s , its average velocity on the time interval t = a to t = a + h is given by the quotient
s(a + h) − s(a)
AV[a,a+h] = . (1.3.1)
h

In a similar way, we make the following definition for an arbitrary function y = f (x).

Definition 1.2
For a function f , the average rate of change of f on the interval [a, a + h] is given by the value
f (a + h) − f (a)
AV[a,a+h] = . (1.3.2)
h

Equivalently, if we want to consider the average rate of change of f on [a, b], we compute
f (b) − f (a)
AV[a,b] = . (1.3.3)
b −a

It is essential to understand how the average rate of change of f on an interval is connected to its graph.

Preview Activity 1.3.1:

Suppose that f is the function given by the graph below and that a and a + h are the input values as labeled on the x-
axis. Use the graph in Figure 1.3.1 to answer the following questions.

Matthew Boelkins, David Austin & Steven


1.3.1 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
Figure 1.3.1 : Plot of y = f (x) for Preview Activity 1.3.
a. Locate and label the points (a, f (a)) and (a + h, f (a + h)) on the graph.
b. Construct a right triangle whose hypotenuse is the line segment from (a, f (a)) to (a + h, f (a + h)) . What
are the lengths of the respective legs of this triangle?
c. What is the slope of the line that connects the points (a, f (a)) and (a + h, f (a + h)) ?
d. Write a meaningful sentence that explains how the average rate of change of the function on a given interval
and the slope of a related line are connected.

The Derivative of a Function at a Point


Just as we defined instantaneous velocity in terms of average velocity, we now define the instantaneous rate of change of a
function at a point in terms of the average rate of change of the function f over related intervals. In addition, we give a
special name to “the instantaneous rate of change of f at a ,” calling this quantity “the derivative of f at a ,” with this value
being represented by the shorthand notation f (a) . Specifically, we make the following definition.

Definition 1.3
Let f be a function and x = a a value in the function’s domain. We define the derivative of f with respect to x

evaluated at x = a , denoted f (a) , by the formula


f (a + h) − f (a)

f (a) = lim , (1.3.4)
h→0 h

provided this limit exists. Aloud, we read the symbol f (a) as either “f -prime at a ” or “the derivative of f evaluated at

x = a .” Much of the next several chapters will be devoted to understanding, computing, applying, and interpreting

derivatives. For now, we make the following important notes.


The derivative of f at the value x = a is defined as the limit of the average rate of change of f on the interval
[a, a + h] as h → 0 . It is possible for this limit not to exist, so not every function has a derivative at every point.

We say that a function that has a derivative at x = a is differentiable at x = a .


The derivative is a generalization of the instantaneous velocity of a position function: when y = s(t) is a
position function of a moving body, s (a) tells us the instantaneous velocity of the body at time t = a .

f (a+h)−f (a)
Because the units on h
are “units of f per unit of x,” the derivative has these very same units. For
instance, if s measures position in feet and t measures time in seconds, the units on s (a) are feet per second.

f (a+h)−f (a)
Because the quantity h
represents the slope of the line through (a, f (a)) and (a + h, f (a + h)) , when
we compute the derivative we are taking the limit of a collection of slopes of lines, and thus the derivative itself
represents the slope of a particularly important line.
While all of the above ideas are important and we will add depth and perspective to them through additional time and
study, for now it is most essential to recognize how the derivative of a function at a given value represents the slope of a
certain line. Thus, we expand upon the last bullet item above.
As we move from an average rate of change to an instantaneous one, we can think of one point as “sliding towards”
another. In particular, provided the function has a derivative at (a, f (a)), the point (a + h, f (a + h)) will approach

Matthew Boelkins, David Austin & Steven


1.3.2 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
(a, f (a)) as h → 0 . Because this process of taking a limit is a dynamic one, it can be helpful to use computing technology
to visualize what the limit is accomplishing. While there are many different options3 , one of the best is a java applet in
which the user is able to control the point that is moving. See the examples referenced in the footnote here, or consider
building your own, perhaps using the fantastic free program Geogebra4 .
In Figure , we provide a sequence of figures with several different lines through the points (a, f (a)) and
1.3.2

(a + h, f (a + h)) that are generated by different values of h . These lines (shown in the first three figures in magenta), are
often called secant lines to the curve y = f (x). A secant line to a curve is simply a line that passes through two points that
f (a+h)−f (a)
lie on the curve. For each such line, the slope of the secant line is m = h
, where the value of h depends on the
location of the point we choose. We can see in the diagram how, as h → 0 , the secant lines start to approach a single line
that passes through the point (a, f (a)). In the situation where the limit of the slopes of the secant lines exists, we say that
the resulting value is the slope of the tangent line to the curve. This tangent line (shown in the right-most figure in green)
to the graph of y = f (x) at the point (a, f (a)) is the line through (a, f (a)) whose slope is m = f (a) .

Figure 1.3.2 : A sequence of secant lines approaching the tangent line to f at (a, f (a)).
As we will see in subsequent study, the existence of the tangent line at x = a is connected to whether or not the function f
looks like a straight line when viewed up close at (a, f (a)), which can also be seen in Figure 1.3.3, where we combine the
four graphs in Figure 1.3.2 into the single one on the left, and then we zoom in on the box centered at (a, f (a)), with that
view expanded on the right (with two of the secant lines omitted). Note how the tangent line sits relative to the curve
y = f (x) at (a, f (a)) and how closely it resembles the curve near x = a .

3
For a helpful collection of java applets, consider the work of David Austin of Grand Valley State University at
http://gvsu.edu/s/5r, and the particularly relevant example at http://gvsu.edu/s/5s. For applets that have been built in
Geogebra, a nice example is the work of Marc Renault of Shippensburg University at http://gvsu.edu/s/5p, with the
example at http://gvsu.edu/s/5q being especially fitting for our work in this section. There are scores of other examples
posted by other authors on the internet.
4
Available for free download from http://geogebra.org.

Figure 1.3.3 : A sequence of secant lines approaching the tangent line to f at (a, f (a)). At right, we zoom in on the point
(a, f (a)) . The slope of the tangent line (in green) to f at (a, f (a)) is given by f (a) .

At this time, it is most important to note that f (a) , the instantaneous rate of change of f with respect to x at x = a , also

measures the slope of the tangent line to the curve y = f (x) at (a, f (a)). The following example demonstrates several key
ideas involving the derivative of a function.

Matthew Boelkins, David Austin & Steven


1.3.3 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
Example 1.3.1:

Example 1.3. For the function given by f (x) = x − x , use the limit definition of the derivative to compute f
2 ′
(2) . In
addition, discuss the meaning of this value and draw a labeled graph that supports your explanation.
Solution. From the limit definition, we know that
f (2 + h) − f (2)

f (2) = lim . (1.3.5)
h→0 h

Now we use the rule for f , and observe that f (2) = 2 − 2 2


= −2 and f (2 + h) = (2 + h) − (2 + h) 2
. Substituting
these values into the limit definition, we have that
2
(2 + h) − (2 + h ) − (−2)

f (2) = lim . (1.3.6)
h→0 h

Observe that with h in the denominator and our desire to let h → 0 , we have to wait to take the limit (that is, we wait
to actually let h approach 0). Thus, we do additional

Figure 1.3.4 : The tangent line to y = x − x at the point (2, −2).


2

algebra. Expanding and distributing in the numerator,


2
2 + h − 4 − 4h − h +2

f (2) = lim . (1.3.7)
h→0 h

Combining like terms, we have


2
−3h − h

f (2) = lim . (1.3.8)
h→0 h

Next, we observe that there is a common factor of h in both the numerator and denominator, which allows us to
simplify and find that

f (2) = lim(−3 − h). (1.3.9)
h→0

Finally, we are able to take the limit as h → 0 , and thus conclude that f ′
(2) = −3 .
Now, we know that f (2) represents the slope of the tangent line to the curve y = x − x at the point (2, −2); f (2) is
′ 2 ′

also the instantaneous rate of change of f at the point (2, −2). Graphing both the function and the line through (2, −2)
with slope m = f (2) = −3 , we indeed see that by calculating the derivative, we have found the slope of the tangent

line at this point, as shown in Figure 1.3.

The following activities will help you explore a variety of key ideas related to derivatives.

Matthew Boelkins, David Austin & Steven


1.3.4 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
Activity 1.3.2:

Consider the function f whose formula is f (x) = 3 − 2x .


a. What familiar type of function is f ? What can you say about the slope of f at every value of x?
b. Compute the average rate of change of f on the intervals [1, 4], [3, 7], and [5, 5 + h] ; simplify each result as
much as possible. What do you notice about these quantities?
c. Use the limit definition of the derivative to compute the exact instantaneous rate of change of f with respect
to x at the value a = 1 . That is, compute f (1) using the limit definition. Show your work. Is your result

surprising?

d. Without doing any additional computations, what are the values of f (2), f (π), and f (−√2) ? Why?
′ ′ ′

Activity 1.3.3:

A water balloon is tossed vertically in the air from a window. The balloon’s height in feet at time t in seconds after
being launched is given by s(t) = −16t + 16t + 32 . Use this function to respond to each of the following questions.
2

a. Sketch an accurate, labeled graph of s on the axes provided in Figure 1.3.5. You should be able to do this
without using computing technology.

Figure 1.3.5 : Axes for plotting y = s(t) in Activity 1.8


1. Compute the average rate of change of s on the time interval [1, 2]. Include units on your answer and
write one sentence to explain the meaning of the value you found.
2. Use the limit definition to compute the instantaneous rate of change of s with respect to time, t , at the
instant a = 1 . Show your work using proper notation, include units on your answer, and write one
sentence to explain the meaning of the value you found.
3. On your graph in (a), sketch two lines: one whose slope represents the average rate of change of s on
[1, 2], the other whose slope represents the instantaneous rate of change of s at the instant a = 1 . Label

each line clearly.


4. For what values of a do you expect s (a) to be positive? Why? Answer the same questions when

“positive” is replaced by “negative” and “zero.”

Activity 1.3.4:

A rapidly growing city in Arizona has its population P at time t , where t is the number of decades after the year 2010,
modeled by the formula P (t) = 25000e . Use this function to respond to the following questions.
t/5

a. Sketch an accurate graph of P for t = 0 to t = 5 on the axes provided in Figure 1.3.6. Label the scale on the
axes carefully.

Matthew Boelkins, David Austin & Steven


1.3.5 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
Figure 1.3.6 : Axes for plotting y = P (t) in Activity 1.9.
1. Compute the average rate of change of P between 2030 and 2050. Include units on your answer and
write one sentence to explain the meaning (in everyday language) of the value you found.
2. Use the limit definition to write an expression for the instantaneous rate of change of P with respect to
time, t , at the instant a = 2 . Explain why this limit is difficult to evaluate exactly.
3. Estimate the limit in (c) for the instantaneous rate of change of P at the instant a = 2 by using several
small h values. Once you have determined an accurate 30 estimate of P (2), include units on your

answer, and write one sentence (using everyday language) to explain the meaning of the value you found.
4. On your graph above, sketch two lines: one whose slope represents the average rate of change of P on
[2, 4], the other whose slope represents the instantaneous rate of change of P at the instant a = 2 .

5. In a carefully-worded sentence, describe the behavior of P (a) as a increases in value. What does this

reflect about the behavior of the given function P ?

Summary
In this section, we encountered the following important ideas:
f (b)−f (a)
The average rate of change of a function f on the interval [a, b] is b−a
. The units on the average rate of change are
units of f per unit of x, and the numerical value of the average rate of change represents the slope of the secant line
between the points (a, f (a)) and (b, f (b)) on the graph of y = f (x). If we view the interval as being [a, a + h] instead
f (a+h)−f (a)
of [a, b], the meaning is still the same, but the average rate of change is now computed by h
.
The instantaneous rate of change with respect to x of a function f at a value x = a is denoted f ′
(a) (read “the

derivative of f evaluated at a ” or “f -prime at a ”) and is defined by the formula


f (a + h) − f (a)

f (a) = lim , (1.3.10)
h→0 h

provided the limit exists. Note particularly that the instantaneous rate of change at x = a is the limit of the average rate of
change on [a, a + h] as h → 0 .
Provided the derivative f (a) exists, its value tells us the instantaneous rate of change of f with respect to x at x = a ,

which geometrically is the slope of the tangent line to the curve y = f (x) at the point (a, f (a)). We even say that
f (a) is the slope of the curve y = f (x) at the point (a, f (a)) .

Limits are the link between average rate of change and instantaneous rate of change: they allow us to move from the
rate of change over an interval to the rate of change at a single point.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.3.6 12/15/2021 https://math.libretexts.org/@go/page/4294
Schlicker
1.4: The Derivative Function

Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How does the limit definition of the derivative of a function f lead to an entirely new (but related) function f ? ′

What is the difference between writing f (a) and f (x)?


′ ′

How is the graph of the derivative function f (x) connected to the graph of f (x)?

What are some examples of functions f for which f is not defined at one or more points?

Given a function y = f (x), we now know that if we are interested in the instantaneous rate of change of the function at
x = a , or equivalently the slope of the tangent line to y = f (x) at x = a , we can compute the value f (a) . In all of our

examples to date, we have arbitrarily identified a particular value of a as our point of interest: a = 1, a = 3 , etc. But it is
not hard to imagine that we will often be interested in the derivative value for more than just one a-value, and possibly for
many of them. In this section, we explore how we can move from computing simply f (1) or f (3) to working more ′ ′

generally with f (a) , and indeed f (x). Said differently, we will work toward understanding how the so-called process of
′ ′

“taking the derivative” generates a new function that is derived from the original function y = f (x). The following
preview activity starts us down this path.

Preview Activity 1.4.1

Consider the function f (x) = 4x − x . 2

a. Use the limit definition to compute the following derivative values: f (0), f (1), f (2), and f (3). ′ ′ ′ ′

b. Observe that the work to find f (a) is the same, regardless of the value of a . Based on your work in (a),

what do you conjecture is the value of f (4)? How about f (5)? (Note: you should not use the limit
′ ′

definition of the derivative to find either value.)


c. Conjecture a formula for f (a) that depends only on the value a . That is, in the same way that we have a

formula for f (x) (recall f (x) = 4x − x ), see if you can use your work above to guess a formula for f (a)
2 ′

in terms of a .

The Derivative is Itself a Function


In your work in Preview Activity 1.4 with f (x) = 4x − x , you may have found several patterns. One comes from
2

observing that f (0) = 4 , f (1) = 2 , f (2) = 0 , and f (3) = −2 . That sequence of values leads us naturally to conjecture
′ ′ ′ ′

that f (4) = −4 and f (5) = −6 . Even more than these individual numbers, if we consider the role of 0, 1, 2, and 3 in the
′ ′

process of computing the value of the derivative through the limit definition, we observe that the particular number has
very little effect on our work. To see this more clearly, we compute f (a) , where a represents a number to be named later.

Following the now standard process of using the limit definition of the derivative,
f (a + h) − f (a)

(f (a) = lim (1.4.1)
h→0 h
2 2
4(a + h) − (a + h ) − (4a − a )
= lim (1.4.2)
h→0 h
2 2 2
4a + 4h − a − 2ha − h − 4a + a
= lim (1.4.3)
h→0 h
2
4h − 2ha − h
= lim (1.4.4)
h→0 h

h(4 − 2a − h)
= lim (1.4.5)
h→0 h

= lim(4 − 2a − h). (1.4.6)


h→0

Here we observe that neither 4 nor 2a depend on the value of h , so as

Matthew Boelkins, David Austin & Steven


1.4.1 10/20/2021 https://math.libretexts.org/@go/page/4295
Schlicker
h → 0, (4 − 2a − h) → (4 − 2a). (1.4.7)


Thus, f ′
(a) = 4 − 2a.

This observation is consistent with the specific values we found above: e.g., f (3) = 4 − 2(3) = −2 . And indeed, our

work with a confirms that while the particular value of a at which we evaluate the derivative affects the value of the
derivative, that value has almost no bearing on the process of computing the derivative. We note further that the letter
being used is immaterial: whether we call it a , x, or anything else, the derivative at a given value is simply given by “4
minus 2 times the value.” We choose to use x for consistency with the original function given by y = f (x), as well as for
the purpose of graphing the derivative function, and thus we have found that for the function f (x) = 4x − x , it follows 2

that f (x) = 4 − 2x .

Because the value of the derivative function is so closely linked to the graphical behavior of the original function, it makes
sense to look at both of these functions plotted on the same domain. In Figure 1.18, on the left we show a plot of
f (x) = 4x − x together with a selection of tangent lines at the points we’ve considered above. On the right, we show a
2

plot of f (x) = 4 − 2x with emphasis on the heights of the derivative graph at the same selection of points. Notice the

connection between colors in the left and right graph: the green tangent line on the original graph is tied to the green point
on the right graph in the following way: the slope of the tangent line at a point on the lefthand graph is the

Figure 1.18: The graphs of f (x) = 4x − x (at left) and f (x) = 4 − 2x (at right). Slopes on the graph of f correspond
2 ′

to heights on the graph of f . ′

same as the height at the corresponding point on the righthand graph. That is, at each respective value of x, the slope of the
tangent line to the original function at that x-value is the same as the height of the derivative function at that x-value. Do
note, however, that the units on the vertical axes are different: in the left graph, the vertical units are simply the output
units of f . On the righthand graph of y = f (x) , the units on the vertical axis are units of f per unit of x.

Of course, this relationship between the graph of a function y = f (x) and its derivative is a dynamic one. An excellent
way to explore how the graph of f (x) generates the graph of f (x) is through a java applet. See, for instance, the applets

at http://gvsu.edu/s/5C or http://gvsu.edu/s/5D, via the sites of Austin and Renault5 .


In Section 1.3 when we first defined the derivative, we wrote the definition in terms of a value a to find f (a) . As we have ′

seen above, the letter a is merely a placeholder, and it often makes more sense to use x instead. For the record, here we
restate the definition of the derivative.

Definition 1.4
Let f be a function and x a value in the function’s domain. We define the derivative of f with respect to x at the value
f (x+h)−f (x)
x , denoted f ′
, by the formula f
(x)

(x) = limh→0
h
, provided this limit exists.

Matthew Boelkins, David Austin & Steven


1.4.2 10/20/2021 https://math.libretexts.org/@go/page/4295
Schlicker
☰ 5David Austin, http://gvsu.edu/s/5r; Marc Renault, http://gvsu.edu/s/5p.

We now may take two different perspectives on thinking about the derivative function: given a graph of y = f (x), how
does this graph lead to the graph of the derivative function y = f (x) ? and given a formula for y = f (x), how does the

limit definition of the derivative generate a formula for y = f (x) ? Both of these issues are explored in the following

activities.

Exercise 1.4.1

For each given graph of y = f (x), sketch an approximate graph of its derivative function, y = f (x) , on the axes

immediately below. The scale of the grid for the graph of f is 1 × 1 ; assume the horizontal scale of the grid for the
graph of f is identical to that for f . If necessary, adjust and label the vertical scale on the axes for f .
′ ′

Matthew Boelkins, David Austin & Steven


1.4.3 10/20/2021 https://math.libretexts.org/@go/page/4295
Schlicker

Matthew Boelkins, David Austin & Steven


1.4.4 10/20/2021 https://math.libretexts.org/@go/page/4295
Schlicker

When you are finished with all 8 graphs, write several sentences that describe your overall process for sketching the
graph of the derivative function, given the graph the original function. What are the values of the derivative function
that you tend to identify first? What do you do thereafter? How do key traits of the graph of the derivative function
exemplify properties of the graph of the original function? C For a dynamic investigation that allows you to
experiment with graphing f when given the graph of f , see http://gvsu.edu/s/8y. 6 6Marc Renault, Calculus Applets

Using Geogebra. 39
Now, recall the opening example of this section: we began with the function y = f (x) = 4x − x and used the limit
2

definition of the derivative to show that f (a) = 4 − 2a , or equivalently that f (x) = 4 − 2x . We subsequently
′ ′

graphed the functions f and f as shown in Figure 1.18. Following Activity 1.10, we now understand that we could

have constructed a fairly accurate graph of f (x) without knowing a formula for either f or f . At the same time, it is
′ ′

ideal to know a formula for the derivative function whenever it is possible to find one. In the next activity, we further
explore the more algebraic approach to finding f (x): given a formula for y = f (x), the limit definition of the

derivative will be used to develop a formula for f (x).


Activity 1.4.2

For each of the listed functions, determine a formula for the derivative function. For the first two, determine the
formula for the derivative by thinking about the nature of the given function and its slope at various points; do not use
the limit definition. For the latter four, use the limit definition. Pay careful attention to the function names and
independent variables. It is important to be comfortable with using letters other than f and x. For example, given a
function p(z) , we call its derivative p (z) .

a. f (x) = 1
b. g(t) = t
c. p(z) = z 2

d. q(s) = s 3

e. F (t) = 1

f. G(y) = √y

Summary
In this section, we encountered the following important ideas:
f (x+h)−f (x)
The limit definition of the derivative, f (x) = li m

h→0
h
, produces a value for each x at which the derivative
is defined, and this leads to a new function whose formula is y = f (x) . Hence we talk both about a given function f

and its derivative f . It is especially important to note that taking the derivative is a process that starts with a given

function ( f ) and produces a new, related function ( f ). ′

There is essentially no difference between writing f (a) (as we did regularly in Section 1.3) and writing f (x). In
′ ′

either case, the variable is just a placeholder that is used to define the rule for the derivative function.
Given the graph of a function y = f (x), we can sketch an approximate graph of its derivative y = f (x) by observing

that heights on the derivative’s graph correspond to slopes on the original function’s graph.
In Activity 1.10, we encountered some functions that had sharp corners on their graphs, such as the shifted absolute
value function. At such points, the derivative fails to exist, and we say that f is not differentiable there. For now, it
suffices to understand this as a consequence of the jump that must occur in the derivative function at a sharp corner on
the graph of the original function.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.4.5 10/20/2021 https://math.libretexts.org/@go/page/4295
Schlicker
1.5: Interpretating, Estimating, and Using the Derivative
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
In contexts other than the position of a moving object, what does the derivative of a function measure?
What are the units on the derivative function f , and how are they related to the units of the original function f ?

What is a central difference, and how can one be used to estimate the value of the derivative at a point from given
function data?
Given the value of the derivative of a function at a point, what can we infer about how the value of the function
changes nearby?

Introduction
An interesting and powerful feature of mathematics is that it can often be thought of both in abstract terms and in applied
ones. For instance, calculus can be developed almost entirely as an abstract collection of ideas that focus on properties of
arbitrary functions. At the same time, calculus can also be very directly connected to our experience of physical reality by
considering functions that represent meaningful processes. We have already seen that for a position function y = s(t) , say
for a ball being tossed straight up in the air, the ball’s velocity at time t is given by v(t) = s (t) , the derivative of the

position function.
Further, recall that if s(t) is measured in feet at time t , the units on v(t) = s (t) are feet per second. In what follows in

this section, we investigate several different functions, each with specific physical meaning, and think about how the units
on the independent variable, dependent variable, and the derivative function add to our understanding. To start, we
consider the familiar problem of a position function of a moving object.

Preview Activity 1.5.1

One of the longest stretches of straight (and flat) road in North America can be found on the Great Plains in the state of
North Dakota on state highway 46, which lies just south of the interstate highway I-94 and runs through the town of
Gackle. A car leaves town (at time t = 0 ) and heads east on highway 46; its position in miles from Gackle at time t in
minutes is given by the graph of the function in Figure 1.22. Three important points are labeled on the graph; where
the curve looks linear, assume that it is indeed a straight line.

Figure 1.22: The graph of y = s(t) , the position of the car along highway 46, which tells its distance in miles from
Gackle, ND, at time t in minutes.
a. In everyday language, describe the behavior of the car over the provided time interval. In particular, discuss
what is happening on the time intervals [57, 68] and [68, 104].
b. Find the slope of the line between the points (57, 63.8) and (104, 106.8). What are the units on this slope?
What does the slope represent?
c. Find the average rate of change of the car’s position on the interval [68, 104]. Include units on your answer.
d. Estimate the instantaneous rate of change of the car’s position at the moment t = 80 . Write a sentence to
explain your reasoning and the meaning of this value.

Matthew Boelkins, David Austin & Steven


1.5.1 12/8/2021 https://math.libretexts.org/@go/page/4296
Schlicker
Units of the derivative function
As we now know, the derivative of the function f at a fixed value x is given by
f (x + h) − f (x)

f (x) = lim (1.5.1)
h→0 h

,
and this value has several different interpretations. If we set x = a , one meaning of f ′
(a) is the slope of the tangent line at
the point (a, (f (a)).
df dy
In alternate notation, we also sometimes equivalently write or instead of f (x), and these notations helps us to
dx dx

further see the units (and thus the meaning) of the derivative as it is viewed as the instantaneous rate of change of f with
f (x+h)−f (x)
respect to x. Note that the units on the slope of the secant line, , are “units of f per unit of x.” Thus, when we
h

45 take the limit to get f (x), we get these same units on the derivative f (x): units of f per unit of x. Regardless of the
′ ′

function f under consideration (and regardless of the variables being used), it is helpful to remember that the units on the
derivative function are “units of output per unit of input,” in terms of the input and output of the original function.
For example, say that we have a function y = P (t) , where P measures the population of a city (in thousands) at the start
of year t (where t = 0 corresponds to 2010 AD), and we are told that P (2) = 21.37. What is the meaning of this value?

Well, since P is measured in thousands and t is measured in years, we can say that the instantaneous rate of change of the
city’s population with respect to time at the start of 2012 is 21.37 thousand people per year. We therefore expect that in the
coming year, about 21,370 people will be added to the city’s population.

Toward more accurate derivative estimates


Toward more accurate derivative estimates It is also helpful to recall, as we first experienced in Section 1.3, that when we
f (x+h)−f (x)
want to estimate the value of f (x) at a given x, we can use the difference quotient

with a relatively small h

value of h . In doing so, we should use both positive and negative values of h in order to make sure we account for the
behavior of the function on both sides of the point of interest. To that end, we consider the following brief example to
demonstrate the notion of a central difference and its role in estimating derivatives.

Example 1.5.1

Suppose that y = f (x) is a function for which three values are known: f (1) = 2.5 , f (2) = 3.25 , and .
f (3) = 3.625

Estimate f (2).

Solution.
f (2+h)−f (2)
We know that f (2) = lim

h→0
h
. But since we don’t have a graph for y = f (x) nor a formula for the
function, we can neither sketch a tangent line nor evaluate the limit exactly. We can’t even use smaller and smaller
values of h to estimate the limit. Instead, we have just two choices: using h = −1 or h = 1 , depending on which point
we pair with (2, 3.25). So, one estimate is
f (1) − f (2) 2.5 − 3.25

f (2) ≈ = = 0.75. (1.5.2)
1 −2 −1

The other is
f (3) − f (2) 3.625 − 3.25

f (2) ≈ = = 0.375. (1.5.3)
3 −2 1

Since the first approximation looks only backward from the point (2, 3.25) and the second approximation looks only
forward from (2, 3.25), it makes sense to average these two values in order to account for behavior on both sides of the
point of interest. Doing so, we find that
0.75 + 0.375

f (2) ≈ = 0.5625. (1.5.4)
2

Matthew Boelkins, David Austin & Steven


1.5.2 12/8/2021 https://math.libretexts.org/@go/page/4296
Schlicker
The intuitive approach to average the two estimates found in Example 1.4 is in fact the best possible estimate to f (2) ′

when we have just two function values for f on opposite sides of the point of interest. To see why, we think about the
diagram in Figure 1.23, which

Figure 1.23: At left, the graph of y = f (x) along with the secant line through (1, 2.5) and (2, 3.25), the secant line through
(2, 3.25) and (3, 3.625), as well as the tangent line. At right, the same graph along with the secant line through (1, 2.5) and
(3, 3.625), plus the tangent line.
shows a possible function y = f (x) that satisfies the data given in Example 1.4. On the left, we see the two secant lines
f (1)−f (2)
with slopes that come from computing the backward difference 1−2
= 0.75 and from the forward difference
f (3)−f (2)

3−2
. Note how the first such line’s slope over-estimates the slope of the tangent line at (2, f (2)), while the
= 0.375

second line’s slope underestimates f (2). On the right, however, we see the secant line whose slope is given by the central

difference
f (3) − f (1) 3.625 − 2.5 1.125
= = = 0.5625. (1.5.5)
3 −1 2 2

Note that this central difference has the exact same value as the average of the forward difference and backward difference
(and it is straightforward to explain why this always holds), and moreover that the central difference yields a very good
approximation to the derivative’s value, in part because the secant line that uses both a point before and after the point of
tangency yields a line that is closer to being parallel to the tangent line.
In general, the central difference approximation to the value of the first derivative is given by
f (a + h) − f (a − h)

f (a) ≈ (1.5.6)
2h

,
and this quantity measures the slope of the secant line to y = f (x) through the points (a − h, f (a − h)) and
(a + h, f (a + h)) . Anytime we have symmetric data surrounding a point at which we desire to estimate the derivative, the

central difference is an ideal choice for so doing.


The following activities will further explore the meaning of the derivative in several different contexts while also viewing
the derivative from graphical, numerical, and algebraic perspectives.

Activity 1.5.2

A potato is placed in an oven, and the potato’s temperature F (in degrees Fahrenheit) at various points in time is taken
and recorded in the following table. Time t is measured in minutes.

t F (t)

0 70

15 180.5

30 251

45 296

Matthew Boelkins, David Austin & Steven


1.5.3 12/8/2021 https://math.libretexts.org/@go/page/4296
Schlicker
t F (t)

60 324.5

75 342.8

90 354.5

a. Use a central difference to estimate the instantaneous rate of change of the temperature of the potato at
t = 30 . Include units on your answer.

b. Use a central difference to estimate the instantaneous rate of change of the temperature of the potato at
t = 60 . Include units on your answer.

c. Without doing any calculation, which do you expect to be greater: f (75) or f (90)? Why?
′ ′

d. Suppose it is given that F (64) = 330.28 and f (64) = 1.341. What are the units on these two quantities?

What do you expect the temperature of the potato to be when t = 65 ? when t = 66 ? Why?
e. Write a couple of careful sentences that describe the behavior of the temperature of the potato on the time
interval [0, 90], as well as the behavior of the instantaneous rate of change of the temperature of the potato on
the same time interval.

Activity 1.5.3

A company manufactures rope, and the total cost of producing r feet of rope is C (r) dollars.
a. What does it mean to say that C (2000) = 800?
b. What are the units of C (r)?

c. Suppose that C (2000) = 800 and C (2000) = 0.35. Estimate C (2100), and justify your estimate by writing

at least one sentence that explains your thinking.


d. Which of the following statements do you think is true, and why?
1. C ′ ′
(2000) < C (3000)

2. C ′ ′
(2000) = C (3000)

3. C ′ ′
(2000) > C (3000)

e. Suppose someone claims that C (5000) = −0.1. What would the practical meaning of this derivative value

tell you about the approximate cost of the next foot of rope? Is this possible? Why or why not?

Activity 1.5.4

Researchers at a major car company have found a function that relates gasoline consumption to speed for a particular
model of car. In particular, they have determined that the consumption C , in liters per kilometer, at a given speed s ,
is given by a function C = f (s) , where s is the car’s speed in kilometers per hour.
a. Data provided by the car company tells us that f (80) = 0.015, f (90) = 0.02, and f (100) = 0.027. Use this
information to estimate the instantaneous rate of change of fuel consumption with respect to speed at s = 90 .
Be as accurate as possible, use proper notation, and include units on your answer.
b. By writing a complete sentence, interpret the meaning (in the context of fuel consumption) of “
f (80) = 0.015.”

c. Write at least one complete sentence that interprets the meaning of the value of f (90) that you estimated in

(a).

In Section 1.4, we learned how use to the graph of a given function f to plot the graph of its derivative, f . It is important

to remember that when we do so, not only does the scale on the vertical axis often have to change to accurately represent
f , but the units on that axis also differ. For example, suppose that P (t) = 400 − 330e tells us the temperature in
′ −0.03t

degrees Fahrenheit of a potato in an oven at time t in minutes. In Figure 1.24, we sketch the graph of P on the left and the
graph of P on the right.

Matthew Boelkins, David Austin & Steven


1.5.4 12/8/2021 https://math.libretexts.org/@go/page/4296
Schlicker
Figure 1.24: Plot of P (t) = 400 − 330e −0.03t
at left, and its derivative P ′
(t) at right.
Note how not only are the vertical scales different in size, but different in units, as the units of P are F , while those of P
∘ ′

are F /min. In all cases where we work with functions that have an applied context, it is helpful and instructive to think

carefully about units involved and how they further inform the meaning of our computations.

Summary
In this section, we encountered the following important ideas:
Regardless of the context of a given function y = f (x), the derivative always measures the instantaneous rate of
change of the output variable with respect to the input variable.
The units on the derivative function y = f (x) are units of f per unit of x. Again, this measures how fast the output of

the function f changes when the input of the function changes.


The central difference approximation to the value of the first derivative is given by
f (a + h) − f (a − h)

f (a) ≈ , (1.5.7)
2h

and this quantity measures the slope of the secant line to y = f (x) through the points (a − h, f (a − h)) and
(a + h, f (a + h)) . The central difference generates a good approximation of the derivative’s value any time we have

symmetric data surrounding a point of interest.


Knowing the derivative and function values at a single point enables us to estimate other function values nearby. If, for
example, we know that f (7) = 2 , then we know that at x=7, the function f is increasing at an instantaneous rate of 2

units of output for every one unit of input. Thus, we expect f (8) to be approximately 2 units greater than f (7). The
value is approximate because we don’t know that the rate of change stays the same as x changes.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.5.5 12/8/2021 https://math.libretexts.org/@go/page/4296
Schlicker
1.6: The Second Derivative
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How does the derivative of a function tell us whether the function is increasing or decreasing at a point or on an
interval?
What can we learn by taking the derivative of the derivative (to achieve the second derivative) of a function f ?
What does it mean to say that a function is concave up or concave down? How are these characteristics connected
to certain properties of the derivative of the function?
What are the units of the second derivative? How do they help us understand the rate of change of the rate of
change?

Introduction
Given a differentiable function y = f (x), we know that its derivative, y = f (x) , is a related function whose output at a

value x = a tells us the slope of the tangent line to y = f (x) at the point (a, f (a)). That is, heights on the derivative graph
tell us the values of slopes on the original function’s graph. Therefore, the derivative tells us important information about
the function f .

Figure 1.25: Two tangent lines on a graph demonstrate how the slope of the tangent line tells us whether the function is
rising or falling, as well as whether it is doing so rapidly or slowly.
At any point where f (x) is positive, it means that the slope of the tangent line to f is positive, and therefore the function

f is increasing (or rising) at that point. Similarly, if f (a) is negative, we know that the graph of f is decreasing (or

falling) at that point.


In the next part of our study, we work to understand not only whether the function f is increasing or decreasing at a point
or on an interval, but also how the function f is increasing or decreasing. Comparing the two tangent lines shown in Figure
1.25, we see that at point A, the value of f (x) is positive and relatively close to zero, which coincides with the graph

rising slowly. By contrast, at point B, the derivative is negative and relatively large in absolute value, which is tied to the
fact that f is decreasing rapidly at B. It also makes sense to not only ask whether the value of the derivative function is
positive or negative and whether the derivative is large or small, but also to ask “how is the derivative changing?”
We also now know that the derivative, y = f (x) , is itself a function. This means that we can consider taking its derivative

– the derivative of the derivative – and therefore ask questions like “what does the derivative of the derivative tell us about
how the original function behaves?” As we have done regularly in our work to date, we start with an investigation of a
familiar problem in the context of a moving object.

Preview Activity 1.6.1

The position of a car driving along a straight road at time t in minutes is given by the function y = s(t) that is
pictured in Figure 1.26. The car’s position function has units measured in thousands of feet. For instance, the point (2,
4) on the graph indicates that after 2 minutes, the car has traveled 4000 feet.

Matthew Boelkins, David Austin & Steven


1.6.1 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
Figure 1.26: The graph of y = s(t) , the position of the car (measured in thousands of feet from its starting location) at
time t in minutes.
a. In everyday language, describe the behavior of the car over the provided time interval. In particular, you
should carefully discuss what is happening on each of the time intervals [0, 1], [1, 2], [2, 3], [3, 4], and [4, 5],
plus provide commentary overall on what the car is doing on the interval [0, 12].
b. On the lefthand axes provided in Figure 1.27, sketch a careful, accurate graph of y = s (t) .

c. What is the meaning of the function y = s (t) in the context of the given problem? What can we say about

the car’s behavior when s (t) is positive? when s (t) is zero? When s (t) is negative?
′ ′ ′

d. Rename the function you graphed in (b) to be called y = v(t) . Describe the behavior of v in words, using
phrases like “v is increasing on the interval . . .” and “v is constant on the interval . . ..”
e. Sketch a graph of the function y = v (t) on the righthand axes provide in Figure 1.27. Write at least one

sentence to explain how the behavior of v (t) is connected to the graph of y = v(t) .

Figure 1.27: Axes for plotting y = v(t) = s (t) and y = v (t) .


′ ′

Increasing, decreasing, or neither


When we look at the graph of a function, there are features that strike us naturally, and common language can be used to
name these features. In many different settings so far, we have intuitively used the words increasing and decreasing to
describe a function’s graph. Here we connect these terms more formally to a function’s behavior on an interval of input
values.

Definition 1.5
Given a function f (x) defined on the interval (a, b), we say that f is increasing on (a, b)provided that for all x, y in
the interval (a, b), if x < y , then f (x) < f (y). Similarly, we say that f is decreasing on (a, b) provided that for all
x, y in the interval (a, b) , if x < y , then f (x) > f (y).

Simply put, an increasing function is one that is rising as we move from left to right along the graph, and a decreasing
function is one that falls as the value of the input increases. For a function that has a derivative, we can use the sign of the
derivative to determine whether or not the function is increasing or decreasing.

Matthew Boelkins, David Austin & Steven


1.6.2 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
Let be a function that is differentiable on an interval (a, b). We say that f is increasing on (a, b) if and only if
f

f (x) > 0 for every x such that a < x < b ; similarly, f is decreasing on(a, b) if and only if f (x) < 0 . If f (a) = 0 ,
′ ′

then we say f is neither increasing nor decreasing at x = a .

Figure 1.28: A function that is decreasing on the intervals −3 < x < −2 and 0 < x < 2 and increasing on −2 < x < 0
and 2 < x < 3 .
For example, the function pictured in Figure 1.28 is increasing on the entire interval −2 < x < 0 . Note that at both
x = ±2 and x = 0 , we say that f is neither increasing nor decreasing, because f (x) = 0 at these values.

The Second Derivative


For any function, we are now accustomed to investigating its behavior by thinking about its derivative. Given a function f ,
its derivative is a new function, one that is given by the rule
f (x + h) − f (x)

f (x) = lim . (1.6.1)
h→0 h

Because f is itself a function, it is perfectly feasible for us to consider the derivative of the derivative, which is the new

function y = [f (x)] . We call this resulting function the second derivative of y = f (x), and denote the second derivative
′ ′

by y = f (x). Due to the presence of multiple possible derivatives, we will sometimes call f “the first derivative” of f ,
′′ ′

rather than simply “the derivative” of f . Formally, the second derivative is defined by the limit definition of the derivative
of the first derivative:
′ ′
f (x + h) − f (x)
′′
f (x) = lim . (1.6.2)
h→0 h

We note that all of the established meaning of the derivative function still holds, so when we compute y = f (x), this new ′′

function measures slopes of tangent lines to the curve y = f (x) , as well as the instantaneous rate of change of y = f (x) .
′ ′

In other words, just as the first derivative measures the rate at which the original function changes, the second derivative
measures the rate at which the first derivative changes. This means that the second derivative tracks the instantaneous rate
of change of the instantaneous rate of change of f . That is, the second derivative will help us to understand how the rate of
change of the original function is itself changing.

Concavity
In addition to asking whether a function is increasing or decreasing, it is also natural to inquire how a function is increasing
or decreasing. To begin, there are three basic behaviors that an increasing function can demonstrate on an interval, as
pictured in Figure 1.29: the function can increase more and more rapidly, increase at the same rate, or increase in a way
that is slowing down. Fundamentally, we are beginning to think about how a particular curve bends, with the natural
comparison being made to lines, which don’t bend at all. More than this, we want to understand how the bend in a
function’s graph is tied to behavior characterized by the first derivative of the function.
For the leftmost curve in Figure 1.29, picture a sequence of tangent lines to the curve. As we move from left to right, the
slopes of those tangent lines will increase. Therefore, the rate of change of the pictured function is increasing, and this
explains why we say this function is increasing at an increasing rate. For the rightmost graph in Figure 1.29, observe that

Matthew Boelkins, David Austin & Steven


1.6.3 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
as x increases, the function increases but the slope of the tangent line decreases, hence this function is increasing at a
decreasing rate.
Of course, similar options hold for how a function can decrease. Here we must be extra careful with our language, since
decreasing functions involve negative slopes, and negative numbers present an interesting situation in the tension between
common language and mathematical language. For example, it can be tempting to say that “-100 is bigger than -2.” But we
must remember that when we say one number is greater than another, this describes how the numbers lie on a number line:
x < y provided that x lies to the left of y . So of course, -100 is less than -2. Informally, it might be helpful to say that

Figure 1.29: Three functions that are all increasing, but doing so at an increasing rate, at a constant rate, and at a
decreasing rate, respectively.
“-100 is more negative than -2.” This leads us to note particularly that when a function’s values are negative, and those
values subsequently get more negative, the function must be decreasing.
Now consider the three graphs shown in Figure 1.30. Clearly the middle graph demonstrates the behavior of a function
decreasing at a constant rate. If we think about a sequence of tangent lines to the first curve that progress from left to right,
we see that the slopes of these lines get less and less negative as we move from left to right. That means that the values of
the first derivative, while all negative, are increasing, and thus we say that the leftmost curve is decreasing at an increasing
rate.

Figure 1.30: From left to right, three functions that are all decreasing, but doing so in different ways.
This leaves only the rightmost curve in Figure 1.30 to consider. For that function, the slope of the tangent line is negative
throughout the pictured interval, but as we move from left to right, the slopes get more and more negative. Hence the slope
of the curve is decreasing, and we say that the function is decreasing at a decreasing rate.
This leads us to introduce the notion of concavity which provides simpler language to describe some of these behaviors.
Informally, when a curve opens up on a given interval, like the upright parabola y = x or the exponential growth function
2

y = e , we say that the curve is concave up on that interval. Likewise, when a curve opens down, such as the parabola
x

y = −x
2
or the opposite of the exponential function y = −e , we say that the function is concave down. This behavior is
x

linked to both the first and second derivatives of the function.


In Figure 1.31, we see two functions along with a sequence of tangent lines to each. On the lefthand plot where the
function is concave up, observe that the tangent lines to the curve always lie below the curve itself and that, as we move
from left to right, the slope of the tangent line is increasing. Said differently, the function f is concave up on the interval
shown because its derivative, f , is increasing on that interval. Similarly, on the righthand plot in Figure 1.31, where the

function shown is concave down, there we see that the tangent lines always lie above the curve and that the value of the
Matthew Boelkins, David Austin & Steven
1.6.4 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
slope of the tangent line is decreasing as we move from left to right. Hence, what makes f concave down on the interval is
the fact that its derivative, f , is decreasing.

Figure 1.31: At left, a function that is concave up; at right, one that is concave down.
We state these most recent observations formally as the definitions of the terms concave up and concave down.

Definition 1.6
Let f be a differentiable function on an interval (a, b). Then f is concave up on (a, b) if and only if f is increasing on

(a, b) ; f is concave down on (a, b) if and only if f is decreasing on (a, b) .


The following activities lead us to further explore how the first and second derivatives of a function determine the behavior
and shape of its graph. We begin by revisiting Preview Activity 1.6.

Activity 1.6.2

The position of a car driving along a straight road at time t in minutes is given by the function y = s(t) that is
pictured in Figure 1.32. The car’s position function has units measured in thousands of feet. Remember that you
worked with this function and sketched graphs of y = v(t) = s (t) and y = v (t) in Preview Activity 1.6.
′ ′

Figure 1.32: The graph of y = s(t) , the position of the car (measured in thousands of feet from its starting location) at
time t in minutes.
a. On what intervals is the position function y = s(t) increasing? decreasing? Why?
b. On which intervals is the velocity function y = v(t) = s (t) increasing? decreasing? neither? Why?

c. Acceleration is defined to be the instantaneous rate of change of velocity, as the acceleration of an object
measures the rate at which the velocity of the object is changing. Say that the car’s acceleration function is
named a(t) . How is a(t) computed from v(t) ? How is a(t) computed from s(t) ? Explain.
d. What can you say about s whenever s is increasing? Why?
′′ ′

e. Using only the words increasing, decreasing, constant, concave up, concave down, and linear, complete the
following sentences. For the position function s with velocity v and acceleration a ,

Matthew Boelkins, David Austin & Steven


1.6.5 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
on an interval where v is positive, s is .
on an interval where v is negative, s is .
on an interval where v is zero, s is .
on an interval where a is positive, v is .
on an interval where a is negative, v is .
on an interval where a is zero, v is .
on an interval where a is positive, s is .
on an interval where a is negative, s is .
on an interval where a is zero, s is .

The context of position, velocity, and acceleration is an excellent one in which to understand how a function, its first
derivative, and its second derivative are related to one another. In Activity 1.15, we can replace s , v , and a with an
arbitrary function f and its derivatives f and f , and essentially all the same observations hold. In particular, note that f
′ ′′ ′

is increasing if and only if f is concave up, and similarly f is increasing if and only if f is positive. Likewise, f is
′ ′′ ′

decreasing if and only if f is concave down, and f is decreasing if and only if f is negative.
′ ′′

Activity 1.6.3:

A potato is placed in an oven, and the potato’s temperature F (in degrees Fahrenheit) at various points in time is taken
and recorded in the following table. Time t is measured in minutes. In Activity 1.12, we computed approximations to
F (30) and F (60) using central differences. Those values and more are provided in the second table below, along
′ ′

with several others computed in the same way.

t F (t)

0 70

15 180.5

30 251

45 296

60 24.5

75 342.8

90 354.5


t F (t)

0 NA

15 6.03

30 3.85

45 2.45

60 1.56

75 1.00

90 NA

a. What are the units on the values of F (t)?


b. Use a central difference to estimate the value of F (30).


′′

c. What is the meaning of the value of F (30) that you have computed in (b) in terms of the potato’s
′′

temperature? Write several careful sentences that discuss, with appropriate units, the values of F (30),
F (30), and F (30), and explain the overall behavior of the potato’s temperature at this point in time.
′ ′′

d. Overall, is the potato’s temperature increasing at an increasing rate, increasing at a constant rate, or
increasing at a decreasing rate? Why?
Matthew Boelkins, David Austin & Steven
1.6.6 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
Activity 1.6.4:

This activity builds on our experience and understanding of how to sketch the graph of f given the graph of

Figure 1.33: Two given functions f , with axes provided for plotting f and f below.
′ ′′

In Figure 1.33, given the respective graphs of two different functions f , sketch the corresponding graph of f on the ′

first axes below, and then sketch f on the second set of axes. In addition, for each, write several careful sentences in
′′

the spirit of those in Activity 1.15 that connect the behaviors of f , f , and f . For instance, write something such as
′ ′′

f

is on the interval , which is connected to the fact that f is on the same interval, and f is on the interval as well
′′

but of course with the blanks filled in. Throughout, view the scale of the grid for the graph of f as being 1 × 1 , and
assume the horizontal scale of the grid for the graph of f is identical to that for f . If you need to adjust the vertical

scale on the axes for the graph of f or f , you should label that accordingly.
′ ′′

Summary
In this section, we encountered the following important ideas:
A differentiable function f is increasing at a point or on an interval whenever its first derivative is positive, and
decreasing whenever its first derivative is negative.
By taking the derivative of the derivative of a function f , we arrive at the second derivative, f . The second derivative
′′

measures the instantaneous rate of change of the first derivative, and thus the sign of the second derivative tells us
whether or not the slope of the tangent line to f is increasing or decreasing.
A differentiable function is concave up whenever its first derivative is increasing (or equivalently whenever its second
derivative is positive), and concave down whenever its first derivative is decreasing (or equivalently whenever its

Matthew Boelkins, David Austin & Steven


1.6.7 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
second derivative is negative). Examples of functions that are everywhere concave up are y = x and y = e ;
2 x

examples of functions that are everywhere concave down are y = −x and y = −e .


2 x

The units on the second derivative are “units of output per unit of input per unit of input.” They tell us how the value of
the derivative function is changing in response to changes in the input. In other words, the second derivative tells us the
rate of change of the rate of change of the original function.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.6.8 12/8/2021 https://math.libretexts.org/@go/page/4297
Schlicker
1.7: Limits, Continuity, and Differentiability
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What does it mean graphically to say that f has limit L as x → a ? How is this connected to having a left-hand
limit at x = a and having a right-hand limit at x = a ?
What does it mean to say that a function f is continuous at x = a ? What role do limits play in determining whether
or not a function is continuous at a point?
What does it mean graphically to say that a function f is differentiable at x = a ? How is this connected to the
function being locally linear?
How are the characteristics of a function having a limit, being continuous, and being differentiable at a given point
related to one another?

Introduction
In Section 1.2, we learned about how the concept of limits can be used to study the trend of a function near a fixed input
value. As we study such trends, we are fundamentally interested in knowing how well-behaved the function is at the given
point, say x = a . In this present section, we aim to expand our perspective and develop language and understanding to
quantify how the function acts and how its value changes near a particular point. Beyond thinking about whether or not the
function has a limit L at x = a , we will also consider the value of the function f (a) and how this value is related to
f (x), as well as whether or not the function has a derivative f (a) at the point of interest. Throughout, we will

li m
x→a

build on and formalize ideas that we have encountered in several settings.


We begin to consider these issues through the following preview activity that asks you to consider the graph of a function
with a variety of interesting behaviors.

Preview Activity 1.7.1

A function f defined on −4 < x < 4 is given by the graph in Figure 1.7.1. Use the graph to answer each of the
following questions. Note: to the right of x = 2 , the graph of f is exhibiting infinite oscillatory behavior similar to the
function sin( ) that we encountered in the key example early in Section 1.2.
π

(a) For each of the values a = −3, −2, −1, 0, 1, 2, 3, determine whether or not li m x→a f (x) exists. If the function
has a limit L at a given point, state the value of the limit using the notation li m x→a f (x) = L . If the function
does not have a limit at a given point, write a sentence to explain why.

Figure 1.7.1 : The graph of y = f (x).


(b) For each of the values of a from part (a) where f has a limit, determine the value of f (a) at each such point.
In addition, for each such a value, does f (a) have the same value as li m x→af (x) ?

(c) For each of the values a = −3, −2, −1, 0, 1, 2, 3, determine whether or not f (a) exists. In particular, based

on the given graph, ask yourself if it is reasonable to say that f has a tangent line at (a, f (a)) for each of the
given a -values. If so, visually estimate the slope of the tangent line to find the value of f (a) .

Matthew Boelkins, David Austin & Steven


1.7.1 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
Having a limit at a point
In Section 1.2, we first encountered limits and learned that we say that f has limit L as x approaches a and write
li m x→af (x) = L provided that we can make the value of f (x) as close to L as we like by taking x sufficiently close (but

not equal to) a . Here, we expand further on this definition and focus in more depth on what it means for a function not to
have a limit at a given value.
Essentially there are two behaviors that a function can exhibit at a point where it fails to have a limit. In Figure 1.7.2, at
left we see a function f whose graph shows a jump at a = 1. In particular, if we let x approach 1 from the left side, the
value of f approaches 2, while if we let x go to 1 from the right, the value of f tends to 3. Because the value of f does not
approach a single number as x gets arbitrarily close to 1 from both sides, we know that f does not have a limit at a = 1.
Since f does approach a single value on each side of a = 1, we can introduce the notion of left and right (or one-sided)
limits. We say that f has limit L as x approaches a from the left and write
1

lim f (x) = L1 (1.7.1)



x→a

provided that we can make the value of f (x) as close to L as we like by taking x sufficiently close to a while always
1

having x < a . In this case, we call L the left-hand limit of f as x approaches a . Similarly, we say L is the right-hand
1 2

limit of f as x approaches a and write


l
i mx→a+ f (x) = L2 (1.7.2)

provided that we can make the value of f (x) as close to L as we like by taking
2 x sufficiently close to a while always
having x > a . In the graph of the function f in Figure 1.7.2, we see that
li mx→1− f (x) = 2and lim f (x) = 3 (1.7.3)
+
x→1

and precisely because the left and right limits are not equal, the overall limit of f as x → 1 fails to exist.

Figure 1.7.2 : Functions f and g that each fail to have a limit at a = 1.


For the function g pictured at right in Figure 1.7.2, the function fails to have a limit at a = 1 for a different reason. While
the function does not have a jump in its graph at a = 1 , it is still not the case that g approaches a single value as x
approaches 1. In particular, due to the infinitely oscillating behavior of g to the right of a = 1 , we say that the right-hand
limit of g as x → 1 does not exist, and thus li m
+
g(x) does not exist. To summarize, anytime either a left- or right-
x→1

hand limit fails to exist or the left- and right-hand limits are not equal to each other, the overall limit will not exist. Said
differently,

A function f has limit L as x → a if and only if

li mx→a− f (x) = L = li mx→a+ f (x). (1.7.4)

That is, a function has a limit at x =a if and only if both the left- and right-hand limits at x =a exist and share the
same value.

In Preview Activity 1.7, the function f given in Figure 1.7.1 only fails to have a limit at two values: at a = −2 (where the
left- and right-hand limits are 2 and −1, respectively) and at x = 2 , where li m f (x) does not exist). Note well that
x→2
+

Matthew Boelkins, David Austin & Steven


1.7.2 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
even at values like a = −1 and a = 0 where there are holes in the graph, the limit still exists.

Activity 1.7.2

Consider a function that is piecewise-defined according to the formula

⎧ 3(x + 2) + 2  for  − 3 < x < −2





⎪ 2

⎪ 3 (x + 2) + 1  for  − 2 ≤ x < −1

2
f (x) = ⎨ (x + 2) + 1  for  − 1 < x < 1 (1.7.5)
3



⎪2
⎪  for x = 1



4 −x  for x > 1

Use the given formula to answer the following questions.

Figure 1.7.3 : Axes for plotting the function y = f (x) in Activity 1.18.
(a) For each of the values a = −2, −1, 0, 1, 2, compute f (a).
(b) For each of the values a = −2, −1, 0, 1, 2, determine li m x→a− f (x) and li m x→a+ .
f (x)

(c) For each of the values a = −2, −1, 0, 1, 2, determine lim x→a . If the limit fails to exist, explain why by
f (x)

discussing the left- and right-hand limits at the relevant a -value.


(d) For which values of a is the following statement true?
lim f (x) ≠ f (a) (1.7.6)
x→a

(e) On the axes provided in Figure 1.7.3, sketch an accurate, labeled graph of y = f (x). Be sure to carefully use
open circles (◦) and filled circles (•) to represent key points on the graph, as dictated by the piecewise formula.

Being continuous at a point


Intuitively, a function is continuous if we can draw it without ever lifting our pencil from the page. Alternatively, we might
say that the graph of a continuous function has no jumps or holes in it. We first consider three specific situations in Figure
1.7.4 where all three functions have a limit at a = 1 , and then work to make the idea of continuity more precise.

Figure 1.7.4 : Functions f ,g , and h that demonstrate subtly different behaviors at a = 1 .


Note that f (1) is not defined, which leads to the resulting hole in the graph of f at a = 1 . We will naturally say that f is
not continuous at a = 1 . For the next function g in in Figure 1.7.4, we observe that while li m g(x) = 3 , the value of g
x→1

Matthew Boelkins, David Austin & Steven


1.7.3 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
(1) = 2, and thus the limit does not equal the function value. Here, too, we will say that g is not continuous, even though
the function is defined at a = 1 . Finally, the function h appears to be the most well-behaved of all three, since at a = 1 its
limit and its function value agree. That is,
lim h(x) = 3 = h(1). (1.7.7)
x→1

With no hole or jump in the graph of h at a = 1 , we desire to say that h is continuous there. More formally, we make the
following definition.

Definition 1.7

A function f is continuous at x = a provided that


(a) f has a limit as x → a ,
(b) f is defined at x = a , and
(c) lim x→a f (x) = f (a).

Conditions (a) and (b) are technically contained implicitly in (c), but we state them explicitly to emphasize their individual
importance. In words, (c) essentially says that a function is continuous at x = a provided that its limit as x → a exists and
equals its function value at x = a . If a function is continuous at every point in an interval [a, b], we say the function is
“continuous on [a, b].” If a function is continuous at every point in its domain, we simply say the function is “continuous.”
Thus, continuous functions are particularly nice: to evaluate the limit of a continuous function at a point, all we need to do
is evaluate the function.

Activity 1.7.3

This activity builds on your work in Preview Activity 1.7, using the same function f as given by the graph that is
repeated in Figure 1.7.5
(a) At which values of a does lim x→a f (x) not exist?
(b) At which values of a is f (a) not defined?
(c) At which values of a does f have a limit, but lim x→a f (x) ≠ f (a) )?
(d) State all values of a for which f is not continuous at x = a .
(e) Which condition is stronger, and hence implies the other:f has a limit at x = a or f is continuous at x = a ?
Explain, and hence complete the following sentence: “If f atx = a , then f at x = a ,” where you complete the
blanks with has a limit and is continuous, using each phrase once.

Figure 1.7.5 : The graph of y = f (x) for Activity 1.19.

Being differentiable at a point


We recall that a function f is said to be differentiable at x = a whenever f (a) exists. Moreover, for f (a) to exist, we
′ ′

know that the function y = f (x) must have a tangent line at the point (a, f (a)), since f (a) is precisely the slope of this

line. In order to even ask if f has a tangent line at (a, f (a)), it is necessary that f be continuous at x = a : if f fails to have

Matthew Boelkins, David Austin & Steven


1.7.4 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
a limit atx = a , if f (a) is not defined, or if f (a) does not equal the value of lim x→a , then it doesn’t even make sense
f (x)

to talk about a tangent line to the curve at this point.


Indeed, it can be proved formally that if a function f is differentiable at x = a, then it must be continuous at x = a . So, if f
is not continuous at x = a , then it is automatically the case that f is not differentiable there. For example, in Figure 1.7.4
from our early discussion of continuity, both f and g fail to be differentiable at x = 1 because neither function is
continuous at x = 1 . But can a function fail to be differentiable at a point where the function is continuous?
In Figure 1.7.6, we revisit the situation where a function has a sharp corner at a point, something we encountered several
times in Section 1.4. For the pictured function f , we observe that f is clearly continuous at a = 1 , since
limx→1 f (x) = 1 = f (1) .

But the function f in Figure 1.7.6 is not differentiable at a = 1 because f (1) fails to exist. One way to see this is to

observe that f (x) = −1 for every value of x that is less than 1, while f (x) = −1 for every value of x that is greater than
′ ′

1. That makes it seem that either +1 or −1 would be equally good candidates for the value of the derivative at x = 1 .
Alternately, we could use the limit definition of the derivative to attempt to compute f (x) = −1 , and discover that the

derivative does not exist. A similar problem will be investigated in Activity 1.20. Finally, we can also see visually that the
function f in Figure 1.7.6 does not have a tangent line. When we zoom in on (1, 1) on the graph of f , no matter how
closely we examine the function, it will always look like a “V”, and never like a single line, which

Figure 1.7.6 : A function f that is continuous at a = 1 but not differentiable at a = 1 ; at right, we zoom in on the point
(1, 1) in a magnified version of the box in the left-hand plot.

tells us there is no possibility for a tangent line there.


To make a more general observation, if a function does have a tangent line at a given point, when we zoom in on the point
of tangency, the function and the tangent line should appear essentially indistinguishable7 . Conversely, if we have a
function such that when we zoom in on a point the function looks like a single straight line, then the function should have a
tangent line there, and thus be differentiable. Hence, a function that is differentiable at x = a will, up close, look more and
more like its tangent line at (a, f (a)), and thus we say that a function is differentiable at x = a is locally linear.
To summarize the preceding discussion of differentiability and continuity, we make several important observations.
If f is differentiable at x = a , then f is continuous at x = a . Equivalently, if f fails to be continuous at x = a ,
then f will not be differentiable at x = a .
A function can be continuous at a point, but not be differentiable there. In particular, a function f is not
differentiable at x = a if the graph has a sharp corner (or cusp) at the point (a, f (a)).
If f is differentiable at x = a , then f is locally linear at x = a . That is, when a function is differentiable, it looks
linear when viewed up close because it resembles its tangent line there.

Exercise 1.7.4

In this activity, we explore two different functions and classify the points at which each is not differentiable. Let g be
the function given by the rule g(x) = |x|, and let f be the function that we have previously explored in Preview
Activity 1.7, whose graph is given again in Figure 1.41.
(a) Reasoning visually, explain why g is differentiable at every point x such that x ≠ 0 .
|h|
(b) Use the limit definition of the derivative to show that g ′
(0) = limh→0
h
.

Matthew Boelkins, David Austin & Steven


1.7.5 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
(c) Explain why g ′
(0) fails to exist by using small positive and negative values of h .

Figure 1.7.7 : The graph of y = f (x) for Activity 1.20.


(d) State all values of a for which f is not differentiable at x =a . For each, provide a reason for your
conclusion.
(e) True or false: if a function p is differentiable at x = b , then lim x→b p(x)must exist. Why?

Summary
In this section, we encountered the following important ideas:
A function f has limit L as x → a if and only if f has a left-hand limit at x = a , has a right-hand limit at x = a ,
and the left- and right-hand limits are equal. Visually, this means that there can be a hole in the graph at x = a ,
but the function must approach the same single value from either side of x = a .
A function f is continuous at x = a whenever f (a) is defined, f has a limit as x → a , and the value of the limit
and the value of the function agree. This guarantees that there is not a hole or jump in the graph of f at x = a .
A function f is differentiable at x = a whenever f (a) exists, which means that f has a tangent line at (a, f (a))

and thus f is locally linear at the value x = a . Informally, this means that the function looks like a line when
viewed up close at (a, f (a)) and that there is not a corner point or cusp at (a, f (a)).
Of the three conditions discussed in this section (having a limit at x = a , being continuous at x = a , and being
differentiable at x = a ), the strongest condition is being differentiable, and the next strongest is being
continuous. In particular, if f is differentiable at x = a , then f is also continuous at x = a , and if f is continuous
at x = a , then f has a limit at x = a .
7
See, for instance, http://gvsu.edu/s/6J for an applet (due to David Austin, GVSU) where zooming in shows the increasing
similarity between the tangent line and the curve.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


1.7.6 12/8/2021 https://math.libretexts.org/@go/page/5286
Schlicker
1.8: The Tangent Line Approximation
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is the formula for the general tangent line approximation to a differentiable function y = f (x) at the point
(a, f (a)) ?

What is the principle of local linearity and what is the local linearization of a differentiable function f at a point
(a, f (a)) ?

How does knowing just the tangent line approximation tell us information about the behavior of the original
function itself near the point of approximation? How does knowing the second derivative’s value at this point
provide us additional knowledge of the original function’s behavior?

Among all functions, linear functions are simplest. One of the powerful consequences of a function y = f (x) being
differentiable at a point (a, f (a)) is that, up close, the functiony = f (x) is locally linear and looks like its tangent line at
that point. In certain circumstances, this allows us to approximate the original function f with a simpler function L that is
linear: this can be advantageous when we have limited information about f or when f is computationally or algebraically
complicated. We will explore all of these situations in what follows.
It is essential to recall that whenf is differentiable at x = a , the value of f (a) provides the slope of the tangent line to

y = f (x) at the point (a, f (a)) . By knowing both a point on the line and the slope of the line we are thus able to find the

equation of the tangent line. Preview Activity 1.8.1 will refresh these concepts through a key example and set the stage for
further study.

Preview Activity 1.8.1

Consider the function y = g(x) = −x 2


+ 3x + 2

a. Use the limit definition of the derivative to compute a formula for y = g (x) .

b. Determine the slope of the tangent line to y = g(x) at the value x = 2.


c. Compute g (2).
d. Find an equation for the tangent line to y = g(x) at the point (2,g (2)). Write your result in point-slope form8.

Figure 1.8.1 : Axes for plotting y = g(x) and its tangent line to the point (2,g (2))).

The Tangent Line


Given a function f that is differentiable at x = a , we know that we can determine the slope of the tangent line to y = f (x)
at (a, f (a)) by computing f (a) . The resulting tangent line through (a, f (a)) with slope m = f (a) has its equation in
′ ′

point-slope form given by



y − f (a) = f (a)(x − a), (1.8.1)

which we can also express as y = f (a)(x − a) + f (a) . Note well: there is a major difference between f (a) and f (x) in

this context. The former is a constant that results from using the given fixed value of a , while the latter is the general
expression for the rule that defines the function. The same is true for f (a) and f (x): we must carefully distinguish
′ ′

Matthew Boelkins, David Austin & Steven


1.8.1 12/22/2021 https://math.libretexts.org/@go/page/5287
Schlicker
between these expressions. Each time we find the tangent line, we need to evaluate the function and its derivative at a fixed
a -value.

In Figure 1.8.2, we see a labeled plot of the graph of a function f and its tangent line at the point (a, f (a)). Notice how
when we zoom in we see the local linearity of f more clearly highlighted as the function and its tangent line are nearly
indistinguishable up close. This can also be seen dynamically in the java applet at http://gvsu.edu/s/6J.

Figure 1.8.2 : A function y = f (x) and its tangent line at the point (a, f (a)): at left, from a distance, and at right, up
close. At right, we label the tangent line function by y = L(x) and observe that for x near (a, f (a)) ≈ L(x) .

The Local Linearization


A slight change in perspective and notation will enable us to be more precise in discussing how the tangent line to
y = f (x) at (a, f (a)) approximates f near x = a . Taking the equation for the tangent line and solving for y , we observe

that the tangent line is given by



y = f (a)(x − a) + f (a) (1.8.2)

and moreover that this line is itself a function of x. Replacing the variable y with the expression L(x), we call

L(x) = f (a)(x − a) + f (a) (1.8.3)

the local linearization of f at the point(a, f (a)). In this notation, it is particularly important to observe that L(x) is
nothing more than a new name for the tangent line, and that for x close to a , we have that f (x) ≈ L(x).
Say, for example, that we know that a function y = f (x) has its tangent line approximation given by L(x) = 3 − 2(x − 1)
at the point (1, 3), but we do not know anything else about the function f . If we are interested in estimating a value of f (x)
for x near 1, such as f (1.2), we can use the fact that f (1.2) ≈ L(1.2) and hence
f (1.2) ≈ L(1.2) = 3 − 2(1.2 − 1) = 3 − 2(0.2) = 2.6 (1.8.4)

Again, much of the new perspective here is only in notation since y = L(x) is simply a new name for the tangent line
function. In light of this new notation and our observations above, we note that since L(x) = f (a) + f (a)(x − a) and

L(x) ≈ f (x) for x near a , it also follows that we can write


f (x) ≈ f (a) + f (a)(x − a) for x near a (1.8.5)

The next activity explores some additional important properties of the local linearization y = L(x) to a function f at given
a -value.

Activity 1.8.2

Suppose it is known that for a given differentiable function y = g(x) , its local linearization at the point where a = −1
is given by L(x) = −2 + 3(x + 1) .
a. Compute the values of L(−1) and L (−1). ′

b. What must be the values of g(−1) and g (−1)? Why?


Matthew Boelkins, David Austin & Steven


1.8.2 12/22/2021 https://math.libretexts.org/@go/page/5287
Schlicker
c. Do you expect the value of g (−1.03) to be greater than or less than the value of g (−1)? Why?
d. Use the local linearization to estimate the value of g (−1.03).
e. Suppose that you also know that g (−1) = 2 . What does this tell you about the graph of y = g(x) at
′′

a = −1 ?

f. For x near −1, sketch the graph of the local linearization y = L(x) as well as a possible graph of y = g(x)
on the axes provided in Figure 1.8.3.

Figure 1.8.3 : Axes for plotting \(y = L(x)\) and y = g(x) .

As we saw in the example provided by Activity 1.8.2, the local linearization y = L(x) is a linear function that shares two
important values with the function y = L(x) that it is derived from. In particular, observe that since
L(x) = f (a) + f (a)(x − a) , it follows that L(a) = f (a) . In addition, since L is a linear function, its derivative is its

slope. Hence, L (x) = f (a) for every value of x, and specifically L (x) = f (a) . Therefore, we see that L is a linear
′ ′ ′ ′

function that has both the same value and the same slope as the function f at the point (a, f (a)).
In situations where we know the linear approximation y = L(x), we therefore know the original function’s value and slope
at the point of tangency. What remains unknown, however, is the shape of the function f at the point of tangency. There are
essentially four possibilities, as enumerated in Figure 1.8.4.

Figure 1.8.4 : Four possible graphs for a nonlinear differentiable function and how it can be situated relative to its
tangent line at a point.
These stem from the fact that there are three options for the value of the second derivative: either f (a) < 0 ,f (a) = 0 ,
′′ ′′

or f (a) > 0 . If f (a) > 0 ., then we know the graph of f is concave up, and we see the first possibility on the left, where
′′ ′′

the tangent line lies entirely below the curve. If f (a) < 0 , then we find ourselves in the second situation (from left)
′′

where f is concave down and the tangent line lies above the curve. In the situation wheref (a) = 0 andf changes sign at
′′ ′′

9
x = a , the concavity of the graph will change, and we will see either the third or fourth option . A fifth option (that is not

very interesting) can occur, which is where the function f is linear, and so f (x) = L(x) for all values of x.
The plots in Figure 1.8.4 highlight yet another important thing that we can learn from the concavity of the graph near the
point of tangency: whether the tangent line lies above or below the curve itself. This is key because it tells us whether or
not the tangent line approximation’s values will be too large or too small in comparison to the true value of f . For instance,
in the first situation in the leftmost plot in Figure 1.8.4 where f (a) > 0 , since the tangent line falls below the curve, we
′′

know that L(x) ≤ f (x) for all values of x near a . We explore these ideas further in the following activity.

Matthew Boelkins, David Austin & Steven


1.8.3 12/22/2021 https://math.libretexts.org/@go/page/5287
Schlicker
Activity 1.8.3

This activity concerns a function f (x) about which the following information is known:
f is a differentiable function defined at every real number x
f (2) = −1

y = f (x)

has its graph given in Figure 1.8.5

Figure 1.8.5 : At center, a graph of y = f ′


(x) ; at left, axes for plotting y = f (x); at right, axes for plotting
y = f (x) .
′′

Your task is to determine as much information as possible about f (especially near the value a =2 ) by responding to
the questions below.
(a) Find a formula for the tangent line approximation, L(x), to f at the point (2, −1).
(b) Use the tangent line approximation to estimate the value of f (2.07). Show your work carefully and clearly.
(c) Sketch a graph of y = f ′′
(x) on the righthand grid in Figure 1.8.5; label it appropriately.
(d) Is the slope of the tangent line to y = f (x) increasing, decreasing, or neither when x = 2 ? Explain.
(e) Sketch a possible graph of y = f (x) near
x = 2\2onthelef thandgridinF igure1.8.5.I ncludeasketchof \(y = L(x) (found in part (a)). Explain how
you know the graph ofy = f (x) looks like you have drawn it.
(f) Does your estimate in (b) over- or under-estimate the true value of f (2.07)? Why?

The idea that a differentiable function looks linear and can be well-approximated by a linear function is an important one
that finds wide application in calculus. For example, by approximating a function with its local linearization, it is possible
to develop an effective algorithm to estimate the zeroes of a function. Local linearity also helps us to make further sense of
certain challenging limits. For instance, we have seen that a limit such as
sin(x)
lim (1.8.6)
x→0 x

x is indeterminate because both its numerator and denominator tend to 0. While there is no algebra that we can do to
sin(x)
simplify , it is straightforward to show that the linearization of f (x) = sin(x) at the point
x
(0, 0) is given by
L(x) = x . Hence, for values of x near 0, sin(x) ≈ x . As such, for values of x near 0,
sin(x) x
≈ = 1, (1.8.7)
x x

which makes plausible the fact that


sin(x)
lim =1 (1.8.8)
x→0 x

These ideas and other applications of local linearity will be explored later on in our work.

Matthew Boelkins, David Austin & Steven


1.8.4 12/22/2021 https://math.libretexts.org/@go/page/5287
Schlicker
Summary
In this section, we encountered the following important ideas:
The tangent line to a differentiable function y = f (x) at the point (a, f (a)) is given in point-slope form by the
equation
y − f (a) = f ′(a)(x − a). (1.8.9)

The principle of local linearity tells us that if we zoom in on a point where a function y = f (x) is differentiable, the
function should become indistinguishable from its tangent line. That is, a differentiable function looks linear when
viewed up close. We rename the tangent line to be the function y = L(x) where L(x) = f (a) + f ′(a)(x − a) and
note that f (x) ≈ L(x)\(f orall\(x near x = a .
If we know the tangent line approximation L(x) = f (a) + f ′(a)(x − a) , then because L(a) = f (a) and
L′(a) = f ′(a) , we also know both the value and the derivative of the function y = f (x) at the point where x = a . In

other words, the linear approximation tells us the height and slope of the original function. If, in addition, we know the
value of f ′′(a), we then know whether the tangent line lies above or below the graph of y = f (x) depending on the
concavity of f .

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)
8Recall that a line with slope m that passes through (x has equation y − y , and this is the point-slope
0, y0 ) 0 = m (x − x0 )

form of the equation.


9
It is possible to have f ′′
(a) = 0 and have f not change sign at x = a , in which case the graph will look like one of the
′′

first two options.

Matthew Boelkins, David Austin & Steven


1.8.5 12/22/2021 https://math.libretexts.org/@go/page/5287
Schlicker
1.E: Understanding the Derivative (Exercises)
1.1: How do we Measure Velocity?
1. A bungee jumper dives from a tower at time t = 0. Her height h (measured in feet) at time t (in seconds) is given by the
graph in Figure 1.3.

Figure 1.3: A bungee jumper’s height function


In this problem, you may base your answers on estimates from the graph or use the fact that the jumper’s height function is
given by s(t) = 100 cos(0.75t) · e −0.2t + 100.
(a) What is the change in vertical position of the bungee jumper between t = 0 and t = 15?
(b) Estimate the jumper’s average velocity on each of the following time intervals: [0, 15], [0, 2], [1, 6], and [8, 10].
Include units on your answers.
(c) On what time interval(s) do you think the bungee jumper achieves her greatest average velocity? Why?
(d) Estimate the jumper’s instantaneous velocity at t = 5. Show your work and explain your reasoning, and include units on
your answer.
(e) Among the average and instantaneous velocities you computed in earlier questions, which are positive and which are
negative? What does negative velocity indicate?
2. A diver leaps from a 3 meter springboard. His feet leave the board at time t = 0, he reaches his maximum height of 4.5
m at t = 1.1 seconds, and enters the water at t = 2.45. Once in the water, the diver coasts to the bottom of the pool (depth
3.5 m), touches bottom at t = 7, rests for one second, and then pushes off the bottom. From there he coasts to the surface,
and takes his first breath at t = 13.
(a) Let s(t) denote the function that gives the height of the diver’s feet (in meters) above the water at time t. (Note that the
“height” of the bottom of the pool is −3.5 meters.) Sketch a carefully labeled graph of s(t) on the provided axes in Figure
1.4. Include scale and units on the vertical axis. Be as detailed as possible.

Figure 1.4: Axes for plotting s(t) in part (a) and v(t) in part (c) of the diver problem.
(b) Based on your graph in (a), what is the average velocity of the diver between t = 2.45 and t = 7? Is his average velocity
the same on every time interval within [2.45, 7]?
(c) Let the function v(t) represent the instantaneous vertical velocity of the diver at time t (i.e. the speed at which the height
function s(t) is changing; note that velocity in the upward direction is positive, while the velocity of a falling object is

Matthew Boelkins, David Austin & Steven


1.E.1 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
negative). Based on your understanding of the diver’s behavior, as well as your graph of the position function, sketch a
carefully labeled graph of v(t) on the axes provided in Figure 1.4. Include scale and units on the vertical axis. Write several
sentences that explain how you constructed your graph, discussing when you expect v(t) to be zero, positive, negative,
relatively large, and relatively small.
(d) Is there a connection between the two graphs that you can describe? What can you say about the velocity graph when
the height function is increasing? decreasing? Make as many observations as you can.
3. According to the U.S. census, the population of the city of Grand Rapids, MI, was 181,843 in 1980; 189,126 in 1990;
and 197,800 in 2000.
(a) Between 1980 and 2000, by how many people did the population of Grand Rapids grow?
(b) In an average year between 1980 and 2000, by how many people did the population of Grand Rapids grow?
(c) Just like we can find the average velocity of a moving body by computing change in position over change in time, we
can compute the average rate of change of any function f . In particular, the average rate of change of a function f over an
interval [a, b] is the quotient f (b) − f (a) b − a . What does the quantity f (b)−f (a) b−a measure on the graph of y = f (x)
over the interval [a, b]?
(d) Let P(t) represent the population of Grand Rapids at time t, where t is measured in years from January 1, 1980. What is
the average rate of change of P on the interval t = 0 to t = 20? What are the units on this quantity?
(e) If we assume the population of Grand Rapids is growing at a rate of approximately 4% per decade, we can model the
population function with the 10 formula P(t) = 181843(1.04) t/10 . Use this formula to compute the average rate of change
of the population on the intervals [5, 10], [5, 9], [5, 8], [5, 7], and [5, 6].
(f) How fast do you think the population of Grand Rapids was changing on January 1, 1985? Said differently, at what rate
do you think people were being added to the population of Grand Rapids as of January 1, 1985? How many additional
people should the city have expected in the following year? Why?

1.2: The Notion of Limit


1. Consider the function whose formula is f (x) = 16 − x 4 x 2 − 4 . (a) What is the domain of f? (b) Use a sequence of
values of x near a = 2 to estimate the value of lim x→2 f (x), if you think the limit exists. If you think the limit doesn’t
exist, explain why. (c) Evaluate limx→2 f (x) exactly, if the limit exists, or explain how your work shows the limit fails to
exist. Here you should use algebra to factor and simplify the numerator and denominator of f (x) as you work to evaluate
the limit. Discuss how your findings compare to your results in (b). (d) True or false: f (2) = −8. Why? (e) True or false:
16−x 4 x 2−4 = −4 − x 2 . Why? How is this equality connected to your work above with the function f? (f) Based on all of
your work above, construct an accurate, labeled graph of y = f (x) on the interval [1, 3], and write a sentence that explains
what you now know about lim x→2 16 − x 4 x 2 − 4 .
2. Let g(x) = − |x + 3| x + 3 . (a) What is the domain of g? (b) Use a sequence of values near a = −3 to estimate the value of
limx→−3 g(x), if you think the limit exists. If you think the limit doesn’t exist, explain why. 20 (c) Evaluate limx→2 g(x)
exactly, if the limit exists, or explain how your work shows the limit fails to exist. Here you should use the definition of the
absolute value function in the numerator of g(x) as you work to evaluate the limit. Discuss how your findings compare to
your results in (b). (Hint: |a| = a whenever a ≥ 0, but |a| = −a whenever a < 0.) (d) True or false: g(−3) = −1. Why? (e) True
or false: − |x+3| x+3 = −1. Why? How is this equality connected to your work above with the function g? (f) Based on all
of your work above, construct an accurate, labeled graph of y = g(x) on the interval [−4, −2], and write a sentence that
explains what you now know about lim x→−3 g(x).
3. For each of the following prompts, sketch a graph on the provided axes of a function that has the stated properties.

Matthew Boelkins, David Austin & Steven


1.E.2 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
Figure 1.9: Axes for plotting y = f (x) in (a) and y = g(x) in (b).
(a) y = f (x) such that
• f (−2) = 2 and lim x→−2 f (x) = 1
• f (−1) = 3 and lim x→−1 f (x) = 3
• f (1) is not defined and lim x→1 f (x) = 0
• f (2) = 1 and lim x→2 f (x) does not exist. (b) y = g(x) such that • g(−2) = 3, g(−1) = −1, g(1) = −2, and g(2) = 3
• At x = −2, −1, 1 and 2, g has a limit, and its limit equals the value of the function at that point. 21
• g(0) is not defined and lim x→0 g(x) does not exist.
4. A bungee jumper dives from a tower at time t = 0. Her height s in feet at time t in seconds is given by s(t) = 100
cos(0.75t) · e −0.2t + 100. (a) Write an expression for the average velocity of the bungee jumper on the interval [1, 1 + h].
(b) Use computing technology to estimate the value of the limit as h → 0 of the quantity you found in (a). (c) What is the
meaning of the value of the limit in (b)? What are its units?

1.3: The Derivative of a Function at a Point


1. Consider the graph of y = f (x) provided in Figure 1.16. (a) On the graph of y = f (x), sketch and label the following
quantities: 31
• the secant line to y = f (x) on the interval [−3, −1] and the secant line to y = f (x) on the interval [0, 2].
• the tangent line to y = f (x) at x = −3 and the tangent line to y = f (x) at x = 0.

Figure 1.16: Plot of y = f (x).


(b) What is the approximate value of the average rate of change of f on [−3, −1]? On [0, 2]? How are these values related
to your work in (a)?
(c) What is the approximate value of the instantaneous rate of change of f at x = −3? At x = 0? How are these values
related to your work in (a)? 2. For each of the following prompts, sketch a graph on the provided axes in Figure 1.17 of a
function that has the stated properties. (a) y = f (x) such that • the average rate of change of f on [−3, 0] is −2 and the
average rate of change of f on [1, 3] is 0.5, and
• the instantaneous rate of change of f at x = −1 is −1 and the instantaneous rate of change of f at x = 2 is 1. (b) y = g(x)
such that

Matthew Boelkins, David Austin & Steven


1.E.3 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
• g(3)−g(−2) 5 = 0 and g(1)−g(−1) 2 = −1, and
• g 0 (2) = 1 and g 0 (−1) = 0
3. Suppose that the population, P, of China (in billions) can be approximated by the function P(t) = 1.15(1.014) t where t is
the number of years since the start of 1993.
(a) According to the model, what was the total change in the population of China between January 1, 1993 and January 1,
2000? What will be the average rate of change of the population over this time period? Is this average rate of change
greater or less than the instantaneous rate of change of the population on January 1, 2000? Explain and justify, being sure
to include proper units on all your answers.

Figure 1.17: Axes for plotting y = f (x) in (a) and y = g(x) in (b).
(b) According to the model, what is the average rate of change of the population of China in the ten-year period starting on
January 1, 2012?
(c) Write an expression involving limits that, if evaluated, would give the exact instantaneous rate of change of the
population on today’s date. Then estimate the value of this limit (discuss how you chose to do so) and explain the meaning
(including units) of the value you have found.
(d) Find an equation for the tangent line to the function y = P(t) at the point where the t-value is given by today’s date.
4. The goal of this problem is to compute the value of the derivative at a point for several different functions, where for
each one we do so in three different ways, and then to compare the results to see that each produces the same value. For
each of the following functions, use the limit definition of the derivative to compute the value of f 0 (a) using three
different approaches: strive to use the algebraic approach first (to compute the limit exactly), then test your result using
numerical evidence (with small values of h), and finally plot the graph of y = f (x) near (a, f (a)) along with the appropriate
tangent line to estimate the value of f 0 (a) visually. Compare your findings among all three approaches; if you are unable
to complete the algebraic approach, still work numerically and graphically.
(a) f (x) = x 2 − 3x, a = 2
(b) f (x) = 1 x , a = 1
(c) f (x) = √ x, a = 1 33
(d) f (x) = 2 − |x − 1|, a = 1
(e) f (x) = sin(x), a = π 2

1.4: The Derivative Function


1. Let f be a function with the following properties: f is differentiable at every value of x (that is, f has a derivative at every
point), f (−2) = 1, and f 0 (−2) = −2, f 0 (−1) = −1, f 0 (0) = 0, f 0 (1) = 1, and f 0 (2) = 2.
(a) On the axes provided at left in Figure 1.19, sketch a possible graph of y = f (x). Explain why your graph meets the
stated criteria.
(b) On the axes at right in Figure 1.19, sketch a possible graph of y = f 0 (x). What type of curve does the provided data
suggest for the graph of y = f 0 (x)?

Matthew Boelkins, David Austin & Steven


1.E.4 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
(c) Conjecture a formula for the function y = f (x). Use the limit definition of the derivative to determine the corresponding
formula for y = f 0 (x). Discuss both graphical and algebraic evidence for whether or not your conjecture is correct.

Figure 1.19: Axes for plotting y = f (x) in (a) and y = f 0 (x) in (b).
2. Consider the function g(x) = x 2 − x + 3.
(a) Use the limit definition of the derivative to determine a formula for g 0 (x). 41
(b) Use a graphing utility to plot both y = g(x) and your result for y = g 0 (x); does your formula for g 0 (x) generate the
graph you expected?
(c) Use the limit definition of the derivative to find a formula for p 0 (x) where p(x) = 5x 2 − 4x + 12.
(d) Compare and contrast the formulas for g 0 (x) and p 0 (x) you have found. How do the constants 5, 4, 12, and 3 affect
the results?
3. Let g be a continuous function (that is, one with no jumps or holes in the graph) and suppose that a graph of y = g 0 (x)
is given by the graph on the right in Figure 1.20.

Figure 1.20: Axes for plotting y = g(x) and, at right, the graph of y = g 0 (x).
(a) Observe that for every value of x that satisfies 0 < x < 2, the value of g 0 (x) is constant. What does this tell you about
the behavior of the graph of y = g(x) on this interval?
(b) On what intervals other than 0 < x < 2 do you expect y = g(x) to be a linear function? Why?
(c) At which values of x is g 0 (x) not defined? What behavior does this lead you to expect to see in the graph of y = g(x)?
(d) Suppose that g(0) = 1. On the axes provided at left in Figure 1.20, sketch an accurate graph of y = g(x). 42 4. For each
graph that provides an original function y = f (x) in Figure 1.21 (on the following page), your task is to sketch an
approximate graph of its derivative function, y = f 0 (x), on the axes immediately below. View the scale of the grid for the
graph of f as being 1 × 1, and assume the horizontal scale of the grid for the graph of f 0 is identical to that for f . If you
need to adjust the vertical scale on the axes for the graph of f 0 , you should label that accordingly.

Matthew Boelkins, David Austin & Steven


1.E.5 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
Figure 1.21: Graphs of y = f (x) and grids for plotting the corresponding graph of y = f 0 (x).

1.5: Interpretating, Estimating, and Using the Derivative


1. A cup of coffee has its temperature F (in degrees Fahrenheit) at time t given by the function F(t) = 75 + 110e −0.05t ,
where time is measured in minutes.
(a) Use a central difference with h = 0.01 to estimate the value of F 0 (10).
(b) What are the units on the value of F 0 (10) that you computed in (a)? What is the practical meaning of the value of F 0
(10)?
(c) Which do you expect to be greater: F 0 (10) or F 0 (20)? Why?
(d) Write a sentence that describes the behavior of the function y = F 0 (t) on the time interval 0 ≤ t ≤ 30. How do you think
its graph will look? Why?
2. The temperature change T (in Fahrenheit degrees), in a patient, that is generated by a dose q (in milliliters), of a drug, is
given by the function T = f (q).
(a) What does it mean to say f (50) = 0.75? Write a complete sentence to explain, using correct units.
(b) A person’s sensitivity, s, to the drug is defined by the function s(q) = f 0 (q). What are the units of sensitivity?
(c) Suppose that f 0 (50) = −0.02. Write a complete sentence to explain the meaning of this value. Include in your response
the information given in (a).
3. The velocity of a ball that has been tossed vertically in the air is given by v(t) = 16−32t, where v is measured in feet per
second, and t is measured in seconds. The ball is in the air from t = 0 until t = 2.

Matthew Boelkins, David Austin & Steven


1.E.6 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
(a) When is the ball’s velocity greatest?
(b) Determine the value of v 0 (1). Justify your thinking.
(c) What are the units on the value of v 0 (1)? What does this value and the corresponding units tell you about the behavior
of the ball at time t = 1?
(d) What is the physical meaning of the function v 0 (t)?
4. The value, V, of a particular automobile (in dollars) depends on the number of miles, m, the car has been driven,
according to the function V = h(m).
(a) Suppose that h(40000) = 15500 and h(55000) = 13200. What is the average rate of change of h on the interval [40000,
55000], and what are the units on this value? 51
(b) In addition to the information given in (a), say that h(70000) = 11100. Determine the best possible estimate of h 0
(55000) and write one sentence to explain the meaning of your result, including units on your answer.
(c) Which value do you expect to be greater: h 0 (30000) or h 0 (80000)? Why?
(d) Write a sentence to describe the long-term behavior of the function V = h(m), plus another sentence to describe the
long-term behavior of h 0 (m). Provide your discussion in practical terms regarding the value of the car and the rate at
which that value is changing.

1.6: The Second Derivative


1. Suppose that y = f (x) is a differentiable function for which the following information is known: f (2) = −3, f 0 (2) = 1.5,
f 00(2) = −0.25.
(a) Is f increasing or decreasing at x = 2? Is f concave up or concave down at x = 2?
(b) Do you expect f (2.1) to be greater than −3, equal to −3, or less than −3? Why?
(c) Do you expect f 0 (2.1) to be greater than 1.5, equal to 1.5, or less than 1.5? Why?
(d) Sketch a graph of y = f (x) near (2, f (2)) and include a graph of the tangent line. 63
2. For a certain function y = g(x), its derivative is given by the function pictured in Figure 1.34.

Figure 1.34: The graph of y = g 0 (x).


(a) What is the approximate slope of the tangent line to y = g(x) at the point (2, g(2))?
(b) How many real number solutions can there be to the equation g(x) = 0? Justify your conclusion fully and carefully by
explaining what you know about how the graph of g must behave based on the given graph of g 0 .
(c) On the interval −3 < x < 3, how many times does the concavity of g change? Why?
(d) Use the provided graph to estimate the value of g 00(2). 3. A bungee jumper’s height h (in feet ) at time t (in seconds)
is given in part by the data in the following table:
t 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 h(t) 200 184.2 159.9 131.9 104.7 81.8 65.5 56.8 55.5 60.4 69.8 t 5.5 6.0 6.5 7.0
7.5 8.0 8.5 9.0 9.5 10.0 h(t) 81.6 93.7 104.4 112.6 117.7 119.4 118.2 114.8 110.0 104.7
(a) Use the given data to estimate h 0 (4.5), h 0 (5), and h 0 (5.5). At which of these times is the bungee jumper rising most
rapidly?
(b) Use the given data and your work in (a) to estimate h 00(5).

Matthew Boelkins, David Austin & Steven


1.E.7 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
(c) What physical property of the bungee jumper does the value of h 00(5) measure? What are its units?
(d) Based on the data, on what approximate time intervals is the function y = h(t) concave down? What is happening to the
velocity of the bungee jumper on these time intervals? 64
4. For each prompt that follows, sketch a possible graph of a function on the interval −3 < x < 3 that satisfies the stated
properties.
(a) y = f (x) such that f is increasing on −3 < x < 3, f is concave up on −3 < x < 0, and f is concave down on 0 < x < 3.
(b) y = g(x) such that g is increasing on −3 < x < 3, g is concave down on −3 < x < 0, and g is concave up on 0 < x < 3.
(c) y = h(x) such that h is decreasing on −3 < x < 3, h is concave up on −3 < x < −1, neither concave up nor concave down
on −1 < x < 1, and h is concave down on 1 < x < 3.
(d) y = p(x) such that p is decreasing and concave down on −3 < x < 0 and p is increasing and concave down on 0 < x < 3.

1.7: Limits, Continuity, and Differentiability


1. Consider the graph of the function y = p(x) that is provided in Figure 1.42. Assume that each portion of the graph of p is
a straight line, as pictured.

Figure 1.42: At left, the piecewise linear function y = p(x). At right, axes for plotting y = p 0 (x).
(a) State all values of a for which limx→a p(x) does not exist.
(b) State all values of a for which p is not continuous at a.
(c) State all values of a for which p is not differentiable at x = a.
(d) On the axes provided in Figure 1.42, sketch an accurate graph of y = p 0 (x).
2. For each of the following prompts, give an example of a function that satisfies the stated criteria. A formula or a graph,
with reasoning, is sufficient for each. If no such example is possible, explain why. 75
(a) A function f that is continuous at a = 2 but not differentiable at a = 2.
(b) A function g that is differentiable at a = 3 but does not have a limit at a = 3.
(c) A function h that has a limit at a = −2, is defined at a = −2, but is not continuous at a = −2.
(d) A function p that satisfies all of the following:
• p(−1) = 3 and limx→−1 p(x) = 2
• p(0) = 1 and p 0 (0) = 0
• limx→1 p(x) = p(1) and p 0 (1) does not exist
3. Let h(x) be a function whose derivative y = h 0 (x) is given by the graph on the right in Figure 1.43.
(a) Based on the graph of y = h 0 (x), what can you say about the behavior of the function y = h(x)?
(b) At which values of x is y = h 0 (x) not defined? What behavior does this lead you to expect to see in the graph of y =
h(x)?
(c) Is it possible for y = h(x) to have points where h is not continuous? Explain your answer.

Matthew Boelkins, David Austin & Steven


1.E.8 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
(d) On the axes provided at left, sketch at least two distinct graphs that are possible functions y = h(x) that each have a
derivative y = h 0 (x) that matches the provided graph at right. Explain why there are multiple possibilities for y = h(x).

Figure 1.43: Axes for plotting y = h(x) and, at right, the graph of y = h 0 (x).
4. Consider the function g(x) = p |x|.
(a) Use a graph to explain visually why g is not differentiable at x = 0. 76
(b) Use the limit definition of the derivative to show that g 0 (0) = lim h→0 p |h| h .
(c) Investigate the value of g 0 (0) by estimating the limit in (b) using small positive and negative values of h. For instance,
you might compute √ |−0.01| 0.01 . Be sure to use several different values of h (both positive and negative), including ones
closer to 0 than 0.01. What do your results tell you about g 0 (0)?
(d) Use your graph in (a) to sketch an approximate graph of y = g 0 (x).

1.8: The Tangent Line Approximation


1. A certain function y = p(x) has its local linearization at a = 3 given by L(x) = −2x + 5. 84
(a) What are the values of p(3) and p 0 (3)? Why?
(b) Estimate the value of p(2.79).
(c) Suppose that p 00(3) = 0 and you know that p 00(x) < 0 for x < 3. Is your estimate in (b) too large or too small?
(d) Suppose that p 00(x) > 0 for x > 3. Use this fact and the additional information above to sketch an accurate graph of y =
p(x) near x = 3. Include a sketch of y = L(x) in your work.
2. A potato is placed in an oven, and the potato’s temperature F (in degrees Fahrenheit) at various points in time is taken
and recorded in the following table. Time t is measured in minutes.
t F(t) 0 70 15 180.5 30 251 45 296 60 324.5 75 342.8 90 354.5
(a) Use a central difference to estimate F 0 (60). Use this estimate as needed in subsequent questions.
(b) Find the local linearization y = L(t) to the function y = F(t) at the point where a = 60.
(c) Determine an estimate for F(63) by employing the local linearization.
(d) Do you think your estimate in (c) is too large or too small? Why?
3. An object moving along a straight line path has a differentiable position function y = s(t); s(t) measures the object’s
position relative to the origin at time t. It is known that at time t = 9 seconds, the object’s position is s(9) = 4 feet (i.e., 4
feet to the right of the origin). Furthermore, the object’s instantaneous velocity at t = 9 is −1.2 feet per second, and its
acceleration at the same instant is 0.08 feet per second per second.
(a) Use local linearity to estimate the position of the object at t = 9.34.
(b) Is your estimate likely too large or too small? Why?
(c) In everyday language, describe the behavior of the moving object at t = 9. Is it moving toward the origin or away from
it? Is its velocity increasing or decreasing? 85

Matthew Boelkins, David Austin & Steven


1.E.9 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
4. For a certain function f , its derivative is known to be f 0 (x) = (x − 1)e −x 2 . Note that you do not know a formula for y
= f (x).
(a) At what x-value(s) is f 0 (x) = 0? Justify your answer algebraically, but include a graph of f 0 to support your
conclusion.
(b) Reasoning graphically, for what intervals of x-values is f 00(x) > 0? What does this tell you about the behavior of the
original function f? Explain.
(c) Assuming that f (2) = −3, estimate the value of f (1.88) by finding and using the tangent line approximation to f at x =
2. Is your estimate larger or smaller than the true value of f (1.88)? Justify your answer.

Matthew Boelkins, David Austin & Steven


1.E.10 12/8/2021 https://math.libretexts.org/@go/page/5386
Schlicker
CHAPTER OVERVIEW
2: COMPUTING DERIVATIVES
Throughout Chapter 2, we will be working to develop shortcut derivative rules that will help us to bypass the limit definition of the
derivative in order to quickly determine the formula for f (x) when we are given a formula for f(x).

2.1: ELEMENTARY DERIVATIVE RULES


The limit definition of the derivative leads to patterns among certain families of functions that enable us to compute derivative
formulas without resorting directly to the limit definition. If we are given a constant multiple of a function whose derivative we know,
or a sum of functions whose derivatives we know, the Constant Multiple and Sum Rules make it straightforward to compute the
derivative of the overall function.

2.2: THE SINE AND COSINE FUNCTION


In this section, we are going to work to conjecture formulas for the sine and cosine functions, primarily through a graphical argument.
To help set the stage for doing so, the following preview activity asks you to think about exponential functions and why it is
reasonable to think that the derivative of an exponential function is a constant times the exponential function itself.

2.3: THE PRODUCT AND QUOTIENT RULES


If a function is a sum, product, or quotient of simpler functions, then we can use the sum, product, or quotient rules to differentiate the
overall function in terms of the simpler functions and their derivatives. The product and quotient rules now complement the constant
multiple and sum rules and enable us to compute the derivative of any function that consists of sums, constant multiples, products,
and quotients of basic functions we already know how to differentiate.

2.4: DERIVATIVES OF OTHER TRIGONOMETRIC FUNCTIONS


The derivatives of the other four trigonometric functions are derived. These four rules for the derivatives of the tangent, cotangent,
secant, and cosecant can be used along with the rules for power functions, exponential functions, and the sine and cosine, as well as
the sum, constant multiple, product, and quotient rules, to quickly differentiate a wide range of different functions.

2.5: THE CHAIN RULE


In this section, we encountered the following important ideas: A composite function is one where the input variable x first passes
through one function, and then the resulting output passes through another.

2.6: DERIVATIVES OF INVERSE FUNCTIONS


Because each function represents a process, a natural question to ask is whether or not the particular process can be reversed. That is,
if we know the output that results from the function, can we determine the input that led to it? Connected to this question, we now
also ask: if we know how fast a particular process is changing, can we determine how fast the inverse process is changing?

2.7: DERIVATIVES OF FUNCTIONS GIVEN IMPLICITELY


Implicit Differentiation is used to identfy the derivative of a y(x) function from an equation where y cannot be solved for explicitly in
terms of x, but where portions of the curve can be thought of as being generated by explicit functions of x. In this case, we say that y
is an implicit function of x. The process of implicit differentiation, we take the equation that generates an implicitly given curve and
differentiate both sides with respect to x while treating y as a function of x.

2.8: USING DERIVATIVES TO EVALUATE LIMITS


Derivatives be used to help us evaluate indeterminate limits of the form 0 0 through L’Hopital’s Rule, which is developed by
replacing the functions in the numerator and denominator with their tangent line approximations. A version of L’Hopital’s Rule also
allows us to use derivatives to assist us in evaluating other indeterminate limits.

2.E: COMPUTING DERIVATIVES (EXERCISES)


These are homework exercises to accompany Chapter 2 of Boelkins et al. "Active Calculus" Textmap.

1 12/22/2021
2.1: Elementary Derivative Rules
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What are alternate notations for the derivative?
How can we sometimes use the algebraic structure of a function f (x) to easily compute a formula for f (x)? ′

What is the derivative of a power function of the form f (x) = x ? What is the derivative of an exponential
n

function of form f (x) = a ? x

If we know the derivative of y = f (x), how is the derivative of y = kf (x) computed, where k is a constant?
If we know the derivatives of y = f (x) and y = g(x) , how is the derivative of y = f (x) + g(x) computed?

In Chapter 1, we developed the concept of the derivative of a function. We now know that the derivative f of a function f ′

measures the instantaneous rate of change of f with respect to x as well as the slope of the tangent line to y = f (x) at any
given value of x. To date, we have focused primarily on interpreting the derivative graphically or, in the context of
functions in a physical setting, as a meaningful rate of change. To actually calculate the value of the derivative at a specific
point, we have typically relied on the limit definition of the derivative.
In this present chapter, we will investigate how the limit definition of the derivative,

f (x) = lim f (x + h) − f (x)h, (2.1.1)
h→0

leads to interesting patterns and rules that enable us to quickly find a formula for f (x) based on the formula for f (x)

without using the limit definition directly. For example, we already know that if f (x) = x, then it follows that f (x) = 1 . ′

While we could use the limit definition of the derivative to confirm this, we know it to be true because f (x) is a linear
function with slope 1 at every value of x. One of our goals is to be able to take standard functions, say ones such as
7 x
g(x) = 4 x − sin(x) + 3 e , (2.1.2)

and, based on the algebraic form of the function, be able to apply shortcuts to almost immediately determine the formula
for g′(x).

Preview Activity 2.1.1

Functions of the form f (x) = x , where n = 1, 2, 3, …, are often called power functions. The first two questions
n

below revisit work we did earlier in Chapter 1, and the following questions extend those ideas to higher powers of x.
a. Use the limit definition of the derivative to find f (x) for f (x) = x .
′ 2

b. Use the limit definition of the derivative to find f (x) for f (x) = x .
′ 3

c. Use the limit definition of the derivative to find f (x) for f (x) = x . (Hint:
′ 4

4 4 3 2
(a + b ) = a + 4 a b + 6 a b + 4ab + b
2
. Apply this rule to (x + h) within the limit definition.)
3 4 4

d. Based on your work in (a), (b), and (c), what do you conjecture is the derivative of f (x) = x ? Of 5

f (x) = x ?
13

e. Conjecture a formula for the derivative of f (x) = x that holds for any positive integer n . That is, given
n

f (x) = x where n is a positive integer, what do you think is the formula for f (x)?
n ′

Some Key Notation


In addition to our usual f notation for the derivative, there are other ways to symbolically denote the derivative of a

function, as well as the instruction to take the derivative. We know that if we have a function, say f (x) = x , that we can 2

denote its derivative by f (x), and we write f (x) = 2x . Equivalently, if we are thinking more about the relationship
′ ′

between y and x, we sometimes denote the derivative of y with respect to x with the symbol
dy
(2.1.3)
dx

Matthew Boelkins, David Austin & Steven


2.1.1 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
which we read “dee-y dee-x.” This notation comes from the fact that the derivative is related to the slope of a line, and
Δy Δy dy
slope is measured by . Note that while we read
Δx
as “change in y over change in x,” for the derivative symbol , we
Δx dx

view this is a single symbol, not a quotient of two quantities1. For example, if y = x , we’ll write that the derivative is 2

dy

dx
= 2x .
dy
Furthermore, we use a variant of notation to convey the instruction to take the derivative of a certain quantity with
dx

respect to a given variable. In particular, if we write


d
[] (2.1.4)
dx

this means “take the derivative of the quantity in [] with respect to x.” To continue our example above with the squaring
function, here we may write [ x ] = 2x. d

dx
2

It is important to note that the independent variable can be different from x. If we have f (z) = z , we then write 2

dy
f (z) = 2z . Similarly, if y = t , we can say = 2 . And changing the variable and derivative notation once more, it is
′ 2

dt

also true that [ q ] = 2q . This notation may also be applied to second derivatives:
d

dq
2

2
d df d f
′′
f (z) = [ ] = . (2.1.5)
2
dz dz dz

In what follows, we’ll be working to widely expand our repertoire of functions for which we can quickly compute the
corresponding derivative formula

Constant, Power, and Exponential Functions


So far, we know the derivative formula for two important classes of functions: constant functions and power functions. For
the first kind, observe that if f (x) = c is a constant function, then its graph is a horizontal line with slope zero at every
point. Thus, [c] = 0 . We summarize this with the following rule.
d

dx

Constant Functions:
For any real number c , if f (x) = c , then f ′
(x) = 0 .


Thus, if f (x) = 7 , then f ′
(x) = 0 .Similarly, dx
d
[ √3] = 0 .
For power functions, from your work in Preview Activity 2.1, you have conjectured that for any positive integer n , if
f (x) = x , then f (x) = nx . Not only can this rule be formally proved to hold for any positive integer n , but also for
n ′ n−1

any nonzero real number (positive or negative).

Power Functions
For any nonzero real number, if f (x) = x , then f n ′
(x) = nx
n−1
.

As we next turn to thinking about derivatives of combinations of basic functions, it will be instructive to have one more
type of basic function whose derivative formula we know. For now, we simply state this rule without explanation or
justification; we will explore why this rule is true in one of the exercises at the end of this section, plus we will encounter
graphical reasoning for why the rule is plausible in Preview Activity 2.2.

Exponential Functions
For any positive real number a , if f (x) = a , then f x ′ x
(x) = a ln(a) .

For instance, this rule tells us that if


f (x) = 2x (2.1.6)

then

Matthew Boelkins, David Austin & Steven


2.1.2 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
′ x
f (x) = 2 ln(2). (2.1.7)

Similarly, for
t
p(t) = 10 (2.1.8)

′ t
p (t) = 10 ln(10). (2.1.9)

It is especially important to note that when a = e , where e is the base of the natural logarithm function, we have that
d
x x x
[e ] = e ln(e) = e (2.1.10)
dx

since ln(e) = 1 . This is an extremely important property of the function e : its derivative function is itself!
x

Finally, note carefully the distinction between power functions and exponential functions: in power functions, the variable
is in the base, as in x , while in exponential functions, the variable is in the power, as in 2 . As we can see from the rules,
2 x

this makes a big difference in the form of the derivative.


The following activity will check your understanding of the derivatives of the three basic types of functions noted above.

Activity 2.1.1

Use the three rules above to determine the derivative of each of the following functions. For each, state your answer
using full and proper notation, labeling the derivative with its name. For example, if you are given a function h(z) ,
dh
you should write “h (z) = ” or “

= ” as part of your response.
dz

a. f (t) = π
b. g(z) = 7z
c. h(w) = w 3/4

d. p(x) = 3 1/2

e. r(t) = (√2) t

f. [ q ]
d

dq
−1

g. m(t) = 1
3
t

Constant Multiples and Sums of Functions


Of course, most of the functions we encounter in mathematics are more complicated than being simply constant, a power
of a variable, or a base raised to a variable power. In this section and several following, we will learn how to quickly
compute the derivative of a function constructed as an algebraic combination of basic functions. For instance, we’d like to
be able to understand how to take the derivative of a polynomial function such as
5 4 2
p(t) = 3 t − 7t +t − 9, (2.1.11)

which is a function made up of constant multiples and sums of powers of t . To that end, we develop two new rules: the
Constant Multiple Rule and the Sum Rule.
Say we have a function y = f (x) whose derivative formula is known. How is the derivative of y = kf (x) related to the
derivative of the original function? Recall that when we multiply a function by a constant k , we vertically stretch the graph
by a factor of |k| (and reflect the graph across y = 0 if k < 0 ). This vertical stretch affects the slope of the graph, making
the slope of the function y = kf (x) be k times as steep as the slope of y = f (x). In terms of the derivative, this is
essentially saying that when we multiply a function by a factor of k , we change the value of its derivative by a factor of k
as well. Thus2 , the Constant Multiple Rule holds:

The Constant Multiple Rule


d ′
For any real number k , if f (x) is a differentiable function with derivative f ′
, then
(x) [kf(x)] = k f (x) .
dx

Matthew Boelkins, David Austin & Steven


2.1.3 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
In words, this rule says that “the derivative of a constant times a function is the constant times the derivative of the
function.” For example, if
t
g(t) = 3 ⋅ 5 , (2.1.12)

we have

g (t) = 3 ⋅ 5t ln(5). (2.1.13)

Similarly,
d −2 −3
[5 z ] = 5(−2 z ). (2.1.14)
dz

Next we examine what happens when we take a sum of two functions. If we have y = f (x) and y = g(x) , we can
compute a new function y = (f + g)(x) by adding the outputs of the two functions: (f + g)(x) = f (x) + g(x) . Not only
does this result in the value of the new function being the sum of the values of the two known functions, but also the slope
of the new function is the sum of the slopes of the known functions. Therefore3, we arrive at the following Sum Rule for
derivatives:

The Sum Rule

If f (x) and g(x) are differentiable functions with derivatives ′


f (x) and ′
g (x) respectively, then
d ′ ′
[f(x) + g(x)] = f (x) + g (x) .
dx

In words, the Sum Rule tells us that “the derivative of a sum is the sum of the derivatives.” It also tells us that any time we
take a sum of two differentiable functions, the result must also be differentiable. Furthermore, because we can view the
difference function

y = (f − g)(x) = f (x) − g(x) as y = f (x) + (−1 ⋅ g(x)), (2.1.15)

the Sum Rule and Constant Multiple Rules together tell us that
d
′ ′
[f (x) + (−1 ⋅ g(x))] = f (x) − g (x), (2.1.16)
dx

or that “the derivative of a difference is the difference of the derivatives.” Hence we can now compute derivatives of sums
and differences of elementary functions. For instance,
d
w 2 w
(2 +w ) = 2 ln(2) + 2w, (2.1.17)
dw

and if
6 −3
h(q) = 3 q − 4q , (2.1.18)

then
′ 5 −4 5 −4
h (q) = 3 (6 q ) − 4 (−3 q ) = 18 q + 12 q . (2.1.19)

Activity 2.1.2

Use only the rules for constant, power, and exponential functions, together with the Constant Multiple and Sum Rules,
to compute the derivative of each function below with respect to the given independent variable. Note well that we do
not yet know any rules for how to differentiate the product or quotient of functions. This means that you may have to
do some algebra first on the functions below before you can actually use existing rules to compute the desired
derivative formula. In each case, label the derivative you calculate with its name using proper notation such as f (x), ′

h (z) , dr/dt , etc.


5/3 4 x
a. f(x) = x −x +2
x 5
b. g(x) = 14e + 3x −x

Matthew Boelkins, David Austin & Steven


2.1.4 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
1 2
c. h(z) = √z + 4
+5
z
−− 7 t 4
d. r(t) = √53 t − πe + e

2 2
e. s(y) = (y + 1) ( y − 1)
3
x −x+2
f. q(x) = x
4 3 2
g. p(a) = 3a − 2a + 7a − a + 12

In the same way that we have shortcut rules to help us find derivatives, we introduce some language that is simpler and
shorter. Often, rather than say “take the derivative of f ,” we’ll instead say simply “differentiate f .” This phrasing is tied to
the notion of having a derivative to begin with: if the derivative exists at a point, we say “f is differentiable,” which is tied
to the fact that f can be differentiated.
As we work more and more with the algebraic structure of functions, it is important to strive to develop a big picture view
of what we are doing. Here, we can note several general observations based on the rules we have so far. One is that the
derivative of any polynomial function will be another polynomial function, and that the degree of the derivative is one less
than the degree of the original function. For instance, if
5 3
p(t) = 7 t − 4t + 8t (2.1.20)

p is a degree 5 polynomial, and its derivative,


′ 4 2
p (t) = 35 t − 12 t + 8, (2.1.21)

is a degree 4 polynomial. Additionally, the derivative of any exponential function is another exponential function: for
z
example, if g(z) = 7⋅ 2 , then g ′ (z) = 7⋅ 2
z
ln(2) , which is also exponential.
Furthermore, while our current emphasis is on learning shortcut rules for finding derivatives without directly using the
limit definition, we should be certain not to lose sight of the fact that all of the meaning of the derivative still holds that we
developed in Chapter 1. That is, anytime we compute a derivative, that derivative measures the instantaneous rate of
change of the original function, as well as the slope of the tangent line at any selected point on the curve. The following
activity asks you to combine the just-developed derivative rules with some key perspectives that we studied in Chapter 1.

Activity 2.1.3

Each of the following questions asks you to use derivatives to answer key questions about functions. Be sure to think
carefully about each question and to use proper notation in your responses.

at the point where \( z = 4\).


1
a. Find the slope of the tangent line to h(z) = √z + z

b. A population of cells is growing in such a way that its total number in millions is given by the function
t
P (t) = 2(1.37) + 32 , where t is measured in days.
c. i. Determine the instantaneous rate at which the population is growing on day 4, and include
units on your answer.
ii. Is the population growing at an increasing rate or growing at a decreasing rate on day 4?
Explain.

d. Find an equation for the tangent line to the curve p(a) = 3a4 − 2a3 + 7a2 − a + 12 at the point
where a = −1 .
e. What is the difference between being asked to find the slope of the tangent line (asked in (a)) and the
equation of the tangent line (asked in (c))?

Summary
In this section, we encountered the following important ideas:

Matthew Boelkins, David Austin & Steven


2.1.5 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
df
Given a differentiable function y = f (x), we can express the derivative of f in several different notations: f ′
,
(x) ,
dx
dy d
, and [f(x)] .
dx dx

The limit definition of the derivative leads to patterns among certain families of functions that enable us to compute
derivative formulas without resorting directly to the limit definition. For example, if f is a power function of the form
f (x) = x , then f (x) = nx for any real number n other than 0. This is called the Rule for Power Functions.
n ′ n−1

We have stated a rule for derivatives of exponential functions in the same spirit as the rule for power functions: for any
positive real number a , if f (x) = a , then f (x) = a ln(a) .
x ′ x

If we are given a constant multiple of a function whose derivative we know, or a sum of functions whose derivatives
we know, the Constant Multiple and Sum Rules make it straightforward to compute the derivative of the overall
function. More formally, if f (x) and g(x) are differentiable with derivatives f (x) and g (x) and a and b are
′ ′

constants, then

d ′ ′
[af(x) + bg(x)] = af (x) + bg (x) (2.1.22)
dx

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)
________
1
That is, we do not say “dee-y over dee-x.”
2The Constant Multiple Rule can be formally proved as a consequence of properties of limits, using the limit definition of
the derivative.
3
Like the Constant Multiple Rule, the Sum Rule can be formally proved as a consequence of properties of limits, using the
limit definition of the derivative.

Matthew Boelkins, David Austin & Steven


2.1.6 12/11/2021 https://math.libretexts.org/@go/page/4299
Schlicker
2.2: The Sine and Cosine Function
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
d x x
What is a graphical justification for why dx
[a ] = a ln(a) ?

What do the graphs of y = sin(x) andy = cos(x) suggest as formulas for their respective derivatives?
Once we know the derivatives of sin(x) and cos(x), how do previous derivative rules work when these functions
are involved?

Throughout Chapter 2, we will be working to develop shortcut derivative rules that will help us to bypass the limit
definition of the derivative in order to quickly determine the formula for f (x) when we are given a formula for f (x). In

Section 2.1, we learned the rule for power functions, that if f (x) = x , then f (x) = nx
n ′
, and justified this in part due
n−1

to results from different n -values when applying the limit definition of the derivative. We also stated the rule for
exponential functions, that if a is a positive real number and f (x) = a , then f'(x) = a x ln(a). Later in this present section,
x

we are going to work to conjecture formulas for the sine and cosine functions, primarily through a graphical argument. To
help set the stage for doing so, the following preview activity asks you to think about exponential functions and why it is
reasonable to think that the derivative of an exponential function is a constant times the exponential function itself.

Preview Activity 2.2.1

Consider the function g(x) = 2 , which is graphed in Figure 2.2.1.


x

a. At each of x = −2, −1, 0, 1, 2, use a straightedge to sketch an accurate tangent line to y = g(x) .
b. Use the provided grid to estimate the slope of the tangent line you drew at each point in (a).
c. Use the limit definition of the derivative to estimate g (0) by using small values of h , and compare the result

to your visual estimate for the slope of the tangent line to y = g(x) at x = 0 in (b).
d. Based on your work in (a), (b), and (c), sketch an accurate graph of y = g (x) on the axes adjacent to the

graph of y = g(x) .
e. Write at least one sentence that explains why it is reasonable to think that g (x) = cg(x) , where c is a

constant. In addition, calculate ln(2), and then discuss how this value, combined with your work above,
reasonably suggests that g (x) = 2 ln(2) .
′ x

Figure 2.2.1: At left, the graph of y = g(x) = 2 x . At right, axes for plotting y = g 0 (x).

The sine and cosine functions


The sine and cosine functions are among the most important functions in all of mathematics. Sometimes called the circular
functions due to their genesis in the unit circle, these periodic functions play a key role in modeling repeating phenomena
such as the location of a point on a bicycle tire, the behavior of an oscillating mass attached to a spring, tidal elevations,
and more. Like polynomial and exponential functions, the sine and cosine functions are considered basic functions, ones

Matthew Boelkins, David Austin & Steven


2.2.1 12/15/2021 https://math.libretexts.org/@go/page/4300
Schlicker
that are often used in the building of more complicated functions. As such, we would like to know formulas for
d

dx
[sin(x)] and a

dx
[cos(x)] , and the next two activities lead us to that end.

Activity 2.2.1

Consider the function f (x) = sin(x), which is graphed in Figure 2.2.2 below. Note carefully that the grid in the
diagram does not have boxes that are 1 × 1, but rather approximately 1.57 × 1, as the horizontal scale of the grid is π/2
units per box.
a. At each of x = −2π , − , −π, − ,0, (\frac { \pi } { 2 }) , (\pi), (\frac { 3 \pi } { 2 })\,

2
π

2π) useastraightedgetosketchanaccuratetangentlineto\(y = f (x) .

b. Use the provided grid to estimate the slope of the tangent line you drew at each point. Pay careful attention to
the scale of the grid.
c. Use the limit definition of the derivative to estimate f (0) by using small values of h , and compare the result

to your visual estimate for the slope of the tangent line to y = f (x) at x = 0 in (b). Using periodicity, what
does this result suggest about f (2π)? about f (−2π)?
′ ′

d. Based on your work in (a), (b), and (c), sketch an accurate graph of y = f (x) on the axes adjacent to the

graph of y = f (x).
e. What familiar function do you think is the derivative of f (x) = sin(x)?

Figure 2.2.2: At left, the graph of y = f (x) = \sin(x).

Activity 2.2.2

Consider the function g(x) = cos(x), which is graphed in Figure 2.2.3 below. Note carefully that the grid in the
diagram does not have boxes that are 1 × 1, but rather approximately 1.57 × 1, as the horizontal scale of the grid is π/2
units per box.
3π π 3π
a. At each of x = −2π , − 2 , −π, 0 , 2 , π , 2 , 2π use a straightedge to sketch an accurate tangent line to y =
g(x).
b. Use the provided grid to estimate the slope of the tangent line you drew at each point. Again, note the scale of the
axes and grid.
c. Use the limit definition of the derivative to estimate g (π ) by using small values of h , and compare the result to
′ 2

your visual estimate for the slope of the tangent line to y = g(x) at \(x = \frac{\pi}{2}) in (b). Using periodicity,
what does this result suggest about \(g ' (\frac{− 3π}{ 2} )? can symmetry on the graph help you estimate other
slopes easily?
d. Based on your work in (a), (b), and (c), sketch an accurate graph of y = g (x) on the axes adjacent to the graph of

y = g(x) .

e. What familiar function do you think is the derivative of g(x) = cos(x)?

Figure 2.2.3: At left, the graph of y = g(x) = \cos(x).

The results of the two preceding activities suggest that the sine and cosine functions not only have the beautiful
interrelationships that are learned in a course in trigonometry – connections such as the identities
2 2
sin (x) + cos (x) = 1 (2.2.1)

Matthew Boelkins, David Austin & Steven


2.2.2 12/15/2021 https://math.libretexts.org/@go/page/4300
Schlicker
and
π
cos(x − ) = sin(x) (2.2.2)
2

– but that they are even further linked through calculus, as the derivative of each involves the other. The following rules
summarize the results of the activities4 .

Sine and Cosine Functions

For all real numbers x,


d
[sin(x)] = cos(x) (2.2.3)
dx

and
d
[cos(x)] = − sin(x) (2.2.4)
dx

We have now added two additional functions to our library of basic functions whose derivatives we know: power
functions, exponential functions, and the sine and cosine functions. The constant multiple and sum rules still hold, of
course, and all of the inherent meaning of the derivative persists, regardless of the functions that are used to constitute a
given choice of f (x). The following activity puts our new knowledge of the derivatives of sin(x) and cos(x) to work.

Activity 2.2.3

Answer each of the following questions. Where a derivative is requested, be sure to label the derivative function with
its name using proper notation.
a. Determine the derivative of h(t) = 3cos(t) − 4sin(t) .
sin(x)
b. Find the exact slope of the tangent line to y = f (x) = 2x + at the point where x = .
2
π

c. Find the equation of the tangent line to y = g(x) = x + 2 cos(x) at the point where x = π .
2 2

d. Determine the derivative of p(z) = z + 4 + 4cos(z) − sin( ) .


4 z π

e. The function P (t) = 24 + 8sin(t) represents a population of a particular kind of animal that lives on a
small island, where P is measured in hundreds and t is measured in decades since January 1, 2010. What is
the instantaneous rate of change of P on January 1, 2030? What are the units of this quantity? Write a
sentence in everyday language that explains how the population is behaving at this point in time.

Summary
In this section, we encountered the following important ideas:
If we consider the graph of an exponential function f (x) = a (where a > 1), the graph of f'(x) behaves similarly,
x

appearing exponential and as a possibly scaled version of the original function a x . For f (x) = 2 x , careful analysis of
the graph and its slopes suggests that d dx [2 x ] = 2 x ln(2), which is a special case of the rule we stated in Section 2.1.
By carefully analyzing the graphs of y = sin(x) and y = cos(x), plus using the limit definition of the derivative at
select points, we found that
d
[sin(x)] = cos(x) (2.2.5)
dx

and
d
[cos(x)] = − sin(x). (2.2.6)
dx

We note that all previously encountered derivative rules still hold, but now may also be applied to functions involving
the sine and cosine, plus all of the established meaning of the derivative applies to these trigonometric functions as
well.

Matthew Boelkins, David Austin & Steven


2.2.3 12/15/2021 https://math.libretexts.org/@go/page/4300
Schlicker
Contributors and Attributions
Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)
________
4
These two rules may be formally proved using the limit definition of the derivative and the expansion identities for
and cos(x + h) .
sin(x + h)

Matthew Boelkins, David Austin & Steven


2.2.4 12/15/2021 https://math.libretexts.org/@go/page/4300
Schlicker
2.3: The Product and Quotient Rules
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How does the algebraic structure of a function direct us in computing its derivative using shortcut rules?
How do we compute the derivative of a product of two basic functions in terms of the derivatives of the basic
functions?
How do we compute the derivative of a quotient of two basic functions in terms of the derivatives of the basic
functions?
How do the product and quotient rules combine with the sum and constant multiple rules to expand the library of
functions we can quickly differentiate?

So far, the basic functions we know how to differentiate include power functions (x ), exponential functions (a ), and the two
n x

fundamental trigonometric functions (sin(x) and cos(x)). With the sum rule and constant multiple rules, we can also compute
the derivative of combined functions such as
11 x –
f (x) = 7 x −4 ⋅ 9 + π sin(x) − √3 cos(x), (2.3.1)

because the function f is fundamentally a sum of basic functions. Indeed, we can now quickly say that
′ 10 x –
f (x) = 77 x −4 ⋅ 9 ln(9) + π cos(x) + √3 sin(x). (2.3.2)

But we can of course combine basic functions in ways other than multiplying them by constants and taking sums and
differences. For example, we could consider the function that results from a product of two basic functions, such as
3
p(z) = z cos(z), (2.3.3)

or another that is generated by the quotient of two basic functions, one like
sin(t)
q(t) = . (2.3.4)
t
2

While the derivative of a sum is the sum of the derivatives, it turns out that the rules for computing derivatives of products and
quotients are more complicated. In what follows we explore why this is the case, what the product and quotient rules actually
say, and work to expand our repertoire of functions we can easily differentiate. To start, Preview Activity 2.3.1 asks you to
investigate the derivative of a product and quotient of two polynomials.

Preview Activity 2.3.1

Let f and g be the functions defined by f (t) = 2t 2 and g(t) = t 3+4t.


a. Determine f' (t) and g 0 (t).
b. Let p(t) = 2t 2 (t 3 + 4t) and observe that p(t) = f (t) · g(t). Rewrite the formula for p by distributing the 2t 2 term. Then,
compute P' (t) using the sum and constant multiple rules.
c. True or false: P' (t) = f' (t) · g 0 (t).
d. Let q(t) = t 3 + 4t 2t 2 and observe that q(t) = g(t) f (t) . Rewrite the formula for q by dividing each term in the
numerator by the denominator and simplify to write q as a sum of constant multiples of powers of t. Then, compute Q'
(t) using the sum and constant multiple rules.
e. True or false: Q' (t) = g 0 (t) f' (t) . ./

The Product Rule


As parts (b) and (d) of Preview Activity 2.3.1 show, it is not true in general that the derivative of a product of two functions is
the product of the derivatives of those functions. Indeed, the rule for differentiating a function of the form p(x) = f (x) · g(x) in
terms of the derivatives of f and g is more complicated than simply taking the product of the derivatives of f and g. To see
further why this is the case, as well as to begin to understand how the product rule actually works, we consider an example
involving meaningful functions.

Matthew Boelkins, David Austin & Steven


2.3.1 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
Say that an investor is regularly purchasing stock in a particular company. Let N(t) be a function that represents the number of
shares owned on day t, where t = 0 represents the first day on which shares were purchased. Further, let S(t) be a function that
gives the value of one share of the stock on day t; note that the units on S(t) are dollars per share. Moreover, to compute the
total value on day t of the stock held by the investor, we use the function V (t) = N (t) ⋅ S(t). By taking the product
V (t) = N (t)shares ⋅ S(t)dollarspershare, (2.3.5)

we have the total value in dollars of the shares held. Observe that over time, both the number of shares and the value of a given
share will vary. The derivative N 0 (t) measures the rate at which the number of shares held is changing, while S (t) measures

the rate at 104 which the value per share is changing. The big question we’d like to answer is: how do these respective rates of
change affect the rate of change of the total value function?
To help better understand the relationship among changes in N, S, and V, let’s consider some specific data. Suppose that on day
100, the investor owns 520 shares of stock and the stock’s current value is $27.50 per share. This tells us that N(100) = 520 and
S(100) = 27.50. In addition, say that on day 100, the investor purchases an additional 12 shares (so the number of shares held is
rising at a rate of 12 shares per day), and that on that same day the price of the stock is rising at a rate of'.75 dollars per share
per day. Viewed in calculus notation, this tells us that
′ ′
N (100) = 12(sharesperday)andS (100) = 0.75(dollarspershareperday). (2.3.6)

At what rate is the value of the investor’s total holdings changing on day 100?
Observe that the increase in total value comes from two sources: the growing number of shares, and the rising value of each
share. If only the number of shares is rising (and the value of each share is constant), the rate at which which total value would
rise is found by computing the product of the current value of the shares with the rate at which the number of shares is
changing. That is, the rate at which total value would change is given by
dollars
S(100) ⋅ N 0(100) = 27.50 ⋅ 12sharesday = 330dollarsday. (2.3.7)
share

Note particularly how the units make sense and explain that we are finding the rate at which the total value V is changing,
measured in dollars per day. If instead the number of shares is constant, but the value of each share is rising, then the rate at
which the total value would rise is found similarly by taking the product of the number of shares with the rate of change of
share value. In particular, the rate total value is rising is
N (100) ⋅ S0(100) = 520shares ⋅ 0.75dollarspershareday = 390dollarsday. (2.3.8)

Of course, when both the number of shares is changing and the value of each share is changing, we have to include both of
these sources, and hence the rate at which the total value is rising is

V 0(100) = S(100) ⋅ N 0(100) + N (100) ⋅ S0(100) = 330 + 390 = 720dollarsday. (2.3.9)

This tells us that we expect the total value of the investor’s holdings to rise by about $720 on the 100th day.

While this example highlights why the product rule is true, there are some subtle issues to recognize. For one, if the stock’s
value really does rise exactly $0.75 on day 100, and the number of shares really rises by 12 on day 100, then we’d expect
that
V (101) = N (101) ⋅ S(101) = 532 ⋅ 28.25 = 15029. (2.3.10)

If, as noted above, we expect the total value to rise by $720, then with V(100) = N(100) · S(100) = 520 · 27.50 = 14300,
then it seems like we should find that V(101) = V(100) + 720 = 15020. Why do the two results differ by 9? One way to
understand why this difference occurs is to recognize that N (100) = 12 represents an instantaneous rate of change, while

our (informal) discussion has also thought of this number as the total change in the number of shares over the course of a
single day. The formal proof of the product rule reconciles this issue by taking the limit as the change in the input tends to
zero.

Next, we expand our perspective from the specific example above to the more general and abstract setting of a product p of two
differentiable functions, f and g . If we have P (x) = f (x) ⋅ g(x) , our work above suggests that
′ ′ ′
P (x) = f (x)g (x) + g(x)f (x). (2.3.11)

Matthew Boelkins, David Austin & Steven


2.3.2 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
Indeed, a formal proof using the limit definition of the derivative can be given to show that the following rule, called the
product rule, holds in general.

Product Rule

If f and g are differentiable functions, then their product P (x) = f (x) ⋅ g(x) is also a differentiable function, and
′ ′ ′
P (x) = f (x)g (x) + g(x)f (x). (2.3.12)

In light of the earlier example involving shares of stock, the product rule also makes sense intuitively: the rate of change of P
should take into account both how fast f and g are changing, as well as how large f and g are at the point of interest.
Furthermore, we note in words what the product rule says: if P is the product of two functions f (the first function) and g (the
second), then “the derivative of P is the first times the derivative of the second, plus the second times the derivative of the
first.” It is often a helpful mental exercise to say this phrasing aloud when executing the product rule. For example, if
3
P (z) = z ⋅ cos(z) (2.3.13)

, we can now use the product rule to differentiate P. The first function is z and the second function is cos(z). By the product
3

rule, P' will be given by the first, z , times the derivative of the second, - sin(z), plus the second, cos(z), times the derivative of
3

the first, 3z 2 . That is,


′ 3 2 3 2
P (z) = z (− sin(z)) + cos(z)3 z = −z sin(z) + 3 z cos(z). (2.3.14)

The following activity further explores the use of the product rule.

Activity 2.3.1

Use the product rule to answer each of the questions below. Throughout, be sure to carefully label any derivative you find
by name. It is not necessary to algebraically simplify any of the derivatives you compute.
a. Let m(w) = 3w 174 w. Find m 0 (w).
b. Let h(t) = (sin(t) + cos(t))t . Find h (t).
4 ′

c. Determine the slope of the tangent line to the curve y = f (x) at the point where a = 1 if f is given by the rule f (x) = e x
sin(x).
d. Find the tangent line approximation L(x) to the function y = g(x) at the point where a = -1 if g is given by the rule g(x)
= (x 2 + x)2 x .

The Quotient Rule


Because quotients and products are closely linked, we can use the product rule to understand how to take the derivative of a
quotient. In particular, let Q(x) be defined by
f (x)
Q(x) = , \eqquot1 (2.3.15)
g(x)

where f and g are both differentiable functions. We desire a formula for Q in terms of f , g , f , and g . It turns out that
′ ′ ′
Q is
differentiable everywhere that g(x), 0. Moreover, multiplying both sides of Equation ??? by g , we can observe that

f (x) = Q(x) ⋅ g(x). (2.3.16)

Thus, we can use the product rule to differentiate f . Doing so,


′ ′ ′
f (x) = Q(x)g (x) + g(x)Q (x). (2.3.17)

Since we want to know a formula for Q , we work to solve this most recent equation for \(Q' (x)\0, finding first that

′ ′ ′
Q (x)g(x) = f (x) − Q(x)g (x). (2.3.18)

Dividing both sides by g(x), we have


′ ′ ′
Q (x) = f (x) − Q(x)g (x)g(x). (2.3.19)

Finally, we also recall that

Matthew Boelkins, David Austin & Steven


2.3.3 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
Q(x) = f (x)g(x). (2.3.20)

Using this expression in the preceding equation and simplifying, we have


′ ′ ′ ′
f (x) − f (x)g(x)g (x) g(x)f (x) − f (x)g (x)
′ ′ ′
Q (x) = = f (x) − f (x)g(x)g (x)g(x) ⋅ g(x)g(x) = . (2.3.21)
2
g(x) g(x)

This shows the fundamental argument for why the quotient rule holds.

Note: Quotient Rule


If f and g are differentiable functions, then their quotient Q(x) = f (x) g(x) is also a differentiable function for all x where
g(x) , 0, and Q' (x) = g(x) f' (x) - f (x)g 0 (x) g(x) 2 .

Like the product rule, it can be helpful to think of the quotient rule verbally. If a function Q is the quotient of a top function f
and a bottom function g, then Q' is given by “the bottom times the derivative of the top, minus the top times the derivative of
107 the bottom, all over the bottom squared.” For example, if Q(t) = sin(t)/2 t , then we can identify the top function as sin(t)
and the bottom function as 2 t . By the quotient rule, we then have that Q' will be given by the bottom, 2 t , times the derivative
of the top, cos(t), minus the top, sin(t), times the derivative of the bottom, 2 t ln(2), all over the bottom squared, (2 t ) 2 . That
is,

Q (t) = 2t cos(t) − sin(t)2tln(2)(2t)2. (2.3.22)

In this particular example, it is possible to simplify Q' (t) by removing a factor of 2 t from both the numerator and denominator,
hence finding that Q' (t) = cos(t) - sin(t) ln(2) 2 t . In general, we must be careful in doing any such simplification, as we don’t
want to correctly execute the quotient rule but then find an incorrect overall derivative due to an algebra error. As such, we will
often place more emphasis on correctly using derivative rules than we will on simplifying the result that follows. The next
activity further explores the use of the quotient rule.

Activity 2.3.2

Use the quotient rule to answer each of the questions below. Throughout, be sure to carefully label any derivative you find
by name. That is, if you’re given a formula for f (x), clearly label the formula you find for f' (x). It is not necessary to
algebraically simplify any of the derivatives you compute.
a. Let r(z) = 3 z z 4 + 1 . Find r 0 (z).
b. Let v(t) = sin(t) cos(t) + t 2 . Find v 0 (t).
c. Determine the slope of the tangent line to the curve R(x) = x 2 - 2x - 8 x 2 - 9 at the point where x = 0.
d. When a camera flashes, the intensity I of light seen by the eye is given by the function I(t) = 100t e t , where I is
measured in candles and t is measured in milliseconds. Compute I 0 (0.5), I 0 (2), and I 0 (5); include appropriate units
on each value; and discuss the meaning of each.

Combining Rules
One of the challenges to learning to apply various derivative shortcut rules correctly and effectively is recognizing the
fundamental structure of a function. For instance, consider the function given by
2
x
f (x) = x sin(x) + . (2.3.23)
cos(x) + 2

How do we decide which rules to apply? Our first task is to recognize the overall structure of the given function. Observe that
the function f is fundamentally a sum of two slightly less complicated functions, so we can apply the sum rule6 and get
2
d x

f (x) = [x sin(x) + ] (2.3.24)
dx cos(x) + 2

2
d d x
= [x sin(x)] + [ ] (2.3.25)
dx dx cos(x) + 2

Now, the left-hand term of Equation 2.3.25 is a product, so the product rule is needed there, while the right-hand term is a
quotient, so the quotient rule is required. Applying these rules respectively, we find that

Matthew Boelkins, David Austin & Steven


2.3.4 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
2
(cos(x) + 2)2x − x (− sin(x))

f (x) = (x cos(x) + sin(x)) + (2.3.26)
2
(cos(x) + 2)

2 2
2x cos(x) + 4 x +x sin(x)
= x cos(x) + sin(x) + . (2.3.27)
2
(cos(x) + 2)

We next consider how the situation changes with the function defined by
y
y⋅7
s(y) = . (2.3.28)
y2 + 1

Overall, s is a quotient of two simpler function, so the quotient rule will be needed. Here, we execute the quotient rule and use
the notation d
to defer the computation of the derivative of the numerator and derivative of the denominator. Thus,
dy

d d
2 y y 2
(y + 1) ⋅ [y ⋅ 7 ] − y ⋅ 7 ⋅ [y + 1]
dy dy

s (y) = . (2.3.29)
2 2
(y + 1)

Now, there remain two derivatives to calculate. The first one,


d y
[y ⋅ 7 ] (2.3.30)
dy

calls for use of the product rule, while the second,


d 2
[y + 1] (2.3.31)
dy

takes only an elementary application of the sum rule. Applying these rules, we now have
2 y y
(y + 1) [y ⋅ 7 ln(7) + 7 ⋅ 1] − y ⋅ 7y[2y]

s (y) = . (2.3.32)
2 2
(y + 1)

d
When taking a derivative that involves the use of multiple derivative rules, it is often helpful to use the notation [] to
dx
wait to apply subsequent rules. This is demonstrated in each of the two examples below.

While some minor simplification is possible, we are content to leave s (y) in its current form, having found the desired ′

derivative of s . In summary, to compute the derivative of s, we applied the quotient rule. In so doing, when it was time to
compute the derivative of the top function, we used the product rule; at the point where we found the derivative of the bottom
function, we used the sum rule. In general, one of the main keys to success in applying derivative rules is to recognize the
structure of the function, followed by the careful and diligent application of relevant derivative rules. The best way to get good
at this process is by doing a large number of exercises, and the next activity provides some practice and exploration to that end.

Exercise 2.3.3

Use relevant derivative rules to answer each of the questions below. Throughout, be sure to use proper notation and
carefully label any derivative you find by name.
a. Let f (r) = (5r 3
+ sin(r))(4
r
− 2 cos(r)) . Find f ′
(r) .
b. Let
cos(t)
p(t) = . (2.3.33)
6 t
t ⋅6

Find P ′
(t) .
c. Let
z
7 z 2
g(z) = 3 z e − 2z sin(z) + . (2.3.34)
2
z +1

Find g (z) .

d. A moving particle has its position in feet at time t in seconds given by the function

Matthew Boelkins, David Austin & Steven


2.3.5 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
3 cos(t) − sin(t)
s(t) = . (2.3.35)
t
e

Find the particle’s instantaneous velocity at the moment t = 1 .


e. Suppose that f (x) and g(x) are differentiable functions and it is known that f (3) = −2 , f ′
(3) = 7 , g(3) = 4 , and
g (3) = −1 . If p(x) = f (x) ⋅ g(x) and q(x) = f (x)g(x), calculate p (3) and q (3).
′ ′ ′

As the algebraic complexity of the functions we are able to differentiate continues to increase, it is important to remember that
all of the derivative’s meaning continues to hold. Regardless of the structure of the function f , the value of f (a) tells us the ′

instantaneous rate of change of f with respect to x at the moment x = a , as well as the slope of the tangent line to y = f (x) at
the point (a, f (a)).

Summary
In this section, we encountered the following important ideas:
If a function is a sum, product, or quotient of simpler functions, then we can use the sum, product, or quotient rules to
differentiate the overall function in terms of the simpler functions and their derivatives.
The product rule tells us that if P is a product of differentiable functions f and g according to the rule P (x) = f (x)g(x),
then
′ ′ ′
P (x) = f (x)g (x) + g(x)f (x). (2.3.36)

The quotient rule tells us that if Q is a quotient of differentiable functions f and g according to the rule Q(x) = f (x) g(x) ,
then
′ ′
g(x)f (x) − f (x)g (x)

Q (x) = . (2.3.37)
2
g(x)

The product and quotient rules now complement the constant multiple and sum rules and enable us to compute the
derivative of any function that consists of sums, constant multiples, products, and quotients of basic functions we already
know how to differentiate. For instance, if F has the form F (x) = 2a(x) − 5b(x)c(x) ⋅ d(x) , then F is fundamentally a
quotient, and the numerator is a sum of constant multiples and the denominator is a product. Hence the derivative of F can
be found by applying the quotient rule and then using the sum and constant multiple rules to differentiate the numerator and
the product rule to differentiate the denominator.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


2.3.6 12/15/2021 https://math.libretexts.org/@go/page/4301
Schlicker
2.4: Derivatives of Other Trigonometric Functions
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What are the derivatives of the tangent, cotangent, secant, and cosecant functions?
How do the derivatives of tan(x), cot(x), sec(x), and csc(x) combine with other derivative rules we have
developed to expand the library of functions we can quickly differentiate?

One of the powerful themes in trigonometry is that the entire subject emanates from a very simple idea: locating a point on
the unit circle.

Figure 2.4.1 : The unit circle and the definition of the sine and cosine functions.
Because each angle θ corresponds to one and only one point (x, y) on the unit circle, the x- and y-coordinates of this point
are each functions of θ. Indeed, this is the very definition of cos(θ) and sin(θ) : cos(θ) is the x-coordinate of the point on
the unit circle corresponding to the angle θ, and sin(θ) is the y-coordinate. From this simple definition, all of trigonometry
is founded. For instance, the fundamental trigonometric identity,
2 2
sin (θ) + cos (θ) = 1 (2.4.1)

is a restatement of the Pythagorean Theorem, applied to the right triangle shown in Figure 2.4.1
We recall as well that there are four other trigonometric functions, each defined in terms of the sine and/or cosine
functions. These six trigonometric functions together offer us a wide range of flexibility in problems involving right
triangles. The tangent function is defined by
sin(θ)
tan(θ) = , (2.4.2)
cos(θ)

while the cotangent function is its reciprocal:


cos(θ)
cot(θ) = . (2.4.3)
sin(θ)

The secant function is the reciprocal of the cosine function,


1
sec(θ) = , (2.4.4)
cos(θ)

and the cosecant function is the reciprocal of the sine function,


1
csc(θ) = . (2.4.5)
sin(θ)

Because we know the derivatives of the sine and cosine function, and the other four trigonometric functions are defined in
terms of these familiar functions, we can now develop shortcut differentiation rules for the tangent, cotangent, secant, and
cosecant functions. In this section’s preview activity, we work through the steps to find the derivative of y = tan(x) .

Matthew Boelkins, David Austin & Steven


2.4.1 11/10/2021 https://math.libretexts.org/@go/page/4302
Schlicker
Preview Activity 2.4.1

Consider the function f (x) = tan(x) , and remember that tan(x) = sin(x) cos(x).
a. What is the domain of f ?
b. Use the quotient rule to show that one expression for f ′
(x) is
cos(x) cos(x) + sin(x) sin(x)

f (x) = (2.4.6)
2
cos (x)

.
c. What is the Fundamental Trigonometric Identity? How can this identity be used to find a simpler form for f' (x)?
d. Recall that sec(x) = 1 cos(x) . How can we express f (x) in terms of the secant function?

e. For what values of x is f' (x) defined? How does this set compare to the domain of f?

Derivatives of the cotangent, secant, and cosecant functions


In Preview Activity 2.4, we found that the derivative of the tangent function can be expressed in several ways, but most
simply in terms of the secant function. Next, we develop the derivative of the cotangent function.
Let g(x) = cot(x) . To find g ′
, we observe that g(x) = cos(x) sin(x) and apply the quotient rule. Hence
(x)

sin(x)(− sin(x)) − cos(x) cos(x)



g (x) = (2.4.7)
2
sin (x)

2 2
si n (x) + cos (x)
=− (2.4.8)
2
sin (x)

By the Fundamental Trigonometric Identity, we see that


1

g (x) = − (2.4.9)
2
sin (x)

recalling Equation 2.4.5, it follows that we can most simply express g by the rule ′

′ 2
g (x) = −csc (x). (2.4.10)

Note that neither g nor g



is defined when sin(x) = 0 , which occurs at every integer multiple of π. Hence we have the
following rule.

Cotangent Function
For all real numbers x such that x ≠ kπ , where k = 0, ±1, ±2, …,
d
2
[cot(x)] = −csc (x). (2.4.11)
dx

Observe that the shortcut rule for the cotangent function is very similar to the rule we discovered in Preview Activity 2.4
for the tangent function.

Tangent Function
(2k + 1)π
For all real numbers x such that x ≠ , where k = ±1, ±2, … ,
2

d
2
[tan(x)] = sec (x). (2.4.12)
dx

In the next two activities, we develop the rules for differentiating the secant and cosecant functions.

Matthew Boelkins, David Austin & Steven


2.4.2 11/10/2021 https://math.libretexts.org/@go/page/4302
Schlicker
Activity 2.4.1

Let h(x) = sec(x) and recall that sec(x) = 1 cos(x) .


a. What is the domain of h?
b. Use the quotient rule to develop a formula for h 0 (x) that is expressed completely in terms of sin(x) and cos(x).
c. How can you use other relationships among trigonometric functions to write h 0 (x) only in terms of tan(x) and
sec(x)? (d) What is the domain of h 0 ? How does this compare to the domain of h? C 116 Activity 2.11. Let p(x) =
csc(x) and recall that csc(x) = 1 sin(x) . (a) What is the domain of p? (b) Use the quotient rule to develop a formula
for p 0 (x) that is expressed completely in terms of sin(x) and cos(x). (c) How can you use other relationships
among trigonometric functions to write p 0 (x) only in terms of cot(x) and csc(x)?
d. What is the domain of p 0 ? How does this compare to the domain of p?

The quotient rule has thus enabled us to determine the derivatives of the tangent, cotangent, secant, and cosecant functions,
expanding our overall library of basic functions we can differentiate. Moreover, we observe that just as the derivative of
any polynomial function is a polynomial, and the derivative of any exponential function is another exponential function, so
it is that the derivative of any basic trigonometric function is another function that consists of basic trigonometric
functions. This makes sense because all trigonometric functions are periodic, and hence their derivatives will be periodic,
too.
As has been and will continue to be the case throughout our work in Chapter 2, the derivative retains all of its fundamental
meaning as an instantaneous rate of change and as the slope of the tangent line to the function under consideration. Our
present work primarily expands the list of functions for which we can quickly determine a formula for the derivative.
Moreover, with the addition of tan(x), cot(x), sec(x), and csc(x) to our library of basic functions, there are many more
functions we can differentiate through the sum, constant multiple, product, and quotient rules.

Activity 2.4.2

Answer each of the following questions. Where a derivative is requested, be sure to label the derivative function with
its name using proper notation.
a. Let f (x) = 5 sec(x) - 2 csc(x). Find the slope of the tangent line to f at the point where x = π 3 .
b. Let p(z) = z 2 sec(z) - z cot(z). Find the instantaneous rate of change of p at the point where z = π 4 .
c. Let h(t) = tan(t) t 2 + 1 - 2e t cos(t). Find h 0 (t).
d. Let g(r) = r sec(r) 5 r . Find g' (r).
e. When a mass hangs from a spring and is set in motion, the object’s position oscillates in a way that the size of the
oscillations decrease. This is usually called 117 a damped oscillation. Suppose that for a particular object, its
displacement from equilibrium (where the object sits at rest) is modeled by the function s(t) = 15 sin(t) e t . Assume
that s is measured in inches and t in seconds. Sketch a graph of this function for t ≥ 0 to see how it represents the
situation described. Then compute ds/dt, state the units on this function, and explain what it tells you about the
object’s motion. Finally, compute and interpret s 0 (2).

Summary
In this section, we encountered the following important ideas:
The derivatives of the other four trigonometric functions are d dx [tan(x)] = sec2 (x), d dx [cot(x)] = - csc2 (x), d dx
[sec(x)] = sec(x)tan(x), and d dx [csc(x)] = - csc(x) cot(x). Each derivative exists and is defined on the same domain as
the original function. For example, both the tangent function and its derivative are defined for all real numbers x such
that x , kπ 2 , where k = ±1, ±2, . . ..
The above four rules for the derivatives of the tangent, cotangent, secant, and cosecant can be used along with the rules
for power functions, exponential functions, and the sine and cosine, as well as the sum, constant multiple, product, and
quotient rules, to quickly differentiate a wide range of different functions.

Matthew Boelkins, David Austin & Steven


2.4.3 11/10/2021 https://math.libretexts.org/@go/page/4302
Schlicker
Contributors and Attributions
Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


2.4.4 11/10/2021 https://math.libretexts.org/@go/page/4302
Schlicker
2.5: The Chain Rule
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a composite function and how do we recognize its structure algebraically?
Given a composite function C (x) = f (g(x)) that is built from differentiable functions f and g , how do we
compute C (x) in terms of f , g , f , and g ?
′ ′ ′

What is the statement of the Chain Rule?

In addition to learning how to differentiate a variety of basic functions, we have also been developing our ability to
understand how to use rules to differentiate certain algebraic combinations of them. For example, we not only know how
to take the derivative of f (x) = sin(x) and g(x) = x , but now we can quickly find the derivative of each of the
2

following combinations of f and g :


2
s(x) = 3 x − 5 sin(x) (2.5.1)

2
p(x) = x sin(x), (2.5.2)

and
2
q(x) = sin(x)x . (2.5.3)

Finding s uses the sum and constant multiple rules, determining p requires the product rule, and q can be attained with
′ ′ ′

the quotient rule. Again, we note the importance of recognizing the algebraic structure of a given function in order to find
its derivative:
s(x) = 3g(x) − 5f (x) (2.5.4)

p(x) = g(x) ⋅ f (x), (2.5.5)

and
f (x)
q(x) = . (2.5.6)
g(x)

There is one more natural way to algebraically combine basic functions, and that is by composing them. For instance, let’s
consider the function
2
C (x) = sin(x ) (2.5.7)

and observe that any input x passes through a chain of functions. In particular, in the process that defines the function
C (x), x is first squared, and then the sine of the result is taken. Using an arrow diagram,

2 2
x → x → sin(x ). (2.5.8)

In terms of the elementary functions f and g , we observe that x is first input in the function g , and then the result is used
as the input in f . Said differently, we can write
2
C (x) = f (g(x)) = sin(x ) (2.5.9)

and say that C is the composition of f and g . We will refer to g , the function that is first applied to x, as the inner function,
while f , the function that is applied to the result, is the outer function.
The main question that we answer in the present section is: given a composite function C (x) = f (g(x)) that is built from
differentiable functions f and g, how do we compute C'(x) in terms of f , g , f , and g ? In the same way that the rate of
′ ′

change of a product of two functions,

p(x) = f (x) ⋅ g(x), (2.5.10)

Matthew Boelkins, David Austin & Steven


2.5.1 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
depends on the behavior of both f and g , it makes sense intuitively that the rate of change of a composite function
C (x) = f (g(x)) will also depend on some combination of f and g and their derivatives. The rule that describes how to

compute C in terms of f and g and their derivatives will be called the chain rule. But before we can learn what the chain

rule says and why it works, we first need to be comfortable decomposing composite functions so that we can correctly
identify the inner and outer functions, as we did in the example above with
2
C (x) = sin(x ). (2.5.11)

Preview Activity 2.5.1:

For each function given below, identify its fundamental algebraic structure. In particular, is the given function a sum,
product, quotient, or composition of basic functions? If the function is a composition of basic functions, state a
formula for the inner function g and the outer function f so that the overall composite function can be written in the
form f (g(x)). If the function is a sum, product, or quotient of basic functions, use the appropriate rule to determine its
derivative.
a. h(x) = tan(2 ) x

b. p(x) = 2 tan(x)
x

c. r(x) = (tan(x)) 2

d. m(x) = e tan(x)


e. w(x) = √x + tan(x)
−−−− −
f. z(x) = √tan(x)

The Chain Rule


One of the challenges of differentiating a composite function is that it often cannot be written in an alternate algebraic
form. For instance, the function
2
C (x) = sin(x ) (2.5.12)

cannot be expanded or otherwise rewritten, so it presents no alternate approaches to taking the derivative. But other
composite functions can be expanded or simplified, and these present a way to begin to explore how the chain rule might
have to work. To that end, we consider two examples of composite functions that present alternate means of finding the
derivative.

Example 2.5.1: Derivative of a Composite Function of Polynomials

Let f (x) = −4x + 7 and g(x) = 3x − 5 . Determine a formula for C (x) = f (g(x)) and compute ′
C (x) . How is C

related to f and g and their derivatives?


Solution
By the rules given for f and g ,

C (x) = f (g(x))

= f (3x − 5)

= −4(3x − 5) + 7 (2.5.13)

= −12x + 20 + 7

= −12x + 27.

Thus, C ′
(x) = −12 . Noting that f ′
(x) = −4 and g ′
(x) = 3 , we observe that C appears to be the product of f and
′ ′

g .

From one perspective, Example 2.5.1 may be too elementary. Linear functions are the simplest of all functions, and
perhaps composing linear functions (which yields another linear function) does not exemplify the true complexity that is
involved with differentiating a composition of more complicated functions. At the same time, we should remember the
perspective that any differentiable function is locally linear, so any function with a derivative behaves like a line when

Matthew Boelkins, David Austin & Steven


2.5.2 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
viewed up close. From this point of view, the fact that the derivatives of f and g are multiplied to find the derivative of
their composition turns out to be a key insight. We now consider a second example involving a nonlinear function to gain
further understanding of how differentiating a composite function involves the basic functions that combine to form it.

Example 2.5.2: Derivative of a Composite Function involving trigonometric functiosn

Let C (x) = sin(2x). Use the double angle identity to rewrite C as a product of basic functions, and use the product
rule to find C . Rewrite C in the simplest form possible.
′ ′

Solution
By the double angle identity for the sine function

C (x) = sin(2x) = 2 sin(x) cos(x). (2.5.14)

Applying the product rule and simplifying,


′ 2 2
C (x) = 2 sin(x)(− sin(x)) + cos(x)(2 cos(x)) = 2(cos (x) − sin (x)). (2.5.15)

Next, we recall that one of the double angle identities for the cosine function tells us that
2 2
cos(2x) = cos (x) − sin (x). (2.5.16)

Substituting this result in our expression for C ′


(x) , we now have that C ′
(x) = 2 cos(2x) .

So from Example 2.5.2, we see that if C (x) = sin(2x), then C (x) = 2 cos(2x). Letting g(x) = 2x and f (x) = sin(x),

we observe that C (x) = f (g(x)). Moreover, with g (x) = 2 and f (x) = cos(x), it follows that we can view the structure
′ ′

of C (x) as

′ ′ ′
C (x) = 2 cos(2x) = g (x)f (g(x)). (2.5.17)

In this example, we see that for the composite function C (x) = f (g(x)), the derivative C is (as in the example involving

linear functions) constituted by multiplying the derivatives of f and g , but with the special condition that f is evaluated at ′

g(x) , rather than at x.

It makes sense intuitively that these two quantities are involved in understanding the rate of change of a composite
function: if we are considering C (x) = f (g(x)) and asking how fast C is changing at a given x value as x changes, it
clearly matters how fast g is changing at x, as well as how fast f is changing at the value of g(x). It turns out that this
structure holds not only for the functions in Examples 2.5.1 and 2.5.2, but indeed for all differentiable functions as is
stated in the Chain Rule (Like other differentiation rules, the Chain Rule can be proved formally using the limit definition
of the derivative).

Chain Rule

If g is differentiable at x and f is differentiable at g(x), then the composite function C defined by C (x) = f (g(x)) is
differentiable at x and
′ ′ ′
C (x) = f (g(x))g (x). (2.5.18)

As with the product and quotient rules, it is often helpful to think verbally about what the chain rule says:

If C is a composite function defined by an outer function f and an inner function


g , then C is given by the derivative of the outer function, evaluated at the inner

function, times the derivative of the inner function.


At least initially in working particular examples requiring the chain rule, it can also be helpful to clearly identify the inner
function g and outer function f , compute their derivatives individually, and then put all of the pieces together to generate
the derivative of the overall composite function. To see what we mean by this, consider the function
2
r(x) = (tan(x)) . (2.5.19)

Matthew Boelkins, David Austin & Steven


2.5.3 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
The function r is composite, with inner function g(x) = tan(x) and outer function f (x) = x
2
. Organizing the key
information involving f , g , and their derivatives, we have
2
f (x) = x

g(x) = tan(x)

f (x) = 2x
′ 2
g (x) = sec (x)

f (g(x)) = 2 tan(x)

Applying the chain rule (Equation 2.5.18), which tells us that


′ ′ ′
r (x) = f (g(x))g (x), (2.5.20)

we find that for r(x) = (tan(x)) , its derivative is


2

′ 2
r (x) = 2 tan(x)sec (x). (2.5.21)

As a side note, we remark that another way to write r(x) is r(x) = tan (x). Observe that in this format, the
2

composite nature of the function is more implicit, but this is common notation for powers of trigonometric functions:
cos (x), si n (x), and sec (x) are all composite functions, with the outer function a power function and the inner
4 5 2

function a trigonometric one.

The chain rule now substantially expands the library of functions we can differentiate, as the following activity
demonstrates.

Activity 2.5.1: Inner vs. Outer Functions

For each function given below, identify an inner function g and outer function f to write the function in the form
f (g(x)). Then, determine f (x), g (x), and f (g(x)), and finally apply the chain rule (Equation 2.5.18) to determine
′ ′ ′

the derivative of the given function.


a. h(x) = cos(x ) 4

−−−− −
b. p(x) = √tan(x)
c. s(x) = 2 sin(x)

d. z(x) = cot (x) 5

e. m(x) = (sec(x) + e x 9
)

Using multiple rules simultaneously The chain rule now joins the sum, constant multiple, product, and quotient rules in our
collection of the different techniques for finding the derivative of a function through understanding its algebraic structure
and the basic functions that constitute it. It takes substantial practice to get comfortable with navigating multiple rules in a
single problem; using proper notation and taking a few extra steps can be particularly helpful as well. We demonstrate with
an example and then provide further opportunity for practice in the following activity.

Example 2.5.3: A more Complex Composite Function

Find a formula for the derivative of


2
t +2t 4
h(t) = 3 sec (t). (2.5.22)

Solution
We first observe that the most basic structure of h is that it is the product of two functions:
h(t) = a(t) ⋅ b(t) (2.5.23)

where a(t) = 3 and b(t) = sec (t) . Therefore, we see that we will need to use the product rule to differentiate h .
2
t +2t 4

When it comes time to differentiate a and b in their roles in the product rule, we observe that since each is a composite
function, the chain rule will be needed. We therefore begin by working separately to compute a (t) and b (t) . ′ ′

Writing
Matthew Boelkins, David Austin & Steven
2.5.4 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
2
t +2t
a(t) = f (g(t)) = 3 (2.5.24)

and finding the derivatives of f and g , we have


\(f (t) = 3^t
2
g(t) = t + 2t
′ t
f (t) = 3 ln(3)

g (t) = 2t + 2
2
′ t +2t
f (g(t)) = 3 ln(3)

Thus, by the chain rule, it follows that


2
′ ′ ′ t +2t
a (t) = f (g(t))g (t) = 3 ln(3)(2t + 2). (2.5.25)

Turning next to b , we write


4
b(t) = r(s(t)) = sec (t) (2.5.26)

and find the derivatives of r and g . Doing so,


4
r(t) = t

s(t) = sec(t)
′ 3
r (t) = 4 t

s (t) = sec(t) tan(t)
′ 3
r (s(t)) = 4 sec (t)

By the chain rule (Equation 2.5.18), we now know that


′ ′ ′ 3 4
b (t) = r (s(t))s (t) = 4 sec (t)sec(t) tan(t) = 4 sec (t) tan(t). (2.5.27)

Now we are finally ready to compute the derivative of the overall function h . Recalling that
2
t +2t 4
h(t) = 3 sec (t), (2.5.28)

by the product rule we have


d 4 4
d 2
t +2t
h (t) = 3t2 + 2t [ sec (t)] + sec (t) [3 ]. (2.5.29)
dt dt

From our work above with a and b , we know the derivatives of 3 t +2t
and sec 4
(t) , and therefore
2 2
′ t +2t 4 4 t +2t
h (t) = 3 4 sec (t) tan(t) + sec (t)3 ln(3)(2t + 2). (2.5.30)

Activity 2.5.2

For each of the following functions, find the function’s derivative. State the rule(s) you use, label relevant derivatives
appropriately, and be sure to clearly identify your overall answer.
−−−−−−−
a. p(r) = 4√r + 2e 6 r

b. m(v) = sin(v ) cos(v 2 3


)

cos(10y)
c. h(y) = 4y
e +1
2

d. s(z) = 2 z sec(z)

e. c(x) = sin(e
2
x
)

The chain rule now adds substantially to our ability to do different familiar problems that involve derivatives. Whether
finding the equation of the tangent line to a curve, the instantaneous velocity of a moving particle, or the instantaneous rate
of change of a certain quantity, if the function under consideration involves a composition of other functions, the chain rule
is indispensable.

Matthew Boelkins, David Austin & Steven


2.5.5 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
Activity 2.5.3

Use known derivative rules, including the chain rule, as needed to answer each of the following questions.
−− −−−
a. Find an equation for the tangent line to the curve y = √e x
+3 at the point where x = 0 .
b. If
1
s(t) = (2.5.31)
2 3
(t + 1)

represents the position function of a particle moving horizontally along an axis at time t (where s is measured in
inches and t in seconds), find the particle’s instantaneous velocity at t = 1 . Is the particle moving to the left or right
at that instant?
c. At sea level, air pressure is 30 inches of mercury. At an altitude of h feet above sea level, the air pressure, P , in
inches of mercury, is given by the function
−0.0000323h
P = 30 e . (2.5.32)

Compute dP /dh and explain what this derivative function tells you about air pressure, including a discussion of
the units on dP /dh. In addition, determine how fast the air pressure is changing for a pilot of a small plane passing
through an altitude of 1000 feet.
d. Suppose that f (x) and g(x) are differentiable functions and that the following information about them is known:

x f (x) f'(x) g(x) g'(x)

-1 2 -5 -3 4

2 -3 4 -1 2

If C (x) is a function given by the formula f (g(x)) , determine ′


C (2) . In addition, if D(x) is the function f (f (x)),
find D (−1).

The Composite Version of Basic Function Rules


As we gain more experience with differentiating complicated functions, we will become more comfortable in the process
of simply writing down the derivative without taking multiple steps. We demonstrate part of this perspective here by
showing how we can find a composite rule that corresponds to two of our basic functions. For instance, we know that
d
[sin(x)] = cos(x). (2.5.33)
dx

If we instead want to know


d
[sin(u(x))], (2.5.34)
dx

where u is a differentiable function of x, then this requires the chain rule with the sine function as the outer function.
Applying the chain rule (Equation 2.5.18),
d

[sin(u(x))] = cos(u(x)) ⋅ u (x). (2.5.35)
dx

Similarly, since
d x x
[a ] = a ln(a), (2.5.36)
dx

it follows by the chain rule that


d u(x) u(x) ′
[a ] =a ln(a) ⋅ u (x). (2.5.37)
dx

Matthew Boelkins, David Austin & Steven


2.5.6 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
In the process of getting comfortable with derivative rules, an excellent exercise is to write down a list of all basic
functions whose derivatives are known, list those derivatives, and then write the corresponding chain rule for the
composite version with the inner function being an unknown function u(x) and the outer function being the known basic
function. These versions of the chain rule are particularly simple when the inner function is linear, since the derivative of a
linear function is a constant. For instance,
d 10 9
(5x + 7 ) = 10(5x + 7 ) ⋅5 (2.5.38)
dx

d 2
[tan(17x)] = 17 sec (17x), (2.5.39)
dx

and
d −3x −3x
[e ] = −3 e . (2.5.40)
dx

Summary
In this section, we encountered the following important ideas:
A composite function is one where the input variable x first passes through one function, and then the resulting output
passes through another. For example, the function
sin(x)
h(x) = 2 (2.5.41)

is composite since
sin(x)
x → sin(x) → 2 . (2.5.42)

Given a composite function C (x) = f (g(x)) that is built from differentiable functions f and g , the chain rule tells us
that we compute C (x) in terms of f , g , f , and g according to the formula
′ ′ ′

′ ′ ′
C (x) = f (g(x))g (x). (2.5.43)

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


2.5.7 12/15/2021 https://math.libretexts.org/@go/page/4303
Schlicker
2.6: Derivatives of Inverse Functions
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is the derivative of the natural logarithm function?
What are the derivatives of the inverse trigonometric functions arcsin(x) and arctan(x)?
If g is the inverse of a differentiable function f , how is g computed in terms of f , f , and g ?
′ ′

Much of mathematics centers on the notion of function. Indeed, throughout our study of calculus, we are investigating the
behavior of functions, often doing so with particular emphasis on how fast the output of the function changes in response
to changes in the input. Because each function represents a process, a natural question to ask is whether or not the
particular process can be reversed. That is, if we know the output that results from the function, can we determine the input
that led to it? Connected to this question, we now also ask: if we know how fast a particular process is changing, can we
determine how fast the inverse process is changing?
As we have noted, one of the most important functions in all of mathematics is the natural exponential function f (x) = e . x

Because the natural logarithm, g(x) = ln(x) , is the inverse of the natural exponential function, the natural logarithm is
similarly important. One of our goals in this section is to learn how to differentiate the logarithm function, and thus expand
our library of basic functions with known derivative formulas. First, we investigate a more familiar setting to refresh some
of the basic concepts surrounding functions and their inverses.

Preview Activity 2.6.1


5
The equation y = (x − 32) relates a temperature given in x degrees Fahrenheit to the corresponding temperature y
9
measured in degrees Celsius.
a. Solve the equation y = 5 9 (x - 32) for x to write x (Fahrenheit temperature) in terms of y (Celcius temperature).
b. Let C(x) = 5 9 (x - 32) be the function that takes a Fahrenheit temperature as input and produces the Celcius
temperature as output. In addition, let F(y) be the function that converts a temperature given in y degrees Celcius to
the temperature F(y) measured in degrees Fahrenheit. Use your work in (a) to write a formula for F(y). 130
c. Next consider the new function defined by p(x) = F(C(x)). Use the formulas for F and C to determine an expression
for p(x) and simplify this expression as much as possible. What do you observe?
d. Now, let r(y) = C(F(y)). Use the formulas for F and C to determine an expression for r(y) and simplify this
expression as much as possible. What do you observe?
e. What is the value of C 0 (x)? of F 0 (y)? How do these values appear to be related? ./

Basic Facts about Inverse Functions


A function f : A → B is a rule that associates each element in the set A to one and only one element in the set B. We call A
the domain of f and B the codomain of f . If there exists a function g : B → A such that g( f (a)) = a for every possible
choice of a in the set A and f (g(b)) = b for every b in the set B, then we say that g is the inverse of f . We often use the
notation f -1 (read “f -inverse”) to denote the inverse of f . Perhaps the most essential thing to observe about the inverse
function is that it undoes the work of f . Indeed, if y = f (x), then
f − 1(y) = f − 1(f (x)) = x, (2.6.1)

and this leads us to another key observation: writing y = f (x) and x = f -1 (y) say the exact same thing. The only difference
between the two equations is one of perspective – one is solved for x, while the other is solved for y. Here we briefly
remind ourselves of some key facts about inverse functions. For a function f : A → B,
f has an inverse if and only if f is one-to-one8 and onto9 ;
provided f -1 exists, the domain of f -1 is the codomain of f , and the codomain of f -1 is the domain of f ;
f -1 ( f (x)) = x for every x in the domain of f and f ( f -1 (y)) = y for every y in the codomain of f ;
y = f (x) if and only if x = f -1 (y).
Matthew Boelkins, David Austin & Steven
2.6.1 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
The last stated fact reveals a special relationship between the graphs of f and f -1 . In particular, if we consider y = f (x) and
a point (x, y) that lies on the graph of f , then it is also true that x = f -1 (y), which means that the point (y, x) lies on the
graph of f -1 .

Note
A function f is one-to-one provided that no two distinct inputs lead to the same output.
A function f is onto provided that every possible element of the codomain can be realized as an output of the
function for some choice of input from the domain.

This shows us that the graphs of f and f -1 are the reflections of one another across the line y = x, since reflecting across y
= x is precisely the geometric action that swaps the coordinates in an ordered pair. In Figure 2.8, we see this exemplified
for the function y = f (x) = 2 x and its inverse, with the points (-1, 1 2 ) and ( 1 2 , -1) highlighting the reflection of the
curves across y = x.

Figure 2.8: A graph of a function y = f (x) along with its inverse, y = f -1 (x).
To close our review of important facts about inverses, we recall that the natural exponential function y = f (x) = e x has an
inverse function, and its inverse is the natural logarithm, x = f -1 (y) = \ln (y). Indeed, writing y = e x is interchangeable
with x = \ln (y), plus \ln (e x ) = x for every real number x and e \ln (y) = y for every positive real number y.

The Derivative of the Natural Logarithm Function


In what follows, we determine a formula for the derivative of g(x) = ln(x) . To do so, we take advantage of the fact that
we know the derivative of the natural exponential function, which is the inverse of g. In particular, we know that writing
g(x) = ln(x) is equivalent to writing e = x . Now we differentiate both sides of this most recent equation. In particular,
g(x)

we observe that
d g(x)
d
[e ] = [x]. (2.6.2)
dx dx

The righthand side of Equaton 2.6.2 is simply 1; applying the chain rule to the left side, we find that
g(x) ′
e g (x) = 1 (2.6.3)

Since our goal is to determine g ′


, we solve Equation 2.6.3 for g
(x)

(x) , so
′ g(x)
g (x) = 1 e . (2.6.4)

Finally, we recall that since

g(x) = ln(x) (2.6.5)

which when substituted into Equation 2.6.4 results in


g(x) ln(x)
e =e =x (2.6.6)

and thus


1
g (x) = . (2.6.7)
x

Matthew Boelkins, David Austin & Steven


2.6.2 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
Natural Logarithm
For all positive real numbers x,
d 1
[ln(x)] = . (2.6.8)
dx x

This rule for the natural logarithm function now joins our list of other basic derivative rules that we have already
established. There are two particularly interesting things to note about the fact that d dx [\ln (x)] = 1 x . One is that this rule
is restricted to only apply to positive values of x, as these are the only values for which the original function is defined.
The other is that for the first time in our work, differentiating a basic function of a particular type has led to a function of a
very different nature: the derivative of the natural logarithm is not another logarithm, nor even an exponential function, but
rather a rational one. Derivatives of logarithms may now be computed in concert with all of the rules known to date. For
instance, if f(t) = \ln (t 2 + 1), then by the chain rule, f 0 (t) = 1 t 2+1 · 2t.

Activity 2.6.1:

For each function given below, find its derivative.


a. h(x) = x 2 \ln (x)
b. p(t) = \ln (t) e t + 1
c. s(y) = \ln (cos(y) + 2)
d. z(x) = tan(\ln (x))
e. m(z) = \ln (\ln (z)) C

In addition to the important rule we have derived for the derivative of the natural log functions, there are additional
interesting connections to note between the graphs of f (x) = e x and f -1 (x) = \ln (x).
In Figure 2.9, we are reminded that since the natural exponential function has the property that its derivative is itself, the
slope of the tangent to y = e x is equal to the height of the curve at that point. For instance, at the point A = (\ln (0.5), 0.5),
the slope of the tangent line is mA = 0.5, and at B = (\ln (5), 5), the tangent line’s slope is mB = 5. At the corresponding
points A 0 and B 0 on the graph of the natural logarithm function (which come from reflecting across the line y = x), we
know that the slope of the tangent line is the reciprocal of the x-coordinate of the point (since d dx [\ln (x)] = 1 x ). Thus,
with A 0 = (0.5, \ln (0.5)), we have mA0 = 1 0.5 = 2, and at B 0 = (5, \ln (5)), mB0 = 1 5 .

Figure 2.9: A graph of the function y = e x along with its inverse, y = \ln (x), where both functions are viewed using the
input variable x.
In particular, we observe that mA0 = 1 mA and mB0 = 1 mB . This is not a coincidence, but in fact holds for any curve y =
f (x) and its inverse, provided the inverse exists. One rationale for why this is the case is due to the reflection across y = x:
in so doing, we essentially change the roles of x and y, thus reversing the rise and run, which leads to the slope of the
inverse function at the reflected point being the reciprocal of the slope of the original function. At the close of this section,
we will also look at how the chain rule provides us with an algebraic formulation of this general phenomenon.

Matthew Boelkins, David Austin & Steven


2.6.3 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
Inverse Trigonometric Functions and their Derivatives
Trigonometric functions are periodic, so they fail to be one-to-one, and thus do not have inverses. However, if we restrict
the domain of each trigonometric function, we can force the function to be one-to-one. For instance, consider the sine
function on the domain [- π 2 , π 2 ].
Because no output of the sine function is repeated on this interval, the function is one-to-one and thus has an inverse. In
particular, if we view f (x) = sin(x) as having domain [- π 2 , π 2 ] and codomain [-1, 1], then there exists an inverse
function f -1 such that
f − 1 : [−1, 1] → [−π2, π2]. (2.6.9)

We call f -1 the arcsine (or inverse sine) function and write f -1 (y) = arcsin(y). It is especially important to remember that
writing
y = sin(x)andx = arcsin(y) (2.6.10)

say the exact same thing. We often read “the arcsine of y” as “the angle whose sine is y.”

Figure 2.10: A graph of f (x) = sin(x) (in blue), restricted to the domain [- π 2 , π 2 ], along with its inverse, f -1 (x) =
arcsin(x) (in magenta).
For example, we say that π 6 is the angle whose sine is 1 2 , which can be written more concisely as arcsin( 1 2 ) = π 6 ,
which is equivalent to writing sin( π 6 ) = 1 2 . Next, we determine the derivative of the arcsine function. Letting h(x) =
arcsin(x), our goal is to find h 0 (x). Since h(x) is the angle whose sine is x, it is equivalent to write
sin(h(x)) = x. (2.6.11)

Differentiating both sides of the previous equation, we have


ddx[sin(h(x))] = ddx[x], (2.6.12)

and by the fact that the righthand side is simply 1 and by the chain rule applied to the left side,
cos(h(x))h0(x) = 1. (2.6.13)

Solving for h 0 (x), it follows that h 0 (x) = 1 cos(h(x)). Finally, we recall that h(x) = arcsin(x), so the denominator of h 0
(x) is the function cos(arcsin(x)), or in other words, “the cosine of the angle whose sine is x.” A bit of right triangle
trigonometry allows us to simplify this expression considerably. Let’s say that θ = arcsin(x), so that θ is the angle whose
sine is x. From this, it follows that we can picture θ as an angle in a right triangle with hypotenuse 1 and a vertical leg of
length x, as shown in Figure 2.11. The horizontal leg must be √ 1 - x 2, by the Pythagorean Theorem. Now, note
particularly that θ = arcsin(x) since sin(θ) = x, and recall that we want to know a different expression for cos(arcsin(x)).

Matthew Boelkins, David Austin & Steven


2.6.4 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
Figure 2.11: The right triangle that corresponds to the angle θ = arcsin(x).
From the figure, cos(arcsin(x)) = cos(θ) = √ 1 - x 2. Thus, returning to our earlier work where we established that if

h(x) = arcsin(x), (2.6.14)

then


1
h (x) = (2.6.15)
cos(arcsin(x))

we have now shown that


1
h (x) = − −−− −. (2.6.16)
√ 1 − x2

Inverse sine
For all real numbers x such that −1 < x < 1 ,
d 1
[arcsin(x)] = − −−− −. (2.6.17)
dx √ 1 − x2

Activity 2.6.2:

The following prompts in this activity will lead you to develop the derivative of the inverse tangent function.
a. Let r(x) = arctan(x). Use the relationship between the arctangent and tangent functions to rewrite this equation
using only the tangent function.
b. Differentiate both sides of the equation you found in (a). Solve the resulting equation for r 0 (x), writing r 0 (x) as
simply as possible in terms of a trigonometric function evaluated at r(x).
c. Recall that r(x) = arctan(x). Update your expression for r 0 (x) so that it only involves trigonometric functions and
the independent variable x.
d. Introduce a right triangle with angle θ so that θ = arctan(x). What are the three sides of the triangle?
e. In terms of only x and 1, what is the value of cos(arctan(x))? 136
f. Use the results of your work above to find an expression involving only 1 and x for r 0 (x).

While derivatives for other inverse trigonometric functions can be established similarly, we primarily limit ourselves to the
arcsine and arctangent functions. With these rules added to our library of derivatives of basic functions, we can
differentiate even more functions using derivative shortcuts. In Activity 2.18, we see each of these rules at work.

Activity 2.6.3:

Determine the derivative of each of the following functions.


a. f (x) = x 3 arctan(x) + e x \ln (x)
b. p(t) = 2 t arcsin(t)
c. h(z) = (arcsin(5z) + arctan(4 - z))27
d. s(y) = cot(arctan(y))
e. m(v) = \ln (sin2 (v) + 1)
f. g(w) = arctan \ln (w) 1 + w2 !

The link between the derivative of a function and the derivative of its inverse In Figure 2.9, we saw an interesting
relationship between the slopes of tangent lines to the natural exponential and natural logarithm functions at points that
corresponded to reflection across the line y = x. In particular, we observed that for a point such as (\ln (2), 2) on the graph
of f (x) = e x , the slope of the tangent line at this point is f 0 (\ln (2)) = 2, while at the corresponding point (2, \ln (2)) on
the graph of f -1 (x) = \ln (x), the slope of the tangent line at this point is ( f -1 ) 0 (2) = 1 2 , which is the reciprocal of f 0
(\ln (2)). That the two corresponding tangent lines having slopes that are reciprocals of one another is not a coincidence. If

Matthew Boelkins, David Austin & Steven


2.6.5 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
we consider the general setting of a differentiable function f with differentiable inverse g such that y = f (x) if and only if x
= g(y), then we know that f (g(x)) = x for every x in the domain of f -1 . Differentiating both sides of this equation with
respect to x, we have d dx [ f (g(x))] = d dx [x], and by the chain rule, f 0 (g(x))g 0 (x) = 1. 137 Solving for g 0 (x), we
have g 0 (x) = 1 f 0 (g(x)) . Here we see that the slope of the tangent line to the inverse function g at the point (x, g(x)) is
precisely the reciprocal of the slope of the tangent line to the original function f at the point (g(x), f (g(x))) = (g(x), x).

Figure 2.12: A graph of function y = f (x) along with its inverse, y = g(x) = f -1 (x). Observe that the slopes of the two
tangent lines are reciprocals of one another.
To see this more clearly, consider the graph of the function y = f (x) shown in Figure 2.12, along with its inverse y = g(x).
Given a point (a, b) that lies on the graph of f , we know that (b, a) lies on the graph of g; said differently, f (a) = b and g(b)
= a. Now, applying the rule that g 0 (x) = 1/ f 0 (g(x)) to the value x = b, we have g 0 (b) = 1 f 0 (g(b)) = 1 f 0 (a) , which is
precisely what we see in the figure: the slope of the tangent line to g at (b, a) is the reciprocal of the slope of the tangent
line to f at (a, b), since these two lines are reflections of one another across the line y = x. Derivative of an inverse
function: Suppose that f is a differentiable function with inverse g and that (a, b) is a point that lies on the graph of f at
which f 0 (a) , 0. Then g 0 (b) = 1 f 0 (a) . More generally, for any x in the domain of g 0 , we have g 0 (x) = 1/ f 0 (g(x)).
The rules we derived for \ln (x), arcsin(x), and arctan(x) are all just specific examples of this general property of the
derivative of an inverse function. For example, with 138 g(x) = \ln (x) and f (x) = e x , it follows that g 0 (x) = 1 f 0 (g(x))
= 1 e \ln (x) = 1 x .

Summary
In this section, we encountered the following important ideas:
For all positive real numbers x, d dx [\ln (x)] = 1 x .
For all real numbers x such that -1 ≤ x ≤ 1, d dx [arcsin(x)] = 1 √ 1 - x 2 . In addition, for all real numbers x, d dx
[arctan(x)] = 1 1 + x 2 .
If g is the inverse of a differentiable function f , then for any point x in the domain of g 0 , g 0 (x) = 1 f 0 (g(x)).

Matthew Boelkins, David Austin & Steven


2.6.6 11/24/2021 https://math.libretexts.org/@go/page/4304
Schlicker
2.7: Derivatives of Functions Given Implicitely
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What does it mean to say that a curve is an implicit function of x, rather than an explicit function of x?
How does implicit differentiation enable us to find a formula for dy/dx when y is an implicit function of x?
dy
In the context of an implicit curve, how can we use dx
to answer important questions about the tangent line to the
curve?

In all of our studies with derivatives to date, we have worked in a setting where we can express a formula for the function
of interest explicitly in terms of x. But there are many interesting curves that are determined by an equation involving x
and y for which it is impossible to solve for y in terms of x. Perhaps the simplest and most natural of all such curves are
circles. Because of the circle’s symmetry, for each x value strictly between the endpoints of the horizontal diameter, there
are two corresponding y -values.

Figure 2.7.1 : At left, the circle given by x + y = 16 . In the middle, the portion of the circle x + y = 16 that has
2 2 2 2

been highlighted in the box at left. And at right, the lemniscate given by x − y = 6xy . 3 3

– –
For instance, in Figure 2.7.1, we have labeled A = (−3, √7) and B = (−3, −√7) , and these points demonstrate that the
circle fails the vertical line test. Hence, it is impossible to represent the circle through a single function of the form
y = f (x). At the same time, portions of the circle can be represented explicitly as a function of x, such as the highlighted

arc that is magnified in the center of Figure 2.7.1. Moreover, it is evident that the circle is locally linear, so we ought to be
able to find a tangent line to the curve at every point; thus, it makes sense to wonder if we can compute dy dx at any point
on the circle, even though we cannot write y explicitly as a function of x. Finally, we note that the righthand curve in
Figure 2.7.1 is called a lemniscate and is just one of many fascinating possibilities for implicitly given curves.
In working with implicit functions, we will often be interested in finding an equation for dy dx that tells us the slope of the
tangent line to the curve at a point (x, y). To do so, it will be necessary for us to work with y while thinking of y as a
function of x, but without being able to write an explicit formula for y in terms of x. The following preview activity
reminds us of some ways we can compute derivatives of functions in settings where the function’s formula is not known.
For instance, recall the earlier example
d u(x) u(x) ′
[e ] =e u (x). (2.7.1)
dx

Preview Activity 2.7.1

Let f be a differentiable function of x (whose formula is not known) and recall that [f (x)] and f (x) are
d

dx

interchangeable notations. Determine each of the following derivatives of combinations of explicit functions of x, the
unknown function f , and an arbitrary constant c .
d
a. x
2
+ f (x)
dx
d
b. 2
x f (x)
dx

Matthew Boelkins, David Austin & Steven


2.7.1 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
d
c. 2
c + x + f (x )
dx
d
d. 2
f (x )
dx
d
e. [xf (x) + f (cx) + cf (x)]
dx

Implicit Differentiation
Because a circle is perhaps the simplest of all curves that cannot be represented explicitly as a single function of x, we
begin our exploration of implicit differentiation with the example of the circle given by
2 2
x +y = 16. (2.7.2)

It is visually apparent that this curve is locally linear, so it makes sense for us to want to find the slope of the tangent line
to the curve at any point, and moreover to think that the curve is differentiable. The big question is: how do we find a
dy
formula for , the slope of the tangent line to the circle at a given point on the circle? By viewing y as an implicit
dx

function of x, we essentially think of y as some function whose formula f (x) is unknown, but which we can differentiate.
dy
Just as y represents an unknown formula, so too its derivative with respect to x, , will be (at least temporarily)
dx

unknown.
Consider Equation 2.7.2 and view y as an unknown differentiable function of x. Differentiating both sides Equation 2.7.2
with respect to x, we have
d 2 2
d
[x +y ] = [16]. (2.7.3)
dx dx

On the right side of Equation 2.7.3, the derivative of the constant 16 is 0, and on the left we can apply the sum rule, so it
follows that
d 2
d 2
x + y = 0. (2.7.4)
dx dx

Next, it is essential that we recognize the different roles being played by x and y . Since x is the independent variable, it is
the variable with respect to which we are differentiating, and thus
d 2
[ x ] = 2x. (2.7.5)
dx

d
But y is the dependent variable and y is an implicit function of x. Thus, when we want to compute 2
[y ] it is identical to
dx
the situation in Preview Activity 2.7.1 where we computed
d 2
[f (x ) ]. (2.7.6)
dx

In both situations, we have an unknown function being squared, and we seek the derivative of the result. This requires the
chain rule, by which we find that
d dy
2 1
[y ] = 2y . (2.7.7)
dx dx

Therefore, continuing our work in differentiating both sides of Equation 2.7.2, we now have that
dy
2x + 2y = 0. (2.7.8)
dx

dy
To find an expression for dx
, we solve for it in Equation 2.7.8. Subtracting 2x from both sides and dividing by 2y,
dy 2x x
=− =− . (2.7.9)
dx 2y y

There are several important things to observe about the result that
Matthew Boelkins, David Austin & Steven
2.7.2 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
d x
=− . (2.7.10)
dx y

First, this expression for the derivative involves both x and y . It makes sense that this should be the case, since for each
value of x between -4 and 4, there are two corresponding points on the circle, and the slope of the tangent line is different
at each of these points. Second, this formula is entirely consistent with our understanding of circles. If we consider the
radius from the origin to the point (a, b), the slope of this line segment is m = ba . The tangent line to the circle at (a, b)
r

will be perpendicular to the radius, and thus have slope m = −ab , as shown in Figure 2.7.2. Finally, the slope of the
t

tangent line is zero at (0, 4) and (0, −4), and is undefined at (−4, 0) and (4, 0); all of these values are consistent with
Equation 2.7.10.

Figure 2.7.2 : The circle given by x + y = 16 with point (a, b) on the circle and the tangent line at that point, with
2 2

labeled slopes of the radial line, m , and tangent line, m .


r t

Essentially the idea of an implicit function is that it can be broken into pieces where each piece can be viewed as an
explicit function of x, and the combination of those pieces constitutes the full implicit function. For the circle, we
could choose to take the top half as one explicit function of x, and the bottom half as another.

We consider the following more complicated example to investigate and demonstrate some additional algebraic issues that
arise in problems involving implicit differentiation.

Example 2.7.1

For the curve given implicitly by


3 2
x +y − 2xy = 2 (2.7.11)

shown in Figure 2.7.3. Find the slope of the tangent line at (−1, 1).

Figure 2.7.3 : The curve x 3


+y
2
− 2xy = 2 .
Solution

Matthew Boelkins, David Austin & Steven


2.7.3 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
We begin by differentiating the curve’s equation implicitly. Taking the derivative of each side of Equation 2.7.11 with
respect to x,
d 3 2
d
[x +y − 2xy] = [2], (2.7.12)
dx dx

by the sum rule and the fact that the derivative of a constant is zero, we have
d 3
d 2
d
[x ] + [y ] − [2xy] = 0. (2.7.13)
dx dx dx

For the three derivatives we now must execute, the first uses the simple power rule, the second requires the chain rule
(since y is an implicit function of x), and the third necessitates the product rule (again since y is a function of x).
Applying these rules, we now find that

2
dy dy
3x + 2y − [2x + 2y] = 0. (2.7.14)
dx dx

dy
Remembering that our goal is to find an expression for dx
so that we can determine the slope of a particular tangent
dy dy
line, we want to solve the preceding equation for . To do so, we get all of the terms involving
dx dx
on one side of the
equation and then factor. Expanding and then subtracting 3x − 2y from both sides, it follows that
2

dy dy
2
2y − 2x = 2y − 3 x . (2.7.15)
dx dx

dy
Factoring the left side to isolate dx
, we have
dy
2
(2y − 2x) = 2y − 3 x . (2.7.16)
dx

Finally, we divide both sides by (2y − 2x) and conclude that


2
dy 2y − 3x
= . (2.7.17)
dx 2y − 2x

dy
Here again, the expression for dx
depends on both x and y . To find the slope of the tangent line at (-1, 1), we
dy
substitute this point in the formula for dx
, using the notation
2
dy ∣ 2(1) − 3(−1) 1
∣ = =− . (2.7.18)
dx ∣(−1,1) 2(1) − 2(−1) 4

This value matches our visual estimate of the slope of the tangent line shown in Figure 2.7.3.

dy
Example 2.7.1 shows that it is possible when differentiating implicitly to have multiple terms involving . Regardless of dx

the particular curve involved, our approach will be 145 similar each time. After differentiating, we expand so that each side
dy
of the equation is a sum of terms, some of which involve . Next, addition and subtraction are used to get all terms
dx

involving dy dx on one side of the equation, with all remaining terms on the other. Finally, we factor to get a single
dy dy
instance of dx
, and then divide to solve for dx
.
dy
Note, too, that since dx
is often a function of both x and y , we use the notation
dy ∣
∣ (2.7.19)
dx ∣(a,b)

dy
to denote the evaluation of dx
at the point (a, b). This is analogous to writing f ′
(a) when f depends on a single variable.

dy
Finally, there is a big difference between writing d

dx
and dx
. For example,
d
2 2
[x +y ] (2.7.20)
dx

Matthew Boelkins, David Austin & Steven


2.7.4 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
gives an instruction to take the derivative with respect to x of the quantity x 2
+y
2
, presumably where y is a function of x.
On the other hand,
dy 2 2
(x +y ) (2.7.21)
dx

means the product of the derivative of y with respect to x with the quantity x + y . Understanding this notational
2 2

subtlety is essential. The following activities present opportunities to explore several different problems involving implicit
differentiation.

Activity 2.7.1

Consider the curve defined by the equation x(y) = y 5


− 5y
3
+ 4y , whose graph is pictured in Figure 2.7.4.
a. Explain why it is not possible to express y as an explicit function of x.
dy
b. Use implicit differentiation to find a formula for . dx

c. Use your result from part (b) to find an equation of the line tangent to the graph of x(y) at the point (0, 1).
d. Use your result from part (b) to determine all of the points at which the graph of x(y) has a vertical tangent line.

Figure 2.7.4 : The curve x = y 5 - 5y 3 + 4y.

Two natural questions to ask about any curve involve where the tangent line can be vertical or horizontal. To be horizontal,
the slope of the tangent line must be zero, while to be vertical, the slope must be undefined. It is typically the case when
differentiating implicitly that the formula for dy dx is expressed as a quotient of functions of x and y, say
dy p(x, y)
= . (2.7.22)
dx q(x, y)

Thus, we observe that the tangent line will be horizontal precisely when the numerator is zero and the denominator is
nonzero, making the slope of the tangent line zero. Similarly, the tangent line will be vertical whenever q(x, y) = 0 and
p(x, y) ≠ 0, making the slope undefined. If both x and y are involved in an equation such as p(x, y) = 0, we try to solve

for one of them in terms of the other, and then use the resulting condition in the original equation that defines the curve to
find an equation in a single variable that we can solve to determine the point(s) that lie on the curve at which the condition
holds. It is not always possible to execute the desired algebra due to the possibly complicated combinations of functions
that often arise.

Exercise 2.7.2

Consider the curve defined by the equation


2
y(y − 1)(y − 2) = x(x − 1)(x − 2), (2.7.23)

whose graph is pictured in Figure 2.7.5. Through implicit differentiation, it can be shown that
dy (x − 1)(x − 2) + x(x − 2) + x(x − 1)
= . (2.7.24)
dx (y 2 − 1)(y − 2) + 2 y 2 (y − 2) + y(y 2 − 1)

Matthew Boelkins, David Austin & Steven


2.7.5 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
Use this fact to answer each of the following questions.
a. Determine all points (x, y) at which the tangent line to the curve is horizontal. (Use technology appropriately to
find the needed zeros of the relevant polynomial function.)
b. Determine all points (x, y) at which the tangent line is vertical. (Use technology appropriately to find the needed
zeros of the relevant polynomial function.)
c. Find the equation of the tangent line to the curve at one of the points where x = 1 .

Figure 2.7.5 : The curve y(y 2


− 1)(y − 2) = x(x − 1)(x − 2) .

The closing activity in this section offers more opportunities to practice implicit differentiation.

Activity 2.7.3

For each of the following curves, use implicit differentiation to find dy/dx and determine the equation of the tangent
line at the given point.
a. x − y = 6xy at (−3, 3)
3 3

b. sin(y) + y = x + x at (0, 0)
3

c. 3x e −xy
=y at (0.619061, 1)
2

Summary
In this section, we encountered the following important ideas:
When we have an equation involving x and y where y cannot be solved for explicitly in terms of x, but where portions
of the curve can be thought of as being generated by explicit functions of x, we say that y is an implicit function of x.
A good example of such a curve is the unit circle.
In the process of implicit differentiation, we take the equation that generates an implicitly given curve and differentiate
dy
both sides with respect to x while treating y as a function of x. In so doing, the chain rule leads dx
to arise, and then
dy
we may subsequently solve for dx
using algebra.
dy dy
While dx
may now involve both the variables x and y , dx
still measures the slope of the tangent line to the curve, and
dy dy
thus this derivative may be used to decide when the tangent line is horizontal ( dx
=0 ) or vertical ( dx
is undefined), or
to find the equation of the tangent line at a particular point on the curve.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


2.7.6 11/24/2021 https://math.libretexts.org/@go/page/5308
Schlicker
2.8: Using Derivatives to Evaluate Limits
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can derivatives be used to help us evaluate indeterminate limits of the form 0 0 ?
What does it mean to say that \lim_{x→∞} f (x) = L and \lim_{x→a} f (x) = ∞?
How can derivatives assist us in evaluating indeterminate limits of the form ∞ ∞ ?

Because differential calculus is based on the definition of the derivative, and the definition of the derivative involves a
limit, there is a sense in which all of calculus rests on limits. In addition, the limit involved in the limit definition of the
derivative is one that always generates an indeterminate form of . If f is a differentiable function for which f (x) exists,
0

0

then when we consider:


f (x + h) − f (x)

f (x) = lim (2.8.1)
h→0 h

it follows that not only does h → 0 in the denominator, but also (f (x + h) − f (x)) → 0 in the numerator, since f is
continuous. Thus, the fundamental form of the limit involved in the definition of f (x) is . Remember, saying a limit has
′ 0

an indeterminate form only means that we don’t yet know its value and have more work to do: indeed, limits of the form 0

can take on any value, as is evidenced by evaluating f (x) for varying values of x for a function such as f (x) = x .
′ ′ 2

Of course, we have learned many different techniques for evaluating the limits that result from the derivative definition,
and including a large number of shortcut rules that enable us to evaluate these limits quickly and easily. In this section, we
turn the situation upside-down: rather than using limits to evaluate derivatives, we explore how to use derivatives to
evaluate certain limits. This topic will combine several different ideas, including limits, derivative shortcuts, local linearity,
and the tangent line approximation.

Preview Activity 2.8.1

Let h be the function given by


5
x +x −2
h(x) = . (2.8.2)
2
x −1

a. What is the domain of h ?


b. Explain why
lim h(x) (2.8.3)
x→1

results in an indeterminate form.


c. Next we will investigate the behavior of both the numerator and denominator of h near the point where x = 1. Let
f (x) = x + x − 2 and g(x) = x − 1 . Find the local linearizations of f and g at a = 1 , and call these functions
5 2

L (x) and L (x), respectively.


f g

d. Explain why h(x) ≈ L (x)L (x) for x near a = 1 .


f g

e. Using your work from (c) and (d), evaluate


Lf (x)
lim . (2.8.4)
x→1 Lg (x)

What do you think your result tells us about lim x→1 h(x)?
f. Investigate the function h(x) graphically and numerically near x = 1 . What do you think is the value of
lim x→1h(x)?

Matthew Boelkins, David Austin & Steven


2.8.1 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
Using Derivatives to Evaluate Indeterminate limits of the Form \frac{0}{0}
The fundamental idea of Preview Activity 2.8.1 – that we can evaluate an indeterminate limit of the form 0 0 by replacing
each of the numerator and denominator with their local linearizations at the point of interest – can be generalized in a way
that enables us to easily evaluate a wide range of limits. We begin by assuming that we have a function h(x) that can be
f (x)
written in the form h(x) =
g(x)
where f and g are both differentiable at x =a and for which f (a) = g(a) = 0 . We are
interested in finding a way to evaluate the indeterminate limit given by lim x→a h(x) .

Figure 2.8.1 : At left, the graphs of f and g near the value a, along with their tangent line approximations L_f and L_g at
x = a. At right, zooming in on the point a and the four graphs.
In Figure 2.8.1, we see a visual representation of the situation involving such functions f and g . In particular, we see that
both f and g have an x-intercept at the point where x = a . In addition, since each function is differentiable, each is locally
linear, and we can find their respective tangent line approximations L and L at x = a , which are also shown in the
f g

figure. Since we are interested in the limit of f (x) g(x) as x → a, the individual behaviors of f (x) and g(x) as x → a are
key to understand. Here, we take advantage of the fact that each function and its tangent line approximation become
indistinguishable as x → a . First, let’s recall that
′ ′
Lf (x) = f (a)(x − a) + f (a)andLg (x) = g (a)(x − a) + g(a). (2.8.5)

The critical observation we make is that when taking the limit, because x is getting arbitrarily close to a, we can replace f
with L_f and replace g with L_g, and thus we observe that
′ ′
lim f (x)g(x) = lim Lf (x)Lg (x) = lim f (a)(x − a) + f (a)g (a)(x − a) + g(a). (2.8.6)
x→a x→a x→a

Next, we remember a key fundamental assumption: that both f (a) = 0 and g(a) = 0 , as this is precisely what makes the
original limit indeterminate. Substituting these values for f (a) and g(a) in the limit above, we now have
′ ′ ′ ′
lim f (x)g(x) = lim f (a)(x − a)g (a)(x − a) = lim f (a)g (a), (2.8.7)
x→a x→a x→a

where the latter equality holds since x is approaching (but not equal to) a, so x−a x−a = 1. Finally, we note that f'(a) g'(a) is
constant with respect to x, and thus

f (x) f (a)
lim = . (2.8.8)
x→a ′
g(x) g (a)

We have, of course, implicitly made the assumption that g (a) ≠ 0 , which is essential to the overall limit having the value

f'(a) g'(a) . We summarize our work above with the statement of L’Hopital’s Rule, which is the formal name of the result
we have shown.

L’Hopital’s Rule
Let f and g be differentiable at x = a , and suppose that f (a) = g(a) = 0 and that g ′
(a) ≠ 0 . Then

f (x) f (a)
lim = . (2.8.9)
x→a g(x) g ′ (a)

Matthew Boelkins, David Austin & Steven


2.8.2 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
In practice, we typically work with a slightly more general version of L’Hopital’s Rule, which states that (under the
identical assumptions as the boxed rule above and the extra assumption that g'is continuous at x = a) \lim_{x→a} f (x) g(x)
= \lim_{x→a} f'(x) g'(x) , provided the righthand limit exists.
This form reflects the fundamental benefit of L’Hopital’s Rule: if f (x) g(x) produces an indeterminate limit of form 0 0 as
x → a, it is equivalent to consider the limit of the quotient of the two functions’ derivatives, f'(x) g'(x) . For example, if we
consider the limit from Preview Activity 2.8.1, lim x→1 x 5 + x − 2 x 2 − 1 , by L’Hopital’s Rule we have that
5 4
x +x −2 5x +1 6
lim = lim = = 3. (2.8.10)
2
x→1 x −1 x→1 2x 2

By being able to replace the numerator and denominator with their respective derivatives, we often move from an
indeterminate limit to one whose value we can easily determine.

Activity 2.8.2

Evaluate each of the following limits. If you use L’Hopital’s Rule, indicate where it was used, and be certain its
hypotheses are met before you apply it.
ln(1 + x)
a. limx→0
x
cos(x)
b. limx→π
x
2 ln(x)
c. limx→1
x−1
1 −e
sin(x) − x
d. limx→0
cos(2x) − 1

While L’Hopital’s Rule can be applied in an entirely aL_gebraic way, it is important to remember that the genesis of the
rule is graphical: the main idea is that the slopes of the tangent lines to f and g at x = a determine the value of the limit of f
(x) g(x) as x → a . We see this in Figure 2.8.2, which is a modified version of Figure 2.8.1, where we can see from the
grid that f'(a) = 2 and g'(a) = −1, hence by L’Hopital’s Rule, \lim_{x→a} f (x) g(x) = f'(a) g'(a) = 2 −1 = −2. Indeed, what
we observe is that it’s not the fact that f and g both approach zero that matters most, but rather the rate at which each
approaches zero that determines the value of the limit. This is a good way to remember what L’Hopital’s Rule says: if f (a)
= g(a) = 0, the the limit of f (x) g(x) as x → a is given by the ratio of the slopes of f and g at x = a.

Figure 2.8.2 : Two functions f and g that satisfy L’Hopital’s Rule.

Activity 2.8.3

In this activity, we reason graphically from the following figure to evaluate limits of ratios of functions about which
some information is known.

Matthew Boelkins, David Austin & Steven


2.8.3 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
Figure 2.8.3 : Three graphs referenced in the questions of Activity 2.8.3.
a. Use the left-hand graph to determine the values of f (2), f'(2), g(2), and g'(2). Then, evaluate lim x→2 f (x) g(x) .
b. Use the middle graph to find p(2), p 0 (2), q(2), and q 0 (2). Then, determine the 154 value of lim x→2 p(x) q(x) .
c. Use the right-hand graph to compute r(2), r 0 (2), s(2), s 0 (2). Explain why you cannot determine the exact value
of lim x→2 r(x) s(x) without further information being provided, but that you can determine the sign of limx→2
r(x) s(x) . In addition, state what the sign of the limit will be, with justification. C

Limits Involving ∞
The concept of infinity, denoted ∞, arises naturally in calculus, like it does in much of mathematics. It is important to note
from the outset that ∞ is a concept, but not a number itseL_f. Indeed, the notion of ∞ naturally invokes the idea of limits.
Consider, for example, the function f (x) = 1 x , whose graph is pictured in Figure 2.8.4. We note that x = 0 is not

Figure 2.8.4 : The graph of f (x) = 1 x .


in the domain of f , so we may naturally wonder what happens as x → 0. As x → 0 + , we observe that f (x) increases
without bound. That is, we can make the value of f (x) as large as we like by taking x closer and closer (but not equal) to 0,
while keeping x > 0. This is a good way to think about what infinity represents: a quantity is tending to infinity if there is
no single number that the quantity is always less than. Recall that when we write \lim_{x→a} f (x) = L, this means that
can make f (x) as close to L as we’d like by taking x sufficiently close (but not equal) to a. We thus expand this notation
and language to include the possibility that either L or a can be ∞. For instance, for f (x) = 1 x , we now write lim x→0 + 1
x = ∞, by which we mean that we can make 1 x as large as we like by taking x sufficiently close (but not equal) to 0. In a
similar way, we naturally write \lim_{x→∞} 1 x = 0, since we can make 1 x as close to 0 as we’d like by taking x
sufficiently large (i.e., by letting x increase without bound). In general, we understand the notation \lim_{x→a} f (x) = ∞
to mean that we can make f (x) as large as we’d like by taking x sufficiently close (but not equal) to a, and the notation
\lim_{x→∞} f (x) = L to mean that we can make f (x) as close to L as we’d like by taking x sufficiently large. This
notation applies to left- and right-hand limits, plus we can also use limits involving −∞. For example, returning to Figure
2.8.4 and f (x) = 1 x , we can say that lim x→0 − 1 x = −∞ and limx→−∞ 1 x = 0. Finally, we write \lim_{x→∞} f (x) = ∞

when we can make the value of f (x) as large as we’d like by taking x sufficiently large. For example, \lim_{x→∞} x 2 =
∞. Note particularly that limits involving infinity identify vertical and horizontal asymptotes of a function. If \lim_{x→a} f
(x) = ∞, then x = a is a vertical asymptote of f , while if \lim_{x→∞} f (x) = L, then y = L is a horizontal asymptote of f .
Similar statements can be made using −∞, as well as with left- and right-hand limits as x → a − or x → a + .

Matthew Boelkins, David Austin & Steven


2.8.4 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
In precalculus classes, it is common to study the end behavior of certain families of functions, by which we mean the
behavior of a function as x → ∞ and as x → −∞. Here we briefly examine a library of some familiar functions and note the
values of several limits involving ∞. For the natural exponential function e x , we note that \lim_{x→∞} e x = ∞ and
limx→−∞ e x = 0, while for the related exponential decay function e −x , observe that these limits are reversed, with
\lim_{x→∞} e −x = 0 and limx→−∞ e −x = ∞. Turning to the natural logarithm function, we have limx→0 + ln(x) = −∞
and \lim_{x→∞} ln(x) = ∞. While both e x and ln(x) grow without bound as x → ∞, the exponential function does so
much more quickly than 156 the logarithm function does. We’ll soon use limits to quantify what we mean by “quickly.”

Figure 2.8.5 : Graphs of some familiar functions whose end behavior as x → ±∞ is known. In the middle graph, f (x) = x
3 − 16x and g(x) = x 4 − 16x 2 − 8.
For polynomial functions of the form p(x) = an x n + an−1 x n−1 + · · · a1 x + a0, the end behavior depends on the sign of
an and whether the highest power n is even or odd. If n is even and an is positive, then \lim_{x→∞} p(x) = ∞ and
limx→−∞ p(x) = ∞, as in the plot of g in Figure 2.8.5. If instead an is negative, then \lim_{x→∞} p(x) = −∞ and
limx→−∞ p(x) = −∞. In the situation where n is odd, then either \lim_{x→∞} p(x) = ∞ and limx→−∞ p(x) = −∞ (which
occurs when an is positive, as in the graph of f in Figure 2.8.5), or \lim_{x→∞} p(x) = −∞ and limx→−∞ p(x) = ∞ (when
an is negative). A function can fail to have a limit as x → ∞. For example, consider the plot of the sine function at right in
Figure 2.8.5. Because the function continues oscillating between −1 and 1 as x → ∞, we say that \lim_{x→∞} sin(x) does
not exist. Finally, it is straightforward to analyze the behavior of any rational function as x → ∞. Consider, for example,
the function q(x) = 3x 2 − 4x + 5 7x 2 + 9x − 10 . Note that both (3x 2−4x +5) → ∞ as x → ∞ and (7x 2+9x −10) → ∞ as x
→ ∞. Here we say that lim x→∞ q(x) has indeterminate form ∞ ∞ , much like we did when we encountered limits of the

form 0 0 . We can determine the value of this limit through a standard aL_gebraic approach. Multiplying the numerator and
denominator each by 1 x 2 , we find that
1
2
3x − 4x + 5 2
x
lim q(x) = lim ⋅ (2.8.11)
2
x→∞ x→∞ 7x + 9x − 10 1

2
x

1 1
3 −4 +5
2 3
x x
= lim =
x→∞ 1 1 7
7 +9 − 10
2
x x

since x
1
2
→ 0 and
1

x
→ 0 as x → ∞ . This shows that the rational function q has a horizontal asymptote at y =
3

7
. A
similar approach can be used to determine the limit of any rational function as x → ∞ .
But how should we handle a limit such as
2
x
lim ? (2.8.12)
x
x→∞ e

Here, both x → ∞\0and\(e → ∞ , but there is not an obvious algebraic approach that enables us to find the limit’s
2 x

value. Fortunately, it turns out that L’Hopital’s Rule extends to cases involving infinity.

Matthew Boelkins, David Austin & Steven


2.8.5 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
L’Hopital’s Rule (∞)
If f and g are differentiable and both approach zero or both approach ±∞ as x → a (where a is allowed to be ∞) , then

f (x) f (x)
lim = lim . (2.8.13)
x→a g(x) x→a g ′ (x)

To be technically correct, we need to the additional hypothesis that g (x) ≠ 0 on an open interval that contains a or in

every neighborhood of infinity if a is ∞; this is almost always met in practice.

To evaluate the limit in Equation 2.8.12, we observe that we can apply L’Hopital’s Rule, since both x
2
→ ∞ and
e → ∞ . Doing so, it follows that
x

2
x 2x
lim = lim . (2.8.14)
x x
x→∞ e x→∞ e

This updated limit is still indeterminate and of the form ∞ ∞ , but it is simpler since 2x has replaced 2
x . Hence, we can
apply L’Hopital’s Rule again, by which we find that
2
x 2x 2
lim = lim = lim . (2.8.15)
x x x
x→∞ e x→∞ e x→∞ e

Now, since 2 is constant and e x → ∞ as x → ∞, it follows that 2 e x → 0 as x → ∞, which shows that \lim_{x→∞} x 2 e x
= 0.

Activity 2.8.4

Evaluate each of the following limits. If you use L’Hopital’s Rule, indicate where it was used, and be certain its
hypotheses are met before you apply it.
a. limx→∞ xln(x)

b. limx→∞ e
x
+ x2ex + x2

c. li m x→0 + ln(x)1x

d. li m x→π 2 − tan(x)x − π2

e. limx→∞ xe − x

When we are considering the limit of a quotient of two functions f (x) g(x) that results in an indeterminate form of ∞ ∞ , in
essence we are asking which function is growing faster without bound. We say that the function g dominates the function f
as x → ∞ provided that
f (x)
lim = 0, (2.8.16)
x→∞ g(x)

whereas f dominates g provided that \lim_{x→∞} f (x) g(x) = ∞. Finally, if the value of
f (x)
lim (2.8.17)
x→∞ g(x)

is finite and nonzero, we say that f and g grow at the same rate. For example, from earlier work we know that \lim_{x→∞}
x 2 e x = 0, so e x dominates x 2 , while

lim 3x2 − 4x + 57x2 + 9x − 10 = 37, (2.8.18)


x→∞

so
f (x) = 3x2 − 4x + 5 (2.8.19)

and
g(x) = 7x2 + 9x − 10 (2.8.20)

Matthew Boelkins, David Austin & Steven


2.8.6 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
grow at the same rate.

Summary
In this section, we encountered the following important ideas:
Derivatives be used to help us evaluate indeterminate limits of the form 0 0 through L’Hopital’s Rule, which is
developed by replacing the functions in the numerator and denominator with their tangent line approximations. In
particular, if f (a) = g(a) = 0 and f and g are differentiable at a, L’Hopital’s Rule tells us that \lim_{x→a} f (x) g(x) =
\lim_{x→a} f'(x) g'(x) .
When we write x → ∞, this means that x is increasing without bound. We thus use ∞ along with limit notation to write
\lim_{x→∞} f (x) = L, which means we can make f (x) as close to L as we like by choosing x to be sufficiently large,
and similarly \lim_{x→a} f (x) = ∞, which means we can make f (x) as large as we like by choosing x sufficiently
close to a.
A version of L’Hopital’s Rule also allows us to use derivatives to assist us in evaluating indeterminate limits of the
form ∞ ∞ . In particular, If f and g are differentiable and both 159 approach zero or both approach ±∞ as x → a (where
a is allowed to be ∞), then \lim_{x→a} f (x) g(x) = \lim_{x→a} f'(x) g'(x) .

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


2.8.7 12/8/2021 https://math.libretexts.org/@go/page/5309
Schlicker
2.E: Computing Derivatives (Exercises)
2.1: Elementary Derivative Rules
1. Let f and g be differentiable functions for which the following information is known: f (2) = 5, g(2) = −3, f 0 (2) = −1/2,
g 0 (2) = 2.
(a) Let h be the new function defined by the rule h(x) = 3 f (x) − 4g(x). Determine h(2) and h 0 (2).
(b) Find an equation for the tangent line to y = h(x) at the point (2, h(2)).
(c) Let p be the function defined by the rule p(x) = −2 f (x)+ 1 2 g(x). Is p increasing, decreasing, or neither at a = 2? Why?
(d) Estimate the value of p(2.03) by using the local linearization of p at the point (2, p(2)).
2. Let functions p and q be the piecewise linear functions given by their respective graphs in Figure 2.1. Use the graphs to
answer the following questions.

Figure 2.1: The graphs of p (in blue) and q (in green).


(a) At what values of x is p not differentiable? At what values of x is q not differentiable? Why?
(b) Let r(x) = p(x) + 2q(x). At what values of x is r not differentiable? Why?
(c) Determine r 0 (−2) and r 0 (0).
(d) Find an equation for the tangent line to y = r(x) at the point (2,r(2)). 95
3. Consider the functions r(t) = t t and s(t) = arccos(t), for which you are given the facts that r 0 (t) = t t (ln(t) + 1) and s 0
(t) = − 1 √ 1−t 2 . Do not be concerned with where these derivative formulas come from. We restrict our interest in both
functions to the domain 0 < t < 1.
(a) Let w(t) = 3t t − 2 arccos(t). Determine w 0 (t).
(b) Find an equation for the tangent line to y = w(t) at the point ( 1 2 , w( 1 2 )).
(c) Let v(t) = t t + arccos(t). Is v increasing or decreasing at the instant t = 1 2 ? Why?
4. Let f (x) = a x . The goal of this problem is to explore how the value of a affects the derivative of f (x), without assuming
we know the rule for d dx [a x ] that we have stated and used in earlier work in this section.
(a) Use the limit definition of the derivative to show that f 0 (x) = lim h→0 a x · a h − a x h .
(b) Explain why it is also true that f 0 (x) = a x · lim h→0 a h − 1 h .
(c) Use computing technology and small values of h to estimate the value of L = lim h→0 a h − 1 h when a = 2. Do
likewise when a = 3.
(d) Note that it would be ideal if the value of the limit L was 1, for then f would be a particularly special function: its
derivative would be simply a x , which would mean that its derivative is itself. By experimenting with different values of a
between 2 and 3, try to find a value for a for which L = lim h→0 a h − 1 h = 1.
(e) Compute ln(2) and ln(3). What does your work in (b) and (c) suggest is true about d dx [2 x ] and d dx [3 x ]? (f) How
do your investigations in (d) lead to a particularly important fact about the function f (x) = e x ?

Matthew Boelkins, David Austin & Steven


2.E.1 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
2.2: The Sine and Cosine Function
1. Suppose that V(t) = 24 · 1.07t + 6 sin(t) represents the value of a person’s investment portfolio in thousands of dollars in
year t, where t = 0 corresponds to January 1, 2010.
(a) At what instantaneous rate is the portfolio’s value changing on January 1, 2012? Include units on your answer.
(b) Determine the value of V 00(2). What are the units on this quantity and what does it tell you about how the portfolio’s
value is changing?
(c) On the interval 0 ≤ t ≤ 20, graph the function V(t) = 24 · 1.07t + 6 sin(t) and describe its behavior in the context of the
problem. Then, compare the graphs of the functions A(t) = 24 · 1.07t and V(t) = 24 · 1.07t + 6 sin(t), as well as the graphs
of their derivatives A 0 (t) and V 0 (t). What is the impact of the term 6 sin(t) on the behavior of the function V(t)? 2. Let f
(x) = 3 cos(x) − 2 sin(x) + 6.
(a) Determine the exact slope of the tangent line to y = f (x) at the point where a = π 4 .
(b) Determine the tangent line approximation to y = f (x) at the point where a = π.
(c) At the point where a = π 2 , is f increasing, decreasing, or neither?
(d) At the point where a = 3π 2 , does the tangent line to y = f (x) lie above the curve, below the curve, or neither? How can
you answer this question without even graphing the function or the tangent line? 101
3. In this exercise, we explore how the limit definition of the derivative more formally shows that d dx [sin(x)] = cos(x).
Letting f (x) = sin(x), note that the limit definition of the derivative tells us that f 0 (x) = lim h→0 sin(x + h) − sin(x) h .
(a) Recall the trigonometric identity for the sine of a sum of angles α and β: sin(α + β) = sin(α) cos(β) + cos(α)sin(β). Use
this identity and some algebra to show that f 0 (x) = lim h→0 sin(x)(cos(h) − 1) + cos(x)sin(h) h .
(b) Next, note that as h changes, x remains constant. Explain why it therefore makes sense to say that f 0 (x) = sin(x) · lim
h→0 cos(h) − 1 h + cos(x) · lim h→0 sin(h) h .
(c) Finally, use small values of h to estimate the values of the two limits in (c): lim h→0 cos(h) − 1 h and lim h→0 sin(h) h
.
(d) What do your results in (c) thus tell you about f 0 (x)?
(e) By emulating the steps taken above, use the limit definition of the derivative to argue convincingly that d dx [cos(x)] =
− sin(x).

2.3: The Product and Quotient Rules


1. Let f and g be differentiable functions for which the following information is known: f (2) = 5, g(2) = −3, f 0 (2) = −1/2,
g 0 (2) = 2.
(a) Let h be the new function defined by the rule h(x) = g(x) · f (x). Determine h(2) and h 0 (2).
(b) Find an equation for the tangent line to y = h(x) at the point (2, h(2)) (where h is the function defined in (a)).
(c) Let r be the function defined by the rule r(x) = g(x) f (x) . Is r increasing, decreasing, or neither at a = 2? Why? 111
(d) Estimate the value of r(2.06) (where r is the function defined in (c)) by using the local linearization of r at the point
(2,r(2)).
2. Consider the functions r(t) = t t and s(t) = arccos(t), for which you are given the facts that r 0 (t) = t t (ln(t) + 1) and s 0
(t) = − 1 √ 1−t 2 . Do not be concerned with where these derivative formulas come from. We restrict our interest in both
functions to the domain 0 < t < 1.
(a) Let w(t) = t t arccos(t). Determine w 0 (t).
(b) Find an equation for the tangent line to y = w(t) at the point ( 1 2 , w( 1 2 )).
(c) Let v(t) = t t arccos(t) . Is v increasing or decreasing at the instant t = 1 2 ? Why?
3. Let functions p and q be the piecewise linear functions given by their respective graphs in Figure 2.5. Use the graphs to
answer the following questions.

Matthew Boelkins, David Austin & Steven


2.E.2 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
Figure 2.5: The graphs of p (in blue) and q (in green).
(a) Let r(x) = p(x) · q(x). Determine r 0 (−2) and r 0 (0).
(b) Are there values of x for which r 0 (x) does not exist? If so, which values, and why?
(c) Find an equation for the tangent line to y = r(x) at the point (2,r(2)).
(d) Let z(x) = q(x) p(x) . Determine z 0 (0) and z 0 (2).
(e) Are there values of x for which z 0 (x) does not exist? If so, which values, and why?
4. A farmer with large land holdings has historically grown a wide variety of crops. With the price of ethanol fuel rising, he
decides that it would be prudent to devote more and more of his acreage to producing corn. As he grows more and more
corn, he learns efficiencies that increase his yield per acre. In the present year, he used 7000 acres of 112 his land to grow
corn, and that land had an average yield of 170 bushels per acre. At the current time, he plans to increase his number of
acres devoted to growing corn at a rate of 600 acres/year, and he expects that right now his average yield is increasing at a
rate of 8 bushels per acre per year. Use this information to answer the following questions.
(a) Say that the present year is t = 0, that A(t) denotes the number of acres the farmer devotes to growing corn in year t,
Y(t) represents the average yield in year t (measured in bushels per acre), and C(t) is the total number of bushels of corn
the farmer produces. What is the formula for C(t) in terms of A(t) and Y(t)? Why?
(b) What is the value of C(0)? What does it measure?
(c) Write an expression for C 0 (t) in terms of A(t), A 0 (t), Y(t), and Y 0 (t). Explain your thinking.
(d) What is the value of C 0 (0)? What does it measure?
(e) Based on the given information and your work above, estimate the value of C(1).
5. Let f (v) be the gas consumption (in liters/km) of a car going at velocity v (in km/hour). In other words, f (v) tells you
how many liters of gas the car uses to go one kilometer if it is traveling at v kilometers per hour. In addition, suppose that f
(80) = 0.05 and f 0 (80) = 0.0004.
(a) Let g(v) be the distance the same car goes on one liter of gas at velocity v. What is the relationship between f (v) and
g(v)? Hence find g(80) and g 0 (80).
(b) Let h(v) be the gas consumption in liters per hour of a car going at velocity v. In other words, h(v) tells you how many
liters of gas the car uses in one hour if it is going at velocity v. What is the algebraic relationship between h(v) and f (v)?
Hence find h(80) and h 0 (80).
(c) How would you explain the practical meaning of these function and derivative values to a driver who knows no
calculus? Include units on each of the function and derivative values you discuss in your response.

2.4: Derivatives of Other Trigonometric Functions


1. An object moving vertically has its height at time t (measured in feet, with time in seconds) given by the function h(t) =
3 + 2 cos(t) 1.2 t .
(a) What is the object’s instantaneous velocity when t = 2?
(b) What is the object’s acceleration at the instant t = 2?

Matthew Boelkins, David Austin & Steven


2.E.3 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
(c) Describe in everyday language the behavior of the object at the instant t = 2.
2. Let f (x) = sin(x) cot(x).
(a) Use the product rule to find f 0 (x). 118
(b) True or false: for all real numbers x, f (x) = cos(x).
(c) Explain why the function that you found in (a) is almost the opposite of the sine function, but not quite. (Hint: convert
all of the trigonometric functions in (a) to sines and cosines, and work to simplify. Think carefully about the domain of f
and the domain of f 0 .)
3. Let p(z) be given by the rule p(z) = z tan(z) z 2 sec(z) + 1 + 3e z + 1.
(a) Determine p 0 (z).
(b) Find an equation for the tangent line to p at the point where z = 0.
(c) At z = 0, is p increasing, decreasing, or neither? Why?

2.5: The Chain Rule


1. Consider the basic functions f (x) = x 3 and g(x) = sin(x).
(a) Let h(x) = f (g(x)). Find the exact instantaneous rate of change of h at the point where x = π 4 .
(b) Which function is changing most rapidly at x = 0.25: h(x) = f (g(x)) or r(x) = g( f (x))? Why?
(c) Let h(x) = f (g(x)) and r(x) = g( f (x)). Which of these functions has a derivative that is periodic? Why?
2. Let u(x) be a differentiable function. For each of the following functions, determine the derivative. Each response will
involve u and/or u 0 .
(a) p(x) = e u(x)
(b) q(x) = u(e x )
(c) r(x) = cot(u(x))
(d) s(x) = u(cot(x))
(e) a(x) = u(x 4 )
(f) b(x) = u 4 (x)
3. Let functions p and q be the piecewise linear functions given by their respective graphs in Figure 2.7. Use the graphs to
answer the following questions.
(a) Let C(x) = p(q(x)). Determine C 0 (0) and C 0 (3).
(b) Find a value of x for which C 0 (x) does not exist. Explain your thinking.
(c) Let Y(x) = q(q(x)) and Z(x) = q(p(x)). Determine Y 0 (−2) and Z 0 (0). 128

Figure 2.7: The graphs of p (in blue) and q (in green).


4. If a spherical tank of radius 4 feet has h feet of water present in the tank, then the volume of water in the tank is given by
the formula V = π 3 h 2 (12 − h).

Matthew Boelkins, David Austin & Steven


2.E.4 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
(a) At what instantaneous rate is the volume of water in the tank changing with respect to the height of the water at the
instant h = 1? What are the units on this quantity?
(b) Now suppose that the height of water in the tank is being regulated by an inflow and outflow (e.g., a faucet and a drain)
so that the height of the water at time t is given by the rule h(t) = sin(πt) + 1, where t is measured in hours (and h is still
measured in feet). At what rate is the height of the water changing with respect to time at the instant t = 2?
(c) Continuing under the assumptions in (b), at what instantaneous rate is the volume of water in the tank changing with
respect to time at the instant t = 2?
(d) What are the main differences between the rates found in (a) and (c)? Include a discussion of the relevant units.

2.6: Derivatives of Inverse Functions


1. Determine the derivative of each of the following functions. Use proper notation and clearly identify the derivative rules
you use.
(a) f (x) = ln(2 arctan(x) + 3 arcsin(x) + 5)
(b) r(z) = arctan(ln(arcsin(z)))
(c) q(t) = arctan2 (3t) arcsin4 (7t)
(d) g(v) = ln arctan(v) arcsin(v) + v 2 !
2. Consider the graph of y = f (x) provided in Figure 2.13 and use it to answer the following questions.
(a) Use the provided graph to estimate the value of f 0 (1).
(b) Sketch an approximate graph of y = f −1 (x). Label at least three distinct points on the graph that correspond to three
points on the graph of f .
(c) Based on your work in (a), what is the value of ( f −1 ) 0 (−1)? Why?
3. Let f (x) = 1 4 x 3 + 4.
(a) Sketch a graph of y = f (x) and explain why f is an invertible function.

Figure 2.13: A function y = f (x) for use in Exercise 2.


(b) Let g be the inverse of f and determine a formula for g.
(c) Compute f 0 (x), g 0 (x), f 0 (2), and g 0 (6). What is the special relationship between f 0 (2) and g 0 (6)? Why?
4. Let h(x) = x + sin(x).
(a) Sketch a graph of y = h(x) and explain why h must be invertible.
(b) Explain why it does not appear to be algebraically possible to determine a formula for h −1 .
(c) Observe that the point ( π 2 , π 2 + 1) lies on the graph of y = h(x). Determine the value of (h −1 ) 0 ( π 2 + 1).

2.7: Derivatives of Functions Given Implicitely


1. Consider the curve given by the equation 2y 3 + y 2 − y 5 = x 4 − 2x 3 + x 2 . Find all points at which the tangent line to
the curve is horizontal or vertical. Be sure to use a graphing utility to plot this implicit curve and to visually check the
results of algebraic reasoning that you use to determine where the tangent lines are horizontal and vertical.

Matthew Boelkins, David Austin & Steven


2.E.5 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
2. For the curve given by the equation sin(x + y) + cos(x − y) = 1, find the equation of the tangent line to the curve at the
point ( π 2 , π 2 ).
3. Implicit differentiation enables us a different perspective from which to see why the rule d dx [a x ] = a x ln(a) holds, if
we assume that d dx [ln(x)] = 1 x . This exercise leads you through the key steps to do so.
(a) Let y = a x . Rewrite this equation using the natural logarithm function to write x in terms of y (and the constant a).
(b) Differentiate both sides of the equation you found in (a) with respect to x, keeping in mind that y is implicitly a
function of x.
(c) Solve the equation you found in (b) for dy dx , and then use the definition of y to write dy dx solely in terms of x. What
have you found?

2.8: Using Derivatives to Evaluate Limits


1. Let f and g be differentiable functions about which the following information is known: f (3) = g(3) = 0, f 0 (3) = g 0 (3)
= 0, f 00(3) = −2, and g 00(3) = 1. Let a new function h be given by the rule h(x) = f (x) g(x) . On the same set of axes,
sketch possible graphs of f and g near x = 3, and use the provided information to determine the value of lim x→3 h(x).
Provide explanation to support your conclusion.
2. Find all vertical and horizontal asymptotes of the function R(x) = 3(x − a)(x − b) 5(x − a)(x − c) , where a, b, and c are
distinct, arbitrary constants. In addition, state all values of x for which R is not continuous. Sketch a possible graph of R,
clearly labeling the values of a, b, and c.
3. Consider the function g(x) = x 2x , which is defined for all x > 0. Observe that limx→0 + g(x) is indeterminate due to its
form of 0 0 . (Think about how we know that 0 k = 0 for all k > 0, while b 0 = 1 for all b , 0, but that neither rule can apply
to 0 0 .)
(a) Let h(x) = ln(g(x)). Explain why h(x) = 2x ln(x).
(b) Next, explain why it is equivalent to write h(x) = 2 ln(x) 1 x .
(c) Use L’Hopital’s Rule and your work in (b) to compute limx→0 + h(x).
(d) Based on the value of limx→0 + h(x), determine limx→0 + g(x).
4. Recall we say that function g dominates function f provided that limx→∞ f (x) = ∞, limx→∞ g(x) = ∞, and limx→∞ f
(x) g(x) = 0.
(a) Which function dominates the other: ln(x) or √ x?
(b) Which function dominates the other: ln(x) or n √ x? (n can be any positive integer)
(c) Explain why e x will dominate any polynomial function.
(d) Explain why x n will dominate ln(x) for any positive integer n. 160
(e) Give any example of two nonlinear functions such that neither dominates the other

Matthew Boelkins, David Austin & Steven


2.E.6 12/22/2021 https://math.libretexts.org/@go/page/5381
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
3: USING DERIVATIVES
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

3.1: USING DERIVATIVES TO IDENTIFY EXTREME VALUES


The critical numbers of a continuous function f are the values of p for which f′(p)=0 or f′(p) does not exist. These values are important
because they identify horizontal tangent lines or corner points on the graph, which are the only possible locations at which a local
maximum or local minimum can occur.

3.2: USING DERIVATIVES TO DESCRIBE FAMILIES OF FUNCTIONS


Given a family of functions that depends on one or more parameters, by investigating how critical numbers and locations where the
second derivative is zero depend on the values of these parameters, we can often accurately describe the shape of the function in terms
of the parameters. In particular, just as we can created first and second derivative sign charts for a single function, we often can do so
for entire families of functions.

3.3: GLOBAL OPTIMIZATION


To find relative extreme values of a function, we normally use a first derivative sign chart and classify all of the function’s critical
numbers. If instead we are interested in absolute extreme values, we first decide whether we are considering the entire domain of the
function or a particular interval. If we are working to find absolute extremes on a restricted interval, then we first identify all critical
numbers of the function that lie in the interval

3.4: APPLIED OPTIMIZATION


While there is no single algorithm that works in every situation where optimization is used, in most of the problems we consider, the
following steps are helpful: draw a picture and introduce variables; identify the quantity to be optimized and find relationships among
the variables; determine a function of a single variable that models the quantity to be optimized; decide the domain on which to
consider the function being optimized; use calculus to identify the absolute maximum and/or minimum.

3.5: RELATED RATES


When two or more related quantities are changing as implicit functions of time, their rates of change can be related by implicitly
differentiating the equation that relates the quantities themselves.

3 12/22/2021
3.E: USING DERIVATIVES (EXERCISES)
These are homework exercises to accompany Chapter 3 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
3.1: Using Derivatives to Identify Extreme Values
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What are the critical numbers of a function f and how are they connected to identifying the most extreme values the
function achieves?
How does the first derivative of a function reveal important information about the behavior of the function,
including the function’s extreme values?
How can the second derivative of a function be used to help identify extreme values of the function?

In many different settings, we are interested in knowing where a function achieves its least and greatest values. These can
be important in applications – say to identify a point at which maximum profit or minimum cost occurs – or in theory to
understand how to characterize the behavior of a function or a family of related functions. Consider the simple and familiar
example of a parabolic function such as
2
s(t) = −16 t + 32t + 48 (3.1.1)

which is shown at left in Figure 3.1.1 and that represents the height of an object tossed vertically: its maximum value
occurs at the vertex of the parabola and represents the highest value that the object reaches. Moreover, this maximum value
identifies an especially important point on the graph, the point at which the curve changes from increasing to decreasing.
More generally, for any function we consider, we can investigate where its lowest

Figure 3.1.1 : At left, s(t) = −16t 2


+ 24t + 32 whose vertex is (3/4, 41); at right, a function g that demonstrates
several high and low points.
and highest points occur in comparison to points nearby or to all possible points on the graph. Given a function f , we say
that f (c) is a global or absolute maximum provided that f (c) ≥ f (x) for all x in the domain of f , and similarly call f (c) a
global or absolute minimum whenever f (c) ≤ f (x) for all x in the domain of f . For instance, for the function g given at
right in Figure 3.1.1, g has a global maximum of g(c) , but g does not appear to have a global minimum, as the graph of g
seems to decrease without bound. We note that the point (c, g(c)) marks a fundamental change in the behavior of g , where
g changes from increasing to decreasing; similar things happen at both (a, g(a)) and (b, g(b)) , although these points are

not global mins or maxes.


For any function f , we say that f (c) is a local maximum or relative maximum provided that f (c) ≥ f (x) for all x near c,
while f (c) is called a local or relative minimum whenever f (c) ≤ f (x) for all x near c . Any maximum or minimum may
be called an extreme value of f . For example, in Figure 3.1.1, g has a relative minimum of g(b) at the point (b, g(b)) and a
relative maximum of g(a) at (a, g(a)) . We have already identified the global maximum of g as g(c) ; this global maximum
can also be considered a relative maximum.
We would like to use fundamental calculus ideas to help us identify and classify key function behavior, including the
location of relative extremes. Of course, if we are given a graph of a function, it is often straightforward to locate these
important behaviors visually. We investigate this situation in the following preview activity.

Matthew Boelkins, David Austin & Steven


3.1.1 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
Preview Activity 3.1.1

Consider the function h given by the graph in Figure 3.1.2. Use the graph to answer each of the following questions.

Figure 3.1.2 : The graph of a function h on the interval [−3, 3].


a. Identify all of the values of c for which h(c) is a local maximum of h .
b. Identify all of the values of c for which h(c) is a local minimum of h .
c. Does h have a global maximum on the interval [3, 3]? If so, what is the value of this global maximum?
d. Does h have a global minimum on the interval [3, 3]? If so, what is its value?
e. Identify all values of c for which h (c) = 0 .

f. Identify all values of c for which h (c) does not exist.


g. True or false: every relative maximum and minimum of h occurs at a point where h (c) is either zero or does not

exist.
h. True or false: at every point where h (c) is zero or does not exist, h has a relative maximum or minimum.

Critical numbers and the First Derivative Test


If a function has a relative extreme value at a point (c, f (c)), the function must change its behavior at c regarding whether
it is increasing or decreasing before or after the point. For example, if a continuous function has a relative maximum at c ,
such as those pictured in the two leftmost functions in Figure 3.1.3, then it is both necessary and sufficient that the
function change from being increasing just before c to decreasing just after c . In the same way, a continuous function has a
relative minimum at c if and only if the function changes from decreasing to increasing at c . See, for instance, the two
functions pictured at right in Figure 3.1.3. There are only two possible ways for these changes in behavior to occur: either
f (c) = 0 or f (c) is undefined.
′ ′

Figure 3.1.3 : From left to right, a function with a relative maximum where its derivative is zero; a function with a
relative maximum where its derivative is undefined; a function with neither a maximum nor a minimum at a point where its
derivative is zero; a function with a relative minimum where its derivative is zero; and a function with a relative minimum
where its derivative is undefined.
Because these values of c are so important, we call them critical numbers. More specifically, we say that a function f has a
critical number at x = c provided that c is in the domain of f , and f (c) = 0 or f (c) is undefined. Critical numbers
′ ′

provide us with the only possible locations where the function f may have relative extremes. Note that not every critical
number produces a maximum or minimum; in the middle graph of Figure 3.1.3, the function pictured there has a

Matthew Boelkins, David Austin & Steven


3.1.2 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
horizontal tangent line at the noted point, but the function is increasing before and increasing after, so the critical number
does not yield a location where the function is greater than every value nearby, nor less than every value nearby.
We also sometimes use the terminology that, when c is a critical number, that (c, f (c)) is a critical point of the function, or
that f (c) is a critical value.
The first derivative test summarizes how sign changes in the first derivative indicate the presence of a local maximum or
minimum for a given function.

First Derivative Test


If p is a critical number of a continuous function f that is differentiable near p (except possibly at x = p ), then f has a
relative maximum at p if and only if f changes sign from positive to negative at p, and f has a relative minimum at p

if and only if f changes sign from negative to positive at p.


We consider an example to show one way the first derivative test can be used to identify the relative extreme values of a
function.

Example 3.1.1:

Let f be a function whose derivative is given by the formula f (x) = e (3 − x)(x + 1) . Determine all critical
′ −2x 2

numbers of f and decide whether a relative maximum, relative minimum, or neither occurs at each.
Solution
Since we already have f (x) written in factored form, it is straightforward to find the critical numbers of f . Since f' (x)

is defined for all values of x, we need only 165 determine where f' (x) = 0. From the equation
′ −2x 2
f (x) = e (3 − x)(x + 1 ) =0 (3.1.2)

and the zero product property, it follows that x = 3 and x = −1 are critical numbers of f . (Note particularly that there
is no value of x that makes e = 0 .)
−2x

Next, to apply the first derivative test, we’d like to know the sign of f (x) at inputs near the critical numbers. Because

the critical numbers are the only locations at which f' can change sign, it follows that the sign of the derivative is the
same on each of the intervals created by the critical numbers: for instance, the sign of f' must be the same for every
x < −1 . We create a first derivative sign chart to summarize the sign of f' on the relevant intervals along with the

corresponding behavior of f .

Figure 3.1.4 : The first derivative sign chart for a function f whose derivative is given by the formula f' (x) = e −2x (3
− x)(x + 1) 2 .
The first derivative sign chart in Figure 3.1.4 comes from thinking about the sign of each of the terms in the factored
form of f (x) at one selected point in the interval under consideration. For instance, for x < −1 , we could consider

x = −2 and determine the sign of e , (3 − x) , and (x + 1) at the value x = −2 . We note that both e and
−2x 2 −2x

(x + 1)
2
are positive regardless of the value of x, while (3 − x) is also positive at x = −2 . Hence, each of the three
terms in f is positive, which we indicate by writing “+ + +.” Taking the product of three positive terms obviously

results in a value that is positive, which we denote by the “+” in the interval to the left of x = −1 indicating the overall
sign of f . And, since f is positive on that interval, we further know that f is increasing, which we summarize by
′ ′

writing “INC” to represent the corresponding behavior of f . In a similar way, we find that f is positive and f is

increasing on −1 < x < 3 , and f is negative and f is decreasing for x > 3 .


Matthew Boelkins, David Austin & Steven


3.1.3 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
Now, by the first derivative test, to find relative extremes of f we look for critical numbers at which f changes sign. ′

In this example, f only changes sign at x = 3 , where f changes from positive to negative, and thus f has a relative
′ ′

maximum at x = 3 . While f has a critical number at x = −1 , since f is increasing both before and after x = −1 , f
has neither a minimum nor a maximum at x = −1.

Activity 3.1.1

Suppose that g(x) is a function continuous for every value of x ≠ 2 whose first derivative is
2
(x + 4)(x − 1)

g (x) = .
x −2

Further, assume that it is known that g has a vertical asymptote at x = 2 .


a. Determine all critical numbers of g .
b. By developing a carefully labeled first derivative sign chart, decide whether g has as a local maximum, local
minimum, or neither at each critical number.
c. Does g have a global maximum? global minimum? Justify your claims.
d. What is the value of lim x→∞g (x)? What does the value of this limit tell you about the long-term behavior of g ?

e. Sketch a possible graph of y = g(x) .

The Second Derivative Test


Recall that the second derivative of a function tells us several important things about the behavior of the function itself. For
instance, if f is positive on an interval, then we know that f is increasing on that interval and, consequently, that f is
′′ ′

concave up, which also tells us that throughout the interval the tangent line to y = f (x) lies below the curve at every point.
In this situation where we know that f (p) = 0 , it turns out that the sign of the second derivative determines whether f has

a local minimum or local maximum at the critical number p.


In Figure 3.1.5, we see the four possibilities for a function f that has a critical number p at which f (p) = 0 , provided

f (p) is not zero on an interval including p (except possibly at p ). On either side of the critical number, f ' can be either
′′ ′

positive or negative, and hence f can be either concave up or concave down. In the first two graphs, f does not change
concavity at p, and in those situations, f has either a local minimum or local maximum. In particular, if f (p) = 0 and ′

f (p) < 0 , then we know f is concave down at p with a horizontal tangent line, and this guarantees f has a local
′′

maximum there. This fact, along with the corresponding statement for when f (p) is positive, is stated in the second
′′

derivative test.

Figure 3.1.5 : Four possible graphs of a function f with a horizontal tangent line at a critical point.

Second Derivative Test

If p is a critical number of a continuous function f such that f (p) = 0 and f (p) ≠ 0 , then f has a relative maximum
′ ′′

at p if and only if f (p) < 0 , and f has a relative minimum at p if and only if f (p) > 0 .
′′ ′′

In the event that f (p) = 0 , the second derivative test is inconclusive. That is, the test doesn’t provide us any information.
′′

This is because if f (p) = 0 , it is possible that f has a local minimum, local maximum, or neither.1 Just as a first
′′

derivative sign chart reveals all of the increasing and decreasing behavior of a function, we can construct a second
derivative sign chart that demonstrates all of the important information involving concavity.

Matthew Boelkins, David Austin & Steven


3.1.4 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
Example 3.1.2:

Let f (x) be a function whose first derivative is


′ 4 2
f (x) = 3 x − 9x .

Construct both first and second derivative sign charts for f , fully discuss where f is increasing and decreasing and
concave up and concave down, identify all relative extreme values, and sketch a possible graph of f .
Solution
Since we know ′
f (x) = 3 x
4 2
− 9x , we can find the critical numbers of f by solving 3x
4 2
− 9x =0 . Factoring, we
observe that
2 2 2 – –
0 = 3x (x − 3) = 3 x (x + √3) (x − √3) , (3.1.3)


so that x = 0, ±√3 are the three critical numbers of f . It then follows that the first derivative sign chart for f is given
in Figure 3.1.6. Thus, f is increasing on the intervals (−∞, − √ 3) and ( √ 3, ∞), while f is decreasing on (− √ 3, 0) and
(0, √ 3). Note particularly that by the first derivative test, this information tells us that f has a local maximum at
1Consider the functions f (x) = x4, g(x) = −x 4 , and h(x) = x 3 at the critical point p = 0.

Figure 3.1.6 : The first derivative sign chart for f when f' (x) = 3x 4 − 9x 2 = 3x 2 (x 2 − 3).
x = − √ 3 and a local minimum at x = √ 3. While f also has a critical number at x = 0, neither a maximum nor
minimum occurs there since f' does not change sign at x = 0. Next, we move on to investigate concavity.
Differentiating f' (x) = 3x 4 − 9x 2 , we see that f''(x) = 12x 3 − 18x. Since we are interested in knowing the intervals
on which f'' is positive and negative, we first find where f''(x) = 0. Observe that 0 = 12x 3 − 18x = 12x x 2 − 3 2 = 12x
* , x + r 3 2 + - * , x − r 3 2 + - , which implies that x = 0, ± q 3 2 . Building a sign chart for f'' in the exact same way
we do for f' , we see the result shown in Figure 3.1.7. Therefore, f is concave down on the

Figure 3.1.7 : The second derivative sign chart for f when f''(x) = 12x 3 − 18x = 12x 2 x 2 − q 3 2 .
intervals (−∞, − q 3 2 ) and (0, q 3 2 ), and concave up on (− q 3 2 , 0) and ( q 3 2 , ∞). Putting all of the above
information together, we now see a complete and accurate 169 possible graph of f in Figure 3.1.8. The point A = (− √
3, f (− √ 3)) is a local maximum, as f is increasing prior to A and decreasing after; similarly, the point E = ( √ 3, f ( √ 3)
is a local minimum.

Matthew Boelkins, David Austin & Steven


3.1.5 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
Figure 3.1.8 : A possible graph of the function f in Example 3.2.
Note, too, that f is concave down at A and concave up at B, which is consistent both with our second derivative sign
chart and the second derivative test. At points B and D, concavity changes, as we saw in the results of the second
derivative sign chart in Figure 3.1.7. Finally, at point C , f has a critical point with a horizontal tangent line, but
neither a maximum nor a minimum occurs there since f is decreasing both before and after C .
It is also the case that concavity changes at C . While we completely understand where f is increasing and decreasing,
where f is concave up and concave down, and where f has relative extremes, we do not know any specific information
about the y-coordinates of points on the curve. For instance, while we know that f has a local maximum at x = − √ 3,
we don’t know the value of that maximum because we do not know f (− √ 3). Any vertical translation of our sketch of
f in Figure 3.1.8 would satisfy the given criteria for f . Points B, C, and D in Figure 3.1.8 are locations at which the
concavity of f changes. We give a special name to any such point: if p is a value in the domain of a continuous function
f at which f changes concavity, then we say that (p, f (p)) is an inflection point of f . Just as we look for locations

where f changes from increasing to decreasing at points where f (p) = 0 or f (p) is undefined, so too we find where
′ ′

f (p) = 0 or f (p) is undefined to see if there are points of inflection at these locations. It is important at this point in
′′ ′′

our study to remind ourselves of the big picture that derivatives help to paint: the sign of the first derivative f' tells us
whether the function f is increasing or decreasing, while the sign of the second derivative f tells us how the function
′′

f is increasing or decreasing.

Activity 3.1.2:

Suppose that g is a function whose second derivative, g , is given by the following graph.
′′

Figure 3.1.9 : The graph of y = g 00(x).


a. Find all points of inflection of g .
b. Fully describe the concavity of g by making an appropriate sign chart.
c. Suppose you are given that g (−1.67857351) = 0. Is there is a local maximum, local minimum, or neither (for the

function g ) at this critical point of g , or is it impossible to say? Why?

Matthew Boelkins, David Austin & Steven


3.1.6 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
d. Assuming that g (x) is a polynomial (and that all important behavior of g is seen in the graph above, what degree
′′ ′′

polynomial do you think g(x) is? Why?

As we will see in more detail in the following section, derivatives also help us to understand families of functions that
differ only by changing one or more parameters. For instance, we might be interested in understanding the behavior of all
functions of the form f (x) = a(x − h) + k where a , h , and k are numbers that may vary. In the following activity, we
2

investigate a particular example where the value of a single parameter has considerable impact on how the graph appears.

Activity 3.1.3:

Consider the family of functions given by


2
h(x) = x + cos(kx) (3.1.4)

where k is an arbitrary positive real number.


a. Use a graphing utility to sketch the graph of h for several different k -values, including k = 1, 3, 5, 10. Plot
h(x) = x + cos(3x) on the axes provided below. What is the smallest value of k at which you think you can see
2

(just by looking at the graph) at least one inflection point on the graph of h ?

Figure 3.1.10 : Axes for plotting y = h(x).


– –
b. Explain why the graph of h has no inflection points if k ≤ √2 , but infinitely many inflection points if k > √2 .
c. Explain why, no matter the value of k , h can only have finitely many critical numbers.

Summary
In this section, we encountered the following important ideas:
The critical numbers of a continuous function f are the values of p for which f (p) = 0 or f (p) does not exist. These
′ ′

values are important because they identify horizontal tangent lines or corner points on the graph, which are the only
possible locations at which a local maximum or local minimum can occur.
Given a differentiable function f , whenever f is positive, f is increasing; whenever f is negative, f is decreasing.
′ ′

The first derivative test tells us that at any point where f changes from increasing to decreasing, f has a local
maximum, while conversely at any point where f changes from decreasing to increasing f has a local minimum.
Given a twice differentiable function f , if we have a horizontal tangent line at x = p and f (p) is nonzero, then the
′′

fact that f tells us the concavity of f will determine whether f has a maximum or minimum at x = p . In particular, if
′′

f (p) = 0 and f (p) < 0 , then f is concave down at p and f has a local maximum there, while if f (p) = 0 and
′ ′′ ′

f (p) > 0 , then f has a local minimum at p . If f (p) = 0 and f (p) = 0 , then the second derivative does not tell us
′′ ′ ′′

whether f has a local extreme at p or not.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


3.1.7 12/22/2021 https://math.libretexts.org/@go/page/4306
Schlicker
3.2: Using Derivatives to Describe Families of Functions
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
Given a family of functions that depends on one or more parameters, how does the shape of the graph of a typical
function in the family depend on the value of the parameters?
How can we construct first and second derivative sign charts of functions that depend on one or more parameters
while allowing those parameters to remain arbitrary constants?

Mathematicians are often interested in making general observations, say by describing patterns that hold in a large number
of cases. For example, think about the Pythagorean Theorem: it doesn’t tell us something about a single right triangle, but
rather a fact about every right triangle, thus providing key information about every member of the right triangle family. In
the next part of our studies, we would like to use calculus to help us make general observations about families of functions
that depend on one or more parameters. People who use applied mathematics, such as engineers and economists, often
encounter the same types of functions in various settings where only small changes to certain constants occur. These
constants are called parameters. We are already familiar with certain families of functions. For example,
f (t) = a sin(b(t − c)) + d (3.2.1)

is a stretched and shifted version of the sine function with amplitude a , period 2πb, phase shift c , and vertical shift d . We
understand from experience with trigonometric functions that a affects the size of the oscillation, b the rapidity of
oscillation, and c where the oscillation starts, as shown in Figure 3.13, while d affects the vertical positioning of the graph.
In addition, there are several basic situations that we already understand completely. For instance, every function of the
form y = mx + b is a line with slope m and y-intercept (0, b). Note that the form y = mx + b allows us to consider every
possible line by using two parameters (except for vertical lines which are of the form x = a). Further, we understand that
the value of m affects the line’s steepness and whether the line rises or falls from left to right, while the value of b situates
the line vertically on the coordinate axes. For other less familiar families of functions, we would like to use calculus to
understand and classify where key behavior occurs: where members of the family are increasing or decreasing, concave up
or concave down, where relative extremes occur, and more, all

Figure 3.13: The graph of f (t) = a sin(b(t − c)) + d based on parameters a, b, c, and d.
in terms of the parameters involved. To get started, we revisit a common collection of functions to see how calculus
confirms things we already know.

Preview Activity 3.2.1

Let a, h, and k be arbitrary real numbers with a , 0, and let f be the function given by the rule f (x) = a(x − h) 2 + k.
a. What familiar type of function is f? What information do you know about f just by looking at its form? (Think
about the roles of a, h, and k.)
Matthew Boelkins, David Austin & Steven
3.2.1 12/22/2021 https://math.libretexts.org/@go/page/4307
Schlicker
b. Next we use some calculus to develop familiar ideas from a different perspective. To start, treat a, h, and k as
constants and compute f 0 (x).
c. Find all critical numbers of f . (These will depend on at least one of a, h, and k.)
d. Assume that a < 0. Construct a first derivative sign chart for f .
e. Based on the information you’ve found above, classify the critical values of f as maxima or minima. ./

Describing families of functions in terms of parameters Given a family of functions that depends on one or more
parameters, our goal is to describe the key characteristics of the overall behavior of each member of the familiy in terms of
those parameters. By finding the first and second derivatives and constructing first and second derivative sign charts (each
of which may depend on one or more of the 176 parameters), we can often make broad conclusions about how each
member of the family will appear. The fundamental steps for this analysis are essentially identical to the work we did in
Section 3.1, as we demonstrate through the following example.

Example 3.2.1

Consider the two-parameter family of functions given by g(x) = axe−bx , where a and b are positive real numbers.
Fully describe the behavior of a typical member of the family in terms of a and b, including the location of all critical
numbers, where g is increasing, decreasing, concave up, and concave down, and the long term behavior of g.
Solution.
We begin by computing g 0 (x). By the product rule, g 0 (x) = ax d dx f e −bx g + e −bx d dx [ax], and thus by
applying the chain rule and constant multiple rule, we find that g 0 (x) = axe−bx(−b) + e −bx(a). To find the critical
numbers of g, we solve the equation g 0 (x) = 0. Here, it is especially helpful to factor g 0 (x). We thus observe that
setting the derivative equal to zero implies 0 = ae−bx(−bx + 1). Since we are given that a , 0 and we know that e −bx ,
0 for all values of x, the only way the preceding equation can hold is when −bx + 1 = 0. Solving for x, we find that x =
1 b , and this is therefore the only critical number of g. Now, recall that we have shown g 0 (x) = ae−bx(1−bx) and that
the only critical number of g is x = 1 b . This enables us to construct the first derivative sign chart for g that is shown
in Figure 3.14.

Figure 3.14: The first derivative sign chart for g(x) = axe−bx .
Note particularly that in g 0 (x) = ae−bx(1 − bx), the term ae−bx is always positive, so the sign depends on the linear
term (1 − bx), which is zero when x = 1 b . Note that this line has negative slope (−b), so (1 − bx) is positive for x < 1
b and negative for x > 1 b . Hence we can not only conclude that g is always increasing for x < 1 b and decreasing for
x > 1 b , but also that g has a global maximum at ( 1 b , g( 1 b )) and no local minimum. We turn next to analyzing the
concavity of g. With g 0 (x) = −abxe−bx + ae−bx, we differentiate to find that g 00(x) = −abxe−bx(−b) + e −bx(−ab) +
ae−bx(−b). Combining like terms and factoring, we now have g 00(x) = ab2 xe−bx − 2abe−bx = abe−bx(bx − 2).
Similar to our work with the first derivative, we observe that abe−bx is always positive,

Matthew Boelkins, David Austin & Steven


3.2.2 12/22/2021 https://math.libretexts.org/@go/page/4307
Schlicker
Figure 3.15: The second derivative sign chart for g(x) = axe−bx .
and thus the sign of g 00 depends on the sign of (bx − 2), which is zero when x = 2 b . Since (bx − 2) represents a line
with positive slope (b), the value of (bx − 2) is negative for x < 2 b and positive for x > 2 b , and thus the sign chart for
g 00 is given by the one shown in Figure 3.15. Thus, g is concave down for all x < 2 b and concave up for all x > 2 b .
Finally, we analyze the long term behavior of g by considering two limits. First, we note that limx→∞ g(x) = limx→∞
axe−bx = limx→∞ ax e bx . Since this limit has indeterminate form ∞ ∞ , we can apply L’Hopital’s Rule and thus find
that limx→∞ g(x) = 0. In the other direction, limx→−∞ g(x) = limx→−∞ axe−bx = −∞, since ax → −∞ and e −bx →
∞ as x → −∞. Hence, as we move left on its graph, g decreases without bound, while as we move to the right, g(x) →
0. 178 All of the above information now allows us to produce the graph of a typical member of this family of functions
without using a graphing utility (and without choosing particular values for a and b), as shown in Figure 3.16.

Figure 3.16: The graph of g(x) = axe−bx .


We note that the value of b controls the horizontal location of the global maximum and the inflection point, as neither
depends on a. The value of a affects the vertical stretch of the graph. For example, the global maximum occurs at the
point ( 1 b , g( 1 b )) = ( 1 b , a b e −1 ), so the larger the value of a, the greater the value of the global maximum.

The kind of work we’ve completed in Example 3.3 can often be replicated for other families of functions that depend on
parameters. Normally we are most interested in determining all critical numbers, a first derivative sign chart, a second
derivative sign chart, and some analysis of the limit of the function as x → ∞. Throughout, we strive to work with the
parameters as arbitrary constants. If stuck, it is always possible to experiment with some particular values of the
parameters present to reduce the algebraic complexity of our work. The following sequence of activities offers several key
examples where we see that the values of different parameters substantially affect the behavior of individual functions
within a given family.

Activity 3.2.2

Consider the family of functions defined by p(x) = x 3 − ax, where a , 0 is an arbitrary constant.
a. Find p 0 (x) and determine the critical numbers of p. How many critical numbers does p have?
b. Construct a first derivative sign chart for p. What can you say about the overall 179 behavior of p if the constant a
is positive? Why? What if the constant a is negative? In each case, describe the relative extremes of p.
c. Find p 00(x) and construct a second derivative sign chart for p. What does this tell you about the concavity of p?
What role does a play in determining the concavity of p?
d. Without using a graphing utility, sketch and label typical graphs of p(x) for the cases where a > 0 and a < 0. Label
all inflection points and local extrema.
e. Finally, use a graphing utility to test your observations above by entering and plotting the function p(x) = x 3 − ax
for at least four different values of a. Write several sentences to describe your overall conclusions about how the
behavior of p depends on a. C

Matthew Boelkins, David Austin & Steven


3.2.3 12/22/2021 https://math.libretexts.org/@go/page/4307
Schlicker
Activity 3.2.3

Consider the two-parameter family of functions of the form h(x) = a(1 − e −bx), where a and b are positive real
numbers.
a. Find the first derivative and the critical numbers of h. Use these to construct a first derivative sign chart and
determine for which values of x the function h is increasing and decreasing.
b. Find the second derivative and build a second derivative sign chart. For which values of x is a function in this
family concave up? concave down?
c. What is the value of limx→∞ a(1 − e −bx)? limx→−∞ a(1 − e −bx)?
d. How does changing the value of b affect the shape of the curve?
e. Without using a graphing utility, sketch the graph of a typical member of this family. Write several sentences to
describe the overall behavior of a typical function h and how this behavior depends on a and b. C

Activity 3.2.4

Let L(t) = A 1 + ce−kt , where A, c, and k are all positive real numbers.
a. Observe that we can equivalently write L(t) = A(1 + ce−kt) −1 . Find L 0 (t) and explain why L has no critical
numbers. Is L always increasing or always decreasing? Why?
b. Given the fact that L 00(t) = Ack2 e −kt ce−kt − 1 (1 + ce−kt) 3 , 180 find all values of t such that L 00(t) = 0 and
hence construct a second derivative sign chart. For which values of t is a function in this family concave up?
concave down?
c. What is the value of limt→∞ A 1 + ce−kt ? limt→−∞ A 1 + ce−kt ?
d. Find the value of L(x) at the inflection point found in (b).
e. Without using a graphing utility, sketch the graph of a typical member of this family. Write several sentences to
describe the overall behavior of a typical function L and how this behavior depends on A, c, and kcritical number.
(f) Explain why it is reasonable to think that the function L(t) models the growth of a population over time in a
setting where the largest possible population the surrounding environment can support is A.

Summary
In this section, we encountered the following important ideas:
Given a family of functions that depends on one or more parameters, by investigating how critical numbers and
locations where the second derivative is zero depend on the values of these parameters, we can often accurately
describe the shape of the function in terms of the parameters.
In particular, just as we can created first and second derivative sign charts for a single function, we often can do so for
entire families of functions where critical numbers and possible inflection points depend on arbitrary constants. These
sign charts then reveal where members of the family are increasing or decreasing, concave up or concave down, and
help us to identify relative extremes and inflection points. critical number

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


3.2.4 12/22/2021 https://math.libretexts.org/@go/page/4307
Schlicker
3.3: Global Optimization
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What are the differences between finding relative extreme values and global extreme values of a function?
How is the process of finding the global maximum or minimum of a function over the function’s entire domain
different from determining the global maximum or minimum on a restricted domain?
For a function that is guaranteed to have both a global maximum and global minimum on a closed, bounded
interval, what are the possible points at which these extreme values occur?

We have seen that we can use the first derivative of a function to determine where the function is increasing or decreasing,
and the second derivative to know where the function is concave up or concave down. Each of these approaches provides
us with key information that helps us determine the overall shape and behavior of the graph, as well as whether the
function has a relative minimum or relative maximum at a given critical number. Remember that the difference between a
relative maximum and a global maximum is that there is a relative maximum of f at x = p if f (p) ≥ f (x) for all x near p,
while there is a global maximum at p if f (p) ≥ f (x) for all x in the domain of f . For instance,

Figure 3.3.1 : A function f with a global maximum, but no global minimum.


in Figure 3.3.1, we see a function f that has a global maximum at x = c and a relative maximum at x = a , since f (c) is
greater than f (x) for every value of x, while f (a) is only greater than the value of f (x) for x near a . Since the function
appears to decrease without bound, f has no global minimum, though clearly f has a relative minimum at x = b . Our
emphasis in this section is on finding the global extreme values of a function (if they exist). In so doing, we will either be
interested in the behavior of the function over its entire domain or on some restricted portion. The former situation is
familiar and similar to work that we did in the two preceding sections of the text. We explore this through a particular
example in the following preview activity.

Preview Activity 3.3.1

Let
3
f (x) = 2 + . (3.3.1)
2
1 + (x + 1)

a. Determine all of the critical numbers of f .


b. Construct a first derivative sign chart for f and thus determine all intervals on which f is increasing or decreasing.
c. Does f have a global maximum? If so, why, and what is its value and where is the maximum attained? If not,
explain why.
d. Determine

lim f (x)
x→∞

and

Matthew Boelkins, David Austin & Steven


3.3.1 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
lim f (x).
x→−∞

e. Explain why f (x) > 2 for every value of x.


f. Does f have a global minimum? If so, why, and what is its value and where is the minimum attained? If not,
explain why.

Global Optimization
For the functions in Figure 3.3.1 and Preview Activity 3.3, we were interested in finding the global minimum and global
maximum on the entire domain, which turned out to be (−∞, ∞) for each. At other times, our perspective on a function
might be more focused due to some restriction on its domain. For example, rather than considering f (x) = 2 + 3
2
1+(x+1)

for every value of x, perhaps instead we are only interested in those x for which 0 ≤ x ≤ 4 , and we would like to know
which values of x in the interval [0, 4] produce the largest possible and smallest possible values of f . We are accustomed
to critical numbers playing a key role in determining the location of extreme values of a function; now, by restricting the
domain to an interval, it makes sense that the endpoints of the interval will also be important to consider, as we see in the
following activity. When limiting ourselves to a particular interval, we will often refer to the absolute maximum or
minimum value, rather than the global maximum or minimum.

Activity 3.3.1

Let
1
3
g(x) = x − 2x + 2. (3.3.2)
3

a. Find all critical numbers of g that lie in the interval −2 ≤ x ≤ 3 .


b. Use a graphing utility to construct the graph of g on the interval −2 ≤ x ≤ 3 .
c. From the graph, determine the x-values at which the absolute minimum and absolute maximum of g occur on the
interval [−2, 3].
d. How do your answers change if we instead consider the interval −2 ≤ x ≤ 2 ?
e. What if we instead consider the interval −2 ≤ x ≤ 1 ?

In Activity 3.3.2, we saw how the absolute maximum and absolute minimum of a function on a closed, bounded interval
[a, b], depend not only on the critical numbers of the function, but also on the selected values of a and b . These

observations demonstrate several important facts that hold much more generally. First, we state an important result called
the Extreme Value Theorem.

The Extreme Value Theorem

If f is a continuous function on a closed interval [a, b], then f attains both an absolute minimum and absolute
maximum on [a, b]. That is, for some value x such that a ≤ x ≤ b , it follows that f (x ) ≤ f (x) for all x in
m m m

[a, b]. Similarly, there is a value xM in [a, b] such that f (x ) ≥ f (x) for all x in [a, b]. Letting m = f (x ) and
M m

M = f (x M ) , it follows that m ≤ f (x) ≤ M for all x in [a, b].

The Extreme Value Theorem tells us that provided a function is continuous, on any closed interval [a, b] the function has to
achieve both an absolute minimum and an absolute maximum. Note, however, that this result does not tell us where these
extreme values occur, but rather only that they must exist. As seen in the examples of Activity 3.3.2, it is apparent that the
only possible locations for relative extremes are either the endpoints of the interval or at a critical number (the latter being
where a relative minimum or maximum could occur, which is a potential location for an absolute extreme). Thus, we have
the following approach to finding the absolute maximum and minimum of a continuous function f on the interval [a, b]:
find all critical numbers of f that lie in the interval;
evaluate the function f at each critical number in the interval and at each endpoint of the interval;
from among the noted function values, the smallest is the absolute minimum of f on the interval, while the largest is
the absolute maximum.

Matthew Boelkins, David Austin & Steven


3.3.2 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
Activity 3.3.3

Find the exact absolute maximum and minimum of each function on the stated interval.
a. h(x) = xe on [0, 3]
−x

b. p(t) = sin(t) + cos(t) on [− π

2
,
π

2
]
2

c. q(x) = x

x−2
on [3, 7]
2

d. f (x) = 4 − e on (−∞, ∞)
−(x−2)

e. h(x) = xe on [0, ] with (a > 0)


−ax 2

a
2

f. f (x) = b − e −(x−a)
on (−∞, ∞) with a > 0 and b > 0 .

One of the big lessons in finding absolute extreme values is the realization that the interval we choose has nearly the same
impact on the problem as the function under consideration. Consider, for instance, the function pictured in Figure 3.3.2. In
sequence,

Figure 3.3.2 : A function g considered on three different intervals.


from left to right, as we see the interval under consideration change from [−2, 3] to [−2, 2] to [−2, 1], we move from
having two critical numbers in the interval with the absolute minimum at one critical number and the absolute maximum at
the right endpoint, to still having both critical numbers in the interval but then with the absolute minimum and maximum at
the two critical numbers, to finally having just one critical number in the interval with the absolute maximum at one
critical number and the absolute minimum at one endpoint. It is particularly essential to always remember to only consider
the critical numbers that lie within the interval.

Moving Towards Applications


In Section 3.4, we will focus almost exclusively on applied optimization problems: problems where we seek to find the
absolute maximum or minimum value of a function that represents some physical situation. We conclude this current
section with an example of one such problem because it highlights the role that a closed, bounded domain can play in
finding absolute extrema. In addition, these problems often involve considerable preliminary work to develop the function
which is to be optimized, and this example demonstrates that process.

Example 3.3.1: Maximizing and Minimizing Area

A 20 cm piece of wire is cut into two pieces. One piece is used to form a square and the other an equilateral triangle.
How should the wire be cut to maximize the total area enclosed by the square and triangle? to minimize the area?
Solution
We begin by constructing a picture that exemplifies the given situation. The primary variable in the problem is where
we decide to cut the wire. We thus label that point x, and note that the remaining portion of the wire then has length
20 − x . As shown in Figure 3.3.3, we see that the x cm of the wire that are used to form the equilateral triangle result

in a triangle with three sides of length . x

Matthew Boelkins, David Austin & Steven


3.3.3 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
Figure 3.3.3 : A 20 cm piece of wire cut into two pieces, one of which forms an equilateral triangle, the other which
yields a square.
For the remaining 20 − x cm of wire, the square that results will have each side of length 20−x

4
.
At this point, we note that there are obvious restrictions on x: in particular, 0 ≤ x ≤ 20 . In the extreme cases, all of the
wire is being used to make just one figure. For instance, if x = 0 , then all 20 cm of wire are used to make a square that
is 5 × 5 .
Now, our overall goal is to find the absolute minimum and absolute maximum areas that can be enclosed. We note that
the area of the triangle is

1 1 x x √3
A△ = bh = ⋅ ⋅ (3.3.3)
2 2 3 6


since the height of an equilateral triangle is √3 times half the length of the base. Further, the area of the square is
2
20 − x
2
A□ = s =( ) . (3.3.4)
4

Therefore, the total area function is


A = A△ + A□ (3.3.5)
– 2 2
√3x 20 − x
= +( ) . (3.3.6)
36 4

Again, note that we are only considering this function on the restricted domain [0, 20] and we seek its absolute
minimum and absolute maximum.
Differentiating A(x) in Equation 3.3.6, we have
– –
√3x 20 − x 1 √3 1 5

A (x) = +2 ( ) (− ) = x+ x− . (3.3.7)
18 4 4 18 8 2

Setting A (x) = 0 , it follows that


180
x = ≈ 11.3007 (3.3.8)

4 √3 + 9

is the only critical number of A , and we note that this lies within the interval [0, 20].
Evaluating A(x) (Equation 3.3.6) at the critical number and endpoints, we see that
2 2
– 180 180
√3( ) ⎛ 20 − ⎞
– –
180 4 √3 + 9 ⎜ 4 √3 + 9 ⎟
A( ) = +⎜ ⎟ ≈ 10.8741

4 √3 + 9 4 ⎜ 4 ⎟

⎝ ⎠ (3.3.9)

A(0) = 25

√3 100 –
A(20) = (400) = √3 ≈ 19.2450
36 9

Matthew Boelkins, David Austin & Steven


3.3.4 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
Thus, the absolute minimum occurs when x ≈ 11.3007 and results in the minimum area of approximately 10.8741
square centimeters, while the absolute maximum occurs when we invest all of the wire in the square (and none in the
triangle), resulting in 25 square centimeters of area. These results are confirmed by a plot of y = A(x) on the interval
[0, 20], as shown in Figure 3.3.4.

Figure 3.3.4 : A plot of the area function from Example 3.3.1.

Activity 3.3.4

A piece of cardboard that is 10 × 15 (each measured in inches) is being made into a box without a top. To do so,
squares are cut from each corner of the box and the remaining sides are folded up. If the box needs to be at least 1 inch
deep and no more than 3 inches deep, what is the maximum possible volume of the box? what is the minimum
volume? Justify your answers using calculus.
a. Draw a labeled diagram that shows the given information. What variable should we introduce to represent the
choice we make in creating the box? Label the diagram appropriately with the variable, and write a sentence to
state what the variable represents.
b. Determine a formula for the function V (that depends on the variable in (a)) that tells us the volume of the box.
c. What is the domain of the function V ? That is, what values of x make sense for input? Are there additional
restrictions provided in the problem?
d. Determine all critical numbers of the function V .
e. Evaluate V at each of the endpoints of the domain and at any critical numbers that lie in the domain.
f. What is the maximum possible volume of the box? the minimum?

The approaches shown in Example 3.3.1 and experienced in Activity 3.3.4 include standard steps that we undertake in
almost every applied optimization problem: we draw a picture to demonstrate the situation, introduce one or more
variables to represent quantities that are changing, work to find a function that models the quantity to be optimized, and
then decide an appropriate domain for that function. Once that work is done, we are in the familiar situation of finding the
absolute minimum and maximum of a function over a particular domain, at which time we apply the calculus ideas that we
have been studying to this point in Chapter 3.

Summary
In this section, we encountered the following important ideas:
To find relative extreme values of a function, we normally use a first derivative sign chart and classify all of the
function’s critical numbers. If instead we are interested in absolute extreme values, we first decide whether we are
considering the entire domain of the function or a particular interval.
In the case of finding global extremes over the function’s entire domain, we again use a first or second derivative sign
chart in an effort to make overall conclusions about whether or not the function can have a absolute maximum or
minimum. If we are working to find absolute extremes on a restricted interval, then we first identify all critical numbers
of the function that lie in the interval.
For a continuous function on a closed, bounded interval, the only possible points at which absolute extreme values
occur are the critical numbers and the endpoints. Thus, to find said absolute extremes, we simply evaluate the function

Matthew Boelkins, David Austin & Steven


3.3.5 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
at each endpoint and each critical number in the interval, and then we compare the results to decide which is largest
(the absolute maximum) and which is smallest (the absolute minimum).

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


3.3.6 11/28/2021 https://math.libretexts.org/@go/page/4308
Schlicker
3.4: Applied Optimization
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
In a setting where a situation is described for which optimal parameters are sought, how do we develop a function
that models the situation and use calculus to find the desired maximum or minimum?

Near the conclusion of Section 3.3, we considered two examples of optimization problems where determining the function
to be optimized was part of a broader question. In Example 3.4, we sought to use a single piece of wire to build two
geometric figures (an equilateral triangle and square) and to understand how various choices for how to cut the wire led to
different values of the area enclosed. One of our conclusions was that in order to maximize the total combined area
enclosed by the triangle and square, all of the wire must be used to make a square. In the subsequent Activity 3.9, we
investigated how the volume of a box constructed from a piece of cardboard by removing squares from each corner and
folding up the sides depends on the size of the squares removed.
Both of these problems exemplify situations where there is not a function explicitly provided to optimize. Rather, we first
worked to understand the given information in the problem, drawing a figure and introducing variables, and then sought to
develop a formula for a function that models the quantity (area or volume, in the two examples, respectively) to be
optimized. Once the function was established, we then considered what domain was appropriate on which to pursue the
desired absolute minimum or maximum (or both). At this point in the problem, we are finally ready to apply the ideas of
calculus to determine and justify the absolute minimum or maximum. Thus, what is primarily different about problems of
this type is that the problem-solver must do considerable work to introduce variables and develop the correct function and
domain to represent the described situation.
Throughout what follows in the current section, the primary emphasis is on the reader solving problems. Initially, some
substantial guidance is provided, with the problems progressing to require greater independence as we move along.

Preview Activity 3.4.1

According to U.S. postal regulations, the girth plus the length of a parcel sent by mail may not exceed 108 inches,
where by “girth” we mean the perimeter of the smallest end. What is the largest possible volume of a rectangular
parcel with a square 192 end that can be sent by mail? What are the dimensions of the package of largest volume?

Figure 3.4.1 : A rectangular parcel with a square end.


a. Let x represent the length of one side of the square end and y the length of the longer side. Label these quantities
appropriately on the image shown in Figure 3.4.1.
b. What is the quantity to be optimized in this problem? Find a formula for this quantity in terms of x and y .
c. The problem statement tells us that the parcel’s girth plus length may not exceed 108 inches. In order to maximize
volume, we assume that we will actually need the girth plus length to equal 108 inches. What equation does this
produce involving x and y ?
d. Solve the equation you found in (c) for one of x or y (whichever is easier).
e. Now use your work in (b) and (d) to determine a formula for the volume of the parcel so that this formula is a
function of a single variable.

Matthew Boelkins, David Austin & Steven


3.4.1 11/17/2021 https://math.libretexts.org/@go/page/4309
Schlicker
f. Over what domain should we consider this function? Note that both x and y must be positive; how does the
constraint that girth plus length is 108 inches produce intervals of possible values for x and y ?
g. Find the absolute maximum of the volume of the parcel on the domain you established in (f) and hence also
determine the dimensions of the box of greatest volume. Justify that you’ve found the maximum using calculus.

More Applied Optimization Problems


Many of the steps in Preview Activity 3.4.1 are ones that we will execute in any applied optimization problem. We briefly
summarize those here to provide an overview of our approach in subsequent questions.
Draw a picture and introduce variables. It is essential to first understand what quantities are allowed to vary in the
problem and then to represent those values with variables. Constructing a figure with the variables labeled is almost
always an essential first step. Sometimes drawing several diagrams can be especially helpful to get a sense of the
situation. A nice example of this can be seen at http://gvsu.edu/s/99, where the choice of where to bend a piece of wire
into the shape of a rectangle determines both the rectangle’s shape and area.
Identify the quantity to be optimized as well as any key relationships among the variable quantities. Essentially this
step involves writing equations that involve the variables that have been introduced: one to represent the quantity
whose minimum or maximum is sought, and possibly others that show how multiple variables in the problem may be
interrelated.
Determine a function of a single variable that models the quantity to be optimized; this may involve using other
relationships among variables to eliminate one or more variables in the function formula. For example, in Preview
Activity 3.4.1, we initially found that V = x y , but then the additional relationship that 4x + y = 108 (girth plus
2

length equals 108 inches) allows us to relate x and y and thus observe equivalently that y = 108 − 4x . Substituting for
y in the volume equation yields V (x) = x (108 − 4x) , and thus we have written the volume as a function of the single
2

variable x.
Decide the domain on which to consider the function being optimized. Often the physical constraints of the problem
will limit the possible values that the independent variable can take on. Thinking back to the diagram describing the
overall situation and any relationships among variables in the problem often helps identify the smallest and largest
values of the input variable.
Use calculus to identify the absolute maximum and/or minimum of the quantity being optimized. This always involves
finding the critical numbers of the function first. Then, depending on the domain, we either construct a first derivative
sign chart (for an open or unbounded interval) or evaluate the function at the endpoints and critical numbers (for a
closed, bounded interval), using ideas we’ve studied so far in Chapter 3.
Finally, we make certain we have answered the question: does the question seek the absolute maximum of a quantity, or
the values of the variables that produce the maximum? That is, finding the absolute maximum volume of a parcel is
different from finding the dimensions of the parcel that produce the maximum.

Activity 3.4.1: Soup Prices

A soup can in the shape of a right circular cylinder is to be made from two materials. The material for the side of the
can costs $0.015 per square inch and the material for the lids costs $0.027 per square inch. Suppose that we desire to
construct a can that has a volume of 16 cubic inches. What dimensions minimize the cost of the can?
a. Draw a picture of the can and label its dimensions with appropriate variables.
b. Use your variables to determine expressions for the volume, surface area, and cost of the can.
c. Determine the total cost function as a function of a single variable. What is the domain on which you should
consider this function?
d. Find the absolute minimum cost and the dimensions that produce this value.

Familiarity with common geometric formulas is particularly helpful in problems like the one in Activity 3.4.1. Sometimes
those involve perimeter, area, volume, or surface area. At other times, the constraints of a problem introduce right triangles
(where the Pythagorean Theorem applies) or other functions whose formulas provide relationships among variables
present.

Matthew Boelkins, David Austin & Steven


3.4.2 11/17/2021 https://math.libretexts.org/@go/page/4309
Schlicker
Activity 3.4.2: Optimized Hikes I

A hiker starting at a point P on a straight road walks east towards point Q, which is on the road and 3 kilometers from
point P. Two kilometers due north of point Q is a cabin. The hiker will walk down the road for a while, at a pace of 8
kilometers per hour. At some point Z between P and Q, the hiker leaves the road and makes a straight line towards the
cabin through the woods, hiking at a pace of 3 kph, as pictured in Figure 3.4.2. In order to minimize the time to go
from P to Z to the cabin, where should the hiker turn into the forest?

Figure 3.4.2 : A hiker walks from P to Z to the cabin, as pictured.

In more geometric problems, we often use curves or functions to provide natural constraints. For instance, we could
investigate which isosceles triangle that circumscribes a unit circle has the smallest area, which you can explore for
yourself at http://gvsu.edu/s/9b. Or similarly, for a region bounded by a parabola, we might seek the rectangle of largest
area that fits beneath the curve, as shown at http://gvsu.edu/s/9c. The next activity is similar to the latter problem.

Activity 3.4.2: Optimized Hikes II

Consider the region in the x − y plane that is bounded by the x-axis and the function f (x) = 25 − x . Construct a
2

rectangle whose base lies on the x-axis and is centered at the origin, and whose sides extend vertically until they
intersect the curve y = 25 − x . Which such rectangle has the maximum possible area? Which such rectangle has the
2

greatest perimeter? Which has the greatest combined perimeter and area? (Challenge: answer the same questions in
terms of positive parameters a and b for the function f (x) = b − ax .)
2

Activity 3.4.3: Maximized Volume

A trough is being constructed by bending a 4 × 24 (measured in feet) rectangular piece of sheet metal. Two symmetric
folds 2 feet apart will be made parallel to the longest side of the rectangle so that the trough has cross-sections in the
shape of a trapezoid, as pictured in Figure 3.4.3. At what angle should the folds be made to produce the trough of
maximum volume?

Figure 3.4.3 : A cross-section of the trough formed by folding to an angle of θ.

Summary
In this section, we encountered the following important ideas:
While there is no single algorithm that works in every situation where optimization is used, in most of the problems we
consider, the following steps are helpful: draw a picture and introduce variables; identify the quantity to be optimized
and find relationships among the variables; determine a function of a single variable that models the quantity to be
optimized; decide the domain on which to consider the function being optimized; use calculus to identify the absolute
maximum and/or minimum of the quantity being optimized.

Matthew Boelkins, David Austin & Steven


3.4.3 11/17/2021 https://math.libretexts.org/@go/page/4309
Schlicker
Contributors and Attributions
Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


3.4.4 11/17/2021 https://math.libretexts.org/@go/page/4309
Schlicker
3.5: Related Rates

Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
If two quantities that are related, such as the radius and volume of a spherical balloon, are both changing as implicit
functions of time, how are their rates of change related? That is, how does the relationship between the values of
the quantities affect the relationship between their respective derivatives with respect to time?

In most of our applications of the derivative so far, we have worked in settings where one quantity (often called y ) depends
explicitly on another (say x), and in some way we have been interested in the instantaneous rate at which y changes with
dy
respect to x, leading us to compute . These settings emphasize how the derivative enables us to quantify how the
dx
quantity y is changing as x changes at a given x-value. We are next going to consider situations where multiple quantities
are related to one another and changing, but where each quantity can be considered an implicit function of the variable t ,
which represents time. Through knowing how the quantities are related, we will be interested in determining how their
respective rates of change with respect to time are related. For example, suppose that air is being pumped into a spherical
balloon in such a way that its volume increases at a constant rate of 20 cubic inches per second. It makes sense that since
the balloon’s volume and radius are related, by knowing how fast the volume is changing, we ought to be able to relate this
rate to how fast the radius is changing. More specifically, can we find how fast the radius of the balloon is increasing at the
moment the balloon’s diameter is 12 inches? The following preview activity leads you through the steps to answer this
question.

Preview Activity 3.5.1

A spherical balloon is being inflated at a constant rate of 20 cubic inches per second. How fast is the radius of the
balloon changing at the instant the balloon’s diameter is 12 inches? Is the radius changing more rapidly when d = 12
or when d = 16 ? Why?
a. Draw several spheres with different radii, and observe that as volume changes, the radius, diameter, and surface
area of the balloon also change.
b. Recall that the volume of a sphere of radius r is
4 3
V = πr . (3.5.1)
3

Note well that in the setting of this problem, both V and r are changing as time t changes, and thus 198 both V
dV dr
and r may be viewed as implicit functions of t , with respective derivatives and . Differentiate both sides of
dt dt
Equation 3.5.1 with respect to t (using the chain rule on the right) to find a formula for dV dt that depends on both
dr
r and .
dt
c. At this point in the problem, by differentiating we have “related the rates” of change of V and r. Recall that we are
given in the problem that the balloon is being inflated at a constant rate of 20 cubic inches per second. Is this rate
dr dV
the value of or ? Why?
dt dt
dV
d. From part (c), we know the value of at every value of t . Next, observe that when the diameter of the balloon is
dt
12, we know the value of the radius. In the equation
dV 2
dr
= 4π r ,
dt dt

substitute these values for the relevant quantities and solve for the remaining unknown quantity, which is dr dt .
How fast is the radius changing at the instant d = 12 ?
e. How is the situation different when d = 16 ? When is the radius changing more rapidly, when d = 12 or when
d = 16 ?

Matthew Boelkins, David Austin & Steven


3.5.1 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker

Related Rates Problems
In problems where two or more quantities can be related to one another, and all of the variables involved can be viewed as
implicit functions of time, t , we are often interested in how the rates of change of the individual quantities with respect to
time are themselves related; we call these related rates problems. Often these problems involve identifying one or more
key underlying geometric relationships to relate the variables involved. Once we have an equation establishing the
fundamental relationship among variables, we differentiate implicitly with respect to time to find connections among the
rates of change.
For example, consider the situation where sand is being dumped by a conveyor belt on a pile so that the sand forms a right
circular cone, as pictured in Figure 3.5.1. As sand falls from the conveyor belt onto the top of the pile, obviously several
features of the sand pile will change: the volume of the pile will grow, the height will increase, and the radius will get
bigger, too. All of these quantities are related to one another, and the rate at which each is changing is related to the rate at
which sand falls from the conveyor.

Figure 3.5.1 : A conical pile of sand.


The first key steps in any related rates problem involve identifying which variables are changing and how they are related.
In the current problem involving a conical pile of sand, we observe that the radius and height of the pile are related to the
volume of the pile by the standard equation for the volume of a cone,
1
2
V = π r h. (3.5.2)
3

Viewing each of V , r, and h as functions of t , we can differentiate implicitly to determine an equation that relates their
respective rates of change. Taking the derivative of each side of the equation with respect to t,
d d 1 2
[V ] = [ π r h] . (3.5.3)
dt dt 3

d dV
On the left, [V ] is simply . On the right, the situation is more complicated, as both r and h are implicit functions of
dt dt
t , hence we have to use the product and chain rules. Doing so, we find that
dV d 1 2
= [ π r h] (3.5.4)
dt dt 3

1 2
d 1 d 2
= πr [h] + πh [r ] (3.5.5)
3 dt 3 dt
1 dh 1 dr
2 2
= πr + πh r (3.5.6)
3 dt 3 dt

Note particularly how we are using ideas from Section 2.7 on implicit differentiation. There we found that when y is an
implicit function of x,
d 2
dy
[ y ] = 2y . (3.5.7)
dx dx

The exact same thing is occurring here when we compute


d dr
2
[ r ] = 2r . (3.5.8)
dt dt

Matthew Boelkins, David Austin & Steven


3.5.2 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker
With our arrival at the equation

dV 1 2
dh 2 dr
= πr + πrh , (3.5.9)
dt 3 dt 3 dt

we have now related the rates of change of V , h , and r. If we are given sufficient information, we may then find the value
of one or more of these rates of change at one or more points in time. Say, for instance, that we know the following:
a. sand falls from the conveyor in such a way that the height of the pile is always half the radius, and
b. sand falls from the conveyor belt at a constant rate of 10 cubic feet per minute.
With this information given, we can answer questions such as: how fast is the height of the sandpile changing at the
moment the radius is 4 feet? The information that the height is always half the radius tells us that for all values of t ,
1
h = r . Differentiating with respect to t , it follows that
2

dh 1 dr
= . (3.5.10)
dt 2 dt

dV
These relationships enable us to relate exclusively to just one of r or h . Substituting the expressions involving r and
dt
dr dh
for h and , we now have that
dt dt

dV 1 2
1 dr 2 1 dr
= πr ⋅ + πr ⋅ r⋅ . (3.5.11)
dt 3 2 dt 3 2 dt

dV
Since sand falls from the conveyor at the constant rate of 10 cubic feet per minute, this tells us the value of , the rate at
dt
dV
which the volume of the sand pile changes. In particular, = 10 f t /min
3
. Furthermore, since we are interested in how
dt
dV
fast the height of the pile is changing at the instant r =4 , we use the value r =4 along with = 10 in Equation
dt
3.5.11 , and hence find that
1 2
1 dr ∣ 2 1 dr ∣ 8 dr ∣ 16 dr ∣
10 = π4 ⋅ ∣ + π4 ⋅ 4⋅ ∣ = π ∣ + π ∣ . (3.5.12)
3 2 dt ∣ r=4
3 2 dt ∣ r=4
3 dt ∣ r=4
3 dt ∣
r=4

dr ∣ dr ∣
With only the value of ∣ remaining unknown, we solve Equation 3.5.12 for ∣ and find that
dt ∣ r=4
dt ∣ r=4

dr ∣
10 = 8π ∣ (3.5.13)
dt ∣ r=4

so that
dr ∣ 10
∣ = ≈ 0.39789 feet per second. (3.5.14)
dt ∣r=4 8π

dh
Because we were interested in how fast the height of the pile was changing at this instant, we want to know when
dt
dh 1 dr
r =4 . Since = for all values of t , it follows that
dt 2 dt

dh ∣ 5
∣ = ≈ 0.19894 feet per second. (3.5.15)
dt ∣ r=4

dr dr ∣
Note particularly how we distinguish between the notations and ∣ . The former represents the rate of change of r
dt dt ∣ r=4

with respect to t at an arbitrary value of t , while the latter is the rate of change of r with respect to t at a particular
moment, in fact the moment r = 4 . While we do not know the exact value of t , because information is provided about the
value of r, it is important to distinguish that we are using this more specific data.

Matthew Boelkins, David Austin & Steven


3.5.3 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker
1
The relationship between h and r, with h = r for all values of t , enables us to transition easily between questions
☰ 2
involving r and h . Indeed, had we known this information at the problem’s outset, we could have immediately simplified
1 1
our work. Using h = r , it follows that since V = πr h
2
, we can write V solely in terms of r to have
2 3

1 1 1
2 3
V = πr ( h) = πr . (3.5.16)
3 2 6

Differentiating Equation 3.5.16 with respect to t implies


dV 1 dr
2
= πr , (3.5.17)
dt 2 dt

dr dh
from which the same conclusions made earlier about and can be made.
dt dt

Our work with the sandpile problem above is similar in many ways to our approach in Preview Activity 3.5.1, and these
steps are typical of most related rates problems. In certain ways, they also resemble work we do in applied optimization
problems, and here we summarize the main approach for consideration in subsequent problems.
Identify the quantities in the problem that are changing and choose clearly defined variable names for them. Draw one
or more figures that clearly represent the situation.
Determine all rates of change that are known or given and identify the rate(s) of change to be found.
Find an equation that relates the variables whose rates of change are known to those variables whose rates of change
are to be found.
Differentiate implicitly with respect to t to relate the rates of change of the involved quantities.
Evaluate the derivatives and variables at the information relevant to the instant at which a certain rate of change is
dr ∣
sought. Use proper notation to identify when a derivative is being evaluated at a particular instant, such as ∣ . In
dt ∣ r=4

the first step of identifying changing quantities and drawing a picture, it is important to think about the dynamic ways
in which the involved quantities change. Sometimes a sequence of pictures can be helpful; for some already-drawn
pictures that can be easily modified as applets built in Geogebra, see the following links2 which represent
how a circular oil slick’s area grows as its radius increases http://gvsu.edu/s/9n;
how the location of the base of a ladder and its height along a wall change as the ladder slides http://gvsu.edu/s/9o;
how the water level changes in a conical tank as it fills with water at a constant rate http://gvsu.edu/s/9p (compare the
problem in Activity 3.5.2);
how a skateboarder’s shadow changes as he moves past a lamppost http://gvsu.edu/s/9q. 2We again refer to the work of
Prof. Marc Renault of Shippensburg University, found at http://gvsu.edu/s/5p. 202 Drawing well-labeled diagrams and
envisioning how different parts of the figure change is a key part of understanding related rates problems and being
successful at solving them.

Activity 3.5.2

A water tank has the shape of an inverted circular cone (point down) with a base of radius 6 feet and a depth of 8 feet.
Suppose that water is being pumped into the tank at a constant instantaneous rate of 4 cubic feet per minute.
a. Draw a picture of the conical tank, including a sketch of the water level at a point in time when the tank is not yet
full. Introduce variables that measure the radius of the water’s surface and the water’s depth in the tank, and label
them on your figure.
b. Say that r is the radius and h the depth of the water at a given time, t . What equation relates the radius and height
of the water, and why?
c. Determine an equation that relates the volume of water in the tank at time t to the depth h of the water at that time.
d. Through differentiation, find an equation that relates the instantaneous rate of change of water volume with respect
to time to the instantaneous rate of change of water depth at time t .
e. Find the instantaneous rate at which the water level is rising when the water in the tank is 3 feet deep.
f. When is the water rising most rapidly: at h = 3 , h = 4 , or h = 5 ?

Matthew Boelkins, David Austin & Steven


3.5.4 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker
Recognizing familiar geometric configurations is one way that we relate the changing quantities in a given problem. For

instance, while the problem in Activity 3.5.2 is centered on a conical tank, one of the most important observations is that
there are two key right triangles present. In another setting, a right triangle might be indicative of an opportunity to take
advantage of the Pythagorean Theorem to relate the legs of the triangle. But in the conical tank, the fact that the water at
any time fills a portion of the tank in such a way that the ratio of radius to depth is constant turns out to be the most
important relationship with which to work. That enables us to write r in terms of h and reduce the overall problem to one
dV
that involves only one variable, where the volume of water depends simply on h, and hence to subsequently relate and
dt
dh
. In other situations where a changing angle is involved, a right triangle may offer the opportunity to find relationships
dt
among various parts of the triangle using trigonometric functions.

Activity 3.5.3: Television Camera

A television camera is positioned 4000 feet from the base of a rocket launching pad. The angle of elevation of the
camera has to change at the correct rate in order to keep the rocket in sight. In addition, the auto-focus of the camera
has to take into 203 account the increasing distance between the camera and the rocket. We assume that the rocket rises
vertically. (A similar problem is discussed and pictured dynamically at http://gvsu.edu/s/9t. Exploring the applet at the
link will be helpful to you in answering the questions that follow.)
a. Draw a figure that summarizes the given situation. What parts of the picture are changing? What parts are
constant? Introduce appropriate variables to represent the quantities that are changing.
b. Find an equation that relates the camera’s angle of elevation to the height of the rocket, and then find an equation
that relates the instantaneous rate of change of the camera’s elevation angle to the instantaneous rate of change of
the rocket’s height (where all rates of change are with respect to time).
c. Find an equation that relates the distance from the camera to the rocket to the rocket’s height, as well as an equation
that relates the instantaneous rate of change of distance from the camera to the rocket to the instantaneous rate of
change of the rocket’s height (where all rates of change are with respect to time).
d. Suppose that the rocket’s speed is 600 ft/sec at the instant it has risen 3000 feet. How fast is the distance from the
television camera to the rocket changing at that moment? If the camera is following the rocket, how fast is the
camera’s angle of elevation changing at that same moment?
e. If from an elevation of 3000 feet onward the rocket continues to rise at 600 feet/sec, will the rate of change of
distance with respect to time be greater when the elevation is 4000 feet than it was at 3000 feet, or less? Why?

In addition to being able to find instantaneous rates of change at particular points in time, we are often able to make more
general observations about how particular rates themselves will change over time. For instance, when a conical tank (point
down) is filling with water at a constant rate, we naturally intuit that the depth of the water should increase more slowly
over time. Note how carefully we need to speak: we mean to say that while the depth, h , of the water is increasing, its rate
dh
of change is decreasing (both as a function of t and as a function of h ). These observations may often be made by
dt
taking the general equation that relates the various rates and solving for one of them, and doing this without substituting
any particular values for known variables or rates. For instance, in the conical tank problem in Activity 3.5.2, we
established that
dV 1 dh
2
= πh , (3.5.18)
dt 16 dt

and hence
dh dV
2
= 16π h (3.5.19)
dt dt

dV
Provided that is constant, it is immediately apparent that as h gets larger, dh dt will get smaller, while always
dt
remaining positive. Hence, the depth of the water is increasing at a decreasing rate.

Matthew Boelkins, David Austin & Steven


3.5.5 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker
Activity 3.5.4: Skateboarding

As pictured in the applet at http://gvsu.edu/s/9q, a skateboarder who is 6 feet tall rides under a 15 foot tall lamppost at
a constant rate of 3 feet per second. We are interested in understanding how fast his shadow is changing at various
points in time.
a. Draw an appropriate right triangle that represents a snapshot in time of the skateboarder, lamppost, and his shadow.
Let x denote the horizontal distance from the base of the lamppost to the skateboarder and s represent the length of
his shadow. Label these quantities, as well as the skateboarder’s height and the lamppost’s height on the diagram.
b. Observe that the skateboarder and the lamppost represent parallel line segments in the diagram, and thus similar
triangles are present. Use similar triangles to establish an equation that relates x and s.
dx ds
c. Use your work in (b) to find an equation that relates and .
dt dt
d. At what rate is the length of the skateboarder’s shadow increasing at the instant the skateboarder is 8 feet from the
lamppost?
e. As the skateboarder’s distance from the lamppost increases, is his shadow’s length increasing at an increasing rate,
increasing at a decreasing rate, or increasing at a constant rate?
f. Which is moving more rapidly: the skateboarder or the tip of his shadow? Explain, and justify your answer.

As we progress further into related rates problems, less direction will be provided. In the first three activities of this
section, we have been provided with guided instruction to build a solution in a step by step way. For the closing activity
and the following exercises, most of the detailed work is left to the reader.

Activity 3.5.5: Baseball

A baseball diamond is 900 square. A batter hits a ball along the third base line and runs to first base. At what rate is the
distance between the ball and first base changing when the ball is halfway to third base, if at that instant the ball is
traveling 100 feet/sec? At what rate is the distance between the ball and the runner changing at the same instant, if at
the same instant the runner is 1/8 of the way to first base running at 30 feet/sec?

Summary
In this section, we encountered the following important ideas:
When two or more related quantities are changing as implicit functions of time, their rates of change can be related by
implicitly differentiating the equation that relates the quantities themselves. For instance, if the sides of a right triangle
are all changing as functions of time, say having lengths x, y, and z, then these quantities are related by the Pythagorean
Theorem: x 2 + y 2 = z 2 . It follows by implicitly differentiating with respect to t that their rates are related by the
equation 2x dx dt + 2y dy dt = 2z dz dt , so that if we know the values of x, y, and z at a particular time, as well as two
of the three rates, we can deduce the value of the third.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


3.5.6 10/18/2021 https://math.libretexts.org/@go/page/4310
Schlicker
3.E: Using Derivatives (Exercises)
3.1: Using Derivatives to Identify Extreme Values
Q3.1.1
This problem concerns a function about which the following information is known:
f is a differentiable function defined at every real number x
f (0) = −1/2

y = f (x) has its graph given at center in Figure 3.11

Figure 3.11: At center, a graph of y = f 0 (x); at left, axes for plotting y = f (x); at right, axes for plotting y = f 00(x).
a. Construct a first derivative sign chart for f . Clearly identify all critical numbers of f , where f is increasing and
decreasing, and where f has local extrema.
b. On the right-hand axes, sketch an approximate graph of y = f (x). ′′

c. Construct a second derivative sign chart for f . Clearly identify where f is concave up and concave down, as well as all
inflection points.
d. On the left-hand axes, sketch a possible graph of y = f (x).

Q3.1.2
Suppose that g is a differentiable function and g 0 (2) = 0. In addition, suppose that on 1 < x < 2 and 2 < x < 3 it is known
that g 0 (x) is positive.
a. Does g have a local maximum, local minimum, or neither at x = 2? Why?
b. Suppose that g 00(x) exists for every x such that 1 < x < 3. Reasoning graphically, describe the behavior of g 00(x) for
x-values near 2.
c. Besides being a critical number of g, what is special about the value x = 2 in terms of the behavior of the graph of g?

Q3.1.3
Suppose that h is a differentiable function whose first derivative is given by the graph in Figure 3.12.

Figure 3.12: The graph of y = h 0 (x).


a. How many real number solutions can the equation h(x) = 0 have? Why?
b. If h(x) = 0 has two distinct real solutions, what can you say about the signs of the two solutions? Why?
c. Assume that limx→∞ h 0 (x) = 3, as appears to be indicated in Figure 3.12. How will the graph of y = h(x) appear as x
→ ∞? Why?
Matthew Boelkins, David Austin & Steven
3.E.1 12/8/2021 https://math.libretexts.org/@go/page/5395
Schlicker
d. Describe the concavity of y = h(x) as fully as you can from the provided information.

Q3.1.4
Let p be a function whose second derivative is p 00(x) = (x + 1)(x − 2)e −x .
a. Construct a second derivative sign chart for p and determine all inflection points of p.
b. Suppose you also know that x = √ 5−1 2 is a critical number of p. Does p have a local minimum, local maximum, or
neither at x = √ 5−1 2 ? Why?
c. If the point (2, 12 e 2 ) lies on the graph of y = p(x) and p 0 (2) = − 5 e 2 , find the equation of the tangent line to y =
p(x) at the point where x = 2. Does the tangent line lie above the curve, below the curve, or neither at this value? Why?

3.2: Using Derivatives to Describe Families of Functions


Q3.2.1
Consider the one-parameter family of functions given by p(x) = x 3 − ax2 , where a > 0.
a. Sketch a plot of a typical member of the family, using the fact that each is a cubic polynomial with a repeated zero at x
= 0 and another zero at x = a.
b. Find all critical numbers of p.
c. Compute p 00 and find all values for which p 00(x) = 0. Hence construct a second derivative sign chart for p. 181
d. Describe how the location of the critical numbers and the inflection point of p change as a changes. That is, if the value
of a is increased, what happens to the critical numbers and inflection point?

Q3.2.2
Let q(x) = e −x
(x − c) be a one-parameter family of functions where c > 0 .
a. Explain why q has a vertical asymptote at x = c.
b. Determine limx→∞ q(x) and limx→−∞ q(x).
c. Compute q 0 (x) and find all critical numbers of q.
d. Construct a first derivative sign chart for q and determine whether each critical number leads to a local minimum, local
maximum, or neither for the function q.
e. Sketch a typical member of this family of functions with important behaviors clearly labeled.

Q3.2.3
Let E(x) = e − (x−m) 2 2s2 , where m is any real number and s is a positive real number.
a. Compute E 0 (x) and hence find all critical numbers of E.
b. Construct a first derivative sign chart for E and classify each critical number of the function as a local minimum, local
maximum, or neither.
c. It can be shown that E 00(x) is given by the formula E 00(x) = e − (x−m) 2 2s2 (x − m) 2 − s 2 s 4 ! . Find all values of
x for which E 00(x) = 0.
d. Determine limx→∞ E(x) and limx→−∞ E(x).
e. Construct a labeled graph of a typical function E that clearly shows how important points on the graph of y = E(x)
depend on m and s.

3.3: Global Optimization


Q3.3.1
1. Based on the given information about each function, decide whether the function has global maximum, a global
minimum, neither, both, or that it is not possible to say without more information. Assume that each function is twice
differentiable and defined for all real numbers, unless noted otherwise. In each case, write one sentence to explain your
conclusion.
a. f is a function such that f 00(x) < 0 for every x.
b. g is a function with two critical numbers a and b (where a < b), and g 0 (x) < 0 for x < a, g 0 (x) < 0 for a < x < b, and g
0 (x) > 0 for x > b.

Matthew Boelkins, David Austin & Steven


3.E.2 12/8/2021 https://math.libretexts.org/@go/page/5395
Schlicker
c. h is a function with two critical numbers a and b (where a < b), and h 0 (x) < 0 for x < a, h 0 (x) > 0 for a < x < b, and h
0 (x) < 0 for x > b. In addition, limx→∞ h(x) = 0 and limx→−∞ h(x) = 0.
d. p is a function differentiable everywhere except at x = a and p 00(x) > 0 for x < a and p 00(x) < 0 for x > a.

Q3.3.2
For each family of functions that depends on one or more parameters, determine the function’s absolute maximum and
absolute minimum on the given interval.
a. p(x) = x 3 − a 2 x, [0, a] (a > 0)
b. r(x) = axe−bx , [ 1 2b , b] (a, b > 0)
c. w(x) = a(1 − e −bx), [b, 3b] (a, b > 0) s(x) = sin(k x), [ π 3k , 5π 6k ]

Q3.3.3
For each of the functions described below (each continuous on [a, b]), state the location of the function’s absolute
maximum and absolute minimum on the interval [a, b], or say there is not enough information provided to make a
conclusion. Assume that 190 any critical numbers mentioned in the problem statement represent all of the critical numbers
the function has in [a, b]. In each case, write one sentence to explain your answer.
a. f 0 (x) ≤ 0 for all x in [a, b]
b. g has a critical number at c such that a < c < b and g 0 (x) > 0 for x < c and g 0 (x) < 0 for x > c
c. h= hand h 00(x) < 0 for all x in [a, b] p> 0, p< 0, and for the critical number c such that a < c < b, p 0 (x) < 0 for x < c
and p 0 (x) > 0 for x > c

Q3.3.4
Let s(t) = 3 sin(2(t − π 6 )) + 5. Find the exact absolute maximum and minimum of s on the provided intervals by testing
the endpoints and finding and evaluating all relevant critical numbers of s.
a. [ π 6 , 7π 6 ]
b. [0, π 2 ]
c. [0, 2π]
d. [ π 3 , 5π 6 ]

3.4: Applied Optimization


Q3.4.1
A rectangular box with a square bottom and closed top is to be made from two materials. The material for the side costs
$1.50 per square foot and the material for the bottom costs $3.00 per square foot. If you are willing to spend $15 on the
box, what is the largest volume it can contain? Justify your answer completely using calculus.

Q3.4.2
A farmer wants to start raising cows, horses, goats, and sheep, and desires to have a rectangular pasture for the animals to
graze in. However, no two different kinds of animals can graze together. In order to minimize the amount of fencing she
will need, she has decided to enclose a large rectangular area and then divide it into four equally sized pens by adding three
segments of fence inside the large rectangle that are parallel to two existing sides. She has decided to purchase 7500 ft of
fencing. What is the maximum possible area that each of the four pens will enclose?

Q3.4.3
Two vertical poles of heights 60 ft and 80 ft stand on level ground, with their bases 100 ft apart. A cable that is stretched
from the top of one pole to some point on the ground between the poles, and then to the top of the other pole. What is the
minimum possible length of cable required? Justify your answer completely using calculus.

Q3.4.4
A company is designing propane tanks that are cylindrical with hemispherical ends. Assume that the company wants tanks
that will hold 1000 cubic feet of gas, and that the ends are more expensive to make, costing $5 per square foot, while the

Matthew Boelkins, David Austin & Steven


3.E.3 12/8/2021 https://math.libretexts.org/@go/page/5395
Schlicker
cylindrical barrel between the ends costs $2 per square foot. Use calculus to determine the minimum cost to construct such
a tank.

3.5: Related Rates


Q3.5.1
A sailboat is sitting at rest near its dock. A rope attached to the bow of the boat is drawn in over a pulley that stands on a
post on the end of the dock that is 5 feet higher than the bow. If the rope is being pulled in at a rate of 2 feet per second,
how fast is the boat approaching the dock when the length of rope from bow to pulley is 13 feet?

Q3.5.2
A swimming pool is 60 feet long and 25 feet wide. Its depth varies uniformly from 3 feet at the shallow end to 15 feet at
the deep end, as shown in the Figure 3.25. Suppose

Figure 3.25: The swimming pool described in Exercise 2.


the pool has been emptied and is now being filled with water at a rate of 800 cubic feet per minute. At what rate is the
depth of water (measured at the deepest point of the pool) increasing when it is 5 feet deep at that end? Over time, describe
how the depth of the water will increase: at an increasing rate, at a decreasing rate, or at a constant rate. Explain.

Q3.5.3
A baseball diamond is a square with sides 90 feet long. Suppose a baseball player is advancing from second to third base at
the rate of 24 feet per second, and an umpire is standing on home plate. Let θ be the angle between the third baseline and
the line of sight from the umpire to the runner. How fast is θ changing when the runner is 30 feet from third base?

Q3.5.4
Sand is being dumped off a conveyor belt onto a pile in such a way that the pile forms in the shape of a cone whose radius
is always equal to its height. Assuming that the sand is being dumped at a rate of 10 cubic feet per minute, how fast is the
height of the pile changing when there are 1000 cubic feet on the pile?

Matthew Boelkins, David Austin & Steven


3.E.4 12/8/2021 https://math.libretexts.org/@go/page/5395
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
4: THE DEFINITE INTEGRAL
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

4.1: DETERMINING DISTANCE TRAVELED FROM VELOCITY


f we know the velocity of a moving body at every point in a given interval and the velocity is positive throughout, we can estimate the
object’s distance traveled and in some circumstances determine this value exactly. In particular, when velocity is positive on an
interval, we can find the total distance traveled by finding the area under the velocity curve and above the t-axis on the given time
interval. We may only be able to estimate this area, depending on the shape of the velocity curve.

4.2: RIEMANN SUMS


A Riemann sum is simply a sum of products of the form f(x )Δx that estimates the area between a positive function and the

i

horizontal axis over a given interval. If the function is sometimes negative on the interval, the Riemann sum estimates the difference
between the areas that lie above the horizontal axis and those that lie below the axis.

4.3: THE DEFINITE INTEGRAL


The Riemann sum of a continuous function provides an estimate of the net signed area bounded by the function and the horizontal
axis on the interval. Increasing the number of subintervals in the Riemann sum improves the accuracy of this estimate, and letting the
number of subintervals increase without bound results in the values of the corresponding Riemann sums approaching the exact value
of the enclosed net signed area.

4.4: THE FUNDAMENTAL THEOREM OF CALCULUS


We can find the exact value of a definite integral without taking the limit of a Riemann sum or using a familiar area formula by
finding the antiderivative of the integrand, and hence applying the Fundamental Theorem of Calculus.

4.E: THE DEFINITE INTEGRAL (EXERCISES)


These are homework exercises to accompany Chapter 4 of Boelkins et al. "Active Calculus" Textmap.

3 12/22/2021
4.1: Determining Distance Traveled from Velocity
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
If we know the velocity of a moving body at every point in a given interval, can we determine the distance the
object has traveled on the time interval?
How is the problem of finding distance traveled related to finding the area under a certain curve?
What does it mean to antidifferentiate a function and why is this process relevant to finding distance traveled?
If velocity is negative, how does this impact the problem of finding distance traveled?

In the very first section of the text, we considered a situation where a moving object had a known position at time t. In
particular, we stipulated that a tennis ball tossed into the air had its height s (in feet) at time t (in seconds) given by s(t) =
64 − 16(t − 1) 2 . From this starting point, we investigated the average velocity of the ball on a given interval [a, b],
computed by the difference quotient s(b)−s(a) b−a , and eventually found that we could determine the exact instantaneous
velocity of the ball at time t by taking the derivative of 207 208 the position function, s 0 (t) = lim h→0 s(t + h) − s(t) h .
Thus, given a differentiable position function, we are able to know the exact velocity of the moving object at any point in
time. Moreover, from this foundational problem involving position and velocity we have learned a great deal. Given a
differentiable function f , we are now able to find its derivative and use this derivative to determine the function’s
instantaneous rate of change at any point in the domain, as well as to find where the function is increasing or decreasing, is
concave up or concave down, and has relative extremes. The vast majority of the problems and applications we have
considered have involved the situation where a particular function is known and we seek information that relies on
knowing the function’s instantaneous rate of change. That is, we have typically proceeded from a function f to its
derivative, f 0 , and then used the meaning of the derivative to help us answer important questions. In a much smaller
number of situations so far, we have encountered the reverse situation where we instead know the derivative, f 0 , and have
tried to deduce information about f . It is this particular problem that will be the focus of our attention in most of Chapter
4: if we know the instantaneous rate of change of a function, are we able to determine the function itself? To begin, we
start with a more focused question: if we know the instantaneous velocity of an object moving along a straight line path,
can we determine its corresponding position function?

Preview Activity 4.1.1:

Suppose that a person is taking a walk along a long straight path and walks at a constant rate of 3 miles per hour.

Figure 4.1: At left, axes for plotting y = v(t); at right, for plotting y = s(t).
a. On the left-hand axes provided in Figure 4.1, sketch a labeled graph of the velocity function v(t) = 3. Note that
while the scale on the two sets of axes is the same, the units on the right-hand axes differ from those on the left.
The right-hand axes 209 will be used in question (d).
b. How far did the person travel during the two hours? How is this distance related to the area of a certain region
under the graph of y = v(t)?
c. Find an algebraic formula, s(t), for the position of the person at time t, assuming that s(0) = 0. Explain your
thinking.
d. On the right-hand axes provided in Figure 4.1, sketch a labeled graph of the position function y = s(t).
Matthew Boelkins, David Austin & Steven
4.1.1 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
e. For what values of t is the position function s increasing? Explain why this is the case using relevant information
about the velocity function v.

Area under the graph of the velocity function In Preview Activity 4.1, we encountered a fundamental fact: when a moving
object’s velocity is constant (and positive), the area under the velocity curve over a given interval tells us the distance the
object traveled. As seen at left in Figure 4.2, if we consider an object

Figure 4.2: At left, a constant velocity function; at right, a non-constant velocity function.
moving at 2 miles per hour over the time interval [1, 1.5], then the area A1 of the shaded region under y = v(t) on [1, 1.5]
is A1 = 2 miles hour · 1 2 hours = 1 mile. This principle holds in general simply due to the fact that distance equals rate
times time, provided the rate is constant. Thus, if v(t) is constant on the interval [a, b], then the 210 distance traveled on [a,
b] is the area A that is given by A = v(a)(b − a) = v(a)4t, where 4t is the change in t over the interval. Note, too, that we
could use any value of v(t) on the interval [a, b], since the velocity is constant; we simply chose v(a), the value at the
interval’s left endpoint. For several examples where the velocity function is piecewise constant, see http://gvsu.edu/s/9T. 1
The situation is obviously more complicated when the velocity function is not constant. At the same time, on relatively
small intervals on which v(t) does not vary much, the area principle allows us to estimate the distance the moving object
travels on that time interval. For instance, for the non-constant velocity function shown at right in Figure 4.2, we see that
on the interval [1, 1.5], velocity varies from v(1) = 2.5 down to v(1.5) ≈ 2.1. Hence, one estimate for distance traveled is
the area of the pictured rectangle, A2 = v(1)4t = 2.5 miles hour · 1 2 hours = 1.25 miles. Because v is decreasing on [1, 1.5]
and the rectangle lies above the curve, clearly A2 = 1.25 is an over-estimate of the actual distance traveled. If we want to
estimate the area under the non-constant velocity function on a wider interval, say [0, 3], it becomes apparent that one
rectangle probably will not give a good approximation. Instead, we could use the six rectangles pictured in Figure 4.3, find
the

Figure 4.3: Using six rectangles to estimate the area under y = v(t) on [0, 3].
area of each rectangle, and add up the total. Obviously there are choices to make and issues to understand: how many
rectangles should we use? where should we evaluate the function to decide the rectangle’s height? what happens if velocity
is sometimes negative? 1Marc Renault, calculus applets. 211 can we attain the exact area under any non-constant curve?
These questions and more are ones we will study in what follows; for now it suffices to realize that the simple idea of the
area of a rectangle gives us a powerful tool for estimating both distance traveled from a velocity function as well as the
area under an arbitrary curve. To explore the setting of multiple rectangles to approximate area under a non-constant
velocity function, see the applet found at http://gvsu.edu/s/9U. 2

Matthew Boelkins, David Austin & Steven


4.1.2 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
Activity 4.1.2:

Suppose that a person is walking in such a way that her velocity varies slightly according to the information given in
the table below and graph given in Figure 4.4.

Figure 4.4: The graph of y = v(t).2Marc Renault, calculus applets. 212


a. Using the grid, graph, and given data appropriately, estimate the distance traveled by the walker during the two
hour interval from t = 0 to t = 2. You should use time intervals of width 4t = 0.5, choosing a way to use the function
consistently to determine the height of each rectangle in order to approximate distance traveled.
b. How could you get a better approximation of the distance traveled on [0, 2]? Explain, and then find this new
estimate.
c. Now suppose that you know that v is given by v(t) = 0.5t 3 − 1.5t 2 + 1.5t + 1.5. Remember that v is the derivative
of the walker’s position function, s. Find a formula for s so that s 0 = v.
d. Based on your work in (c), what is the value of s(2)− s(0)? What is the meaning of this quantity?

Two approaches: Area and Antidifferentiation


When the velocity of a moving object is positive, the object’s position is always increasing. While we will soon consider
situations where velocity is negative and think about the ramifications of this condition on distance traveled, for now we
continue to assume that we are working with a positive velocity function. In that setting, we have established that
whenever v is actually constant, the exact distance traveled on an interval is the area under the velocity curve; furthermore,
we have observed that when v is not constant, we can estimate the total distance traveled by finding the areas of rectangles
that help to approximate the area under the velocity curve on the given interval. Hence, we see the importance of the
problem of finding the area between a curve and the horizontal axis: besides being an interesting geometric question, in the
setting of the curve being the (positive) velocity of a moving object, the area under the curve over an interval tells us the
exact distance traveled on the interval. We can estimate this area any time we have a graph of the velocity function or a
table of data that tells us some relevant values of the function. In Activity 4.1, we also encountered an alternate approach
to finding the distance traveled. In particular, if we know a formula for the instantaneous velocity, y = v(t), of the moving
body at time t, then we realize that v must be the derivative of some corresponding position function s. If we can find a
formula for s from the formula for v, it follows that we know the position of the object at time t. In addition, under the
assumption that velocity is positive, the change in position over a given interval then tells us the distance traveled on that
interval. For a simple example, consider the situation from Preview Activity 4.1, where a person is walking along a straight
line and has velocity function v(t) = 3 mph. As pictured in

Matthew Boelkins, David Austin & Steven


4.1.3 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
Figure 4.5: The velocity function v(t) = 3 and corresponding position function s(t) = 3t.
Figure 4.5, we see the already noted relationship between area and distance traveled on the left-hand graph of the velocity
function. In addition, because the velocity is constant 213 at 3, we know that if3 s(t) = 3t, then s 0 (t) = 3, so s(t) = 3t is a
function whose derivative is v(t). Furthermore, we now observe that s(1.5) = 4.5 and s(0.25) = 0.75, which are the
respective locations of the person at times t = 0.25 and t = 1.5, and therefore s(1.5) − s(0.25) = 4.5 − 0.75 = 3.75 miles.
This is not only the change in position on [0.25, 1.5], but also precisely the distance traveled on [0.25, 1.5], which can also
be computed by finding the area under the velocity curve over the same interval. There are profound ideas and connections
present in this example that we will spend much of the remainder of Chapter 4 studying and exploring. For now, it is most
important to observe that if we are given a formula for a velocity function v, it can be very helpful to find a function s that
satisfies s 0 = v. In this context, we say that s is an antiderivative of v. More generally, just as we say that f 0 is the
derivative of f for a given function f , if we are given a function g and G is a function such that G 0 = g, we say that G is an
antiderivative of g. For example, if g(x) = 3x 2 + 2x, an antiderivative of g is G(x) = x 3 + x 2 , since G 0 (x) = g(x). Note
that we say “an” antiderivative of g rather than “the” antiderivative of g because H(x) = x 3 + x 2 + 5 is also a function
whose derivative is g, and thus H is another antiderivative of g.

Activity 4.1.3:

A ball is tossed vertically in such a way that its velocity function is given by v(t) = 32 − 32t, where t is measured in
seconds and v in feet per second. Assume that this function is valid for 0 ≤ t ≤ 2.
a. For what values of t is the velocity of the ball positive? What does this tell you about the motion of the ball on this
interval of time values?
b. Find an antiderivative, s, of v that satisfies s(0) = 0.
c. Compute the value of s(1) − s( 1 2 ). What is the meaning of the value you find?
d. Using the graph of y = v(t) provided in Figure 4.6, find the exact area of the region under the velocity curve
between t = 1 2 and t = 1. What is the meaning of the value you find?
e. Answer the same questions as in (c) and (d) but instead using the interval [0, 1]. (f) What is the value of s(2) −
s(0)? What does this result tell you about the flight of the ball? How is this value connected to the provided graph
of y = v(t)? Explain. C

Here we are making the implicit assumption that s(0) = 0; we will further discuss the different possibilities for values of
s(0) in subsequent study.

Figure 4.6: The graph of y = v(t).

Matthew Boelkins, David Austin & Steven


4.1.4 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
When velocity is negative Most of our work in this section has occurred under the assumption that velocity is positive.
This hypothesis guarantees that the movement of the object under consideration is always in a single direction, and hence
ensures that the moving body’s change in position is the same as the distance it travels on a given interval. As we saw in
Activity 4.2, there are natural settings in which a moving object’s velocity is negative; we would like to understand this
scenario fully as well. Consider a simple example where a person goes for a walk on a beach along a stretch of very
straight shoreline that runs east-west. We can naturally assume that their initial position is s(0) = 0, and further stipulate
that their position function increases as they move east from their starting location. For instance, a position of s = 1 mile
represents being one mile east of the start location, while s = −1 tells us the person is one mile west of where they began
walking on the beach. Now suppose the person walks in the following manner. From the outset at t = 0, the person walks
due east at a constant rate of 3 mph for 1.5 hours. After 1.5 hours, the person stops abruptly and begins walking due west
at the constant rate of 4 mph and does so for 0.5 hours. Then, after another abrupt stop and start, the person resumes
walking at a constant rate of 3 mph to the east for one more hour. What is the total distance the person traveled on the time
interval t = 0 to t = 3? What is the person’s total change in position over that time? On one hand, these are elementary
questions to answer because the velocity involved is constant on each interval. From t = 0 to t = 1.5, the person traveled
D[0,1.5] = 3 miles per hour · 1.5 hours = 4.5 miles. Similarly, on t = 1.5 to t = 2, having a different rate, the distance
traveled is D[1.5,2] = 4 miles per hour · 0.5 hours = 2 miles. 215 Finally, similar calculations reveal that in the final hour,
the person walked D[2,3] = 3 miles per hour · 1 hours = 3 miles, so the total distance traveled is D = D[0,1.5] + D[1.5,2] +
D[2,3] = 4.5 + 2 + 3 = 9.5 miles. Since the velocity on 1.5 < t < 2 is actually v = −4, being negative to indicate motion in
the westward direction, this tells us that the person first walked 4.5 miles east, then 2 miles west, followed by 3 more miles
east. Thus, the walker’s total change in position is change in position = 4.5 − 2 + 3 = 5.5 miles. While we have been able to
answer these questions fairly easily, it is also important to think about this problem graphically in order that we can
generalize our solution to the more complicated setting when velocity is not constant, as well as to note the particular
impact that negative velocity has. In Figure 4.7, we see how the distances we computed

Figure 4.7: At left, the velocity function of the person walking; at right, the corresponding position function.
above can be viewed as areas: A1 = 4.5 comes from taking rate times time (3 · 1.5), as do A2 and A3 for the second and
third rectangles. The big new issue is that while A2 is an area (and is therefore positive), because this area involves an
interval on which the velocity function is negative, its area has a negative sign associated with it. This helps us to
distinguish between distance traveled and change in position. The distance traveled is the sum of the areas, D = A1 + A2 +
A3 = 4.5 + 2 + 3 = 9.5 miles. 216 But the change in position has to account for the sign associated with the area, where
those above the t-axis are considered positive while those below the t-axis are viewed as negative, so that s(3) − s(0) =
(+4.5) + (−2) + (+3) = 5.5 miles, assigning the “−2” to the area in the interval [1.5, 2] because there velocity is negative
and the person is walking in the “negative” direction. In other words, the person walks 4.5 miles in the positive direction,
followed by two miles in the negative direction, and then 3 more miles in the positive direction. This affect of velocity
being negative is also seen in the graph of the function y = s(t), which has a negative slope (specifically, its slope is −4) on
the interval 1.5 < t < 2 since the velocity is −4 on that interval, which shows the person’s position function is decreasing
due to the fact that she is walking east, rather than west. On the intervals where she is walking west, the velocity function
is positive and the slope of the position function s is therefore also positive. To summarize, we see that if velocity is
sometimes negative, this makes the moving object’s change in position different from its distance traveled. By viewing the
intervals on which velocity is positive and negative separately, we may compute the distance traveled on each such
interval, and then depending on whether we desire total distance traveled or total change in position, we may account for
negative velocities that account for negative change in position, while still contributing positively to total distance traveled.

Matthew Boelkins, David Austin & Steven


4.1.5 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
We close this section with one additional activity that further explores the effects of negative velocity on the problem of
finding change in position and total distance traveled.

Activity 4.1.4:

Suppose that an object moving along a straight line path has its velocity v (in meters per second) at time t (in seconds)
given by the piecewise linear function whose graph is pictured in Figure 4.8. We view movement to the right as being
in the positive direction (with positive velocity), while movement to the left is in the negative direction. Suppose

Figure 4.8: The velocity function of a moving object.


further that the object’s initial position at time t = 0 is s(0) = 1.
a. Determine the total distance traveled and the total change in position on the time interval 0 ≤ t ≤ 2. What is the
object’s position at t = 2?
b. On what time intervals is the moving object’s position function increasing? Why? On what intervals is the object’s
position decreasing? Why?
c. What is the object’s position at t = 8? How many total meters has it traveled to get to this point (including distance
in both directions)? Is this different from the object’s total change in position on t = 0 to t = 8?
d. Find the exact position of the object at t = 1, 2, 3, . . ., 8 and use this data to sketch an accurate graph of y = s(t) on
the axes provided at right. How can you use the provided information about y = v(t) to determine the concavity of s
on each relevant interval?

Summary
In this section, we encountered the following important ideas:
If we know the velocity of a moving body at every point in a given interval and the velocity is positive throughout, we
can estimate the object’s distance traveled and in some circumstances determine this value exactly.
In particular, when velocity is positive on an interval, we can find the total distance traveled by finding the area under
the velocity curve and above the t-axis on the given time interval. We may only be able to estimate this area, depending
on the shape of the velocity curve.
An antiderivative of a function f is a new function F whose derivative is f . That is, F is an antiderivative of f provided
that F 0 = f . In the context of velocity and position, if we know a velocity function v, an antiderivative of v is a
position function s that satisfies s 0 = v. If v is positive on a given interval, say [a, b], then the change in position, s(b) −
s(a), measures the distance the moving object traveled on [a, b].
In the setting where velocity is sometimes negative, this means that the object is sometimes traveling in the opposite
direction (depending on whether velocity is positive or negative), and thus involves the object backtracking. To
determine distance traveled, we have to think about the problem separately on intervals where velocity is positive and
negative and account for the change in position on each such interval.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


4.1.6 11/21/2021 https://math.libretexts.org/@go/page/4313
Schlicker
4.2: Riemann Sums
Learning Objectives

In this section, we strive to understand the ideas generated by the following important questions:
How can we use a Riemann sum to estimate the area between a given curve and the horizontal axis over a particular interval?
What are the differences among left, right, middle, and random Riemann sums?
How can we write Riemann sums in an abbreviated form??

In Section 4.1, we learned that if we have a moving object with velocity function v , whenever v(t) is positive, the area between y = v(t)
and the t-axis over a given time interval tells us the distance traveled by the object over that time period; in addition, if v(t) is sometimes
negative and we view the area of any region below the t-axis as having an associated negative sign, then the sum of these signed areas over
a given interval tells us the moving object’s change in position over the time interval. For instance, for the velocity function given in Figure
4.2.1, if the areas of shaded regions are A , A , and A as labeled, then the total distance D traveled by the moving object on [a, b] is
1 2 3

D = A1 + A2 + A3 , (4.2.1)

while the total change in the object’s position on [a, b] is


s(b) − s(a) = A1 − A2 + A3 . (4.2.2)

Because the motion is in the negative direction on the interval where v(t) < 0 , we subtract A when determining the object’s total change
2

in position.

Figure 4.2.1 : A velocity function that is sometimes negative.


Of course, finding D and s(b) − s(a) for the situation given in Figure 4.2.1 presumes that we can actually find the areas represented by
A1, A2, and A3. In most of our work in Section 4.1, such as in Activities 4.2 and 4.3, we worked with velocity functions that were either
constant or linear, so that by finding the areas of rectangles and triangles, we could find the area bounded by the velocity function and the
horizontal axis exactly. But when the curve that bounds a region is not one for which we have a known formula for area, we are unable to
find this area exactly. Indeed, this is one of our biggest goals in Chapter 4: to learn how to find the exact area bounded between a curve and
the horizontal axis for as many different types of functions as possible. To begin, we expand on the ideas in Activity 4.1, where we
encountered a nonlinear velocity function and approximated the area under the curve using four and eight rectangles, respectively. In the
following preview activity, we focus on three different options for deciding how to find the heights of the rectangles we will use.

Preview Activity 4.2.1

A person walking along a straight path has her velocity in miles per hour at time t given by the function v(t) = 0.25t 3 − 1.5t 2 + 3t +
0.25, for times in the interval 0 ≤ t ≤ 2. The graph of this function is also given in each of the three diagrams in Figure 4.2.2. Note that
in each diagram, we use four rectangles to estimate the area under y = v(t) on the interval [0, 2], but the method by which the four
rectangles’ respective heights are decided varies among the three individual graphs.

Matthew Boelkins, David Austin & Steven


4.2.1 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
Figure 4.2.2 : Three approaches to estimating the area under y = v(t) on the interval [0, 2].
How are the heights of rectangles in the left-most diagram being chosen? Explain, and hence determine the value of
S = A1 + A2 + A3 + A4 (4.2.3)

by evaluating the function y = v(t) at appropriately chosen values and observing the width of each rectangle. Note, for example, that
A3 = v(1) ⋅ 12 = 2 ⋅ 12 = 1. (4.2.4)

a. Explain how the heights of rectangles are being chosen in the middle diagram and find the value of
T = B1 + B2 + B3 + B4 . (4.2.5)

b. Likewise, determine the pattern of how heights of rectangles are chosen in the right-most diagram and determine U = C1 + C2 + C3
+ C4.
c. Of the estimates S, T, and U, which do you think is the best approximation of D, the total distance the person traveled on [0, 2]?
Why? ./

Sigma Notation
It is apparent from several different problems we have considered that sums of areas of rectangles is one of the main ways to approximate
the area under a curve over a given interval. Intuitively, we expect that using a larger number of thinner rectangles will provide a way to
improve the estimates we are computing. As such, we anticipate dealing with sums with a large number of terms. To do so, we introduce
the use of so-called sigma notation, named for the Greek letter Σ, which is the capital letter S in the Greek alphabet. For example, say we
are interested in the sum 1 + 2 + 3 + ... + 100, which is the sum of the first 100 natural numbers. Sigma notation provides a shorthand
notation that recognizes the general pattern in the terms of the sum. It is equivalent to write
100

∑ k = 1 + 2 + 3 + … + 100. (4.2.6)

k=1

We read the symbol


100

∑ (4.2.7)

k=1

as “the sum from k equals 1 to 100 of k.” The variable k is usually called the index of summation, and the letter that is used for this variable
is immaterial. Each sum in sigma notation involves a function of the index; for example,
10

2 2 2 2 2
∑(k + 2k) = (1 + 2 ⋅ 1) + (2 + 2 ⋅ 2) + (3 + 2 ⋅ 3) + … + (10 + 2 ⋅ 10), (4.2.8)

k=1

and more generally,


n

∑ f (k) = f (1) + f (2) + … + f (n). (4.2.9)

k=1

Sigma notation allows us the flexibility to easily vary the function being used to track the pattern in the sum, as well as to adjust the number
of terms in the sum simply by changing the value of n. We test our understanding of this new notation in the following activity.

Matthew Boelkins, David Austin & Steven


4.2.2 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
Activity 4.2.2

For each sum written in sigma notation, write the sum long-hand and evaluate the sum to find its value. For each sum written in
expanded form, write the sum in sigma notation.
a. (X 5 k=1 (k 2 + 2)
b. X 6 i=3 (2i − 1)
c. 3 + 7 + 11 + 15 + \ldots + 27
d. 4 + 8 + 16 + 32 + \ldots + 256
e. X 6 i=1 1 2 i C

Riemann Sums
When a moving body has a positive velocity function y = v(t) on a given interval [a, b], we know that the area under the curve over the
interval is the total distance the body travels on [a, b]. While this is the fundamental motivating force behind our interest in the area
bounded by a function, we are also interested more generally in being able to find the exact area bounded by y = f (x) on an interval [a, b],
regardless of the meaning or context of the function f . For now, we continue to focus on determining an accurate estimate of this area
through the use of a sum of the areas of rectangles, doing so in the setting where f (x) ≥ 0 on [a, b]. Throughout, unless otherwise indicated,
we also assume that f is continuous on [a, b]. The first choice we make in any such approximation is the number of rectangles. If we

Figure 4.2.3 : Subdividing the interval [a, b] into n subintervals of equal length 4x.
say that the total number of rectangles is n, and we desire n rectangles of equal width to subdivide the interval [a, b], then each rectangle
must have width 4x = b−a n . We observe further that
x1 = x0 + 4x, x2 = x0 + 24x, (4.2.10)

and thus in general


xi = a + i4x (4.2.11)

as pictured in Figure 4.2.3. We use each subinterval [xi , xi+1] as the base of a rectangle, and next must choose how to decide the height of
the rectangle that will be used to approximate the area under y = f (x) on the subinterval. There are three standard choices: use the left
endpoint of each subinterval, the right endpoint of each subinterval, or the midpoint of each. These are precisely the options encountered in
Preview Activity 4.2 and seen in Figure 4.2.2. We next explore how these choices can be reflected in sigma notation. If we now consider an
arbitrary positive function f on [a, b] with the interval subdivided as shown in Figure 4.2.3, and choose to use left endpoints, then on each
interval of the form [xi , xi+1], the area of the rectangle formed is given by Ai+1 = f (xi) · 4x, as seen in Figure 4.2.4. If we let Ln denote
the sum of the areas of rectangles whose heights are given by the function value at each respective left endpoint, then we see that
Ln = A1 + A2 + … + Ai + 1 + … + An = f (x0) ⋅ 4x + f (x1) ⋅ 4x + … + f (xi) ⋅ 4x + … + f (xn − 1) ⋅ 4x. (4.2.12)

In the more compact sigma notation, we have Ln = Xn−1 i=0 f (xi)4x. Note particularly that since the index of summation begins at 0 and
ends at n − 1, there are indeed n terms in this sum. We call Ln the left Riemann sum for the function f on the interval [a, b].

Figure 4.2.4 : Subdividing the interval [a, b] into n subintervals of equal length 4x and approximating the area under y = f (x) over [a, b]
using left rectangles.
There are now two fundamental issues to explore: the number of rectangles we choose to use and the selection of the pattern by which we
identify the height of each rectangle. It is best to explore these choices dynamically, and the applet4 found at http://gvsu.edu/s/a9 is a
particularly useful one. There we see the image shown in

Matthew Boelkins, David Austin & Steven


4.2.3 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
Figure 4.2.5: A snapshot of the applet found at http://gvsu.edu/s/a9.
Figure 4.2.5, but with the opportunity to adjust the slider bars for the left endpoint and the number of subintervals. By moving the sliders,
we can see how the heights of the rectangles change as we consider left endpoints, midpoints, and right endpoints, as well as the impact that
a larger number of narrower rectangles has on the approximation of the exact area bounded by the function and the horizontal axis. To see
how the Riemann sums for right endpoints and midpoints are constructed, 4Marc Renault, Geogebra Calculus Applets. we consider Figure
4.2.6. For the sum with right endpoints, we see that the area of the

Figure 4.2.6 : Riemann sums using right endpoints and midpoints.


rectangle on an arbitrary interval [xi , xi+1] is given by Bi+1 = f (xi+1) · 4x, so that the sum of all such areas of rectangles is given by
Rn = B1 + B2 + … + Bi + 1 + … + Bn = f (x1) ⋅ 4x + f (x2) ⋅ 4x + … + f (xi + 1) ⋅ 4x + … + f (xn) ⋅ 4x (4.2.13)

= Xni = 1f (xi)4x.

We call Rn the right Riemann sum for the function f on the interval [a, b]. For the sum that uses midpoints, we introduce the notation xi+1
= xi + xi+1 2 so that xi+1 is the midpoint of the interval [xi , xi+1]. For instance, for the rectangle with area C1 in Figure 4.2.6, we now
have C1 = f (x1) · 4x. Hence, the sum of all the areas of rectangles that use midpoints is

M n = C 1 + C 2 + … + C i + 1 + … + C n = f (x1) ⋅ 4x + f (x2) ⋅ 4x + … + f (xi + 1) ⋅ 4x + … + f (xn) ⋅ 4x (4.2.14)

= Xni = 1f (xi)4x,

and we say that Mn is the middle Riemann sum for f on [a, b]. When f (x) ≥ 0 on [a, b], each of the Riemann sums Ln, Rn, and Mn provides
an estimate of the area under the curve y = f (x) over the interval [a, b]; momentarily, we will discuss the meaning of Riemann sums in the
setting when f is sometimes negative. We also recall that in the context of a nonnegative velocity function y = v(t), the corresponding
Riemann sums are approximating the distance traveled on [a, b] by the moving object with velocity function v. There is a more general way
to think of Riemann sums, and that is to not restrict the choice of where the function is evaluated to determine the respective rectangle
heights. That is, rather than saying we’ll always choose left endpoints, or always choose midpoints, we simply say that a point x ∗ i+1 will
be selected at random in the interval [xi , xi+1] (so that xi ≤ x ∗ i+1 ≤ xi+1), which makes the Riemann sum given by f (x ∗ 1 ) · 4x + f (x
∗ 2 ) · 4x + \ldots + f (x ∗ i+1 ) · 4x + \ldots + f (x ∗ n ) · 4x = Xn i=1 f (x ∗ i )4x. At http://gvsu.edu/s/a9, the applet noted earlier and
referenced in Figure 4.2.5, by unchecking the “relative” box at the top left, and instead checking “random,” we can easily explore the effect
of using random point locations in subintervals on a given Riemann sum. In computational practice, we most often use Ln, Rn, or Mn,
while the random Riemann sum is useful in theoretical discussions. In the following activity, we investigate several different Riemann sums
for a particular velocity function.

Activity 4.2.3

Suppose that an object moving along a straight line path has its velocity in feet per second at time t in seconds given by
v(t) = 29(t − 3)2 + 2. (4.2.15)

a. Carefully sketch the region whose exact area will tell you the value of the distance the object traveled on the time interval 2 ≤ t ≤ 5.
b. Estimate the distance traveled on [2, 5] by computing L4, R4, and M4.
c. Does averaging L4 and R4 result in the same value as M4? If not, what do you think the average of L4 and R4 measures?
Matthew Boelkins, David Austin & Steven
4.2.4 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
d. For this question, think about an arbitrary function f , rather than the particular function v given above. If f is positive and increasing
on [a, b], will Ln overestimate or under-estimate the exact area under f on [a, b]? Will Rn over- or under-estimate the exact area
under f on [a, b]? Explain. C

When the function is sometimes negative For a Riemann sum such as Ln = Xn−1 i=0 f (xi)4x, we can of course compute the sum even
when f takes on negative values. We know that when f is positive on [a, b], the corresponding left Riemann sum Ln estimates the area
bounded by f and the horizontal axis over the interval. For a function such as the

Figure 4.2.7 : At left and center, two left Riemann sums for a function f that is sometimes negative; at right, the areas bounded by f on the
interval [a, d].
one pictured in Figure 4.2.7, where in the first figure a left Riemann sum is being taken with 12 subintervals over [a, d], we observe that the
function is negative on the interval b ≤ x ≤ c, and so for the four left endpoints that fall in [b, c], the terms f (xi)4x have negative function
values. This means that those four terms in the Riemann sum produce an estimate of the opposite of the area bounded by y = f (x) and the x-
axis on [b, c]. In Figure 4.2.7, we also see evidence that by increasing the number of rectangles used in a Riemann sum, it appears that the
approximation of the area (or the opposite of the area) bounded by a curve appears to improve. For instance, in the middle graph, we use 24
left rectangles, and from the shaded areas, it appears that we have decreased the error from the approximation that uses 12. When we
proceed to Section 4.3, we will discuss the natural idea of letting the number of rectangles in the sum increase without bound. For now, it is
most important for us to observe that, in general, any Riemann sum of a continuous function f on an interval [a, b] approximates the
difference between the area that lies above the horizontal axis on [a, b] and under f and the area that lies below the horizontal axis on [a, b]
and above f . In the notation of Figure 4.2.7, we may say that
L24 ≈ A1 − A2 + A3 , (4.2.16)

where L is the left Riemann sum using 24 subintervals shown in the middle graph, and A1 and A3 are the areas of the regions where f is
24

positive on the interval of interest, while A2 is the area of the region where f is negative. We will also call the quantity A1 − A2 + A3 the
net signed area bounded by f over the interval [a, d], where by the phrase “signed area” we indicate that we are attaching a minus sign to the
areas of regions that fall below the horizontal axis.
Finally, we recall from the introduction to this present section that in the context where the function f represents the velocity of a moving
object, the total sum of the areas bounded by the curve tells us the total distance traveled over the relevant time interval, while the total net
signed area bounded by the curve computes the object’s change in position on the interval.

Activity 4.2.4

Suppose that an object moving along a straight line path has its velocity v (in feet per second) at time t (in seconds) given by
1 7
2
v(t) = t − 3t + . (4.2.17)
2 2

a. Compute M5, the middle Riemann sum, for v on the time interval [1, 5]. Be sure to clearly identify the value of 4t as well as the
locations of t , t , … , t . In addition, provide a careful sketch of the function and the corresponding rectangles that are being used
0 1 5

in the sum.
b. Building on your work in (a), estimate the total change in position of the object on the interval [1, 5].
c. Building on your work in (a) and (b), estimate the total distance traveled by the object on [1, 5].
d. Use appropriate computing technology5 to compute M10 and M20. What exact value do you think the middle sum eventually
approaches as n increases without bound? What does that number represent in the physical context of the overall problem?

Summary
In this section, we encountered the following important ideas:
A Riemann sum is simply a sum of products of the form f (x )Δx that estimates the area between a positive function and the horizontal

i

axis over a given interval. If the function is sometimes negative on the interval, the Riemann sum estimates the difference between the
areas that lie above the horizontal axis and those that lie below the axis.

Matthew Boelkins, David Austin & Steven


4.2.5 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
The three most common types of Riemann sums are left, right, and middle sums, plus we can also work with a more general, random
Riemann sum. The only difference 5For instance, consider the applet at http://gvsu.edu/s/a9 and change the function and adjust the
locations of the blue points that represent the interval endpoints a and b. among these sums is the location of the point at which the
function is evaluated to determine the height of the rectangle whose area is being computed in the sum. For a left Riemann sum, we
evaluate the function at the left endpoint of each subinterval, while for right and middle sums, we use right endpoints and midpoints,
respectively.
The left, right, and middle Riemann sums are denoted Ln, Rn, and Mn, with formulas Ln = f (x0)4x + f (x1)4x + \ldots + f (xn−1)4x =
Xn−1 i=0 f (xi)4x, Rn = f (x1)4x + f (x2)4x + \ldots + f (xn)4x = Xn i=1 f (xi)4x, Mn = f (x1)4x + f (x2)4x + \ldots + f (xn)4x = Xn i=1
f (xi)4x, where x0 = a, xi = a + i4x, and xn = b, using 4x = b−a n . For the midpoint sum, xi = (xi−1 + xi)/2.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand Valley State
University)

Matthew Boelkins, David Austin & Steven


4.2.6 11/21/2021 https://math.libretexts.org/@go/page/4314
Schlicker
4.3: The Definite Integral
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How does increasing the number of subintervals affect the accuracy of the approximation generated by a Riemann
sum?
What is the definition of the definite integral of a function f over the interval [a, b]?
What does the definite integral measure exactly, and what are some of the key properties of the definite integral?

In Figure 4.3.1, we see visual evidence that increasing the number of rectangles in a Riemann sum improves the accuracy
of the approximation of the net signed area that is bounded by the given function on the interval under consideration. We
thus explore the natural idea of allowing the number of rectangles to increase without bound in an effort to compute the
exact net signed area bounded by a function on an interval. In addition, it is important to think about the differences among
left, right, and middle Riemann sums and the different results they generate as the value of n increases. As we have done
throughout our investigations with area, we begin with functions that are exclusively positive on the interval under
consideration.

Figure 4.3.1 : At left and center, two left Riemann sums for a function f that is sometimes negative; at right, the exact
areas bounded by f on the interval [a, d].

Preview Activity 4.3.1

Consider the applet found at http://gvsu.edu/s/aw6 . There, you will initially see the situation shown in Figure 4.3.2.
Note that the value of the chosen

Figure 4.3.2 : A right Riemann sum with 10 subintervals for the function f (x) = sin(2x) − x 2 10 + 3 on the interval
[1, 7]. The value of the sum is R10 = 4.90595.

Matthew Boelkins, David Austin & Steven


4.3.1 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
Riemann sum is displayed next to the word “relative,” and that you can change the type of Riemann sum being
computed by dragging the point on the slider bar below the phrase “sample point placement.” Explore to see how you
can change the window in which the function is viewed, as well as the function itself. You can set the minimum and
maximum values of x by clicking and dragging on the blue points that set the endpoints; you can change the function
by typing a new formula in the “f(x)” window at the bottom; and you can adjust the overall window by “panning and
zooming” by using the Shift key and the scrolling feature of your mouse. More information on how to pan and zoom
can be found at http://gvsu.edu/s/Fl. Work accordingly to adjust the applet so that it uses a left Riemann sum with n = 5
subintervals for the function is f (x) = 2x + 1. You should see the updated figure shown in Figure 4.3.3.

Figure 4.3.3: A left Riemann sum with 5 subintervals for the function f (x) = 2x + 1 on the interval [1, 4]. The value of
the sum is L5 = 16.2. 6Marc Renault, Shippensburg University, Geogebra Applets for Calclulus, http://gvsu.edu/s/5p.
Then, answer the following questions.
a. Update the applet (and view window, as needed) so that the function being considered is f (x) = 2x + 1 on [1, 4], as
directed above. For this function on this interval, compute Ln, Mn, Rn for n = 5, n = 25, and n = 100. What appears
to be the exact area bounded by f (x) = 2x + 1 and the x-axis on [1, 4]?
b. Use basic geometry to determine the exact area bounded by f (x) = 2x + 1 and the x-axis on [1, 4].
c. Based on your work in (a) and (b), what do you observe occurs when we increase the number of subintervals used
in the Riemann sum?
d. Update the applet to consider the function f (x) = x 2 + 1 on the interval [1, 4] (note that you need to enter “x∧2 +
1” for the function formula). Use the applet to compute Ln, Mn, Rn for n = 5, n = 25, and n = 100. What do you
conjecture is the exact area bounded by f (x) = x 2 + 1 and the x-axis on [1, 4]?
e. Why can we not compute the exact value of the area bounded by f (x) = x 2 + 1 and the x-axis on [1, 4] using a
formula like we did in (b)?

The Definition of the Definite Integral


In both examples in Preview Activity 4.3.1, we saw that as the number of rectangles got larger and larger, the values of Ln,
Mn, and Rn all grew closer and closer to the same value. It turns out that this occurs for any continuous function on an
interval [a, b], and even more generally for a Riemann sum using any point x in the interval [x , x ]. Said differently,

i+1 i i+1

as we let n → ∞ , it doesn’t really matter where we choose to evaluate the function within a given subinterval, because
n


lim Ln = lim Rn = lim Mn = lim ∑ f (x )Δx. (4.3.1)
i
n→∞ n→∞ n→∞ n→∞
i=1

That these limits always exist (and share the same value) for a continuous7 function f allows us to make the following
definition.

Matthew Boelkins, David Austin & Steven


4.3.2 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
Definition: Definite Integral
The definite integral of a continuous function f on the interval [a, b], denoted \int_a^b f (x) dx, is the real number given
by
b n


∫ f (x)dx = lim ∑ f (x )Δx, (4.3.2)
i
n→∞
a
i=1

where 4x = b−a n , xi = a + i4x (for i = 0, . . ., n), and x ∗ i satisfies xi−1 ≤ x ∗ i ≤ xi (for i = 1, . . ., n).

We call the symbol ∫ the integral sign, the values a and b the limits of integration, and the function f the integrand. The
process of determining the real number R b a f (x) dx is called evaluating the definite integral. While we will come to
understand that there are several different interpretations of the value of the definite integral, for now the most important is
that
b

∫ f (x) dx (4.3.3)
a

measures the net signed area bounded by y = f (x) and the x-axis on the interval [a, b]. For example, in the notation of the
definite integral, if f is the function pictured in Figure 4.3.4 and A1, A2, and A3 are the exact areas bounded by f and the
x-axis on the respective intervals [a, b], [b, c], and [c, d], then
b

∫ f (x) dx = A1 (4.3.4)
a

∫ f (x) dx = −A2 (4.3.5)


b

∫ f (x) dx = A3 (4.3.6)
c

and
d

∫ f (x) dx = A1 − A2 + A3 . (4.3.7)
a

We can also use definite integrals to express the change in position and distance traveled by a moving object. In the setting
of a velocity function v on an interval [a, b], it follows from our work above and in preceding sections that the change in
position, s(b) − s(a) , is given by
b

s(b) − s(a) = ∫ v(t)dt. (4.3.8)


a

It turns out that a function need not be continuous in order to have a definite integral. For our purposes, we assume that
the functions we consider are continuous on the interval(s) of interest. It is straightforward to see that any function that
is piecewise continuous on an interval of interest will also have a well-defined definite integral.

Matthew Boelkins, David Austin & Steven


4.3.3 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
Figure 4.3.4: A continuous function f on the interval [a, d].
b
If the velocity function is nonnegative on [a, b], then ∫ v(t) dt tells us the distance the object traveled. When velocity is
a

sometimes negative on [a, b], the areas bounded by the function on intervals where v does not change sign can be found
using integrals, and the sum of these values will tell us the distance the object traveled. If we wish to compute the value of
a definite integral using the definition, we have to take the limit of a sum. While this is possible to do in select
circumstances, it is also tedious and time-consuming; moreover, computing these limits does not offer much additional
insight into the meaning or interpretation of the definite integral. Instead, in Section 4.4, we will learn the Fundamental
Theorem of Calculus, a result that provides a shortcut for evaluating a large class of definite integrals. This will enable us
to determine the exact net signed area bounded by a continuous function and the x-axis in many circumstances, including
examples such as R 4 1 (x 2 + 1) dx, which we approximated by Riemann sums in Preview Activity 4.3.1. For now, our
goal is to understand the meaning and properties of the definite integral, rather than how to actually compute its value
using ideas in calculus. Thus, we temporarily rely on the net signed area interpretation of the definite integral and observe
that if a given curve produces regions whose areas we can compute exactly through known area formulas, we can thus
compute the exact value of the integral. For instance, if we wish to evaluate the definite integral R 4 1 (2x + 1) dx, we can
observe that the region bounded by this function and the x-axis is the trapezoid shown in Figure 4.3.5, and by the known
formula for the area of a trapezoid, its area is A = 1 2 (3 + 9) · 3 = 18, so Z 4 1 (2x + 1) dx = 18.

Figure 4.3.5 : The area bounded by f (x) = 2x + 1 and the x-axis on the interval [1, 4].

Activity 4.3.2

Use known geometric formulas and the net signed area interpretation of the definite integral to evaluate each of the
definite integrals below.
a. \int^1_0 3x dx\)
b. \int^4_{−1} (2 − 2x) dx\)
c. \int^1 _{-1} √ 1 − x 2 dx\)
d. \int^4_{−3 }g(x) dx,\) where g is the function pictured in Figure 4.3.6. Assume that each portion of g is either part
of a line or part of a circle.

Figure 4.3.6: A function g that is piecewise defined; each piece of the function is part of a circle or part of a line.

Some Properties of the Definite Integral


With the perspective that the definite integral of a function f over an interval [a, b] measures the net signed area bounded
by f and the x-axis over the interval, we naturally arrive at several different standard properties of the definite integral. In

Matthew Boelkins, David Austin & Steven


4.3.4 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
addition, it is helpful to remember that the definite integral is defined in terms of Riemann sums that fundamentally consist
of the areas of rectangles.
If we consider the definite integral R a a f (x) dx for any real number a , it is evident that no area is being bounded because
the interval begins and ends with the same point. Hence,

If f is a continuous function and a is a real number, then


a

∫ f (x) dx = 0. (4.3.9)
a

Figure 4.3.7 : The area bounded by y = f (x) on the interval [a, c].
Next, we consider the results of subdividing a given interval. In Figure 4.3.7, we see that
b

∫ f (x) dx = A1 (4.3.10)
a

∫ f (x) dx = A2 (4.3.11)
b

and
c

∫ f (x) dx = A1 + A2 , (4.3.12)
a

which is indicative of the following general rule.

Rule
If f is a continuous function and a, b, and c are real numbers, then
b c

Zcaf (x)dx = ∫ f (x)dx + ∫ f (x)dx. (4.3.13)


a b

While this rule is most apparent in the situation where a < b < c, it in fact holds in general for any values of a, b, and c.
This result is connected to another property of the definite integral, which states that if we reverse the order of the limits of
integration, we change the sign of the integral’s value. If f is a continuous function and a and b are real numbers, then
a b

∫ f (x) dx = − ∫ f (x) dx. (4.3.14)


b a

This result makes sense because if we integrate from a to b, then in the defining Riemann sum 4x = b−a n , while if we
integrate from b to a, 4x = a−b n = − b−a n , and this is the only change in the sum used to define the integral. There are
two additional properties of the definite integral that we need to understand. Recall that when we worked with derivative
rules in Chapter 2, we found that both the Constant Multiple Rule and the Sum Rule held. The Constant Multiple Rule tells
us that if f is a differentiable function and k is a constant, then
d ′
[kf (x)] = kf (x), (4.3.15)
dx

Matthew Boelkins, David Austin & Steven


4.3.5 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
and the Sum Rule states that if f and g are differentiable functions, then
d ′ ′
[f (x) + g(x)] = f (x) + g (x). (4.3.16)
dx

These rules are useful because they enable us to deal individually with the simplest parts of certain functions and take
advantage of the elementary operations of addition and multiplying by a constant. They also tell us that the process of
taking the derivative respects addition and multiplying by constants in the simplest possible way.

Figure 4.3.8 : The areas bounded by y = f (x) and y = 2 f (x) on [a, b].
It turns out that similar rules hold for the definite integral. First, let’s consider the situation pictured in Figure 4.3.8, where
we examine the effect of multiplying a function by a factor of 2 on the area it bounds with the x-axis. Because multiplying
the function by 2 doubles its height at every x-value, we see that if we consider a typical rectangle from a Riemann sum,
the difference in area comes from the changed height of the rectangle: f (xi) for the original function, versus 2 f (xi) in the
doubled function, in the case of left sum. Hence, in Figure 4.3.8, we see that for the pictured rectangles with areas A and
B, it follows B = 2A. As this will happen in every such rectangle, regardless of the value of n and the type of sum we use,
we see that in the limit, the area of the red region bounded by y = 2 f (x) will be twice that of the area of the blue region
bounded by y = f (x). As there is nothing special about the value 2 compared to an arbitrary constant k, it turns out that
the following general principle holds.

Constant Multiple Rule

If f is a continuous function and k is any real number then


b b

∫ k ⋅ f (x)dx = k ∫ f (x)dx. (4.3.17)


a a

Finally, we see a similar situation geometrically with the sum of two functions f and g. In particular, as shown in Figure
4.3.9, if we take the sum of two functions f and g, at every point in the interval, the height of the function f +g is given by (

f +g)(xi) = f (xi)+g(xi), which is the sum of the individual function values of f and g (taken at left endpoints). Hence, for
the pictured rectangles with areas A, B, and C, it follows that C = A + B, and because this will occur for every such
rectangle, in the limit the area of the gray region will be the sum of the areas of the blue and red regions.

Figure 4.3.9 : The areas bounded by y = f (x) and y = g(x) on [a, b], as well as the area bounded by y = f (x) + g(x).
Stated in terms of definite integrals, we have the following general rule.

Matthew Boelkins, David Austin & Steven


4.3.6 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
Sum Rule
If f and g are continuous functions, then
b b b

∫ [f (x) + g(x)]dx = ∫ f (x)dx + ∫ g(x)dx. (4.3.18)


a a a

More generally, the Constant Multiple and Sum Rules can be combined to make the observation that for any continuous
functions f and g and any constants c and k,
b b b

∫ [cf (x) ± kg(x)]dx = c ∫ f (x)dx ± k ∫ g(x)dx. (4.3.19)


a a a

Activity 4.3.3

Suppose that the following information is known about the functions f , g, x 2 , and x 3 :
2
∫ f (x)dx = −3; R52f (x)dx = 2
0
2
∫ g(x)dx = 4; R52g(x)dx = −1
0
2
∫ x2dx = 83; R52x2dx = 1173
0
2
∫ x3dx = 4; R52x3dx = 6094
0

Use the provided information and the rules discussed in the preceding section to evaluate each of the following definite
integrals.
2
a. ∫5
f (x)dx
5
b. ∫0
g(x)dx
5
c. ∫0
(f (x) + g(x))dx
5
d. ∫2
2
(3 x
3
− 4 x )dx
0
e. ∫5
3
(2 x − 7g(x))dx

How the definite integral is connected to a function’s average value


One of the most valuable applications of the definite integral is that it provides a way to meaningfully discuss the average
value of a function, even for a function that takes on infinitely many values. Recall that if we wish to take the average of n
numbers y1, y2, . . ., yn, we do so by computing
y1 + y2 + ⋅ ⋅ ⋅ + yn
Avg = . (4.3.20)
n

Since integrals arise from Riemann sums in which we add n values of a function, it should not be surprising that evaluating
an integral is something like averaging the output values of a function. Consider, for instance, the right Riemann sum Rn
of a function f , which is given by
Rn = f (x1)4x + f (x2)4x + ⋅ ⋅ ⋅ + f (xn)4x = (f (x1) + f (x2) + ⋅ ⋅ ⋅ + f (xn))4x. (4.3.21)

Since 4x = b−a n , we can thus write


Rn = (f (x1) + f (x2) + ⋅ ⋅ ⋅ + f (xn)) ⋅ b − an = (b − a)f (x1) + f (x2) + ⋅ ⋅ ⋅ + f (xn)n. (4.1) (4.3.22)

Here, we see that the right Riemann sum with n subintervals is the length of the interval (b − a) times the average of the n
function values found at the right endpoints. And just as with our efforts to compute area, we see that the larger the value
of n we use, the more accurate our average of the values of f will be. Indeed, we will define the average value of f on [a, b]
to be fAVG[a,b] = \lim_{n→∞} f (x1) + f (x2) + · · · + f (xn) n . But we also know that for any continuous function f on [a,
b], taking the limit of a Riemann sum leads precisely to the definite integral. That is, \lim_{n→∞} Rn = \int_a^b f (x) dx,
and thus taking the limit as n → ∞ in Equation (4.1), we have that

Matthew Boelkins, David Austin & Steven


4.3.7 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
b

∫ f (x)dx = (b − a) ⋅ fAV G [a, b]. (4.3.23)


a

Solving Equation 4.3.23 for fAVG[a,b] , we have the following general principle.

Average value of a function:


If f is a continuous function on [a, b], then its average value on [a, b] is given by the formula
b
1
fAV G[a,b] = ⋅∫ f (x) dx. (4.3.24)
b −a a

Observe that Equation 4.3.23 tells us another way to interpret the definite integral: the definite integral of a function f from
a to b is the length of the interval (b − a) times the average value of the function on the interval. In addition, Equation
4.3.23 has a natural visual interpretation when the function f is nonnegative on [a, b]. Consider Figure 4.3.10, where we

see at left the shaded region whose area is R b a f (x) dx, at center the shaded rectangle whose dimensions are (b − a) by
fAVG[a,b] , and at right these two figures superimposed. Specifically, note that in dark green we show the horizontal line y
= fAVG[a,b] . Thus, the area of the green rectangle is given by (b − a) · fAVG[a,b] , which is precisely the value of R b a f
(x) dx. Said differently, the area of the blue region in the

Figure 4.3.10 : A function y = f (x), the area it bounds, and its average value on [a, b].
left figure is the same as that of the green rectangle in the center figure; this can also be seen by observing that the areas
A1 and A2 in the rightmost figure appear to be equal. Ultimately, the average value of a function enables us to construct a
rectangle whose area is the same as the value of the definite integral of the function on the interval. The java applet8 at
http://gvsu.edu/s/az provides an opportunity to explore how the average value of the function changes as the interval
changes, through an image similar to that found in Figure 4.3.10.

Activity 4.3.4:
−−−−−−−−−
Suppose that v(t) = √4 − (t − 2) tells us the instantaneous velocity of a moving object on the interval 0 ≤ t ≤ 4,
2

where t is measured in minutes and v is measured in meters per minute.


a. Sketch an accurate graph of y = v(t). What kind of curve is y = p 4 − (t − 2) 2?
b. Evaluate R 4 0 v(t) dt exactly.
c. In terms of the physical problem of the moving object with velocity v(t), what is the meaning of R 4 0 v(t) dt?
Include units on your answer.
d. Determine the exact average value of v(t) on [0, 4]. Include units on your answer.
e. Sketch a rectangle whose base is the line segment from t = 0 to t = 4 on the t-axis such that the rectangle’s area is
equal to the value of R 4 0 v(t) dt. What is the rectangle’s exact height?
f. How can you use the average value you found in (d) to compute the total distance traveled by the moving object
over [0, 4]? 8David Austin, http://gvsu.edu/s/5r.

Summary
In this section, we encountered the following important ideas:

Matthew Boelkins, David Austin & Steven


4.3.8 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
Any Riemann sum of a continuous function f on an interval [a, b] provides an estimate of the net signed area bounded
by the function and the horizontal axis on the interval. Increasing the number of subintervals in the Riemann sum
improves the accuracy of this estimate, and letting the number of subintervals increase without bound results in the
values of the corresponding Riemann sums approaching the exact value of the enclosed net signed area.
When we take the just described limit of Riemann sums, we arrive at what we call the definite integral of f over the
interval [a, b]. In particular, the symbol R b a f (x) dx denotes the definite integral of f over [a, b], and this quantity is
defined by the equation \int_a^b f (x) dx = \lim_{n→∞} Xn i=1 f (x ∗ i )4x, where 4x = b−a n , xi = a + i4x (for i = 0, .
. ., n), and x ∗ i satisfies xi−1 ≤ x ∗ i ≤ xi (for i = 1, . . ., n).
The definite integral R b a f (x) dx measures the exact net signed area bounded by f and the horizontal axis on [a, b]; in
addition, the value of the definite integral is related to what we call the average value of the function on [a, b]:
fAVG[a,b] = 1 b−a · R b a f (x) dx. In the setting where we consider the integral of a velocity function v, R b a v(t) dt
measures the exact change in position of the moving object on [a, b]; when v is nonnegative, R b a v(t) dt is the object’s
distance traveled on [a, b].
The definite integral is a sophisticated sum, and thus has some of the same natural properties that finite sums have.
Perhaps most important of these is how the definite integral respects sums and constant multiples of functions, which
can be summarized by the rule \int_a^b [c f (x) ± kg(x)] dx = c \int_a^b f (x) dx ± k \int_a^b g(x) dx where f and g are
continuous functions on [a, b] and c and k are arbitrary constants.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


4.3.9 12/22/2021 https://math.libretexts.org/@go/page/4315
Schlicker
4.4: The Fundamental Theorem of Calculus
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we find the exact value of a definite integral without taking the limit of a Riemann sum?
What is the statement of the Fundamental Theorem of Calculus, and how do antiderivatives of functions play a key
role in applying the theorem?
What is the meaning of the definite integral of a rate of change in contexts other than when the rate of change
represents velocity?

Much of our work in Chapter 4 has been motivated by the velocity-distance problem: if we know the instantaneous
velocity function, v(t), for a moving object on a given time interval [a, b], can we determine the distance it traveled on
[a, b]? If the velocity function is nonnegative on [a, b], the area bounded by y = v(t) and the t -axis on [a, b] is equal to the
b
distance traveled. This area is also the value of the definite integral ∫ v(t) dt. If the velocity is sometimes negative, the
a

total area bounded by the velocity function still tells us distance traveled, while the net signed area tells us the object's
change in position.
For instance, for the velocity function in Figure 4.4.1, the total distance D traveled by the moving object on [a, b] is

D = A1 + A2 + A3 ,

and the total change in the object's position is

s(b) − s(a) = A1 − A2 + A3 .

The areas A , A , and A are each given by definite integrals, which may be computed by limits of Riemann sums (and in
1 2 3

special circumstances by geometric formulas).

Figure 4.4.1 . A velocity function that is sometimes negative.


We turn our attention to an alternate approach.

Exercise 4.4.1: Preview Activity

A student with a third floor dormitory window 32 feet off the ground tosses a water balloon straight up in the air with
an initial velocity of 16 feet per second. It turns out that the instantaneous velocity of the water balloon is given by
v(t) = −32t + 16, where v is measured in feet per second and t is measured in seconds.

1. Let s(t) represent the height of the water balloon above ground at time t, and note that s is an antiderivative of v.
That is, v is the derivative of s: s (t) = v(t). Find a formula for s(t) that satisfies the initial condition that the

balloon is tossed from 32 feet above ground. In other words, make your formula for s satisfy s(0) = 32.
2. When does the water balloon reach its maximum height? When does it land?
3. Compute s( ) − s(0), s(2) − s( ), and s(2) − s(0). What do these represent?
1

2
1

4. What is the total vertical distance traveled by the water balloon from the time it is tossed until the time it lands?

Matthew Boelkins, David Austin & Steven


4.4.1 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
5. Sketch a graph of the velocity function y = v(t) on the time interval [0, 2]. What is the total net signed area
bounded by y = v(t) and the t -axis on [0, 2]? Answer this question in two ways: first by using your work above,
and then by using a familiar geometric formula to compute areas of certain relevant regions.

The Fundamental Theorem of Calculus


Suppose we know the position function s(t) and the velocity function v(t) of an object moving in a straight line, and for
the moment let us assume that v(t) is positive on [a, b]. Then, as shown in Figure 4.4.2, we know two different ways to
compute the distance, D, the object travels: one is that D = s(b) − s(a), the object's change in position. The other is the
b
area under the velocity curve, which is given by the definite integral, so D = ∫ v(t) dt. a

>

Figure 4.4.2 . Finding distance traveled when we know a velocity function v.


Since both of these expressions tell us the distance traveled, it follows that they are equal, so
b

s(b) − s(a) = ∫ v(t) dt. (4.4.1)


a

Equation 4.4.1 holds even when velocity is sometimes negative, because s(b) − s(a), the object's change in position, is
b
also measured by the net signed area on [a, b] which is given by ∫ a
v(t) dt.

Perhaps the most powerful fact Equation 4.4.1 reveals is that we can compute the integral's value if we can find a formula
for s. Remember, s and v are related by the fact that v is the derivative of s, or equivalently that s is an antiderivative of
v.

Example 4.4.1

Determine the exact distance traveled on [1, 5] by an object with velocity function v(t) = 3 t
2
+ 40 feet per second.
The distance traveled on the interval [1, 5] is given by
5 5
2
D =∫ v(t) dt = ∫ (3 t + 40) dt = s(5) − s(1),
1 1

where s is an antiderivative of v. Now, the derivative of t


3
is 3t
2
and the derivative of 40t is 40, so it follows that
s(t) = t + 40t is an antiderivative of v. Therefore,
3

5
2
D =∫ 3t + 40 dt = s(5) − s(1)
1

3 3
= (5 + 40 ⋅ 5) − (1 + 40 ⋅ 1) = 284 feet.

Note the key lesson of Example 4.4.3: to find the distance traveled, we need to compute the area under a curve, which is
given by the definite integral. But to evaluate the integral, we can find an antiderivative, s, of the velocity function, and
then compute the total change in s on the interval. In particular, we can evaluate the integral without computing the limit
of a Riemann sum.

Matthew Boelkins, David Austin & Steven


4.4.2 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
Figure 4.4.3 : The exact area of the region enclosed by v(t) = 3t 2
+ 40 on [1, 5].
It will be convenient to have a shorthand symbol for a function's antiderivative. For a continuous function f , we will often
denote an antiderivative of f by F , so that F (x) = f (x) for all relevant x. Using the notation V in place of s (so that V

is an antiderivative of v ) in Equation 4.4.1, we can write


b

V (b) − V (a) = ∫ v(t) dt. (4.4.2)


a

b
Now, to evaluate the definite integral ∫ f (x) dx for an arbitrary continuous function f , we could certainly think of f as
a

representing the velocity of some moving object, and x as the variable that represents time. But Equations (4.4.1) and
(4.4.2) hold for any continuous velocity function, even when v is sometimes negative. So Equation 4.4.2 offers a shortcut
route to evaluating any definite integral, provided that we can find an antiderivative of the integrand. The Fundamental
Theorem of Calculus (FTC) summarizes these observations.

Theorem 4.4.1: Fundamental Theorem of Calculus


b
If f is a continuous function on [a, b], and F is any antiderivative of f , then ∫ a
f (x) dx = F (b) − F (a).

A common alternate notation for F (b) − F (a) is


b
F (b) − F (a) = F (x)| ,
a

where we read the righthand side as “the function F evaluated from a to b.” In this notation, the FTC says that
b
b
∫ f (x) dx = F (x)| .
a
a

The FTC opens the door to evaluating a wide range of integrals if we can find an antiderivative F for the integrand f . For
instance since [ x ] = x , the FTC tells us that
d

dx
1

3
3 2

1 1
1 ∣
2 3
∫ x dx = x ∣
3 ∣
0 0

1 1
3 3
= (1 ) − (0 )
3 3
1
= .
3

But finding an antiderivative can be far from simple; it is often difficult or even impossible. While we can differentiate just
about any function, even some relatively simple functions don't have an elementary antiderivative. A significant portion of
integral calculus (which is the main focus of second semester college calculus) is devoted to the problem of finding
antiderivatives.

Matthew Boelkins, David Austin & Steven


4.4.3 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
Exercise 4.4.1

Use the Fundamental Theorem of Calculus to evaluate each of the following integrals exactly. For each, sketch a graph
of the integrand on the relevant interval and write one sentence that explains the meaning of the value of the integral in
terms of the (net signed) area bounded by the curve.
4

1. ∫ (2 − 2x) dx
−1
π

2. ∫ sin(x) dx
0
1

3. ∫ e
x
dx
0
1

4. ∫ x
5
dx
−1
2

5. ∫ (3 x
3
− 2x
2 x
− e ) dx
0

Basic antiderivatives
The general problem of finding an antiderivative is difficult. In part, this is due to the fact that we are trying to undo the
process of differentiating, and the undoing is much more difficult than the doing. For example, while it is evident that an
antiderivative of f (x) = sin(x) is F (x) = − cos(x) and that an antiderivative of g(x) = x is G(x) = x , 2 1

3
3

combinations of f and g can be far more complicated. Consider the functions


sin(x)
2 2 2
5 sin(x) − 4 x ,  x sin(x),   ,  and  sin(x ).
2
x

What is involved in trying to find an antiderivative for each? From our experience with derivative rules, we know that
derivatives of sums and constant multiples of basic functions are simple to execute, but derivatives involving products,
quotients, and composites of familiar functions are more complicated. Therefore, it stands to reason that antidifferentiating
products, quotients, and composites of basic functions may be even more challenging. We defer our study of all but the
most elementary antiderivatives to later in the text.
We do note that whenever we know the derivative of a function, we have a function-derivative pair, so we also know the
antiderivative of a function. For instance, since we know that
d
[− cos(x)] = sin(x),
dx

we also know that F (x) = − cos(x) is an antiderivative of f (x) = sin(x). F and f together form a function-derivative
pair. Clearly, every basic derivative rule leads us to such a pair, and thus to a known antiderivative.
In Activity 4.4.3, we will construct a list of the basic antiderivatives we know at this time. Those rules will help us
antidifferentiate sums and constant multiples of basic functions. For example, since − cos(x) is an antiderivative of sin(x)
and x is an antiderivative of x , it follows that
1

3
3 2

4 3
F (x) = −5 cos(x) − x
3

is an antiderivative of f (x) = 5 sin(x) − 4x 2


, by the sum and constant multiple rules for differentiation.
Finally, before proceeding to build a list of common functions whose antiderivatives we know, we recall that each function
has more than one antiderivative. Because the derivative of any constant is zero, we may add a constant of our choice to
any antiderivative. For instance, we know that G(x) = x is an antiderivative of g(x) = x . But we could also have
1

3
3 2

chosen G(x) = x + 7, since in this case as well, G (x) = x . If g(x) = x , we say that the general antiderivative of g
1

3
3 ′ 2 2

is
1
3
G(x) = x + C,
3

Matthew Boelkins, David Austin & Steven


4.4.4 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
where C represents an arbitrary real number constant. Regardless of the formula for g, including +C in the formula for its
antiderivative G results in the most general possible antiderivative.
Our current interest in antiderivatives is so that we can evaluate definite integrals by the Fundamental Theorem of
Calculus. For that task, the constant C is irrelevant, and we usually omit it. To see why, consider the definite integral
1
2
∫ x dx.
0

For the integrand g(x) = x 2


, suppose we find and use the general antiderivative G(x) = 1

3
3
x + C. Then, by the FTC,
1 1
2
1 3 ∣
∫ x dx = x + C∣
3 ∣
0 0

1 3
1 3
=( (1 ) + C) − ( (0 ) + C)
3 3

1
= +C −0 −C
3
1
= .
3

Observe that the C -values appear as opposites in the evaluation of the integral and thus do not affect the definite integral's
value.
In the following activity, we work to build a list of basic functions whose antiderivatives we already know.

Exercise 4.4.1

Use your knowledge of derivatives of basic functions to complete Table 4.4.5 of antiderivatives. For each entry, your
task is to find a function F whose derivative is the given function f . When finished, use the FTC and the results in the
table to evaluate the three given definite integrals.
Table 4.4.5. Familiar basic functions and their antiderivatives.
given function, f (x) antiderivative, F (x)

k, (k is constant)
n
x , n ≠ −1

1
, x > 0
x

sin(x)

cos(x)

sec(x) tan(x)

csc(x) cot(x)

2
sec (x)

2
csc (x)

x
e

x
a (a > 1)

1
2
1+x

√1−x2

1. ∫ 3
(x −x −e
x
+ 2) dx
0

Matthew Boelkins, David Austin & Steven


4.4.5 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
π/3

2. ∫ 2
(2 sin(t) − 4 cos(t) + sec (t) − π) dt
0
1

3. ∫ 2
(√x − x ) dx
0

The Total Change Theorem


Let us review three interpretations of the definite integral.
For a moving object with instantaneous velocity v(t), the object's change in position on the time interval [a, b] is given
b b
by ∫ v(t) dt, and whenever v(t) ≥ 0 on [a, b], ∫ v(t) dt tells us the total distance traveled by the object on [a, b].
a a
b
For any continuous function f , its definite integral ∫ f (x) dx represents the net signed area bounded by y = f (x) and
a

the x-axis on [a, b], where regions that lie below the x-axis have a minus sign associated with their area.
The value of a definite integral is linked to the average value of a function: for a continuous function f on [a, b], its
average value f is given by
AVG[a,b]

b
1
fAVG[a,b] = ∫ f (x) dx.
b −a a

The Fundamental Theorem of Calculus now enables us to evaluate exactly (without taking a limit of Riemann sums) any
definite integral for which we are able to find an antiderivative of the integrand.
A slight change in perspective allows us to gain even more insight into the meaning of the definite integral. Recall
Equation 4.4.2, where we wrote the Fundamental Theorem of Calculus for a velocity function v with antiderivative V as
b

V (b) − V (a) = ∫ v(t) dt.


a

If we instead replace V with s (which represents position) and replace v with ′


s (since velocity is the derivative of
position), Equation 4.4.2 then reads as
b

s(b) − s(a) = ∫ s (t) dt. (4.4.3)
a

In words, this version of the FTC tells us that the total change in an object's position function on a particular interval is
given by the definite integral of the position function's derivative over that interval.
Of course, this result is not limited to only the setting of position and velocity. Writing the result in terms of a more general
function f , we have the Total Change Theorem.

Theorem 4.4.1: Total Change Theorem


b
If f is a continuously differentiable function on [a, b] with derivative f , then f (b) − f (a) = ∫ f (x) dx. That is, the

a

definite integral of the rate of change of a function on [a, b] is the total change of the function itself on [a, b].

The Total Change Theorem tells us more about the relationship between the graph of a function and that of its derivative.
Recall that heights on the graph of the derivative function are equal to slopes on the graph of the function itself. If instead
we know f and are seeking information about f , we can say the following:

differences in heights on f correspond to net signed areas bounded by f ′


.

Matthew Boelkins, David Austin & Steven


4.4.6 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
alt

Figure 4.4.5 : The graphs of f (x) = 4 − 2x (at left)


and an antiderivative f (x) = 4x − x 2


at right. Differences in heights on f correspond to net signed areas bounded by f . ′

To see why this is so, consider the difference f (1) − f (0). This value is 3, because f (1) = 3 and f (0) = 0, but also
because the net signed area bounded by y = f (x) on [0, 1] is 3. That is,

1

f (1) − f (0) = ∫ f (x) dx.
0

In addition to this observation about area, the Total Change Theorem enables us to answer questions about a function
whose rate of change we know.

Example 4.4.1

Suppose that pollutants are leaking out of an underground storage tank at a rate of r(t) gallons/day, where t is
measured in days. It is conjectured that r(t) is given by the formula r(t) = 0.0069t − 0.125t + 11.079 over a
3 2

10
certain 12-day period. The graph of y = r(t) is given in Figure 4.4.6. What is the meaning of ∫ r(t) dt and what is
4

its value? What is the average rate at which pollutants are leaving the tank on the time interval 4 ≤ t ≤ 10?

Figure 4.4.6 : The rate r(t) of pollution leaking from a tank, measured in gallons per day.
Solution
10
Since r(t) ≥ 0, the value of ∫ r(t) dt is the area under the curve on the interval [4, 10]. A Riemann sum for this area
4

will have rectangles with heights measured in gallons per day and widths measured in days, so the area of each

Matthew Boelkins, David Austin & Steven


4.4.7 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
rectangle will have units of
gallons
⋅ days = gallons.
day

Thus, the definite integral tells us the total number of gallons of pollutant that leak from the tank from day 4 to day 10.
The Total Change Theorem tells us the same thing: if we let R(t) denote the total number of gallons of pollutant that
have leaked from the tank up to day t, then R (t) = r(t), and′

10

∫ r(t) dt = R(10) − R(4),


4

the number of gallons that have leaked from day 4 to day 10.
To compute the exact value of the integral, we use the Fundamental Theorem of Calculus. Antidifferentiating
r(t) = 0.0069 t − 0.125 t + 11.079, we find that
3 2

10 10
1 1 ∣
3 2 4 3
∫ 0.0069 t − 0.125 t + 11.079 dt = 0.0069 ⋅ t − 0.125 ⋅ t + 11.079t∣
4 3 ∣
4 4

≈ 44.282.

Thus, approximately 44.282 gallons of pollutant leaked over the six day time period.
To find the average rate at which pollutant leaked from the tank over 4 ≤ t ≤ 10, we compute the average value of r
on [4, 10]. Thus,
10
1 44.282
rAVG[4,10] = ∫ r(t) dt ≈ = 7.380
10 − 4 4
6

gallons per day.

Exercise 4.4.1

During a 40-minute workout, a person riding an exercise machine burns calories at a rate of c calories per minute,
where the function y = c(t) is given in Figure 4.4.7. On the interval 0 ≤ t ≤ 10, the formula for c is
c(t) = −0.05 t + t + 10, while on 30 ≤ t ≤ 40, its formula is c(t) = −0.05 t + 3t − 30.
2 2

alt

Figure 4.4.7 . The rate c(t) at which a person exercising burns


calories, measured in calories per minute.
1. What is the exact total number of calories the person burns during the first 10 minutes of her workout?
2. Let C (t) be an antiderivative of c(t). What is the meaning of C (40) − C (0) in the context of the person
exercising? Include units on your answer.
3. Determine the exact average rate at which the person burned calories during the 40-minute workout.
4. At what time(s), if any, is the instantaneous rate at which the person is burning calories equal to the average rate at
which she burns calories, on the time interval 0 ≤ t ≤ 40?

Summary
We can find the exact value of a definite integral without taking the limit of a Riemann sum or using a familiar area
formula by finding the antiderivative of the integrand, and hence applying the Fundamental Theorem of Calculus.

Matthew Boelkins, David Austin & Steven


4.4.8 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
The Fundamental Theorem of Calculus says that if f is a continuous function on [a, b] and F is an antiderivative of f ,
then
b

∫ f (x) dx = F (b) − F (a).


a

Hence, if we can find an antiderivative for the integrand f , evaluating the definite integral comes from simply
computing the change in F on [a, b].
A slightly different perspective on the FTC allows us to restate it as the Total Change Theorem, which says that
b

∫ f (x) dx = f (b) − f (a),
a

for any continuously differentiable function f . This means that the definite integral of the instantaneous rate of change
of a function f on an interval [a, b] is equal to the total change in the function f on [a, b].

Exercises

1. Finding exact displacement.

The velocity function is v(t) = −t2 + 4t − 3 for a particle moving along a line. Find the
displacement (net distance covered) of the particle during the time interval [−1, 5].

displacement =

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

2. Evaluating the de nite integral of a rational function.

Matthew Boelkins, David Austin & Steven


4.4.9 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
4
1
The value of ∫ 2
dx is
2
x

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

3. Evaluating the de nite integral of a linear function.

Evaluate the definite integral

∫ (4x + 10) dx.


2

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

4. Evaluating the de nite integral of a quadratic function.

Matthew Boelkins, David Austin & Steven


4.4.10 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
Evaluate the definite integral

6
2
∫ (36 − x ) dx.
−6

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

5. Simplifying an integrand before integrating.

Evaluate the definite integral

8 2
8x + 3
∫ dx.
3 √x

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

6. Evaluating the de nite integral of a trigonometric function.

Matthew Boelkins, David Austin & Steven


4.4.11 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
Evaluate the definite integral
π

∫ 8 sin(x) dx.
0

Preview My Answers Check Answers Show Correct Answers

WeBWorK © 2000-2021 | host: https://webwork-ptx.aimath.org | course: anonymous | format: simple | theme: math4

7.
The instantaneous velocity (in meters per minute) of a moving object is given by the function v as pictured in Figure
4.4.10. Assume that on the interval 0 ≤ t ≤ 4, v(t) is given by v(t) = − t + and that on every other
1 3 3 2
t + 1,
4 2

interval v is piecewise linear, as shown.


alt

Figure 4.4.10. The velocity function of a moving body.


1. Determine the exact distance traveled by the object on the time interval 0 ≤ t ≤ 4.
2. What is the object's average velocity on [12, 24]?
3. At what time is the object's acceleration greatest?
4. Suppose that the velocity of the object is increased by a constant value c for all values of t. What value of c will
make the object's total distance traveled on [12, 24] be 210 meters?

8.
A function f is given piecewise by the formula
2
⎧ −x + 2x + 1,  if 0 ≤ x < 2

f (x) = ⎨ −x + 3,  if 2 ≤ x < 3 .



2
x − 8x + 15,  if 3 ≤ x ≤ 5

1. Determine the exact value of the net signed area enclosed by f and the x-axis on the interval [2, 5].
2. Compute the exact average value of f on [0, 5].

Matthew Boelkins, David Austin & Steven


4.4.12 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
3. Find a formula for a function g on 5 ≤ x ≤ 7 so that if we extend the above definition of f so that f (x) = g(x) if
7
5 ≤ x ≤ 7, it follows that ∫ f (x) dx = 0.
0

9.
When an aircraft attempts to climb as rapidly as possible, its climb rate (in feet per minute) decreases as altitude
increases, because the air is less dense at higher altitudes. Given below is a table showing performance data for a certain
single engine aircraft, giving its climb rate at various altitudes, where c(h) denotes the climb rate of the airplane at an
altitude h.

h (feet) 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000

c
925 875 830 780 730 685 635 585 535 490 440
(ft/min)

Let a new function called m(h) measure the number of minutes required for a plane at altitude h to climb the next foot
of altitude.
1. Determine a similar table of values for m(h) and explain how it is related to the table above. Be sure to explain the
units.
2. Give a careful interpretation of a function whose derivative is m(h). Describe what the input is and what the output
is. Also, explain in plain English what the function tells us.
3. Determine a definite integral whose value tells us exactly the number of minutes required for the airplane to ascend
to 10,000 feet of altitude. Clearly explain why the value of this integral has the required meaning.
4. Use the Riemann sum M to estimate the value of the integral you found in (c). Include units on your result.
5

10.
In Chapter 1, we showed that for an object moving along a straight line with position function s(t), the object's
“average velocity on the interval [a, b]” is given by
s(b) − s(a)
AV[a,b] = .
b −a

More recently in Chapter 4, we found that for an object moving along a straight line with velocity function v(t), the
object's “average value of its velocity function on [a, b]” is
b
1
vAVG[a,b] = ∫ v(t) dt.
b −a a

Are the “average velocity on the interval [a, b] ” and the “average value of the velocity function on ” the same
[a, b]

thing? Why or why not? Explain.

11.
In Table 4.4.5 in Activity 4.4.3, we noted that for x > 0, the antiderivative of f (x) = is F (x) = ln(x). Here we
1

observe that a key difference between f (x) and F (x) is that f is defined for all x ≠ 0, while F is only defined for
x > 0, and see how we can actually define the antiderivative of f for all values of x.

1. Suppose that x < 0, and let G(x) = ln(−x). Compute G (x). ′

2. Explain why G is an antiderivative of f for x < 0.

Matthew Boelkins, David Austin & Steven


4.4.13 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
3. Let H (x) = ln(|x|), and recall that

−x,  if x < 0


|x| = { .
x,  if x ≥ 0

Explain why H (x) = G(x) for x < 0 and H (x) = F (x) for x > 0.
4. Now discuss why we say that the antiderivative of 1

x
is ln(|x|) for all x ≠ 0.

Matthew Boelkins, David Austin & Steven


4.4.14 12/15/2021 https://math.libretexts.org/@go/page/4316
Schlicker
4.E: The Definite Integral (Exercises)
4.1: Determining Distance Traveled from Velocity
1. Along the eastern shore of Lake Michigan from Lake Macatawa (near Holland) to Grand Haven, there is a bike bath that
runs almost directly north-south. For the purposes of this problem, assume the road is completely straight, and that the
function s(t) tracks the position of the biker along this path in miles north of Pigeon Lake, which lies roughly halfway
between the ends of the bike path. Suppose that the biker’s velocity function is given by the graph in Figure 4.9 on the time
interval 0 ≤ t ≤ 4 (where t is measured in hours), and that

Figure 4.9: The graph of the biker’s velocity, y = v(t), at left. At right, axes to plot an approximate sketch of y = s(t).
(a) Approximately how far north of Pigeon Lake was the cyclist when she was the greatest distance away from Pigeon
Lake? At what time did this occur?
(b) What is the cyclist’s total change in position on the time interval 0 ≤ t ≤ 2? At t = 2, was she north or south of Pigeon
Lake?
(c) What is the total distance the biker traveled on 0 ≤ t ≤ 4? At the end of the ride, how close was she to the point at which
she started?
(d) Sketch an approximate graph of y = s(t), the position function of the cyclist, on the interval 0 ≤ t ≤ 4. Label at least four
important points on the graph of s.
2. A toy rocket is launched vertically from the ground on a day with no wind. The rocket’s vertical velocity at time t (in
seconds) is given by v(t) = 500 − 32t feet/sec.
(a) At what time after the rocket is launched does the rocket’s velocity equal zero? Call this time value a. What happens to
the rocket at t = a?
(b) Find the value of the total area enclosed by y = v(t) and the t-axis on the interval 0 ≤ t ≤ a. What does this area
represent in terms of the physical 219 setting of the problem?
(c) Find an antiderivative s of the function v. That is, find a function s such that s 0 (t) = v(t).
(d) Compute the value of s(a) − s(0). What does this number represent in terms of the physical setting of the problem?
(e) Compute s(5) − s(1). What does this number tell you about the rocket’s flight?
3. An object moving along a horizontal axis has its instantaneous velocity at time t in seconds given by the function v
pictured in Figure 4.10, where v is measured in feet/sec. Assume that the curves that make up the parts of the graph of y =
v(t) are either

Matthew Boelkins, David Austin & Steven


4.E.1 11/21/2021 https://math.libretexts.org/@go/page/5400
Schlicker
Figure 4.10: The graph of y = v(t), the velocity function of a moving object.
portions of straight lines or portions of circles.
(a) Determine the exact total distance the object traveled on 0 ≤ t ≤ 2.
(b) What is the value and meaning of s(5) − s(2), where y = s(t) is the position function of the moving object?
(c) On which time interval did the object travel the greatest distance: [0, 2], [2, 4], or [5, 7]?
(d) On which time interval(s) is the position function s increasing? At which point(s) does s achieve a relative maximum?
4. Filters at a water treatment plant become dirtier over time and thus become less effective; they are replaced every 30
days. During one 30-day period, the rate at which pollution passes through the filters into a nearby lake (in units of
particulate matter per day) is measured every 6 days and is given in the following table. The time t is measured in days
since the filters were replaced.

(a) Plot the given data on a set of axes with time on the horizontal axis and the rate of pollution on the vertical axis.
(b) Explain why the amount of pollution that entered the lake during this 30-day period would be given exactly by the area
bounded by y = p(t) and the t-axis on the time interval [0, 30].
(c) Estimate the total amount of pollution entering the lake during this 30-day period. Carefully explain how you
determined your estimate.

4.2: Riemann Sums


1. Consider the function f (x) = 3x + 4.
(a) Compute M4 for y = f (x) on the interval [2, 5]. Be sure to clearly identify the value of 4x, as well as the locations of
x0, x1, . . ., x4. Include a careful sketch of the function and the corresponding rectangles being used in the sum.
(b) Use a familiar geometric formula to determine the exact value of the area of the region bounded by y = f (x) and the x-
axis on [2, 5].
(c) Explain why the values you computed in (a) and (b) turn out to be the same. Will this be true if we use a number
different than n = 4 and compute Mn? Will L4 or R4 have the same value as the exact area of the region found in (b)?
(d) Describe the collection of functions g for which it will always be the case that Mn, regardless of the value of n, gives
the exact net signed area bounded between the function g and the x-axis on the interval [a, b].
2. Let S be the sum given by S = ((1.4) 2+1)·0.4+((1.8) 2+1)·0.4+((2.2) 2+1)·0.4+((2.6) 2+1)·0.4+((3.0) 2+1)·0.4.
(a) Assume that S is a right Riemann sum. For what function f and what interval [a, b] is S an approximation of the area
under f and above the x-axis on [a, b]? Why? 232 4.2. RIEMANN SUMS
(b) How does your answer to (a) change if S is a left Riemann sum? a middle Riemann sum?
(c) Suppose that S really is a right Riemann sum. What is geometric quantity does S approximate?
(d) Use sigma notation to write a new sum R that is the right Riemann sum for the same function, but that uses twice as
many subintervals as S.
3. A car traveling along a straight road is braking and its velocity is measured at several different points in time, as given in
the following table.

(a) Plot the given data on a set of axes with time on the horizontal axis and the velocity on the vertical axis.
(b) Estimate the total distance traveled during the car the time brakes using a middle Riemann sum with 3 subintervals.
(c) Estimate the total distance traveled on [0, 1.8] by computing L6, R6, and 1 2 (L6 + R6).

Matthew Boelkins, David Austin & Steven


4.E.2 11/21/2021 https://math.libretexts.org/@go/page/5400
Schlicker
(d) Assuming that v(t) is always decreasing on [0, 1.8], what is the maximum possible distance the car traveled before it
stopped? Why?
4. The rate at which pollution escapes a scrubbing process at a manufacturing plant increases over time as filters and other
technologies become less effective. For this particular example, assume that the rate of pollution (in tons per week) is
given by the function r that is pictured in Figure 4.18.

Figure 4.18: The rate, r(t), of pollution in tons per week.


(a) Use the graph to estimate the value of M4 on the interval [0, 4]. 4.2. RIEMANN SUMS 233
(b) What is the meaning of M4 in terms of the pollution discharged by the plant?
(c) Suppose that r(t) = 0.5e 0.5t . Use this formula for r to compute L5 on [0, 4].
(d) Determine an upper bound on the total amount of pollution that can escape the plant during the pictured four week time
period that is accurate within an error of at most one ton of pollution.

4.3: The Definite Integral


1. The velocity of an object moving along an axis is given by the piecewise linear function v that is pictured in Figure 4.29.
Assume that the object is moving to the right when its 4.3. THE DEFINITE INTEGRAL 247 velocity is positive, and
moving to the left when its velocity is negative. Assume that the given velocity function is valid for t = 0 to t = 4.

Figure 4.29: The velocity function of a moving object.


(a) Write an expression involving definite integrals whose value is the total change in position of the object on the interval
[0, 4].
(b) Use the provided graph of v to determine the value of the total change in position on [0, 4].
(c) Write an expression involving definite integrals whose value is the total distance traveled by the object on [0, 4]. What
is the exact value of the total distance traveled on [0, 4]?
(d) What is the object’s exact average velocity on [0, 4]?
(e) Find an algebraic formula for the object’s position function on [0, 1.5] that satisfies s(0) = 0.
2. Suppose that the velocity of a moving object is given by v(t) = t(t − 1)(t − 3), measured in feet per second, and that this
function is valid for 0 ≤ t ≤ 4.
(a) Write an expression involving definite integrals whose value is the total change in position of the object on the interval
[0, 4].
Matthew Boelkins, David Austin & Steven
4.E.3 11/21/2021 https://math.libretexts.org/@go/page/5400
Schlicker
(b) Use appropriate technology (such as http://gvsu.edu/s/a99 ) to compute Riemann sums to estimate the object’s total
change in position on [0, 4]. Work to ensure that your estimate is accurate to two decimal places, and explain how you
know this to be the case.
(c) Write an expression involving definite integrals whose value is the total distance traveled by the object on [0, 4]. 9Marc
Renault, Shippensburg University. 248 4.3. THE DEFINITE INTEGRAL
(d) Use appropriate technology to compute Riemann sums to estimate the object’s total distance travelled on [0, 4]. Work
to ensure that your estimate is accurate to two decimal places, and explain how you know this to be the case.
(e) What is the object’s average velocity on [0, 4], accurate to two decimal places? 3. Consider the graphs of two functions
f and g that are provided in Figure 4.30. Each piece of f and g is either part of a straight line or part of a circle.

Figure 4.30: Two functions f and g.


(a) Determine the exact value of R 1 0 [ f (x) + g(x)] dx.
(b) Determine the exact value of R 4 1 [2 f (x) − 3g(x)] dx.
(c) Find the exact average value of h(x) = g(x) − f (x) on [0, 4].
(d) For what constant c does the following equation hold? Z 4 0 c dx = Z 4 0 [ f (x) + g(x)] dx
4. Let f (x) = 3 − x 2 and g(x) = 2x 2 .
(a) On the interval [−1, 1], sketch a labeled graph of y = f (x) and write a definite integral whose value is the exact area
bounded by y = f (x) on [−1, 1].
(b) On the interval [−1, 1], sketch a labeled graph of y = g(x) and write a definite integral whose value is the exact area
bounded by y = g(x) on [−1, 1].
(c) Write an expression involving a difference of definite integrals whose value is the exact area that lies between y = f (x)
and y = g(x) on [−1, 1].
(d) Explain why your expression in (c) has the same value as the single integral R 1 −1 [ f (x) − g(x)] dx.
(e) Explain why, in general, if p(x) ≥ q(x) for all x in [a, b], the exact area between y = p(x) and y = q(x) is given by Z b a
[p(x) − q(x)] dx.

4.4: The Fundamental Theorem of Calculus

Exercises
1. The instantaneous velocity (in meters per minute) of a moving object is given by the function v as pictured in Figure
4.37. Assume that on the interval 0 ≤ t ≤ 4, v(t) is given by v(t) = − 1 4 t 3 + 3 2 t 2 + 1, and that on every other interval v
is piecewise linear, as shown.
(a) Determine the exact distance traveled by the object on the time interval 0 ≤ t ≤ 4.
(b) What is the object’s average velocity on [12, 24]?
(c) At what time is the object’s acceleration greatest?

Matthew Boelkins, David Austin & Steven


4.E.4 11/21/2021 https://math.libretexts.org/@go/page/5400
Schlicker
(d) Suppose that the velocity of the object is increased by a constant value c for all values of t. What value of c will make
the object’s total distance traveled on [12, 24] be 210 meters?

Figure 4.37: The velocity function of a moving body.


2. A function f is given piecewise by the formula f (x) = −x 2 + 2x + 1, if 0 ≤ x < 2 −x + 3, if 2 ≤ x < 3 x 2 − 8x +
15, if 3 ≤ x ≤ 5
(a) Determine the exact value of the net signed area enclosed by f and the x-axis on the interval [2, 5].
(b) Compute the exact average value of f on [0, 5].
(c) Find a formula for a function g on 5 ≤ x ≤ 7 so that if we extend the above definition of f so that f (x) = g(x) if 5 ≤ x ≤
7, it follows that R 7 0 f (x) dx = 0.
3. When an aircraft attempts to climb as rapidly as possible, its climb rate (in feet per minute) decreases as altitude
increases, because the air is less dense at higher altitudes. Given below is a table showing performance data for a certain
single engine aircraft, giving its climb rate at various altitudes, where c(h) denotes the climb rate of the airplane at an
altitude h.

Let a new function called m(h) measure the number of minutes required for a plane at altitude h to climb the next foot of
altitude.
(a) Determine a similar table of values for m(h) and explain how it is related to the table above. Be sure to explain the
units.
(b) Give a careful interpretation of a function whose derivative is m(h). Describe what the input is and what the output is.
Also, explain in plain English what the function tells us. 264 4.4. THE FUNDAMENTAL THEOREM OF CALCULUS
(c) Determine a definite integral whose value tells us exactly the number of minutes required for the airplane to ascend to
10,000 feet of altitude. Clearly explain why the value of this integral has the required meaning.
(d) Use the Riemann sum M5 to estimate the value of the integral you found in (c). Include units on your result.
4. In Chapter 1, we showed that for an object moving along a straight line with position function s(t), the object’s “average
velocity on the interval [a, b]” is given by AV[a,b] = s(b) − s(a) b − a . More recently in Chapter 4, we found that for an
object moving along a straight line with velocity function v(t), the object’s “average value of its velocity function on [a,
b]” is vAVG[a,b] = 1 b − a Z b a v(t) dt. Are the “average velocity on the interval [a, b]” and the “average value of the
velocity function on [a, b]” the same thing? Why or why not? Explain.

Matthew Boelkins, David Austin & Steven


4.E.5 11/21/2021 https://math.libretexts.org/@go/page/5400
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
5: FINDING ANTIDERIVATIVES AND EVALUATING INTEGRALS
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

5.1: CONSTRUCTION ACCURATE GRAPHS OF ANTIDERIVATIVES


Given the graph of a function f, we can construct the graph of its antiderivative F provided that (a) we know a starting value of F, say
F(a), and (b) we can evaluate the integral R b a f (x) dx exactly for relevant choices of a and b. Thus, any function with at least one
antiderivative in fact has infinitely many, and the graphs of any two antiderivatives will differ only by a vertical translation.

5.2: THE SECOND FUNDAMENTAL THEOREM OF CALCULUS


The Second Fundamental Theorem of Calculus is the formal, more general statement of the preceding fact: if f is a continuous
function and c is any constant, then A(x) = R x c f (t) dt is the unique antiderivative of f that satisfies A(c) = 0. Together, the First and
Second FTC enable us to formally see how differentiation and integration are almost inverse processes through the observations that Z
x c d dt [ f (t)] dt = f (x) − f (c) and d dx "Z x c f (t) dt# = f (x).

5.3: INTEGRATION BY SUBSTITUTION


The technique of u-substitution helps us evaluate indefinite integrals of the form f (g(x))g' (x) dx through the substitutions u = g(x)
and du = g' (x) dx. A key part of choosing the expression in x to be represented by u is the identification of a function-derivative pair.
To do so, we often look for an “inner” function g(x) that is part of a composite function, while investigating whether g' (x) (or a
constant multiple of g' (x)) is present as a multiplying factor of the integrand.

5.4: INTEGRATION BY PARTS


Through the method of Integration by Parts, we can evaluate indefinite integrals that involve products of basic functions through a
substitution that enables us to effectively trade one of the functions in the product for its derivative, and the other for its antiderivative,
in an effort to find a different product of functions that is easier to integrate.

5.5: OTHER OPTIONS FOR FINDING ALGEBRAIC DERIVATIVES


The method of partial fractions enables any rational function to be antidifferentiated, because any polynomial function can be factored
into a product of linear and irreducible quadratic terms. Until the development of computing algebra systems, integral tables enabled
students of calculus to more easily evaluate integrals. Computer algebra systems can play an important role in finding antiderivatives,
though we must be cautious to watch for unusual or unfamiliar advanced functions.

3 12/22/2021
5.6: NUMERICAL INTEGRATION
Sometimes we cannot use the First Fundamental Theorem of Calculus because the integrand lacks an elementary algebraic
antiderivative, we can estimate the integral’s value by using a sequence of Riemann sum approximations. The Trapezoid and
Midpoint Rules are two approaches to calculate Riemann sums.

5.E: FINDING ANTIDERIVATIVES AND EVALUATING INTEGRALS (EXERCISES)


These are homework exercises to accompany Chapter 5 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
5.1: Construction Accurate Graphs of Antiderivatives
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
Given the graph of a function’s derivative, how can we construct a completely accurate graph of the original
function?
How many antiderivatives does a given function have? What do those antiderivatives all have in common?
0
Given a function f , how does the rule A(x) = ∫ f (t)dt define a new function A ?
x

A recurring theme in our discussion of differential calculus has been the question “Given information about the derivative
of an unknown function f , how much information can we obtain about f itself?” For instance, in Activity 1.22, we
explored the situation where the graph of y = f (x) was known (along with the value of f at a single point) and

endeavored to sketch a possible graph of f near the known point. In Example 3.1 – and indeed throughout Section 3.1 –
we investigated how the first derivative test enables us to use information regarding f to determine where the original

function f is increasing and decreasing, as well as where f has relative extreme values. Further, if we know a formula or
graph of f , by computing f we can find where the original function f is concave up and concave down. Thus, the
′ ′′

combination of knowing f and f enables us to fully understand the shape of the graph of f .
′ ′′

We returned to this question in even more detail in Section 4.1; there, we considered the situation where we knew the
instantaneous velocity of a moving object and worked from that information to determine as much information as possible
about the object’s position function. We found key connections between the net-signed area under the velocity function and
the corresponding change in position of the function; in Section 4.4, the Total Change Theorem further illuminated these
connections between f and f in a more general setting, such as the one found in Figure 4.34, showing that the total

change in the value of f over an interval [a, b] is determined by the exact net-signed area bounded by f and the x-axis on

the same interval. In what follows, we explore these issues still further, with a particular emphasis on the situation where
we possess an accurate graph of the derivative function along with a single value of the function f . From that information,
we desire to completely determine an accurate graph of f that not only represents correctly where f is increasing,
decreasing, concave up, and concave down, but also allows us to find an accurate function value at any point of interest to
us.

Preview Activity 5.1.1:

Suppose that the following information is known about a function f : the graph of its derivative, y = f (x) , is given in ′

Figure 5.1. Further, assume that f is piecewise linear (as pictured) and that for x ≤ 0 and x ≥ 6 , f (x) = 0 . Finally,
′ ′

it is given that f (0) = 1 .

Figure 5.1: At left, the graph of y = f ′


(x) ; at right, axes for plotting y = f (x).
(a) On what interval(s) is f an increasing function? On what intervals is f decreasing?
(b) On what interval(s) is f concave up? concave down?
(c) At what point(s) does f have a relative minimum? a relative maximum?
(d) Recall that the Total Change Theorem tells us that

Matthew Boelkins, David Austin & Steven


5.1.1 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
1 ′
f (1) − f (0) = ∫ f (x)dx.
0

What is the exact value of f (1)?


(e) Use the given information and similar reasoning to that in (d) to determine the exact value of
f (2), f (3), f (4), f (5), and f (6).

(f) Based on your responses to all of the preceding questions, sketch a complete and accurate graph of y = f (x) on the
axes provided, being sure to indicate the behavior of f for x < 0 and x > 6 .

Constructing the graph of an antiderivative


Preview Activity 5.1 demonstrates that when we can find the exact area under a given graph on any given interval, it is
possible to construct an accurate graph of the given function’s antiderivative: that is, we can find a representation of a
function whose derivative is the given one. While we have considered this question at different points throughout our
study, it is important to note here that we now can determine not only the overall shape of the antiderivative, but also the
actual height of the antiderivative at any point of interest.
Indeed, this is one key consequence of the Fundamental Theorem of Calculus: if we know a function f and wish to know
information about its antiderivative, F , provided that we have some starting point a for which we know the value of F (a) ,
b
we can determine the value of F (b) via the definite integral. In particular, since F (b) − F (a) = ∫
a
f (x)dx, it follows
that
b

F (b) = F (a) + ∫ f (x)dx. (5.1.1)


a

Moreover, in the discussion surrounding Figure 4.34, we made the observation that differences in heights of a function
correspond to net-signed areas bounded by its derivative. Rephrasing this in terms of a given function f and its
antiderivative F , we observe that on an interval [a, b],
differences in heights on the antiderivative (such as F (b) − F (a) ) correspond to the net-signed area bounded by the
original function on the interval [a, b](Rbaf (x)dx).
For example, say that f (x) = x and that we are interested in an antiderivative of f that satisfies
2
F (1) = 2 . Thinking of
a = 1 and b = 2 in Equation (5.1), it follows from the Fundamental Theorem of Calculus that

2
2
F (2) = F (1) + ∫ x dx (5.1.2)
1

1 3 2
=2+ x |1
3

8 1
= 2 +( − )
3 3

13
= .
3

In this way, we see that if we are given a function f for which we can find the exact net-signed area bounded by f on a
given interval, along with one value of a corresponding antiderivative F , we can find any other value of F that we seek,
and in this way construct a completely accurate graph of F . We have two main options for finding the exact net-signed
area: using the Fundamental Theorem of Calculus (which requires us to find an algebraic formula for an antiderivative of
the given function f ), or, in the case where f has nice geometric properties, finding net-signed areas through the use of
known area formulas.

Activity 5.1.1:

Suppose that the function y = f (x) is given by the graph shown in Figure 5.2, and that the pieces of f are either
portions of lines or portions of circles. In addition, let F be an antiderivative of f and say that F (0) = −1 . Finally,
assume that for x ≤ 0 and x ≥ 7 , f (x) = 0 .

Matthew Boelkins, David Austin & Steven


5.1.2 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
Figure 5.2: At left, the graph of y = f (x).
a. On what interval(s) is F an increasing function? On what intervals is F decreasing?
b. On what interval(s) is F concave up? concave down? neither?
c. At what point(s) does F have a relative minimum? a relative maximum?
d. Use the given information to determine the exact value of F (x) for x = 1, 2, . . . , 7. In addition, what are the
values of F (−1) and F (8)?
e. Based on your responses to all of the preceding questions, sketch a complete and accurate graph of y = F (x) on
the axes provided, being sure to indicate the behavior of F for x < 0 and x > 7 . Clearly indicate the scale on the
vertical and horizontal axes of your graph.
f. What happens if we change one key piece of information: in particular, say that G is an antiderivative of f and
G(0) = 0 . How (if at all) would your answers to the preceding questions change? Sketch a graph of G on the same

axes as the graph of F you constructed in (e). C

Multiple antiderivatives of a single function


In the final question of Activity 5.1, we encountered a very important idea: a given function f has more than one
antiderivative. In addition, any antiderivative of f is determined uniquely by identifying the value of the desired
antiderivative at a single point. For example, suppose that f is the function given at left in Figure 5.3, and we say that F is
an antiderivative of f that satisfies F (0) = 1 .

Figure 5.3: At left, the graph of y = f (x). At right, three different antiderivatives of f .
Then, using Equation 5.1, we can compute F (1) = 1.5, F (2) = 1.5, F (3) = −0.5, F (4) = −2, F (5) = −0.5, and
F (6) = 1, plus we can use the fact that f = f to ascertain where F is increasing and decreasing, concave up and concave

down, and has relative extremes and inflection points. Through work similar to what we encountered in Preview Activity
5.1 and Activity 5.1, we ultimately find that the graph of F is the one given in blue in Figure 5.3.
If we instead chose to consider a function G that is an antiderivative of f but has the property that G(0) = 3 , then G will
have the exact same shape as F (since both share the derivative f ), but G will be shifted vertically away from the graph
1
of F , as pictured in red in Figure 5.3. Note that G(1) − G(0) = ∫ f (x)dx = 0.5, just as F (1) − F (0) = 0.5, but since
0

G(0) = 3, G(1) = G(0) + 0.5 = 3.5 , whereas F (1) = F (0) + 0.5 = 1.5 , since F (0) = 1 . In the same way, if we

assigned a different initial value to the antiderivative, say H (0) = −1 , we would get still another antiderivative, as shown
in magenta in Figure 5.3.
This example demonstrates an important fact that holds more generally:

Matthew Boelkins, David Austin & Steven


5.1.3 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
Definition
If G and H are both antiderivatives of a function f , then the function G − H must be constant.

To see why this result holds, observe that if G and


are both antiderivatives of f , then G = f and H = f . Hence,
H
′ ′

[G(x) − H (x)] = G (x) − H (x) = f (x) − f (x) = 0. Since the only way a function can have derivative zero is by
d ′ ′

dx

being a constant function, it follows that the function G − H must be constant.


Further, we now see that if a function has a single antiderivative, it must have infinitely many: we can add any constant of
our choice to the antiderivative and get another antiderivative. For this reason, we sometimes refer to the general
1
antiderivative of a function f . For example, if f (x) = x
2
, its general antiderivative is F (x) =
3
x +C , where we
3
include the “+C ” to indicate that F includes all of the possible antiderivatives of f . To identify a particular antiderivative
of f , we must be provided a single value of the antiderivative F (this value is often called an initial condition). In the
1
present example, suppose that condition is F (2) = 3 ; substituting the value of 2 for x in F (x) = x
3
+C , we find that
3

1
3
3 = (2 ) + C, (5.1.3)
3

8 1 1 1
and thus C =3− = . Therefore, the particular antiderivative in this case is F (x) = x
3
+ .
3 3 3 3

Activity 5.1.2:

For each of the following functions, sketch an accurate graph of the antiderivative that satisfies the given initial
condition. In addition, sketch the graph of two additional antiderivatives of the given function, and state the
corresponding initial conditions that each of them satisfy. If possible, find an algebraic formula for the antiderivative
that satisfies the initial condition.
a. original function: g(x) = |x| − 1 ; initial condition: G(−1) = 0 ; interval for sketch: [−2, 2]
b. original function: h(x) = sin(x); initial condition: H (0) = 1 ; interval for sketch: [0, 4π]
2
⎧x , if 0 < x ≤ 1

c. original function: p(x) = ⎨ −(x − 2) 2


, if 1 < x < 2;

0 otherwise

initial condition: P (0) = 1 ;


interval for sketch:[−1, 3]

Functions Defined by Integrals


In Equation (5.1), we found an important rule that enables us to compute the value of the antiderivative F at a point b ,
provided that we know F (a) and can evaluate the definite integral from a to b of f . Again, that rule is
b

F (b) = F (a) + ∫ f (x)dx. (5.1.4)


a

In several examples, we have used this formula to compute several different values of F (b) and then plotted the points
(b, F (b)) to assist us in generating an accurate graph of F . That suggests that we may want to think of b , the upper limit of

integration, as a variable itself. To that end, we introduce the idea of an integral function, a function whose formula
involves a definite integral.
Given a continuous function f , we define the corresponding integral function A according to the rule
A(x) = Zxaf (t)dt. (5.1.5)

Note particularly that because we are using the variable x as the independent variable in the function A , and x determines
the other endpoint of the interval over which we integrate (starting from a ), we need to use a variable other than x as the
variable of integration. A standard choice is t , but any variable other than x is acceptable.

Matthew Boelkins, David Austin & Steven


5.1.4 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
One way to think of the function A is as the “net-signed area from a up to x” function, where we consider the region
bounded by y = f (t) on the relevant interval. For example, in Figure 5.4, we see a given function f pictured at left, and its
x
corresponding area function (choosing a = 0 ), A(x) = ∫ f (t)dt shown at right.
0

Note particularly that the function A measures the net-signed area from t = 0 to t = x bounded by the curve y = f (t) ;
this value is then reported as the corresponding height on the graph of y = A(x) . It is even more natural to think of this
relationship between f and A dynamically. At http://gvsu.edu/s/cz, we find a java applet1 that brings the static picture in
Figure 5.4 to life. There, the user can move the red point on the function f and see how the corresponding height changes
at the light blue point on the graph of A.
1
David Austin, Grand Valley State University

x
Figure 5.4: At left, the graph of the given function f . At right, the area function A(x) = ∫ 0
f (t)dt .
The choice of a is somewhat arbitrary. In the activity that follows, we explore how the value of a affects the graph of the
integral function, as well as some additional related issues.

Exercise 5.1.3

Suppose that g is given by the graph at left in Figure 5.5 and that A is the corresponding integral function defined by
x
A(x) = ∫ g(t)dt .
1

Figure 5.5: At left, the graph of y = g(t) ; at right, axes for plotting y = A(x) , where A is defined by the formula
x
A(x) = ∫ g(t)dt .
1

a. On what interval(s) is A an increasing function? On what intervals is A decreasing? Why?


b. On what interval(s) do you think A is concave up? concave down? Why?
c. At what point(s) does A have a relative minimum? a relative maximum?
d. Use the given information to determine the exact values of A(0), A(1), A(2), A(3), A(4), A(5), and A(6).
e. Based on your responses to all of the preceding questions, sketch a complete and accurate graph of y = A(x) on
the axes provided, being sure to indicate the behavior of A for x < 0 and x > 6 .
x
f. How does the graph of B compare to A if B is instead defined by B(x) = ∫ g(t)dt ? 0

Summary
In this section, we encountered the following important ideas:
Given the graph of a function f , we can construct the graph of its antiderivative F provided that (a) we know a
b
starting value of F , say F (a) , and (b) we can evaluate the integral ∫
a
f (x)dx exactly for relevant choices of a and b .

Matthew Boelkins, David Austin & Steven


5.1.5 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
3
For instance, if we wish to know F (3), we can compute F (3) = F (a) + ∫ f (x)dx . When we combine this
a

information about the function values of F together with our understanding of how the behavior of f = f affects the

overall shape of F , we can develop a completely accurate graph of the antiderivative F .


Because the derivative of a constant is zero, if F is an antiderivative of f , it follows that G(x) = F (x) + C will also
be an antiderivative of f . Moreover, any two antiderivatives of a function f differ precisely by a constant. Thus, any
function with at least one antiderivative in fact has infinitely many, and the graphs of any two antiderivatives will differ
only by a vertical translation.
x
Given a function f , the rule A(x) = ∫ f (t)dt defines a new function A that measures the net-signed area bounded by
a

f on the interval [a, x]. We call the function A the integral function corresponding to f .

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


5.1.6 12/1/2021 https://math.libretexts.org/@go/page/4320
Schlicker
5.2: The Second Fundamental Theorem of Calculus
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
x
How does the integral function A(x) = ∫ f (t)dt define an antiderivative of f ?
1

What is the statement of the Second Fundamental Theorem of Calculus?


How do the First and Second Fundamental Theorems of Calculus enable us to formally see how differentiation and
integration are almost inverse processes?

In Section 4.4, we learned the Fundamental Theorem of Calculus (FTC), which from here forward will be referred to as the
First Fundamental Theorem of Calculus, as in this section we develop a corresponding result that follows it. In particular,
recall that the First FTC tells us that if f is a continuous function on [a, b] and F is any antiderivative of f (that is,
F = f ), then

∫ f (x)dx = F (b) − F (a). (5.2.1)


a

We have typically used this result in two settings: (1) where f is a function whose graph we know and for which we can
compute the exact area bounded by f on a certain interval [a, b], we can compute the change in an antiderivative F over
the interval; and (2) where f is a function for which it is easy to determine an algebraic formula for an antiderivative, we
may evaluate the integral exactly and hence determine the net-signed area bounded by the function on the interval. For the
former, see Preview Activity 5.1 or Activity 5.1. For the latter, we can easily evaluate exactly integrals such as
4
2
∫ x dx, (5.2.2)
1

1
since we know that the function F (x) = 3
x is an antiderivative of f (x) = x2. Thus,
3

4
1
2 3 4
∫ x dx = x | (5.2.3)
1
1
3

1 3
1 3
= (4 ) − (1 )
3 3

= 21

.
Here we see that the First FTC can be viewed from at least two perspectives: first, as a tool to find the difference
F (b) − F (a) for an antiderivative F of the integrand f . In this situation, we need to be able to determine the value of the
b
integral ∫ f (x)dx exactly, perhaps through known geometric formulas for area. It is possible that we may not have a
a

formula for F itself. From a second perspective, the First FTC provides a way to find the exact value of a definite integral,
and hence a certain net-signed area exactly, by finding an antiderivative of the integrand and evaluating its total change
over the interval. In this latter case, we need to know a formula for the antiderivative F , as this enables us to compute net-
signed areas exactly through definite integrals, as demonstrated in Figure 5.9.

Matthew Boelkins, David Austin & Steven


5.2.1 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
Figure 5.9: At left, the graph of f (x) = x on the interval [1, 4] and the area it bounds. At right, the antiderivative
2

1
function F (x) = 3
x , whose total change on [1, 4] is the value of the definite integral at left.
3

We recall further that the value of a definite integral may have additional meaning depending on context: change in
position when the integrand is a velocity function, total pollutant leaked from a tank when the integrand is the rate at which
pollution is leaking, or other total changes that correspond to a given rate function that is the integrand. In addition, the
value of the definite integral is always connected to the average value of a continuous function on a given interval:
1 b
fAVG[a,b] ∫
a
f (x)dx..
b −a
x
Next, remember that in the last part of Section 5.1, we studied integral functions of the form A(x) = ∫ f (t)dt . Figure c

5.4 is a particularly important image to keep in mind as we work with integral functions, and the corresponding java applet
at http://gvsu.edu/s/cz is likewise foundational to our understanding of the function A . In what follows, we use the First
x
FTC to gain additional understanding of the function A(x) = ∫ f (t)dt , where the integrand f is given (either through a
c

graph or a formula), and c is a constant. In particular, we investigate further the special nature of the relationship between
the functions A and f .

Preview Activity 5.2.1:

Consider the function A defined by the rule


x
A(x) = ∫ f (t)dt,
1

where f (t) = 4 − 2t .
a. Compute A(1) and A(2) exactly.
b. Use the First Fundamental Theorem of Calculus to find an equivalent formula for A(x) that does not involve
x
integrals. That is, use the first FTC to evaluate ∫ (4 − 2t)dt . 1

c. Observe that f is a linear function; what kind of function is A ?


d. Using the formula you found in (b) that does not involve integrals, compute A' (x).
e. While we have defined f by the rule f (t) = 4 − 2t , it is equivalent to say that f is given by the rule
f (x) = 4 − 2x . What do you observe about the relationship between A and f ?

The Second Fundamental Theorem of Calculus


The result of Preview Activity 5.2 is not particular to the function f (t) = 4 − 2t , nor to the choice of “1” as the lower
x
bound in the integral that defines the function A . For instance, if we let f (t) = cos(t) − t and set A(x) = ∫ f (t)dt , then 2

we can determine a formula for A without integrals by the First FTC. Specifically,
x

A(x) = ∫ (cos(t) − t)dt (5.2.4)


2

1 2 x
= sin(t) − t |
2
2

1 2
= sin(x) − x − (sin(2) − 2)
2

.
Differentiating A(x), since (sin(2) − 2) is constant, it follows that

A (x) = cos(x) − x, (5.2.5)

and thus we see that A (x) = f (x). This tells us that for this particular choice of f , Ais an antiderivative of f . More

2
specifically, since A(2) = ∫ f (t)dt = 0 , A is the only antiderivative of f for which A(2) = 0 .
2

In general, if f is any continuous function, and we define the function A by the rule
x

A(x) = ∫ f (t)dt, (5.2.6)


c

Matthew Boelkins, David Austin & Steven


5.2.2 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
where c is an arbitrary constant, then we can show that A is an antiderivative of f . To see why, let’s demonstrate that

A (x) = f (x) by using the limit definition of the derivative. Doing so, we observe that
A(x + h) − A(x)

A (x) = lim (5.2.7)
h→0 h

x+h x
∫ f (t)dt − ∫ f (t)dt
x c
= lim
h→0 h
x+h
∫ f (t)dt
x
= lim ,
h→0 h

x x+h x+h
where Equation (5.3) in the preceding chain follows from the fact that ∫
c
f (t)dt + ∫
x
f (t)dt = ∫
c
f (t)dt . Now,
observe that for small values of h ,
x+h

∫ f (t)dt ≈ f (x) ⋅ h, (5.2.8)


x

by a simple left-hand approximation of the integral. Thus, as we take the limit in Equation (5.3), it follows that
x+h
∫ f (t)dt f (x) ⋅ h
′ x
A (x) = li mh→0 = li mh→0 = f (x) (5.2.9)
h h

Hence, A is indeed an antiderivative of f . In addition, A(c) = R f (t)dt = 0 . The preceding argument demonstrates the
c
c

truth of the Second Fundamental Theorem of Calculus, which we state as follows.

Theorem
Theorem. (Second FTC) If f is a continuous function and c is any constant, then f has a unique antiderivative A that
x
satisfies A(c) = 0 , and that antiderivative is given by the rule A(x) = ∫ f (t)dt . c

Activity 5.2.2:

Suppose that f is the function given in Figure 5.10 and that f is a piecewise function whose parts are either portions of
x
lines or portions of circles, as pictured. In addition, let A be the function defined by the rule A(x) = ∫ f (t)dt . 2

Figure 5.10: At left, the graph of y = f (x). At right, axes for sketching y = A(x) .
a. What does the Second FTC tell us about the relationship between A and f ?
b. Compute A(1) and A(3) exactly.
c. Sketch a precise graph of y = A(x) on the axes at right that accurately reflects where A is increasing and
decreasing, where A is concave up and concave down, and the exact values of A at x = 0, 1, . . . , 7.
d. How is A similar to, but different from, the function F that you found in Activity 5.1?
x
e. With as little additional work as possible, sketch precise graphs of the functions B(x) = ∫ f (t)dt and 3
x
C (x) = ∫ f (t)dt . Justify your results with at least one sentence of explanation.
1

Understanding Integral Functions


The Second FTC provides us with a means to construct an antiderivative of any continuous function. In particular, if we
are given a continuous function g and wish to find an antiderivative of G, we can now say that
x

G(x) = ∫ g(t)d (5.2.10)


c

Matthew Boelkins, David Austin & Steven


5.2.3 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
provides the rule for such an antiderivative, and moreover that G(c) = 0 . Note especially that we know that
G (x) = g(x) . We sometimes want to write this relationship between and g from a different notational perspective. In

G

particular, observe that


x
d
[∫ g(t)dt] = g(x). (5.2.11)
dx c

This result can be particularly useful when we’re given an integral function such as G and wish to understand properties of
its graph by recognizing that G (x) = g(x) , while not necessarily being able to exactly evaluate the definite integral

x
∫ g(t)dt . To see how this is the case, we consider the following example.
c

Example 5.2.1:

Investigate the behavior of the integral function


x
2
−t
E(x) = ∫ e dt. (5.2.12)
0

Solution
E is closely related to the well-known error function2, a function that is particularly important in probability and
2

statistics. It turns out that the function e does not have an elementary antiderivative that we can express without
−t

integrals. That is, whereas a function such as f (t) = 4 − 2t has elementary antiderivative F (t) = 4t − t , we are 2

unable to find a simple formula for an antiderivative of e that does not involve a definite integral. We will learn
2
−t

more about finding (complicated) algebraic formulas for antiderivatives without definite integrals in the chapter on
infinite series.
Returning our attention to the function E , while we cannot evaluate E exactly for any value other than x = 0 , we still
can gain a tremendous amount of information about the function E . To begin, applying the rule in Equation (5.4) to E ,
it follows that

E'(x) = \dfrac{d}{dx} \left[ \int^x_0 e^{−t^2} \lright[ = e ^{−x ^2} ,

so we know a formula for the derivative of E . Moreover, we know that E(0) = 0 . This information is precisely the
type we were given in problems such as the one in Activity 3.1 and others in Section 3.1, where we were given
information about the derivative of a function, but lacked a formula for the function itself.
Here, using the first and second derivatives of E , along with the fact that E(0) = 0 , we can determine more
information about the behavior of E . First, with E (x) = e − x , we note that for all real numbers x, e − x > 0 , and
′ 2 2

thus E (x) > 0 for all x. Thus E is an always increasing function. Further, we note that as


x → ∞, E (x) = e − x2 → 0, hencetheslopeof thef unctionEtendstozeroasx → ∞(andsimilarlyasx → −∞)

. I ndeed, itturnsout(duetosomemoresophisticatedanalysis)that\(E

has horizontal asymptotes as x increases or decreases without bound.


2

In addition, we can observe that E (x) = −2x e , and that E


′′ −x ′′
(0) = 0 , while E (x) < 0 for x > 0 and E (x) > 0
′′ ′′

for x < 0 . This information tells us that E is concave up for x <0 and concave down for x > 0 with a point of
inflection at x = 0 .
The only thing we lack at this point is a sense of how big E can get as x increases. If we use a midpoint Riemann sum
with 10 subintervals to estimate E(2), we see that E(2) ≈ 0.8822; a similar calculation to estimate E(3) shows little
change E(3) ≈ 0.8862) soitappearsthatas\(x increases without bound, E approaches a value just larger than
0.886 which aligns with the fact that E has horizontal asymptote. Putting all of this information together (and using
the symmetry of \(f (t) = e^{ −t^2} )\, we see the results shown in Figure 5.11.
2
2 x
The error function is defined by the rule and has the key property that
2
−t
erf (x) = − − ∫0 e dt 0 ≤ erf (x) < 1
√π

for all x ≤ 0 and moreover that lim x→∞ erf (x) = 1 .

Matthew Boelkins, David Austin & Steven


5.2.4 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
x
Figure 5.11: At left, the graph of f (t) = e − t2 . At right, the integral function E(x) = ∫ , which is the
2
−t
e dt
0

unique antiderivative of f that satisfies E(0) = 0 .


Again, E is the antiderivative of f (t) = e that satisfies E(0) = 0 . Moreover, the values on the graph of y = E(x)
2
−t

represent the net-signed area of the region bounded by f (t) = e from 0 up to x. We see that the value of E
2
−t

increases rapidly near zero but then levels off as x increases since there is less and less additional accumulated area
bounded by f (t) = e as x increases.
2
−t

Activity 5.2.3:
x
Suppose that \(f (t) = \dfrac{t}{{1+t^2}\) and F (x) = ∫ 0
f (t)dt .
a. On the axes at left in Figure 5.12, plot a graph of \(f (t) = \dfrac{t}{{1+t^2}\) on the interval −10 ≥ t ≥ 10 .
Clearly label the vertical axes with appropriate scale.
b. What is the key relationship between F and f , according to the Second FTC?
c. Use the first derivative test to determine the intervals on which F is increasing and decreasing.
d. Use the second derivative test to determine the intervals on which F is concave up and concave down. Note that
F (t) can be simplified to be written in the form \(f (t) = \dfrac{t}{{(1+t^2)^2}\).

e. Using technology appropriately, estimate the values of F (5) and F (10) through appropriate Riemann sums.

Figure 5.12: Axes for plotting f and F .


(f) Sketch an accurate graph of y = F (x) on the righthand axes provided, and clearly label the vertical axes with
appropriate scale.

Differentiating an Integral Function


We have seen that the Second FTC enables us to construct an antiderivative F of any continuous function f by defining F
by the corresponding integral function
x x
F (x) = ∫ f (t)dt. Saiddif f erently, if wehaveaf unctionof thef ormF (x) = ∫ f (t)dt , then we know that
c c
x
f (t)dt] = f (x) . This shows that integral functions, while perhaps having the most complicated formulas
′ d
F (x) = [∫
dx c

of any functions we have encountered, are nonetheless particularly simple to differentiate. For instance, if
x
2
F (x) = ∫ sin(t )dt, (5.2.13)
π

then by the Second FTC, we know immediately that

Matthew Boelkins, David Austin & Steven


5.2.5 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
′ 2
F (x) = sin(x ) (5.2.14)

.
Stating this result more generally for an arbitrary function f , we know by the Second FTC that
x
d
[∫ f (t)dt] = f (x). (5.2.15)
dx c

.
In words, the last equation essentially says that “the derivative of the integral function whose integrand is f , is f . ” In this
sense, we see that if we first integrate the function f from t = a to t = x , and then differentiate with respect to x, these
two processes “undo” one another.
Taking a different approach, say we begin with a function f (t) and differentiate with respect to t . What happens if we
follow this by integrating the result from t = a to t = x ? That is, what can we say about the quantity
x
d
∫ [f (t)] dt? (5.2.16)
a
dt

Here, we use the First FTC and note that f (t) is an antiderivative of d

dt
[f (t)] . Applying this result and evaluating the
antiderivative function, we see that
x
d x
∫ [f (t)]dt = f (t)| (5.2.17)
a
a
dt

= f (x) − f (a).

Thus, we see that if we apply the processes of first differentiating f and then integrating the result from a to x, we return
to the function f , minus the constant value f (a). So in this situation, the two processes almost undo one another, up to the
constant f (a).
The observations made in the preceding two paragraphs demonstrate that differentiating and integrating (where we
integrate from a constant up to a variable) are almost inverse processes. In one sense, this should not be surprising:
integrating involves antidifferentiating, which reverses the process of differentiating. On the other hand, we see that there
is some subtlety involved, as integrating the derivative of a function does not quite produce the function itself. This is
connected to a key fact we observed in Section 5.1, which is that any function has an entire family of antiderivatives, and
any two of those antiderivatives differ only by a constant.

Activity 5.2.6:

Evaluate each of the following derivatives and definite integrals. Clearly cite whether you use the First or Second FTC
in so doing.
x
a.
2
d t
[∫ e dt]
dx 4

4
−2 t
b.∫ x
d

dx
[
4
] dt
1 +t

1
c. d

dx
[∫
x
cos(t )dt]
3

3
d.∫ x
d

dt
[ln(1 + t )]dt
2

3
x
e. d

dx

4
[sin(t )dt]
2

x
(Hint: Let F (x) = ∫ 4
2
sin(t )dt and observe that this problem is asking you to evaluate d

dx
3
[F (x )],.

Summary
In this section, we encountered the following important ideas:

Matthew Boelkins, David Austin & Steven


5.2.6 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
x
For a continuous function f , the integral function A(x) = ∫ f (t)dt defines an antiderivative of f .
1

The Second Fundamental Theorem of Calculus is the formal, more general statement of the preceding fact: if f is a
x
continuous function and c is any constant, then A(x) = ∫ f (t)dt is the unique antiderivative of f that satisfies
c

A(c) = 0 .

Together, the First and Second FTC enable us to formally see how differentiation and integration are almost inverse
processes through the observations that
x
d
∫ [f (t)]dt = f (x) − f (c) (5.2.18)
c
dt

and
x
d
[∫ f (t)dt] = f (x) (5.2.19)
dx c

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


5.2.7 11/21/2021 https://math.libretexts.org/@go/page/4321
Schlicker
5.3: Integration by Substitution
Learning Objectives

In this section, we strive to understand the ideas generated by the following important questions:
How can we begin to find algebraic formulas for antiderivatives of more complicated algebraic functions?
What is an indefinite integral and how is its notation used in discussing antiderivatives?
How does the technique of u-substitution work to help us evaluate certain indefinite integrals, and how does this process rely on
identifying function-derivative pairs?

In Section 4.4, we learned the key role that antiderivatives play in the process of evaluating definite integrals exactly. In particular, the
Fundamental Theorem of Calculus tells us that if F is any antiderivative of f , then
b

∫ f (x)dx = F (b) − F (a). (5.3.1)


a

Furthermore, we realized that each elementary derivative rule developed in Chapter 2 leads to a corresponding elementary antiderivative, as
summarized in Table 4.1. Thus, if we wish to evaluate an integral such as
1

∫ x3 − √x + 5xdx (5.3.2)
0

it is straightforward to do so, since we can easily antidifferentiate f (x) = x 3 − √ x + 5 x . In particular, since a function F whose derivative
is f is given by
F (x) = 14x4 − 23x3/2 + 1 ln(5)5x (5.3.3)

the Fundamental Theorem of Calculus tells us that


1

∫ x3 − √x + 5xdx = 14x4 − 23x3/2 + 1 ln(5)5x10 = 14(1)4 − 23(1)3/2 + 1 ln(5)51! − 0 − 0 + 1 ln(5)50! = (5.3.4)


0

−512 + 4 ln(5).

Because an algebraic formula for an antiderivative of f enables us to evaluate the definite integral
b

∫ f (x)dx (5.3.5)
a

exactly, we see that we have a natural interest in being able to find such algebraic antiderivatives. Note that we emphasize algebraic
antiderivatives, as opposed to any antiderivative, since we know by the Second Fundamental Theorem of Calculus that
x

G(x) = ∫ f (t)dt (5.3.6)


a

is indeed an antiderivative of the given function f , but one that still involves a definite integral. One of our main goals in this section and
the one following is to develop understanding, in select circumstances, of how to “undo” the process of differentiation in order to find an
algebraic antiderivative for a given function.

Preview Activity 5.3.1

In Section 2.5, we learned the Chain Rule and how it can be applied to find the derivative of a composite function. In particular, if u is
a differentiable function of x, and f is a differentiable function of u(x), then
d ′
[f (u(x))] = f (u(x)) ⋅ u0(x). (5.3.7)
dx

In words, we say that the derivative of a composite function c(x) = f (u(x)), where f is considered the “outer” function and u the “inner”
function, is “the derivative of the outer function, evaluated at the inner function, times the derivative of the inner function.”
(a) For each of the following functions, use the Chain Rule to find the function’s derivative. Be sure to label each derivative by name
(e.g., the derivative of g(x) should be labeled g' (x)).
i. g(x) = e 3x
ii. h(x) = sin(5x + 1)
iii. p(x) = arctan(2x)
iv. q(x) = (2 − 7x) 4
v. r(x) = 3 4−11x

Matthew Boelkins, David Austin & Steven


5.3.1 12/1/2021 https://math.libretexts.org/@go/page/4322
Schlicker
(b) For each of the following functions, use your work in (a) to help you determine the general antiderivative3 of the function. Label
each antiderivative by name (e.g., the antiderivative of m should be called M). In addition, check your work by computing the
derivative of each proposed antiderivative.
i. m(x) = e 3x
ii. n(x) = cos(5x + 1)
iii. s(x) = 1 1+4x 2 3Recall that the general antiderivative of a function includes “+C” to reflect the entire family of functions that share
the same derivative.
iv. v(x) = (2 − 7x) 3 v. w(x) = 3 4−11x
(c) Based on your experience in parts (a) and (b), conjecture an antiderivative for each of the following functions. Test your conjectures
by computing the derivative of each proposed antiderivative.
i. a(x) = cos(πx)
ii. b(x) = (4x + 7) 11
iii. c(x) = xex 2 ./

Reversing the Chain Rule: First Steps


In Preview Activity 5.3.1, we saw that it is usually straightforward to antidifferentiate a function of the form h(x) = f (u(x)), whenever f is a
familiar function whose antiderivative is known and u(x) is a linear function. For example, if we consider h(x) = (5x − 3) 6 , in this context
the outer function f is f (u) = u 6 , while the inner function is u(x) = 5x − 3. Since the antiderivative of f is
F (u) = 17u7 + C , (5.3.8)

we see that the antiderivative of h is


H (x) = 17(5x − 3)7 ⋅ 15 + C = 135(5x − 3)7 + C . (5.3.9)

The inclusion of the constant 1 5 is essential precisely because the derivative of the inner function is u 0 (x) = 5. Indeed, if we now compute
H (x), we find by the Chain Rule (and Constant Multiple Rule) that


H (x) = 135 ⋅ 7(5x − 3)6 ⋅ 5 = (5x − 3)6 = h(x), (5.3.10)

and thus H is indeed the general antiderivative of h . Hence, in the special case where the outer function is familiar and the inner function is
linear, we can antidifferentiate composite functions according to the following rule. If h(x) = f (ax + b) and F is a known algebraic
antiderivative of f , then the general antiderivative of h is given by
H (x) = 1aF (ax + b) + C . (5.3.11)

When discussing antiderivatives, it is often useful to have shorthand notation that indicates the instruction to find an antiderivative. Thus, in
a similar way to how the notation d dx [ f (x)] represents the derivative of f (x) with respect to x, we use the notation of the indefinite
integral, Z f (x) dx to represent the general antiderivative of f with respect to x. For instance, returning to the earlier example with h(x) =
(5x − 3) 6 above, we can rephrase the relationship between h and its antiderivative H through the notation

∫ (5x − 3)6dx = 135(5x − 6)7 + C . (5.3.12)

When we find an antiderivative, we will often say that we evaluate an indefinite integral; said differently, the instruction to evaluate an
indefinite integral means to find the general antiderivative. Just as the notation d dx [] means “find the derivative with respect to x of ,” the
notation R dx means “find a function of x whose derivative is .”

Activity 5.3.2

Evaluate each of the following indefinite integrals. Check each antiderivative that you find by differentiating.

a. ∫ sin(8 − 3x)dx

b. ∫ sec2(4x)dx

c. ∫ 111x − 9dx

d. ∫ csc(2x + 1)cot(2x + 1)dx

e. ∫ 1√1 − 16x2dx

f. ∫ 5 − xdx

Matthew Boelkins, David Austin & Steven


5.3.2 12/1/2021 https://math.libretexts.org/@go/page/4322
Schlicker
Reversing the Chain Rule
u-substitution Of course, a natural question arises from our recent work: what happens when the inner function is not a linear function? For
example, can we find antiderivatives of such functions as g(x) = xex and h(x) = ex ? It is important to explicitly remember that
2 2

differentiation and antidifferentiation are essentially inverse processes; that they are not quite inverse processes is due to the +C that arises
when antidifferentiating. This close relationship enables us to take any known derivative rule and translate it to a corresponding rule for an
indefinite integral. For example, since d dx x 5 = 5x 4 , we can equivalently write Z 5x 4 dx = x 5 + C. Recall that the Chain Rule states
that
d
′ ′
[f (g(x))] = f (g(x)) ⋅ g (x). (5.3.13)
dx

Restating this relationship in terms of an indefinite integral,

′ ′
∫ f (g(x))g (x)dx = f (g(x)) + C . (5.3.14)

Hence, Equation 5.3.14 tells us that if we can take a given function and view its algebraic structure as f' (g(x))g' (x) for some appropriate
choices of f and g, then we can antidifferentiate the function by reversing the Chain Rule. It is especially notable that both g(x) and g' (x)
appear in the form of f' (g(x))g' (x); we will sometimes say that we seek to identify a function-derivative pair when trying to apply the rule
in Equation 5.3.14. In the situation where we can identify a function-derivative pair, we will introduce a new variable u to represent the
function g(x). Observing that with u = g(x), it follows in Leibniz notation that du dx = g' (x), so that in terms of differentials4 , du = g' (x)
dx. Now converting the indefinite integral of interest to a new one in terms of u, we have

′ ′ ′
∫ f (g(x))g (x)dx = ∫ f (u)du. (5.3.15)

Provided that f' is an elementary function whose antiderivative is known, we can now 4 If we recall from the definition of the derivative
that du dx ≈ 4u 4x and use the fact that du dx = g' (x), then we see that g' (x) ≈ 4u 4x . Solving for 4u, 4u ≈ g' (x)4x. It is this last
relationship that, when expressed in “differential” notation enables us to write

du = g (x)dx (5.3.16)

in the change of variable formula.


easily evaluate the indefinite integral in u, and then go on to determine the desired overall antiderivative of f' (g(x))g' (x). We call this
process u-substitution. To see u-substitution at work, we consider the following example.

Example 5.3.1:

Evaluate the indefinite integral

3
∫ x ⋅ sin(7x4 + 3)dx (5.3.17)

and check the result by differentiating.


Solution
We can make two key algebraic observations regarding the integrand, x 3 · sin(7x 4 + 3). First, sin(7x 4 + 3) is a composite function; as
such, we know we’ll need a more sophisticated approach to antidifferentiating.
Second, x 3 is almost the derivative of (7x 4 + 3); the only issue is a missing constant. Thus, x 3 and (7x 4 + 3) are nearly a function-
derivative pair. Furthermore, we know the antiderivative of f (u) = sin(u). The combination of these observations suggests that we can
evaluate the given indefinite integral by reversing the chain rule through u-substitution. Letting u represent the inner function of the
composite function sin(7x 4 + 3), we have u = 7x 4 + 3, and thus du dx = 28x 3 . In differential notation, it follows that
du = 28x3dx, (5.3.18)

and thus
x3dx = 128du. (5.3.19)

We make this last observation because the original indefinite integral may now be written Z sin(7x 4 + 3) · x 3 dx, and so by
substituting the expressions in u for x (specifically u for 7x 4 + 3 and 1 28 du for x 3 dx), it follows that

∫ sin(7x4 + 3) ⋅ x3dx = Zsin(u) ⋅ 128du. (5.3.20)

Now we may evaluate the original integral by first evaluating the easier integral in u, followed by replacing u by the expression 7x 4 +
3. Doing so, we find

Matthew Boelkins, David Austin & Steven


5.3.3 12/1/2021 https://math.libretexts.org/@go/page/4322
Schlicker
∫ sin(7x4 + 3) ⋅ x3dx = Zsin(u) ⋅ 128du = 128Zsin(u)du = 128(−cos(u)) + C = −128cos(7x4 + 3) + C . (5.3.21)

To check our work, we observe by the Chain Rule that


d
− 128cos(7x4 + 3) + C = −128 ⋅ (−1)sin(7x4 + 3) ⋅ 28x3 = sin(7x4 + 3) ⋅ x3 (5.3.22)
dx

which is indeed the original integrand. An essential observation about our work in Example 5.2 is that the u-substitution only worked
because the function multiplying sin(7x 4 + 3) was x 3 . If instead that function was x 2 or x 4 , the substitution process may not (and
likely would not) have worked. This is one of the primary challenges of antidifferentiation: slight changes in the integrand make
tremendous differences. For instance, we can use u-substitution with u = x 2 and du = 2xdx to find that

∫ xex2dx = Zeu ⋅ 12du = 12Zeudu = 12eu + C = 12ex2 + C . (5.3.23)

If, however, we consider the similar indefinite integral

∫ ex2dx, (5.3.24)

the missing x to multiply e x 2 makes the u-substitution u = x 2 no longer possible. Hence, part of the lesson of u-substitution is just
how specialized the process is: it only applies to situations where, up to a missing constant, the integrand that is present is the result of
applying the Chain Rule to a different, related function.

Activity 5.3.3

Evaluate each of the following indefinite integrals by using these steps:


Find two functions within the integrand that form (up to a possible missing constant) a function-derivative pair;
Make a substitution and convert the integral to one involving u and du;
Evaluate the new integral in u;
Convert the resulting function of u back to a function of x by using your earlier substitution;
Check your work by differentiating the function of x. You should come up with the integrand originally given.
a. displaystyle ∫ x25x3 + 1dx
b. displaystyle ∫ exsin(ex)dx
c. displaystyle ∫ cos(√x)√xdxC

Evaluating Definite Integrals via u-substitution


We have just introduced u-substitution as a means to evaluate indefinite integrals of functions that can be written, up to a constant multiple,
in the form f (g(x))g' (x). This same technique can be used to evaluate definite integrals involving such functions, though we need to be
careful with the corresponding limits of integration. Consider, for instance, the definite integral
5

∫ xex2dx. (5.3.25)
2

Whenever we write a definite integral, it is implicit that the limits of integration correspond to the variable of integration. To be more
explicit, observe that
5

∫ xex2dx = Zx = 5x = 2xex2dx. (5.3.26)


2

When we execute a u-substitution, we change the variable of integration; it is essential to note that this also changes the limits of
integration. For instance, with the substitution u = x 2 and du = 2x dx, it also follows that when x = 2, u = 2 2 = 4, and when x = 5, u = 5 2
= 25. Thus, under the change of variables of u-substitution, we now have
x=5

∫ xex2dx = Zu = 25u = 4eu ⋅ 12du = 12euu = 25u = 4 = 12e25 − 12e4 (5.3.27)


x=2

Alternatively, we could consider the related indefinite integral R xex 2 dx, find the antiderivative 1 2 e x 2 through u-substitution, and then
evaluate the original definite integral.
From that perspective, we’d have
5

∫ xex2dx = 12ex252 = 12e25 − 12e4 (5.3.28)


2

Matthew Boelkins, David Austin & Steven


5.3.4 12/1/2021 https://math.libretexts.org/@go/page/4322
Schlicker
which is, of course, the same result.

Activity 5.3.1

Evaluate each of the following definite integrals exactly through an appropriate usubstitution.
a. Z 2 1 x 1 + 4x 2 dx
b. \int_0^1 e −x (2e −x + 3) 9 dx
c. Z 4/π 2/π cos 1 x x 2 dx C

Summary
In this section, we encountered the following important ideas:
To begin to find algebraic formulas for antiderivatives of more complicated algebraic functions, we need to think carefully about how
we can reverse known differentiation rules. To that end, it is essential that we understand and recall known derivatives of basic
functions, as well as the standard derivative rules.
The indefinite integral provides notation for antiderivatives. When we write “R f (x) dx,” we mean “the general antiderivative of f .” In
particular, if we have functions f and F such that f' = f , the following two statements say the exact thing: d dx [F(x)] = f (x) and Z f (x)
dx = F(x) + C. That is, f is the derivative of F, and F is an antiderivative of f.
The technique of R u-substitution helps us evaluate indefinite integrals of the form f (g(x))g' (x) dx through the substitutions u = g(x)
and du = g' (x) dx, so that Z f (g(x))g' (x) dx = Z f (u) du. A key part of choosing the expression in x to be represented by u is the
identification of a function-derivative pair. To do so, we often look for an “inner” function g(x) that is part of a composite function,
while investigating whether g' (x) (or a constant multiple of g' (x)) is present as a multiplying factor of the integrand.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand Valley State
University)

Matthew Boelkins, David Austin & Steven


5.3.5 12/1/2021 https://math.libretexts.org/@go/page/4322
Schlicker
5.4: Integration by Parts
Learning Objectives

In this section, we strive to understand the ideas generated by the following important questions:
How do we evaluate indefinite integrals that involve products of basic functions such as R x sin(x) dx and R xex dx?
What is the method of integration by parts and how can we consistently apply it to integrate products of basic functions?
How does the algebraic structure of functions guide us in identifying u and dv in using integration by parts?

In Section 5.3, we learned the technique of u-substitution for evaluating indefinite integrals that involve certain composite functions. For
example, the indefinite integral
x
4
∫ sin(x ) dx (5.4.1)
3

is perfectly suited to u-substitution, since not only is there a composite function present, but also the inner function’s derivative (up to a
constant) is multiplying the composite function. Through u-substitution, we learned a general situation where recognizing the algebraic
structure of a function can enable us to find its antiderivative. It is natural to ask similar questions to those we considered in Section 5.3
about functions with a different elementary algebraic structure: those that are the product of basic functions. For instance, suppose we are
interested in evaluating the indefinite integral

∫ x sin(x) dx. (5.4.2)

Here, there is not a composite function present, but rather a product of the basic functions
f (x) = x (5.4.3)

and
g(x) = sin(x). (5.4.4)

From our work in Section 2.3 with the Product Rule, we know that it is relatively complicated to compute the derivative of the product of
two functions, so we should expect that antidifferentiating a product should be similarly involved. In addition, intuitively we expect that
evaluating

∫ x sin(x) dx (5.4.5)

will involve somehow reversing the Product Rule. To that end, in Preview Activity 5.4.1 we refresh our understanding of the Product Rule
and then investigate some indefinite integrals that involve products of basic functions.

Preview Activity 5.4.1

In Section 2.3, we developed the Product Rule and studied how it is employed to differentiate a product of two functions. In particular,
recall that if f and g are differentiable functions of x, then
d ′ ′
[f (x) ⋅ g(x)] = f (x) ⋅ g (x) + g(x) ⋅ f (x). (5.4.6)
dx

a. For each of the following functions, use the Product Rule to find the function’s derivative. Be sure to label each derivative by name
(e.g., the derivative of g(x) should be labeled g ' (x)).
i. g(x) = x sin(x)
ii. h(x) = xex
iii. p(x) = x ln(x)
iv. q(x) = x 2 cos(x)
v. r(x) = e x sin(x)
b. Use your work in (a) to help you evaluate the following indefinite integrals. Use differentiation to check your work.
i. Z xex + e x dx
ii. Z e x (sin(x) + cos(x)) dx
iii. Z 2x cos(x) − x 2 sin(x) dx
iv. Z x cos(x) + sin(x) dx v. Z 1 + ln(x) dx

Matthew Boelkins, David Austin & Steven


5.4.1 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
c. Observe that the examples in (b) work nicely because of the derivatives you were asked to calculate in (a). Each integrand in (b) is
precisely the result of differentiating one of the products of basic functions found in (a). To see what happens when an integrand is
still a product but not necessarily the result of differentiating an elementary product, we consider how to evaluate Z x cos(x) dx.
i. First, observe that d dx [x sin(x)] = x cos(x) + sin(x). Integrating both sides indefinitely and using the fact that the integral of a
sum is the sum of the integrals, we find that Z d dx [x sin(x)] ! dx = Z x cos(x) dx + Z sin(x) dx. In this last equation, evaluate
the indefinite integral on the left side as well as the rightmost indefinite integral on the right.
ii. In the most recent equation from (i.), solve the equation for the expression \([\int x \cos(x) \,dx.\]
iii. For which product of basic functions have you now found the antiderivative?

Reversing the Product Rule: Integration by Parts


Problem (c) in Preview Activity 5.4.1 provides a clue for how we develop the general technique known as Integration by Parts, which
comes from reversing the Product Rule. Recall that the Product Rule states that
d
′ ′
[f (x)g(x)] = f (x)g (x) + g(x)f (x). (5.4.7)
dx

Integrating both sides of this equation indefinitely with respect to x, it follows that
d
′ ′
∫ [f (x)g(x)]dx = ∫ f (x)g (x)dx + ∫ g(x)f (x)dx. (5.4.8)
dx

On the left in Equation 5.4.8, we recognize that we have the indefinite integral of the derivative of a function which, up to an additional
constant, is the original function itself. Temporarily omitting the constant that may arise, we equivalently have


f (x)g(x) = ∫ f (x)g(x)dx + ∫ g(x)f (x)dx. (5.4.9)

The most important thing to observe about Equation 5.4.9 is that it provides us with a choice of two integrals to evaluate. That is, in a
situation where we can identify two functions f and g, if we can integrate f (x)g (x), then we know the indefinite integral of g(x)f (x),
′ ′

and vice versa. To that end, we choose the first indefinite integral on the left in Equation 5.4.9 and solve for it to generate the rule

′ ′
∫ f (x)g (x)dx = f (x)g(x) − ∫ g(x)f (x)dx. (5.4.10)

Often we express Equation 5.4.10 in terms of the variables u and v, where u = f (x) and v = g(x). Note that in differential notation,

du = f (x)dx (5.4.11)

and

dv = g (x)dx (5.4.12)

and thus we can state the rule for Integration by Parts in its most common form as follows.

∫ udv = uv − ∫ vdu. (5.4.13)

To apply Integration by Parts, we look for a product of basic functions that we can identify as u and dv. If we can antidifferentiate dv to
find v, and evaluating R v du is not more difficult than evaluating R u dv, then this substitution usually proves to be fruitful. To
demonstrate, we consider the following example.

Example 5.4.1:

Evaluate the indefinite integral

∫ x cos(x) dx (5.4.14)

using Integration by Parts.


Solution
Whenever we are trying to integrate a product of basic functions through Integration by Parts, we are presented with a choice for u and
dv. In the current problem, we can either let u = x and dv = cos(x) dx , or let u = cos(x) and dv = x dx . While there is not a
universal rule for how to choose u and dv, a good guideline is this: do so in a way that R v du is at least as simple as the original
problem R u dv. In this setting, this leads us to choose 6 u = x and dv = cos(x) dx, from which it follows that du = 1 dx and v = sin(x).
With this substitution, the rule for Integration by Parts tells us that

Matthew Boelkins, David Austin & Steven


5.4.2 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
∫ x cos(x)dx = x sin(x) − ∫ sin(x) ⋅ 1dx. 6 (5.4.15)

Observe that if we considered the alternate choice, and let u = cos(x) and dv = x dx, then du = − sin(x) dx and v = 1 2 x 2 , from which
we would write

∫ xcos(x)dx = 12x2cos(x) − Z12x2(−sin(x))dx. (5.4.16)

Thus we have replaced the problem of integrating x cos(x) with that of integrating 1

2
2
x ; the latter is clearly more complicated,
sin(x)

which shows that this alternate choice is not as helpful as the first choice.
At this point, all that remains to do is evaluate the (simpler) integral R sin(x) · 1 dx. Doing so, we find

∫ x cos(x)dx = xsin(x) − (−\os(x)) + C = x sin(x) + cos(x) + C . (5.4.17)

There are at least two additional important observations to make from Example 5.4.1. First, the general technique of Integration by Parts
involves trading the problem of integrating the product of two functions for the problem of integrating the product of two related functions.
In particular, we convert the problem of evaluating R u dv for that of evaluating R v du. This perspective clearly shapes our choice of u and
v. In Example 5.4.1, the original integral to evaluate was R x cos(x) dx, and through the substitution provided by Integration by Parts, we
were instead able to evaluate

∫ sin(x) ⋅ 1dx. (5.4.18)

Note that the original function x was replaced by its derivative, while cos(x) was replaced by its antiderivative. Second, observe that when
we get to the final stage of evaluating the last remaining antiderivative, it is at this step that we include the integration constant, +C.

Activity 5.4.2:

Evaluate each of the following indefinite integrals. Check each antiderivative that you find by differentiating.
a. ∫ te − tdt
b. ∫ 4xsin(3x)dx
c. ∫ zsec2(z)dz
d. ∫ xln(x)dx

Some Subtleties with Integration by Parts


There are situations where Integration by Parts is not an obvious choice, but the technique is appropriate nonetheless. One guide to
understanding why is the observation that integration by parts allows us to replace one function in a product with its derivative while
replacing the other with its antiderivative. For instance, consider the problem of evaluating Z arctan(x) dx. Initially, this problem seems ill-
suited to Integration by Parts, since there does not appear to be a product of functions present. But if we note that

arctan(x) = arctan(x) ⋅ 1, (5.4.19)

and realize that we know the derivative of arctan(x) as well as the antiderivative of 1, we see the possibility for the substitution u =
arctan(x) and dv = 1 dx. We explore this substitution further in Activity 5.4.2.
In a related problem, if we consider

3 2
∫ t sin(t ) dt, (5.4.20)

two key observations can be made about the algebraic structure of the integrand: there is a composite function present in sin(t 2 ), and there
is not an obvious function-derivative pair, as we have t 3 present (rather than simply t) multiplying sin(t 2 ). This problem exemplifies the
situation where we sometimes use both u-substitution and Integration by Parts in a single problem. If we write t 3 = t · t 2 and consider the
indefinite integral Z t · t 2 · sin(t 2 ) dt, we can use a mix of the two techniques we have recently learned. First, let z = t 2 so that dz = 2t dt,
and thus t dt = 1 2 dz. (We are using the variable z to perform a “zsubstitution” since u will be used subsequently in executing Integration
by Parts.) Under this z-substitution, we now have

2 2
∫ t⋅t ⋅ sin(t )dt = ∫ z ⋅ sin(z) ⋅ 12dz. (5.4.21)

The remaining integral is a standard one that can be evaluated by parts. This, too, is explored further in Activity 5.4.2. The problems
briefly introduced here exemplify that we sometimes must think creatively in choosing the variables for substitution in Integration by Parts,

Matthew Boelkins, David Austin & Steven


5.4.3 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
as well as that it is entirely possible that we will need to use the technique of substitution for an additional change of variables within the
process of integrating by parts.

Activity 5.4.3

Evaluate each of the following indefinite integrals, using the provided hints.
a. Evaluate R arctan(x) dx by using Integration by Parts with the substitution u = arctan(x) and dv = 1 dx.
b. Evaluate R ln(z) dz. Consider a similar substitution to the one in (a).
c. Use the substitution z = t 2 to transform the integral R t 3 sin(t 2 ) dt to a new integral in the variable z, and evaluate that new
integral by parts.
d. Evaluate R s 5 e s 3 ds using an approach similar to that described in (c).
e. Evaluate R e 2t cos(e t ) dt. You will find it helpful to note that e 2t = e t · e t .

Using Integration by Parts Multiple Times


We have seen that the technique of Integration by Parts is well suited to integrating the product of basic functions, and that it allows us to
essentially trade a given integrand for a new one where one function in the product is replaced by its derivative, while the other is replaced
by its antiderivative. The main goal in this trade of R u dv for R v du is to have the new integral not be more challenging to evaluate than
the original one. At times, it turns out that it can be necessary to apply Integration by Parts more than once in order to ultimately evaluate a
given indefinite integral. For example, if we consider R t 2 e t dt and let u = t 2 and dv = e t dt, then it follows that du = 2t dt and v = e t ,
thus

t2etdt = t2et − ∫ 2tetdt. (5.4.22)

The integral on the righthand side is simpler to evaluate than the one on the left, but it still requires Integration by Parts. Now letting u = 2t
and dv = e t dt, we have du = 2 dt and v = e t , so that

2 2
Z t etdt = t et − 2tet − ∫ 2etdt!. (5.4.23)

Note the key role of the parentheses, as it is essential to distribute the minus sign to the entire value of the integral R 2tet dt. The final
integral on the right in the most recent equation is a basic one; evaluating that integral and distributing the minus sign, we find

2 2
∫ t etdt = t et − 2tet + 2et + C . (5.4.24)

Of course, situations are possible where even more than two applications of Integration by Parts may be necessary. For instance, in the
preceding example, it is apparent that if the integrand was t 3 e t instead, we would have to use Integration by Parts three times. Next, we
consider the slightly different scenario presented by the definite integral R e t cos(t) dt. Here, we can choose to let u be either e t or cos(t);
we pick u = cos(t), and thus dv = e t dt. With du = − sin(t) dt and v = e t , Integration by Parts tells us that

∫ etcos(t)dt = etcos(t) − ∫ et(−sin(t))dt, (5.4.25)

or equivalently that

∫ etcos(t)dt = etcos(t) + ∫ etsin(t)dt (5.4.26)

Observe that the integral on the right in Equation 5.4.26, R e t sin(t) dt, while not being more complicated than the original integral we
want to evaluate, it is essentially identical to R e t cos(t) dt. While the overall situation isn’t necessarily better than what we started with,
the problem hasn’t gotten worse. Thus, we proceed by integrating by parts again. This time we let u = sin(t) and dv = e t dt, so that du =
cos(t) dt and v = e t , which implies

t t t t
∫ e cos(t) dt = e cos(t) + e sin(t) − ∫ e cos(t) dt! (5.4.27)

We seem to be back where we started, as two applications of Integration by Parts has led us back to the original problem, R e t cos(t) dt.
But if we look closely at Equation 5.4.27, we see that we can use algebra to solve for the value of the desired integral. In particular, adding
R e t cos(t) dt to both sides of the equation, we have

t
2∫ etcos(t)dt = etcos(t) + e sin(t), (5.4.28)

and therefore

Matthew Boelkins, David Austin & Steven


5.4.4 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
∫ etcos(t)dt = 12etcos(t) + etsin(t) + C . (5.4.29)

Note that since we never actually encountered an integral we could evaluate directly, we didn’t have the opportunity to add the integration
constant C until the final step, at which point we include it as part of the most general antiderivative that we sought from the outset in
evaluating an indefinite integral.

Activity 5.4.4

Evaluate each of the following indefinite integrals.


a. Z x 2 sin(x) dx
b. Z t 3 ln(t) dt
c. Z e z sin(z) dz
d. Z s 2 e 3s ds
e. Z t arctan(t) dt (Hint: At a certain point in this problem, it is very helpful to note that t 2 1+t 2 = 1 − 1 1+t 2 .) C

Evaluating Definite Integrals Using Integration by Parts


Just as we saw with u-substitution in Section 5.3, we can use the technique of Integration by Parts to evaluate a definite integral. Say, for
example, we wish to find the exact value of
π/2

∫ t sin(t)dt. (5.4.30)
0

One option is to evaluate the related indefinite integral to find that

Rtsin(t)dt = −tcos(t) + sin(t) + C , (5.4.31)

and then use the resulting antiderivative along with the Fundamental Theorem of Calculus to find that
π/2

∫ tsin(t)dt = (−tcos(t) + sin(t))π/20 = −π2cos(π2) + sin(π2) − (−0cos(0) + sin(0)) = 1. (5.4.32)


0

Alternatively, we can apply Integration by Parts and work with definite integrals throughout. In this perspective, it is essential to remember
to evaluate the product uv over the given limits of integration. To that end, using the substitution u = t and dv = sin(t) dt, so that du = dt and
v = − cos(t), we write
π/2

∫ tsin(t)dt = −tcos(t)π/20 − Zπ/20(−cos(t))dt = −tcos(t)π/20 + sin(t)π/20 = −π2cos(π2) + sin(π2) (5.4.33)


0

− (−0cos(0) + sin(0)) = 1.

As with any substitution technique, it is important to remember the overall goal of the problem, to use notation carefully and completely,
and to think about our end result to ensure that it makes sense in the context of the question being answered. When u-substitution and
Integration by Parts Fail to Help As we close this section, it is important to note that both integration techniques we have discussed apply
in relatively limited circumstances. In particular, it is not hard to find examples of functions for which neither technique produces an
antiderivative; indeed, there are many, many functions that appear elementary but that do not have an elementary algebraic antiderivative.
For instance, if we consider the indefinite integrals
2
x
∫ e dx (5.4.34)

and

∫ x tan(x)dx, (5.4.35)

neither u-substitution nor Integration by Parts proves fruitful. While there are other integration techniques, some of which we will consider
briefly, none of them enables us to find an algebraic antiderivative for e x 2 or x tan(x). There are at least two key observations to make:
one, we do know from the Second Fundamental Theorem of Calculus that we can construct an integral antiderivative for each function;
and two, antidifferentiation is much, much harder in general than differentiation. In particular, we observe that F(x) = R x 0 e t 2 dt is an
antiderivative of f (x) = e x 2 , and G(x) = R x 0 t tan(t) dt is an antiderivative of g(x) = x tan(x). But finding an elementary algebraic
formula that doesn’t involve integrals for either F or G turns out not only to be impossible through u-substitution or Integration by Parts,
but indeed impossible altogether.

Matthew Boelkins, David Austin & Steven


5.4.5 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
Summary
In this section, we encountered the following important ideas:
Through the method of Integration by Parts, we can evaluate indefinite integrals that involve products of basic functions such as R x
sin(x) dx and R x ln(x) dx through a substitution that enables us to effectively trade one of the functions in the product for its derivative,
and the other for its antiderivative, in an effort to find a different product of functions that is easier to integrate.
If we are given an integral whose algebraic structure we can identify as a product of basic functions in the form R f (x)g 0 (x) dx, we
can use the substitution u = f (x) and dv = g 0 (x) dx and apply the rule Z u dv = uv − Z v du in an effort to evaluate the original integral
R f (x)g 0 (x) dx by instead evaluating R v du = R f 0 (x)g(x) dx.
When deciding to integrate by parts, we normally have a product of functions present in the integrand and we have to select both u and
dv. That selection is guided by the overall principal that we desire the new integral R v du to not be any more difficult or complicated
than the original integral R u dv. In addition, it is often helpful to recognize if one of the functions present is much easier to differentiate
than antidifferentiate (such as ln(x)), in which case that function often is best assigned the variable u. For sure, when choosing dv, the
corresponding function must be one that we can antidifferentiate.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand Valley State
University)

Matthew Boelkins, David Austin & Steven


5.4.6 11/24/2021 https://math.libretexts.org/@go/page/4323
Schlicker
5.5: Other Options for Finding Algebraic Derivatives
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How does the method of partial fractions enable any rational function to be antidifferentiated?
What role have integral tables historically played in the study of calculus and how can a table be used to evaluate
integrals such as R √ a 2 + u 2 du?
What role can a computer algebra system play in the process of finding antiderivatives?

In the preceding sections, we have learned two very specific antidifferentiation techniques: u-substitution and integration
by parts. The former is used to reverse the chain rule, while the latter to reverse the product rule. But we have seen that
each only works in very specialized circumstances. For example, while R xex 2 dx may be evaluated by u-substitution and
R xex dx by integration by parts, neither method provides a route to evaluate R e x 2 dx. That fact is not a particular
shortcoming of these two antidifferentiation techniques, as it turns out there does not exist an elementary algebraic
antiderivative for e x 2 . Said differently, no matter what antidifferentiation methods we could develop and learn to
execute, none of them will be able to provide us with a simple formula that does not involve integrals for a function F(x)
that satisfies F 0 (x) = e x 2 . In this section of the text, our main goals are to better understand some classes of functions
that can always be antidifferentiated, as well as to learn some options for so doing. At the same time, we want to recognize
that there are many functions for which an algebraic formula for an antiderivative does not exist, and also appreciate the
role that computing technology can play in helping us find antiderivatives of other complicated functions. Throughout, it is
helpful to remember what we have learned so far: how to reverse the chain rule through u-substitution, how to reverse the
product rule through integration by parts, and that overall, there are subtle and challenging issues to address when trying to
find antiderivatives.

Preview Activity 5.5.1:

For each of the indefinite integrals below, the main question is to decide whether the integral can be evaluated using u-
substitution, integration by parts, a combination of the two, or neither. For integrals for which your answer is
affirmative, state the substitution(s) you would use. It is not necessary to actually evaluate any of the integrals
completely, unless the integral can be evaluated immediately using a familiar basic antiderivative.
a. Z x 2 sin(x 3 ) dx, Z x 2 sin(x) dx, Z sin(x 3 ) dx, Z x 5 sin(x 3 ) dx
b. Z 1 1 + x 2 dx, Z x 1 + x 2 dx, Z 2x + 3 1 + x 2 dx, Z e x 1 + (e x ) 2 dx,
c. Z x ln(x) dx, Z ln(x) x dx, Z ln(1 + x 2 ) dx, Z x ln(1 + x 2 ) dx,
d. Z x √ 1 − x 2 dx, Z 1 √ 1 − x 2 dx, Z x √ 1 − x 2 dx, Z 1 x √ 1 − x 2 dx, ./

The Method of Partial Fractions


The method of partial fractions is used to integrate rational functions, and essentially involves reversing the process of
finding a common denominator. For example, suppoes we have the function R(x) = 5x x 2−x−2 and want to evaluate Z 5x
x 2 − x − 2 dx. Thinking algebraically, if we factor the denominator, we can see how R might come from the sum of two
fractions of the form A x−2 + B x+1 . In particular, suppose that 5x (x − 2)(x + 1) = A x − 2 + B x + 1 . Multiplying both
sides of this last equation by (x − 2)(x + 1), we find that 5x = A(x + 1) + B(x − 2). Since we want this equation to hold for
every value of x, we can use insightful choices of specific x-values to help us find A and B. Taking x = −1, we have 5(−1)
= A(0) + B(−3), and thus B = 5 3 . Choosing x = 2, it follows 5(2) = A(3) + B(0), so A = 10 3 . Therefore, we now know
that Z 5x x 2 − x − 2 dx = Z 10/3 x − 2 + 5/3 x + 1 dx.
This equivalent integral expression is straightforward to evaluate, and hence we find that Z 5x x 2 − x − 2 dx = 10 3 ln |x −
2| + 5 3 ln |x + 1| + C. It turns out that for any rational function R(x) = P(x) Q(x) where the degree of the polynomial P is
less than7 the degree of the polynomial Q, the method of partial fractions can be used to rewrite the rational function as a
sum of simpler rational functions of one of the following forms: A x − c , A (x − c) n , or Ax + B x 2 + k where A, B, and c
are real numbers, and k is a positive real number. Because each of these basic forms is one we can antidifferentiate, partial
fractions enables us to antidifferentiate any rational function. A computer algebra system such as Maple, Mathematica, or
Matthew Boelkins, David Austin & Steven
5.5.1 11/3/2021 https://math.libretexts.org/@go/page/4324
Schlicker
WolframAlpha can be used to find the partial fraction decomposition of any rational function. In WolframAlpha, entering
partial fraction 5x/(xˆ2-x-2) results in the output 5x x 2 − x − 2 = 10 3(x − 2) + 5 3(x + 1) . We will primarily use
technology to generate partial fraction decompositions of rational functions, and then work from there to evaluate the
integrals of interest using established methods.

Activity 5.5.2:

For each of the following problems, evaluate the integral by using the partial fraction decomposition provided.
a. Z 1 x 2 − 2x − 3 dx, given that 1 x 2−2x−3 = 1/4 x−3 − 1/4 x+1
b. Z x 2 + 1 x 3 − x 2 dx, given that x 2+1 x 3−x 2 = − 1 x − 1 x 2 + 2 x−1
c. Z x − 2 x 4 + x 2 dx, given that x−2 x 4+x 2 = 1 x − 2 x 2 + −x+2 1+x 2 C

If the degree of P is greater than or equal to the degree of Q, long division may be used to write R as the sum of a
polynomial plus a rational function where the numerator’s degree is less than the denominator’s.

Using an Integral Table


Calculus has a long history, with key ideas going back as far as Greek mathematicians in 400-300 BC. Its main
foundations were first investigated and understood independently by Isaac Newton and Gottfried Wilhelm Leibniz in the
late 1600s, making the modern ideas of calculus well over 300 years old. It is instructive to realize that until the late 1980s,
the personal computer essentially did not exist, so calculus (and other mathematics) had to be done by hand for roughly
300 years. During the last 30 years, however, computers have revolutionized many aspects of the world we live in,
including mathematics. In this section we take a short historical tour to precede the following discussion of the role
computer algebra systems can play in evaluating indefinite integrals. In particular, we consider a class of integrals
involving certain radical expressions that, until the advent of computer algebra systems, were often evaluated using an
integral table. As seen in the short table of integrals found in Appendix A, there are also many forms of integrals that
involve √ a 2 ± w2 and √ w2 − a 2. These integral rules can be developed using a technique known as trigonometric
substitution that we choose to omit; instead, we will simply accept the results presented in the table. To see how these rules
are needed and used, consider the differences among Z 1 √ 1 − x 2 dx, Z x √ 1 − x 2 dx, and Z √ 1 − x 2 dx. The first
integral is a familiar basic one, and results in arcsin(x) + C. The second integral can be evaluated using a standard u-
substitution with u = 1 − x 2 . The third, however, is not familiar and does not lend itself to u-substitution. In Appendix A,
we find the rule (8) Z √ a 2 − u 2 du = u 2 √ a 2 − u 2 + a 2 2 arcsin u a + C. Using the substitutions a = 1 and u = x (so that
du = dx), it follows that Z √ 1 − x 2 dx = x 2 √ 1 − x 2 − 1 2 arcsin x + C. One important point to note is that whenever we
are applying a rule in the table, we are doing a u-substitution. This is especially key when the situation is more complicated
than allowing u = x as in the last example. For instance, say we wish to evaluate the integral Z √ 9 + 64x 2 dx. Once again,
we want to use Rule (3) from the table, but now do so with a = 3 and u = 8x; we also choose the “+” option in the rule.
With this substitution, it follows that du = 8dx, so dx = 1 du . Applying this substitution, Z √ 9 + 64x 2 dx = Z √ 9 + u 2 · 1
8 du = 1 8 Z √ 9 + u 2 du. By Rule (3), we now find that Z √ 9 + 64x 2 dx = 1 8 u 2 √ u 2 + 9 + 9 2 ln |u + √ u 2 + 9| + C =
1 8 8x 2 √ 64x 2 + 9 + 9 2 ln |8x + √ 64x 2 + 9| + C . In problems such as this one, it is essential that we not forget to
account for the factor of 1 8 that must be present in the evaluation.

Activity 5.5.3:

For each of the following integrals, evaluate the integral using u-substitution and/or an entry from the table found in
Appendix A.
a. Z √ x 2 + 4 dx
b. Z x √ x 2 + 4 dx
c. Z 2 √ 16 + 25x 2 dx
d. Z 1 x 2 √ 49 − 36x 2 dx C

Matthew Boelkins, David Austin & Steven


5.5.2 11/3/2021 https://math.libretexts.org/@go/page/4324
Schlicker
Using Computer Algebra Systems
A computer algebra system (CAS) is a computer program that is capable of executing symbolic mathematics. For a simple
example, if we ask a CAS to solve the equation ax2 + bx + c = 0 for the variable x, where a, b, and c are arbitrary
constants, the program will return x = −b± √ b 2−4ac 2a . While research to develop the first CAS dates to the 1960s, these
programs became more common and publicly available in the early 1990s. Two prominent early examples are the
programs Maple and Mathematica, which were among the first computer algebra systems to offer a graphical user
interface. Today, Maple and Mathematica are exceptionally powerful professional software packages that are capable of
executing an amazing array of sophisticated mathematical computations. They are also very expensive, as each is a
proprietary program. The CAS SAGE is an open-source, free alternative to Maple and Mathematica.
For the purposes of this text, when we need to use a CAS, we are going to turn instead to a similar, but somewhat different
computational tool, the web-based “computational knowledge engine” called WolframAlpha. There are two features of
WolframAlpha that make it stand out from the CAS options mentioned above: (1) unlike Maple and Mathematica,
WolframAlpha is free (provided we are willing to suffer through some pop-up advertising); and (2) unlike any of the three,
the syntax in WolframAlpha is flexible. Think of WolframAlpha as being a little bit like doing a Google search: the
program will interpret what is input, and then provide a summary of options. If we want to have WolframAlpha evaluate an
integral for us, we can provide it syntax such as integrate xˆ2 dx to which the program responds with Z x 2 dx = x 3 3 +
constant. While there is much to be enthusiastic about regarding CAS programs such as WolframAlpha, there are several
things we should be cautious about:
1. a CAS only responds to exactly what is input;
2. a CAS can answer using powerful functions from highly advanced mathematics; and
3. there are problems that even a CAS cannot do without additional human insight.
Although (1) likely goes without saying, we have to be careful with our input: if we enter syntax that defines a function
other than the problem of interest, the CAS will work with precisely the function we define. For example, if we are
interested in evaluating the integral
Z116 − 5x2dx, (5.5.1)

and we mistakenly enter integrate 1/16 - 5xˆ2 dx a CAS will (correctly) reply with 1 16 x − 5 3 x 3 . It is essential that we
are sufficiently well-versed in antidifferentiation to recognize that this function cannot be the one that we seek: integrating
a rational function such as 1 16−5x 2 , we expect the logarithm function to be present in the result. Regarding (2), even for
a relatively simple integral such as R 1 16−5x 2 dx, some CASs will invoke advanced functions rather than simple ones.
For instance, if we use Maple to execute the command int(1/(16-5*xˆ2), x); the program responds with Z 1 16 − 5x 2 dx =
√ 5 20 arctanh( √ 5 4 x). While this is correct (save for the missing arbitrary constant, which Maple never reports), the
inverse hyperbolic tangent function is not a common nor familiar one; a simpler way to express this function can be found
by using the partial fractions method, and happens to be the result reported by WolframAlpha:

Z116 − 5x2dx = 18√5log(4√5 + 5√x) − log(4√5 − 5√x) + constant. (5.5.2)

Using sophisticated functions from more advanced mathematics is sometimes the way a CAS says to the user “I don’t
know how to do this problem.” For example, if we want to evaluate Z e −x 2 dx, and we ask WolframAlpha to do so, the
input integrate exp(-xˆ2) dx results in the output

Ze − x2dx = √π2erf (x) + constant. (5.5.3)

The function “erf(x)” is the error function, which is actually defined by an integral:

erf (x) = 2√πZx0e − t2dt. (5.5.4)

So, in producing output involving an integral, the CAS has basically reported back to us the very question we asked.
Finally, as remarked at (3) above, there are times that a CAS will actually fail without some additional human insight. If
we consider the integral

Z(1 + x)ex√1 + x2e2xdx (5.5.5)

and ask WolframAlpha to evaluate

Matthew Boelkins, David Austin & Steven


5.5.3 11/3/2021 https://math.libretexts.org/@go/page/4324
Schlicker
int(1 + x) ∗ exp(x) ∗ sqrt(1 + xˆ2 ∗ exp(2x))dx, (5.5.6)

the program thinks for a moment and then reports (no result found in terms of standard mathematical functions) But in fact
this integral is not that difficult to evaluate. If we let u = xex , then du = (1 + x)e x dx, which means that the preceding
integral has form Z (1 + x)e x √ 1 + x 2e 2x dx = Z √ 1 + u 2 du, which is a straightforward one for any CAS to evaluate.
So, the above observations regarding computer algebra systems lead us to proceed with some caution: while any CAS is
capable of evaluating a wide range of integrals (both definite and indefinite), there are times when the result can mislead
us. We must think carefully about the meaning of the output, whether it is consistent with what we expect, and whether or
not it makes sense to proceed.

Summary
In this section, we encountered the following important ideas:
The method of partial fractions enables any rational function to be antidifferentiated, because any polynomial function
can be factored into a product of linear and irreducible quadratic terms. This allows any rational function to be written
as the sum of a polynomial plus rational terms of the form A (x−c) n (where n is a natural number) and Bx+C x 2+k
(where k is a positive real number).
Until the development of computing algebra systems, integral tables enabled students of calculus to more easily
evaluate integrals such as R √ a 2 + u 2 du, where a is a positive real number. A short table of integrals may be found in
Appendix A.
Computer algebra systems can play an important role in finding antiderivatives, though we must be cautious to use
correct input, to watch for unusual or unfamiliar advanced functions that the CAS may cite in its result, and to consider
the possibility that a CAS may need further assistance or insight from us in order to answer a particular question.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


5.5.4 11/3/2021 https://math.libretexts.org/@go/page/4324
Schlicker
5.6: Numerical Integration
Learning Objectives

In this section, we strive to understand the ideas generated by the following important questions:
How do we accurately evaluate a definite integral such as R 1 0 e −x 2 dx when we cannot use the First
Fundamental Theorem of Calculus because the integrand lacks an elementary algebraic antiderivative? Are there
ways to generate accurate estimates without using extremely large values of n in Riemann sums?
What is the Trapezoid Rule, and how is it related to left, right, and middle Riemann sums?
How are the errors in the Trapezoid Rule and Midpoint Rule related, and how can they be used to develop an even
more accurate rule?

When we were first exploring the problem of finding the net-signed area bounded by a curve, we developed the concept of
a Riemann sum as a helpful estimation tool and a key step in the definition of the definite integral. In particular, as we
found in Section 4.2, recall that the left, right, and middle Riemann sums of a function f on an interval [a, b] are denoted
Ln, Rn, and Mn, with formulas
Ln = f (x0)4x + f (x1)4x + ⋅ ⋅ ⋅ + f (xn − 1)4x = Xn − 1i = 0f (xi)4x, (5.6.1)

Rn = f (x1)4x + f (x2)4x + ⋅ ⋅ ⋅ + f (xn)4x = Xni = 1f (xi)4x, (5.6.2)

M n = f (x1)4x + f (x2)4x + ⋅ ⋅ ⋅ + f (xn)4x = Xni = 1f (xi)4x, (5.6.3)

where x0 = a, xi = a + i4x, xn = b, and 4x = b−a n .


For the middle sum, note that xi = (xi−1 + xi)/2. Further, recall that a Riemann sum is essentially a sum of (possibly signed)
areas of rectangles, and that the value of n determines the number of rectangles, while our choice of left endpoints, right
endpoints, or midpoints determines how we use the given function to find the heights of the respective rectangles we
choose to use. Visually, we can see the similarities and differences among these three options in Figure 5.14, where we
consider the function f (x) = 1 20 (x − 4) 3 + 7 on the interval [1, 8], and use 5 rectangles for each of the Riemann sums.

Figure 5.14: Left, right, and middle Riemann sums for y = f (x) on [1, 8] with 5 subintervals.
While it is a good exercise to compute a few Riemann sums by hand, just to ensure that we understand how they work and
how varying the function, the number of subintervals, and the choice of endpoints or midpoints affects the result, it is of
course the case that using computing technology is the best way to determine Ln, Rn, and Mn going forward. Any
computer algebra system will offer this capability; as we saw in Preview Activity 4.3, a straightforward option that happens
to also be freely available online is the applet8 at http://gvsu.edu/s/a9. Note that we can adjust the formula for f (x), the
window of x- and y-values of interest, the number of subintervals, and the method. See Preview Activity 4.3 for any needed
reminders on how the applet works. In what follows in this section we explore several different alternatives, including left,
right, and middle Riemann sums, for estimating definite integrals. One of our main goals in the upcoming section is to
develop formulas that enable us to estimate definite integrals accurately without having to use exceptionally large numbers
of rectangles.

Preview Activity 5.6.1:

As we begin to investigate ways to approximate definite integrals, it will be insightful to compare results to integrals
whose exact values we know. To that end, the following sequence of questions centers on R 3 0 x 2 dx.

Matthew Boelkins, David Austin & Steven


5.6.1 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
a. Use the applet at http://gvsu.edu/s/a9 with the function f (x) = x 2 on the window of x values from 0 to 3 to compute
L3, the left Riemann sum with three subintervals.
b. Likewise, use the applet to compute R3 and M3, the right and middle Riemann sums with three subintervals,
respectively. 8Marc Renault, Shippensburg University
c. Use the Fundamental Theorem of Calculus to compute the exact value of I = R 3 0 x 2 dx.
d. We define the error in an approximation of a definite integral to be the difference between the integral’s exact value
and the approximation’s value. What is the error that results from using L3? From R3? From M3?
e. In what follows in this section, we will learn a new approach to estimating the value of a definite integral known as
the Trapezoid Rule. The basic idea is to use trapezoids, rather than rectangles, to estimate the area under a curve.
What is the formula for the area of a trapezoid with bases of length b1 and b2 and height h?
f. Working by hand, estimate the area under f (x) = x 2 on [0, 3] using three subintervals and three corresponding
trapezoids. What is the error in this approximation? How does it compare to the errors you calculated in (d)? ./

The Trapezoid Rule


Throughout our work to date with developing and estimating definite integrals, we have used the simplest possible
quadrilaterals (that is, rectangles) to subdivide regions with complicated shapes. It is natural, however, to wonder if other
familiar shapes might serve us even better. In particular, our goal is to be able to accurately estimate R b a f (x) dx without
having to use extremely large values of n in Riemann sums. To this end, we consider an alternative to Ln, Rn, and Mn,
know as the Trapezoid Rule. The fundamental idea is simple: rather than using a rectangle to estimate the (signed) area
bounded by y = f (x) on a small interval, we use a trapezoid. For example, in Figure 5.15, we estimate the area under the
pictured curve using three subintervals and the trapezoids that result from connecting the corresponding points on the curve
with straight lines. The biggest difference between the Trapezoid Rule and a left, right, or middle Riemann sum is that on
each subinterval, the Trapezoid Rule uses two function values, rather than one, to estimate the (signed) area bounded by the
curve. For instance, to compute D1, the area of the trapezoid generated by the curve y = f (x) in Figure 5.15 on [x0, x1], we
observe that the left base of this trapezoid has length f (x0), while the right base has length f (x1). In addition, the height of
this trapezoid is x1 − x0 = 4x = b−a 3 . Since the area of a trapezoid is the average of the bases times the height, we have
D1 = 1 2 ( f (x0) + f (x1)) · 4x. Using similar computations for D2 and D3, we find that T3, the trapezoidal approximation

Figure 5.15: Estimating R b a f (x) dx using three subintervals and trapezoids, rather than rectangles, where a = x0 and b
= x3.
Because both left and right endpoints are being used, we recognize within the trapezoidal approximation the use of both left
and right Riemann sums. In particular, rearranging the expression for T3 by removing a factor of 1 2 , grouping the left
endpoint evaluations of f , and grouping the right endpoint evaluations of f , we see that
T 3 = 12[(f (x0)4x + f (x1)4x + f (x2)4x) + (f (x1)4x + f (x2)4x + f (x3)4x)]. (5.14) (5.6.4)

At this point, we observe that two familiar sums have arisen. Since the left Riemann sum L3 is L3 = f (x0)4x + f (x1)4x + f
(x2)4x, and the right Riemann sum is R3 = f (x1)4x + f (x2)4x + f (x3)4x, substituting L3 and R3 for the corresponding
expressions in Equation 5.14, it follows that T3 = 1 2 [L3 + R3] . We have thus seen the main ideas behind a very important
result: using trapezoids to estimate the (signed) area bounded by a curve is the same as averaging the estimates generated
by using left and right endpoints. (The Trapezoid Rule) The trapezoidal approximation, Tn, of the definite integral R b a f
(x) dx using n subintervals is given by the rule
T n = 12(f (x0) + f (x1))4x + 12(f (x1) + f (x2))4x + ⋅ ⋅ ⋅ + 12(f (xn − 1) + f (xn))4x. = Xn − 1i (5.6.5)

= 012(f (xi) + f (xi + 1))4x.

Matthew Boelkins, David Austin & Steven


5.6.2 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
Moreover,
T n = 12[Ln + Rn]. (5.6.6)

Exercise 5.6.2:

In this activity, we explore the relationships among the errors generated by left, right, midpoint, and trapezoid
approximations to the definite integral R 2 1 1 x 2 dx
a. Use the First FTC to evaluate R 2 1 1 x 2 dx exactly.
b. Use appropriate computing technology to compute the following approximations for R 2 1 1 x 2 dx: T4, M4, T8,
and M8.
c. Let the error of an approximation be the difference between the exact value of the definite integral and the resulting
approximation. For instance, if we let ET,4 represent the error that results from using the trapezoid rule with 4
subintervals to estimate the integral, we have ET,4 = Z 2 1 1 x 2 dx − T4. Similarly, we compute the error of the
midpoint rule approximation with 8 subintervals by the formula EM,8 = Z 2 1 1 x 2 dx − M8. Based on your work
in (a) and (b) above, compute ET,4, ET,8, EM,4, EM,8.
d. Which rule consistently over-estimates the exact value of the definite integral? Which rule consistently under-
estimates the definite integral?
e. What behavior(s) of the function f (x) = 1 x 2 lead to your observations in (d)? C

Comparing the Midpoint and Trapezoid Rules


We know from the definition of the definite integral of a continuous function f , that if we let n be large enough, we can
make the value of any of the approximations Ln, Rn, and Mn as close as we’d like (in theory) to the exact value of R b a f
(x) dx. Thus, it may be natural to wonder why we ever use any rule other than Ln or Rn (with a sufficiently large n value)
to estimate a definite integral. One of the primary reasons is that as n → ∞, 4x = b−a n → 0, and thus in a Riemann sum
calculation with a large n value, we end up multiplying by a number that is very close to zero. Doing so often generates
roundoff error, as representing numbers close to zero accurately is a persistent challenge for computers. Hence, we are
exploring ways by which we can estimate definite integrals to high levels of precision, but without having to use extremely
large values of n. Paying close attention 324 5.6. NUMERICAL INTEGRATION to patterns in errors, such as those
observed in Activity 5.15, is one way to begin to see some alternate approaches. To begin, we make a comparison of the
errors in the Midpoint and Trapezoid rules from two different perspectives. First, consider a function of consistent
concavity on a given interval, and picture approximating the area bounded on that interval by both the Midpoint and
Trapezoid rules using a single subinterval. As seen in Figure 5.16, it

Figure 5.16: Estimating R b a f (x) dx using a single subinterval: at left, the trapezoid rule; in the middle, the midpoint
rule; at right, a modified way to think about the midpoint rule.
is evident that whenever the function is concave up on an interval, the Trapezoid Rule with one subinterval, T1, will
overestimate the exact value of the definite integral on that interval. Moreover, from a careful analysis of the line that
bounds the top of the rectangle for the Midpoint Rule (shown in magenta), we see that if we rotate this line segment until it
is tangent to the curve at the point on the curve used in the Midpoint Rule (as shown at right in Figure 5.16), the resulting
trapezoid has the same area as M1, and this value is less than the exact value of the definite integral. Hence, when the
function is concave up on the interval, M1 underestimates the integral’s true value. These observations extend easily to the
situation where the function’s concavity remains consistent but we use higher values of n in the Midpoint and Trapezoid
Rules. Hence, whenever f is concave up on [a, b], Tn will overestimate the value of R b a f (x) dx, while Mn will
underestimate R b a f (x) dx. The reverse observations are true in the situation where f is concave down. Next, we compare

Matthew Boelkins, David Austin & Steven


5.6.3 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
the size of the errors between Mn and Tn. Again, we focus on M1 and T1 on an interval where the concavity of f is
consistent. In Figure 5.17, where the error of the Trapezoid Rule is shaded in red, while the error of the Midpoint Rule is
shaded lighter red, it is visually apparent that the error in the Trapezoid Rule is more significant.

Figure 5.17: Comparing the error in estimating R b a f (x) dx using a single subinterval: in red, the error from the
Trapezoid rule; in light red, the error from the Midpoint rule.
To see how much more significant, let’s consider two examples and some particular computations. If we let f (x) = 1 − x 2
and consider R 1 0 f (x) dx, we know by the First FTC that the exact value of the integral is Z 1 0 (1 − x 2 ) dx = x − x 3 3 1
0 = 2 3 . Using appropriate technology to compute M4, M8, T4, and T8, as well as the corresponding errors EM,4, EM,8,
ET,4, and ET,8, as we did in Activity 5.15, we find the results summarized in Table 5.1. Note that in the table, we also
include the approximations and their errors for the example R 2 1 1 x 2 dx from Activity 5.15.

Table 5.1: Calculations of T4, M4, T8, and M8, along with corresponding errors, for the definite integrals R 1 0 (1 − x 2 )
dx and R 2 1 1 x 2 dx.
Recall that for a given function f and interval [a, b], ET,4 = R b a f (x) dx −T4 calculates the difference between the exact
value of the definite integral and the approximation generated by the Trapezoid Rule with n = 4. If we look at not only
ET,4, but also the other errors generated by using Tn and Mn with n = 4 and n = 8 in the two examples noted in Table 5.1,
we see an evident pattern. Not only is the sign of the error (which measures whether the rule generates an over- or under-
estimate) tied to the rule used and the function’s concavity, but the magnitude of the errors generated by Tn and Mn seems
closely connected. In particular, the errors generated by the Midpoint Rule seem to be about half the size of those generated
by the Trapezoid Rule. That is, we can observe in both examples that EM,4 ≈ −1 2 ET,4 and EM,8 ≈ −1 2 ET,8, which
demonstrates a property of the Midpoint and Trapezoid Rules that turns out to hold in general: for a function of consistent
concavity, the error in the Midpoint Rule has the opposite sign and approximately half the magnitude of the error of the
Trapezoid Rule. Said symbolically, EM,n ≈ − 1 2 ET,n. This important relationship suggests a way to combine the
Midpoint and Trapezoid Rules to create an even more accurate approximation to a definite integral. Simpson’s Rule When
we first developed the Trapezoid Rule, we observed that it can equivalently be viewed as resulting from the average of the
Left and Right Riemann sums: Tn = 1 2 (Ln + Rn). Whenever a function is always increasing or always decreasing on the
interval [a, b], one of Ln and Rn will over-estimate the true value of R b a f (x) dx, while the other will under-estimate the
integral. Said differently, the errors found in Ln and Rn will have opposite signs; thus, averaging Ln and Rn eliminates a
considerable amount of the error present in the respective approximations. In a similar way, it makes sense to think about
averaging Mn and Tn in order to generate a still more accurate approximation. At the same time, we’ve just observed that
Mn is typically about twice as accurate as Tn. Thus, we instead choose to use the weighted average S2n = 2Mn + Tn 3 .

Matthew Boelkins, David Austin & Steven


5.6.4 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
(5.15) The rule for S2n giving by Equation 5.15 is usually known as Simpson’s Rule. 9 Note that we use “S2n” rather that
“Sn” since the n points the Midpoint Rule uses are different from the n points the Trapezoid Rule uses, and thus Simpson’s
Rule is using 2n points at which to evaluate the function. We build upon the results in Table 5.1 to see the approximations
generated by Simpson’s Rule. In particular, in Table 5.2, we include all of the results in 9Thomas Simpson was an 18th
century mathematician; his idea was to extend the Trapezoid rule, but rather than using straight lines to build trapezoids, to
use quadratic functions to build regions whose area was bounded by parabolas (whose areas he could find exactly).
Simpson’s Rule is often developed from the more sophisticated perspective of using interpolation by quadratic functions.
5.6. NUMERICAL INTEGRATION 327 Table 5.1, but include additional results for S8 = 2M4+T4 3 and S16 = 2M8+T8 3
. R 1 0 (1 − x 2 ) dx = 0.6

Table 5.2: Table 5.1 updated to include S8, S16, and the corresponding errors.
The results seen in Table 5.2 are striking. If we consider the S16 approximation of R 2 1 1 x 2 dx, the error is only ES,16 =
0.0000019434. By contrast, L8 = 0.5491458502, so the error of that estimate is EL,8 = −0.0491458502. Moreover, we
observe that generating the approximations for Simpson’s Rule is almost no additional work: once we have Ln, Rn, and Mn
for a given value of n, it is a simple exercise to generate Tn, and from there to calculate S2n. Finally, note that the error in
the Simpson’s Rule approximations of R 1 0 (1 − x 2 ) dx is zero!10 These rules are not only useful for approximating
definite integrals such as R 1 0 e −x 2 dx, for which we cannot find an elementary antiderivative of e −x 2 , but also for
approximating definite integrals in the setting where we are given a function through a table of data.

Exercise 5.6.3:

A car traveling along a straight road is braking and its velocity is measured at several different points in time, as given
in the following table. Assume that v is continuous, always decreasing, and always decreasing at a decreasing rate, as is
suggested by the data.

a. Plot the given data on the set of axes provided in Figure 5.18 with time on the horizontal axis and the velocity on
the vertical axis.
b. What definite integral will give you the exact distance the car traveled on [0, 1.8]? 10Similar to how the Midpoint
and Trapezoid approximations are exact for linear functions, Simpson’s Rule approximations are exact for quadratic
and cubic functions. See additional discussion on this issue later in the section and in the exercises.
c. Estimate the total distance traveled on [0, 1.8] by computing L3, R3, and T3. Which of these under-estimates the
true distance traveled?
d. Estimate the total distance traveled on [0, 1.8] by computing M3. Is this an over- or under-estimate? Why?
e. Using your results from (c) and (d), improve your estimate further by using Simpson’s Rule.
f. What is your best estimate of the average velocity of the car on [0, 1.8]? Why? What are the units on this quantity?

Matthew Boelkins, David Austin & Steven


5.6.5 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
Figure 5.18: Axes for plotting the data in Activity 5.16.

Overall observations regarding Ln, Rn, Tn, Mn, and S2n. As we conclude our discussion of numerical approximation of
definite integrals, it is important to summarize general trends in how the various rules over- or under-estimate the true value
of a definite integral, and by how much. To revisit some past observations and see some new ones, we consider the
following activity.

Activity 5.6.4:

Consider the functions f (x) = 2 − x 2 , g(x) = 2 − x 3 , and h(x) = 2 − x 4 , all on the interval [0, 1]. For each of the
questions that require a numerical answer in what follows, write your answer exactly in fraction form.
a. On the three sets of axes provided in Figure 5.19, sketch a graph of each function on the interval [0, 1], and
compute L1 and R1 for each. What do you observe? 5.6. NUMERICAL INTEGRATION 329
b. Compute M1 for each function to approximate R 1 0 f (x) dx, R 1 0 g(x) dx, and R 1 0 h(x) dx, respectively.
c. Compute T1 for each of the three functions, and hence compute S1 for each of the three functions.
d. Evaluate each of the integrals R 1 0 f (x) dx, R 1 0 g(x) dx, and R 1 0 h(x) dx exactly using the First FTC.
e. For each of the three functions f , g, and h, compare the results of L1, R1, M1, T1, and S2 to the true value of the
corresponding definite integral. What patterns do you observe?

Figure 5.19: Axes for plotting the functions in Activity 5.17.

The results seen in the examples in Activity 5.17 generalize nicely. For instance, for any function f that is decreasing on [a,
b], Ln will over-estimate the exact value of R b a f (x) dx, and for any function f that is concave down on [a, b], Mn will
over-estimate the exact value of the integral. An excellent exercise is to write a collection of scenarios of possible function
behavior, and then categorize whether each of Ln, Rn, Tn, and Mn is an over- or under-estimate. Finally, we make two
important notes about Simpson’s Rule. When T. Simpson first developed this rule, his idea was to replace the function f on
a given interval with a quadratic function that shared three values with the function f . In so doing, he guaranteed that this
new approximation rule would be exact for the definite integral of any quadratic polynomial. In one of the pleasant
surprises of numerical analysis, it turns out that even though it was designed to be exact for quadratic polynomials,
Simpson’s Rule is exact for any cubic polynomial: that is, if we are interested in an integral such as R 5 2 (5x 3 −2x 2 +7x
−4) dx, S2n will always be exact, regardless of the value of n. This is just one more piece of evidence that shows how
effective Simpson’s Rule is as an approximation tool for estimating definite integrals.11

Summary
In this section, we encountered the following important ideas:

Matthew Boelkins, David Austin & Steven


5.6.6 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
For a definite integral such as R 1 0 e −x 2 dx when we cannot use the First Fundamental Theorem of Calculus because
the integrand lacks an elementary algebraic antiderivative, we can estimate the integral’s value by using a sequence of
Riemann sum approximations. Typically, we start by computing Ln, Rn, and Mn for one or more chosen values of n.
The Trapezoid Rule, which estimates R b a f (x) dx by using trapezoids, rather than rectangles, can also be viewed as
the average of Left and Right Riemann sums. That is, Tn = 1 2 (Ln + Rn).
The Midpoint Rule is typically twice as accurate as the Trapezoid Rule, and the signs of the respective errors of these
rules are opposites. Hence, by taking the weighted average Sn = 2Mn+Tn 3 , we can build a much more accurate
approximation to R b a f (x) dx by using approximations we have already computed. The rule for Sn is known as
Simpson’s Rule, which can also be developed by approximating a given continuous function with pieces of quadratic
polynomials.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


5.6.7 12/22/2021 https://math.libretexts.org/@go/page/4325
Schlicker
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)
5.1: Construction Accurate Graphs of Antiderivatives
1. A moving particle has its velocity given by the quadratic function v pictured in Figure 5.6. In addition, it is given that
A1 = 7 6 and A2 = 8 3 , as well as that for the corresponding position function s, s(0) = 0.5.
(a) Use the given information to determine s(1), s(3), s(5), and s(6).
(b) On what interval(s) is s increasing? On what interval(s) is s decreasing?
(c) On what interval(s) is s concave up? On what interval(s) is s concave down?

Figure 5.6: At left, the given graph of v. At right, axes for plotting s.
(d) Sketch an accurate, labeled graph of s on the axes at right in Figure 5.6.
(e) Note that v(t) = −2 + 1 2 (t − 3) 2 . Find a formula for s.
2. A person exercising on a treadmill experiences different levels of resistance and thus burns calories at different rates,
depending on the treadmill’s setting. In a particular workout, the rate at which a person is burning calories is given by the
piecewise constant function c pictured in Figure 5.7. Note that the units on c are “calories per minute.”

Figure 5.7: At left, the given graph of c. At right, axes for plotting C.
(a) Let C be an antiderivative of c. What does the function C measure? What are its units?
(b) Assume that C(0) = 0. Determine the exact value of C(t) at the values t = 5, 10, 15, 20, 25, 30. 5.1. CONSTRUCTING
ACCURATE GRAPHS OF ANTIDERIVATIVES 275
(c) Sketch an accurate graph of C on the axes provided at right in Figure 5.7. Be certain to label the scale on the vertical
axis.
(d) Determine a formula for C that does not involve an integral and is valid for 5 ≤ t ≤ 10.
3. Consider the piecewise linear function f given in Figure 5.8. Let the functions A, B, and C be defined by the rules A(x)
= R x −1 f (t) dt, B(x) = R x 0 f (t) dt, and C(x) = R x 1 f (t) dt.

Matthew Boelkins, David Austin & Steven


5.E.1 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
Figure 5.8: At left, the given graph of f . At right, axes for plotting A, B, and C.
(a) For the values x = −1, 0, 1, . . ., 6, make a table that lists corresponding values of A(x), B(x), and C(x).
(b) On the axes provided in Figure 5.8, sketch the graphs of A, B, and C.
(c) How are the graphs of A, B, and C related?
(d) How would you best describe the relationship between the function A and the function f?

5.2: The Second Fundamental Theorem of Calculus


1. Let g be the function pictured at left in Figure 5.13, and let F be defined by F(x) = R x 2 g(t) dt. Assume that the shaded
areas have values A1 = 4.29, A2 = 12.75, A3 = 0.36, and A4 = 1.79. Assume further that the portion of A2 that lies
between x = 0.5 and x = 2 is 6.06. Sketch a carefully labeled graph of F on the axes provided, and include a written
analysis of how you know where F is zero, increasing, decreasing, CCU, and CCD. 2. The tide removes sand from the
beach at a small ocean park at a rate modeled by the function R(t) = 2 + 5 sin 4πt 25 A pumping station adds sand to the
beach at rate modeled by the function S(t) = 15t 1 + 3t Both R(t) and S(t) are measured in cubic yards of sand per hour, t is
measured in hours, and the valid times are 0 ≤ t ≤ 6. At time t = 0, the beach holds 2500 cubic yards of sand.

Figure 5.13: At left, the graph of g. At right, axes for plotting F.


(a) What definite integral measures how much sand the tide will remove during the time period 0 ≤ t ≤ 6? Why?
(b) Write an expression for Y(x), the total number of cubic yards of sand on the beach at time x. Carefully explain your
thinking and reasoning.
(c) At what instantaneous rate is the total number of cubic yards of sand on the beach at time t = 4 changing?
(d) Over the time interval 0 ≤ t ≤ 6, at what time t is the amount of sand on the beach least? What is this minimum value?
Explain and justify your answers fully.
3. When an aircraft attempts to climb as rapidly as possible, its climb rate (in feet per minute) decreases as altitude
increases, because the air is less dense at higher altitudes. Given below is a table showing performance data for a certain
single engine aircraft, giving its climb rate at various altitudes, where c(h) denotes the climb rate of the airplane at an
altitude h.

Matthew Boelkins, David Austin & Steven


5.E.2 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
Let a new function m, that also depends on h, (say y = m(h)) measure the number of minutes required for a plane at
altitude h to climb the next foot of altitude.
a. Determine a similar table of values for m(h) and explain how it is related to the table above. Be sure to discuss the units
on m.
b. Give a careful interpretation of a function whose derivative is m(h). Describe what the input is and what the output is.
Also, explain in plain English what the function tells us. 5.2. THE SECOND FUNDAMENTAL THEOREM OF
CALCULUS 287
c. Determine a definite integral whose value tells us exactly the number of minutes required for the airplane to ascend to
10,000 feet of altitude. Clearly explain why the value of this integral has the required meaning.
d. Determine a formula for a function M(h) whose value tells us the exact number of minutes required for the airplane to
ascend to h feet of altitude.
e. Estimate the values of M(6000) and M(10000) as accurately as you can. Include units on your results.

5.3 Integration by Substitution


Exercises
1. This problem centers on finding antiderivatives for the basic trigonometric functions other than sin(x) and cos(x).
(a) Consider the indefinite integral Z tan(x) dx. By rewriting the integrand as tan(x) = sin(x) cos(x) and identifying an
appropriate function-derivative pair, make a u-substitution and hence evaluate Z tan(x) dx.
(b) In a similar way, evaluate Z cot(x) dx.
(c) Consider the indefinite integral Z sec2 (x) + sec(x)tan(x) sec(x) + tan(x) dx. Evaluate this integral using the substitution
u = sec(x) + tan(x).
(d) Simplify the integrand in (c) by factoring the numerator. What is a far simpler way to write the integrand?
(e) Combine your work in (c) and (d) to determine R sec(x) dx. (f) Using (c)-(e) as a guide, evaluate Z csc(x) dx.
2. Consider the indefinite integral Z x √ x − 1 dx.
(a) At first glance, this integrand may not seem suited to substitution due to the presence of x in separate locations in the
integrand. Nonetheless, using the composite function √ x − 1 as a guide, let u = x − 1. Determine expressions for both x
and dx in terms of u.
(b) Convert the given integral in x to a new integral in u.
(c) Evaluate the integral in (b) by noting that √ u = u 1/2 and observing that it is now possible to rewrite the integrand in u
by expanding through multiplication.
(d) Evaluate each of the integrals Z x 2 √ x − 1 dx and Z x √ x 2 − 1 dx. Write a 298 5.3. INTEGRATION BY
SUBSTITUTION paragraph to discuss the similarities among the three indefinite integrals in this problem and the role of
substitution and algebraic rearrangement in each.
3. Consider the indefinite integral Z sin3 (x) dx.
(a) Explain why the substitution u = sin(x) will not work to help evaluate the given integral.
(b) Recall the Fundamental Trigonometric Identity, which states that sin2 (x) + cos2 (x) = 1. By observing that sin3 (x) =
sin(x) · sin2 (x), use the Fundamental Trigonometric Identity to rewrite the integrand as the product of sin(x) with another
function.
(c) Explain why the substitution u = cos(x) now provides a possible way to evaluate the integral in (b).
(d) Use your work in (a)-(c) to evaluate the indefinite integral Z sin3 (x) dx.
(e) Use a similar approach to evaluate Z cos3 (x) dx.
4. For the town of Mathland, MI, residential power consumption has shown certain trends over recent years. Based on data
reflecting average usage, engineers at the power company have modeled the town’s rate of energy consumption by the

Matthew Boelkins, David Austin & Steven


5.E.3 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
function r(t) = 4 + sin(0.263t + 4.7) + cos(0.526t + 9.4). Here, t measures time in hours after midnight on a typical
weekday, and r is the rate of consumption in megawatts5 at time t. Units are critical throughout this problem.
(a) Sketch a carefully labeled graph of r(t) on the interval [0,24] and explain its meaning. Why is this a reasonable model
of power consumption?
(b) Without calculating its value, explain the meaning of R 24 0 r(t) dt. Include appropriate units on your answer.
(c) Determine the exact amount of power Mathland consumes in a typical day.
(d) What is Mathland’s average rate of energy consumption in a given 24-hour period? What are the units on this quantity?
5The unit megawatt is itself a rate, which measures energy consumption per unit time. A megawatt-hour is the total
amount of energy that is equivalent to a constant stream of 1 megawatt of power being sustained for 1 hour.

5.4: Integration by Parts


1. Let f (t) = te−2t and F(x) = R x 0 f (t) dt.
(a) Determine F 0 (x).
(b) Use the First FTC to find a formula for F that does not involve an integral.
(c) Is F an increasing or decreasing function for x > 0? Why?
2. Consider the indefinite integral given by R e 2x cos(e x ) dx.
(a) Noting that e 2x = e x · e x , use the substitution z = e x to determine a new, equivalent integral in the variable z.
(b) Evaluate the integral you found in (a) using an appropriate technique.
(c) How is the problem of evaluating R e 2x cos(e 2x ) dx different from evaluating the integral in (a)? Do so.
(d) Evaluate each of the following integrals as well, keeping in mind the approach(es) used earlier in this problem:
• R e 2x sin(e x ) dx
• R e 3x sin(e 3x ) dx
• R xex 2 cos(e x 2 )sin(e x 2 ) dx
3. For each of the following indefinite integrals, determine whether you would use usubstitution, integration by parts,
neither*, or both to evaluate the integral. In each case, write one sentence to explain your reasoning, and include a
statement of any substitutions used. (That is, if you decide in a problem to let u = e 3x , you should state that, as well as
that du = 3e 3x dx.) Finally, use your chosen approach to evaluate each integral. (* one of the following problems does not
have an elementary antiderivative and you are not expected to actually evaluate this integral; this will correspond with a
choice of “neither” among those given.)
(a) R x 2 cos(x 3 ) dx
(b) R x 5 cos(x 3 ) dx (Hint: x 5 = x 2 · x 3 )
(c) R x ln(x 2 ) dx
(d) R sin(x 4 ) dx
(e) R x 3 sin(x 4 ) dx
(f) R x 7 sin(x 4 ) dx

5.5: Other Options for Finding Algebraic Derivatives


1. For each of the following integrals involving rational functions, (1) use a CAS to find the partial fraction decomposition
of the integrand; (2) evaluate the integral of the resulting function without the assistance of technology; (3) use a CAS to
evaluate the original integral to test and compare your result in (2).
(a) Z x 3 + x + 1 x 4 − 1 dx

Matthew Boelkins, David Austin & Steven


5.E.4 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
(b) Z x 5 + x 2 + 3 x 3 − 6x 2 + 11x − 6 dx 318 5.5. OTHER OPTIONS FOR FINDING ALGEBRAIC
ANTIDERIVATIVES
(c) Z x 2 − x − 1 (x − 3) 3 dx
2. For each of the following integrals involving radical functions, (1) use an appropriate u-substitution along with
Appendix A to evaluate the integral without the assistance of technology, and (2) use a CAS to evaluate the original
integral to test and compare your result in (1).
(a) Z 1 x √ 9x 2 + 25 dx
(b) Z x √ 1 + x 4 dx
(c) Z e x √ 4 + e 2x dx
(d) Z tan(x) p 9 − cos2(x) dx
3. Consider the indefinite integral given by Z q x + √ 1 + x 2 x dx.
(a) Explain why u-substitution does not offer a way to simplify this integral by discussing at least two different options you
might try for u.
(b) Explain why integration by parts does not seem to be a reasonable way to proceed, either, by considering one option for
u and dv.
(c) Is there any line in the integral table in Appendix A that is helpful for this integral?
(d) Evaluate the given integral using WolframAlpha. What do you observe?

5.6: Numerical Integration


1. Consider the definite integral R 1 0 x tan(x) dx.
(a) Explain why this integral cannot be evaluated exactly by using either usubstitution or by integrating by parts.
(b) Using 4 subintervals, compute L4, R4, M4, T4, and S4.
(c) Which of the approximations in (b) is an over-estimate to the true value of R 1 0 x tan(x) dx? Which is an under-
estimate? How do you know?
2. For an unknown function f (x), the following information is known.
• f is continuous on [3, 6];
• f is either always increasing or always decreasing on [3, 6];
• f has the same concavity throughout the interval [3, 6];
• As approximations to R 6 3 f (x) dx, L4 = 7.23, R4 = 6.75, and M4 = 7.05. 11
One reason that Simpson’s Rule is so effective is that S2n benefits from using 2n + 1 points of data. Because it combines
Mn, which uses n midpoints, and Tn, which uses the n + 1 endpoints of the chosen subintervals, S2n takes advantage of the
maximum amount of information we have when we know function values at the endpoints and midpoints of n subintervals.
5.6. NUMERICAL INTEGRATION 331
(a) Is f increasing or decreasing on [3, 6]? What data tells you?
(b) Is f concave up or concave down on [3, 6]? Why?
(c) Determine the best possible estimate you can for R 6 3 f (x) dx, based on the given information.
3. The rate at which water flows through Table Rock Dam on the White River in Branson, MO, is measured in thousands
of cubic feet per second (TCFS). As engineers open the floodgates, flow rates are recorded according to the following
chart.

Matthew Boelkins, David Austin & Steven


5.E.5 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
(a) What definite integral measures the total volume of water to flow through the dam in the 60 second time period
provided by the table above?
(b) Use the given data to calculate Mn for the largest possible value of n to approximate the integral you stated in (a). Do
you think Mn over- or underestimates the exact value of the integral? Why?
(c) Approximate the integral stated in (a) by calculating Sn for the largest possible value of n, based on the given data.
(d) Compute 1 60 Sn and 2000+2100+2400+3000+3900+5100+6500 7 . What quantity do both of these values estimate?
Which is a more accurate approximation?

Matthew Boelkins, David Austin & Steven


5.E.6 12/1/2021 https://math.libretexts.org/@go/page/5415
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
6: USING DEFINITE INTEGRALS
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

6.1: USING DEFINITE INTEGRALS TO FIND AREA AND LENGTH


A single definite integral may be used to represent the area between two curves. To find the area between two curves, we think about
slicing the region into thin rectangles. The shape of the region usually dictates whether we should use vertical rectangles of thickness
or horizontal rectangles of thickness.

6.2: USING DEFINITE INTEGRALS TO FIND VOLUME


Just as we can use definite integrals to add the areas of rectangular slices to find the exact area that lies between two curves, we can
also employ integrals to determine the volume of certain regions that have cross-sections of a particular consistent shape. We can use
a definite integral to find the volume of a three-dimensional solid of revolution that results from revolving a two-dimensional region
about a particular axis by taking slices perpendicular to the axis of revolution which will t

6.3: DENSITY, MASS, AND CENTER OF MASS


For an object of constant density D, with volume V and mass m, we know that m = D·V. If an object with constant cross-sectional
area (such as a thin bar) has its density distributed along an axis according to the function ρ(x), then we can find the mass of the object
between

6.4: PHYSICS APPLICATIONS - WORK, FORCE, AND PRESSURE


While there are many different formulas that we use in solving problems involving work, force, and pressure, it is important to
understand that the fundamental ideas behind these problems are similar to several others that we’ve encountered in applications of
the definite integral. In particular, the basic idea is to take a difficult problem and somehow slice it into more manageable pieces that
we understand, and then use a definite integral to add up these simpler pieces.

6.5: IMPROPER INTEGRALS


An integral can be improper if at least one the limits of integration is ±∞, making the interval unbounded, or if the integrand has a
vertical asymptote. When we encounter an improper integral, we work to understand it by replacing the improper integral with a limit
of proper integrals.

3 12/22/2021
6.E: USING DEFINITE INTEGRALS (EXERCISES)
These are homework exercises to accompany Chapter 6 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
6.1: Using Definite Integrals to Find Area and Length
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we use definite integrals to measure the area between two curves?
How do we decide whether to integrate with respect to x or with respect to y when we try to find the area of a
region?
How can a definite integral be used to measure the length of a curve?

Early on in our work with the definite integral, we learned that if we have a nonnegative velocity function, v , for an object
moving along an axis, the area under the velocity function between a and b tells us the distance the object traveled on that
time interval. Moreover, based on the definition of the definite integral, that area is given precisely by
b

∫ v(t) dt.
a

Indeed, for any nonnegative function f on an interval [a, b], we know that
b

∫ f (x) dx
a

measures the area bounded by the curve and the x-axis between x = a and x = b . Through our upcoming work in the
present section and chapter, we will explore how definite integrals can be used to represent a variety of different physically
important properties. In Preview Activity 6.1.1, we begin this investigation by seeing how a single definite integral may be
used to represent the area between two curves.

Preview Activity 6.1.1

Consider the functions given by f (x) = 5 − (x − 1) and g(x) = 4 − x .


2

a. Use algebra to find the points where the graphs of f and g intersect.
b. Sketch an accurate graph of f and g on the axes provided, labeling the curves by name and the intersection points
with ordered pairs.
c. Find and evaluate exactly an integral expression that represents the area between y = f (x) and the x−axis on the
interval between the intersection points of f and g .
d. Find and evaluate exactly an integral expression that represents the area between y = g(x) and the x−axis on the
interval between the intersection points of f and g .
e. What is the exact area between f and g between their intersection points? Why?

Figure 6.1.1 : Axes for plotting f and g in Preview Activity 6.1

Matthew Boelkins, David Austin & Steven


6.1.1 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
The Area Between Two Curves
Through Preview Activity 6.1.1, we encounter a natural way to think about the area between two curves: the area between
the curves is the area beneath the upper curve minus the area below the lower curve. For the functions
2
f (x) = (x − 1 ) +1

and

g(x) = x + 2,

shown in Figure 6.1.2, we see that the upper curve is g(x) = x + 2 , and that the graphs intersect at (0, 2) and (3, 5). Note
that we can find these intersection points by solving the system of equations given by y = (x − 1) + 1 and y = x + 2 2

through substitution: substituting x + 2 for y in the first equation yields x + 2 = (x − 1) + 1 , so 2

x + 2 = x − 2x + 1 + 1 , and thus
2

2
x − 3x = x(x − 3) = 0.

Figure 6.1.2 : The areas bounded by the functions f (x) = (x − 1) 2


+1 and g(x) = x + 2 on the interval [0,3].
from which it follows that x = 0 or x = 3 . Using y = x +2 , we find the corresponding values of the intersection
y−

points. On the interval [0, 3], the area beneath g is


3
21
∫ (x + 2)dx = ,
0
2

while the area under f on the same interval is


3
2
∫ [(x − 1 ) + 1]dx = 6.
0

Thus, the area between the curves is


3 3
21 9
2
A =∫ (x + 2)dx − ∫ [(x − 1 ) + 1]dx = −6 = . (6.1)
0 0
2 2

A slightly different perspective is also helpful here: if we take the region between two curves and slice it up into thin
vertical rectangles (in the same spirit as we originally sliced the region between a single curve and the x−axis in Section
4.2), then we see that the height of a typical rectangle is given by the difference between the two functions. For example,
for the rectangle shown at left in Figure 6.1.3, we see that the rectangle’s height is g(x) − f (x) , while its width can be
viewed as Δx, and thus the area of the rectangle is

Arect = (g(x) − f (x))Δx.

Matthew Boelkins, David Austin & Steven


6.1.2 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
Figure 6.1.3 : The area bounded by the functions f (x) = (x − 1) 2
+1 and g(x) = x + 2 on the interval [0,3].
The area between the two curves on [0, 3] is thus approximated by the Riemann sum
n

A ≈ ∑(g(xi ) − f (xi ))Δx,

i=1

and then as we let n → ∞ , it follows that the area is given by the single definite integral
3

A =∫ (g(x) − f (x))dx. (6.2)


0

In many applications of the definite integral, we will find it helpful to think of a “representative slice” and how the definite
integral may be used to add these slices to find the exact value of a desired quantity. Here, the integral essentially sums the
areas of thin rectangles.
Finally, whether we think of the area between two curves as the difference between the area bounded by the individual
curves (as in Equation 6.1) or as the limit of a Riemann sum that adds the areas of thin rectangles between the curves (as in
Equation 6.2), these two results are the same, since the difference of two integrals is the integral of the difference:
3 3 3

∫ g(x)dx − ∫ f (x)dx = ∫ (g(x) − f (x))dx.


0 0 0

Moreover, our work so far in this section exemplifies the following general principle. If two curves y = g(x) and
y = f (x) intersect at (a, g(a)) and (b, g(b)), and for all x such that a  ≤  x  ≤  b , g(x)  ≥  f (x) , then the area between

the curves is
b

A =∫ (g(x) − f (x))dx.
a

Activity 6.1.1

In each of the following problems, our goal is to determine the area of the region described. For each region,
i. determine the intersection points of the curves,
ii. sketch the region whose area is being found,
iii. draw and label a representative slice, and
iv. state the area of the representative slice.
Then, state a definite integral whose value is the exact area of the region, and evaluate the integral to find the numeric
value of the region’s area.
− 1
a. The finite region bounded by y = √x and y = x .
4
b. The finite region bounded by y = 12 − 2x and y = x − 8 .
2 2

c. The area bounded by the y−axis, f (x) = cos(x) , and g(x) = sin(x), where we consider the region formed by the
first positive value of x for which f and g intersect.
d. The finite regions between the curves y = x − x and y = x .
3 2

Matthew Boelkins, David Austin & Steven


6.1.3 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
Finding Area with Horizontal Slices
At times, the shape of a geometric region may dictate that we need to use horizontal rectangular slices, rather than vertical
ones. For instance, consider the region bounded by the parabola x = y − 1 and the line y = x − 1 , pictured in Figure
2

6.1.4. First, we observe that by solving the second equation for x and writing x = y + 1 , we can eliminate a variable

through substitution and find that y + 1 = y − 1 , and hence the curves intersect where
2

2
y − y − 2 = 0.

Thus, we find y  =   − 1 or y  =  2 , so the intersection points of the two curves are (0, −1) and (3, 2). We see that if we
attempt to use vertical rectangles to slice up the area, at certain values of x (specifically from x  =   − 1 to x  =  0 , as
seen in the center graph of Figure 6.1.4, the curves that govern the top and bottom of the rectangle are one and the same.
This suggests, as shown in the rightmost graph in the figure, that we try using horizontal rectangles as a way to think about
the area of the region. For such a horizontal rectangle, note that its width depends on y , the height at which the rectangle is
constructed. In particular, at a height y between y  =   − 1 and y  =  2 , the right end of a representative rectangle is
determined by the line, x = y + 1 , while the left end of the rectangle is determined by the parabola, x = y − 1 , and the2

thickness of the rectangle is Δy.


Therefore, the area of the rectangle is
2
Arect = [(y + 1) − (y − 1)]Δy,

from which it follows that the area between the two curves on the y−interval [−1, 2] is approximated by the Riemann
n

2
A ≈ ∑[(yi + 1) − (y − 1)]Δy.
i

i=1

Figure 6.1.4 : The area bounded by the functions x = y − 1 and y = x − 1 (at left), with the region sliced vertically
2

(center) and horizontally (at right).


Taking the limit of the Riemann sum, it follows that the area of the region is
y=2
2
A =  ∫ [(y + 1) − (y − 1)]dy. (6.3)
y=−1

We emphasize that we are integrating with respect to y ; this is dictated by the fact that we chose to use horizontal
rectangles whose widths depend on y and whose thickness is denoted Δy. It is a straightforward exercise to evaluate the
9
integral in Equation 6.3 and find that A = .
2

Just as with the use of vertical rectangles of thickness Δx , we have a general principle for finding the area between two
curves, which we state as follows.
If two curves x = g(y) and x = f (y) intersect at (g(c), c) and (g(d), d), and for all y such that c ≤y ≤d , g(y) ≥ f (y),
then the area between the curves is
y=d

A =∫ (g(y) − f (y)) dy
y=c

Matthew Boelkins, David Austin & Steven


6.1.4 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
Activity 6.1.2

In each of the following problems, our goal is to determine the area of the region described. For each region,
i. determine the intersection points of the curves,
ii. sketch the region whose area is being found,
iii. draw and label a representative slice, and
iv. state the area of the representative slice.
Then, state a definite integral whose value is the exact area of the region, and evaluate the integral to find the numeric
value of the region’s area. Note well: At the step where you draw a representative slice, you need to make a choice
about whether to slice vertically or horizontally.
a. The finite region bounded by x = y and x = 6 − 2y .
2 2

b. The finite region bounded by x = 1 − y and x = 2 − 2y .


2 2

c. The area bounded by the x-axis, y = x , and y = 2 − x .


2

d. The finite regions between the curves x = y − 2y and y = x .


2

Finding the length of a curve


In addition to being able to use definite integrals to find the areas of certain geometric regions, we can also use the definite
integral to find the length of a portion of a curve. We use the same fundamental principle: we take a curve whose length we
cannot easily find, and slice it up into small pieces whose lengths we can easily approximate. In particular, we take a given
curve and subdivide it into small approximating line segments, as shown at left in Figure 6.1.5. To see how we find such a
definite integral that measures arc length on

Figure 6.1.5 : At left, a continuous function y = f (x) whose length we seek on the interval a = x to b = x . At right, a
0 3

close up view of a portion of the curve.


the curve y = f (x) from x = a to x = b , we think about the portion of length, L , that lies along the curve on a small
slice

interval of length Δx, and estimate the value of L using a well-chosen triangle. In particular, if we consider the right
slice

triangle with legs parallel to the coordinate axes and hypotenuse connecting two points on the curve, as seen at right in
Figure 6.1.5, we see that the length, h , of the hypotenuse approximates the length, L , of the curve between the two
slice

selected points. Thus,


−−−−−−−−−−−−
2 2
Lslice ≈ h = √ (Δx ) + (Δy )

By algebraically rearranging the expression for the length of the hypotenuse, we see how a definite integral can be used to
compute the length of a curve. In particular, observe that by removing a factor of (Δx) , we find that
2

Matthew Boelkins, David Austin & Steven


6.1.5 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
−−−−−−−−−−−−
2 2
Lslice ≈ √ (Δx ) + (Δy )

−−−−−−−−−−−−−−− −
(Δy)2
2
= √ (Δx ) (1 + )
2
(Δx)

−−−−−−−−−
2
(Δy)
= √1 + ⋅ Δx.
2
(Δx)

Δy dy
Furthermore, as n → ∞ and Δx → 0 , it follows that →

= f (x) . Thus, we can say that
Δx dx

−−−−−−−−−
′ 2
Lslice ≈ √ 1  +  f (x ) Δx.

Taking a Riemann sum of all of these slices and letting n → ∞ , we arrive at the following fact.
Given a differentiable function f on an interval [a , b ], the total arc length, L , along the curve y = f (x) from x =a to
x = b is given by

b −−−−−−−−
′ 2
L =∫ √ 1 + f (x ) dx.
a

Activity 6.1.3

Each of the following questions somehow involves the arc length along a curve.
a. Use the definition and appropriate computational technology to determine the arc length along y = x from 2

x = −1 to x = 1 .
−−−− −
b. Find the arc length of y = √4 − x on the interval −2 ≤ x ≤ 2 . Find this value in two different ways:
2

1. by using a definite integral


2. by using a familiar property of the curve.
c. Determine the arc length of y = xe on the interval [0, 1].
3x

d. Will the integrals that arise calculating arc length typically be ones that we can evaluate exactly using the First
FTC, or ones that we need to approximate? Why?
e. A moving particle is traveling along the curve given by y = f (x) = 0.1x + 1 , and does so at a constant rate of 7
2

cm/sec, where both x and y are measured in cm (that is, the curve y = f (x) is the path along which the object
actually travels; the curve is not a “position function”). Find the position of the particle when t = 4 sec, assuming
that when t = 0, the particle’s location is (0, f (0)).

Summary
In this section, we encountered the following important ideas:
To find the area between two curves, we think about slicing the region into thin rectangles. If, for instance, the area of a
typical rectangle on the interval x = a to x = b is given by A = (g(x) − f (x))Δx, then the exact area of the
rect

region is given by the definite integral


b

A =∫ (g(x) − f (x))dx
a

The shape of the region usually dictates whether we should use vertical rectangles of thickness Δx or horizontal
rectangles of thickness Δy. We desire to have the height of the rectangle governed by the difference between two
curves: if those curves are best thought of as functions of y , we use horizontal rectangles, whereas if those curves are
best viewed as functions of x, we use vertical rectangles.
The arc length, L, along the curve y = f (x) from x = a to x = b is given by
b −−−−−−−−
′ 2
L =∫ √ 1 + f (x ) dx.
a

Matthew Boelkins, David Austin & Steven


6.1.6 11/3/2021 https://math.libretexts.org/@go/page/4327
Schlicker
6.2: Using Definite Integrals to Find Volume
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we use a definite integral to find the volume of a three-dimensional solid of revolution that results from
revolving a two-dimensional region about a particular axis?
In what circumstances do we integrate with respect to y instead of integrating with respect to x?
What adjustments do we need to make if we revolve about a line other than the x or y -axis?

Just as we can use definite integrals to add the areas of rectangular slices to find the exact area that lies between two
curves, we can also employ integrals to determine the volume of certain regions that have cross-sections of a particular
consistent shape. As a very elementary example, consider a cylinder of radius 2 and height 3, as pictured in Figure 6.2.1.

Figure 6.2.1 : A right circular cylinder.


While we know that we can compute the area of any circular cylinder by the formula
2
V = π r h,

if we think about slicing the cylinder into thin pieces, we see that each is a cylinder of radius r = 2 and height (thickness)
Δx. Hence, the volume of a representative slice is

2
Vslice = π ⋅ 2 ⋅ Δx.

Letting Δx → 0 and using a definite integral to add the volumes of the slices, we find that
3
2
V =∫ π⋅2 dx.
0

Moreover, since
3

∫ 4π dx = 12π,
0

we have found that the volume of the cylinder is 12π. The principal problem of interest in our upcoming work will be to
find the volume of certain solids whose cross-sections are all thin cylinders (or washers) and to do so by using a definite
integral. To that end, we first consider another familiar shape in Preview Activity 6.2.2: a circular cone.

Preview Activity 6.2.1

Consider a circular cone of radius 3 and height 5, which we view horizontally as pictured in Figure 6.2.2. Our goal in
this activity is to use a definite integral to determine the volume of the cone.

Matthew Boelkins, David Austin & Steven


6.2.1 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
Figure 6.2.2 : The circular cone Described in Preview Activity 6.2.
a. Find a formula for the linear function y = f (x) that is pictured in Figure 6.2.2.
b. For the representative slice of thickness Δx that is located horizontally at a location x (somewhere between x = 0
and x = 5 ), what is the radius of the representative slice? Note that the radius depends on the value of x.
c. What is the volume of the representative slice you found in (b)?
d. What definite integral will sum the volumes of the thin slices across the full horizontal span of the cone? What is
the exact value of this definite integral?
e. Compare the result of your work in (d) to the volume of the cone that comes from using the formula
1
2
Vcone = π r h.
3

The Volume of a Solid of Revolution


A solid of revolution is a three dimensional solid that can be generated by revolving one or more curves around a fixed
axis. For example, we can think of a circular cylinder as a solid of revolution: in Figure 6.2.1, this could be accomplished
by revolving the line segment from (0, 2) to (3, 2) about the x-axis. Likewise, the circular cone in Figure 6.2.2 is the solid
3
of revolution generated by revolving the portion of the line y =3− x from x =0 to x =5 about the x -axis. It is
5
particularly important to notice in any solid of revolution that if we slice the solid perpendicular to the axis of revolution,
the resulting cross-section is circular. We consider two examples to highlight some of the natural issues that arise in
determining the volume of a solid of revolution.

Example 6.2.1

Find the volume of the solid of revolution generated when the region R bounded by y = 4 −x
2
and the x-axis is
revolved about the x-axis.
Solution
First, we observe that y = 4 − x intersects the x-axis at the points (−2, 0) and (2, 0). When we take the region R that
2

lies between the curve and the x-axis on this interval and revolve it about the x-axis, we get the three-dimensional
solid pictured in Figure 6.2.3.

Matthew Boelkins, David Austin & Steven


6.2.2 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
Figure 6.2.3 : The solid of revolution in Example 6.2.1.
Taking a representative slice of the solid located at a value x that lies between x = −2 and x = 2 , we see that the
thickness of such a slice is Δx (which is also the height of the cylinder-shaped slice), and that the radius of the slice is
determined by the curve y = 4 − x . Hence, we find that
2

2 2
Vslice = π(4 − x ) Δx,

since the volume of a cylinder of radius r and height h is V   = π r 2


h . Using a definite integral to sum the volumes of
the representative slices, it follows that
2
2 2
V =∫ π(4 − x ) dx.
−2

It is straightforward to evaluate the integral and find that the volume is


512
V = π. (6.2.1)
15

For a solid such as the one in Example 6.2.1, where each cross-section is a cylindrical disk, we first find the volume of a
typical cross-section (noting particularly how this volume depends on x), and then we integrate over the range of x-values
through which we slice the solid in order to find the exact total volume. Often, we will be content with simply finding the
integral that represents the sought volume; if we desire a numeric value for the integral, we typically use a calculator or
computer algebra system to find that value.
The general principle we are using to find the volume of a solid of revolution generated by a single curve is often called
the disk method.
If y = r(x) is a nonnegative continuous function on [a, b], then the volume of the solid of revolution generated by
revolving the curve about the x-axis over this interval is given by
b
2
V =∫ πr(x ) dx.
a

A different type of solid can emerge when two curves are involved, as we see in the following example.

Example 6.2.2

Find the volume of the solid of revolution generated when the finite region R that lies between y = 4 −x
2
and
y = x + 2 is revolved about the x-axis.

Solution

Matthew Boelkins, David Austin & Steven


6.2.3 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
First, we must determine where the curves y = 4 − x and y = x + 2 intersect. Substituting the expression for y from
2

the second equation into the first equation, we find that


2
x +2 = 4 −x .

Rearranging, it follows that


2
x +x −2 = 0

and the solutions to this equation are x = −2 and x = 1 . The curves therefore cross at (−2, 0) and (1, 1).
When we take the region R that lies between the curves and revolve it about the x-axis, we get the three-dimensional
solid pictured at left in Figure 6.2.4.

Figure 6.2.4 : At left, the solid of revolution in Example 6.2. at right, a typical slice with inner radius r(x) and outer
radius R(x).
Immediately we see a major difference between the solid in this example and the one in Example 6.2.1: here, the
three-dimensional solid of revolution isn’t “solid” in the sense that it has open space in its center. If we slice the solid
perpendicular to the axis of revolution, we observe that in this setting the resulting representative slice is not a solid
disk, but rather a washer, as pictured at right in Figure 6.2.4. Moreover, at a given location x between x = −2 and
x = 1 , the small radius r(x) of the inner circle is determined by the curve y = x + 2 , so r(x) = x + 2 . Similarly, the

big radius R(x) comes from the function y = 4 − x , and thus R(x) = 4 − x . Thus, to find the volume of a
2 2

representative slice, we compute the volume of the outer disk and subtract the volume of the inner disk. Since
2 2 2 2
πR(x ) Δx − πr(x ) Δx = π[R(x ) − r(x ) ]Δx,

it follows that the volume of a typical slice is


2 2 2
Vslice = π[(4 − x ) − (x + 2 ) ]Δx.

Hence, using a definite integral to sum the volumes of the respective slices across the integral, we find that
1
2 2 2
V =∫ π[(4 − x ) − (x + 2 ) ]dx.
−2

Evaluating the integral, the volume of the solid of revolution is


108
V = π.
5

The general principle we are using to find the volume of a solid of revolution generated by a single curve is often
called the washer method. If y = R(x) and y = r(x) are nonnegative continuous functions on [a, b] that satisfy
R(x) ≥ r(x) for all x in [a, b], then the volume of the solid of revolution generated by revolving the region between

them about the x-axis over this interval is given by


b
2 2
V =∫ π[R(x ) − r(x ) ]dx.
a

Matthew Boelkins, David Austin & Steven


6.2.4 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
Activity 6.2.1

In each of the following questions, draw a careful, labeled sketch of the region described, as well as the resulting solid
that results from revolving the region about the stated axis. In addition, draw a representative slice and state the
volume of that slice, along with a definite integral whose value is the volume of the entire solid. It is not necessary to
evaluate the integrals you find.
a. The region S bounded by the x-axis, the curve y = √−
x , and the line x = 4 ; revolve S about the x-axis.

b. The region S bounded by the y-axis, the curve y = √x , and the line y = 2 ; revolve S about the x-axis.
c. The finite region S bounded by the curves y = √−x and y = x ; revolve S about the x-axis.
3

d. The finite region S bounded by the curves y = 2x + 1 and y = x + 4 ; revolve S about the x-axis
2 2

e. The region S bounded by the y-axis, the curve y = √−


x , and the line y = 2 ; revolve S about the y -axis. How does

the problem change considerably when we revolve about the y -axis?

Revolving about the y-axis


As seen in Activity 6.2.1, problem (e), the problem changes considerably when we revolve a given region about the y -
axis. Foremost, this is due to the fact that representative slices now have thickness Δy, which means that it becomes
necessary to integrate with respect to y . Let’s consider a particular example to demonstrate some of the key issues.

Example 6.2.3

Example 6.3.
Find the volume of the solid of revolution generated when the finite region R that lies between y = √−
x and y = x is 4

revolved about the y -axis.


Solution.
We observe that these two curves intersect when x = 1 , hence at the point (1, 1). When we take the region R that lies
between the curves and revolve it about the y -axis, we get the three-dimensional solid pictured at left in Figure 6.2.5.

Figure 6.2.5 : At left, the solid of revolution in Example 6.3. At right, a typical slice with inner radius r(y) and outer
radius R(y).
Now, it is particularly important to note that the thickness of a representative slice is Δy, and that the slices are only
cylindrical washers in nature when taken perpendicular to the y -axis. Hence, we envision slicing the solid horizontally,

starting at y = 0 and proceeding up to y = 1 . Because the inner radius is governed by the curve y = √x , but from the
perspective that x is a function of y , we solve for x and get
x =y
2
= r(y) .
In the same way, we need to view the curve y = x (which governs the outer radius) in the form where x is a function
4

of y , and hence √y. Therefore, we see that the volume of a typical slice is
4

2
Vslice = π[R(y )
2
− r(y ) ] = π[ √
4
y
2 2 2
− (y ) ]Δy .

Matthew Boelkins, David Austin & Steven


6.2.5 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
Using a definite integral to sum the volume of all the representative slices from y = 0 to y = 1 , the total volume is
y=1
V =∫
y=0
π[ √
4
y
2 2
− (y ) ] dy
2
.
7
It is straightforward to evaluate the integral and find that V = π .
15

Activity 6.2.2

In each of the following questions, draw a careful, labeled sketch of the region described, as well as the resulting solid
that results from revolving the region about the stated axis. In addition, draw a representative slice and state the
volume of that slice, along with a definite integral whose value is the volume of the entire solid. It is not necessary to
evaluate the integrals you find.
a. The region S bounded by the y -axis, the curve y = √− x , and the line y = 2 ; revolve S about the y -axis.

b. The region S bounded by the x-axis, the curve y = √− x , and the line x = 4 ; revolve S about the y -axis.

c. The finite region S in the first quadrant bounded by the curves y = 2x and y = x ; revolve S about the x-axis.
3

d. The finite region S in the first quadrant bounded by the curves y = 2x and y = x ; revolve S about the y -axis.
3

e. The finite region S bounded by the curves x = (y − 1) and y = x − 1 ; revolve S about the y -axis.
2

Revolving about horizontal and vertical lines other than the coordinate axes
Just as we can revolve about one of the coordinate axes (y = 0 or x = 0 ), it is also possible to revolve around any
horizontal or vertical line. Doing so essentially adjusts the radii of cylinders or washers involved by a constant value. A
careful, well-labeled plot of the solid of revolution will usually reveal how the different axis of revolution affects the
definite integral we set up. Again, an example is instructive.

Example 6.2.4

Example 6.4.
Find the volume of the solid of revolution generated when the finite region S that lies between y =x
2
and y =x is
revolved about the line y = −1 .
Solution
Graphing the region between the two curves in the first quadrant between their points of intersection ((0, 0) and (1, 1))
and then revolving the region about the line y = −1 , we see the solid shown in Figure 6.2.6. Each slice of the solid
perpendicular to the axis of revolution is a washer, and the radii of each washer are governed by the curves y = x and 2

y = x . But we also see that there is one added change: the axis of revolution adds a fixed length to each radius. In

particular, the inner radius of a typical slice, r(x), is given

Figure 6.2.6 : The solid of revolution described in Example 6.4.


by r(x) = x 2
+1 , while the outer radius is R(x) = x + 1 . Therefore, the volume of a typical slice is
Matthew Boelkins, David Austin & Steven
6.2.6 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
2 2 2 2 2
Vslice = π[R(x ) − r(x ) ]Δx = π[(x + 1 ) − (x + 1 ) ]Δx. . .

Finally, we integrate to find the total volume, and


1 7
V =∫
0
π[(x + 1 )
2 2
− (x
2
+ 1 ) ]dx = π .
15

Activity 6.2.3

In each of the following questions, draw a careful, labeled sketch of the region described, as well as the resulting solid
that results from revolving the region about the stated axis. In addition, draw a representative slice and state the
volume of that slice, along with a definite integral whose value is the volume of the entire solid. It is not necessary to
evaluate the integrals you find. For each prompt, use the finite region S in the first quadrant bounded by the curves
y = 2x and y = x .
3

a. Revolve S about the line y = −2 .


b. Revolve S about the line y = 4 .
c. Revolve S about the line x = −1 .
d. Revolve S about the line x = 5 .

Summary
In this section, we encountered the following important ideas:
We can use a definite integral to find the volume of a three-dimensional solid of revolution that results from revolving a
two-dimensional region about a particular axis by taking slices perpendicular to the axis of revolution which will then
be circular disks or washers.
If we revolve about a vertical line and slice perpendicular to that line, then our slices are horizontal and of thickness
Δy. This leads us to integrate with respect to y , as opposed to with respect to x when we slice a solid vertically.

If we revolve about a line other than the x- or y -axis, we need to carefully account for the shift that occurs in the radius
of a typical slice. Normally, this shift involves taking a sum or difference of the function along with the constant
connected to the equation for the horizontal or vertical line; a well-labeled diagram is usually the best way to decide the
new expression for the radius.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


6.2.7 12/8/2021 https://math.libretexts.org/@go/page/4328
Schlicker
6.3: Density, Mass, and Center of Mass
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How are mass, density, and volume related?
How is the mass of an object with varying density computed?
What is the center of mass of an object, and how are definite integrals used to compute it?

We have seen in several different circumstances how studying the units on the integrand and variable of integration enables
us to better understand the meaning of a definite integral. For instance, if v(t) is the velocity of an object moving along an
axis, measured in feet per second, while t measures time in seconds, then both the definite integral and its Riemann sum
approximation,
b n

a
v(t) dt ≈ ∑i=1 v(ti )Δt ,
have their overall units given by the product of the units of v(t) and t :
(feet/sec)·(sec) = feet.
Thus,
b
∫ v(t) dt
a

measures the total change in position (in feet) of the moving object. This type of unit analysis will be particularly helpful
to us in what follows. To begin, in the following preview activity we consider two different definite integrals where the
integrand is a function that measures how a particular quantity is distributed over a region and think about how the units on
the integrand and the variable of integration indicate the meaning of the integral.

Preview Activity 6.3.1:

In each of the following scenarios, we consider the distribution of a quantity along an axis.
a. Suppose that the function c(x) = 200 + 100e models the density of traffic on a straight road, measured in
−0.1x

cars per mile, where x is number of miles east of a major interchange, and consider the definite integral
2
) dx.
−0.1x
∫ (200 + 100 e
0

i. What are the units on the product c(x) ⋅ Δx ?


ii. What are the units on the definite integral and its Riemann sum approximation given by
2
c(x )Δx ?
n
∫ c(x)dx ≈ ∑ i
0 i=1
2 2
iii. Evaluate the definite integral ∫ 0
c(x) dx = ∫
0
(200 + 100 e
−0.1x
) dx and write one sentence to explain the
meaning of the value you find.
b. On a 6 foot long shelf filled with books, the function B models the distribution of the weight of the books,
measured in pounds per inch, where x is the number of inches from the left end of the bookshelf. Let B(x) be
1
given by the rule B(x) = 0.5 + 2
.
(x + 1)

i. What are the units on the product B(x) ⋅ Δx ?


ii. What are the units on the definite integral and its Riemann sum approximation given by
36
B(x )Δx ?
n
∫ B(x) dx ≈ ∑ i
12 i=1

72 72 1
iii. Evaluate the definite integral ∫ 0
B(x)dx = ∫
0
(0.5 + ) dx and write one sentence to explain the
(x + 1)2

meaning of the value you find.

Matthew Boelkins, David Austin & Steven


6.3.1 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
Density
The mass of a quantity, typically measured in metric units such as grams or kilograms, is a measure of the amount of a
quantity. In a corresponding way, the density of an object measures the distribution of mass per unit volume. For instance,
if a brick has mass 3 kg and volume 0.002 m , then the density of the brick is
3

3kg kg
= 1500 . (6.3.1)
3
0.002m m3

As another example, the mass density of water is 1000 kg/m . Each of these relationships demonstrate the following
3

general principle.
For an object of constant density d , with mass m and volume V ,
m
d = (6.3.2)
V

or
m =d⋅V . (6.3.3)

But what happens when the density is not constant?


If we consider the formula m = d ⋅ V , it is reminiscent of two other equations that we have used frequently in recent
work: for a body moving in a fixed direction, distance = rate · time, and, for a rectangle, its area is given by A  =  l ⋅ w .
These formulas hold when the principal quantities involved, such as the rate the body moves and the height of the
rectangle, are constant. When these quantities are not constant, we have turned to the definite integral for assistance. The
main idea in each situation is that by working with small slices of the quantity that is varying, we can use a definite
integral to add up the values of small pieces on which the quantity of interest (such as the velocity of a moving object) are
approximately constant.
For example, in the setting where we have a nonnegative velocity function that is not constant, over a short time interval
Δt we know that the distance traveled is approximately v(t)Δt , since v(t) is almost constant on a small interval, and for a

constant rate, distance = rate · time. Similarly, if we are thinking about the area under a nonnegative function f whose
value is changing, on a short interval δx the area under the curve is approximately the area of the rectangle whose height is
f (x) and whose width is Δx: f (x)Δx. Both of these principles are represented visually in Figure 6.3.1.

Figure 6.3.1 : At left, estimating a small amount of distance traveled, v(t)Δt , and at right, a small amount of area under
the curve, f (x)Δx.
In a similar way, if we consider the setting where the density of some quantity is not constant, the definite integral enables
us to still compute the overall mass of the quantity. Throughout, we will focus on problems where the density varies in
only one dimension, say along a single axis, and think about how mass is distributed relative to location along the axis.
Let’s consider a thin bar of length b that is situated so its left end is at the origin, where x = 0, and assume that the bar has
constant cross-sectional area of 1 cm . We let the function ρ(x) represent the mass density function of the bar, measured
2

in grams per cubic centimeter. That is, given a location x, ρ(x) tells us approximately how much mass will be found in a
one-centimeter wide slice of the bar at x.

Matthew Boelkins, David Austin & Steven


6.3.2 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
g
Figure 6.3.2 : A thin bar of constant cross-sectional area 1 cm with density function ρ(x)
2

3
.
cm

If we now consider a thin slice of the bar of width Δx, as pictured in Figure 6.3.2, the volume of such a slice is the cross-
sectional area times Δx. Since the cross-sections each have constant area 1 cm , it follows that the volume of the slice is
2

1Δx cm . Moreover, since mass is the product of density and volume (when density is constant), we see that the mass of
2

this given slice is approximately


g
3
mass slice ≈ ρ(x) ⋅ 1Δx cm = ρ(x) ⋅ Δx ⋅ g
3
cm

Hence, for the corresponding Riemann sum (and thus for the integral that it approximates),
n b

∑ ρ(xi )Δx ≈ ∫ ρ(x)dx, (6.3.4)


0
i=1

we see that these quantities measure the mass of the bar between 0 and b . (The Riemann sum is an approximation, while
the integral will be the exact mass.)
At this point, we note that we will be focused primarily on situations where mass is distributed relative to horizontal
location, x, for objects whose cross-sectional area is constant. In that setting, it makes sense to think of the density
function ρ(x) with units “mass per unit length,” such as g/cm. Thus, when we compute ρ(x) ⋅ Δx on a small slice Δx, the
g
resulting units are ⋅ cm = g , which thus measures the mass of the slice. The general principle follows.
cm

For an object of constant cross-sectional area whose mass is distributed along a single axis according to the function ρ(x)
(whose units are units of mass per unit of length), the total mass, M of the object between x = a and x = b is given by
b

M =∫ ρ(x)dx. (6.3.5)
a

Exercise 6.3.1:

Consider the following situations in which mass is distributed in a non-constant manner.


a. Suppose that a thin rod with constant cross-sectional area of 1 cm has its mass distributed according to the density
2

function ρ(x) = 2e −0.2x


, where x is the distance in cm from the left end of the rod, and the units on ρ(x) are g/cm.
If the rod is 10 cm long, determine the exact mass of the rod.
b. Consider the cone that has a base of radius 4 m and a height of 5 m. Picture the cone lying horizontally with the
center of its base at the origin and think of the cone as a solid of revolution.
i. Write and evaluate a definite integral whose value is the volume of the cone.
ii. Next, suppose that the cone has uniform density of 800 kg/m3 . What is the mass of the solid cone?
iii. Now suppose that the cone’s density is not uniform, but rather that the cone is most dense at its base. In
particular, assume that the density of the cone is uniform across cross sections parallel to its base, but that in
each such cross section that is a distance x units from the origin, the density of the cross section is given by the
200
function ρ(x) = 400 + , measured in kg/m . Determine and evaluate a definite integral whose value is
3

1 + x2

the mass of this cone of non-uniform density. Do so by first thinking about the mass of a given slice of the cone
x units away from the base; remember that in such a slice, the density will be essentially constant.

c. Let a thin rod of constant cross-sectional area 1 cm and length 12 cm have its mass be distributed according to the
2

1
density function ρ(x) = (x − 15 )
2
, measured in g/cm. Find the exact location z at which to cut the bar so that
25
the two pieces will each have identical mass.

Matthew Boelkins, David Austin & Steven


6.3.3 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
Weighted Averages
class grade grade points credits chemistry B+ 3.3

Table 6.1: A college student’s semester grades.


The concept of an average is a natural one, and one that we have used repeatedly as part of our understanding of the
meaning of the definite integral. If we have n values a ,  a , . . . ,  a , we know that their average is given by
1 2 n

a1 ,  a2 , . . . ,  an
, (6.3.6)
n

and for a quantity being measured by a function f on an interval [a, b], the average value of the quantity on [a, b] is
\p \dfrac{1}{b − a} \int^b_a f (x) dx.\[
As we continue to think about problems involving the distribution of mass, it is natural to consider the idea of a weighted
average, where certain quantities involved are counted more in the average.
A common use of weighted averages is in the computation of a student’s GPA, where grades are weighted according to
credit hours. Let’s consider the scenario in Table 6.1.
If all of the classes were of the same weight (i.e., the same number of credits), the student’s GPA would simply be
calculated by taking the average
3.3 + 3.7 + 2.7 + 2.7
= 3.1. (6.3.7)
4

But since the chemistry and calculus courses have higher weights (of 5 and 4 credits respectively), we actually compute
the GPA according to the weighted average
3.3 ⋅ 5 + 3.7 ⋅ 4 + 2.7 ⋅ 3 + 2.7 ⋅ 3 ¯
¯¯¯
¯
= 3. 16. (6.3.8)
5 +4 +3 +3

The weighted average reflects the fact that chemistry and calculus, as courses with higher credits, have a greater impact on
the students’ grade point average. Note particularly that in the weighted average, each grade gets multiplied by its weight,
and we divide by the sum of the weights. In the following activity, we explore further how weighted averages can be used
to find the balancing point of a physical system.

Activity 6.3.1:

For quantities of equal weight, such as two children on a teeter-totter, the balancing point is found by taking the
average of their locations. When the weights of the quantities differ, we use a weighted average of their respective
locations to find the balancing point.
a. Suppose that a shelf is 6 feet long, with its left end situated at x = 0 . If one book of weight 1 lb is placed at
x = 0 , and another book of weight 1 lb is placed at x = 6 , what is the location of x, the point at which the shelf
¯¯
¯
1 2

would (theoretically) balance on a fulcrum?


b. Now, say that we place four books on the shelf, each weighing 1 lb: at x = 0 , at x = 2 , at x = 4 , and at x = 6 .
1 2 3 4

Find x, the balancing point of the shelf.


¯¯
¯

c. How does x change if we change the location of the third book? Say the locations of the 1-lb books are x = 0 ,
¯¯
¯
1

x = 2 , x = 3 , and x = 6 .
2 3 4

Matthew Boelkins, David Austin & Steven


6.3.4 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
d. Next, suppose that we place four books on the shelf, but of varying weights: at x = 0 a 2-lb book, at x = 2 a 3-lb
1 2

book, and x = 4 a 1-lb book, and at x = 6 a 1-lb book. Use a weighted average of the locations to find x̄, the
3 4
¯
¯

balancing point of the shelf. How does the balancing point in this scenario compare to that found in (b)?
e. What happens if we change the location of one of the books? Say that we keep everything the same in (d), except
that x = 5 . How does x̄ change?
3
¯
¯

f. What happens if we change the weight of one of the books? Say that we keep everything the same in (d), except
that the book at x = 4 now weighs 2 lbs. How does x̄ change?
3
¯
¯

g. Experiment with a couple of different scenarios of your choosing where you move the location of one of the books
to the left, or you decrease the weight of one of the books.
h. Write a couple of sentences to explain how adjusting the location of one of the books or the weight of one of the
books affects the location of the balancing point of the shelf. Think carefully here about how your changes should
be considered relative to the location of the balancing point x̄ of the current scenario.
¯
¯

Center of Mass
In Activity 6.3.2, we saw that the balancing point of a system of point-masses1 (such as books on a shelf) is found by
taking a weighted average of their respective locations. In the activity, we were computing the center of mass of a system
of masses distributed along an axis, which is the balancing point of the axis on which the masses rest.
For a collection of n masses m 1, … , mn that are distributed along a single axis at the locations x 1, … , xn , the center of
mass is given by
x1 m1 + x2 m2 + ⋅ ⋅ ⋅ xn mn
¯¯
x̄ = .
m1 + m2 + ⋅ ⋅ ⋅ + mn

What if we instead consider a thin bar over which density is distributed continuously? If the density is constant, it is
obvious that the balancing point of the bar is its midpoint. But if density is not constant, we must compute a weighted
average. Let’s say that the 1 In the activity, we actually used weight rather than mass. Since weight is computed by the
gravitational constant times mass, the computations for the balancing point result in the same location regardless of
whether we use weight or mass, since the gravitational constant is present in both the numerator and denominator of the
weighted average.
function ρ(x) tells us the density distribution along the bar, measured in g/cm. If we slice the bar into small sections, this
enables us to think of the bar as holding a collection of adjacent point-masses. For a slice of thickness Δx at location x , i

note that the mass of the slice, m , satisfies m ≈ ρ(x )Δx .


i i i

Taking n slices of the bar, we can approximate its center of mass by


x1 ⋅ ρ(x1 )Δx + x2 ⋅ ρ(x2 )Δx + ⋅ ⋅ ⋅ + xn ⋅ ρ(xn )Δx
¯¯
x̄ ≈ .
ρ(x1 )Δx + ρ(x2 )Δx + ⋅ ⋅ ⋅ + ρ(xn )Δx

. Rewriting the sums in sigma notation, it follows that


n
∑i=1 xi ⋅ ρ(xi )Δx
¯¯
¯
x ≈ . (6.3.9)
n
∑ ρ(xi )Δx
i=1

Moreover, it is apparent that the greater the number of slices, the more accurate our estimate of the balancing point will be,
and that the sums in Equation (6.3.9) can be viewed as Riemann sums. Hence, in the limit as n → ∞ , we find that the
center of mass is given by the quotient of two integrals.
For a thin rod of density ρ(x) distributed along an axis from x = a to x = b , the center of mass of the rod is given by
b
∫ xρ(x)dx
¯¯
x̄ =
a
.
b
∫ ρ(x)dx
a

Note particularly that the denominator of x̄ is the mass of the bar, and that this quotient of integrals is simply the
¯
¯

continuous version of the weighted average of locations, x, along the bar.

Matthew Boelkins, David Austin & Steven


6.3.5 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
Activity 6.3.1:

Consider a thin bar of length 20 cm whose density is distributed according to the function ρ(x) = 4 + 0.1x , where
x = 0 represents the left end of the bar. Assume that ρ is measured in g/cm and x is measured in cm.

a. Find the total mass, M , of the bar.


b. Without doing any calculations, do you expect the center of mass of the bar to be equal to 10, less than 10, or
greater than 10? Why?
c. Compute x, the exact center of mass of the bar.
¯¯
¯

d. What is the average density of the bar?


e. Now consider a different density function, given by p(x) = 4e , also for a bar of length 20 cm whose left
0.020732x

end is at x = 0 . Plot both ρ(x) and p(x) on the same axes. Without doing any calculations, which bar do you
expect to have the greater center of mass? Why?
f. Compute the exact center of mass of the bar described in (e) whose density function is p(x) = 4e . Check 0.020732x

the result against the prediction you made in (e).

Summary
In this section, we encountered the following important ideas:
For an object of constant density D, with volume V and mass m, we know that m = D ⋅ V .
If an object with constant cross-sectional area (such as a thin bar) has its density distributed along an axis according to
the function ρ(x), then we can find the mass of the object between x = a and x = b by
b

m =∫ ρ(x)dx. (6.3.10)
a

For a system of point-masses distributed along an axis, say m 1, . . . , mn at locations x1, . . . , xn , the center of mass, x̄,
¯
¯

is given by the weighted average


n
∑ xi mi
i=1
¯¯
¯
x = n
. (6.3.11)
∑ mi
i=1

If instead we have mass continuously distributed along an axis, such as by a density function ρ(x) for a thin bar of
constant cross-sectional area, the center of mass of the portion of the bar between x = a and x = b is given by
b
∫ xρ(x)dx
a
¯
¯
x̄ = . (6.3.12)
b
∫ ρ(x)dx
a

In each situation, x represents the balancing point of the system of masses or of the portion of the bar.
¯¯
¯

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


6.3.6 12/5/2021 https://math.libretexts.org/@go/page/4329
Schlicker
6.4: Physics Applications - Work, Force, and Pressure
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How do we measure the work accomplished by a varying force that moves an object a certain distance?
What is the total force exerted by water against a dam?
How are both of the above concepts and their corresponding use of definite integrals similar to problems we have
encountered in the past involving formulas such as “distance equals rate times time” and “mass equals density
times volume”?

In our work to date with the definite integral, we have seen several different circumstances where the integral enables us to
measure the accumulation of a quantity that varies, provided the quantity is approximately constant over small intervals.
For instance, based on the fact that the area of a rectangle is A = l ⋅ w, if we wish to find the area bounded by a
nonnegative curve y = f (x) and the x-axis on an interval [a, b], a representative slice of width Δx has area
A = f (x)Δx, and thus as we let the width of the representative slice tend to zero, we find that the exact area of the
slice

region is
b

A =∫ f (x)dx. (6.4.1)
a

In a similar way, if we know that the velocity of a moving object is given by the function y = v(t) , and we wish to know
the distance the object travels on an interval [a, b] where v(t) is nonnegative, we can use a definite integral to generalize
the fact that d = r ⋅ t when the rate, r, is constant. More specifically, on a short time interval Δt, v(t) is roughly constant,
and hence for a small slice of time, d = v(t)Δt , and so as the width of the time interval Δt tends to zero, the exact
slice

distance traveled is given by the definite integral


b

d =∫ v(t)dt. (6.4.2)
a

Finally, when we recently learned about the mass of an object of non-constant density, we saw that since M = D ⋅ V
(mass equals density times volume, provided that density is constant), if we can consider a small slice of an object on
which the density is approximately constant, a definite integral may be used to determine the exact mass of the object. For
instance, if we have a thin rod whose cross sections have constant density, but whose density is distributed along the x axis
according to the function y = ρ(x), it follows that for a small slice of the rod that is Δx thick, M = ρ(x)Δx. In the
slice

limit as Δx → 0 , we then find that the total mass is given by


b

M =∫ ρ(x)dx. (6.4.3)
a

Note that all three of these situations are similar in that we have a basic rule (A = l ⋅ w, d = r ⋅ t, M = D ⋅ V ) where one
of the two quantities being multiplied is no longer constant; in each, we consider a small interval for the other variable in
the formula, calculate the approximate value of the desired quantity (area, distance, or mass) over the small interval, and
then use a definite integral to sum the results as the length of the small intervals is allowed to approach zero. It should be
apparent that this approach will work effectively for other situations where we have a quantity of interest that varies. We
next turn to the notion of work: from physics, a basic principal is that work is the product of force and distance. For
example, if a person exerts a force of 20 pounds to lift a 20-pound weight 4 feet off the ground, the total work
accomplished is
W =F ⋅d (6.4.4)

= 20 ⋅ 4 (6.4.5)

= 80 foot-pounds. (6.4.6)

If force and distance are measured in English units (pounds and feet), then the units on work are foot-pounds. If instead we
work in metric units, where forces are measured in Newtons and distances in meters, the units on work are Newton-meters.
Matthew Boelkins, David Austin & Steven
6.4.1 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
Figure 6.14: Three settings where we compute the accumulation of a varying quantity: the area under y = f (x), the
distance traveled by an object with velocity y = v(t) , and the mass of a bar with density function y = ρ(x).
Of course, the formula W = F ⋅ d only applies when the force is constant while it is exerted over the distance d . In
Preview Activity 6.4, we explore one way that we can use a definite integral to compute the total work accomplished when
the force exerted varies.

Preview Activity 6.4.1

A bucket is being lifted from the bottom of a 50-foot deep well; its weight (including the water), B , in pounds at a
height h feet above the water is given by the function B(h) . When the bucket leaves the water, the bucket and water
together weigh B(0) = 20 pounds, and when the bucket reaches the top of the well, B(50) = 12 pounds. Assume that
the bucket loses water at a constant rate (as a function of height, h ) throughout its journey from the bottom to the top
of the well.
a. Find a formula for B(h) .
b. Compute the value of the product B(5)Δh , where Δh = 2 feet. Include units on your answer. Explain why this
product represents the approximate work it took to move the bucket of water from h = 5 to h = 7 .
c. Is the value in (b) an over- or under-estimate of the actual amount of work it took to move the bucket from h = 5
to h = 7 ? Why?
d. Compute the value of the product B(22)Δh, where Δh = 0.25 feet. Include units on your answer. What is the
meaning of the value you found?
e. More generally, what does the quantity W = B(h)Δh measure for a given value of h and a small positive
slice

value of Δh?
5
f. Evaluate the definite integral ∫ 0 B(h)dh . What is the meaning of the value you find? Why?
0

Work
Because work is calculated by the rule W = F ⋅ d , whenever the force F is constant, it follows that we can use a definite
integral to compute the work accomplished by a varying force. For example, suppose that in a setting similar to the
problem posed in Preview Activity 6.4, we have a bucket being lifted in a 50-foot well whose weight at height h is given
by
−0.1h
B(h) = 12 + 8 e . (6.4.7)

In contrast to the problem in the preview activity, this bucket is not leaking at a constant rate; but because the weight of the
bucket and water is not constant, we have to use a definite integral to determine the total work that results from lifting the
bucket. Observe that at a height h above the water, the approximate work to move the bucket a small distance Δh is
−0.1h
Wslice = B(h)Δh = (12 + 8 e )Δh. (6.4.8)

Hence, if we let Δh tend to 0 and take the sum of all of the slices of work accomplished on these small intervals, it follows
that the total work is given by

Matthew Boelkins, David Austin & Steven


6.4.2 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
50 50
−0.1h
W =∫ B(h) dh = ∫ (12 + 8 e )dh. (6.4.9)
0 0

While is a straightforward exercise to evaluate this integral exactly using the First Fundamental Theorem of Calculus, in
applied settings such as this one we will typically use computing technology to find accurate approximations of integrals
that are of interest to us. Here, it turns out that
50
−0.1h
W =∫ (12 + 8 e )dh ≈ 679.461 foot-pounds. (6.4.10)
0

Our work in Preview Activity 6.1 and in the most recent example above employs the following important general
principle.
For an object being moved in the positive direction along an axis, x, by a force F (x) , the total work to move the object
from a to b is given by
b

W =∫ F (x)dx. (6.4.11)
a

Activity 6.4.1

Consider the following situations in which a varying force accomplishes work.


a. Suppose that a heavy rope hangs over the side of a cliff. The rope is 200 feet long and weighs 0.3 pounds per foot;
initially the rope is fully extended. How much work is required to haul in the entire length of the rope? (Hint: set up
a function F (h) whose value is the weight of the rope remaining over the cliff after h feet have been hauled in.)
b. A leaky bucket is being hauled up from a 100 foot deep well. When lifted from the water, the bucket and water
together weigh 40 pounds. As the bucket is being hauled upward at a constant rate, the bucket leaks water at a
constant rate so that it is losing weight at a rate of 0.1 pounds per foot. What function B(h) tells the weight of the
bucket after the bucket has been lifted h feet? What is the total amount of work accomplished in lifting the bucket
to the top of the well?
c. Now suppose that the bucket in (b) does not leak at a constant rate, but rather that its weight at a height h feet
above the water is given by B(h) = 25 + 15e . What is the total work required to lift the bucket 100 feet?
−0.05h

What is the average force exerted on the bucket on the interval h = 0 to h = 100 ?
d. From physics, Hooke’s Law for springs states that the amount of force required to hold a spring that is compressed
(or extended) to a particular length is proportionate to the distance the spring is compressed (or extended) from its
natural length. That is, the force to compress (or extend) a spring x units from its natural length is F (x) = kx for
some constant k (which is called the spring constant.) For springs, we choose to measure the force in pounds and
the distance the spring is compressed in feet. Suppose that a force of 5 pounds extends a particular spring 4 inches
(1/3 foot) beyond its natural length.
i. Use the given fact that F (1/3) = 5 to find the spring constant k .
ii. Find the work done to extend the spring from its natural length to 1 foot beyond its natural length.
iii. Find the work required to extend the spring from 1 foot beyond its natural length to 1.5 feet beyond its natural
length.

Work: Pumping Liquid from a Tank


In certain geographic locations where the water table is high, residential homes with basements have a peculiar feature: in
the basement, one finds a large hole in the floor, and in the hole, there is water. For example, in Figure 6.15 where we see a
sump crock.

Matthew Boelkins, David Austin & Steven


6.4.3 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
Figure 6.15: A sump crock. Image credit to www.warreninspect.com/basement-moisture.
Essentially, a sump crock provides an outlet for water that may build up beneath the basement floor; of course, as that
water rises, it is imperative that the water not flood the basement. Hence, in the crock we see the presence of a floating
pump that sits on the surface of the water: this pump is activated by elevation, so when the water level reaches a particular
height, the pump turns on and pumps a certain portion of the water out of the crock, hence relieving the water buildup
beneath the foundation. One of the questions we’d like to answer is: how much work does a sump pump accomplish? To
that end, let’s suppose that we have a sump crock that has the shape of a frustum of a cone, as pictured in Figure 6.16.
Assume that the crock has a diameter of 3 feet at its surface, a diameter of 1.5 feet at its base, and a depth of 4 feet. In
addition, suppose that the sump pump is set up so that it pumps the water vertically up a pipe to a drain that is located at
ground level just outside a basement window. To accomplish this, the pump must send the water to a location 9 feet above
the surface of the sump crock.

Figure 6.16: A sump crock with approximately cylindrical cross-sections that is 4 feet deep, 1.5 feet in diameter at its base,
and 3 feet in diameter at its top.
It turns out to be advantageous to think of the depth below the surface of the crock as being the independent variable, so, in
problems such as this one we typically let the positive x-axis point down, and the positive y -axis to the right, as pictured in
the figure. As we think about the work that the pump does, we first realize that the pump sits on the surface of the water, so
it makes sense to think about the pump moving the water one “slice” at a time, where it takes a thin slice from the surface,
pumps it out of the tank, and then proceeds to pump the next slice below. For the sump crock described in this example,
each slice of water is cylindrical in shape. We see that the radius of each approximately cylindrical slice varies according
to the linear function y = f (x) that passes through the points (0, 1.5) and (4, 0.75), where x is the depth of the particular
slice in the tank; it is a straightforward exercise to find that f (x) = 1.5 − 0.1875x. Now we are prepared to think about
the overall problem in several steps:
a. determining the volume of a typical slice;
b. finding the weight (We assume that the weight density of water is 62.4 pounds per cubic foot) of a typical slice (and
thus the force that must be exerted on it)
c. deciding the distance that a typical slice moves; and

Matthew Boelkins, David Austin & Steven


6.4.4 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
d. computing the work to move a representative slice. Once we know the work it takes to move one slice, we use a
definite integral over an appropriate interval to find the total work.
Consider a representative cylindrical slice that sits on the surface of the water at a depth of x feet below the top of the
crock. It follows that the approximate volume of that slice is given by
2
Vslice = πf (x ) Δx = π(1.5 − 0.1875x ) Δx
2
.
Since water weighs 62.4 lb/ft3 , it follows that the approximate weight of a representative slice, which is also the
approximate force the pump must exert to move the slice, is
Fslice = 62.4  ⋅  Vslice = 62.4π(1.5 − 0.1875x ) Δx
2
.
Because the slice is located at a depth of x feet below the top of the crock, the slice being moved by the pump must move
x feet to get to the level of the basement floor, and then, as stated in the problem description, be moved another 9 feet to

reach the drain at ground level outside a basement window. Hence, the total distance a representative slice travels is
dslice = x + 9 .
Finally, we note that the work to move a representative slice is given by
2
Wslice = Fslice ⋅ dslice = 62.4π(1.5 − 0.1875x ) Δx ⋅ (x + 9) ,
since the force to move a particular slice is constant. We sum the work required to move slices throughout the tank (from
x = 0 to x = 4 ), let Δx → 0 , and hence

4
W =∫
0
2
62.4π(1.5 − 0.1875x ) (x + 9)dx ,
which, when evaluated using appropriate technology, shows that the total work is W = 10970.5π foot-pounds.
The preceding example demonstrates the standard approach to finding the work required to empty a tank filled with liquid.
The main task in each such problem is to determine the volume of a representative slice, followed by the force exerted on
the slice, as well as the distance such a slice moves. In the case where the units are metric, there is one key difference: in
the metric setting, rather than weight, we normally first find the mass of a slice. For instance, if distance is measured in
meters, the mass density of water is 1000 kg/m3 . In that setting, we can find the mass of a typical slice (in kg). To
determine the force required to move it, we use F = ma, where m is the object’s mass and a is the gravitational constant
9.81 N/kg3 . That is, in metric units, the weight density of water is 9810 N/m3 .

Activity 6.4.2

In each of the following problems, determine the total work required to accomplish the described task. In parts (b) and
(c), a key step is to find a formula for a function that describes the curve that forms the side boundary of the tank.

Figure 6.17: A trough with triangular ends, as described in Activity 6.11, part (c).
a. Consider a vertical cylindrical tank of radius 2 meters and depth 6 meters. Suppose the tank is filled with 4 meters
of water of mass density 1000 kg/m3 , and the top 1 meter of water is pumped over the top of the tank.
b. Consider a hemispherical tank with a radius of 10 feet. Suppose that the tank is full to a depth of 7 feet with water
of weight density 62.4 pounds/ft3, and the top 5 feet of water are pumped out of the tank to a tanker truck whose
height is 5 feet above the top of the tank.

Matthew Boelkins, David Austin & Steven


6.4.5 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet
wide, and the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density
62.4 pounds/ft3, and a pump is used to empty the tank until the water remaining in the tank is 1 foot deep.

Force due to Hydrostatic Pressure


When a dam is built, it is imperative to for engineers to understand how much force water will exert against the face of the
dam. The first thing we realize is the force exerted by the fluid is related to the natural concept of pressure. The pressure a
force exerts on a region is measured in units of force per unit of area: for example, the air pressure in a tire is often
measured in pounds per square inch (PSI). Hence, we see that the general relationship is given by
F
P = , or F =P ⋅A ,
A

where P represents pressure, F represents force, and A the area of the region being considered. Of course, in the equation F
= PA, we assume that the pressure is constant over the entire region A.
Most people know from experience that the deeper one dives underwater while swimming, the greater the pressure that is
exerted by the water. This is due to the fact that the deeper one dives, the more water there is right on top of the swimmer:
it is the force that “column” of water exerts that determines the pressure the swimmer experiences. To get water pressure
measured in its standard units (pounds per square foot), we say that the total water pressure is found by computing the total
weight of the column of water that lies above a region of area 1 square foot at a fixed depth. Such a rectangular column
with a 1 × 1 base and a depth of d feet has volume V = 1 · 1 · d ft3, and thus the corresponding weight of the water
overhead is 62.4d. Since this is also the amount of force being exerted on a 1 square foot region at a depth d feet
underwater, we see that P = 62.4d (lbs/ft2) is the pressure exerted by water at depth d.
The understanding that P = 62.4d will tell us the pressure exerted by water at a depth of d, along with the fact that F = PA,
will now enable us to compute the total force that water exerts on a dam, as we see in the following example.

Example 6.4.3

Consider a trapezoid-shaped dam that is 60 feet wide at its base and 90 feet wide at its top, and assume the dam is 25
feet tall with water that rises to within 5 feet of the top of its face. Water weighs 62.5 pounds per cubic foot. How
much force does the water exert against the dam?
Solution
First, we sketch a picture of the dam, as shown in Figure 6.18. Note that, as in problems involving the work to pump
out a tank, we let the positive x-axis point down.
It is essential to use the fact that pressure is constant at a fixed depth. Hence, we consider a slice of water at constant
depth on the face, such as the one shown in the figure. First, the approximate area of this slice is the area of the
pictured rectangle. Since the width of that rectangle depends on the variable x (which represents the how far the slice
lies from the top of the dam), we find a formula for the function y = f (x) that determines one side of the face of the
3
dam. Since f is linear, it is straightforward to find that y = f (x) = 45 − x . Hence, the approximate area of a
5
representative slice is
3
Aslice = 2f (x)Δx = 2(45 − x)Δx .
5

At any point on this slice, the depth is approximately constant, and thus the pressure can be considered constant. In
particular, we note that since x measures the distance to the top of the dam, and because the water rises to within 5 feet
of the top of the dam, the depth of any point on the representative slice is approximately (x − 5) . Now, since pressure

Matthew Boelkins, David Austin & Steven


6.4.6 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
Figure 6.18: A trapezoidal dam that is 25 feet tall, 60 feet wide at its base, 90 feet wide at its top, with the water line 5
feet down from the top of its face.
is given by P = 62.4d, we have that at any point on the representative slice
Pslice = 62.4(x − 5) .
Knowing both the pressure and area, we can find the force the water exerts on the slice. Using F = PA , it follows that
3
Fslice = Pslice ⋅ Aslice = 62.4(x − 5) ⋅ 2(45 − x)Δx .
5

Finally, we use a definite integral to sum the forces over the appropriate range of x-values. Since the water rises to
within 5 feet of the top of the dam, we start at x = 5 and slice all the way to the bottom of the dam, where x = 30.
Hence,
x=30 3
F =∫
x=5
62.4(x − 5) ⋅ 2(45 − x)dx .
5

Using technology to evaluate the integral, we find F ≈ 1.248 × 106 pounds.

Activity 6.4.4

In each of the following problems, determine the total force exerted by water against the surface that is described.
a. Consider a rectangular dam that is 100 feet wide and 50 feet tall, and suppose that water presses against the dam all
the way to the top.
b. Consider a semicircular dam with a radius of 30 feet. Suppose that the water rises to within 10 feet of the top of the
dam.
c. Consider a trough with triangular ends, as pictured in Figure 6.17, where the tank is 10 feet long, the top is 5 feet
wide, and the tank is 4 feet deep. Say that the trough is full to within 1 foot of the top with water of weight density
62.4 pounds/ft3. How much force does the water exert against one of the triangular ends?

While there are many different formulas that we use in solving problems involving work, force, and pressure, it is
important to understand that the fundamental ideas behind these problems are similar to several others that we’ve
encountered in applications of the definite integral. In particular, the basic idea is to take a difficult problem and somehow
slice it into more manageable pieces that we understand, and then use a definite integral to add up these simpler pieces.

Summary
In this section, we encountered the following important ideas:
To measure the work accomplished by a varying force that moves an object, we subdivide the problem into pieces on
which we can use the formula W = F · d, and then use a definite integral to sum the work accomplished on each piece.
To find the total force exerted by water against a dam, we use the formula F = P · A to measure the force exerted on a
slice that lies at a fixed depth, and then use a definite integral to sum the forces across the appropriate range of depths.

Matthew Boelkins, David Austin & Steven


6.4.7 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
Because work is computed as the product of force and distance (provided force is constant), and the force water exerts
on a dam can be computed as the product of pressure and area (provided pressure is constant), problems involving
these concepts are similar to earlier problems we did using definite integrals to find distance (via “distance equals rate
times time”) and mass (“mass equals density times volume”).

Matthew Boelkins, David Austin & Steven


6.4.8 12/1/2021 https://math.libretexts.org/@go/page/4330
Schlicker
6.5: Improper Integrals

Learning Objectives

In this section, we strive to understand the ideas generated by the following important questions:
What are improper integrals and why are they important?
What does it mean to say that an improper integral converges or diverges?
What are some typical improper integrals that we can classify as convergent or divergent?

Another important application of the definite integral regards how the likelihood of certain events can be measured. For
example, consider a company that manufactures incandescent light bulbs, and suppose that based on a large volume of test
results, they have determined that the fraction of light bulbs that fail between times t = a and t = b of use (where t is
measured in months) is given by
b
−0.3t
∫ 0.3 e dt. (6.5.1)
a

For example, the fraction of light bulbs that fail during their third month of use is given by
3 3
−0.3t −0.3t ∣
∫ 0.3 e dt  =   − e ∣

2 2

−0.9 −0.6
=  −e +e

≈  0.1422

Thus about 14.22% of all lightbulbs fail between t = 2 and t = 3 . Clearly we could adjust the limits of integration to
measure the fraction of light bulbs that fail during any time period of interest.

Preview Activity 6.5.1:

A company with a large customer base has a call center that receives thousands of calls a day. After studying the data
that represents how long callers wait for assistance, they find that the function p(t) = 0.25e models the time −0.25t

customers wait in the following way: the fraction of customers who wait between t = a and t = b minutes is given by
b

∫ p(t)dt. (6.5.2)
a

Use this information to answer the following questions.


a. Determine the fraction of callers who wait between 5 and 10 minutes.
b. Determine the fraction of callers who wait between 10 and 20 minutes.
c. Next, let’s study how the fraction who wait up to a certain number of minutes:
i. What is the fraction of callers who wait between 0 and 5 minutes?
ii. What is the fraction of callers who wait between 0 and 10 minutes?
iii. Between 0 and 15 minutes? Between 0 and 20?
d. Let F(b) represent the fraction of callers who wait between 0 and b minutes. Find a formula for F(b) that involves a
definite integral, and then use the First FTC to find a formula for F(b) that does not involve a definite integral.
e. What is the value of the limit limb→∞ F (b)? What is its meaning in the context of the problem? ./

Improper Integrals Involving Unbounded Intervals


In light of our example with light bulbs that fail, as well as with the problem involving customer wait time in Preview
Activity 6.5, we see that it is natural to consider questions where we desire to integrate over an interval whose upper limit
grows without bound. For example, if we are interested in the fraction of light bulbs that fail within the first b months of
use, we know that the expression
Matthew Boelkins, David Austin & Steven
6.5.1 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
b
−0.3t
∫ 0.3 e dt (6.5.3)
0

measures this value. To think about the fraction of light bulbs that fail eventually, we understand that we wish to find
b
−0.3t
lim ∫ 0.3 e dt, (6.5.4)
b→∞
0

for which we will also use the notation



−0.3t
∫ 0.3 e dt (6.5.5)
0

Note particularly that we are studying the area of an unbounded region, as pictured in Figure 6.20.
Anytime we are interested in an integral for which the interval of integration is unbounded (that is, one for which at least
one of the limits of integration involves ∞), we say that the integral is improper.

Figure 6.20: At left, the area bounded by p(t) = 0.3e on the finite interval [0, b]; at right, the result of letting b → ∞ .
−0.3t

By “· · · ” in the righthand figure, we mean that the region extends to the right without bound.
For instance, the integrals
∞ 0 ∞
1 1 2
−x
∫ dx,   ∫ dx,  and ∫ e dx (6.5.6)
2 2
1 x −∞ 1 +x −∞

are all improper due to having limits of integration that involve ∞. We investigate the value of any such integral be

replacing the improper integral with a limit of proper integrals; for an improper integral such as ∫ f (x)dx, we write 0

∞ b

∫ f (x)dx = lim ∫ f (x)dx. (6.5.7)


b→∞
0 0

b
We can then attempt to evaluate ∫ f (x)dx using the First FTC, after which we can evaluate the limit. An immediate and
0

important question arises: is it even possible for the area of such an unbounded region to be finite? The following activity
explores this issue and others in more detail.

Activity 6.5.1:

∞ 1 ∞ 1
In this activity we explore the improper integrals ∫ 1
dx and ∫ 1
dx .
x 3/2
x

∞ 1
a. First we investigate ∫ 1
dx .
x

10 1 1000 1 100000 1
i. Use the First FTC to determine the exact values of ∫ 1
dx ∫ , 1
, and ∫
dx
1
dx . Then, use your
x x x
calculator to compute a decimal approximation of each result.
ii. Use the First FTC to evaluate the definite integral (which results in an expression that depends on b).
iii. Now, use your work from (ii.) to evaluate the limit given by
b
1
lim ∫ dx. (6.5.8)
b→∞
1
x

Matthew Boelkins, David Austin & Steven


6.5.2 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
∞ 1
a. Next, we investigate ∫ 1 3/2
dx .
x

10 1 1000 1 100000 1
Use the First FTC to determine the exact values of ∫ 1
dx ,∫1
dx , and ∫
1
dx . Then, use
x3/2 x3/2 x3/2
your calculator to compute a decimal approximation of each result.
b 1
Use the First FTC to evaluate the definite integral ∫ 1
dx (which results in an expression that depends on
3/2
x
b).
Now, use your work from (2.) to evaluate the limit given by
b
1
lim ∫ dx. (6.5.9)
b→∞ 3/2
1 x

1
a. Plot the functions y = x
1
and y = on the same coordinate axes for the values x = 0 … 10. How would you
3/2
x
compare their behavior as x increases without bound? What is similar? What is different?
∞ 1 ∞ 1
b. How would you characterize the value of ∫ 1
dx ? of ∫ 1
dx ? What does this tell us about the respective
x 3/2
x
areas bounded by these two curves for x ≥ 1 ?

Convergence and Divergence


Our work so far has suggested that when we consider a nonnegative function f on an interval [1, inf], such as
1
f (x) = (6.5.10)
x

or
1
f (x) = , (6.5.11)
3/2
x

b
there are at least two possibilities for the value of lim b→∞ ∫
1
f (x)dx : the limit is finite or infinite. With these possibilities
in mind, we introduce the following terminology.

Definition
If f (x) is nonnegative for x ≥ a , then we say that the improper integral \int_{a}^{\infty} f(x) dx converges provided
that
b

lim ∫ f (x)dx (6.5.12)


b→∞
a


exists and is finite. Otherwise, we say that ∫ a
f (x)dx diverges.

We normally restrict our interest to improper integrals for which the integrand is nonnegative. Further, we note that our
primary interest is in functions f for which lim f (x) = 0, , for if the function f does not approach 0 as x → ∞ , then
x→∞

it is impossible for ∫ f (x)dx to converge.
a

Activity 6.14
Determine whether each of the following improper integrals converges or diverges. For each integral that converges,
find its exact value.
∞ 1
a. ∫1 2
dx
x

b. ∫0
e
−x/4
dx

∞ 9
c. ∫2 2/3
dx
(x + 5)

Matthew Boelkins, David Austin & Steven


6.5.3 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
∞ 3
d. ∫
4 5/4
dx
(x + 2)

e. ∫
0
xe
−x/4
dx

∞ 1
f. ∫
1
dx , where p is a positive real number.
xP

Improper Integrals Involving Unbounded Integrands


It is also possible for an integral to be improper due to the integrand being unbounded on the interval of integration. For
example, if we consider
1
1
∫ dx, (6.5.13)

0 √x

1
we see that because f (x) =

has a vertical asymptote at x =0 , f is not continuous on [0, 1], and the integral is
√x

attempting to represent the area of the unbounded region shown at right in Figure 6.21.
Just as we did with improper integrals involving infinite limits, we address the problem of the integrand being unbounded
1 1
by replacing such an improper integral with a limit of proper integrals. For example, to evaluate ∫ 0 −
dx, we replace 0
√x

with a and let a approach 0 from the right. Thus,


1 1
1 1
∫ − dx = lim ∫ − dx, (6.5.14)
√x √x
+
0 a→0 a

and then we evaluate the proper integral


1
1
∫ dx, (6.5.15)

a √x

followed by taking the limit.

Figure 6.21: At left, the area bounded by f (x) = 1 √ x on the finite interval [a, 1]; at right, the result of letting a → 0 + ,
where we see that the shaded region will extend vertically without bound.
In the same way as with improper integrals involving unbounded regions, we will say that the improper integral converges
provided that this limit exists, and diverges otherwise. In the present example, we observe that
1 1
1 1
∫ dx = lim ∫ dx (6.5.16)
− +

0 √x a→0 a √x

−1
= lim 2 √x |
+
a
a→0

– −

= lim 2 √1 − 2 √a
+
a→0

= 2,

1 1
and therefore the improper integral ∫
0 −
dx converges (to the value 2). We have to be particularly careful with
√x

unbounded integrands, for they may arise in ways that may not initially be obvious. Consider, for instance, the integral

Matthew Boelkins, David Austin & Steven


6.5.4 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
3
1
∫ dx. (6.5.17)
2
1 (x − 2)

At first glance we might think that we can simply apply the Fundamental Theorem of Calculus by antidifferentiating
1 −1
to get and then evaluate from 1 to 3. Were we to do so, we would be erroneously applying the FTC
(x − 2)2 x −2

1
because f (x) =
2
fails to be continuous throughout the interval, as seen in Figure 6.22. Such an incorrect
(x − 2)

application of the FTC leads to an impossible result (−2), which would itself suggest that something we did must be
wrong.

1
Figure 6.22: The function f (x) = on an interval including x = 2 .
(x − 2)2

1
Indeed, we must address the vertical asymptote in f (x) = 2
at x = 2 by writing
(x − 2)

3 a 3
1 1 1
∫ dx = lim ∫ dx + lim ∫ dx (6.5.18)
1
(x − 2)2 a→2

1
(x − 2)2 a→2
+
b
(x − 2)2

and then evaluate two separate limits of proper integrals. For instance, doing so for the integral with a approaching 2 from
the left, we find
1 a
1 1
∫ dx = lim ∫ dx (6.5.19)
2
2
(x − 2) a→2

1
(1 − x)2

1 a
= lim − |

1
a→2 (x − 2)

1 1
= lim +
a→2

(a − 2) 1 −2

= ∞,

1 2 1
since → −∞ as a approaches 2 from the left. Thus, the improper integral ∫
1
dx diverges; similar work
a−2 (x − 2)2

3 1
shows that ∫ 2 2
dx also diverges. From either of these two results, we can conclude that that the original integral,
(x − 2)

3 1

1
dx diverges, too.
(x − 2)2

Activity 6.5.1:

For each of the following definite integrals, decide whether the integral is improper or not. If the integral is proper,
evaluate it using the First FTC. If the integral is improper, determine whether or not the integral converges or diverges;
if the integral converges, find its exact value.
1 1
a. ∫ 0
dx
1/3
x
2
b. ∫ 0
e
−x
dx

Matthew Boelkins, David Austin & Steven


6.5.5 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
4 1
c. ∫ 1 −−−−− dx
√4 − x

2 1
d. ∫ −2 2
dx
x
π/2
e. ∫ 0
tan(x)dx

1 1
f. ∫ 0 −−−− − dx
√1 − x2

Summary
In this section, we encountered the following important ideas:
b
An integral ∫ f (x)dx can be improper if at least one of a or b is ±∞ , making the interval unbounded, or if f has a
a

vertical asymptote at x = c for some value of c that satisfies a ≤ c ≤ b . One reason that improper integrals are
important is that certain probabilities can be represented by integrals that involve infinite limits.
When we encounter an improper integral, we work to understand it by replacing the improper integral with a limit of
proper integrals. For instance, we write
\[\int_{a}^{\infty} f(x) dx = \lim_{b \rightarrow \infty}\int_{a}^{b} f(x)dx, ],
and then work to determine whether the limit exists and is finite. For any improper integral, if the resulting limit of proper
integrals exists and is finite, we say the improper integral converges. Otherwise, the improper integral diverges.
An important class of improper integrals is given by

1
∫ dx (6.5.20)
P
1 x

where p is a positive real number. We can show that this improper integral converges whenever p >1 , and diverges
1 1
whenever 0 < p ≤ 1 . A related class of improper integrals is ∫ 0 P
dx , which converges for 0 < p < 1 , and diverges for
x
p ≥1 .

Matthew Boelkins, David Austin & Steven


6.5.6 11/24/2021 https://math.libretexts.org/@go/page/4331
Schlicker
6.E: Using Definite Integrals (Exercises)
6.1: Using Definite Integrals to Find Area and Length
Exercises 1. Find the exact area of each described region. The finite region between the curves x = y(y − 2) and x = −(y −
1)(y − 3). The region between the sine and cosine functions on the interval [ π 4 , 3π 4 ]. The finite region between x = y 2
− y − 2 and y = 2x − 1. The finite region between y = mx and y = x 2 − 1, where m is a positive constant. 2. Let f (x) = 1 −
x 2 and g(x) = ax2 − a, where a is an unknown positive real number. For what value(s) of a is the area between the curves f
and g equal to 2? 3. Let f (x) = 2 − x 2 . Recall that the average value of any continuous function f on an interval [a, b] is
given by 1 b−a R b a f (x) dx. 3
a. Find the average value of f (x) = 2 − x 2 on the interval [0, √ 2]. Call this value r.
b. Sketch a graph of y = f (x) and y = r. Find their intersection point(s).
c. Show that on the interval [0, √ 2], the amount of area that lies below y = f (x) and above y = r is equal to the amount of
area that lies below y = r and above y = f (x).
d. Will the result of be true for any continuous function and its average value on any interval? Why?

6.2 Using Definite Integrals to Find Volume


1. Consider the curve f (x) = 3 cos( x 3 4 ) and the portion of its graph that lies in the first quadrant between the y-axis and
the first positive value of x for which f (x) = 0. Let R denote the region bounded by this portion of f , the x-axis, and the y-
axis.
a. Set up a definite integral whose value is the exact arc length of f that lies along the upper boundary of R. Use
technology appropriately to evaluate the integral you find.
b. Set up a definite integral whose value is the exact area of R. Use technology appropriately to evaluate the integral you
find.
c. Suppose that the region R is revolved around the x-axis. Set up a definite integral whose value is the exact volume of
the solid of revolution that is generated. Use technology appropriately to evaluate the integral you find.
d. Suppose instead that R is revolved around the y-axis. If possible, set up an integral expression whose value is the exact
volume of the solid of revolution and evaluate the integral using appropriate technology. If not possible, explain why.
2. Consider the curves given by y = sin(x) and y = cos(x). For each of the following problems, you should include a sketch
of the region/solid being considered, as well as a labeled representative slice.
Sketch the region R bounded by the y-axis and the curves y = sin(x) and y = cos(x) up to the first positive value of x at
which they intersect. What is the exact intersection point of the curves? 6.2. USING DEFINITE INTEGRALS TO FIND
VOLUME 353
a. Set up a definite integral whose value is the exact area of R.
b. Set up a definite integral whose value is the exact volume of the solid of revolution generated by revolving R about the
x-axis.
c. Set up a definite integral whose value is the exact volume of the solid of revolution generated by revolving R about the
y-axis.
d. Set up a definite integral whose value is the exact volume of the solid of revolution generated by revolving R about the
line y = 2.
e. Set up a definite integral whose value is the exact volume of the solid of revolution generated by revolving R about the
x = −1.
3. Consider the finite region R that is bounded by the curves y = 1 + 1 2 (x − 2) 2 , y = 1 2 x 2 , and x = 0.
a. Determine a definite integral whose value is the area of the region enclosed by the two curves.
b. Find an expression involving one or more definite integrals whose value is the volume of the solid of revolution
generated by revolving the region R about the line y = −1.
c. Determine an expression involving one or more definite integrals whose value is the volume of the solid of revolution
generated by revolving the region R about the y-axis.
d. Find an expression involving one or more definite integrals whose value is the perimeter of the region R.

Matthew Boelkins, David Austin & Steven


6.E.1 12/22/2021 https://math.libretexts.org/@go/page/5401
Schlicker
6.3 Density, Mass, and Center of Mass
Exercises
1. Let a thin rod of length a have density distribution function ρ(x) = 10e −0.1x , where x is measured in cm and ρ in grams
per centimeter.
a. If the mass of the rod is 30 g, what is the value of a? 6.3. DENSITY, MASS, AND CENTER OF MASS 363
b. For the 30g rod, will the center of mass lie at its midpoint, to the left of the midpoint, or to the right of the midpoint?
Why?
c. For the 30g rod, find the center of mass, and compare your prediction in (b).
d. At what value of x should the 30g rod be cut in order to form two pieces of equal mass?
2. Consider two thin bars of constant cross-sectional area, each of length 10 cm, with respective mass density functions
ρ(x) = 1 1+x 2 and p(x) = e −0.1x .
a. Find the mass of each bar.
b. Find the center of mass of each bar.
c. Now consider a new 10 cm bar whose mass density function is f (x) = ρ(x) + p(x).
a. i. Explain how you can easily find the mass of this new bar with little to no additional work.
b. ii. Similarly, compute R 10 0 x f (x) dx as simply as possible, in light of earlier computations.
c. iii. True or false: the center of mass of this new bar is the average of the centers of mass of the two earlier bars.
Write at least one sentence to say why your conclusion makes sense.
3. Consider the curve given by y = f (x) = 2xe−1.25x + (30 − x)e −0.25(30−x) .
a. Plot this curve in the window x = 0 . . . 30, y = 0 . . . 3 (with constrained scaling so the units on the x and y axis are
equal), and use it to generate a solid of revolution about the x-axis. Explain why this curve could generate a reasonable
model of a baseball bat.
b. Let x and y be measured in inches. Find the total volume of the baseball bat generated by revolving the given curve
about the x-axis. Include units on your answer
c. Suppose that the baseball bat has constant weight density, and that the weight density is 0.6 ounces per cubic inch. Find
the total weight of the bat whose volume you found in (b).
d. Because the baseball bat does not have constant cross-sectional area, we see that the amount of weight concentrated at
a location x along the bat is determined by the volume of a slice at location x. Explain why we can think about the
function ρ(x) = 0.6π f (x) 2 (where f is the function given at the start of the problem) as being the weight density
function for how the weight of the baseball bat is distributed from x = 0 to x = 30. 364 6.3. DENSITY, MASS, AND
CENTER OF MASS
e. Compute the center of mass of the baseball bat.

6.4 Physics Applications: Work, Force, and Pressure


1. Consider the curve f (x) = 3 cos( x 3 4 ) and the portion of its graph that lies in the first quadrant between the y-axis and
the first positive value of x for which f (x) = 0. Let R denote the region bounded by this portion of f , the x-axis, and the y-
axis. Assume that x and y are each measured in feet.
a. Picture the coordinate axes rotated 90 degrees clockwise so that the positive x-axis points straight down, and the
positive y-axis points to the right. Suppose that R is rotated about the x axis to form a solid of revolution, and we
consider this solid as a storage tank. Suppose that the resulting tank is filled to a depth of 1.5 feet with water weighing
62.4 pounds per cubic foot. Find the amount of work required to lower the water in the tank until it is 0.5 feet deep, by
pumping the water to the top of the tank.
b. Again picture the coordinate axes rotated 90 degrees clockwise so that the positive x-axis points straight down, and the
positive y-axis points to the right. Suppose that R, together with its reflection across the x-axis, forms one end of a
storage tank that is 10 feet long. Suppose that the resulting tank is filled completely with water weighing 62.4 pounds
per cubic foot. Find a formula for a function that tells the amount of work required to lower the water by h feet.
c. Suppose that the tank described in is completely filled with water. Find the total force due to hydrostatic pressure
exerted by the water on one end of the tank.

Matthew Boelkins, David Austin & Steven


6.E.2 12/22/2021 https://math.libretexts.org/@go/page/5401
Schlicker
2. A cylindrical tank, buried on its side, has radius 3 feet and length 10 feet. It is filled completely with water whose
weight density is 62.4 lbs/ft3 , and the top of the tank is two feet underground.
a. Set up, but do not evaluate, an integral expression that represents the amount of work required to empty the top half of
the water in the tank to a truck whose tank lies 4.5 feet above ground.
b. With the tank now only half-full, set up, but do not evaluate an integral expression that represents the total force due to
hydrostatic pressure against one end of the tank.

6.5: Improper Integrals


1. Determine, with justification, whether each of the following improper integrals converges or diverges.
a. Z ∞ e ln(x) x dx
b. Z ∞ e 1 x ln(x) dx
c. Z ∞ e 1 x(ln(x))2 dx
d. Z ∞ e 1 x(ln(x))p dx, where p is a positive real number
e. Z 1 0 ln(x) x dx Z 1 0 ln(x) dx
2. Sometimes we may encounter an improper integral for which we cannot easily evaluate the limit of the corresponding
proper integrals. For instance, consider R ∞ 1 1 1+x 3 dx. While it is hard (or perhaps impossible) to find an antiderivative
for 1 1+x 3 , we can still determine whether or not the improper integral converges or diverges by comparison to a simpler
one. Observe that for all x > 0, 1 + x 3 > x 3 , and therefore 1 1 + x 3 < 1 x 3 . It therefore follows that Z b 1 1 1 + x 3 dx <
Z b 1 1 x 3 dx for every b > 1. If we let b → ∞ so as to consider the two improper integrals R ∞ 1 1 1+x 3 dx and R ∞ 1 1 x
3 dx, we know that the larger of the two improper integrals converges. And thus, since the smaller one lies below a
convergent integral, it follows that the smaller one must converge, too. In particular, R ∞ 1 1 1+x 3 dx must converge, even
though we never explicitly evaluated the corresponding limit of proper integrals. We use this idea and similar ones in the
exercises that follow.
a. Explain why x 2 + x + 1 > x 2 for all x ≥ 1, and hence show that R ∞ 1 1 x 2+x+1 dx converges by comparison to R ∞
1 1 x 2 dx.
b. Observe that for each x > 1, ln(x) < x. Explain why Z b 2 1 x dx < Z b 2 1 ln(x) dx for each b > 2. Why must it be true
that R b 2 1 ln(x) dx diverges?
c. Explain why q x 4+1 x 4 > 1 for all x > 1. Then, determine whether or not the improper integral Z ∞ 1 1 x · r x 4 + 1 x
4 dx converges or diverges.

Matthew Boelkins, David Austin & Steven


6.E.3 12/22/2021 https://math.libretexts.org/@go/page/5401
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
7: DIFFERENTIAL EQUATIONS
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

7.1: AN INTRODUCTION TO DIFFERENTIAL EQUATIONS


Here introduce the concept of differential equations. A differential equation is an equation that provides a description of a function’s
derivative, which means that it tells us the function’s rate of change. Using this information, we would like to learn as much as
possible about the function itself. For instance, we would ideally like to have an algebraic description of the function.

7.2: QUALITATIVE BEHAVIOR OF SOLUTIONS TO DIFFERENTIAL EQUATIONS


Since the derivative at a point tells us the slope of the tangent line at this point, a differential equation gives us crucial information
about the tangent lines to the graph of a solution. We will use this information about the tangent lines to create a slope field for the
differential equation, which enables us to sketch solutions to initial value problems. Our aim will be to understand the solutions
qualitatively.

7.3: EULER'S METHOD


Euler’s method is an algorithm for approximating the solution to an initial value problem by following the tangent lines while we take
horizontal steps across the t-axis. If we wish to approximate y(t) for some fixed t by taking horizontal steps of size ∆t, then the error in
our approximation is proportional to ∆t.

7.4: SEPARABLE DIFFERENTIAL EQUATIONS


A separable differential equation is one that may be rewritten with all occurrences of the dependent variable multiplying the derivative
and all occurrences of the independent variable on the other side of the equation. We may find the solutions to certain separable
differential equations by separating variables, integrating with respect to t, and ultimately solving the resulting algebraic equation for
y. This technique allows us to solve many important differential equations.

7.5: MODELING WITH DIFFERENTIAL EQUATIONS


In our work to date, we have seen several ways that differential equations arise in the natural world, from the growth of a population
to the temperature of a cup of coffee. In this section, we will look more closely at how differential equations give us a natural way to
describe various phenomena. As we’ll see, the key is to focus on understanding the different factors that cause a quantity to change.

3 12/22/2021
7.6: POPULATION GROWTH AND THE LOGISTIC EQUATION
The growth of the earth’s population is one of the pressing issues of our time. Will the population continue to grow? Or will it perhaps
level off at some point, and if so, when? In this section, we will look at two ways in which we may use differential equations to help
us address questions such as these. Before we begin, let’s consider again two important differential equations that we have seen in
earlier work this chapter.

7.E: DIFFERENTIAL EQUATIONS (EXERCISES)


These are homework exercises to accompany Chapter 7 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
7.1: An Introduction to Differential Equations
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a differential equation and what kinds of information can it tell us?
How do differential equations arise in the world around us?
What do we mean by a solution to a differential equation?

In previous chapters, we have seen that a function’s derivative tells us the rate at which the function is changing. More
recently, the Fundamental Theorem of Calculus helped us to determine the total change of a function over an interval when
we know the function’s rate of change. For instance, an object’s velocity tells us the rate of change of that object’s position.
By integrating the velocity over a time interval, we may determine by how much the position changes over that time
interval. In particular, if we know where the object is at the beginning of that interval, then we have enough information to
accurately predict where it will be at the end of the interval.
In this chapter, we will introduce the concept of differential equations and explore this idea in more depth. Simply said, a
differential equation is an equation that provides a description of a function’s derivative, which means that it tells us the
function’s rate of change. Using this information, we would like to learn as much as possible about the function itself. For
instance, we would ideally like to have an algebraic description of the function. As we’ll see, this may be too much to ask
in some situations, but we will still be able to make accurate approximations.

Exercise 7.1.1

The position of a moving object is given by the function s(t) , where s is measured in feet and t in seconds. We
determine that the velocity is v(t) = 4t + 1 feet per second.
a. How much does the position change over the time interval [0, 4]?
b. Does this give you enough information to determine s(4) , the position at time t = 4 ? If so, what is s(4) ? If not,
what additional information would you need to know to determine s(4) ?
c. Suppose you are told that the object’s initial position s(0) = 7 . Determine s(2) , the object’s position 2 seconds
later.
d. If you are told instead that the object’s initial position is s(0) = 3 , what is s(2) ?
e. If we only know the velocity v(t) = 4t + 1 , is it possible that the object’s position at all times is
s(t) = 2t2 + t − 4 ? Explain how you know.

f. Are there other possibilities for s(t) ? If so, what are they?
g. If, in addition to knowing the velocity function is v(t) = 4t + 1 , we know the initial position s(0) , how many
possibilities are there for s(t) ?

What is a differential equation? A differential equation is an equation that describes the derivative, or derivatives, of a
function that is unknown to us. For instance, the equation
dy
= x sin x
dx

is a differential equation since it describes the derivative of a function y(x) that is unknown to us.
As many important examples of differential equations involve quantities that change in time, the independent variable in
our discussion will frequently be time t . For instance, in the preview activity, we considered the differential equation
ds
= 4t + 1.
dt

Knowing the velocity and the starting position of the object, we were able to find the position at any later time.
Because differential equations describe the derivative of a function, they give us information about how that function
changes. Our goal will be to take this information and use it to predict the value of the function in the future; in this way,

Matthew Boelkins, David Austin & Steven


7.1.1 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
differential equations provide us with something like a crystal ball. Differential equations arise frequently in our every day
world. For instance, you may hear a bank advertising:
Your money will grow at a 3% annual interest rate with us.
This innocuous statement is really a differential equation. Let’s translate: A(t) will be amount of money you have in your
dA
account at time t . On one hand, the rate at which your money grows is the derivative . On the other hand, we are told
dt
that this rate is 0.03A. This leads to the differential equation
dA
= 0.03A.
dt

ds
This differential equation has a slightly different feel than the previous example = 4t + 1. . In the earlier example, the
dt
rate of change depends only on the independent variable t , and we may find s(t) by integrating the velocity 4t + 1 . In the
banking example, however, the rate of change depends on the dependent variable A, so we’ll need some new techniques in
order to find A(t) .

Activity 7.1.1:

Express the following statements as differential equations. In each case, you will need to introduce notation to describe
the important quantities in the statement so be sure to clearly state what your notation means.
a. The population of a town grows continuously at an annual rate of 1.25%.
b. A radioactive sample loses 5.6% of its mass every day.
c. You have a bank account that continuously earns 4% interest every year. At the same time, you withdraw money
continually from the account at the rate of $1000 per year.
d. A cup of hot chocolate is sitting in a 70° room. The temperature of the hot chocolate cools continuously by 10% of
the difference between the hot chocolate’s temperature and the room temperature every minute.
e. A can of cold soda is sitting in a 70° room. The temperature of the soda warms continuously at the rate of 10% of
the difference between the soda’s temperature and the room’s temperature every minute.

Differential equations in the world around us As we have noted, differential equations give a natural way to describe
phenomena we see in the real world. For instance, physical principles are frequently expressed as a description of how a
quantity changes. A good example is Newton’s Second Law, an important physcial principle that says:
The product of an object’s mass and acceleration equals the force applied to it.
For instance, when gravity acts on an object near the earth’s surface, it exerts a force equal to mg, the mass of the object
times the gravitational constant g . We therefore have
ma = mg

or
dv
= g,
dt

where v is the velocity of the object, and g = 9.8 meters per second squared. Notice that this physical principle does not
tell us what the object’s velocity is, but rather how the object’s velocity changes.

Activity 7.1.2:

Shown below are two graphs depicting the velocity of falling objects. On the left is the velocity of a skydiver, while on
the right is the velocity of a meteorite entering the Earth’s atmosphere.

Matthew Boelkins, David Austin & Steven


7.1.2 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
a. Begin with the skydiver’s velocity and use the given graph to measure the rate of change dv/dt when the velocity
is v = 0.5, 1.0, 1.5, 2.0, and2.5. Plot your values on the graph below. You will want to think carefully about this:
you are plotting the derivative dv/dt as a function of velocity.

b. Now do the same thing with the meteorite’s velocity: use the given graph to measure the rate of change dv/dt
when the velocity is v = 3.5, 4.0, 4.5, and5.0. Plot your values on the graph above.
c. You should find that all your points lie on a line. Write the equation of this line being careful to use proper notation
for the quantities on the horizontal and vertical axes.
d. The relationship you just found is a differential equation. Write a complete sentence that explains its meaning.
e. By looking at the differential equation, determine the values of the velocity for which the velocity increases.
f. By looking at the differential equation, determine the values of the velocity for which the velocity decreases.
g. By looking at the differential equation, determine the values of the velocity for which the velocity remains
constant.

The point of this activity is to demonstrate how differential equations model processes in the real world. In this example,
two factors are influencing the velocities: gravity and wind resistance. The differential equation describes how these
factors influence the rate of change of the objects’ velocities.

Matthew Boelkins, David Austin & Steven


7.1.3 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
Solving a Differential Equation
We have said that a differential equation is an equation that describes the derivative, or derivatives, of a function that is
unknown to us. By a solution to a differential equation, we mean simply a function that satisfies this description. For
instance, the first differential equation we looked at is
ds
= 4t + 1,
dt

which describes an unknown function s(t) . We may check that


2
s(t) = 2 t +t

is a solution because it satisfies this description. Notice that


2
s(t) = 2 t +t +4

is also a solution. If we have a candidate for a solution, it is straightforward to check whether it is a solution or not. Before
we demonstrate, however, let’s consider the same issue in a simpler context. Suppose we are given the equation
2
2x − 2x = 2x + 6

and asked whether x =3 is a solution. To answer this question, we could rewrite the variable x in the equation with the
symbol □:
2
2□ − 2□ = 2□ + 6.

To determine whether x = 3 is a solution, we can investigate the value of each side of the equation separately when the
value 3 is placed □ in and see if indeed the two resulting values are equal. Doing so, we observe that
2 2
2□ − 2□ = 2 × 3 − 2 × 3 = 12,

and
2□ + 6 = 2 × 3 + 6 = 12.

Therefore, x = 3 is indeed a solution.


We will do the same thing with differential equations. Consider the differential equation
dv
= 1.5 − 0.5v or,
dt

d□
= 1.5 − 0.5□.
dt

Let’s ask whether v(t) = 3 − 2e −0.5t


is a solution (at this time, don’t worry about why we chose this function; we will
learn techniques for finding solutions to differential equations soon enough). Using this formula for v , observe first that
dv d□ d
−0.5t −0.5t −0.5t
= = [3 − 2 e ] = −2 e (−0.5) = e
dt dt dt

and
−0.5t −0.5t −0.5t
1.5 − 0.5v = 1.5 − 0.5□ = 1.5 − 0.5(3 − 2 e ) = 1.5 − 1.5 + e =e .

Since dv

dt
and 1.5 − 0.5v agree for all values of t when v = 3 − 2e − 0.5t , we have indeed found a solution to the
differential equation.

Activity 7.1.3:

Consider the differential equation


dv
= 1.5 − 0.5v.
dt

Which of the following functions are solutions of this differential equation?


a. v(t) = 1.5t − 0.25t . 2

b. v(t) = 3 + 2e − 0.5t .

Matthew Boelkins, David Austin & Steven


7.1.4 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
c. v(t) = 3 .
d. v(t) = 3 + C e −0.5t
where C is any constant.

This activity shows us something interesting. Notice that the differential equation has infinitely many solutions, which are
parametrized by the constant C in v(t) = 3 + C e .
−0.5t

In Figure 7.1, we see the graphs of these solutions for a few values of C , as labeled.

Figure 7.1: The family of solutions to the differential equation dv dt = 1.5 − 0.5v.
Notice that the value of C is connected to the initial value of the velocity v(0) , since v(0) = 3 + C . In other words, while
the differential equation describes how the velocity changes as a function of the velocity itself, this is not enough
information to determine the velocity uniquely: we also need to know the initial velocity. For this reason, differential
equations will typically have infinitely many solutions, one corresponding to each initial value. We have seen this
phenomenon before, such as when given the velocity of a moving object v(t), we were not able to uniquely determine the
object’s position unless we also know its initial position.
If we are given a differential equation and an initial value for the unknown function, we say that we have an initial value
problem. For instance,
dv
= 1.5 − 0.5v , with v(0) = 0.5
dt

is an initial value problem. In this situation, we know the value of v at one time and we know how v is changing.
Consequently, there should be exactly one function v that satisfies the initial value problem. This demonstrates the
following important general property of initial value problems.

Initial value problems

Initial value problems that are “well behaved” have exactly one solution, which exists in some interval around the
initial point.

We won’t worry about what “well behaved” means—it is a technical condition that will be satisfied by all the differential
equations we consider.
To close this section, we note that differential equations may be classified based on certain characteristics they may
possess. Indeed, you may see many different types of differential equations in a later course in differential equations. For
now, we would like to introduce a few terms that are used to describe differential equations.
A first-order differential equation is one in which only the first derivative of the function occurs. For this reason,
dv
= 1.5 − 0.5v
dt

is a first-order equation while


2
d y
= −10y
2
dt

Matthew Boelkins, David Austin & Steven


7.1.5 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
is a second-order differential equation. A differential equation is autonomous if the independent variable does not appear in
the description of the derivative. For instance,
dv
= 1.5 − 0.5v
dt

is autonomous because the description of the derivative dv

dt
does not depend on time. The equation
dy
= 1.5t − 0.5y,
dt

however, is not autonomous.

Summary
In this section, we encountered the following important ideas:
A differential equation is simply an equation that describes the derivative(s) of an unknown function.
Physical principles, as well as some everyday situations, often describe how a quantity changes, which lead to
differential equations.
A solution to a differential equation is a function whose derivatives satisfy the equation’s description. Differential
equations typically have infinitely many solutions, parametrized by the initial values.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.1.6 12/8/2021 https://math.libretexts.org/@go/page/4334
Schlicker
7.2: Qualitative Behavior of Solutions to Differential Equations
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a slope field?
How can we use a slope field to obtain qualitative information about the solutions of a differential equation?
What are stable and unstable equilibrium solutions of an autonomous differential equation?

In earlier work, we have used the tangent line to the graph of a function f at a point a to approximate the values of f near
a . The usefulness of this approximation is that we need to know very little about the function; armed with only the value

f (a) and the derivative f (a) , we may find the equation of the tangent line and the approximation

f (x) ≈ f (a) + f 0(a)(x − a). (7.2.1)

Remember that a first-order differential equation gives us information about the derivative of an unknown function. Since
the derivative at a point tells us the slope of the tangent line at this point, a differential equation gives us crucial
information about the tangent lines to the graph of a solution. We will use this information about the tangent lines to create
a slope field for the differential equation, which enables us to sketch solutions to initial value problems. Our aim will be to
understand the solutions qualitatively. That is, we would like to understand the basic nature of solutions, such as their long-
range behavior, without precisely determining the value of a solution at a particular point.

Preview Activity 7.2.1

Let’s consider the initial value problem


dy
= t −2
dt

with y(0) = 1.
a. Use the differential equation to find the slope of the tangent line to the solution y(t) at t = 0 . Then use the initial
value to find the equation of the tangent line at t = 0 . Sketch this tangent line over the interval −0.25 ≤ t ≤ 0.25
on the axes provided.

a. Also shown in the given figure are the tangent lines to the solution y(t) at the points t = 1 , 2, and 3 (we will see
how to find these later). Use the graph to measure the slope of each tangent line and verify that each agrees with
the value specified by the differential equation.
b. Using these tangent lines as a guide, sketch a graph of the solution y(t) over the interval 0 ≤ t ≤ 3 so that the lines
are tangent to the graph of y(t).
c. Use the Fundamental Theorem of Calculus to find y(t), the solution to this initial value problem.
d. Graph the solution you found in (d) on the axes provided, and compare it to the sketch you made using the tangent
lines.

Matthew Boelkins, David Austin & Steven


7.2.1 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
Slope fields
Preview Activity 7.2.1 shows that we may sketch the solution to an initial value problem if we know an appropriate
collection of tangent lines. Because we may use a given differential equation to determine the slope of the tangent line at
any point of interest, by plotting a useful collection of these, we can get an accurate sense of how certain solution curves
dy
must behave. Let’s continue looking at the differential equation = t − 2. . If t =0 , this equation says that
dt
dy
= 0 − 2 = −2.
dt

Note that this value holds regardless of the value of y . We will therefore sketch tangent lines for several values of y and
t = 0 with a slope of −2.

dy
Let’s continue in the same way: if t =1 , the Equation tells us that = 1 − 2 = −1 and this holds regardless of the
dt
value of y . We now sketch tangent lines for several values of y and t = 1 with a slope of −1.

Similarly, we see that when t = 2 , dy/dt = 0 and when t =3 , dy/dt = 1 . We may therefore add to our growing
collection of tangent line plots to achieve the next figure.

In this figure, you may see the solutions to the differential equation emerge. However, for the sake of clarity, we will add
more tangent lines to provide the more complete picture shown below.

Matthew Boelkins, David Austin & Steven


7.2.2 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
This most recent figure, which is called a slope field for the differential equation, allows us to sketch solutions just as we
did in the preview activity. Here, we will begin with the initial value y(0) = 1 and start sketching the solution by
following the tangent line, as shown in the next figure.

We then continue using this principle: whenever the solution passes through a point at which a tangent line is drawn, that
line is tangent to the solution. Doing so leads us to the following sequence of images.

In fact, we may draw solutions for any possible initial value, and doing this for several different initial values for y(0)

results in the graphs shown next.


Just as we have done for the most recent example with the equation, we can construct a slope field for any differential
equation of interest. The slope field provides us with visual information about how we expect solutions to the differential
equation to behave.

Activity 7.2.1

Consider the autonomous differential equation


dy 1
=− (y − 4).
dt 2

dy
a. Make a plot of versus y on the axes provided. Looking at the graph, for what values of y does y increase and
dt
for what values of y does y decrease?

Matthew Boelkins, David Austin & Steven


7.2.3 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
b. Next, sketch the slope field for this differential equation on the axes provided.

c. Use your work in (b) to sketch the solutions that satisfy y(0) = 0 , y(0) = 2 , y(0) = 4 , and y(0) = 6 .
d. Verify that y(t) = 4 + 2e is a solution to the given differential equation with the initial value y(0) = 6 .
−t/2

Compare its graph to the one you sketched in (c).


e. What is special about the solution where y(0) = 4 ?

Equilibrium Solutions and Stability


As our work in Activity 7.2.1 demonstrates, first-order autonomous solutions may have solutions that are constant. In fact,
these are quite easy to detect by inspecting the differential equation dy/dt = f (y) : constant solutions necessarily have a
zero derivative so dy/dt = 0 = f (y) . For example, in Activity 7.2.1, we considered the equation
dy 1
= f (y) = − (y − 4). (7.2.2)
dt 2

1
Constant solutions are found by setting f (y) = − (y − 4) = 0, which we immediately see implies that y = 4 . Values of
2
dy
y for which f (y) = 0 in an autonomous differential equation = f (y) are usually called the equilibrium solutions of
dt
the differential equation.

Activity 7.2.2

Consider the autonomous differential equation


dy 1
=− y(y − 4). (7.2.3)
dt 2

dy
a. Make a plot of versus y . Looking at the graph, for what values of y does y increase and for what values of y
dt

does y decrease?

Matthew Boelkins, David Austin & Steven


7.2.4 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
b. Identify any equilibrium solutions of the given differential equation.
c. Now sketch the slope field for the given differential equation.

c. Sketch the solutions to the given differential equation that correspond to initial values y(0) = −1, 0, 1, … , 5.
d. An equilibrium solution ȳ is called stable if nearby solutions converge to ȳ . This means that if the initial condition
varies slightly from y , then
lim y(t) = ȳ . (7.2.4)
t→∞

Conversely, an equilibrium solution ȳ is called unstable if nearby solutions are pushed away from ȳ . Using your
work above, classify the equilibrium solutions you found in (b) as either stable or unstable.
e. Suppose that y(t) describes the population of a species of living organisms and that the initial value y(0) is
positive. What can you say about the eventual fate of this population?
f. Remember that an equilibrium solution y satisfies f (y) = 0. If we graph dy/dt = f (y) as a function of y , for
which of the following differential equations is y a stable equilibrium and for which is y unstable? Why?

Summary
In this section, we encountered the following important ideas:
A slope field is a plot created by graphing the tangent lines of many different solutions to a differential equation.
Once we have a slope field, we may sketch the graph of solutions by drawing a curve that is always tangent to the lines
in the slope field.
Autonomous differential equations sometimes have constant solutions that we call equilibrium solutions. These may be
classified as stable or unstable, depending on the behavior of nearby solutions.

Matthew Boelkins, David Austin & Steven


7.2.5 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
Contributors and Attributions
Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.2.6 11/21/2021 https://math.libretexts.org/@go/page/4335
Schlicker
7.3: Euler's Method
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is Euler’s method and how can we use it to approximate the solution to an initial value problem?
How accurate is Euler’s method?

In Section 7.2, we saw how a slope field can be used to sketch solutions to a differential equation. In particular, the slope
field is a plot of a large collection of tangent lines to a large number of solutions of the differential equation, and we sketch
a single solution by simply following these tangent lines. With a little more thought, we may use this same idea to
numerically approximate the solutions of a differential equation.

Preview Activity 7.3.1

Consider the initial value problem


dy 1
=
dt 2(y + 1)

with y(0) = 0 .
a. Use the differential equation to find the slope of the tangent line to the solution y(t) at t = 0 . Then use the given
initial value to find the equation of the tangent line at t = 0 .
b. Sketch the tangent line on the axes below on the interval 0 ≤ t ≤ 2 and use it to approximate y(2), the value of the
solution at t = 2 .
c. Assuming that your approximation for y(2) is the actual value of y(2), use the differential equation to find the
slope of the tangent line to y(t) at t = 2 . Then, write the equation of the tangent line at t = 2 .
d. Add a sketch of this tangent line to your plot on the axes above on the interval 2 ≤ t ≤ 4 ; use this new tangent line
to approximate y(4), the value of the solution at t = 4 .
e. Repeat the same step to find an approximation for y(6).

Euler’s Method
Preview Activity 7.3.1 demonstrates the essence of an algorithm, which is known as Euler’s Method, that generates a
numerical approximation to the solution of an initial value problem. In this algorithm, we will approximate the solution by
taking horizontal steps of a fixed size that we denote by Δt.
Before explaining the algorithm in detail, let’s remember how we compute the slope of a line: the slope is the ratio of the
vertical change to the horizontal change, as shown in the following figure.

Matthew Boelkins, David Austin & Steven


7.3.1 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
Δy
In other words, m = Δt
. Said differently, the vertical change is the product of the slope and the horizontal change:
Δy = mΔt.

Suppose that we would like to solve the initial value problem


dy
= t − y, y(0) = 1.
dt

“Euler” is pronounced “Oy-ler.” Among other things, Euler is the mathematician credited with the famous number e; if
you incorrectly pronounce his name “You-ler,” you fail to appreciate his genius and legacy.

While there is an algorithm by which we can find an algebraic formula for the solution to this initial value problem, and we
can check that this solution is
−t
y(t) = t − 1 + 2 e ,

we are instead interested in generating an approximate solution by creating a sequence of points (t , y ), where y ≈ y(t ) .
i i i i

For this first example, we choose Δt = 0.2 . Since we know that y(0) = 1 , we will take the initial point to be
(t , y ) = (0, 1) and move horizontally by Δt = 0.2 to the point (t , y ). Therefore,
0 0 1 1

t1 = t0 + Δt = 0.2.

The differential equation tells us that the slope of the tangent line at this point is
dy ∣
m = ∣ = 0 − 1 = −1.
dt ∣ (0,1)

Therefore, if we move along the tangent line by taking a horizontal step of size Δt = 0.2 , we must also move vertically by
Δy = mΔt = −1 ⋅ 0.2 = −0.2.

We then have the approximation


y(0.2) ≈ y1 = y0 + Δy = 1 − 0.2 = 0.8.

At this point, we have executed one step of Euler’s method. 0.4 0.8 1.2 0.4 0.8 1.2 (t0 , y0 )(t1 , y1 )ty Now we repeat this
process: at (t , y ) = (0.2, 0.8),
1 1 the differential equation tells us that the slope is
m = dy/dt(0.2, 0.8) = 0.2 − 0.8 = −0.6 .

Matthew Boelkins, David Austin & Steven


7.3.2 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
If we move horizontally by Δt to t 2 = t1 + Δ = 0.4 , we must move vertically by
Δy = −0.6 ⋅ 0.2 = −0.12.

We consequently arrive at y = y
2 1 + Δy = 0.8 − 0.12 = 0.68, which gives y(0.2) ≈ 0.68. Now we have completed the
second step of Euler’s method.
If we continue in this way, we may generate the points (t , y ) shown at left in Figure 7.3.1. In situations where we are
i i

able to find a formula for the actual solution y(t), we can graph y(t) to compare it to the points generated by Euler’s
method, as shown at right in Figure 7.3.1.
Because we need to generate a large number of points (t , y ), it is convenient to organize the implementation of Euler’s
i i

method in a table as shown. We begin with the given initial data.

Figure 7.3.1 : At left, the points and piecewise linear approximate solution generated by Euler’s method; at right, the
approximate solution compared to the exact solution (shown in blue).

From here, we compute the slope of the tangent line m = dy/dt using the formula for dy/dt from the differential
equation, and then we find Δy, the change in y , using the rule Δy = mΔt .

Next, we increase t by Δt and y by Δy to get


i i

and then we simply continue the process for however many steps we decide, eventually generating a table like the one that
follows.

Matthew Boelkins, David Austin & Steven


7.3.3 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
Activity 7.3.1

Consider the initial value problem


dy
= 2t − 1,  y(0) = 0
dt

1. (a) Use Euler’s method with Δt = 0.2 to approximate the solution at t = 0.2 , 0.4, 0.6, 0.8, and 1.0. Record your
i

work in the following table, and sketch the points (t , y ) on the following axes provided.
i i

b. Find the exact solution to the original initial value problem and use this function to find the error in your
approximation at each one of the points t .i

c. Explain why the value y generated by Euler’s method for this initial value problem produces the same value as a
5

1
left Riemann sum for the definite integral ∫ (2t − 1) dt.
0

d. How would your computations differ if the initial value were y(0) = 1 instead? What does this mean about
different solutions to this differential equation?

Exercise 7.3.2

Consider the differential equation


dy
2
= 6y − y .
dt

a. Sketch the slope field for this differential equation on the axes provided at left below.
Matthew Boelkins, David Austin & Steven
7.3.4 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
b. Identify any equilibrium solutions and determine whether they are stable or unstable.
c. What is the long-term behavior of the solution that satisfies the initial value y(0) = 1 ?
d. Using the initial value y(0) = 1 , use Euler’s method with Δt = 0.2 to approximate the solution at t = 0.2 , 0.4,
i

0.6, 0.8, and 1.0. Sketch the points (t , y ) on the axes provided at right in (a). (Note the different horizontal scale
i i

on the two sets of axes.)


e. What happens if we apply Euler’s method to approximate the solution with y(0) = 6 ?

The Error in Euler’s Method


Since we are approximating the solutions to an initial value problem using tangent lines, we should expect that the error in
the approximation will be less when the step size is smaller. To explore this observation quantitatively, let’s consider the
initial value problem
dy
= y,  y(0) = 1
dt

whose solution we can easily find.


Consider the question posed by this initial value problem: “what function do we know that is the same as its own
derivative and has value 1 when t = 0 ?” It is not hard to see that the solution is y(t) = e . We now apply Euler’s method
t

to approximate y(1) = e using several values of Δt. These approximations will be denoted by E , and these estimates
Δt

provide us a way to see how accurate Euler’s Method is.


To begin, we apply Euler’s method with a step size of Δt = 0.2 . In that case, we find that y(1) ≈ E0.2 = 2.4883. The
error is therefore y(1) − E = e − 2.4883 ≈ 0.2300.
0.2

Repeatedly halving Δt gives the following results, expressed in both tabular and graphical form.

Matthew Boelkins, David Austin & Steven


7.3.5 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
Notice, both numerically and graphically, that the error is roughly halved when Δt is halved. This example illustrates the
following general principle.

Euler's Approximation
If Euler’s method is to approximate the solution to an initial value problem at a point t , then the error is proportional to
Δt . That is,

y( t̄ ) − EΔt ≈ KΔt

for some constant of proportionality K .

Summary
In this section, we encountered the following important ideas:
Euler’s method is an algorithm for approximating the solution to an initial value problem by following the tangent lines
while we take horizontal steps across the t -axis.
If we wish to approximate y( t̄ ) for some fixed t̄ by taking horizontal steps of size Δt, then the error in our
approximation is proportional to Δt.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.3.6 11/17/2021 https://math.libretexts.org/@go/page/4336
Schlicker
7.4: Separable Differential Equations
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a separable differential equation?
How can we find solutions to a separable differential equation?
Are some of the differential equations that arise in applications separable?

In Sections 7.2 and 7.3, we have seen several ways to approximate the solution to an initial value problem. Given the
frequency with which differential equations arise in the world around us, we would like to have some techniques for
finding explicit algebraic solutions of certain initial value problems. In this section, we focus on a particular class of
differential equations (called separable) and develop a method for finding algebraic formulas for solutions to these
equations.
A separable differential equation is a differential equation whose algebraic structure permits the variables present to be
separated in a particular way. For instance, consider the equation
dy
= ty. (7.4.1)
dt

We would like to separate the variables t and y so that all occurrences of t appear on the right-hand side, and all
occurrences of y appears on the left and multiply dy/dt. We may do this in the preceding differential equation by dividing
both sides by y :
1 dy
= t. (7.4.2)
y dt

Note particularly that when we attempt to separate the variables in a differential equation, we require that the left-hand side
be a product in which the derivative dy/dt is one term. Not every differential equation is separable. For example, if we
consider the equation
dy
= t − y, (7.4.3)
dt

it may seem natural to separate it by writing


dy
y+ = t. (7.4.4)
dt

dy
As we will see, this will not be helpful since the left-hand side is not a product of a function of y with dt
.

Preview Activity 7.4.1:

In this preview activity, we explore whether certain differential equations are separable or not, and then revisit some
key ideas from earlier work in integral calculus.
a. Which of the following differential equations are separable? If the equation is separable, write the equation in the
dy
revised form g(y) = h(t).
dt

dy
1. = −3y
dt
dy
2. t = ty − y.
dt
dy
3. = t + 1.
dt
dy
4. =t
2
−y .
2

dt

b. Explain why any autonomous differential equation is guaranteed to be separable.

Matthew Boelkins, David Austin & Steven


7.4.1 12/15/2021 https://math.libretexts.org/@go/page/4337
Schlicker
c. Why do we include the term “+C ” in the expression
2
x
∫ x dx = + C?
2

d. Suppose we know that a certain function f satisfies the equation


∫ f (x) dx = ∫ x dx.

What can you conclude about f ?

Solving separable differential equations


Before we discuss a general approach to solving a separable differential equation, it is instructive to consider an example.

Example 7.4.1: Separable Differential Equation

Find all functions y that are solutions to the differential equation


dy t
= .
2
dt y

Solution
We begin by separating the variables and writing

2
dy
y = t.
dt

Integrating both sides of the equation with respect to the independent variable t shows that

2
dy
∫ y dt = ∫ t dt.
dt

Next, we notice that the left-hand side allows us to change the variable of antidifferentiation from t to y . In particular,
dy
dy = dt, so we now have
dt

2
∫ y dy = ∫ t dt.

This most recent equation says that two families of antiderivatives are equal to one another. Therefore, when we find
representative antiderivatives of both sides, we know they must differ by arbitrary constant C . Antidifferentiating and
including the integration constant C on the right, we find that
3 2
y t
= + C.
3 2

Again, note that it is not necessary to include an arbitrary constant on both sides of the equation; we know that 3
y /3

and t /2 are in the same family of antiderivatives and must therefore differ by a single constant.
2

Finally, we may now solve the last equation above for y as a function of t , which gives
−−−− −−−−
3
3 2
y(t) = √ t + 3C .
2

Of course, the term 3C on the right-hand side represents 3 times an unknown constant. It is, therefore, still an
unknown constant, which we will rewrite as C . We thus conclude that the function
−−−−−−−
3 2
3
y(t) = √ t +C.
2

is a solution to the original differential equation for any value of C .

Matthew Boelkins, David Austin & Steven


7.4.2 12/15/2021 https://math.libretexts.org/@go/page/4337
Schlicker
Notice that because this solution depends on the arbitrary constant C , we have found an infinite family of solutions.
This makes sense because we expect to find a unique solution that corresponds to any given initial value.
For example, if we want to solve the initial value problem
dy t
=
2
dt y

with y(0) = 2,
−−− −−−−
3
we know that the solution has the form y(t) = √
3
t2 + C for some constant C . We therefore must find the
2

appropriate value for C that gives the initial value y(0) = 2 . Hence,
−−−−−−−
3
3 2 3 −

2 = y(0) = √ 0 + C = √C .
2

which shows that C =2


3
=8 . The solution to the initial value problem is then

dy
The strategy of Example 7.4.1 may be applied to any differential equation of the form = g(y) ⋅ h(t), and any
dt
differential equation of this form is said to be separable. We work to solve a separable differential equation by writing
1 dy
= h(t), (7.4.5)
g(y) dt

and then integrating both sides with respect to t . After integrating, we strive to solve algebraically for y in order to write y
as a function of t .
We consider one more example before doing further exploration in some activities.

Example 7.4.2: Separable Differential Equation

Solve the differential equation


dy
= 3y.
dt

Solution
Following the same strategy as in Example 7.4.1, we have
1 dy
= 3.
y dt

Integrating both sides with respect to t ,


1 dy
∫ dt = ∫ 3 dt,
y dt

and thus
1
∫ dy = ∫ 3 dt.
y

Antidifferentiating and including the integration constant, we find that

ln |y| = 3t + C .

Finally, we need to solve for y . Here, one point deserves careful attention. By the definition of the natural logarithm
function, it follows that
3t+C 3t C
|y| = e =e e .

Matthew Boelkins, David Austin & Steven


7.4.3 12/15/2021 https://math.libretexts.org/@go/page/4337
Schlicker
Since C is an unknown constant, e is as well, though we do know that it is positive (because e is positive for any
C x

x). When we remove the absolute value in order to solve for y , however, this constant may be either positive or

negative. We will denote this updated constant (that accounts for a possible + or −) by C to obtain
3t
y(t) = C e .

There is one more slightly technical point to make. Notice that y = 0 is an equilibrium solution to this differential
equation. In solving the equation above, we begin by dividing both sides by y , which is not allowed if y = 0 . To be
perfectly careful, therefore, we will typically consider the equilibrium solutions separably. In this case, notice that the
final form of our solution captures the equilibrium solution by allowing C = 0 .

Activity 7.4.1

Suppose that the population of a town is growing continuously at an annual rate of 3% per year.
a. Let P (t) be the population of the town in year t . Write a differential equation that describes the annual growth rate.
b. Find the solutions of this differential equation.
c. If you know that the town’s population in year 0 is 10,000, find the population P (t).
d. How long does it take for the population to double? This time is called the doubling time.
e. Working more generally, find the doubling time if the annual growth rate is k times the population.

Activity 7.4.2: Cooling Coffee

Suppose that a cup of coffee is initially at a temperature of 105° F and is placed in a 75° F room. Newton’s law of
cooling says that
dT
= −k(T − 75),
dt

where k is a constant of proportionality.


a. Suppose you measure that the coffee is cooling at one degree per minute at the time the coffee is brought into the
room. Use the differential equation to determine the value of the constant k .
b. Find all the solutions of this differential equation.
c. What happens to all the solutions as t → ∞ ? Explain how this agrees with your intuition.
d. What is the temperature of the cup of coffee after 20 minutes?
e. How long does it take for the coffee to cool to 80°?

Activity 7.4.3

Solve each of the following differential equations or initial value problems.


dy
a. − (2 − t)y = 2 − t
dt
1 dy
b.
2
t −2y
=e
t dt
c. y ′
= 2y + 2, y(0) = 2

d. y ′ 2
= 2 y , y(−1) = 2

dy −2ty
e. =
2
, y(0) = 4
dt t +1

Summary
In this section, we encountered the following important ideas:
A separable differential equation is one that may be rewritten with all occurrences of the dependent variable
multiplying the derivative and all occurrences of the independent variable on the other side of the equation.

Matthew Boelkins, David Austin & Steven


7.4.4 12/15/2021 https://math.libretexts.org/@go/page/4337
Schlicker
We may find the solutions to certain separable differential equations by separating variables, integrating with respect to
t , and ultimately solving the resulting algebraic equation for y .

This technique allows us to solve many important differential equations that arise in the world around us. For instance,
questions of growth and decay and Newton’s Law of Cooling give rise to separable differential equations. Later, we
will learn in Section 7.6 that the important logistic differential equation is also separable.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.4.5 12/15/2021 https://math.libretexts.org/@go/page/4337
Schlicker
7.5: Modeling with Differential Equations
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we use differential equations to describe phenomena in the world around us?
How can we use differential equations to better understand these phenomena?

In our work to date, we have seen several ways that differential equations arise in the natural world, from the growth of a
population to the temperature of a cup of coffee. In this section, we will look more closely at how differential equations
give us a natural way to describe various phenomena. As we’ll see, the key is to focus on understanding the different
factors that cause a quantity to change.

Preview Activity 7.5.1

Any time that the rate of change of a quantity is related to the amount of a quantity, a differential equation naturally
arises. In the following two problems, we see two such scenarios; for each, we want to develop a differential equation
whose solution is the quantity of interest.
1. Suppose you have a bank account in which money grows at an annual rate of 3%.
i. If you have $10,000 in the account, at what rate is your money growing?
ii. Suppose that you are also withdrawing money from the account at $1,000 per year. What is the rate of change
in the amount of money in the account? What are the units on this rate of change?
2. Suppose that a water tank holds 100 gallons and that a salty solution, which contains 20 grams of salt in every
gallon, enters the tank at 2 gallons per minute.
i. How much salt enters the tank each minute?
ii. Suppose that initially there are 300 grams of salt in the tank. How much salt is in each gallon at this point in
time?
iii. Finally, suppose that evenly mixed solution is pumped out of the tank at the rate of 2 gallons per minute. How
much salt leaves the tank each minute?
iv. What is the total rate of change in the amount of salt in the tank?

Developing a Differential Equation


Preview activity 7.5.1 demonstrates the kind of thinking we will be doing in this section. In each of the two examples we
considered, there is a quantity, such as the amount of money in the bank account or the amount of salt in the tank, that is
changing due to several factors. The governing differential equation results from the total rate of change being the
difference between the rate of increase and the rate of decrease.

Example 7.5.1: Lake Michigan

In the Great Lakes region, rivers flowing into the lakes carry a great deal of pollution in the form of small pieces of
plastic averaging 1 millimeter in diameter. In order to understand how the amount of plastic in Lake Michigan is
changing, construct a model for how this type pollution has built up in the lake.
Solution
First, some basic facts about Lake Michigan.
The volume of the lake is 5 × 10 cubic meters.
12

Water flows into the lake at a rate of 5 × 10 cubic meters per year. It flows out of the lake at the same rate.
10

Each cubic meter flowing into the lake contains roughly 3 × 10 cubic meters of plastic pollution.
−8

Let’s denote the amount of pollution in the lake by P (t), where P is measured in cubic meters of plastic and t in
years. Our goal is to describe the rate of change of this function; in other words, we want to develop a differential
equation describing P (t).
Matthew Boelkins, David Austin & Steven
7.5.1 11/17/2021 https://math.libretexts.org/@go/page/4338
Schlicker
First, we will measure how P (t) increases due to pollution flowing into the lake. We know that 5 × 10 0 cubic meters 1

of water enters the lake every year and each cubic meter of water contains 3 × 10 cubic meters of pollution. −8

Therefore, pollution enters the lake at the rate of


3 3
10
m water −8
m plastic 3
(5 ⋅ 10 ) ⋅ (3 ⋅ 10 ) = 1.5 ⋅ 10
year m3 water

Second, we will measure how P (t) decreases due to pollution flowing out of the lake. If the total amount of pollution
is P cubic meters and the volume of Lake Michigan is 5 ⋅ 10 cubic meters then the concentration of plastic pollution
12

in Lake Michigan is
P
12
cubic meters of plastic per cubic meter of water.
5⋅10

Since 5 ⋅ 10 10
cubic meters of water flow out each year, then the plastic pollution leaves the lake at the rate of
3 3
P m plastic m water P
(
12 3
) ⋅ (5 ⋅ 10
10
) = cubic meters of plastic per cubic meter of water.
5 ⋅ 10 m water year 100

The total rate of change of P is thus the difference between the rate at which pollution enters the lake minus the rate at
which pollution leaves the lake; that is,
dP P
3
= 1.5 ⋅ 10 − (7.5.1)
dt 100
1 5
= (1.5 ⋅ 10 − P ). (7.5.2)
100

We have now found a differential equation that describes the rate at which the amount of pollution is changing. To
better understand the behavior of P (t), we now apply some of the techniques we have recently developed.
Since this is an autonomous differential equation, we can sketch dP /dt as a function of P and then construct a slope
field, as shown in Figure 7.5.1.

dP dP 1
Figure 7.5.1 : Plots of vs. P and the slope field for the differential equation =
5
(1.5 ⋅ 10 −P) .
dt dt 100

These plots both show that P = 1.5 ⋅ 10 is a stable equilibrium. Therefore, we should expect that the amount of
5

pollution in Lake Michigan will stabilize near 1.5 ⋅ 10 cubic meters of pollution. Next, assuming that there is initially
5

no pollution in the lake, we will solve the initial 6 and we assume that each cubic meter of water that flows out carries
with it the plastic pollution it contains value problem
dP 1
= (1.5 ⋅ 10
5
−P) , P (0) = 0 .
dt 100

Separating variables, we find that


1 dP 1
= .
5
1.5 ⋅ 10 −P dt 100

Integrating with respect to t , we have


1 dP 1
∫ dt = ∫  dt
5
1.5 ⋅ 10 −P dt 100

Matthew Boelkins, David Austin & Steven


7.5.2 11/17/2021 https://math.libretexts.org/@go/page/4338
Schlicker
and thus changing variables on the left and antidifferentiating on both sides, we find that
dP 1
∫ =∫ dt
5
1.5 ⋅ 10 P 100

5
1
− ln |1.5 ⋅ 10 −P| = t +C
100

Finally, multiplying both sides by −1 and using the definition of the logarithm, we find that
5 −t/100
1.5 ⋅ 10 − P = Ce . (7.1)

This is a good time to determine the constant C . Since P =0 when t = 0 , we have


5 0
1.5 ⋅ 10 − 0 = Ce = C.

In other words, C = 1.5 × 10


5

Using this value of C in Equation (7.1) and solving for P , we arrive at the solution
5 −t/100
P (t) = 1.5 ⋅ 10 (1 − e )

Superimposing the graph of P on the slope field we saw in Figure 7.5.1, we see, as shown in Figure 7.5.2 We see that,
as expected, the amount of plastic pollution stabilizes around 1.5 ⋅ 10 cubic meters. 5

There are many important lessons to learn from Example 7.5.1. Foremost is how we can develop a differential
equation by thinking about the “total rate = rate in - rate out” model. In addition, we note how we can bring together
dP
all of our available understanding (plotting vs. P , creating a slope field, solving the differential equation) to see
dt
how the differential equation describes the behavior of a changing quantity.
Of course, we can also explore what happens when certain aspects of the problem change. For instance, let’s suppose
we are at a time when the plastic pollution entering

dP 1
Figure 7.5.2 : The solution P (t) and the slope field for the differential equation = (1.5 × 10
5
−P) .
dt 100

Lake Michigan has stabilized at 1.5 × 10 cubic meters, and that new legislation is passed to prevent this type of
5

pollution entering the lake. So, there is no longer any inflow of plastic pollution to the lake. How does the amount of
plastic pollution in Lake Michigan now change? For example, how long does it take for the amount of plastic pollution
in the lake to halve?
Restarting the problem at time t = 0 , we now have the modified initial value problem
dP 1
=− P , P (0) = 1.5 ⋅ 10 5
.
dt 100

It is a straightforward and familiar exercise to find that the solution to this equation is P (t) = 1.5 ⋅ 10 e . The
5 −t/100

time that it takes for half of the pollution to flow out of the lake is given by T where P (T ) = 0.75 ⋅ 10 . Thus, we 5

must solve the equation


5 5
0.75 × 10 = 1.5 × 10 e − T /100,

Matthew Boelkins, David Austin & Steven


7.5.3 11/17/2021 https://math.libretexts.org/@go/page/4338
Schlicker
or
1
−T /100
=e .
2

It follows that
1
T = −100 ln( ) ≈ 69.3 , years.
2

In the upcoming activities, we explore some other natural settings in which differential equation model changing
quantities.

Activity 7.5.1: Accrued Savings

Suppose you have a bank account that grows by 5% every year. Let A(t) be the amount of money in the account in
year t .
a. What is the rate of change of A with respect to t ?
b. Suppose that you are also withdrawing $10,000 per year. Write a differential equation that expresses the total rate
of change of A .
c. Sketch a slope field for this differential equation, find any equilibrium solutions, and identify them as either stable
or unstable. Write a sentence or two that describes the significance of the stability of the equilibrium solution.
d. Suppose that you initially deposit $100,000 into the account. How long does it take for you to deplete the account?
e. What is the smallest amount of money you would need to have in the account to guarantee that you never deplete
the money in the account?
f. If your initial deposit is $300,000, how much could you withdraw every year without depleting the account?

Activity 7.5.2: Morphine

A dose of morphine is absorbed from the bloodstream of a patient at a rate proportional to the amount in the
bloodstream.
a. Write a differential equation for M (t), the amount of morphine in the patient’s bloodstream, using k as the
constant proportionality.
b. Assuming that the initial dose of morphine is M , solve the initial value problem to find M (t). Use the fact that
0

the half-life for the absorption of morphine is two hours to find the constant k .
c. Suppose that a patient is given morphine intravenously at the rate of 3 milligrams per hour. Write a differential
equation that combines the intravenous administration of morphine with the body’s natural absorption.
d. Find any equilibrium solutions and determine their stability.
e. Assuming that there is initially no morphine in the patient’s bloodstream, solve the initial value problem to
determine M (t). What happens to M (t) after a very long time?
f. To what rate should a doctor reduce the intravenous rate so that there is eventually 7 milligrams of morphine in the
patient’s bloodstream?

Summary
In this section, we encountered the following important ideas:
Differential equations arise in a situation when we understand how various factors cause a quantity to change.
We may use the tools we have developed so far—slope fields, Euler’s methods, and our method for solving separable
equations—to understand a quantity described by a differential equation.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.5.4 11/17/2021 https://math.libretexts.org/@go/page/4338
Schlicker
7.6: Population Growth and the Logistic Equation
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
How can we use differential equations to realistically model the growth of a population?
How can we assess the accuracy of our models?

The growth of the earth’s population is one of the pressing issues of our time. Will the population continue to grow? Or
will it perhaps level off at some point, and if so, when? In this section, we will look at two ways in which we may use
differential equations to help us address questions such as these. Before we begin, let’s consider again two important
differential equations that we have seen in earlier work this chapter.

Preview Activity 7.6.1

Recall that one model for population growth states that a population grows at a rate proportional to its size.
a. We begin with the differential equation
dP 1
= P. (7.6.1)
dt 2

Sketch a slope field below as well as a few typical solutions on the axes provided.
b. Find all equilibrium solutions of Equation 7.6.1 and classify them as stable or unstable.
c. If P (0) is positive, describe the long-term behavior of the solution to Equation 7.6.1.
d. Let’s now consider a modified differential equation given by
dP 1
= P (3 − P ).
dt 2

As before, sketch a slope field as well as a few typical solutions on the following axes provided.
e. Find any equilibrium solutions and classify them as stable or unstable.
f. If P (0) is positive, describe the long-term behavior of the solution.

The Earth’s Population


We will now begin studying the earth’s population. To get started, here are some data for the earth’s population in recent
years that we will use in our investigations.

Year 1998 1999 2000 2001 2002 2005 2006 2007 2008 2009 2010

Populati
on (in 5.932 6.008 6.084 6.159 6.234 6.456 6.531 6.606 6.681 6.756 6.831
Billions)

Matthew Boelkins, David Austin & Steven


7.6.1 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
Activity 7.6.1: Growth Dynamics

Our first model will be based on the following assumption:


The rate of change of the population is proportional to the population.
On the face of it, this seems pretty reasonable. When there is a relatively small number of people, there will be fewer
births and deaths so the rate of change will be small. When there is a larger number of people, there will be more births
and deaths so we expect a larger rate of change. If P (t) is the population t years after the year 2000, we may express
this assumption as
dP
= kP (7.6.2)
dt

where k is a constant of proportionality.


a. Use the data in the table to estimate the derivative P (0) using a central difference. Assume that t = 0 corresponds

to the year 2000.


b. What is the population P (0)?
c. Use these two facts to estimate the constant of proportionality k in the differential equation.
d. Now that we know the value of k , we have the initial value problem of Equation 7.6.2 with P (0) = 6.084. Find
the solution to this initial value problem.
e. What does your solution predict for the population in the year 2010? Is this close to the actual population given in
the table?
f. When does your solution predict that the population will reach 12 billion?
g. What does your solution predict for the population in the year 2500?
h. Do you think this is a reasonable model for the earth’s population? Why or why not? Explain your thinking using a
couple of complete sentences.

Our work in Activity 7.6.1 shows that that the exponential model is fairly accurate for years relatively close to 2000.
However, if we go too far into the future, the model predicts increasingly large rates of change, which causes the
population to grow arbitrarily large. This does not make much sense since it is unrealistic to expect that the earth would be
able to support such a large population.
The constant k in the differential equation has an important interpretation. Let’s rewrite the differential equation
dP
= kP
dt

by solving for k , so that we have


dP

dt
k = .
P

Viewed in this light, k is the ratio of the rate of change to the population; in other words, it is the contribution to the rate of
change from a single person. We call this the per capita growth rate.
In the exponential model we introduced in Activity 7.6.1, the per capita growth rate is constant. In particular, we are
assuming that when the population is large, the per capita growth rate is the same as when the population is small. It is
natural to think that the per capita growth rate should decrease when the population becomes large, since there will not be
enough resources to support so many people. In other words, we expect that a more realistic model would hold if we
assume that the per capita growth rate depends on the population P. In the previous activity, we computed the per capita
growth rate in a single year by computing k , the quotient of and P (which we did for t = 0 ). If we return data and
dP

dt

compute the per capita growth rate over a range of years, we generate the data shown in Figure 7.6.1, which shows how
the per capita growth rate is a function of the population, P .

Matthew Boelkins, David Austin & Steven


7.6.2 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
Figure 7.6.1: A plot of per capita growth rate vs. population P.
From the data, we see that the per capita growth rate appears to decrease as the population increases. In fact, the points
seem to lie very close to a line, which is shown at two different scales in Figure 7.6.2.

Figure 7.6.2: The line that approximates per capita growth as a function of population, P.
Looking at this line carefully, we can find its equation to be
dP

dt
= 0.025 − 0.002P .
P

If we multiply both sides by P , we arrive at the differential equation


dP
= P (0.025 − 0.002P ). (7.6.3)
dt

Graphing the dependence of on the population P , we see that this differential equation demonstrates a quadratic
dP

dt

relationship between and P , as shown in Figure 7.6.3.


dP

dt

Matthew Boelkins, David Austin & Steven


7.6.3 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
Figure 7.6.3: A plot of dP

dt
vs. P for Equation 7.6.3.
Equation 7.6.3 is an example of the logistic equation, and is the second model for population growth that we will consider.
We have reason to believe that it will be more realistic since the per capita growth rate is a decreasing function of the
population.
Indeed, the graph in Figure 7.6.3 shows that there are two equilibrium solutions, P = 0 , which is unstable, and P = 12.5,
which is a stable equilibrium. The graph shows that any solution with P (0) > 0 will eventually stabilize around 12.5. In
other words, our model predicts the world’s population will eventually stabilize around 12.5 billion.
A prediction for the long-term behavior of the population is a valuable conclusion to draw from our differential equation.
We would, however, like to answer some quantitative questions. For instance, how long will it take to reach a population
of 10 billion? To determine this, we need to find an explicit solution of the equation. Solving the logistic differential
equation Since we would like to apply the logistic model in more general situations, we state the logistic equation in its
more general form,
dP
= kP (N − P ). (7.6.4)
dt

The equilibrium solutions here are when P = 0 and 1 − = 0 , which shows that P = N . The equilibrium at P = N is
P

called the carrying capacity of the population for it represents the stable population that can be sustained by the
environment.
We now solve the logistic Equation 7.6.4, which is separable, so we separate the variables
1 dP
= k,
P (N − P ) dt

and integrate to find that


1
∫ dP = ∫ kdt,
P (N − P )

To find the antiderivative on the left, we use the partial fraction decomposition
1 1 1 1
= [ + ].
P (N − P ) N P N −P

Now we are ready to integrate, with


1 1 1
∫ [ + ] dP = ∫ kdt.
N P N −P

On the left, observe that N is constant, so we can remove the factor of 1

N
and antidifferentiate to find that
1
(ln |P | − ln |N − P |) = kt + C .
N

Multiplying both sides of this last equation by N and using an important rule of logarithms, we next find that
∣ P ∣
ln∣ ∣ = kN t + C .
∣ N −P ∣

From the definition of the logarithm, replacing e


C
with C , and letting C absorb the absolute value signs, we now know
that
P
kN t
= Ce .
N −P

At this point, all that remains is to determine C and solve algebraically for P .
If the initial population is P (0) = P , then it follows that
0

P0
C =
N − P0

so

Matthew Boelkins, David Austin & Steven


7.6.4 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
P P0
kN t
= e .
N −P N − P0

We will solve this most recent equation for P by multiplying both sides by (N − P )(N − P 0) to obtain
kN t
P (N − P0 ) = P0 (N − P )e (7.6.5)

kN t kN t
= P0 N e − P0 P e . (7.6.6)

Swapping the left and right sides, expanding, and factoring, it follows that
kN t kN t
P0 N e = P (N − P0 ) + P0 P e (7.6.7)

kN t
= P (N − P0 + P0 e ). (7.6.8)

Dividing to solve for P , we see that


kN t
P0 N e
P = .
N − P0 + P0 ekN t

Finally, we choose to multiply the numerator and denominator by 1

P0
e
−kN t
to obtain

N
P (t) = . (7.6.9)
N − P0
−kN t
( )e +1
P0

While that was a lot of algebra, notice the result: we have found an explicit solution to the initial value problem
dP
= kP (N − P ),  P (0) = P0 ,
dt

with P (0) = P and that solution is Equation 7.6.9.


0

For the logistic equation describing the earth’s population that we worked with earlier in this section, we have
,
k = 0.002 N = 12.5 , and P 0 = 6.084 .
This gives the solution
12.5
P (t) = , (7.6.10)
−0.025t
1.0546 e +1

whose graph is shown in Figure 7.6.4 Notice that the graph shows the population leveling off at 12.5 billion, as we
expected, and that the population will be around 10 billion in the year 2050. These results, which we have found using a
relatively simple mathematical model, agree fairly well with predictions made using a much more sophisticated model
developed by the United Nations.

Figure 7.6.4: The solution to the logistic equation modeling the earth’s population (Equation 7.6.10).
The logistic equation is useful in other situations, too, as it is good for modeling any situation in which limited growth is
possible. For instance, it could model the spread of a flu virus through a population contained on a cruise ship, the rate at
which a rumor spreads within a small town, or the behavior of an animal population on an island. Again, it is important to

Matthew Boelkins, David Austin & Steven


7.6.5 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
realize that through our work in this section, we have completely solved the logistic equation, regardless of the values of
the constants N , k , and P .
0

Anytime we encounter a logistic equation, we can apply the formula we found in Equation 7.6.9.

Activity 7.6.2: Predicting Earth's Population

Consider the logistic equation


dP
= kP (N − P )
dt

with the graph of dP

dt
vs. P shown below.

a. At what value of P is the rate of change greatest?


b. Consider the model for the earth’s population that we created. At what value of P is the rate of change greatest?
How does that compare to the population in recent years?
c. According to the model we developed, what will the population be in the year 2100?
d. According to the model we developed, when will the population reach 9 billion?
e. Now consider the general solution to the general logistic initial value problem that we found, given by Equation
7.6.9. Verify algebraically that P (0) = P 0and that limt→∞ P (t) = N .

Summary
In this section, we encountered the following important ideas:
If we assume that the rate of growth of a population is proportional to the population, we are led to a model in which
the population grows without bound and at a rate that grows without bound.
By assuming that the per capita growth rate decreases as the population grows, we are led to the logistic model of
population growth, which predicts that the population will eventually stabilize at the carrying capacity.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


7.6.6 11/24/2021 https://math.libretexts.org/@go/page/4339
Schlicker
7.E: Differential Equations (Exercises)
7.1: An Introduction to Differential Equations
Q7.1.1
Suppose that T(t) represents the temperature of a cup of coffee set out in a room, where T is expressed in degrees
Fahrenheit and t in minutes. A physical principle known as Newton’s Law of Cooling tells us that dT dt = − 1 15 T + 5.
a. Supposes that T(0) = 105. What does the differential equation give us for the value of dT dt |T=105? Explain in a
complete sentence the meaning of these two facts.
b. Is T increasing or decreasing at t = 0?
c. What is the approximate temperature at t = 1?
d. On the graph below, make a plot of dT/dt as a function of T.

e. For which values of T does T increase? For which values of T does T decrease? 396 7.1. AN INTRODUCTION TO
DIFFERENTIAL EQUATIONS
f. What do you think is the temperature of the room? Explain your thinking.
g. Verify that T(t) = 75 + 30e −t/15 is the solution to the differential equation with initial value T(0) = 105. What happens
to this solution after a long time?

Q7.1.2
Suppose that the population of a particular species is described by the function P(t), where P is expressed in millions.
Suppose further that the population’s rate of change is governed by the differential equation dP dt = f (P) where f (P) is the
function graphed below.

a. For which values of the population P does the population increase?


b. For which values of the population P does the population decrease?
c. If P(0) = 3, how will the population change in time?
d. If the initial population satisfies 0 < P(0) < 1, what will happen to the population after a very long time?
e. If the initial population satisfies 1 < P(0) < 3, what will happen to the population after a very long time?
f. If the initial population satisfies 3 < P(0), what will happen to the population after a very long time?

Matthew Boelkins, David Austin & Steven


7.E.1 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
g. This model for a population’s growth is sometimes called “growth with a threshold.” Explain why this is an appropriate
name.

Q7.1.3
In this problem, we test further what it means for a function to be a solution to a given differential equation. 7.1. AN
INTRODUCTION TO DIFFERENTIAL EQUATIONS 397
a. Consider the differential equation dy dt = y − t. Determine whether the following functions are solutions to the given
differential equation.
a. (i) y(t) = t + 1 + 2e t
b. (ii) y(t) = t + 1
c. (iii) y(t) = t + 2
b. When you weigh bananas in a scale at the grocery store, the height h of the bananas is described by the differential
equation d 2h dt2 = −k h where k is the spring constant, a constant that depends on the properties of the spring in the
scale. After you put the bananas in the scale, you (cleverly) observe that the height of the bananas is given by h(t) = 4
sin(3t). What is the value of the spring constant?

7.2: Qualitative Behavior of Solutions to DE's


Q7.2.1
Consider the differential equation dy dt = t − y.

a. Sketch a slope field on the plot below:


b. Sketch the solutions whose initial values are y(0) = −4, −3, . . ., 4.
c. What do your sketches suggest is the solution whose initial value is y(0) = −1? Verify that this is indeed the solution to
this initial value problem.
d. By considering the differential equation and the graphs you have sketched, what is the relationship between t and y at a
point where a solution has a local minimum?

Q7.2.2
Consider the situation from problem 2 of Section 7.1: Suppose that the population of a particular species is described by
the function P(t), where P is expressed in millions. Suppose further that the population’s rate of change is governed by the
differential equation dP dt = f (P) where f (P) is the function graphed below.

Matthew Boelkins, David Austin & Steven


7.E.2 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
a. Sketch a slope field for this differential equation. You do not have enough information to determine the actual slopes,
but you should have enough information to determine where slopes are positive, negative, zero, large, or small, and
hence determine the qualitative behavior of solutions. 7.2. QUALITATIVE BEHAVIOR OF SolutionS TO DES 407
b. Sketch some solutions to this differential equation when the initial population P(0) > 0.
c. Identify any equilibrium solutions to the differential equation and classify them as stable or unstable.
d. If P(0) > 1, what is the eventual fate of the species?
e. if P(0) < 1, what is the eventual fate of the species?
f. Remember that we referred to this model for population growth as “growth with a threshold.” Explain why this
characterization makes sense by considering solutions whose inital value is close to 1.

Q7.2.3
The population of a species of fish in a lake is P(t) where P is measured in thousands of fish and t is measured in months.
The growth of the population is described by the differential equation dP dt = f (P) = P(6 − P).
a. Sketch a graph of f (P) = P(6 − P) and use it to determine the equilibrium solutions and whether they are stable or
unstable. Write a complete sentence that describes the long-term behavior of the fish population.
b. Suppose now that the owners of the lake allow fishers to remove 1000 fish from the lake every month (remember that
P(t) is measured in thousands of fish). Modify the differential equation to take this into account. Sketch the new graph
of dP/dt versus P. Determine the new equilibrium solutions and decide whether they are stable or unstable.
c. Given the situation in part (b), give a description of the long-term behavior of the fish population.
d. Suppose that fishermen remove h thousand fish per month. How is the differential equation modified?
e. What is the largest number of fish that can be removed per month without eliminating the fish population? If fish are
removed at this maximum rate, what is the eventual population of fish?

Q7.2.4
Let y(t) be the number of thousands of mice that live on a farm; assume time t is measured in years.2
a. The population of the mice grows at a yearly rate that is twenty times the number of mice. Express this as a differential
equation. 2This problem is based on an ecological analysis presented in a research paper by C.S. Hollings: The
Components of Predation as Revealed by a Study of Small Mammal Predation of the European Pine Sawfly, Canadian
Entomology 91: 283-320.
b. At some point, the farmer brings C cats to the farm. The number of mice that the cats can eat in a year is M(y) = C y 2
+ y thousand mice per year. Explain how this modifies the differential equation that you found in part a).
c. Sketch a graph of the function M(y) for a single cat C = 1 and explain its features by looking, for instance, at the
behavior of M(y) when y is small and when y is large.
d. Suppose that C = 1. Find the equilibrium solutions and determine whether they are stable or unstable. Use this to
explain the long-term behavior of the mice population depending on the initial population of the mice.
e. Suppose that C = 60. Find the equilibrium solutions and determine whether they are stable or unstable. Use this to
explain the long-term behavior of the mice population depending on the initial population of the mice.
f. What is the smallest number of cats you would need to keep the mice population from growing arbitrarily large?

7.3: Euler's Method


Q7.3.1
Newton’s Law of Cooling says that the rate at which an object, such as a cup of coffee, cools is proportional to the
difference in the object’s temperature and room temperature. If T(t) is the object’s temperature and Tr is room temperature,
this law is expressed at dT dt = −k(T − Tr ), where k is a constant of proportionality. In this problem, temperature is
measured in degrees Fahrenheit and time in minutes.
a. Two calculus students, Alice and Bob, enter a 70◦ classroom at the same time. Each has a cup of coffee that is 100◦ .
The differential equation for Alice has a constant of proportionality k = 0.5, while the constant of proportionality for
Bob is k = 0.1. What is the initial rate of change for Alice’s coffee? What is the initial rate of change for Bob’s coffee?
b. What feature of Alice’s and Bob’s cups of coffee could explain this difference?

Matthew Boelkins, David Austin & Steven


7.E.3 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
c. As the heating unit turns on and off in the room, the temperature in the room is Tr = 70 + 10 sin t. Implement Euler’s
method with a step size of ∆t = 0.1 to approximate the temperature of Alice’s coffee over the time interval 0 ≤ t ≤ 50.
This will most easily be performed using a spreadsheet such as Excel. Graph the temperature of her coffee and room
temperature over this interval.
d. In the same way, implement Euler’s method to approximate the temperature of Bob’s coffee over the same time
interval. Graph the temperature of his coffee and room temperature over the interval.
e. Explain the similarities and differences that you see in the behavior of Alice’s and Bob’s cups of coffee.

Q7.3.2
We have seen that the error in approximating the solution to an initial value problem is proportional to ∆t. That is, if E∆t is
the Euler’s method approximation to the solution to an initial value problem at t, then y(t) − E∆t ≈ K∆t for some constant
of proportionality K. In this problem, we will see how to use this fact to improve our estimates, using an idea called
accelerated convergence.
a. We will create a new approximation by assuming the error is exactly proportional to ∆t, according to the formula y(t) −
E∆t = K∆t. Using our earlier results from the initial value problem dy/dt = y and y(0) = 1 with ∆t = 0.2 and ∆t = 0.1, we
have y(1) − 2.4883 = 0.2K y(1) − 2.5937 = 0.1K. This is a system of two linear equations in the unknowns y(1) and K.
Solve this system to find a new approximation for y(1). (You may remember that the exact value is y(1) = e = 2.71828 .
. . .)
b. Use the other data, E0.05 = 2.6533 and E0.025 = 2.6851 to do similar work as in to obtain another approximation.
Which gives the better approximation? Why do you think this is?
c. Let’s now study the initial value problem dy dt = t − y, y(0) = 0. Approximate y(0.3) by applying Euler’s method to
find approximations E0.1 and E0.05. Now use the idea of accelerated convergence to obtain a better approximation.
(For the sake of comparison, you want to note that the actual value is y(0.3) = 0.0408.)

Q7.3.3
In this problem, we’ll modify Euler’s method to obtain better approximations to solutions of initial value problems. This
method is called the Improved Euler’s method. In Euler’s method, we walk across an interval of width ∆t using the slope
obtained from the differential equation at the left endpoint of the interval. Of course, the slope of the solution will most
likely change over this interval. We can improve our approximation by trying to incorporate the change in the slope over
the interval. Let’s again consider the initial value problem dy/dt = y and y(0) = 1, which we will approximate using steps of
width ∆t = 0.2. Our first interval is therefore 0 ≤ t ≤ 0.2. At t = 0, the differential equation tells us that the slope is 1, and
the approximation we obtain from Euler’s method is that y(0.2) ≈ y1 = 1 + 1(0.2) = 1.2. This gives us some idea for how
the slope has changed over the interval 0 ≤ t ≤ 0.2. We know the slope at t = 0 is 1, while the slope at t = 0.2 is 1.2, trusting
in the Euler’s method approximation. We will therefore refine our estimate of the initial slope to be the average of these
two slopes; that is, we will estimate the slope to be (1+1.2)/2 = 1.1. This gives the new approximation y(1) = y1 = 1 +
1.1(0.2) = 1.22. The first few steps look like this:

a. Continue with this method to obtain an approximation for y(1) = e.


b. Repeat this method with ∆t = 0.1 to obtain a better approximation for y(1).
c. We saw that the error in Euler’s method is proportional to ∆t. Using your results from parts and (b), what power of ∆t
appears to be proportional to the error in the Improved Euler’s Method?

7.4: Separable Differential Equations

Matthew Boelkins, David Austin & Steven


7.E.4 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
Q7.4.1
The mass of a radioactive sample decays at a rate that is proportional to its mass.
a. Express this fact as a differential equation for the mass M(t) using k for the constant of proportionality.
b. If the initial mass is M0, find an expression for the mass M(t).
c. The half-life of the sample is the amount of time required for half of the mass to decay. Knowing that the half-life of
Carbon-14 is 5730 years, find the value of k for a sample of Carbon-14.
d. How long does it take for a sample of Carbon-14 to be reduced to one-quarter its original mass?
e. Carbon-14 naturally occurs in our environment; any living organism takes in Carbon-14 when it eats and breathes.
Upon dying, however, the organism no longer takes in Carbon-14. Suppose that you find remnants of a pre-historic
firepit. By analyzing the charred wood in the pit, you determine that the amount of Carbon-14 is only 30% of the
amount in living trees. Estimate the age of the firepit.5

Q7.4.2
Consider the initial value problem dy dt = − t y , y(0) = 8
a. Find the solution of the initial value problem and sketch its graph.
b. For what values of t is the solution defined?
c. What is the value of y at the last time that the solution is defined?
d. By looking at the differential equation, explain why we should not expect to find solutions with the value of y you
noted in (c).

Q7.4.3
Suppose that a cylindrical water tank with a hole in the bottom is filled with water. The water, of course, will leak out and
the height of the water will decrease. Let h(t) denote the height of the water. A physical principle called Torricelli’s Law
implies that the height decreases at a rate proportional to the square root of the height.
a. Express this fact using k as the constant of proportionality.
b. Suppose you have two tanks, one with k = −1 and another with k = −10. What physical differences would you expect to
find? 5This approach is the basic idea behind radiocarbon dating.
c. Suppose you have a tank for which the height decreases at 20 inches per minute when the water is filled to a depth of
100 inches. Find the value of k.
d. Solve the initial value problem for the tank in part (c), and graph the solution you determine.
e. How long does it take for the water to run out of the tank?
f. Is the solution that you found valid for all time t? If so, explain how you know this. If not, explain why not.

Q7.4.4
The Gompertz equation is a model that is used to describe the growth of certain populations. Suppose that P(t) is the
population of some organism and that dP dt = −P ln P 3 ! = −P(ln P − ln 3).
a. Sketch a slope field for P(t) over the range 0 ≤ P ≤ 6.
b. Identify any equilibrium solutions and determine whether they are stable or unstable.
c. Find the population P(t) assuming that P(0) = 1 and sketch its graph. What happens to P(t) after a very long time?
d. Find the population P(t) assuming that P(0) = 6 and sketch its graph. What happens to P(t) after a very long time?
e. Verify that the long-term behavior of your solutions agrees with what you predicted by looking at the slope field.

7.5: Modeling with Differential Equations


Q7.5.1
Congratulations, you just won the lottery! In one option presented to you, you will be paid one million dollars a year for
the next 25 years. You can deposit this money in an account that will earn 5% each year.
a. Set up a differential equation that describes the rate of change in the amount of money in the account. Two factors
cause the amount to grow—first, you are depositing one millon dollars per year and second, you are earning 5%
interest.

Matthew Boelkins, David Austin & Steven


7.E.5 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
b. If there is no amount of money in the account when you open it, how much money will you have in the account after 25
years?
c. The second option presented to you is to take a lump sum of 10 million dollars, which you will deposit into a similar
account. How much money will you have in that account after 25 years?
d. Do you prefer the first or second option? Explain your thinking.
e. At what time does the amount of money in the account under the first option overtake the amount of money in the
account under the second option?

Q7.5.2
When a skydiver jumps from a plane, gravity causes her downward velocity to increase at the rate of g ≈ 9.8 meters per
second squared. At the same time, wind resistance causes her velocity to decrease at a rate proportional to the velocity.
a. Using k to represent the constant of proportionality, write a differential equation that describes the rate of change of the
skydiver’s velocity.
b. Find any equilibrium solutions and decide whether they are stable or unstable. Your result should depend on k.
c. Suppose that the initial velocity is zero. Find the velocity v(t).
d. A typical terminal velocity for a skydiver falling face down is 54 meters per second. What is the value of k for this
skydiver?
e. How long does it take to reach 50% of the terminal velocity?

Q7.5.3
During the first few years of life, the rate at which a baby gains weight is proportional to the reciprocal of its weight.
a. Express this fact as a differential equation.
b. Suppose that a baby weighs 8 pounds at birth and 9 pounds one month later. How much will he weigh at one year?
c. Do you think this is a realistic model for a long time?

Q7.5.4
Suppose that you have a water tank that holds 100 gallons of water. A briny solution, which contains 20 grams of salt per
gallon, enters the tank at the rate of 3 gallons per minute. At the same time, the solution is well mixed, and water is
pumped out of the tank at the rate of 3 gallons per minute.
a. Since 3 gallons enters the tank every minute and 3 gallons leaves every minute, what can you conclude about the
volume of water in the tank.
b. How many grams of salt enters the tank every minute?
c. Suppose that S(t) denotes the number of grams of salt in the tank in minute t. How many grams are there in each gallon
in minute t?
d. Since water leaves the tank at 3 gallons per minute, how many grams of salt leave the tank each minute?
e. Write a differential equation that expresses the total rate of change of S. Identify any equilibrium solutions and
determine whether they are stable or unstable.
f. Suppose that there is initially no salt in the tank. Find the amount of salt S(t) in minute t.
g. What happens to S(t) after a very long time? Explain how you could have predicted this only knowing how much salt
there is in each gallon of the briny solution that enters the tank.

7.6: Population Growth and the Logistic Equation


Q7.6.1
The logistic equation may be used to model how a rumor spreads through a group of people. Suppose that p(t) is the
fraction of people that have heard the rumor on day t. The equation dp dt = 0.2p(1 − p) describes how p changes. Suppose
initially that one-tenth of the people have heard the rumor; that is, p(0) = 0.1.
a. What happens to p(t) after a very long time?
b. Determine a formula for the function p(t).
c. At what time is p changing most rapidly?
d. How long does it take before 80% of the people have heard the rumor?

Matthew Boelkins, David Austin & Steven


7.E.6 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
Q7.6.2
Suppose that b(t) measures the number of bacteria living in a colony in a Petri dish, where b is measured in thousands and t
is measured in days. One day, you measure that there are 6,000 bacteria and the per capita growth rate is 3. A few days
later, you measure that there are 9,000 bacteria and the per capita growth rate is 2.
a. Assume that the per capita growth rate db/dt b is a linear function of b. Use the measurements to find this function and
write a logistic equation to describe db dt .
b. What is the carrying capacity for the bacteria?
c. At what population is the number of bacteria increasing most rapidly?
d. If there are initially 1,000 bacteria, how long will it take to reach 80% of the carrying capacity?

Q7.6.3
Suppose that the population of a species of fish is controlled by the logistic equation dP dt = 0.1P(10 − P), where P is
measured in thousands of fish and t is measured in years.
a. What is the carrying capacity of this population?
b. Suppose that a long time has passed and that the fish population is stable at the carrying capacity. At this time, humans
begin harvesting 20% of the fish every year. Modify the differential equation by adding a term to incorporate the
harvesting of fish.
c. What is the new carrying capacity?
d. What will the fish population be one year after the harvesting begins?
e. How long will it take for the population to be within 10% of the carrying capacity?

Matthew Boelkins, David Austin & Steven


7.E.7 12/2/2021 https://math.libretexts.org/@go/page/5402
Schlicker
CHAPTER OVERVIEW

1 12/22/2021
8: SEQUENCES AND SERIES
An Introductory Calculus Libretexts Textmap
Active Calculus
by Matt Boelkins, David Austin, and Steve Schlicker
Chapter 1

Chapter 1: Understanding the Derivative


1.1: How do we Measure Velocity?
1.2: The Notion of Limit
1.3: The Derivative of a Function at a Point
1.4: The Derivative Function
1.5: Interpretating, Estimating, and Using the Derivative
1.6: The Second Derivative
1.7: Limits, Continuity, and Differentiability
1.8: The Tangent Line Approximation
1.E: Understanding the Derivative (Exercises)

• Chapter 2

Chapter 2: Computing Derivatives


2.1: Elementary Derivative Rules
2.2: The Sine and Cosine Function
2.3: The Product and Quotient Rules
2.4: Derivatives of Other Trigonometric Functions
2.5: The Chain Rule
2.6: Derivatives of Inverse Functions
2.7: Derivatives of Functions Given Implicitely
2.8: Using Derivatives to Evaluate Limits
2.E: Computing Derivatives (Exercises)

• Chapter 3

Chapter 3: Using Derivatives


3.1: Using Derivatives to Identify Extreme Values
3.2: Using Derivatives to Describe Families of Functions
3.3: Global Optimization
3.4: Applied Optimization
3.5: Related Rates
3.E: Using Derivatives (Exercises)

• Chapter 4

Chapter 4: The Definite Integral


4.1: Determining Distance Traveled from Velocity
4.2: Riemann Sums
4.3: The Definite Integral
4.4: The Fundamental Theorem of Calculus
4.E: The Definite Integral (Exercises)

• Chapter 5

Chapter 5: Finding Antiderivatives and Evaluating Integrals


5.1: Construction Accurate Graphs of Antiderivatives
5.2: The Second Fundamental Theorem of Calculus
5.3 Integration by Substitution
5.4: Integration by Parts
5.5: Other Options for Finding Algebraic Derivatives
5.6: Numerical Integration
5.E: Finding Antiderivatives and Evaluating Integrals (Exercises)

2 12/22/2021
• Chapter 6

Chapter 6: Using Definite Integrals


6.1: Using Definite Integrals to Find Area and Length
6.2: Using Definite Integrals to Find Volume
6.3: Density, Mass, and Center of Mass
6.4: Physics Applications: Work, Force, and Pressure
6.5: Improper Integrals
6.E: Using Definite Integrals (Exercises)

• Chapter 7

Chapter 7: Differential Equations


7.1: An Introduction to Differential Equations
7.2: Qualitative Behavior of Solutions to Differential Equations
7.3: Euler's Method
7.4: Separable Differential Equations
7.5: Modeling with Differential Equations
7.6: Population Growth and the Logistic Equation
7.E: Differential Equations (Exercises)

• Chapter 8

Chapter 8: Sequences and Series


8.1: Sequences
8.2: Geometric Series
8.3: Series of Real Numbers
8.4: Alternating Series
8.5: Taylor Polynomials and Taylor Series
8.6: Power Series
8.E: Sequences and Series (Exercises)

8.1: SEQUENCES
A sequence is a list of objects in a specified order. We will typically work with sequences of real numbers and can also think of a
sequence as a function from the positive integers to the set of real numbers. A sequence diverges if it does not converge.

8.2: GEOMETRIC SERIES


Many important sequences are generated through the process of addition.

8.3: SERIES OF REAL NUMBERS


An infinite series is a sum of the elements in an infinite sequence. The sequence of partial sums of a series P∞ k=1 ak tells us about
the convergence or divergence of the series. The series converges if the sequence of partial sums converges.

8.4: ALTERNATING SERIES


An alternating series is a series whose terms alternate in sign. An alternating series converges if and only if its sequence of partial
sums converges. The sequence of partial sums of a convergent alternating series oscillates around and converge to the sum of the
series if the sequence of nth terms converges to 0.

8.5: TAYLOR POLYNOMIALS AND TAYLOR SERIES


We can use Taylor polynomials to approximate complicated functions. This allows us to approximate values of complicated functions
using only addition, subtraction, multiplication, and division of real numbers. The Lagrange Error Bound shows us how to determine
the accuracy in using a Taylor polynomial to approximate a function.

8.6: POWER SERIES


We can often assume a solution to a given problem can be written as a power series, then use the information in the problem to
determine the coefficients in the power series. This method allows us to approximate solutions to certain problems using partial sums
of the power series; that is, we can find approximate solutions that are polynomials. The connection between power series and Taylor
series is that they are essentially the same thing.

3 12/22/2021
8.E: SEQUENCES AND SERIES (EXERCISES)
These are homework exercises to accompany Chapter 8 of Boelkins et al. "Active Calculus" Textmap.

4 12/22/2021
8.1: Sequences
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a sequence?
What does it mean for a sequence to converge?
What does it mean for a sequence to diverge?

We encounter sequences every day. Your monthly rent payments, the annual interest you earn on investments, a list of your
car’s miles per gallon every time you fill up; all are examples of sequences. Other sequences with which you may be
familiar include the Fibonacci sequence

1, 1, 2, 3, 5, 8, . . .

in which each entry is the sum of the two preceding entries and the triangular numbers

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, . . .

which are numbers that correspond to the number of vertices seen in the triangles in Figure 8.1. Sequences of integers are
of such interest to mathematicians and others that

Figure 8.1: Triangular numbers


they have a journal1 devoted to them and an on-line encyclopedia2 that catalogs a huge number of integer sequences and
their connections. Sequences are also used in digital recordings and digital images. To this point, most of our studies in
calculus have dealt with continuous information (e.g., continuous functions). The major difference we will see now is that
sequences model discrete instead of continuous information. We will study ways to represent and work with discrete
information in this chapter as we investigate sequences and series, and ultimately see key connections between the discrete
and continuous.

Preview Activity 8.1.1

Suppose you receive $5000 through an inheritance. You decide to invest this money into a fund that pays 8% annually,
compounded monthly. That means that each month your investment earns 0.0812 ⋅ P additional dollars, where P is
0.08
your principal balance at the start of the month. So in the first month your investment earns 5000 ( ) or $33.33.
12

If you reinvest this money, you will then have $5033.33 in your account at the end of the first month. From this point
on, assume that you reinvest all of the interest you earn.
a. How much interest will you earn in the second month? How much money will you have in your account at the end
of the second month?
b. Complete Table 8.1 to determine the interest earned and total amount of money in this investment each month for
one year.
c. As we will see later, the amount of money P in the account after month n is given by
n

n
0.08
Pn = 5000 (1 + ) \) (8.1.1)
12

Use this formula to check your calculations in Table 8.1. Then find the amount of money in the account after 5
years.
Matthew Boelkins, David Austin & Steven
8.1.1 12/1/2021 https://math.libretexts.org/@go/page/4341
Schlicker
1The Journal of Integer Sequences at http://www.cs.uwaterloo.ca/journals/JIS/
2The On-Line Encyclopedia of Integer Sequences at http://oeis.org/

Month Interest earned Total amount of money in the account

0 $0 $5000.00

1 $33.33 $5033.33

10

11

12

Table 8.1: Interest


d. How many years will it be before the account has doubled in value to $10000?

Sequences
As our discussion in the introduction and Preview Activity 8.1.1 illustrate, many discrete phenomena can be represented as
lists of numbers (like the amount of money in an account over a period of months). We call these any such list a sequence.
In other words, a sequence is nothing more than list of terms in some order. To be able to refer to a sequence in a general
sense, we often list the entries of the sequence with subscripts,

s1 , s2 , . . . , sn . . . ,

where the subscript denotes the position of the entry in the sequence. More formally,

Definition: Sequences
A sequence is a list of terms s 1, s2 , s3 , . . . in a specified order.

As an alternative to Definition 8.1, we can also consider a sequence to be a function f whose domain is the set of positive
integers. In this context, the sequence s , s , s , . . . would correspond to the function f satisfying f (n) = s for each
1 2 3 n

positive integer n. This alternative view will be be useful in many situations. We will often write the sequence
s , s , s , . . . using the shorthand notation {s }. The value s
1 2 3 n (alternatively s(n) ) is called the nth term in the sequence.
n

If the terms are all 0 after some fixed value of n , we say the sequence is finite. Otherwise the sequence is infinite. We will
work with both finite and infinite sequences, but focus more on the infinite sequences. With infinite sequences, we are
often interested in their end behavior and the idea of convergent sequences.

Activity 8.1.1

a. Let s be the nth term in the sequence 1, 2, 3, . . .. Find a formula for s and use appropriate technological tools to
n n

draw a graph of entries in this sequence by plotting points of the form (n, s ) for some values of n. Most graphing
n

calculators can plot sequences; directions follow for the TI-84.


In the MODE menu, highlight SEQ in the FUNC line and press ENTER.
Matthew Boelkins, David Austin & Steven
8.1.2 12/1/2021 https://math.libretexts.org/@go/page/4341
Schlicker
In the Y= menu, you will now see lines to enter sequences. Enter a value for nM in (where the sequence starts),
a function for u(n) (the nth term in the sequence), and the value of u M in. n

Set your window coordinates (this involves choosing limits for n as well as the window coordinates XMin,
XMax, YMin, and YMax.
The GRAPH key will draw a plot of your sequence. Using your knowledge of limits of continuous functions as
x → ∞ , decide if this sequence {s } has a limit as n → ∞ . Explain your reasoning.
n

b. Let s be the n th term in the sequence 1, , , . . .. Find a formula for s . Draw a graph of some points in this
n
1

2
1

3
n

sequence. Using your knowledge of limits of continuous functions as x → ∞ , decide if this sequence {s } has a n

limit as n → ∞ . Explain your reasoning.


c. Let s be the n th term in the sequence 2, , , . . .. Find a formula for s . Using your knowledge of limits of
n
3

2
4

3
5

4
n

continuous functions as x → ∞ , decide if this sequence {s } has a limit as n → ∞ . Explain your reasoning.
n

Next we formalize the ideas from Activity 8.1.1.

Activity 8.1.2

a. Recall our earlier work with limits involving infinity in Section 2.8. State clearly what it means for a continuous
function f to have a limit L as x → ∞ .
b. Given that an infinite sequence of real numbers is a function from the integers to the real numbers, apply the idea
from part (a) to explain what you think it means for a sequence {s } to have a limit as n → ∞ .
n

1+n
c. Based on your response to (b), decide if the sequence { } has a limit as n → ∞ . If so, what is the limit? If not,
2+n

why not?

In Activities 8.1.1 and 8.1.2 we investigated the notion of a sequence {s } having a limit as n goes to infinity. If a
n

sequence {s } has a limit as n goes to infinity, we say that the sequence converges or is a convergent sequence. If the limit
n

of a convergent sequence is the number L, we use the same notation as we did for continuous functions and write

lim sn = L.
n→∞

If a sequence {s } does not converge then we say that the sequence {s } diverges. Convergence of sequences is a major
n n

idea in this section and we describe it more formally as follows.

Definition: Convergence
A sequence {s } of real numbers converges to a number L if we can make all values of s for k ≥ n as close to L as
n k

we want by choosing n to be sufficiently large.

Remember, the idea of sequence having a limit as n → ∞ is the same as the idea of a continuous function having a limit
as x → ∞ . The only new wrinkle here is that our sequences are discrete instead of continuous. We conclude this section
with a few more examples in the following activity.

Activity 8.1.3

Use graphical and/or algebraic methods to determine whether each of the following sequences converges or diverges.
a. { 1+2n

3n−2
}
n
5+3
b. { 10+2
n }
n

c. { } (where ! is the factorial symbol and n! = n(n − 1)(n − 2) ⋅ ⋅ ⋅ (2)(1) for any positive integer n (as
10

n!

convention we define 0! to be 1)).

Summary
In this section, we encountered the following important ideas:

Matthew Boelkins, David Austin & Steven


8.1.3 12/1/2021 https://math.libretexts.org/@go/page/4341
Schlicker
A sequence is a list of objects in a specified order. We will typically work with sequences of real numbers and can also
think of a sequence as a function from the positive integers to the set of real numbers.
A sequence {s } of real numbers converges to a number L if we can make every value of s for k ≥ n as close as we
n k

want to L by choosing n sufficiently large.


A sequence diverges if it does not converge.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.1.4 12/1/2021 https://math.libretexts.org/@go/page/4341
Schlicker
8.2: Geometric Series
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a geometric series?
What is a partial sum of a geometric series?
What is a simplified form of the nth partial sum of a geometric series?
Under what conditions does a geometric series converge?
What is the sum of a convergent geometric series?

Many important sequences are generated through the process of addition. In Preview Activity 8.2.1 , we see a particular
example of a special type of sequence that is connected to a sum.

Preview Activity 8.2.1

Warfarin is an anticoagulant that prevents blood clotting; often it is prescribed to stroke victims in order to help ensure
blood flow. The level of warfarin has to reach a certain concentration in the blood in order to be effective. Suppose
warfarin is taken by a particular patient in a 5 mg dose each day. The drug is absorbed by the body and some is
excreted from the system between doses. Assume that at the end of a 24 hour period, 8% of the drug remains in the
body. Let Q(n) be the amount (in mg) of warfarin in the body before the (n + 1) st dose of the drug is administered.
a. Explain why Q(1) = 5 × 0.08 mg.
b. Explain why Q(2) = (5 + Q(1)) × 0.08 mg. Then show that Q(2) = (5 × 0.08)(1 + 0.08) mg.
c. Explain why Q(3) = (5 + Q(2)) × 0.08 mg. Then show that Q(3) = (5 × 0.08)1 + 0.08 + 0.082 mg.
d. Explain why Q(4) = (5 + Q(3)) × 0.08 mg. Then show that Q(4) = (5 × 0.08)1 + 0.08 + 0.082 + 0.083 mg.
e. There is a pattern that you should see emerging. Use this pattern to find a formula for Q(n), where n is an arbitrary
positive integer.
f. Complete Table 8.2 with values of Q(n) for the provided n -values (reporting Q(n) to 10 decimal places). What
appears to be happening to the sequence Q(n) as n increases?

Q(1) 0.40

Q(2)

Q(3)

Q(4)

Q(5)

Q(6)

Q(7)

Q(8)

Q(9)

Q(10)

Table 8.2: Values of Q(n) for selected values of n

Geometric Sums
In Preview Activity 8.2 we encountered the sum
n−1
(5 × 0.08)1 + 0.08 + 0.082 + 0.083 + ⋅ ⋅ ⋅ + 0.08 .

Matthew Boelkins, David Austin & Steven


8.2.1 11/24/2021 https://math.libretexts.org/@go/page/4342
Schlicker
In order to evaluate the long-term level of Warfarin in the patient’s system, we will want to fully understand the sum in this
expression. This sum has the form
2 n−1
a + ar + ar + ⋅ ⋅ ⋅ + ar (8.1)

where a = 5 × 0.08 and r = 0.08. Such a sum is called a geometric sum with ratio r. We will analyze this sum in more
detail in the next activity.

Activity 8.2.1

Let a and r be real numbers (with r ≠ 1 ) and let


2 n−1
Sn = a + ar + ar + ⋅ ⋅ ⋅ + ar .

In this activity we will find a shortcut formula for S that does not involve a sum of n terms.
n

a. Multiply S by r. What does the resulting sum look like?


n

b. Subtract rS from S and explain why


n n

n
Sn − rSn = a − ar . (8.2)

c. Solve Equation (8.2) for S to find a simple formula for S that does not involve adding n terms.
n n

We can summarize the result of Activity 8.2.1 in the following way. A geometric sum Sn is a sum of the form
2 n−1
Sn = a + ar + ar + ⋅ ⋅ ⋅ + ar , (8.3)

where a and r are real numbers such that r ≠ 1 . The geometric sum S can be written more simply as
n

2 n n
Sn = a + ar + ar + ⋅ ⋅ ⋅ + ar − 1 = a(1 − r )1 − r. (8.4)

We now apply Equation 8.4 to the example involving warfarin from Preview Activity 8.2. Recall that
Q(n) = (5 × 0.08)1 + 0.08 + 0.082 + 0.083 + ⋅ ⋅ ⋅ + 0.08
n−1
mg,
so Q(n) is a geometric sum with a = 5 × 0.08 = 0.4 and r = 0.08. Thus,

Q(n) = 0.41 − 0.08n1 − 0.08! = 12.3(1 − 0.08n).

Notice that as n goes to infinity, the value of 0.08 goes to 0. So,


n

1 1
n
lim Q(n) = lim (1 − 0.08 ) = ≈ 0.435.
n→∞ n→∞ 2.3 2.3

1
Therefore, the long-term level of Warfarin in the blood under these conditions is , which is approximately 0.435 mg.
2.3
To determine the long-term effect of Warfarin, we considered a geometric sum of n terms, and then considered what
happened as n was allowed to grow without bound. In this sense, we were actually interested in an infinite geometric sum
(the result of letting n go to infinity in the finite sum). We call such an infinite geometric sum a geometric series.

Definition

A geometric series is an infinite sum of the form


2 n
a + ar + ar +. . . = ∑ ar . (8.5)

n=0

The value of r in the geometric series in Equation 8.5 is called the common ratio of the series because the ratio of the
(n + 1) st term ar to the n th term ar is always r.
n n−1

Geometric series are very common in mathematics and arise naturally in many different situations. As a familiar example,
suppose we want to write the number with repeating decimal expansion

Matthew Boelkins, David Austin & Steven


8.2.2 11/24/2021 https://math.libretexts.org/@go/page/4342
Schlicker
¯
¯¯¯
¯
N = 0.121212

as a rational number.
Observe that

N = 0.12 + 0.0012 + 0.000012 + ⋅ ⋅ ⋅

2
12 12 1 12 1
= + + +. . . ,
100 100 100 100 100

12 1
which is an infinite geometric series with a = and r = . In the same way that we were able to find a shortcut
100 100
formula for the value of a (finite) geometric sum, we would like to develop a formula for the value of a (infinite) geometric
series. We explore this idea in the following activity.

Activity 8.2.2

Let r ≠ 1 and a be real numbers and let


2 n−1
S = a + ar + ar + ⋅ ⋅ ⋅ar +⋅ ⋅ ⋅

be an infinite geometric series. For each positive integer n , let


2 n−1
Sn = a + ar + ar + ⋅ ⋅ ⋅ + ar .

Recall that
n
1 −r
Sn = a .
1 −r

a. What should we allow n to approach in order to have S approach S n

b. What is the value of lim n→∞ r


n
for
|r| > 1?
|r| < 1?
Explain.
c. If |r| < 1, use the formula for S and your observations in (a) and (b) to explain why S is finite and find a resulting
n

formula for S .
From our work in Activity 8.5, we can now find the value of the geometric series
2
12 12 1 12 1 12
N = + + +⋅ ⋅ ⋅ . In particular, using a = and r = 100
1
, we see that
100 100 100 100 100 100

⎛ ⎞
12 1 12 100 4
N = ⎜

⎟ =

( ) = .
100 1 100 99 33
⎝1− ⎠
100

It is important to notice that a geometric sum is simply the sum of a finite number of terms of a geometric series. In
other words, the geometric sum S for the geometric series
n


k
∑ ar

k=0

is
n−1

2 n−1 k
Sn = a + ar + ar +. . . . ar = ∑ ar

k=0

Matthew Boelkins, David Austin & Steven


8.2.3 11/24/2021 https://math.libretexts.org/@go/page/4342
Schlicker
We also call this sum S the n th partial sum of the geometric series. We summarize our recent work with geometric
n

series as follows.
A geometric series is an infinite sum of the form

2 n
a + ar + ar +. . . = ∑ ar , (8.6)

n=0

where a and r are real numbers such that r ≠ 0 .


The nth partial sum Sn of the geometric series is
2 n−1
Sn = a + ar + ar + ⋅ ⋅ ⋅ + ar .

n
1 −r
If |r| < 1, then using the fact that S n =a , it follows that the sum S of the geometric series (8.6) is
1 −r

n
1 −r a
S = lim Sn = lim a = .
n→∞ n→∞ 1 −r 1 −r

Activity 8.2.3

The formulas we have derived for the geometric series and its partial sum so far have assumed we begin indexing our
sums at n = 0 . If instead we have a sum that does not begin at n = 0, we can factor out common terms and use our
established formulas. This process is illustrated in the examples in this activity.
a. Consider the sum
∞ k 2 3
1 1 1 1
∑(2)( ) = (2) ( ) + (2)( ) + (2)( ) +. . . .
3 3 3 3
k=1

1
Remove the common factor of (2) ( ) from each term and hence find the sum of the series.
3

b. Next let a and r be real numbers with −1 < r < 1 . Consider the sum

k 3 4 5
∑ = ar = ar + ar + ar +. . .

k=3

Remove the common factor of ar from each term and find the sum of the series.
3

c. Finally, we consider the most general case. Let a and r be real numbers with −1 < r < 1 , let n be a positive
integer, and consider the sum

k n n+1 n+2
∑ = ar = ar + ar + ar +. . .

k=n

Remove the common factor of ar from each term to find the sum of the series.
n

Summary
In this section, we encountered the following important ideas:
A geometric series is an infinite sum of the form

k
∑ ar

k=0

where a and r are real numbers and r ≠ 0 .


For the geometric series ∑ ∞

k=0
ar
k
, its n th partial sum is

Matthew Boelkins, David Austin & Steven


8.2.4 11/24/2021 https://math.libretexts.org/@go/page/4342
Schlicker
n−1

k
Sn = ∑ ar

k=0

An alternate formula for the n th partial sum is


n
1 −r
Sn = a .
1 −r

∞ a
Whenever |r| < 1, the infinite geometric series ∑ k=0
k
ar has the finite sum .
1 −r

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.2.5 11/24/2021 https://math.libretexts.org/@go/page/4342
Schlicker
8.3: Series of Real Numbers
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is an infinite series?
What is the nth partial sum of an infinite series?
How do we add up an infinite number of numbers? In other words, what does it mean for an infinite series of real
numbers to converge?
What does is mean for an infinite series of real numbers to diverge?

In Section 8.2, we encountered several situations where we naturally considered an infinite sum of numbers called a
geometric series. For example, by writing
12 12 1 12 1
N = 0.1212121212 ⋅ ⋅⋅ = + ⋅ + ⋅ +⋅ ⋅ ⋅
100 100 100 100 1002

4
as a geometric series, we found a way to write the repeating decimal expansion of \9N\) as a single fraction: N = .
33
There are many other situations in mathematics where infinite sums of numbers arise, but often these are not geometric. In
this section, we begin exploring these other types of infinite sums. Preview Activity 8.3.1 provides a context in which we
see how one such sum is related to the famous number e.

Preview Activity 8.3.1

Have you ever wondered how your calculator can produce a numeric approximation for complicated numbers like e , π
or ln(2)? After all, the only operations a calculator can really perform are addition, subtraction, multiplication, and
division, the operations that make up polynomials. This activity provides the first steps in understanding how this
process works. Throughout the activity, let f (x) = e . x

a. Find the tangent line to f at x = 0 and use this linearization to approximate e . That is, find a formula L(x) for the
tangent line, and compute L(1), since L(1) ≈ f (1) = e .
b. The linearization of e does not provide a good approximation to e since 1 is not very close to 0. To obtain a better
x

approximation, we alter our approach a bit. Instead of using a straight line to approximate e , we put an appropriate
bend in our estimating function to make it better fit the graph of e for x close to 0. With the linearization, we had
x

both f (x) and f (x) share the same value as the linearization at x = 0 . We will now use a quadratic approximation

P (x) to f (x) = e centered at x = 0 which has the property that P (0) = f (0) , P (0) = f 0(0), and
x ′
2 2 2

P (0) = f (0) .
′′ ′′
2

2
x
i. Let P 2 (x) = 1 +x + . Show that P 2 (0) = f (0) ,P 2

(0) = f (0)

, and P ′′
2
(0) = f
′′
(0) . Then, use P 2 (x) to
2
approximate e by observing that P (1) ≈ f (1) . 2

ii. We can continue approximating e with polynomials of larger degree whose higher derivatives agree with those
of f at 0. This turns out to make the polynomials fit the graph of f better for more values of x around 0. For
2
x 3

example, let P 3 (x) = 1 +x + +


x

6
. Show that P 3 (0)

= f (0), P (0) = f (0), P
3

3
′′
(0) = f
′′
(0) , and
2
P
3
′′′
(0) = f
′′′
(0) . Use P 3 (x) to approximate e in a way similar to how you did so with P 2 (x) above.

Preview Activity 8.3.1 shows that an approximation to e using a linear polynomial is 2, an approximation to e using a
quadratic polynomial is 2.5, and an approximation using a cubic polynomial is 2.6667. As we will see later, if we continue
this process we can obtain approximations from quartic (degree 4), quintic (degree 5), and higher degree polynomials
giving us the following approximations to e :
linear: e ≈ 1 + 1 = 2
1
quadratic: e ≈ 1 + 1 + = 2.5
2

Matthew Boelkins, David Austin & Steven


8.3.1 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
1 1
cubic: e ≈ 1 + 1 + + = 2.6
2 6
1 1 1
quartic: e ≈ 1 + 1 + + + = 2.7083
2 6 24
1 1 1 1
quintic: e ≈ 1 + 1 + + + + = 2.716
2 6 24 120

We see an interesting pattern here. The number e is being approximated by the sum
1 1 1 1 1
1 +1 + + + + +⋅ ⋅ ⋅ + (8.11)
2 6 24 120 n!

for increasing values of n . And just as we did with Riemann sums, we can use the summation notation as a shorthand4 for
writing the sum in Equation 8.11 so that
k=0
1 1 1 1 1 1
e ≈ 1 +1 + + + + +⋅ ⋅ ⋅ + ∑ . (8.3.1)
2 6 24 120 n! k!
n

We can calculate this sum using as large an n as we want, and the larger n is the more accurate the approximation
(Equation 8.3.1) is. Ultimately, this argument shows that we can write the number e as the infinite sum:

1
e =∑ . (8.13)
k!
k=0

Note that 0! appears in Equation 8.3.1 and by definition, 0! = 1. This sum is an example of a series (or an infinite series).
1
Note that the series in Equation 8.13 is the sum of the terms of the (infinite) sequence { }. In general, we use the
n!
following notation and terminology. Definition 8.3. An infinite series of real numbers is the sum of the entries in an infinite
sequence of real numbers. In other words, an infinite series is sum of the form

a1 + a2 + ⋅ ⋅ ⋅ + an + ⋅ ⋅ ⋅ = ∑ ak,

k=1

where a 1, a2 , . . . , are real numbers.


We will normally use summation notation to identify a series. If the series adds the entries of a sequence {a }n ≥ 1 , then n

we will write the series as

∑ ak

k≥1

or

∑ ak

k=1

Note well: each of these notations is simply shorthand for the infinite sum a 1 + a2 + ⋅ ⋅ ⋅ + an + ⋅ ⋅ ⋅ .
Is it even possible to sum an infinite list of numbers? This question is one whose answer shouldn’t come as a surprise.
After all, we have used the definite integral to add up continuous (infinite) collections of numbers, so summing the entries
of a sequence might be even easier. Moreover, we have already examined the special case of geometric series in the
previous section. The next activity provides some more insight into how we make sense of the process of summing an
infinite list of numbers.

Activity 8.3.2:

Consider the series


1
∑ .
2
k
k=1

Matthew Boelkins, David Austin & Steven


8.3.2 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
While it is physically impossible to add an infinite collection of numbers, we can, of course, add any finite collection
of them. In what follows, we investigate how understanding how to find the nth partial sum (that is, the sum of the first
n terms) enables us to make sense of the infinite sum.
a. Sum the first two numbers in this series. That is, find a numeric value for
a
1

2
k
k=1

b. (b) Next, add the first three numbers in the series.


c. Continue adding terms in this series to complete Table 8.4. Carry each sum to at least 8 decimal places.

1 1 6 1
∑ = 1 ∑ =
k=1 2 k=1 2
k k

2 1 7 1
∑ = ∑ =
k=1 2 k=1 2
k k

3 1 8 1
∑ = ∑ =
k=1 2 k=1 2
k k

4 1 9 1
∑ = ∑ =
k=1 2 k=1 2
k k

5 1 1 1
∑ = ∑ 0 =
k=1 2 k=1 2
k k

∞ 1
Table 8.4: Sums of some of the first terms of the series ∑ k=1
k2

1
(d) The sums in the table in (c) form a sequence whose n th term is S n =∑

k=1 2
. Based on your calculations in the
k
table, do you think the sequence {S } converges or diverges? Explain. How do you think this sequence {S } is related
n n

∞ 1
to the series ∑ k=1 2
?
k

The example in Activity 8.3.2 illustrates how we define the sum of an infinite series. We can add up the first n terms of the
series to obtain a new sequence of numbers (called the sequence of partial sums). Provided that sequence converges, the
corresponding infinite series is said to converge, and we say that we can find the sum of the series.

Definition
The n th partial sum of the series ∑ a is the finite sum S

k=1 k n =∑

k=1
ak . In other words, the nth partial sum Sn of a
series is the sum of the first n terms in the series, or

Sn = a1 + a2 + ⋅ ⋅ ⋅ + an .

We then investigate the behavior of a given series by examining the sequence

S1 , S2 , . . . , Sn , . . .

of its partial sums. If the sequence of partial sums converges to some finite number, then we say that the corresponding
series converges. Otherwise, we say the series diverges. From our work in Activity 8.3.2, the series

1

2
k
k=1

appears to converge to some number near 1.54977. We formalize the concept of convergence and divergence of an infinite
series in the following definition.

Matthew Boelkins, David Austin & Steven


8.3.3 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
Definition
The infinite series

∑ ak

k=1

converges (or is convergent) if the sequence {S } of partial sums converges, where


n

Sn = ∑ ak .

k=1

If limn→∞ Sn = S , then we call S the sum of the series ∑ ∞

k=1
ak . That is,

∑ ak = lim Sn = S.
n→∞
k=1

If the sequence of partial sums does not converge, then the series

∑ ak

k=1

diverges (or is divergent).

The early terms in a series do not contribute to whether or not the series converges or diverges. Rather, the convergence or
divergence of a series

∑ ak

k=1

is determined by what happens to the terms ak for very large values of k . To see why, suppose that m is some constant
larger than 1. Then
∞ ∞

∑ ak = (a1 + a2 +. . . am ) + ∑ ak .

k=1 k=m+1

∞ ∞
Since a + a + ⋅ ⋅ ⋅ + a
1 2 m is a finite number, the series ∑ a will converge if and only if the series ∑
k=1 k a k=m+1 k

converges. Because the starting index of the series doesn’t affect whether the series converges or diverges, we will often
just write

∑ ak

when we are interested in questions of convergence/divergence and not necessarily the exact sum of a series.
In Section 8.2 we encountered the special family of infinite geometric series whose convergence or divergence we
completely determined. Recall that a geometric series is a special series of the form ∑ a where a and r are real

k=1 k

numbers (and r ≠ 1 ). We found that the n th partial sum S of a geometric series is given by the convenient formula
n

n
1 −r
Sn = ,
1 −r

and thus a geometric series converges if |r| < 1. Geometric series diverge for all other values of r. While we have
completely determined the convergence or divergence of geometric series, it is generally a difficult question to determine
if a given nongeometric series converges or diverges. There are several tests we can use that we will consider in the
following sections.

The Divergence Test


The first question we ask about any infinite series is usually “Does the series converge or diverge?” There is a
straightforward way to check that certain series diverge; we explore this test in the next activity.

Matthew Boelkins, David Austin & Steven


8.3.4 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
Activity 8.3.3:

If the series ∑ a converges, then an important result necessarily follows regarding the sequence {a }. This activity
k n

explores this result. Assume that the series ∑ a converges and has sum equal to L.

k=1 k


a. What is the n th partial sum S of the series ∑
n a ?
k=1 k

b. What is the (n − 1) st partial sum S − 1 of the series ∑
n a ?
k=1 k

c. What is the difference between the nth partial sum and the (n − 1) st partial sum of the series ∑ a ?
k=1 k

d. Since we are assuming that ∑ a = L , what does that tell us about lim
k=1 k S ? Why? What does that tell us
n→∞ n

about lim S
n→∞ ? Why? n−1

e. Combine the results of the previous two parts of this activity to determine lim a = lim (S − S n→∞ n n→∞ n n−1

The result of Activity 8.3.3 is the following important conditional statement:



If the series ∑ a converges, then the sequence {a } of k th terms converges to 0. It is logically equivalent to say that
k=1 k k

if the sequence {ak } of n terms does not converge to 0, then the series ∑ a cannot converge. This statement is called
k=1 k

the Divergence Test.

THe divergence Test


If lim k→∞ ak ≠ 0, , then the series ∑ ak diverges.

Activity 8.3.4:

Determine if the Divergence Test applies to the following series. If the test does not apply, explain why. If the test does
apply, what does it tell us about the series?
k
a. ∑
k+1

b. ∑(−1) k

c. ∑ 1

Note well: be very careful with the Divergence Test. This test only tells us what happens to a series if the terms of the
corresponding sequence do not converge to 0. If the sequence of the terms of the series does converge to 0, the Divergence
Test does not apply: indeed, as we will soon see, a series whose terms go to zero may either converge or diverge.

The Integral Test


The Divergence Test settles the questions of divergence or convergence of series ∑ a in which lim a ≠ 0. k k→∞ k

Determining the convergence or divergence of series ∑ a in which lim a = 0 turns out to be more complicated.
k k→∞ k

Often, we have to investigate the sequence of partial sums or apply some other technique.
As an example, consider the harmonic series

1
∑ .
k
k=1

5
This series is called harmonic because each term in the series after the first is the harmonic mean of the term before it and
2ab
the term after it. The harmonic mean of two numbers a and b is . See “What’s Harmonic about the Harmonic
a+b

Series", by David E. Kullman (in the College Mathematics Journal, Vol. 32, No. 3 (May, 2001), 201-203) for an interesting
discussion of the harmonic mean.
Table 8.3 shows some partial sums of this series.
1
Table 8.5: Sums of some of the first terms of the sequence ∑ .

k=1
k

Matthew Boelkins, David Austin & Steven


8.3.5 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
1 1

1
k=1
. 1 ∑
6
k=1
. 2.450000000
k k

1 1

2
k=1
. 1.5 ∑
7
k=1
. 2.592857143
k k

1 1

3
k=1
. 1.833333333 ∑
8
k=1
. 2.717857143
k k

1 1

4
k=1
. 2.083333333 ∑
9
k=1
. 2.828968254
k k

1 1

5
k=1
. 2.283333333 ∑
1
k=1
0 . 2.928968254
k k

1
This information doesn’t seem to be enough to tell us if the series∑ ∞

k=1
converges or diverges. The partial sums could
k
eventually level off to some fixed number or continue to grow without bound. Even if we look at larger partial sums, such
1000
as ∑ n=1
≈ 7.485470861, the result doesn’t particularly sway us one way or another. The Integral Test is one way to

determine whether or not the harmonic series converges, and we explore this further in the next activity.

Activity 8.3.5:
1
Consider the harmonic series ∑

k=1
. Recall that the harmonic series will converge provided that its sequence of
k
∞ 1
partial sums converges. The n th partial sum S of the series ∑ n k=1
is
k

n
1
Sn = ∑
k
k=1

1 1 1
=1+ + +. . .
2 3 n

1 1 1
1(1) + (1)( ) + (1)( )+. . . +(1) .
2 3 n

1
Through this last expression for S , we can visualize this partial sum as a sum of areas of rectangles with heights
n
m
and bases of length 1, as shown in Figure 8.3, which uses the 9th partial sum. The graph of the continuous function f
defined by f (x) = is overlaid on this plot.
1

a. Explain how this picture represents a particular Riemann sum.


b. What is the definite integral that corresponds to the Riemann sum you considered in (a)?

Figure 8.3: A picture of the 9th partial sum of the harmonic series as a sum of areas of rectangles.
c. Which is larger, the definite integral in (b), or the corresponding partial sum S of the series? Why? 9

d. If instead of considering the 9th partial sum, we consider the nth partial sum, and we let n go to infinity, we can
1 ∞
then compare the series ∑ ∞

k=1
to the improper integral ∫ 1
1

x
dx . Which of these quantities is larger? Why?
k

e. Does the improper integral ∫ 1
1

x
dx converge or diverge? What does that result, together with your work in (d),
∞ 1
tell us about the series ∑ k=1
?
k

Matthew Boelkins, David Austin & Steven


8.3.6 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
The ideas from Activity 8.3.5 hold more generally. Suppose that f is a continuous decreasing function and that a k = f (k)

for each value of k . Consider the corresponding series ∑ a . The partial sum

k=1 k

Sn = ∑ ak

k=1

can always be viewed as a left hand Riemann sum of f (x) using rectangles with heights given by the values a and bases k

of length 1. A representative picture is shown at left in Figure 8.4. Since f is a decreasing function, we have that
n

Sn > ∫ (x)dx.
1

Taking limits as n goes to infinity shows that


k=1 ∞

∑ ak > ∫ f (x)dx.

∞ 1


Therefore, if the improper integral ∫ 1
f (x)dx. diverges, so does the series ∑ ∞

k=1
ak .

Figure 8.4: Comparing an improper integral to a series


What’s more, if we look at the right hand Riemann sums of f on [1, n ] as shown at right in Figure 8.4, we see that
∞ ∞

∫ f (x)dx > ∑ ak .
1
k=2


So if ∫ f (x)dx converges, then so does ∑
1
a , which also means that the series ∑

k=2 k

k=1
ak converges. Our preceding
discussion has demonstrated the truth of the Integral Test.

The Integral Test


Let f be a real valued function and assume f is decreasing and positive for all x larger than some number c . Let
ak = f (k) for each positive integer k .

∞ ∞
1. If the improper integral ∫ c
f (x)dx converges, then the series ∑ k=1
ak converges.

2. If the improper integral ∫ c
f (x)dx diverges, then the series ∑ ∞

k=1
ak diverges.
Note that the Integral Test compares a given infinite series to a natural, corresponding improper integral and basically says
that the infinite series and corresponding improper integral both have the same convergence status. In the next activity, we
apply the Integral Test to determine the convergence or divergence of a class of important series.

Activity 8.3.6:

The series ∑ are special series called p-series. We have already seen that the p-series with
k
1
p
p =1 (the harmonic
series) diverges. We investigate the behavior of other p-series in this activity.
∞ ∞
a. Evaluate the improper integral ∫ 1
1

x
2
dx . Does the series ∑ k=1
1
P
converge or diverge? Explain.
k

b. Evaluate the improper integral ∫ 1
1
P
dx where p > 1 . For which values of p can we conclude that the series
x


k=1
1
P
converges?
k

Matthew Boelkins, David Austin & Steven


8.3.7 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
c. Evaluate the improper integral \int_{1}^{\infty} \frac{1}{x^P} dx where p < 1 . What does this tell us about the
corresponding p-series ∑ ? ∞

k=1
1
P
k

d. Summarize your work in this activity by completing the following statement. The p-series ∑ ∞

k=1
1
P
converges if
k

and only if ___________________________.

The Limit Comparison Test


The Integral Test allows us to determine the convergence of an entire family of series: the p-series. However, we have seen
that it is, in general, difficult to integrate functions, so the Integral Test is not one that we can use all of the time. In fact,
2
k +1
even for a relatively simple series like ∑ 4 ′
, the Integral Test is not an option. In this section we will develop a test
k2 k+2

that we can use to apply to series of rational functions like this by comparing their behavior to the behavior of p-series.

Activity 8.3.7:

Consider the series ∑


k+1
3
. Since the convergence or divergence of a series only depends on the behavior of the
k +2

series for large values of k , we might examine the terms of this series more closely a k gets large.
k
a. By computing the value of k+1

3
for k = 100 and k = 1000 , explain why the terms k+1

3
are essentially 3
when k
k +2 k +2 k
is large.
b. Let’s formalize our observations in (a) a bit more. Let a k =
k+1
3
and b
k =
k
3
.
k +2 k

Calculate
ak
lim .
k→∞ bk

What does the value of the limit tell you about a and b for large values of k ? Compare your response from part (a).
k k

k
c. Does the series ∑
3
converge or diverge? Why? What do you think that tells us about the convergence or
k
k+1
divergence of the series ∑ 3
? Explain.
k +2

Activity 8.3.7 illustrates how we can compare one series with positive terms to another whose behavior (that is, whether
the series converges or diverges) we know. More generally, suppose we have two series ∑ a and ∑ b with positive k k

terms and we know the behavior of the series ∑ a . Recall that the convergence or divergence of a series depends only on
k

what happens to the terms of the series for large values of k, so if we know that ak and bk are essentially proportional to
each other for large k , then the two series ∑ a and ∑ b should behave the same way. In other words, if there is a
k k

positive finite constant c such that


bk
lim = c,
k→∞ ak

then bk ≈ c ak for large values of k . So

∑ bk ≈ ∑ c ak = c ∑ ak .

Since multiplying by a nonzero constant does not affect the convergence or divergence of a series, it follows that the series
P ak and P bk either both converge or both diverge. The formal statement of this fact is called the Limit Comparison Test.

The Limit Comparison Test.


Let ∑ a and ∑ b be series with positive terms. If
k k

bk
lim =c
k→∞ ak

for some positive (finite) constant c , then ∑ a and ∑ b either both converge or both diverge.
k k

Matthew Boelkins, David Austin & Steven


8.3.8 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
p(k)
In essence, the Limit Comparison Test shows that if we have a series ∑ of rational functions where p(k) is a
q(k)

p(k) m

polynomial of degree m and q(k) a polynomial of degree l, then the series ∑ will behave like the series ∑ k
l
. So
q(k) k

this test allows us to quickly and easily determine the convergence or divergence of series whose summands are rational
functions.

Activity 8.3.8:

Use the Limit Comparison Test to determine the convergence or divergence of the series
2
3k +1
∑ .
4
5k + 2k + 2

1
by comparing it to the series ∑ 2
.
k

The Ratio Test


The Limit Comparison Test works well if we can find a series with known behavior to compare. But such series are not
always easy to find. In this section we will examine a test that allows us to examine the behavior of a series by comparing
it to a geometric series, without knowing in advance which geometric series we need.

Activity 8.3.9:

Consider the series defined by


∞ k
2
∑ . (4.18)
k
k=1
3 −k

This series is not a geometric series, but this activity will illustrate how we might compare this series to a geometric
ak+1
one. Recall that a series ∑ a is geometric if the ratio
k is always the same. For the series in Equation 4.18, note
ak
k
2
that a
k = .
k
3 −k

k
2
a. To see if ∑ is comparable to a geometric series, we analyze the ratios of successive terms in the series.
k
3 −k

Complete Table 8.6, listing your calculations to at least 8 decimal places.


ak+1
k
ak

10

20

21

22

23

24

25

k
2
Table 8.6: Ratios of successive terms in the series ∑ k
3 −k

ak+1
b. Based on your calculations in Table 8.6, what can we say about the ratio if k is large?
ak

Matthew Boelkins, David Austin & Steven


8.3.9 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
k
2
c. Do you agree or disagree with the statement: “the series ∑ k
is approximately geometric when k is large”? If
3 −k
k
2
not, why not? If so, do you think the series ∑ k
converges or diverges? Explain.
3 −k

We can generalize the argument in Activity 8.14 in the following way. Consider the series ∑ a . If k

ak+1
≈r
ak

for large values of k , then a ≈ ra


k+1 for large k and the series ∑ ak is approximately the geometric series ∑ ar for
k
k

large k . Since the geometric series with ratio r converges only for −1 < r < 1 , we see that the series ∑ a will converge k

if
ak+1
l lim =r
k→∞ ak

for a value of r such that |r| < 1 . This result is known as the Ratio Test.

The Ratio Test


Let ∑ a be an infinite series. Suppose
k

ak+1
l lim =r
k→∞ ak

1. If 0 ≤ r < 1 , then the series ∑ a converges.


k

2. If 1 < r , then the series ∑ a diverges.


k

3. If r = 1 , then the test is inconclusive.


Note well: The Ratio Test takes a given series and looks at the limit of the ratio of consecutive terms; in so doing, the test
is essentially asking, “is this series approximately geometric?” If the series can be thought of as essentially geometric, the
test use the limiting common ratio to determine if the given series converges.
We have now encountered several tests for determining convergence or divergence of series. The Divergence Test can be
used to show that a series diverges, but never to prove that a series converges. We used the Integral Test to determine the
convergence status of an entire class of series, the p-series. The Limit Comparison Test works well for series that involve
rational functions and which can therefore by compared to p-series. Finally, the Ratio Test allows us to compare our series
to a geometric series; it is particularly useful for series that involve nth powers and factorials. Two other tests, the Direct
Comparison Test and the Root Test, are discussed in the exercises. Now it is time for some practice.

Activity 8.3.10:

Determine whether each of the following series converges or diverges. Explicitly state which test you use.
k
a. ∑ k
2

3
k +2
b. ∑ 2
k +1

k
10
c. ∑
k!

3 2
k − 2k +1
d. ∑ 6
k +4

Summary
In this section, we encountered the following important ideas:
An infinite series is a sum of the elements in an infinite sequence. In other words, an infinite series is a sum of the form

Matthew Boelkins, David Austin & Steven


8.3.10 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker

a1 + a2 + ⋅ ⋅ ⋅ + an + ⋅ ⋅ ⋅ = ∑ ak

k=1

where a is a real number for each positive integer k .


k


The n t partial sum S of the series ∑
n k=1
ak is the sum of the first n terms of the series. That is,
n

Sn = a1 + a2 + ⋅ ⋅ ⋅ + an = ∑ ak

k=1

The sequence {S } of partial sums of a series ∑


n

k=1
ak tells us about the convergence or divergence of the series. In
particular
The series ∑ ∞
a converges if the sequence {S } of partial sums converges. In this case we say that the series is
k=1 k n

the limit of the sequence of partial sums and write


∑ ak = lim Sn.
n→∞
k=1


The series ∑ k=1
ak diverges if the sequence {S } of partial sums diverges.
n

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.3.11 12/8/2021 https://math.libretexts.org/@go/page/4343
Schlicker
8.4: Alternating Series
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is an alternating series?
What does it mean for an alternating series to converge?
Under what conditions does an alternating series converge? Why?
How well does the nth partial sum of a convergent alternating series approximate the actual sum of the series?
Why?
What is the difference between absolute convergence and conditional convergence?

In our study of series so far, almost every series that we’ve considered has exclusively nonnegative terms. Of course, it is
possible to consider series that have some negative terms. For instance, if we consider the geometric series
n
4 8 −2
2− + − … + 2( ) +… , (8.4.1)
3 9 3

2
which has a =2 and r =− , we see that not only does every other term alternate in sign, but also that this series
3
converges to
a 2 6
S = = = . (8.4.2)
1 −r 2 5
1 − (− )
3

In Preview Activity 8.4.1 and our following discussion, we investigate the behavior of similar series where consecutive
terms have opposite signs.

Preview Activity 8.4.1

showed how we can approximate the number e with linear, quadratic, and other polynomial approximations. We use a
similar approach in this activity to obtain linear and quadratic approximations to ln(2). Along the way, we encounter a
type of series that is different than most of the ones we have seen so far. Throughout this activity, let
f (x) = ln(1 + x) .

a. Find the tangent line to f at x = 0 and use this linearization to approximate ln(2) . That is, find L(x) , the tangent
line approximation to f (x), and use the fact that L(1) ≈ f (1) to estimate ln(2).
b. The linearization of ln(1 + x) does not provide a very good approximation to ln(2) since 1 is not that close to 0. To
obtain a better approximation, we alter our approach; instead of using a straight line to approximate ln(2), we use a
quadratic function to account for the concavity of ln(1 + x) for x close to 0. With the linearization, both the function’s
value and slope agree with the linearization’s value and slope at x = 0 . We will now make a quadratic approximation
P (x)tof (x) = ln(1 + x) centered at x = 0 with the property that P (0) = f (0), P (0) = f (0) , and ′ ′
2 2 2

P (0) = f (0) .
′′ ′′
2

2
x
i. Let P 2 (x) =x− . Show that P 2 (0)

= f (0), P (0) = f (0)
2

, and P ′′
2(0) = f
′′
(0) . Use P 2 (x) to approximate
2
ln(2) by using the fact that P (1) ≈ f (1) .
2

ii. We can continue approximating ln(2) with polynomials of larger degree whose derivatives agree with those of f at
0. This makes the polynomials fit the graph of f better for more values of x around 0. For example, let
2 3
x x
P3 (x) = x − + . Show that P 3 (0)
′ ′
= f (0), P (0) = f (0), P
3 3
′′
(0) = f
′′
(0) , and P ′′′
3
(0) = f
′′′
(0) . Taking
2 3
a similar approach to preceding questions, use P3(x) to approximate ln(2).

Matthew Boelkins, David Austin & Steven


8.4.1 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
iii. If we used a degree 4 or degree 5 polynomial to approximate ln(1 + x) , what approximations of ln(2) do you
think would result? Use the preceding questions to conjecture a pattern that holds, and state the degree 4 and degree
5 approximation. .

Preview Activity 8.4.1 gives us several approximations to ln(2) , the linear approximation is 1 and the quadratic
1 1
approximation is 1− = . If we continue this process we will obtain approximations from cubic, quartic (degree 4),
2 2
quintic (degree 5), and higher degree polynomials giving us the following approximations to ln(2):

Linear 1 1
1
Quadratric 1 − 0.5
2

1 1
Cubic 1 − + 0.83̅
2 3

1 1 1
Quartic 1 − + − 0.583̅
2 3 4

1 1 1 1
Quintic 1 − + − + 0.783̅
2 3 4 5

The pattern here shows the fact that the number ln(2) can be approximated by the partial sums of the infinite series

1
k+1
∑(−1 ) (8.15)
k
k=1

where the alternating signs are determined by the factor (−1) . Using computational technology, we find that
k+1

0.6881721793 is the sum of the first 100 terms in this series. As a comparison, ln(2) ≈ 0.6931471806. This shows that
even though the series (Equation 8.15) converges to ln(2), it must do so quite slowly, since the sum of the first 100 terms
isn’t particularly close to ln(2). We will investigate the issue of how quickly an alternating series converges later in this
section. Again, note particularly that the series 8.15 is different from the series we have considered earlier in that some of
the terms are negative. We call such a series an alternating series.

Definition: alternating series

An alternating series is a series of the form


k
∑(−1 ) ak ,

k=0

where a k ≥0 for each k . We have some flexibility in how we write an alternating series; for example, the series

k+1
∑(−1 ) ak ,

k=1

whose index starts at k = 1 , is also alternating. As we will soon see, there are several very nice results that hold about
alternating series, while alternating series can also demonstrate some unusual behavior.

It is important to remember that most of the series tests we have seen in previous sections apply only to series with
nonnegative terms. Thus, alternating series require a different test. To investigate this idea, we return to the example in
Preview Activity 8.4.1.

Activity 8.4.1

Remember that, by definition, a series converges if and only if its corresponding sequence of partial sums converges.
Complete Table 8.7 by calculating the first few partial sums (to 10 decimal places) of the alternating series

Matthew Boelkins, David Austin & Steven


8.4.2 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker

k+1
1
∑(−1 ) .
k
k=1

Plot the sequence of partial sums from part (a) in the plane. What do you notice about this sequence?

Activity 8.4.1 exemplifies the general behavior that any convergent alternating series will demonstrate. In this example,
we see that the partial sums of the alternating harmonic series oscillate around a fixed number that turns out to be the sum
of the series.
1
Table 8.7: Partial sums of the alternating series ∑
∞ k+1
(−1)
k=1
k

1 1

1
k=1
(−1 )
k+1
= ∑
6
k=1
k+1
(−1 ) =
k k

1 1

2
k=1
(−1 )
k+1
= ∑
7
k=1
k+1
(−1 ) =
k k

1 1

3
k=1
(−1 )
k+1
= ∑
8
k=1
k+1
(−1 ) =
k k

1 1

4
k=1
(−1 )
k+1
= ∑
9
k=1
k+1
(−1 ) =
k k

1 1

5
k=1
(−1 )
k+1
= ∑
10
k=1
(−1 )
k+1

k k

Recall that if lim k→∞a ≠ 0 , then the series P a diverges by the Divergence Test. From this point forward, we will thus
k k

only consider alternating series


k+1
∑(−1 ) ak

k=1

in which the sequence a consists of positive numbers that decrease to 0. For such a series, the n th partial sum S satisfies
k n

k+1
Sn = ∑(−1 ) ak .

k=1

Notice that
S1 = a1

S2 = a1 − a2 , and since a 1 > a2 we have

0 < S2 < S1 .

S3 = S2 + a3 and so S 2 < S3 . But a 3 < a2 , so S 3 < S1 . Thus,

0 < S2 < S3 < S1

S4 = S3 − a4 and so S 4 < S3 . But a 4 < a3 , so S 2 < S4 . Thus,

0 < S2 < S4 < S3 < S1 .

S5 = S4 + a5 and so S 4 < S5 . But a 5 < a4 , so S 5 < S3 . Thus,

0 < S2 < S4 < S5 < S3 < S1 .

This pattern continues as illustrated in Figure 8.4.1 (with n odd) so that each partial sum lies between the previous two
partial sums. Note further that the absolute value of the difference between the (n − 1) st partial sum S − 1 and the nth n

partial sum S isn

| Sn − Sn − 1| = an .

Since the sequence {a } converges to 0, the distance between successive partial sums becomes as close to zero as we’d
n

like, and thus the sequence of partial sums converges (even though we don’t know the exact value to which the sequence

Matthew Boelkins, David Austin & Steven


8.4.3 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
of partial sums converges). The preceding discussion has demonstrated the truth of the Alternating Series Test.

Figure 8.4.1 : Partial sums of an alternating series

Definition: The Alternating Series Test


Given an alternating series
k
∑(−1 ) ak ,

if the sequence {a k} of positive terms decreases to 0 as k → ∞ , then the alternating series converges.

Note particularly that if the limit of the sequence {a } is not 0, then the alternating series diverges.
k

Activity 8.4.2

Which series converge and which diverge? Justify your answers.


k
(−1)
a. ∑ ∞

k=1 2
k +2
k+1

(−1 ) 2k
b. ∑ k=1
k+5
k
(−1)
c. ∑ ∞

k=2
ln(k)

The argument for the Alternating Series Test also provides us with a method to determine how close the n th partial sum S n

is to the actual sum of a convergent alternating series. To see how this works, let S be the sum of a convergent alternating
series, so

k
S = ∑(−1 ) ak .

k=1

Recall that the sequence of partial sums oscillates around the sum S so that

|S − Sn | < | Sn+1 − Sn | = an+1 .

Therefore, the value of the term an+1 provides an error estimate for how well the partial sum S approximates the actual
n

sum S . We summarize this fact in the statement of the Alternating Series Estimation Theorem.

Alternating Series Estimation Theorem

If the alternating series


k+1
∑(−1 ) ak

k=1

converges and has sum S , and


n

k+1
Sn = ∑(−1 ) ak

k=1

is the nth partial sum of the alternating series, then



k+1
| ∑(−1 ) ak − Sn | ≤ an + 1.

k=1

Matthew Boelkins, David Austin & Steven


8.4.4 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
Example 8.4.1

Let’s determine how well the 100th partial sum S 100 of


∞ k+1
(−1)

k
k=1

approximates the sum of the series.


Solution.
If we let S be the sum of the series P∞ k=1 (−1) k+1 k , then we know that

|S100 − S| < a101 .

Now
1
a101 = ≈ 0.0099,
101

so the 100th partial sum is within 0.0099 of the sum of the series. We have discussed the fact (and will later verify) that
∞ k+1
(−1)
S =∑ = ln(2),
k
k=1

and so S ≈ 0.693147 while


∞ k+1
(−1)
S100 = ∑ ≈ 0.6881721793.
k
k=1

We see that the actual difference between S and S100 is approximately 0.0049750013, which is indeed less than
0.0099.

Activity 8.4.3

Determine the number of terms it takes to approximate the sum of the convergent alternating series
∞ k+1
(−1)

4
k
k=1

to within 0.0001.

Absolute and Conditional Convergence


A series such as
1 1 1 1 1 1 1 1 1
1− − + + + − − − − +… (8.16)
4 9 16 25 36 49 64 81 100

whose terms are neither all nonnegative nor alternating is different from any series that we have considered to date. The
behavior of these series can be rather complicated, but there is an important connection between these arbitrary series that
have some negative terms and series with all nonnegative terms that we illustrate with the next activity.

Activity 8.4.4

a. Explain why the series


1 1 1 1 1 1 1 1 1
1− − + + + − − − − +… (8.4.3)
4 9 16 25 36 49 64 81 100

must have a sum that is less than the series

Matthew Boelkins, David Austin & Steven


8.4.5 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker

1
∑ .
2
k
k=1

b. Explain why the series


1 1 1 1 1 1 1 1 1
1− − + + + − − − − +…
4 9 16 25 36 49 64 81 100

must have a sum that is greater than the series



1
∑− .
2
k
k=1

c. Given that the terms in the series


1 1 1 1 1 1 1 1 1
1− − + + + − − − − +…
4 9 16 25 36 49 64 81 100

converge to 0, what do you think the previous two results tell us about the convergence status of this series?

As the example in Activity 8.4.4suggests, if we have a series ∑ a , (some of whose terms may be negative) such that
k

∑ | a | converges, it turns out to always be the case that the original series, ∑ a , must also converge. That is, if ∑ | a |
k k k

converges, then so must ∑ a . k

As we just observed, this is the case for the series (8.16), since the corresponding series of the absolute values of its terms
1
is the convergent p-series ∑ d 1
2
. At the same time, there are series like the alternating harmonic series ∑(−1) k+1
that
k k
1
converge, while the corresponding series of absolute values, ∑ , diverges. We distinguish between these behaviors by
k
introducing the following language.

Definition

Consider a series ∑ a k

1. The series ∑ a converges absolutely (or is absolutely convergent) provided that ∑ |a | converges.
k k

2. The series ∑ a converges conditionally (or is conditionally convergent) provided that ∑ |a | diverges and ∑ a
k k k

converges

In this terminology, the series (Equation 8.16) converges absolutely while the alternating harmonic series is conditionally
convergent.

Activity 8.4.5:

ln(k)
a. Consider the series ∑(−1) k
.
k

i. Does this series converge? Explain.


ii. Does this series converge absolutely? Explain what test you use to determine your answer.
ln(k)
b. Consider the series ∑(−1) k

2
.
k

i. Does this series converge? Explain.




ii. Does this series converge absolutely? Hint: Use the fact that ln(k) < √k for large values of k and the compare to
an appropriate p-series.

Conditionally convergent series turn out to be very interesting. If the sequence {a } decreases to 0, but the series ∑ a n k

diverges, the conditionally convergent series ∑(−1) a is right on the borderline of being a divergent series. As a result,
k
k

Matthew Boelkins, David Austin & Steven


8.4.6 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
any conditionally convergent series converges very slowly. Furthermore, some very strange things can happen with
conditionally convergent series, as illustrated in some of the exercises.

Summary of Tests for Convergence of Series


We have discussed several tests for convergence/divergence of series in our sections and in exercises. We close this section
of the text with a summary of all the tests we have encountered, followed by an activity that challenges you to decide
which convergence test to apply to several different series.

Geometric Series
The geometric series ∑ ar with ratio r converges for −1 < r < 1 and diverges for |r| ≥ 1 .
k


a
The sum of the convergent geometric series ∑ k=0
ar
k
is
1 −r

Divergence Test
If the sequence a does not converge to 0, then the series ∑ a diverges.
n k

This is the first test to apply because the conclusion is simple. However, if limn→∞ an = 0 , no conclusion can be
drawn

Integral Test

Let f be a positive, decreasing function on an interval [c, ∞] and let a k = f (k) for each positive integer k ≥ c .

If ∫ c
f (t)dt converges, then ∑ a converges.
k

If ∫ c
f (t)dt diverges, then ∑ a diverges.
k

Use this test when f (x) is easy to integrate.

Direct Comparison Test


See Ex 4 in Section 8.3 .
Let 0 ≤ a k ≤ bk for each positive integer k .
If ∑ b converges, then ∑ a converges.
k k

If ∑ a diverges, then ∑ b diverges


k k

Use this test when you have a series with known behavior that you can compare to – this test can be difficult to apply.

Limit Comparison Test


Let an and bn be sequences of positive terms. If
ak
lim =L
k→∞ bk

for some positive finite number L, then the two series ∑ a and ∑ b either both converge or both diverge.
k k

Easier to apply in general than the comparison test, but you must have a series with known behavior to compare.
Useful to apply to series of rational functions.

Ratio test
Let ak ≠0 for each k and suppose
|ak + 1|
lim = r.
k→∞ |ak|

If r < 1 , then the series ∑ a converges absolutely.


k

If r > 1 , then the series ∑ a diverges.k

Matthew Boelkins, David Austin & Steven


8.4.7 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
If r = 1 , then test is inconclusive.
This test is useful when a series involves factorials and powers.

Root Test
Let a k ≥0 for each k and suppose
k −−
lim √ ak = r.
k→∞

If r < 1 , then the series ∑ a converges. k

If r > 1 , then the series ∑ a diverges. k

If r = 1 , then test is inconclusive.


In general, the Ratio Test can usually be used in place of the Root Test. However, the Root Test can be quick to use
when a involves k th powers.
k

Alternating Series Test

If a is a positive, decreasing sequence so that lim


n k→∞ an = 0 , then the alternating series ∑(−1) k+1
ak converges.
This test applies only to alternating series – we assume that the terms a are all positive and that the sequence {a } is
n n

decreasing.

Alternating Series Estimation Theorem


Let S = ∑ (−1) a be the n th partial sum of the alternating series ∑ (−1) a . Assume a > 0 for each
n
n

k=1
k+1
k
n

k=1
k+1
k n

positive integer n , the sequence an decreases to 0 and lim S = S . Then it follows that \(|S − S_n| < a_n+1\.
n→∞ n

This bound can be used to determine the accuracy of the partial sum Sn as an approximation of the sum of a
convergent alternating series.

Activity 8.4.6:

For (a)-(j), use appropriate tests to determine the convergence or divergence of the following series. Throughout, if a
series is a convergent geometric series, find its sum.
∞ 2
a. ∑ k=3 −−−−
√k − 2
k
b. ∑ ∞

k=1
1 + 2k
2
2k +1
c. ∑ ∞

k=0 3
k +k+1
k
100
d. ∑ ∞

k=0
k!
k
2
e. ∑ ∞

k=1 k
5
3
k −1
f. ∑ ∞

k=1 5
k +1

g. \(\sum_{k=2}^{\infty} \dfrac{3^{k−1}{7^k}\)
1
h. ∑ ∞

k=2
kk
k+1
(−1)
i. ∑ ∞

k=1 −−−−
√k + 1

∞ 1
j. ∑ k=2
k ln(k)

Matthew Boelkins, David Austin & Steven


8.4.8 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
n
(−1)
k. Determine a value of n so that the nth partial sum S of the alternating series ∑
n

n=2
approximates the sum
ln(n)

to within 0.001.

Summary
In this section, we encountered the following important ideas:
An alternating series is a series whose terms alternate in sign. In other words, an alternating series is a series of the
form
k
∑(−1) ak

where a is a positive real number for each k .


k


An alternating series ∑ k=1
k
(−1 ) ak . converges if and only if its sequence {S } of partial sums converges, where
n

k
Sn = ∑(−1 ) ak .

k=1

The sequence of partial sums of a convergent alternating series oscillates around and converge to the sum of the series
if the sequence of n th terms converges to 0. That is why the Alternating Series Test shows that the alternating series
(−1 ) a converges whenever the sequence {a } of nth terms decreases to 0.
∞ k
∑ k n
k=1

The difference between the n − 1st partial sum S and the n th partial sum S of a convergent alternating series
n−1 n

(−1 ) a is | S − S | = a .. Since the partial sums oscillate around the sum S of the series, it follows that
∞ k
∑ k n n−1 n
k=1

|S − Sn | < an .

So the n th partial sum of a convergent alternating series ∑ ∞

k=1
k
(−1 ) ak approximates the actual sum of the series to
within a . n

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.4.9 12/15/2021 https://math.libretexts.org/@go/page/4344
Schlicker
8.5: Taylor Polynomials and Taylor Series
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a Taylor polynomial? For what purposes are Taylor polynomials used?
What is a Taylor series?
How are Taylor polynomials and Taylor series different? How are they related?
How do we determine the accuracy when we use a Taylor polynomial to approximate a function?

In our work to date in Chapter 8, essentially every sum we have considered has been a sum of numbers. In particular, each
infinite series that we have discussed has been a series of real numbers, such as

1 1 1 1
1+ + +… + =∑ . (8.5.1)
2 2 k k
2 k=0
2

In the remainder of this chapter, we will expand our notion of series to include series that involve a variable, say x. For
instance, if in the geometric series in Equation 8.5.1 we replace the ratio r = with the variable x, then we have the
1

infinite (still geometric) series


2 k k
1 +x +x +… +x +⋅ ⋅ ⋅ = ∑x . (8.5.2)

k=0

Here we see something very interesting: since a geometric series converges whenever its ratio r satisfies |r| < 1 , and the
sum of a convergent geometric series is , we can say that for |x| < 1,
a

1−r

1
2 k
1 +x +x +… +x +… = . (8.5.3)
1 −x

Note well what Equation 8.5.3 states: the non-polynomial function on the right is equal to the infinite polynomial
1

1−x

expression on the left. Moreover, it appears natural to truncate the infinite sum on the left (whose terms get very small as k
gets large) and say, for example, that
1
2 3
1 +x +x +x ≈ (8.5.4)
1 −x

for small values of x. This shows one way that a polynomial function can be used to approximate a non-polynomial
function; such approximations are one of the main themes in this section and the next.

A polynomial function can be used to approximate a non-polynomial function.


In Preview Activity 8.5.1, we begin our explorations of approximating non-polynomial functions with polynomials, from
which we will also develop ideas regarding infinite series that involve a variable, x.

Preview Activity 8.5.1

Preview Activity 8.5.3 showed how we can approximate the number e using linear, quadratic, and other polynomial
functions; we then used similar ideas in Preview Activity 8.4 to approximate ln(2). In this activity, we review and
extend the process to find the “best" quadratic approximation to the exponential function e around the origin. Let x

f (x) = e
x
throughout this activity.
a. Find a formula for P (x), the linearization of f (x) at x = 0 . (We label this linearization P because it is a first
1 1

degree polynomial approximation.) Recall that P (x) is a good approximation to f (x) for values of x close to 0.
1

Plot f and P near x = 0 to illustrate this fact.


1

b. Since f (x) = e is not linear, the linear approximation eventually is not a very good one. To obtain better
x

approximations, we want to develop a different approximation that “bends” to make it more closely fit the graph of
f near x = 0 . To do so, we add a quadratic term to P (x). In other words, we let
1

Matthew Boelkins, David Austin & Steven


8.5.1 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
2
P2 (x) = P1 (x) + c2 x (8.5.5)

for some real number c2 . We need to determine the value of c2 that makes the graph of P2 (x) best fit the graph of
f (x) near x = 0 .

Remember that P (x) was a good linear approximation to f (x) near 0; this is because
1 P1 (0) = f (0) and

P 1(0) = f (0)

. It is therefore reasonable to seek a value of c so that 2

P2 (0) = f (0) (8.5.6)

′ ′
P (0) = f (0),  and (8.5.7)
2

′′ ′′
P (0) = f (0). (8.5.8)
2

Remember, we are letting P 2 (x) = P1 (x) + c2 x .


2

i. Calculate P (0) to show that P (0) = f (0) .


2 2

ii. Calculate P (0) to show that P (0) = f (0) .


2
′ ′
2

iii. Calculate P (x). Then find a value for c so that P (0) = f (0) .
2
′′
2 2
′′ ′′

iv. Explain why the condition P (0) = f (0) will put an appropriate “bend" in the graph of P to make P fit the
2
′′ ′′
2 2

graph of f around x = 0 .

Taylor Polynomials Preview


Activity 8.5 illustrates the first steps in the process of approximating complicated functions with polynomials. Using this
process we can approximate trigonometric, exponential, logarithmic, and other nonpolynomial functions as closely as we
like (for certain values of x) with polynomials. This is extraordinarily useful in that it allows us to calculate values of these
functions to whatever precision we like using only the operations of addition, subtraction, multiplication, and division,
which are operations that can be easily programmed in a computer.
We next extend the approach in Preview Activity 8.5.1 to arbitrary functions at arbitrary points. Let f be a function that
has as many derivatives at a point x = a as we need. Since first learning it in Section 1.8, we have regularly used the linear
approximation P (x) to f at x = a , which in one sense is the best linear approximation to f near a. Recall that P (x) is
1 1

the tangent line to f at (a, f (a)) and is given by the formula



P1 (x) = f (a) + f (a)(x − a). (8.5.9)

If we proceed as in Preview Activity 8.5.1, we then want to find the best quadratic approximation
2
P2 (x) = P1 (x) + c2 (x − a) (8.5.10)

so that P2 (x) more closely models f (x) near x = a . Consider the following calculations of the values and derivatives of
P (x):
2

2
P2 (x) = P1 (x) + c2 (x − a) (8.5.11)

′ ′
P (x) = P (x) + 2 c2 (x − a) (8.5.12)
2 1

′′ 2
P (x) = 2c (8.5.13)
2

and then evaluated at x = a


P2 (a) = P1 (a) = f (a) (8.5.14)

′ ′ ′
P (a) = P (a) = f (a) (8.5.15)
2 1

′′
P (a) = 2 c2 . (8.5.16)
2

To make P 2 (x) fit f (x) better than P , we want P


1 (x) 2 (x) and f (x) to have the same concavity at x = a . That is, we want
to have
′′ ′′
P (a) = f (a). (8.5.17)
2

This implies that


′′
2 c2 = f (a) (8.5.18)

Matthew Boelkins, David Austin & Steven


8.5.2 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
and thus
′′
c2 = f (a)2. (8.5.19)

Therefore, the quadratic approximation P 2 (x) to f centered at x = 0 is


′′
f (a)
′ 2
P2 (x) = f (a) + f (a)(x − a) + (x − a) . (8.5.20)
2!

This approach extends naturally to polynomials of higher degree. In this situation, we define polynomials
3
P3 (x) = P2 (x) + c3 (x − a) (8.5.21)

4
P4 (x) = P3 (x) + c4 (x − a) (8.5.22)

5
P5 (x) = P4 (x) + c5 (x − a) (8.5.23)

and so on, with the general one being


n
Pn (x) = Pn−1 (x) + cn (x − a) . (8.5.24)

The defining property of these polynomials is that for each n , P n (x) must have its value and all its first n derivatives agree
with those of f at x = a . In other words we require that
(k) (k)
Pn (a) = f (a) (8.5.25)

for all k from 0 to n .


To see the conditions under which this happens, suppose
2 n
Pn (x) = c0 + c1 (x − a) + c2 (x − a) + … + cn (x − a) . (8.5.26)

Then
(0)
Pn (a) = c0 (8.5.27)

(1)
Pn (a) = c1 (8.5.28)

(2)
Pn (a) = 2c2 (8.5.29)

(3)
Pn (a) = (2)(3)c3 (8.5.30)

(4)
Pn (a) = (2)(3)(4)c4 (8.5.31)

(5)
Pn (a) = (2)(3)(4)(5)c5 (8.5.32)

and, in general,
(k)
Pn (a) = (2)(3)(4) … (k − 1)(k)ck = k! ck . (8.5.33)

So having
(k) (k)
Pn (a) = f (a) (8.5.34)

means that
(k)
k! ck = f (a) (8.5.35)

and therefore
(k)
f (a)
ck = (8.5.36)
k!

for each value of k . The expression for c we have found the formula for the degree n polynomial approximation of f that
k

we seek.

Matthew Boelkins, David Austin & Steven


8.5.3 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
Taylor Polynomials
The n th order Taylor polynomial of f centered at x = a is given by
′′ (n)
f (a) f (a)
′ 2 n
Pn (x) = f (a) + f (a)(x − a) + (x − a) +… + (x − a) (8.5.37)
2! n!
n (k)
f (a)
k
=∑ (x − a) . (8.5.38)
k!
k=0

(k)
This degree n polynomial approximates f (x) near x = a and has the property that P n (a) = f
(k)
(a) for k = 0 … n .

Example 8.5.1

Determine the third order Taylor polynomial for f (x) = e , as well as the general n th order Taylor polynomial for
x
f

centered at x = 0 .
Solution
We know that f ′
(x) = e
x
and so f ′′
(x) = e
x
and f ′′′
(x) = e
x
. Thus,
′ ′′ ′′′
f (0) = f (0) = f (0) = f (0) = 1. (8.5.39)

So the third order Taylor polynomial of f (x) = e centered at x = 0 is (Equation 8.5.38)


x

′′ ′′′
f (0) f (0)
′ 2 3
P3 (x) = f (0) + f (0)(x − 0) + (x − 0 ) + (x − 0 ) (8.5.40)
2! 3!
2 3
x x
= 1 +x + + (8.5.41)
2 6

In general, for the exponential function f we have f (x) = e for every positive integer k . Thus, the k th term in the
(k) x

n th order Taylor polynomial for f (x) centered at x = 0 is

(k)
f (0) 1
k k
(x − 0 ) = x . (8.5.42)
k! k!

Therefore, the n th order Taylor polynomial for f (x) = e centered at x = 0 is x

2 n k
x 1 n
x
Pn (x) = 1 + x + +… + x =∑ . (8.5.43)
2! n! k!
k=0

Activity 8.5.2: Other Functions

We have just seen that the n th order Taylor polynomial centered at a = 0 for the exponential function e is x

n k
x
∑ . (8.5.44)
k!
k=0

In this activity, we determine small order Taylor polynomials for several other familiar functions, and look for general
patterns that will help us find the Taylor series expansions a bit later.
a. Let f (x) = 1−x
1
.
i. Calculate the first four derivatives of f (x) at x = 0 . Then find the fourth order Taylor polynomial P 4 (x) for
1

1−x
centered at 0.
ii. Based on your results from part (i), determine a general formula for f (0). (k)

b. Let f (x) = cos(x).


i. Calculate the first four derivatives of f (x) at x = 0 . Then find the fourth order Taylor polynomial P 4 (x) for
cos(x) centered at 0.

Matthew Boelkins, David Austin & Steven


8.5.4 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
ii. Based on your results from part (i), find a general formula for f (k)
(0) . (Think about how k being even or odd
affects the value of the k th derivative.)
c. Let f (x) = sin(x).
i. Calculate the first four derivatives of f (x) at x = 0 . Then find the fourth order Taylor polynomial P (x) for 4

sin(x) centered at 0.

ii. Based on your results from part (i), find a general formula for f (0). (Think about how k being even or odd
(k)

affects the value of the k th derivative.)

It is possible that an n th order Taylor polynomial is not a polynomial of degree n ; that is, the order of the approximation
can be different from the degree of the polynomial. For example, in Activity 8.5.2 we found that the second order Taylor
polynomial P (x) centered at 0 for sin(x) is P (x) = x . In this case, the second order Taylor polynomial is a degree 1
2 2

polynomial.

Taylor Series
In Activity 8.5.2 we saw that the fourth order Taylor polynomial P 4 (x) for sin(x) centered at 0 is
3
x
P4 (x) = x − (8.5.45)
3!

The pattern we found for the derivatives f (k)


(0) describe the higher-order Taylor polynomials, e.g.,
3 (5)
x x
P5 (x) = x − + (8.5.46)
3! 5!
3 (5) (7)
x x x
P7 (x) = x − + − (8.5.47)
3! 5! 7!
3 (5) (7) (9)
x x x x
P9 (x) =x− + − + (8.5.48)
3! 5! 7! 9!

and so on. It is instructive to consider the graphical behavior of these functions; Figure 8.5.1 shows the graphs of a few of
the Taylor polynomials centered at 0 for the sine function.

Figure 8.5.1 : The order 1, 5, 7, and 9 Taylor polynomials centered at x = 0 for f (x) = sin(x).
Notice that P (x) is close to the sine function only for values of x that are close to 0, but as we increase the degree of the
1

Taylor polynomial the Taylor polynomials provide a better fit to the graph of the sine function over larger intervals. This
illustrates the general behavior of Taylor polynomials: for any sufficiently well-behaved function, the sequence {P (x)} of n

Taylor polynomials converges to the function f on larger and larger intervals (though those intervals may not necessarily
increase without bound). If the Taylor polynomials ultimately converge to f on its entire domain, we write

f (x) = ∑ f (k)(a)k!(x − a)k (8.5.49)

k=0

Definition: Taylor and Maclaurin Series


Let f be a function all of whose derivatives exist at x =a . The Taylor series for f centered at x =a is the series
Tf (x) defined by

Matthew Boelkins, David Austin & Steven


8.5.5 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
∞ (k)
f (a)
k
Tf (x) = ∑ (x − a) . (8.5.50)
k!
k=0

In the special case where a = 0 in Equation 8.5.50, the Taylor series is also called the Maclaurin series for f . From
Example 8.5.1 we know the nth order Taylor polynomial centered at 0 for the exponential function e ; thus, the Maclaurinx

series for e is
x

∞ k
x
∑ . (8.5.51)
k!
k=0

Activity 8.5.3

In Activity 8.5.2 we determined small order Taylor polynomials for a few familiar functions, and also found general
patterns in the derivatives evaluated at 0. Use that information to write the Taylor series centered at 0 for the following
functions.
a. f (x) = 1−x
1

b. f (x) = cos(x) (You will need to carefully consider how to indicate that many of the coefficients are 0. Think
about a general way to represent an even integer.)
c. f (x) = sin(x) (You will need to carefully consider how to indicate that many of the coefficients are 0. Think about
a general way to represent an odd integer.)

The next activity further considers the important issue of the x-values for which the Taylor series of a function converges
to the function itself.

Activity 8.5.4

a. Plot the graphs of several of the Taylor polynomials centered at 0 (of order at least 5) for e and convince yourself
x

that these Taylor polynomials converge to e x for every value of x.


b. Draw the graphs of several of the Taylor polynomials centered at 0 (of order at least 6) for cos(x) and convince
yourself that these Taylor polynomials converge to cos(x) for every value of x. Write the Taylor series centered at
0 for cos(x).
c. Draw the graphs of several of the Taylor polynomials centered at 0 for 1 1−x . Based on your graphs, for what
values of x do these Taylor polynomials appear to converge to \frac{1}{1−x}\)? How is this situation different
from what we observe with e and cos(x)? In addition, write the Taylor series centered at 0 for 1 1−x .
x

The Maclaurin series for e


x
, sin(x) , , and
cos(x)
1

1−x
will be used frequently, so we should be certain to know and
recognize them well.

The Interval of Convergence of a Taylor Series


In the previous section (in Figure 8.6 and Activity 8.24) we observed that the Taylor polynomials centered at 0 for e , x

cos(x), and sin(x) converged to these functions for all values of x in their domain, but that the Taylor polynomials

centered at 0 for converged to


1

1−x
for only some values of x. In fact, the Taylor polynomials centered at 0 for 1 1−x
1

1−x

converge to 1 1−x on the interval (−1, 1) and diverge for all other values of x. So the Taylor series for a function f (x) does
not need to converge for all values of x in the domain of f . Our observations to date suggest two natural questions: can we
determine the values of x for which a given Taylor series converges? Moreover, given the Taylor series for a function f ,
does it actually converge to f (x) for those values of x for which the Taylor series converges?

Example 8.5.2: The Ratio Test

Graphical evidence suggests that the Taylor series centered at 0 for e converges for all values of x. To verify this, use
x

the Ratio Test to determine all values of x for which the Taylor series

Matthew Boelkins, David Austin & Steven


8.5.6 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
∞ k
x
∑ (8.21)
k!
k=0

converges absolutely.
Solution
In previous work, we used the Ratio Test on series of numbers that did not involve a variable; recall, too, that the Ratio
Test only applies to series of nonnegative terms. In this example, we have to address the presence of the variable x.
Because we are interested in absolute convergence, we apply the Ratio Test to the series
∞ k ∞ k
x |x|
∑| =∑ . (8.5.52)
k! k!
k=0 k=0

Now, observe that


k+1
|x|

ak+1 (k + 1)!
lim = lim (8.5.53)
k
k→∞ ak k→∞
|x|

k+1
|x | k!
= lim (8.5.54)
k+1
k→∞
|x | (k + 1)!

|x|
= lim (8.5.55)
k→∞ k+1

=0 (8.5.56)

for any value of x. So the Taylor series (Equation 8.21) converges absolutely for every value of x, and thus converges
for every value of x.

One key question remains: while the Taylor series for e converges for all x, what we have done does not tell us that
x

this Taylor series actually converges to e for each x. We’ll return to this question when we consider the error in a
x

Taylor approximation near the end of this section.

We can apply the main idea from Example 8.5.2 in general. To determine the values of x for which a Taylor series

∑ ck (x − a)k (8.5.57)

k=0

centered at x = a will converge, we apply the Ratio Test with ak = | ck (x − a) |


k
and recall that the series to which the
Ratio Test is applied converges if limk→∞ ak+1 ak < 1.
Observe that

ak+1 | ck+1 |
= |x − a| , (8.5.58)
ak | ck |

so when we apply the Ratio Test, we get that


ak+1 ck+1
lim = lim |x − a| . (8.5.59)
k→∞ ak k→∞ ck

Note further that \(c_k = \dfrac{f^(k) (a)}{k!} , and say that


ck+1
lim = L. (8.5.60)
k→∞ ck

Thus, we have found that

Matthew Boelkins, David Austin & Steven


8.5.7 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
ak+1
lim = |x − a|L. (8.5.61)
k→∞ ak

There are three important possibilities for L : L can be 0, a finite positive value, or infinite. Based on this value of L, we
can therefore determine for which values of x the original Taylor series converges.
If L = 0 , then the Taylor series converges on (−\infty, \infty).
If L is infinite, then the Taylor series converges only at x = a .
If L is finite and nonzero, then the Taylor series converges absolutely for all \(x) that satisfy

|x − a| ⋅ L < 1. (8.5.62)

In other words, the series converges absolutely for all x such that

|x − a| < 1L, (8.5.63)

which is also the interval


1 1
(a − ,a+ ). (8.5.64)
L L

1
Because the Ratio Test is inconclusive when the |x − a| ⋅ L = 1 , the endpoints a ± have to be checked separately. It is
L
important to notice that the set of x values at which a Taylor series converges is always an interval centered at x = a. For
this reason, the set on which a Taylor series converges is called the interval of convergence. Half the length of the interval
of convergence is called the radius of convergence. If the interval of convergence of a Taylor series is infinite, then we say
that the radius of convergence is infinite.

Activity 8.5.5: Using the Ratio Test

a. Use the Ratio Test to explicitly determine the interval of convergence of the Taylor series for f (x) = 1
centered
1−x

at x = 0 .
b. Use the Ratio Test to explicitly determine the interval of convergence of the Taylor series for f (x) = cos(x)
centered at x = 0 .
c. Use the Ratio Test to explicitly determine the interval of convergence of the Taylor series for f (x) = sin(x)
centered at x = 0 .

The Ratio Test tells us how we can determine the set of x values for which a Taylor series converges absolutely. However,
just because a Taylor series for a function f converges, we cannot be certain that the Taylor series actually converges to
f (x) on its interval of convergence. To show why and where a Taylor series does in fact converge to the function f , we

next consider the error that is present in Taylor polynomials.

Error Approximations for Taylor Polynomials


We now know how to find Taylor polynomials for functions such as sin(x), as well as how to determine the interval of
convergence of the corresponding Taylor series. We next develop an error bound that will tell us how well an nth order
Taylor polynomial P (x) approximates its generating function f (x). This error bound will also allow us to determine
n

whether a Taylor series on its interval of convergence actually equals the function f from which the Taylor series is
derived. Finally, we will be able to use the error bound to determine the order of the Taylor polynomial P (x) for a n

function f that we need to ensure that P (x) approximates f (x) to any desired degree of accuracy.
n

In all of this, we need to compare P (x) to f (x). For this argument, we assume throughout that we center our
n

approximations at 0 (a similar argument holds for approximations centered at a). We define the exact error, E (x), thatn

results from approximating f (x) with P (x) by


n

En (x) = f (x) − Pn (x). (8.5.65)

We are particularly interested in |En (x)|, the distance between P and f . Note that since
n

(k) (k)
Pn (0) = f (0) (8.5.66)

Matthew Boelkins, David Austin & Steven


8.5.8 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
for 0 ≤ k ≤ n , we know that
(k)
En (0) = 0 (8.5.67)

for 0 ≤ k ≤ n . Furthermore, since P n (x) is a polynomial of degree less than or equal to n , we know that
(n+1)
Pn (x) = 0. (8.5.68)

Thus, since
(n+1) (n+1) (n+1)
En (x) = f (x) − Pn (x), (8.5.69)

it follows that
(n+1) (n+1)
En (x) = f (x) (8.5.70)

for all x.
Suppose that we want to approximate f (x) at a number c close to 0 using P (c). If we assume |f n
(n+1)
(t)| is bounded by
some number M on [0, c], so that f (t) ≤ M for all 0 ≤ t ≤ c , then we can say that
(n+1)

(n+1) (n+1)
| En (t)| = | f (t)| ≤ M (8.5.71)

for all t between 0 and c . Equivalently,


(n+1)
−M ≤ En (t) ≤ M (8.22)

on [0, c]. Next, we integrate the three terms in the inequality 8.22 from t = 0 to t = x , and thus find that
x x x

∫ −M dt ≤ ∫ E(n + 1)n(t)dt ≤ ∫ M dt (8.5.72)


0 0 0

(n)
for every value of x in [0, c]. Since E n (0) = 0 , the First FTC tells us that
−M x ≤ E(n)n(x) ≤ M x (8.5.73)

for every x in [0, c]. Integrating the most recent inequality, we obtain
x x x

∫ −M tdt ≤ ∫ E(n)n(t)dt ≤ ∫ M tdt (8.5.74)


0 0 0

and thus
2 2
x x
(
−M ≤ E n − 1 )n (x) ≤ M (8.5.75)
2 2

for all x in [0, c]. Integrating n times, we arrive at


n+1 n+1
x x
−M ≤ En (x) ≤ M (8.5.76)
(n + 1)! (n + 1)!

for all x in [0, c]. This enables us to conclude that


n+1
|x|
| En (x)| ≤ M (8.5.77)
(n + 1)!

for all x in [0, c], which shows an important bound on the approximation’s error, E . Our work above was based on the n

approximation centered at a = 0 ; the argument may be generalized to hold for any value of a, which results in the
following theorem.

The Lagrange Error Bound


For P (x). Let f be a continuous function with n + 1 continuous derivatives. Suppose that M is a positive real number
n

such that |f (x)| ≤ M on the interval [a, c]. If P (x) is the n th order Taylor polynomial for f (x) centered at x = a ,
(n+1)
n

Matthew Boelkins, David Austin & Steven


8.5.9 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
then
n+1
|c − a|
| Pn (c) − f (c)| ≤ M . (8.5.78)
(n + 1)!

This error bound may now be used to tell us important information about Taylor polynomials and Taylor series, as we see
in the following examples and activities.

Exercise 8.5.6:

Determine how well the 10th order Taylor polynomial P 10 (x) for sin(x), centered at 0, approximates sin(2).
Solution.
To answer this question we use f (x) = sin(x), c = 2, a = 0 , and n = 10 in the Lagrange error bound formula. To use
the bound, we also need to find an appropriate value for M . Note that the derivatives of f (x) = sin(x) are all equal to
± sin(x)or ± cos(x) . Thus,

(
| f n + 1)(x)| ≤ 1

for any n and x. Therefore, we can choose M to be 1. Then


11 11
|2 − 0| 2
|P 10(2) − f (2)| ≤ (1) = ≈ 0.00005130671797.
(11)! (11)!

So P 1 0(2) approximates sin(2) to within at most 0.00005130671797. A computer algebra system tells us that
P1 0(2) ≈ 0.9093474427 and sin(2) ≈ 0.9092974268
with an actual difference of about 0.0000500159.

Activity 8.5.7:

Let P n(x) be the nth order Taylor polynomial for sin(x) centered at x = 0 . Determine how large we need to choose n
so that P (2) approximates sin(2) to 20 decimal places.
n

Example 8.5.3:

Show that the Taylor series for sin(x) actually converges to sin(x) for all x.
Solution
Recall from the previous example that since f (x) = sin(x), we know
(n+1)
|f (x)| ≤ 1

for any n and x. This allows us to choose M =1 in the Lagrange error bound formula. Thus,
n+1
|x|
| Pn (x) − sin(x)| ≤ (8.23)
(n + 1)!

k
∞ x
for every x. We showed in earlier work with the Taylor series ∑
k=0
converges for every value of x. Since the
k!
terms of any convergent series must approach zero, it follows that
n+1
x
limn→∞ =0
(n + 1)!

for every value of x. Thus, taking the limit as n → ∞ in the inequality (8.23), it follows that
limn→∞ | Pn (x) − sin(x)| = 0.

As a result, we can now write

Matthew Boelkins, David Austin & Steven


8.5.10 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
n 2n+1

(−1) x
∑n=0
(2n + 1)!

for every real number x.

Activity 8.5.8:

a. Show that the Taylor series centered at 0 for cos(x) converges to cos(x) for every real number x.
b. Next we consider the Taylor series for e . x

i. Show that the Taylor series centered at 0 for e converges to e for every nonnegative value of x.
x x

ii. Show that the Taylor series centered at 0 for e converges to e for every negative value of x.
x x

iii. Explain why the Taylor series centered at 0 for e converges to e for every real number x. Recall that we
x x

earlier showed that the Taylor series centered at 0 for e converges for all x, and we have now completed the
x

argument that the Taylor series for e actually converges to e for all x.
x x

c. Let P (x) be the n th order Taylor polynomial for e centered at 0. Find a value of n so that P
n
x
n (5) approximates
e correct to 8 decimal places.
5

Summary
In this section, we encountered the following important ideas:
We can use Taylor polynomials to approximate complicated functions. This allows us to approximate values of
complicated functions using only addition, subtraction, multiplication, and division of real numbers. The n th order
Taylor polynomial centered at x = a of a function f is
′′ (n) (
f (a) f (a) n
f k)(a)
′ 2 n k
Pn (x) = f (a) + f (a)(x − a) + (x − a) +⋅ ⋅ ⋅ + (x − a) =∑ (x − a) .
k=0
2! n! k!

The Taylor series centered at x = a for a function f is



∑k=0 f (k)(a)k!(x − a)k .
The n th order Taylor polynomial centered at a for f is the n th partial sum of its Taylor series centered at a . So the n th
order Taylor polynomial for a function f is an approximation to f on the interval where the Taylor series converges; for
the values of x for which the Taylor series converges to f we write
(

f k)(a)
f (x) = ∑k=0 (x − a)
k
.
k!

The Lagrange Error Bound shows us how to determine the accuracy in using a Taylor polynomial to approximate a
function. More specifically, if P (x) is the nth order Taylor polynomial for f centered at x = a and if M is an upper
n

bound for |f (x)| on the interval [a, c], then


(n+1)

n+1
|c − a|
| Pn (c) − f (c)| ≤ M .
(n + 1)!

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.5.11 11/24/2021 https://math.libretexts.org/@go/page/4345
Schlicker
8.6: Power Series
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a power series?
What are some important uses of power series?
What is the connection between power series and Taylor series?

We have noted at several points in our work with Taylor polynomials and Taylor series that polynomial functions are the
simplest possible functions in mathematics, in part because they essentially only require addition and multiplication to
evaluate. Moreover, from the point of view of calculus, polynomials are especially nice: we can easily differentiate or
integrate any polynomial. In light of our work in Section 8.5, we now know that several important non-polynomials have
polynomial-like expansions. For example, for any real number x,
2 3 n
x x x
x
e = 1 +x + + +… + +… .
2! 3! n!

As we continue our study of infinite series, there are two settings where other series like the one for e arise: one is where x

we are simply given an expression like


2 3
1 + 2x + 3 x + 4x +…

and we seek the values of x for which the expression makes sense, while another is where we are trying to find an
unknown function f , and we think about the possibility that the function has expression
2 k
f (x) = a0 + a1 x + a2 x + … + ak x +… ,

and we try to determine the values of the constants a , a , . . .. The latter situation is explored in Preview Activity 8.6.1.
0 1

Preview Activity 8.6.1

In Chapter 7, we learned some of the many important applications of differential equations, and learned some
approaches to solve or analyze them. Here, we consider an important approach that will allow us to solve a wider
variety of differential equations. Let’s consider the familiar differential equation from exponential population growth
given by

y = ky, (8.25)

where k is the constant of proportionality. While we can solve this differential equation using methods we have
already learned, we take a different approach now that can be applied to a much larger set of differential equations. For
the rest of this activity, let’s assume that k = 1 . We will use our knowledge of Taylor series to find a solution to the
differential Equation 8.25.
To do so, we assume that we have a solution y = f (x) and that f (x) has a Taylor series that can be written in the form

y = f (x) = ∑
k=0
ak x
k
,
where the coefficients a are undetermined. Our task is to find the coefficients.
k

a. Assume that we can differentiate a power series term by term. By taking the derivative of f (x) with respect to x
and substituting the result into the differential equation (8.25), show that the equation
∞ k−1 ∞ k
∑ kak x =∑ ak x
k=1 k=0

must be satisfied in order for f (x) = ∑


k=1
ak x
k
to be a solution of the DE.infty} a_k x^k\) to be a solution of the
DE.
b. Two series are equal if and only if they have the same coefficients on like power terms. Use this fact to find a
relationship between a and a .
1 0

c. Now write a in terms of a . Then write a in terms of a .


2 1 2 0

Matthew Boelkins, David Austin & Steven


8.6.1 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
d. Write a in terms of a . Then write a in terms of a .
3 2 3 0

e. Write a in terms of a . Then write a in terms of a .


4 3 4 0

f. Observe that there is a pattern in (b)-(e). Find a general formula for a in terms of a . k 0

g. Write the series expansion for y using only the unknown coefficient a . From this, determine what familiar 0

functions satisfy the differential equation (8.25). (Hint: Compare to a familiar Taylor series.)

Power Series
As Preview Activity 8.6.1 shows, it can be useful to treat an unknown function as if it has a Taylor series, and then
determine the coefficients from other information. In other words, we define a function as an infinite series of powers of x
and then determine the coefficients based on something besides a formula for the function. This method of using series
illustrated in Preview Activity 8.6.1 to solve differential equations is a powerful and important one that allows us to
approximate solutions to many different types of differential equations even if we cannot explicitly solve them. This
approach is different from defining a Taylor series based on a given function, and these functions we define with arbitrary
coefficients are given a special name.

Definition

A power series centered at x = a is a function of the form


k
∑ ck (x − a) (8.26)

k=0

where {c } is a sequence of real numbers and x is an independent variable.


k

We can substitute different values for x and test whether the resulting series converges or diverges. Thus, a power series
defines a function f whose domain is the set of x values for which the power series converges. We therefore write
∞ k
f (x) = ∑k=0 ck (x − a)

It turns out that, on its interval of convergence, a power series is the Taylor series of the function that is the sum of the
power series, so all of the techniques we developed in the previous section can be applied to power series as well.

Example 8.6.1

Consider the power series defined by


k
x
f (x) = ∑

k=0 k
. .
2

3
What are f (1) and f( ? Find a general formula for
) f (x) and determine the values for which this power series
2
converges.
7See Exercise 2 in this section.

Solution
If we evaluate f at x = 1 we obtain the series

1

k=0 k
2

1
which is a geometric series with ratio . So we can sum this series and find that
2

1
f (1) = = 2.
1
1−
2

Similarly,

Matthew Boelkins, David Austin & Steven


8.6.2 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
k
3 1
f (3/2) = ∑

k=0
( ) = = 4. .
4 3
1−
4
x
In general,f (x) is a geometric series with ratio and
2

x k 1 2

f (x) = ∑k=0 ( ) = x
=
2 1− 2 −x
2

provided that −1 < x < 1 (so that the ratio is less than 1 in absolute value). Thus, the power series that defines
2
f

converges for −2 < x < 2 .


As with Taylor series, we define the interval of convergence of a power series (8.26) to be the set of values of x for
which the series converges. In the same way as we did with Taylor series, we typically use the Ratio Test to find the
values of x for which the power series converges absolutely, and then check the endpoints separately if the radius of
convergence is finite.

Example 8.6.2

Let
k

x
f (x) = ∑ ∞ .
k=1 2
k

Determine the interval of convergence of this power series.


First we will draw graphs of some of the partial sums of this power series to get an idea of the interval of convergence.
Let
k
n
x
Sn (x) = ∑
k=1 2
k

for each n ≥ 1 . Figure 8.7 shows plots of S 0(x) (in red), S 5(x) (in blue), and S 0(x) (in green). The behavior of
1 2 5

S 0 particularly highlights that it appears to be converging to a particular curve on the interval (−1, 1), while growing
5

without bound outside of that interval.

k
x
Figure 8.7: Graphs of partial sums of the power series ∑ ∞

k=1

2
k

This suggests that the interval of convergence might be −1 < x < 1 . To more fully understand this power series, we
apply the Ratio Test to determine the values of x for which the power series converges absolutely. For the given series,
we have
k
x
ak =
2
,
k

so

Matthew Boelkins, David Austin & Steven


8.6.3 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
k+1
|x|

2
| ak+1 | (k + 1)
limk→∞ = limk→∞
k
| ak | |x|

2
k

2
k
= limk→∞ |x| ( )
k+1

2
k
= |x| limk→∞ ( )
k+1

= |x|.

Therefore, the Ratio Test tells us that the given power series f (x) converges absolutely when |x| < 1 and diverges
when |x| > 1. Since the Ratio Test is inconclusive when |x| = 1, we need to check x = 1 and x = −1 individually.
When x = 1 , observe that
∞ 1
f (1) = ∑k=1 ∞ .
k2

This is a p-series with p > 1 , which we know converges. When x = −1 , we have


k

(−1)
f (−1) = ∑
k=1

2
.
k

1
This is an alternating series, and since the sequence { 2
decreases to 0, the power series converges when x = −1 by
n
the Alternating Series Test. Thus, the interval of convergence of this power series is −1 ≤ x ≤ 1 .

Activity 8.6.1

Determine the interval of convergence of each power series.


k
(x − 1)
a. ∑ ∞

k=1

3k

b. ∑ k=1
kx
k

2 k
k (x + 1 )
c. ∑ ∞

k=1

k
4
k
∞ x
d. ∑ k=1

(2k)!

e. ∑ k=1
k! x
k

Manipulating Power Series


Recall that we know several power series expressions for important functions such as sin(x) and e . Often, we can take a x

known power series expression for such a function and use that series expansion to find a power series for a different, but
related, function. The next activity demonstrates one way to do this.

Activity 8.6.1
1
Our goal in this activity is to find a power series expansion for f (x) = 2
centered at x = 0 .
1 +x

1
While we could use the methods of Section 8.5 and differentiate f (x) = 2
several times to look for patterns and
1 +x

find the Taylor series for , we seek an alternate approach because of how complicated the derivatives of
f (x) f (x)

quickly become.
1
a. What is the Taylor series expansion for g(x) = ? What is the interval of convergence of this series?
1 −x

Matthew Boelkins, David Austin & Steven


8.6.4 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
b. How is g(−x ) related to f (x)? Explain, and hence substitute −x for x in the power series expansion for g(x).
2 2

Given the relationship between g(−x ) and f (x), how is the resulting series related to f (x)?
2

c. For which values of x will this power series expansion for f (x) be valid? Why?

In a previous section we determined several important Maclaurin series and their intervals of convergence. Here, we list
these key functions and remind ourselves of their corresponding expansions.
k 2k+1
(−1) x

sin(x) = ∑
k=0
for −∞ < x < ∞
(2k + 1)!

k 2k
(−1) x

cos(x) = ∑
k=0
for −∞ < x < ∞
(2k)!

k
x
e
x
= ∑

k=0
for −∞ < x < ∞
k!

1
= ∑

k=0
x
k
for −1 < x < 1
1 −x

As we saw in Activity 8.29, we can use these known series to find other power series expansions for related functions such
as sin(x ), e , and cos(x ). Another important way that we can manipulate power series is illustrated in the next
3
2 5x 5

activity.

Activity 8.6.2

Let f be the function given by the power series expansion


2k

x
k
∑ (−1 )
k=0
(2k)!

a. Assume that we can differentiate a power series term by term, just like we can differentiate a (finite) polynomial.
Use the fact that
2 4 6 2k
x x x x
k
f (x)1 − + − + … (−1 ) +…
2! 4! 6! (2k!)

to find a power series expansion for f ′


(x) .
b. Observe that f (x) and f (x) have familiar Taylor series. What familiar functions are these? What known

relationship does our work demonstrate?


c. What is the series expansion for f ′′
(x) ? What familiar function is f ′′
(x) ?

It turns out that our work in Activity 8.29 holds more generally. The corresponding theorem, which we will not prove,
states that we can differentiate a power series for a function f term by term and obtain the series expansion for f , and ′

similarly we can integrate a series expansion for a function f term by term and obtain the series expansion for ∫ f (x)dx.
For both, the radius of convergence of the resulting series is the same as the original, though it is possible that the
convergence status of the resulting series may differ at the endpoints. The formal statement of the Power Series
Differentiation and Integration Theorem follows.

Power Series Differentiation and Integration Theorem.


Suppose f (x) has a power series expansion
∞ k
f (x) = ∑ ck x
k=0

so that the series converges absolutely to f (x) on the interval −r < x < r . Then, the power series

f (x) = ∑ kc x
k=1
obtained by differentiating the power series for f (x) term by term converges absolutely to
k
k−1

f (x) on the interval −r < x < r . That is,



f (x) = ∑

k=1
kck x
k−1
, for |x| < r.

Matthew Boelkins, David Austin & Steven


8.6.5 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
k+1
x
Similarly, the power series ∑

k=0
obtained by integrating the power series for f (x) term by term converges
k+1

absolutely on the interval −r < x < r , and


k+1
∞ x
∫ f (x)dx = ∑k=0 ck +C , for |x| < r.
k+1

This theorem validates the steps we took in Activity 8.30. It is important to note that this result about differentiating and
integrating power series tells us that we can differentiate and integrate term by term on the interior of the interval of
convergence, but it does not tell us what happens at the endpoints of this interval. We always need to check what happens
at the endpoints separately. More importantly, we can use use the approach of differentiating or integrating a series term by
term to find new series.

Example 8.6.3

Find a series expansion centered at x = 0 for arctan(x), as well as its interval of convergence.
Solution.
While we could differentiate arctan(x) repeatedly and look for patterns in the derivative values at x = 0 in an attempt
to find the Maclaurin series forarctan(x) from the definition, it turns out to be far easier to use a known series in an
insightful way. In Activity 8.29, we found that
1 ∞ k 2k
= ∑k=0 (−1 ) x
2
1 +x

for −1 < x < 1 . Recall that


d 1
[arctan(x)] = ,
2
dx 1 +x

and therefore
1
∫ = arctan(x) + C .
2
(1 + x )

1
It follows that we can integrate the series for 2
term by term to obtain the power series expansion for arctan(x).
1 +x
Doing so, we find that
∞ k 2k
arctan(x) = ∫ (∑ (−1 ) x ) dx
k=0

∞ k 2k
= ∑k=0 (∫ (−1 ) x dx)

2k+1
∞ x
k
= ∑k=0 (−1 )
2k + 1

The Power Series Differentiation and Integration Theorem tells us that this equality is valid for at least −1 < x < 1 .
To find the value of the constant C, we can use the fact that arctan(0) = 0 . So
2k+1

0
k
0 = arctan(0) = ( ∑ (−1 ) ) + C = C,
k=0
2k + 1

and we must have C = 0. Therefore,


∞ 2k+1
x
arctan(x) = ∑ (8.27)
2k + 1
k=0

for at least −1 < x < 1 .


It is a straightforward exercise to check that the power series
2k+1
∞ x
k
∑ (−1 )
k=0
2k + 1

Matthew Boelkins, David Austin & Steven


8.6.6 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
1
converges both when x = −1 and when x =1 ; in each case, we have an alternating series with terms that
2k + 1

decrease to 0, and thus the interval of convergence for the series expansion for arctan(x) in Equation (8.27) is
−1 ≤ x ≤ 1 .

Activity 8.6.3:

Find a power series expansion for ln(1 + x) centered at x =0 and determine its interval of convergence. (Hint: Use
1
the Taylor series expansion for centered at x = 0 .
1 +x

Summary
In this section, we encountered the following important ideas:
A power series is a series of the form
∞ k
∑ ak x
k=0

We can often assume a solution to a given problem can be written as a power series, then use the information in the
problem to determine the coefficients in the power series. This method allows us to approximate solutions to certain
problems using partial sums of the power series; that is, we can find approximate solutions that are polynomials.
The connection between power series and Taylor series is that they are essentially the same thing: on its interval of
convergence a power series is the Taylor series of its sum.

Contributors and Attributions


Matt Boelkins (Grand Valley State University), David Austin (Grand Valley State University), Steve Schlicker (Grand
Valley State University)

Matthew Boelkins, David Austin & Steven


8.6.7 11/10/2021 https://math.libretexts.org/@go/page/4346
Schlicker
8.E: Sequences and Series (Exercises)
8.1: Sequences
1. Finding limits of convergent sequences can be a challenge. However, there is a useful tool we can adapt from our study
of limits of continuous functions at infinity to use to find limits of sequences. We illustrate in this exercise with the
example of the sequence ln(n) n .
(a) Calculate the first 10 terms of this sequence. Based on these calculations, do you think the sequence converges or
diverges? Why?
(b) For this sequence, there is a corresponding continuous function f defined by f (x) = ln(x) x . Draw the graph of f (x) on
the interval [0, 10] and then plot the entries of the sequence on the graph. What conclusion do you think we can draw about
the sequence ( ln(n) n ) if limx→∞ f (x) = L? Explain.
(c) Note that f (x) has the indeterminate form ∞ ∞ as x goes to infinity. What idea from differential calculus can we use to
calculate limx→∞ f (x)? Use this method to find limx→∞ f (x). What, then, is limn→∞ ln(n) n ?
2. We return to the example begun in Preview Activity 8.1 to see how to derive the formula for the amount of money in an
account at a given time. We do this in a general setting. Suppose you invest P dollars (called the principal) in an account
paying r% interest compounded monthly. In the first month you will receive r 12 (here r is in decimal form; e.g., if we have
8% interest, we write 0.08 12 ) of the principal P in interest, so you earn P r 12 8.1. SEQUENCES 453 dollars in interest.
Assume that you reinvest all interest. Then at the end of the first month your account will contain the original principal P
plus the interest, or a total of P1 = P + P r 12 = P 1 + r 12 dollars.
(a) Given that your principal is now P1 dollars, how much interest will you earn in the second month? If P2 is the total
amount of money in your account at the end of the second month, explain why P2 = P1 1 + r 12 = P 1 + r 12 2 .
(b) Find a formula for P3, the total amount of money in the account at the end of the third month in terms of the original
investment P.
(c) There is a pattern to these calculations. Let Pn the total amount of money in the account at the end of the third month in
terms of the original investment P. Find a formula for Pn.
3. Sequences have many applications in mathematics and the sciences. In a recent paper3 the authors write The incretin
hormone glucagon-like peptide-1 (GLP-1) is capable of ameliorating glucose-dependent insulin secretion in subjects with
diabetes. However, its very short half-life (1.5-5 min) in plasma represents a major limitation for its use in the clinical
setting. The half-life of GLP-1 is the time it takes for half of the hormone to decay in its medium. For this exercise, assume
the half-life of GLP-1 is 5 minutes. So if A is the amount of GLP-1 in plasma at some time t, then only A 2 of the hormone
will be present after t + 5 minutes. Suppose A0 = 100 grams of the hormone are initially present in plasma.
(a) Let A1 be the amount of GLP-1 present after 5 minutes. Find the value of A1.
(b) Let A2 be the amount of GLP-1 present after 10 minutes. Find the value of A2.
(c) Let A3 be the amount of GLP-1 present after 15 minutes. Find the value of A3.
(d) Let A4 be the amount of GLP-1 present after 20 minutes. Find the value of A4.
(e) Let An be the amount of GLP-1 present after 5n minutes. Find a formula for An.
(f) Does the sequence {An} converge or diverge? If the sequence converges, find its limit and explain why this value
makes sense in the context of this problem. 3Hui H, Farilla L, Merkel P, Perfetti R. The short half-life of glucagon-like
peptide-1 in plasma does not reflect its long-lasting beneficial effects, Eur J Endocrinol 2002 Jun;146(6):863-9. 454 8.1.
SEQUENCES
(g) Determine the number of minutes it takes until the amount of GLP-1 in plasma is 1 gram.
4. Continuous data is the basis for analog information, like music stored on old cassette tapes or vinyl records. A digital
signal like on a CD or MP3 file is obtained by sampling an analog signal at some regular time interval and storing that
information. For example, the sampling rate of a compact disk is 44,100 samples per second. So a digital recording is only
an approximation of the actual analog information. Digital information can be manipulated in many useful ways that allow
Matthew Boelkins, David Austin & Steven
8.E.1 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
for, among other things, noisy signals to be cleaned up and large collections of information to be compressed and stored in
much smaller space. While we won’t investigate these techniques in this chapter, this exercise is intended to give an idea
of the importance of discrete (digital) techniques. Let f be the continuous function defined by f (x) = sin(4x) on the interval
[0, 10]. A graph of f is shown in Figure 8.2. We approximate f by sampling, that is by partitioning the interval [0, 10] into
uniform subintervals and recording the values of f at the endpoints.

Figure 8.2: The graph of f (x) = sin(4x) on the interval [0, 10]
(a) Ineffective sampling can lead to several problems in reproducing the original signal. As an example, partition the
interval [0, 10] into 8 equal length subintervals and create a list of points (the sample) using the endpoints of each
subinterval. Plot your sample on graph of f in Figure Figure 8.2. What can you say about the period of your sample as
compared to the period of the original function?
(b) The sampling rate is the number of samples of a signal taken per second. As part (a) illustrates, sampling at too small a
rate can cause serious problems with reproducing the original signal (this problem of inefficient sampling leading to an
inaccurate approximation is called aliasing). There is an elegant theorem 8.1. SEQUENCES 455 called the Nyquist-
Shannon Sampling Theorem that says that human perception is limited, which allows that replacement of a continuous
signal with a digital one without any perceived loss of information. This theorem also provides the lowest rate at which a
signal can be sampled (called the Nyquist rate) without such a loss of information. The theorem states that we should
sample at double the maximum desired frequency so that every cycle of the original signal will be sampled at at least two
points. Recall that the frequency of a sinusoidal function is the reciprocal of the period. Identify the frequency of the
function f and determine the number of partitions of the interval [0, 10] that give us the Nyquist rate.
(c) Humans cannot typically pick up signals above 20 kHz. Explain why, then, that information on a compact disk is
sampled at 44,100 Hz.

8.2: Geometric Series


1. There is an old question that is often used to introduce the power of geometric growth. Here is one version. Suppose you
are hired for a one month (30 days, working every day) job and are given two options to be paid. Option 1. You can be
paid $500 per day or Option 2. You can be paid 1 cent the first day, 2 cents the second day, 4 cents the third day, 8 cents the
fourth day, and so on, doubling the amount you are paid each day.
(a) How much will you be paid for the job in total under Option 1?
(b) Complete Table 8.3 to determine the pay you will receive under Option 2 for the first 10 days.
Day Pay on this day Total amount paid to date 1 $0.01 $0.01 2 $0.02 $0.03 3 4 5 6 7 8 9 10
Table 8.3: Option 2 payments
8.2. GEOMETRIC SERIES 463
(c) Find a formula for the amount paid on day n, as well as for the total amount paid by day n. Use this formula to
determine which option (1 or 2) you should take.
2. Suppose you drop a golf ball onto a hard surface from a height h. The collision with the ground causes the ball to lose
energy and so it will not bounce back to its original height. The ball will then fall again to the ground, bounce back up, and
continue. Assume that at each bounce the ball rises back to a height 3 4 of the height from which it dropped. Let hn be the

Matthew Boelkins, David Austin & Steven


8.E.2 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
height of the ball on the nth bounce, with h0 = h. In this exercise we will determine the distance traveled by the ball and
the time it takes to travel that distance.
(a) Determine a formula for h1 in terms of h.
(b) Determine a formula for h2 in terms of h.
(c) Determine a formula for h3 in terms of h.
(d) Determine a formula for hn in terms of h.
(e) Write an infinite series that represents the total distance traveled by the ball. Then determine the sum of this series.
(f) Next, let’s determine the total amount of time the ball is in the air.
(i) When the ball is dropped from a height H, if we assume the only force acting on it is the acceleration due to gravity,
then the height of the ball at time t is given by H − 1 2 gt 2 . Use this formula to determine the time it takes for the ball to
hit the ground after being dropped from height H.
(ii) Use your work in the preceding item, along with that in (a)-(e) above to determine the total amount of time the ball is
in the air.
3. Suppose you play a game with a friend that involves rolling a standard six-sided die. Before a player can participate in
the game, he or she must roll a six with the die. Assume that you roll first and that you and your friend take alternate rolls.
In this exercise we will determine the probability that you roll the first six.
(a) Explain why the probability of rolling a six on any single roll (including your first turn) is 1 6 .
(b) If you don’t roll a six on your first turn, then in order for you to roll the first six on your second turn, both you and your
friend had to fail to roll a six on your first turns, and then you had to succeed in rolling a six on your second 464 8.2.
GEOMETRIC SERIES turn. Explain why the probability of this event is 5 6 5 6 1 6 = 5 6 2 1 6 .
(c) Now suppose you fail to roll the first six on your second turn. Explain why the probability is 5 6 5 6 5 6 5 6 1 6 = 5 6 4
1 6 that you to roll the first six on your third turn.
(d) The probability of you rolling the first six is the probability that you roll the first six on your first turn plus the
probability that you roll the first six on your second turn plus the probability that your roll the first six on your third turn,
and so on. Explain why this probability is 1 6 + 5 6 2 1 6 + 5 6 4 1 6 + · · · . Find the sum of this series and determine the
probability that you roll the first six.
4. The goal of a federal government stimulus package is to positively affect the economy. Economists and politicians quote
numbers like “k million jobs and a net stimulus to the economy of n billion of dollars.” Where do they get these numbers?
Let’s consider one aspect of a stimulus package: tax cuts. Economists understand that tax cuts or rebates can result in long-
term spending that is many times the amount of the rebate. For example, assume that for a typical person, 75% of her
entire income is spent (that is, put back into the economy). Further, assume the government provides a tax cut or rebate
that totals P dollars for each person.
(a) The tax cut of P dollars is income for its recipient. How much of this tax cut will be spent?
(b) In this simple model, we will say that the spent portion of the tax cut/rebate from part (a) then becomes income for
another person who, in turn, spends 75% of this income. After this “second round" of spent income, how many total
dollars have been added to the economy as a result of the original tax cut/rebate?
(c) This second round of spending becomes income for another group who spend 75% of this income, and so on. In
economics this is called the multiplier effect. Explain why an original tax cut/rebate of P dollars will result in multiplied
spending of 0.75P(1 + 0.75 + 0.752 + · · · ). dollars. 8.2. GEOMETRIC SERIES 465
(d) Based on these assumptions, how much stimulus will a 200 billion dollar tax cut/rebate to consumers add to the
economy, assuming consumer spending remains consistent forever.
5. Like stimulus packages, home mortgages and foreclosures also impact the economy. A problem for many borrowers is
the adjustable rate mortgage, in which the interest rate can change (and usually increases) over the duration of the loan,
causing the monthly payments to increase beyond the ability of the borrower to pay. Most financial analysts recommend
fixed rate loans, ones for which the monthly payments remain constant throughout the term of the loan. In this exercise we
Matthew Boelkins, David Austin & Steven
8.E.3 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
will analyze fixed rate loans. When most people buy a large ticket item like car or a house, they have to take out a loan to
make the purchase. The loan is paid back in monthly installments until the entire amount of the loan, plus interest, is paid.
With a loan, we borrow money, say P dollars (called the principal), and pay off the loan at an interest rate of r%. To pay
back the loan we make regular monthly payments, some of which goes to pay off the principal and some of which is
charged as interest. In most cases, the interest is computed based on the amount of principal that remains at the beginning
of the month. We assume a fixed rate loan, that is one in which we make a constant monthly payment M on our loan,
beginning in the original month of the loan. Suppose you want to buy a house. You have a certain amount of money saved
to make a down payment, and you will borrow the rest to pay for the house. Of course, for the privilege of loaning you the
money, the bank will charge you interest on this loan, so the amount you pay back to the bank is more than the amount you
borrow. In fact, the amount you ultimately pay depends on three things: the amount you borrow (called the principal), the
interest rate, and the length of time you have to pay off the loan plus interest (called the duration of the loan). For this
example, we assume that the interest rate is fixed at r%. To pay off the loan, each month you make a payment of the same
amount (called installments). Suppose we borrow P dollars (our principal) and pay off the loan at an interest rate of r%
with regular monthly installment payments of M dollars. So in month 1 of the loan, before we make any payments, our
principal is P dollars. Our goal in this exercise is to find a formula that relates these three parameters to the time duration
of the loan. We are charged interest every month at an annual rate of r%, so each month we pay r 12% interest on the
principal that remains. Given that the original principal is P dollars, we will pay 0.0r 12 P dollars in interest on our first
payment. Since we paid M dollars in total for our first payment, the remainder of the payment (M − r 12 P) goes to pay
down the principal. So the principal remaining after the first payment (let’s call it P1) is the original principal minus what
we paid on the principal, or P1 = P − M − r 12 P = 1 + r 12 P − M. 466 8.2. GEOMETRIC SERIES As long as P1 is
positive, we still have to keep making payments to pay off the loan.
(a) Recall that the amount of interest we pay each time depends on the principal that remains. How much interest, in terms
of P1 and r, do we pay in the second installment?
(b) How much of our second monthly installment goes to pay off the principal? What is the principal P2, or the balance of
the loan, that we still have to pay off after making the second installment of the loan? Write your response in the form P2 =
( )P1 − ( )M, where you fill in the parentheses.
(c) Show that P2 = 1 + r 12 2 P − 1 + 1 + r 12 M.
(d) Let P3 be the amount of principal that remains after the third installment. Show that P3 = 1 + r 12 3 P − " 1 + 1 + r 12 +
1 + r 12 2 # M.
(e) If we continue in the manner described in the problems above, then the remaining principal of our loan after n
installments is Pn = 1 + r 12 n P − Xn−1 k=0 1 + r 12 k M. (8.7) This is a rather complicated formula
and one that is difficult to use. However, we can simplify the sum if we recognize part of it as a partial sum of a geometric
series. Find a formula for the sum Xn−1 k=0 1 + r 12 k . (8.8) and then a general formula for Pn that does not involve a
sum.
(f) It is usually more convenient to write our formula for Pn in terms of years rather than months. Show that P(t), the
principal remaining after t years, can be written as P(t) = P − 12M r ! 1 + r 12 12t + 12M r . (8.9)
(g) Now that we have analyzed the general loan situation, we apply formula (8.9) to an actual loan. Suppose we charge
$1,000 on a credit card for holiday expenses. If our credit card charges 20% interest and we pay only the minimum
payment of $25 each month, how long will it take us to pay off the $1,000 charge? How much in total will we have paid on
this $1,000 charge? How much total interest will we pay on this loan? 8.2. GEOMETRIC SERIES 467
(h) Now we consider larger loans, e.g. automobile loans or mortgages, in which we borrow a specified amount of money
over a specified period of time. In this situation, we need to determine the amount of the monthly payment we need to
make to pay off the loan in the specified amount of time. In this situation, we need to find the monthly payment M that will
take our outstanding principal to 0 in the specified amount of time. To do so, we want to know the value of M that makes
P(t) = 0 in formula (8.9). If we set P(t) = 0 and solve for M, it follows that M = rP 1 + r 12 12t 12 1 + r 12 12t − 1 .
(i) Suppose we want to borrow $15,000 to buy a car. We take out a 5 year loan at 6.25%. What will our monthly payments
be? How much in total will we have paid for this $15,000 car? How much total interest will we pay on this loan?

Matthew Boelkins, David Austin & Steven


8.E.4 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
(ii) Suppose you charge your books for winter semester on your credit card. The total charge comes to $525. If your credit
card has an interest rate of 18% and you pay $20 per month on the card, how long will it take before you pay off this debt?
How much total interest will you pay?
(iii) Say you need to borrow $100,000 to buy a house. You have several options on the loan: – 30 years at 6.5% – 25 years
at 7.5% – 15 years at 8.25%.
(a) What are the monthly payments for each loan?
(b) Which mortgage is ultimately the best deal (assuming you can afford the monthly payments)? In other words, for which
loan do you pay the least amount of total interest?

8.3: Series of Real Numbers

Exercises
1. In this exercise we investigate the sequence ( b n n! ) for any constant b.
(a) Use the Ratio Test to determine if the series P 10k k! converges or diverges.
(b) Now apply the Ratio Test to determine if the series P b k k! converges for any constant b. 8.3. SERIES OF REAL
NUMBERS 483
(c) Use your result from (b) to decide whether the sequence ( b n n! ) converges or diverges. If the sequence ( b n n! )
converges, to what does it converge? Explain your reasoning.
2. There is a test for convergence similar to the Ratio Test called the Root Test. Suppose we have a series P ak of positive
terms so that an → 0 as n → ∞.
(a) Assume n √ an → r as n goes to infinity. Explain why this tells us that an ≈ r n for large values of n.
(b) Using the result of part (a), explain why P ak looks like a geometric series when n is big. What is the ratio of the
geometric series to which P ak is comparable?
(c) Use what we know about geometric series to determine that values of r so that P ak converges if n √ an → r as n → ∞.
3. The associative and distributive laws of addition allow us to add finite sums in any order we want. That is, if Pn k=0 ak
and Pn k=0 bk are finite sums of real numbers, then Xn k=0 ak + Xn k=0 bk = Xn k=0 (ak + bk ). However, we do need to
be careful extending rules like this to infinite series.
(a) Let an = 1 + 1 2 n and bn = −1 for each nonnegative integer n.
(i) Explain why the series P∞ k=0 ak and P∞ k=0 bk both diverge.
(ii) Explain why the series P∞ k=0 (ak + bk ) converges.
(iii) Explain why X∞ k=0 ak + X∞ k=0 bk , X∞ k=0 (ak + bk ). This shows that it is possible to have to two divergent
series P∞ k=0 ak and P∞ k=0 bk but yet have the series P∞ k=0 (ak + bk ) converge.
(b) While part (a) shows that we cannot add series term by term in general, we can under reasonable conditions. The
problem in part (a) is that we tried to add divergent series. In this exercise we will show that if P ak and P bk are
convergent series, then P (ak + bk ) is a convergent series and X (ak + bk ) = X ak + X bk .
(i) Let An and Bn be the nth partial sums of the series P∞ k=1 ak and P∞ k=1 bk , respectively. Explain why An + Bn =
Xn k=1 (ak + bk ).
(ii) Use the previous result and properties of limits to show that X∞ k=1 (ak + bk ) = X∞ k=1 ak + X∞ k=1 bk . (Note that
the starting point of the sum is irrelevant in this problem, so it doesn’t matter where we begin the sum.)
(c) Use the prior result to find the sum of the series P∞ k=0 2 k+3 k 5 k .
4. In the Limit Comparison Test we compared the behavior of a series to one whose behavior we know. In that test we use
the limit of the ratio of corresponding terms of the series to determine if the comparison is valid. In this exercise we see
how we can compare two series directly, term by term, without using a limit of sequence. First we consider an example.

Matthew Boelkins, David Austin & Steven


8.E.5 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
(a) Consider the series X 1 k 2 and X 1 k 2 + k . We know that the series P 1 k 2 is a p-series with p = 2 > 1 and so P 1 k 2
converges. In this part of the exercise we will see how to use information about P 1 k 2 to determine information about P 1
k 2+k . Let ak = 1 k 2 and bk = 1 k 2+k .
(i) Let Sn be the nth partial sum of P 1 k 2 and Tn the nth partial sum of P 1 k 2+k . Which is larger, S1 or T1? Why?
(ii) Recall that S2 = S1 + a2 and T2 = T1 + b2. Which is larger, a2 or b2? Based on that answer, which is larger, S2 or T2?
(iii) Recall that S3 = S2 + a3 and T3 = T2 + b3. Which is larger, a3 or b3? Based on that answer, which is larger, S3 or T3?
(iv) Which is larger, an or bn? Explain. Based on that answer, which is larger, Sn or Tn?
(v) Based on your response to the previous part of this exercise, what relationship do you expect there to be between P 1 k
2 and P 1 k 2+k ? Do you expect P 1 k 2+k to converge or diverge? Why?
(b) The example in the previous part of this exercise illustrates a more general result. Explain why the Direct Comparison
Test, stated here, works. The Direct Comparison Test. If 0 ≤ bk ≤ ak for every k, then we must have 0 ≤ X bk ≤ X ak 1. If P
ak converges, then P bk converges. 2. If P bk diverges, then P ak diverges. Important Note: This comparison test applies
only to series with nonnegative terms.
(i) Use the Direct Comparison Test to determine the convergence or divergence of the series P 1 k−1 . Hint: Compare to
the harmonic series.
(ii) Use the Direct Comparison Test to determine the convergence or divergence of the series P k k 3+1 .

8.4: Alternating Series

Exercises
1. Conditionally convergent series converge very slowly. As an example, consider the famous formula6 π 4 = 1 − 1 3 + 1 5
− 1 7 + · · · = X∞ k=0 (−1) k 1 2k + 1 . (8.17) In theory, the partial sums of this series could be used to approximate π.
(a) Show that the series in (8.17) converges conditionally.
(b) Let Sn be the nth partial sum of the series in (8.17). Calculate the error in 6We will derive this formula in upcoming
work. 498 8.4. ALTERNATING SERIES approximating π 4 with S100 and explain why this is not a very good
approximation.
(c) Determine the number of terms it would take in the series (8.17) to approximate π 4 to 10 decimal places. (The fact that
it takes such a large number of terms to obtain even a modest degree of accuracy is why we say that conditionally
convergent series converge very slowly.)
2. We have shown that if P (−1) k+1ak is a convergent alternating series, then the sum S of the series lies between any two
consecutive partial sums Sn. This suggests that the average Sn+Sn+1 2 is a better approximation to S than is Sn.
(a) Show that Sn+Sn+1 2 = Sn + 1 2 (−1) n+2an+1.
(b) Use this revised approximation in (a) with n = 20 to approximate ln(2) given that ln(2) = X∞ k=1 (−1) k+1 1 k .
Compare this to the approximation using just S20. For your convenience, S20 = 155685007 232792560 .
3. In this exercise, we examine one of the conditions of the Alternating Series Test. Consider the alternating series 1 − 1 +
1 2 − 1 4 + 1 3 − 1 9 + 1 4 − 1 16 + · · · , where the terms are selected alternately from the sequences 1 n and ( − 1 n 2 ) .
(a) Explain why the nth term of the given series converges to 0 as n goes to infinity. (b) Rewrite the given series by
grouping terms in the following manner: (1 − 1) + 1 2 − 1 4 + 1 3 − 1 9 + 1 4 − 1 16 + · · · . Use this regrouping to
determine if the series converges or diverges.
(c) Explain why the condition that the sequence {an} decreases to a limit of 0 is included in the Alternating Series Test.
4. Conditionally convergent series exhibit interesting and unexpected behavior. In this exercise we examine the
conditionally convergent alternating harmonic series X∞ k=1 (−1) k+1 k and discover that addition is not commutative for
conditionally convergent series. We will also encounter Riemann’s Theorem concerning rearrangements of conditionally
8.4. ALTERNATING SERIES 499 convergent series. Before we begin, we remind ourselves that X∞ k=1 (−1) k+1 k =
ln(2), a fact which will be verified in a later section.

Matthew Boelkins, David Austin & Steven


8.E.6 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
(a) First we make a quick analysis of the positive and negative terms of the alternating harmonic series.
(i) Show that the series X∞ k=1 1 2k diverges.
(ii) Show that the series X∞ k=1 1 2k + 1 diverges.
(iii) Based on the results of the previous parts of this exercise, what can we say about the sums X∞ k=C 1 2k and X∞ k=C
1 2k + 1 for any positive integer C? Be specific in your explanation.
(b) Recall addition of real numbers is commutative; that is a + b = b + a for any real numbers a and b. This property is
valid for any sum of finitely many terms, but does this property extend when we add infinitely many terms together? The
answer is no, and something even more odd happens. Riemann’s Theorem (after the nineteenth-century mathematician
Georg Friedrich Bernhard Riemann) states that a conditionally convergent series can be rearranged to converge to any
prescribed sum. More specifically, this means that if we choose any real number S, we can rearrange the terms of the
alternating harmonic series X∞ k=1 (−1) k+1 k so that the sum is S. To understand how Riemann’s Theorem works, let’s
assume for the moment that the number S we want our rearrangement to converge to is positive. Our job is to find a way to
order the sum of terms of the alternating harmonic series to converge to S.
(i) Explain how we know that, regardless of the value of S, we can find a partial sum P1 P1 = Xn1 k=1 1 2k + 1 = 1 + 1 3
+ 1 5 + · · · + 1 2n1 + 1 of the positive terms of the alternating harmonic series that equals or 500 8.4. ALTERNATING
SERIES exceeds S. Let S1 = P1.
(ii) Explain how we know that, regardless of the value of S1, we can find a partial sum N1 N1 = − Xm1 k=1 1 2k = − 1 2 −
1 4 − 1 6 − · · · − 1 2m1 so that S2 = S1 + N1 ≤ S.
(iii) Explain how we know that, regardless of the value of S2, we can find a partial sum P2 P2 = Xn2 k=n1+1 1 2k + 1 = 1
2(n1 + 1) + 1 + 1 2(n1 + 2) + 1 + · · · + 1 2n2 + 1 of the remaining positive terms of the alternating harmonic series so that
S3 = S2 + P2 ≥ S.
(iv) Explain how we know that, regardless of the value of S3, we can find a partial sum N2 = − Xm2 k=m1+1 1 2k = − 1
2(m1 + 1) − 1 2(m1 + 2) − · · · − 1 2m2 of the remaining negative terms of the alternating harmonic series so that S4 = S3
+ N2 ≤ S.
(v) Explain why we can continue this process indefinitely and find a sequence {Sn} whose terms are partial sums of a
rearrangement of the terms in the alternating harmonic series so that limn→∞ Sn = S.

8.5: Taylor Polynomials and Taylor Series

Exercises
1. In this exercise we investigation the Taylor series of polynomial functions.
(a) Find the 3rd order Taylor polynomial centered at a = 0 for f (x) = x 3 − 2x 2 + 3x − 1. Does your answer surprise you?
Explain.
(b) Without doing any additional computation, find the 4th, 12th, and 100th order Taylor polynomials (centered at a = 0)
for f (x) = x 3 − 2x 2 + 3x − 1. Why should you expect this?
(c) Now suppose f (x) is a degree m polynomial. Completely describe the nth order Taylor polynomial (centered at a = 0)
for each n.
2. The examples we have considered in this section have all been for Taylor polynomials and series centered at 0, but
Taylor polynomials and series can be centered at any value of a. We look at examples of such Taylor polynomials in this
exercise.
(a) Let f (x) = sin(x). Find the Taylor polynomials up through order four of f centered at x = π 2 . Then find the Taylor
series for f (x) centered at x = π 2 . Why should you have expected the result?
(b) Let f (x) = ln(x). Find the Taylor polynomials up through order four of f centered at x = 1. Then find the Taylor series
for f (x) centered at x = 1.
(c) Use your result from (b) to determine which Taylor polynomial will approximate ln(2) to two decimal places. Explain
in detail how you know you have the desired accuracy.
Matthew Boelkins, David Austin & Steven
8.E.7 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
3. We can use known Taylor series to obtain other Taylor series, and we explore that idea in this exercise, as a preview of
work in the following section.
(a) Calculate the first four derivatives of sin(x 2 ) and hence find the fourth order Taylor polynomial for sin(x 2 ) centered
at a = 0.
(b) Part (a) demonstrates the brute force approach to computing Taylor polynomials and series. Now we find an easier
method that utilizes a known Taylor series. Recall that the Taylor series centered at 0 for f (x) = sin(x) is X∞ k=0 (−1) k x
2k+1 (2k + 1)! . (8.24)
(i) Substitute x 2 for x in the Taylor series (8.24). Write out the first several terms and compare to your work in part (a).
Explain why the substitution in this problem should give the Taylor series for sin(x 2 ) centered at 0.
(ii) What should we expect the interval of convergence of the series for sin(x 2 ) to be? Explain in detail.
4. Based on the examples we have seen, we might expect that the Taylor series for a function f always converges to the
values f (x) on its interval of convergence. We explore that idea in more detail in this exercise. Let f (x) = e −1/x 2 if
x , 0, 0 if x = 0.
(a) Show, using the definition of the derivative, that f 0 (0) = 0.
(b) It can be shown that f (n) (0) = 0 for all n ≥ 2. Assuming that this is true, find the Taylor series for f centered at 0.
(c) What is the interval of convergence of the Taylor series centered at 0 for f? Explain. For which values of x the interval
of convergence of the Taylor series does the Taylor series converge to f (x)?

8.6: Power Series

Exercises
1. In this exercise we investigation the Taylor series of polynomial functions.
(a) Find the 3rd order Taylor polynomial centered at a = 0 for f (x) = x 3 − 2x 2 + 3x − 1. Does your answer surprise you?
Explain.
(b) Without doing any additional computation, find the 4th, 12th, and 100th order Taylor polynomials (centered at a = 0)
for f (x) = x 3 − 2x 2 + 3x − 1. Why should you expect this?
(c) Now suppose f (x) is a degree m polynomial. Completely describe the nth order Taylor polynomial (centered at a = 0)
for each n.
2. The examples we have considered in this section have all been for Taylor polynomials and series centered at 0, but
Taylor polynomials and series can be centered at any value of a. We look at examples of such Taylor polynomials in this
exercise.
(a) Let f (x) = sin(x). Find the Taylor polynomials up through order four of f centered at x = π 2 . Then find the Taylor
series for f (x) centered at x = π 2 . Why should you have expected the result?
(b) Let f (x) = ln(x). Find the Taylor polynomials up through order four of f centered at x = 1. Then find the Taylor series
for f (x) centered at x = 1.
(c) Use your result from (b) to determine which Taylor polynomial will approximate ln(2) to two decimal places. Explain
in detail how you know you have the desired accuracy.
3. We can use known Taylor series to obtain other Taylor series, and we explore that idea in this exercise, as a preview of
work in the following section.
(a) Calculate the first four derivatives of sin(x 2 ) and hence find the fourth order Taylor polynomial for sin(x 2 ) centered
at a = 0.
(b) Part (a) demonstrates the brute force approach to computing Taylor polynomials and series. Now we find an easier
method that utilizes a known Taylor series. Recall that the Taylor series centered at 0 for f (x) = sin(x) is X∞ k=0 (−1) k x
2k+1 (2k + 1)! . (8.24)

Matthew Boelkins, David Austin & Steven


8.E.8 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
(i) Substitute x 2 for x in the Taylor series (8.24). Write out the first several terms and compare to your work in part (a).
Explain why the substitution in this problem should give the Taylor series for sin(x 2 ) centered at 0.
(ii) What should we expect the interval of convergence of the series for sin(x 2 ) to be? Explain in detail.
4. Based on the examples we have seen, we might expect that the Taylor series for a function f always converges to the
values f (x) on its interval of convergence. We explore that idea in more detail in this exercise. Let f (x) = e −1/x 2 if
x , 0, 0 if x = 0.
(a) Show, using the definition of the derivative, that f 0 (0) = 0.
(b) It can be shown that f (n) (0) = 0 for all n ≥ 2. Assuming that this is true, find the Taylor series for f centered at 0.
(c) What is the interval of convergence of the Taylor series centered at 0 for f? Explain. For which values of x the interval
of convergence of the Taylor series does the Taylor series converge to f (x)?

Matthew Boelkins, David Austin & Steven


8.E.9 12/8/2021 https://math.libretexts.org/@go/page/5372
Schlicker
CHAPTER OVERVIEW
9: MULTIVARIABLE AND VECTOR FUNCTIONS
http://scholarworks.gvsu.edu/books/14/

9.1: FUNCTIONS OF SEVERAL VARIABLES AND THREE DIMENSIONAL SPACE


9.2: SECTION 2-
9.3: SECTION 3-
9.4: SECTION 4-
9.5: SECTION 5-
9.6: SECTION 6-

1 12/22/2021
9.1: Functions of Several Variables and Three Dimensional Space
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a function of several variables? What do we mean by the domain of a function of several variables?
How do we find the distance between two points in R ? What is the equation of a sphere in R ?
3 3

What is a trace of a function of two variables? What does a trace tell us about a function?
What is a level curve of a function of two variables? What does a level curve tell us about a function?

Introduction
Throughout our mathematical careers we have studied functions of a single variable. We define a function of one variable
as a rule that assigns exactly one output to each input. We analyze these functions by looking at their graphs, calculating
limits, differentiating, integrating, and more. In this and following sections, we will study functions whose input is defined
in terms of more than one variable, and then analyze these functions by looking at their graphs, calculating limits,
differentiating, integrating, and more. We will see that many of the ideas from single variable calculus translate well to
functions of several variables, but we will have to make some adjustments as well.

Preview Activity 9.1.1:

When people buy a large ticket item like a car or a house, they often take out a loan to make the purchase. The loan is
paid back in monthly installments until the entireamount of the loan, plus interest, is paid. The monthly payment that
the borrower has to make depends on the amount P of money borrowed (called the principal), the duration t of the
loan in years, and the interest rate r. For example, if we borrow $18,000 to buy a car, the monthly payment M that we
need to make to pay off the loan is given by the formula
The variables r and t are independent of each other, so using functional notation we write
(a) Find the monthly payments on this loan if the interest rate is 6% and the duration of the loan is 5 years.
(b) Evaluate M (0.05, 4). Explain in words what this calculation represents.
(c) Now consider only loans where the interest rate is 5%. Calculate the monthly payments as indicated in Table 9.1.
Round payments to the nearest penny.

Duration (in Years) 2 3 4 5 6

Monthly Payments
(dollars)

Table 9.1: Monthly payments at an interest rate of 5.


(d) Now consider only loans where the duration is 3 years. Calculate the monthly payments as indicated in Table 9.2.
Round payments to the nearest penny. Interest rate 0.03 0.05 0.07 0.09 0.11 Monthly payments (dollars) Table 9.2:
Monthly payments over three years.

Interest rate 0.03 0.05 0.07 0.09 0.11

Monthly Payments
(dollars)

Table 9.2: Monthly payment over three years.


(e) Describe as best you can the combinations of interest rates and durations of loans that result in a monthly payment
of $200.

Matthew Boelkins, David Austin & Steven


9.1.1 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Functions of Several Variables
Suppose we launch a projectile, using a golf club, a cannon, or some other device, from ground level. Under ideal
conditions (ignoring wind resistance, spin, or any other forces except the force of gravity) the horizontal distance the
object will travel depends on the initial velocity x the object is given, and the angle y at which it is launched. If we let f
represent the horizontal distance the object travels (its range), then f is a function of the two variables x and y , and we
represent f in functional notation by
2
x sin(2y)
f (x, y) = (9.1.1)
h

where g is the acceleration due to gravity.1

Definition
A function f of two independent variables is a rule that assigns to each ordered pair (x; y) in some set D exactly one
real number f (x; y).

There is, of course, no reason to restrict ourselves to functions of only two variables—we can use any number of variables
we like. For example,
2
f (x, y, z) = x − 2xz + cos(y) (9.1.2)

defines f as a function of the three variables x, y , and z . In general, a function of n independent variables is a rule that
assigns to an ordered n -tuple (x , x ; . . . , x ) in some set D exactly one real number.
1 2 n

As with functions of a single variable, it is important to understand the set of inputs for which the function is defined.

Definition

The domain of a function f is the set of all inputs at which the function is defined.

Activity 9.1.1

Identify the domain of each of the following functions. Draw a picture of each domain in the
x-y plane.
(a) f (x, y) = x + y
2 2

−−−−−−
(b) f (x, y) = √x + y 2 2

x +y
(c) Q(x, y) = 2 2
x +y
1
(d) s(x, y) = −−−−− −
√1 − xy 2

Representing Functions of Two Variables


One of the techniques we use to study functions of one variable is to create a table of values. We can do the same for
functions of two variables, except that our tables will have to allow us to keep track of both input variables. We can do this
with a 2-dimensional table, where we list the x-values down the first column and the y-values across the first row. Let f be
2
x sin(2y)
the function defined by f (x; y) = that gives the range of a projectile as a function of the initial velocity x and
g

launch angle y of the projectile. The value f (x, y) is then displayed in the location where the x row intersects the y

column, as shown in Table 9.3 (where we measure x in feet and y in radians).

x/y 0.2 0.4 0.6 0.8 1.0 1.2 1.4

25 7.6 14.0 19.5 19.5 17.8 13.2 6.5

50 30.4 56.0 78.1 87.1 71.0 52.8 26.2

Matthew Boelkins, David Austin & Steven


9.1.2 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
x/y 0.2 0.4 0.6 0.8 1.0 1.2 1.4

75 68.4 126.1 163.8 175.7 159.8 118.7 58.9

100 121.7 224.2 291.3 312.4 284.2 221.1 104.7

125 190.1 350.3 455.1 448.1 444.0 329.8 163.6

150 273.8 504.4 655.3 702.8 639.3 474.9 235.5

175 372.7 686.5 892.0 956.6 870.2 646.4 320.6

200 486.8 896.7 1165.0 1249.5 1136.6 884.3 418.9

225 616.2 1134.9 1474.5 1581.4 1438.5 1068.6 530.0

250

2
x sin(2y)
Table 9.3: Values of f (x; y) = .
g

Activity 9.1.2

Complete the last row in Table 9.3 to provide the needed values of the function f .

If f is a function of a single variable x, then we define the graph of f to be the set of points of the form (x; f (x)), where x
is in the domain of f . We then plot these points using the coordinate axes in order to visualize the graph. We can do a
similar thing with functions of several variables. Table 9.3 identifies points of the form (x, y, f (x, y)), and we define the
graph of f to be the set of these points.

Definition

The graph of a function f = f (x, y) is the set of points of the form (x, y, f (x, y)) , where the point (x, y) is in the
domain of f .

We also often refer to the graph of a function f of two variables as the surface generated by f . Points in the form
(x, y, f (x, y)) are in three dimensions, so plotting these points takes a bit more work than graphs of functions in two

dimensions. To plot these three-dimensional points, we need to set up a coordinate system with three mutually
perpendicular axes – the x-axis, the y-axis, and the z-axis (called the coordinate axes). There are essentially two different
ways we could set up a 3D coordinate system, as shown in Figures 9.1 and 9.2; thus, before we can proceed, we need to
establish a convention.

Figure 9.1.1 : A right hand system

Matthew Boelkins, David Austin & Steven


9.1.3 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Figure 9.1.2 : A left hand system
The distinction between these two figures is subtle, but important. In the coordinate system shown in 9.1, imagine that you
are sitting on the positive z-axis next to the label “z.” Looking down at the x- and y-axes, you see that the y-axis is
obtained by rotating the x-axis by 90 in the counterclockwise direction. Again sitting on the positive z-axis in Figure 9.2,
you see that the y-axis is obtained by rotating the x-axis by 90 in the clockwise direction.
We call the coordinate system in 9.1 a right-hand system; if we point the index finger of our right hand along the positive
x-axis and our middle finger along the positive y-axis, then our thumb points in the direction of the positive z-axis.
Following mathematical conventions, we choose to use a right-hand system throughout this book.
Now that we have established a convention for a right hand system, we can draw a graph of the range function defined by
2
x sin(2y)
f (x, y) = . Note that the function f is continuous in both variables, so when we plot these points in the right
g

hand coordinate system, we can connect them all to form a surface in 3-space. The graph of the range function f is shown
in Figure 9.3.

Figure 9.1.3 : The range surface.


There are many graphing tools available for drawing three-dimensional surfaces.2 Since we will be able to visualize graphs
of functions of two variables, but not functions of more than two variables, we will primarily deal with functions of two
variables in this course. It is important to note, however, that the techniques we develop apply to functions of any number
of variables.
Notation: We let R denote the set of all ordered pairs of real numbers in the plane (two copies of the real number system)
2

and let R represent the set of all ordered triples of real numbers (which constitutes three-space).
3

Some Standard Equations in Three-Space


In addition to graphing functions, we will also want to understand graphs of some simple equations in three dimensions.
For example, in R , the graphs of the equations x = a and y = b , where a and b are constants, are lines parallel to the
2

coordinate axes. In the next activity we consider their three-dimensional analogs.

Matthew Boelkins, David Austin & Steven


9.1.4 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Activity 9.1.3

a) Consider the set of points (x, y, z) that satisfy the equation x = 2 . Describe this set as best you can.
(b) Consider the set of points (x, y, z) that satisfy the equation y = −1 . Describe this set as best you can.
(c) Consider the set of points (x, y, z) that satisfy the equation z = 0 . Describe this set as best you can.

Activity 9.3 shows that the equations where one independent variable is constant lead to planes parallel to ones that result
from a pair of the coordinate axes. When we make the constant 0, we get the coordinate planes. The xy-plane satisfies
z = 0 , the xz-plane satisfies y = 0 , and the yz-plane satisfies z = 0 (see Figure 9.4).

Figure 9.1.4 : The coordinate planes.


On a related note, we define a circle in R as the set of all points equidistant from a fixed point. In R , we call the set of all
2 3

points equidistant from a fixed point a sphere. To find the equation of a sphere, we need to understand how to calculate the
distance between two points in three-space, and we explore this idea in the next activity.

Activity 9.1.4

Let P = (x , y , z ) and Q = (x , y , z ) be two points in R . These two points form opposite vertices of a
0 0 0 1 1 1
3

rectangular box whose sides are planes parallel to the coordinate planes as illustrated in Figure 9.5, and the distance
between P and Q is the length of the diagonal shown in Figure 9.1.5.

Figure 9.1.5 : The distance formula in R . 3

(a) Consider one of the right triangles in the base of the box whose hypotenuse is shown as the red line in Figure 9.5.
What are the vertices of this triangle? Since this right triangle lies in a plane, we can use the Pythagorean Theorem to
find a formula for the length of the hypotenuse of this triangle. Find such a formula, which will be in terms of x , y , 0 0

x , and y1.
1

(b) Now notice that the triangle whose hypotenuse is the blue segment connecting the points P and Q with a leg as the
hypotenuse of the triangle found in part (a) lies entirely in a plane, so we can again use the Pythagorean Theorem to
find the length of its hypotenuse. Explain why the length of this hypotenuse, which is the distance between the points
P and Q, is

−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
√ (x1 − x0 ) + (y1 − y0 ) + (z1 − z0 ) . (9.1.3)

Matthew Boelkins, David Austin & Steven


9.1.5 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
The formula developed in Activity 9.4 is important to remember.

Definition

The distance between points P = (x0 , y0 , z0 ) and Q = (x1 , y1 , z1 ) (denoted as |PQ|) in 3


R is given by the
formula
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
|P Q| √ (x1 − x0 ) + (y1 − y0 ) + (z1 − z0 ) . (9.1.4)

Equation (9.2) can be used to derive the formula for a sphere centered at a point (x , y , z ) with radius r. Since the0 0 0

distance from any point (x, y, z) on such a sphere to the point (x , y , z ) is r, the point (x, y, z) will satisfy the
0 0 0

equation
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
√ (x1 − x0 ) + (y1 − y0 ) + (z1 − z0 ) =r (9.1.5)

Squaring both sides, we come to the standard equation for a sphere.

Definition

The equation of a sphere with center (x 0, y0 , z0 ) and radius r is


2 2 2 2
(x1 − x0 ) + (y1 − y0 ) + (z1 − z0 ) =r . (9.1.6)

This makes sense if we compare this equation to its two-dimensional analogue, the equation of a circle of radius r in
the plane centered at (x , y ):
0 0

2 2 2
(x1 − x0 ) + (y1 − y0 ) =r . (9.1.7)

Traces
When we study functions of several variables we are often interested in how each individual variable affects the function
in and of itself. In Preview Activity 9.1, we saw that the monthly payment on an $18,000 loan depends on the interest rate
and the duration of the loan. However, if we fix the interest rate, the monthly payment depends only on the duration of the
loan, and if we set the duration the payment depends only on the interest rate. This idea of keeping one variable constant
while we allow the other to change will be an important tool for us when studying functions of several variables.
As another example, consider again the range function f defined by
2
x sin(2y)
f (x, y) = (9.1.8)
g

where x is the initial velocity of an object in feet per second, y is the launch angle in radians, and g is the acceleration due
to gravity (32 feet per second squared). If we hold the launch angle constant at y = 0.6 radians, we can consider f a
function of the initial velocity alone. In this case we have
2
x
f (x) = sin(2 ⋅ 0.6). (9.1.9)
32

We can plot this curve on the surface by tracing out the points on the surface when y = 0.6 , as shown in Figure 9.6. The
graph and the formula clearly show that f is quadratic in the x-direction. More descriptively, as we increase the launch
velocity while keeping the launch angle constant, the range increases proportional to the square of the initial velocity.
Similarly, if we fix the initial velocity at 150 feet per second, we can consider the range as a function of the launch angle
only. In this case we have
2
150 sin(2y)
f (y) = (9.1.10)
32

Matthew Boelkins, David Austin & Steven


9.1.6 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
We can again plot this curve on the surface by tracing out the points on the surface when x = 150, as shown in Figure 9.7.
The graph and the formula clearly show that f is sinusoidal in the y-direction. More descriptively, as we increase the
launch angle while keeping the initial velocity constant, the range is proportional to the sine of twice the launch angle.

Figure 9.6: The trace with y = 0.6 .

Figure 9.7: The trace with x = 150.


The curves we define when we fix one of the independent variables in our two variable function are called traces.

Definition

A trace of a function f of two independent variables x and y is a curve of the form z = f (c, y) or z = f (x, c) , where
c is a constant.

Understanding trends in the behavior of functions of two variables can be challenging, as can sketching their graphs; traces
help us with each of these tasks.

Activity 9.1.5

In the following questions, we investigate the use of traces to better understand a function through both tables and
graphs.
2
x sin(2y)
(a) Identify the y = 0.6 trace for the range function f defined by f (x, y) = by
g

highlighting or circling the appropriate cells in Table 9.3. Write a sentence to describe the behavior of the function
along this trace. (b) Identify the x = 150 trace for the range function by highlighting or circling the appropriate cells in
Table 9.3. Write a sentence to describe the behavior of the function along this trace.

Matthew Boelkins, David Austin & Steven


9.1.7 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Figure 9.8: Coordinate axes to sketch traces.
(c) For the function g defined by g(x, y) = x + y + 1 , explain the type of function that each trace in the x direction
2 2

will be (keeping y constant). Plot the y = −4 , y = −2 , y = 0 , y = 2 , and y = 4 traces in 3-dimensional coordinate


system provided in Figure 9.8.
(d) For the function g defined by g(x, y) = x + y + 1 , explain the type of function that each trace in the y direction
2 2

will be (keeping x constant). Plot the x = −4 , x = −2 , x = 0 , x = 2 , and x = 4 traces in 3-dimensional coordinate


system in Figure 9.8.
(e) Describe the surface generated by the function g .

Contour Maps and Level Curves


We have all seen topographic maps such as the one of the Porcupine Mountains in the upper peninsula of Michigan shown
in Figure 9.9.3 The curves on these maps show the regions of constant altitude. The contours also depict changes in
altitude: contours that are close together signify steep ascents or descents, while contours that are far apart indicate only
slight changes in elevation. Thus, contour maps tell us a lot about three-dimensional surfaces. Mathematically, if f (x, y)
represents the altitude at the point (x, y), then each contour is the graph of an equation of the form f (x, y) = k , for some
constant k .

Figure 9.9: Contour map of the Porcupine Mountains

Matthew Boelkins, David Austin & Steven


9.1.8 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Activity 9.1.6

On the topographical map of the Porcupine Mountains in Figure 9.9,


(a) identify the highest and lowest points you can find;
(b) from a point of your choice, determine a path of steepest ascent that leads to the highest point;
(c) from that same initial point, determine the least steep path that leads to the highest point.

Definition 9.5.
A level curve (or contour) of a function f of two independent variables x and y is a curve of the form k = f (x, y) ,
where k is a constant.

Topographical maps can be used to create a three-dimensional surface from the two-dimensional contours or level curves.
2
x sin(2y)
For example, level curves of the range function defined by f (x, y) = plotted in the xy-plane are shown in
32
Figure 9.10. If we lift these contours and plot them at their respective heights, then we get a picture of the surface itself, as
illustrated in Figure 9.11.

Figure 9.10: Several level curves.

Figure 9.11: Level curves at the appropriate height.


The use of level curves and traces can help us construct the graph of a function of two variables.

Activity 9.1.7

(a) Let f (x, y) = x + y . Draw the level curves f (x, y) = k for k = 1, k = 2 , k = 3 , and k = 4 on the left set of
2 2

axes given in Figure 9.12. (You decide on the scale of the axes.) Explain what the surface defined by f looks like.

−−−−−−
Figure 9.12: Left: Level curves for f (x, y) = x 2
+y
2
. Right: Level curves for g(x, y) = √x 2
+y
2
.

Matthew Boelkins, David Austin & Steven


9.1.9 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
−−−−−−
(b) Let g(x, y) = √x + y . Draw the level curves g(x, y) = k for k = 1 , k = 2 , k = 3 , and k = 4 on the right set
2 2

of axes given in Figure 9.12. (You decide on the scale of the axes.) Explain what the surface defined by g looks like.
(c) Compare and contrast the graphs of f and g . How are they alike? How are they different? Use traces for each
function to help answer these questions.

The traces and level curves of a function of two variables are curves in space. In order to understand these traces and level
curves better, we will first spend some time learning about vectors and vector-valued functions in the next few sections and
return to our study of functions of several variables once we have those more mathematical tools to support their study.

A gallery of functions
We end this section by considering a collection of functions and illustrating their graphs and some level curves.

Figure 9.13: z = x
2
+y
2

Figure 9.14: z = 4 − (x 2
+y )
2

−−−−−−
Figure 9.15: z = √x 2
+y
2

Matthew Boelkins, David Austin & Steven


9.1.10 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
Figure 9.16: z = x 2
−y
2

Figure 9.17: z = sin(x) + sin(y)

Figure 9.18: z = y 2
−x
3
+x

Figure 9.19: \(z = xye^{-x^2-y^2}

Summary
A function f of several variables is a rule that assigns a unique number to an ordered collection of independent inputs. The
domain of a function of several variables is the set of all inputs for which the function is defined. In R , the distance
3

between points P = (x , y , z ) and Q = (x , y , z ) (denoted as |P Q|) is given by the formula


0 0 0 1 1 1

Matthew Boelkins, David Austin & Steven


9.1.11 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 2 2
|P Q| = √ (x1 − x0 ) + (y1 − y0 ) + (z1 − z0 ) . (9.1.11)

and thus the equation of a sphere with center (x 0, y0 , z0 ) and radius r is


2 2 2 2
(x − x0 ) + (y − y0 ) + (z − z0 ) =r . (9.1.12)

A trace of a function f of two independent variables x and y is a curve of the form z = f (c, y) or z = f (x, c) , where c is
a constant. A trace tells us how the function depends on a single independent variable if we treat the other independent
variable as a constant.
A level curve of a function f of two independent variables x and y is a curve of the form k = f (x, y) , where k is a
constant. A level curve describes the set of inputs that lead to a specific output of the function.

Matthew Boelkins, David Austin & Steven


9.1.12 11/24/2021 https://math.libretexts.org/@go/page/5933
Schlicker
9.2: Section 2-
Learning Objectives
In this section, we strive to understand the ideas generated by the following important questions:
What is a vector?
What does it mean for two vectors to be equal?
How do we add two vectors together and multiply a vector by a scalar?
How do we determine the magnitude of a vector? What is a unit vector, and how do we find a unit vector in the
direction of a given vector?

Introduction
If we are at a point x in the domain of a function of one variable, there are only two directions in which we can move: in
the positive or negative x-direction. If, however, we are at a point (x, y) in the domain of a function of two variables, there
are many directions in which we can move. Thus, it is important for us to have a means to indicate direction, and we will
do so using vectors.

Preview Activity 9.2.1:

After working out, Sarah and John leave the Recreation Center on the Grand Valley State University Allendale campus
(a map of which is given in Figure 9.20) to go to their next classes.4 Suppose we record Sarah’s movement on the map
in a pair < x; y > (we will call this pair a vector), where x is the horizontal distance (in feet) she moves (with east as
the positive direction) and y as the vertical distance (in feet) she moves (with north as the positive direction). We do
the same for John. Throughout, use the legend to estimate your responses as best you can.
(a) What is the vector v =< x, y > that describes Sarah’s movement if she walks directly in a straight line path from
1

the Recreation Center to the entrance at the northwest end of Mackinac Hall? (Assume a straight line path, even if
there are buildings in the way.) Explain how you found this vector. What is the total distance in feet between the
Recreation Center and the entrance to Mackinac Hall? Measure the number of feet directly and then explain how to
calculate this distance in terms of x and y .
(b) What is the vector v =< x, y > that describes John’s change in position if he walks directly from the Recreation
2

Center to Au Sable Hall? How many feet are there between Recreation Center to Au Sable Hall in terms of x and y ?
(c) What is the vector v =< x, y > that describes the change in position if John walks directly from Au Sable Hall to
3

the northwest entrance of Mackinac Hall to meet up with Sarah after class? What relationship do you see among the
vectors v , v , and v ? Explain why this relationship should hold.
1 2 3

Figure 9.20: Grand Valley State University Allendale campus map.


4: 4GVSU campus map from http://www.gvsu.edu/homepage/files/p.../allendale.pdf, used with permission from
GVSU, credit to illustrator Chris Bessert.

Representations of Vectors
Preview Activity 9.2.1 shows how we can record the magnitude and direction of a change in position using an ordered pair
of numbers < x, y >. There are many other quantities, such as force and velocity, that possess the attributes of magnitude
and direction, and we will call each such quantity a vector.

Definition: Vector
A vector is any quantity that possesses the attributes of magnitude and direction.

We can represent a vector geometrically as a directed line segment, with the magnitude as the length of the segment and an
arrowhead indicating direction, as shown in Figure 9.21.
Figure 9.21: A vector.
Matthew Boelkins, David Austin & Steven
9.2.1 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
Figure 9.22: Representations of the same vector.
According to the definition, a vector possesses the attributes of length (magnitude) and direction; the vector’s position,
however, is not mentioned. Consequently, we regard as equal any two vectors having the same magnitude and direction, as
shown in Figure 9.22.

Note
Two vectors are equal provided they have the same magnitude and direction.

This means that the same vector may be drawn in the plane in many different ways. For instance, suppose that we would
like to draw the vector < 3, 4 > , which represents a horizontal change of three units and a vertical change of four units.
We may place the tail of the vector (the point from which the vector originates) at the origin and the tip (the terminal point
of the vector) at (3; 4), as illustrated in Figure 9.23. A vector with its tail at the origin is said to be in standard position.
Alternatively, we may place the tail of the vector < 3, 4 > at another point, such as Q(1, 1). After a displacement of three
units to the right and four units up, the tip of the vector is at the point R(4, 5) (see Figure 9.24).
Figure 9.23: A vector in standard position

Figure 9.24: A vector between two points



In this example, the vector led to the directed line segment from Q to R , which we denote as QR. We may also turn the
situation around: given the two points Q and R , we obtain the vector < 3, 4 > because we move horizontally three units
→ →
and vertically four units to get from Q to R . In other words, QR =< 3, 4 > . In general, the vector QR from the point
Q = (q , q ) to R = (r , r ) is found by taking the difference of coordinates, so that
1 2 1 2

��Q��!R = hr1 �� q1; r2 �� q2i:


We will use boldface letters to represent vectors, such as v =< 3, 4 > , to distinguish them from scalars. The entries of a
vector are called its components; in the vector < 3, 4 > , the x component is 3 and the y component is 4. We use pointed
brackets < , > and the term components to distinguish a vector from a point ( , ) and its coordinates. There is, however, a

close connection between vectors and points. Given a point P , we will frequently consider the vector OP from the origin

O to P . For instance, if P = (3, 4) , then OP =< 3, 4 > as in Figure 9.25. In this way, we think of a point P as defining a

vector OP whose components agree with the coordinates of P .

Figure 9.25: A point defines a vector


While we often illustrate vectors in the plane since it is easier to draw pictures, different situations call for the use of
vectors in three or more dimensions. For instance, a vector v in n-dimensional space, R , has n components and may be
n

represented as
v = hv1; v2; v3; : : : ; vni:
The next activity will help us to become accustomed to vectors and operations on vectors in three dimensions.

Activity 9.2.1:

As a class, determine a coordinatization of your classroom, agreeing on some convenient set of axes (e.g., an
intersection of walls and floor) and some units in the x, y , and z directions (e.g., using lengths of sides of floor,
ceiling, or wall tiles). Let O be the origin of your coordinate system. Then, choose three points, A , B , and C in the
room, and complete the following.
(a) Determine the coordinates of the points A , B , and C .
(b) Determine the components of the indicated vectors.
(i) O��!A (ii) O����!B (iii) O����!C (iv) A����!B (v) A��!C (vi) B����!C
C
Matthew Boelkins, David Austin & Steven
9.2.2 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
Equality of Vectors
Because location is not mentioned in the definition of a vector, any two vectors that have the same magnitude and direction
are equal. It is helpful to have an algebraic way to determine when this occurs. That is, if we know the components of two
vectors u and v, we will want to be able to determine algebraically when u and v are equal. There is an obvious set of
conditions that we use.

Note
Two vectors u = hu1; u2i and v = hv1; v2i in R2 are equal if and only if their corresponding components are equal: u1
= v1 and u2 = v2. More generally, two vectors u = hu1; u2; : : : ; uni and v = hv1; v2; : : : ; vni in Rn are equal if and
only if ui = vi for each possible value of i.

Operations on Vectors
Vectors are not numbers, but we can now represent them with components that are real numbers. As such, we naturally
wonder if it is possible to add two vectors together, multiply two vectors, or combine vectors in any other ways. In this
section, we will study two operations on vectors:
vector addition and scalar multiplication. To begin, we investigate a natural way to add two vectors together, as well as to
multiply a vector by a scalar.

Activity 9.2.1:

Let u = h2; 3i, v = h��1; 4i.


(a) Using the two specific vectors above, what is the natural way to define the vector sum
u + v?
(b) In general, how do you think the vector sum a+b of vectors a = ha1; a2i and b = hb1; b2i in R2 should be defined?
Write a formal definition of a vector sum based on your
intuition.
(c) In general, how do you think the vector sum a + b of vectors a = ha1; a2; a3i and
b = hb1; b2; b3i in R3 should be defined? Write a formal definition of a vector sum
based on your intuition.
(d) Returning to the specific vector v = h��1; 4i given above, what is the natural way to
define the scalar multiple 1
2v?
(e) In general, how do you think a scalar multiple of a vector a = ha1; a2i in R2 by a scalar
c should be defined? how about for a scalar multiple of a vector a = ha1; a2; a3i in R3
by a scalar c? Write a formal definition of a scalar multiple of a vector based on your
intuition.

We can now add vectors and multiply vectors by scalars, and thus we can add together scalar multiples of vectors. This
allows us to define vector subtraction, v��u, as the sum of v and ��1 times u, so that
v �� u = v + (��1)u:
Using vector addition and scalar multiplication, we will often represent vectors in terms of the special vectors i = h1; 0i
and j = h0; 1i. For instance, we can write the vector ha; bi in R2 as
ha; bi = ah1; 0i + bh0; 1i = ai + bj;
which means that
h2;��3i = 2i �� 3j:
In the context of R3, we let i = h1; 0; 0i, j = h0; 1; 0i, and k = h0; 0; 1i, and we can write the vector ha; b; ci in R3 as
ha; b; ci = ah1; 0; 0i + bh0; 1; 0i + ch0; 0; 1i = ai + bj + ck:

Matthew Boelkins, David Austin & Steven


9.2.3 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
The vectors i, j, and k are called the standard unit vectors5, and are important in the physical sciences.

Properties of Vector Operations


We know that the scalar sum 1 + 2 is equal to the scalar sum 2 + 1. This is called the commutative property of scalar
addition. Any time we define operations on objects (like addition of vectors) we usually want to know what kinds of
properties the operations have. For example, is addition of vectors a commutative operation? To answer this question we
take two arbitrary vectors v and u 5As we will learn momentarily, unit vectors have length 1 and add them together and see
what happens. Let v = hv1; v2i and u = hu1; u2i. Now we use the fact that v1, v2, u1, and u2 are scalars, and that the
addition of scalars is commutative to see that
v + u = hv1; v2i + hu1; u2i = hv1 + u1; v2 + u2i = hu1 + v1; u2 + v2i = hu1; u2i + hv1; v2i = u + v:
So the vector sum is a commutative operation. Similar arguments can be used to show the following properties of vector
addition and scalar multiplication.

Theorem

Let v, u, and w be vectors in Rn and let a and b be scalars. Then


1. v + u = u + v
2. (v + u) + w = v + (u + w)
3. The vector 0 = h0; 0; : : : ; 0i has the property that v + 0 = v. The vector 0 is called the
zero vector.
4. (��1)v + v = 0. The vector (��1)v = ��v is called the additive inverse of the vector v.
5. (a + b)v = av + bv
6. a(v + u) = av + au
7. (ab)v = a(bv)
8. 1v = v.

We verified the first property for vectors in R2; it is straightforward to verify that the rest of the eight properties just noted
hold for all vectors in Rn.

Geometric Interpretation of Vector Operations


Next, we explore a geometric interpretation of vector addition and scalar multiplication that allows us to visualize these
operations. Let u = h4; 6i and v = h3;��2i. Then w = u + v = h7; 4i, as shown on the left in Figure 9.26.
If we think of these vectors as displacements in the plane, we find a geometric way to envision vector addition. For
instance, the vector u+v will represent the displacement obtained by following the displacement u with the displacement v.
We may picture this by placing the tail of v at the
tip of u, as seen in the center of Figure 9.26.
Of course, vector addition is commutative so we obtain the same sum if we place the tail of u at the tip of v. We therefore
see that u+v appears as the diagonal of the parallelogram determined by u and v, as shown on the right of Figure 9.26.
Vector subtraction has a similar interpretation. On the left in Figure 9.27, we see vectors u, v, and w = u + v. If we rewrite v
= w �� u, we have the arrangement of Figure 9.28. In other words, to form the difference w �� u, we draw a vector
from the tip of u to the tip of w.
Figure 9.26: A vector sum (left), summing displacements (center), the parallelogram law (right)
Figure 9.27: Vector addition
Figure 9.28: Vector subtraction
In a similar way, we may geometrically represent a scalar multiple of a vector. For instance, if v = h2; 3i, then 2v = h4; 6i.
As shown in Figure 9.29, multiplying v by 2 leaves the direction unchanged, but stretches v by 2. Also, ��2v =
h��4;��6i, which shows that multiplying by a
negative scalar gives a vector pointing in the opposite direction of v.
Figure 9.29: Scalar multiplication
Matthew Boelkins, David Austin & Steven
9.2.4 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
Activity 9.2.1:

Figure 9.30
Figure 9.31
Suppose that u and v are the vectors shown in Figure 9.30.
(a) On Figure 9.30, sketch the vectors u + v, v �� u, 2u, ��2u, and ��3v.
(b) What is 0v?
(c) On Figure 9.31, sketch the vectors ��3v, ��2v, ��1v, 2v, and 3v.
(d) Give a geometric description of the set of vectors tv where t is any scalar.
(e) On Figure 9.31, sketch the vectors u �� 3v, u �� 2v, u �� v, u + v, and u + 2v.
(f) Give a geometric description of the set of vectors u + tv where t is any scalar.

The Magnitude of a Vector


By definition, vectors have both direction and magnitude (or length). We now investigate how to calculate the magnitude
of a vector. Since a vector v can be represented by a directed line segment, we can use the distance formula to calculate the
length of the segment. This length is the magnitude of the vector v and is denoted jvj.

Activity 9.2.1:

Figure 9.32: The vector defined by A and B.


Figure 9.33: An arbitrary vector, v.
(a) Let A = (2; 3) and B = (4; 7), as shown in Figure 9.32. Compute jA����!Bj.
(b) Let v = hv1; v2i be the vector in R2 with components v1 and v2 as shown in Figure 9.33.
Use the distance formula to find a general formula for jvj.
(c) Let v = hv1; v2; v3i be a vector in R3. Use the distance formula to find a general formula
for jvj.
(d) Suppose that u = h2; 3i and v = h��1; 2i. Find juj, jvj, and ju + vj. Is it true that
ju + vj = juj + jvj?
(e) Under what conditions will ju+vj = juj+jvj? (Hint: Think about how u, v, and u+v
form the sides of a triangle.)
(f) With the vector u = h2; 3i, find the lengths of 2u, 3u, and ��2u, respectively, and use
proper notation to label your results.
(g) If t is any scalar, how is jtuj related to juj?
(h) A unit vector is a vector whose magnitude is 1. Of the vectors i, j, and i + j, which are
unit vectors?
(i) Find a unit vector v whose direction is the same as u = h2; 3i. (Hint: Consider the result
of part (g).)

Summary
A vector is any object that possesses the attributes of magnitude and direction. Examples of vector quantities are
position, velocity, acceleration, and force.
Two vectors are equal if they have the same direction and magnitude. Notice that position is not considered, so a vector
is independent of its location.
If u = hu1; u2; : : : ; uni and v = hv1; v2; : : : ; vni are two vectors in Rn, then their vector sum is the vector
u + v = hu1 + v1; u2 + v2; : : : ; un + vni:
If u = hu1; u2; : : : ; uni is a vector in Rn and c is a scalar, then the scalar multiple cu is the vector
cu = hcu1; cu2; : : : ; cuni:
The magnitude of the vector v = hv1; v2; : : : ; vni in Rn is the scalar

Matthew Boelkins, David Austin & Steven


9.2.5 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
A vector u is a unit vector provided that juj = 1. If v is a nonzero vector, then the vector v jvj is a unit vector with the same
direction as v.

Matthew Boelkins, David Austin & Steven


9.2.6 10/27/2021 https://math.libretexts.org/@go/page/5934
Schlicker
9.3: Section 3-

Your page has been created!


Remove this content and add your own.

Edit page
Click the Edit page button in your user bar. You will see a suggested structure for your content. Add your content and
hit Save.

Tips:

Drag and drop


Drag one or more image files from your computer and drop them onto your browser window to add them to your
page.

Classifications
Tags are used to link pages to one another along common themes. Tags are also used as markers for the dynamic
organization of content in the CXone Expert framework.

Working with templates


CXone Expert templates help guide and organize your documentation, making it flow easier and more uniformly.
Edit existing templates or create your own.

Visit for all help topics.

Matthew Boelkins, David Austin & Steven


9.3.1 11/24/2021 https://math.libretexts.org/@go/page/5935
Schlicker
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education at
all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook environment is
under constant revision by students, faculty, and outside experts to supplant conventional paper-based books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 10/13/2021 https://math.libretexts.org/@go/page/5936


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/1/2021 https://math.libretexts.org/@go/page/5937


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/22/2021 https://math.libretexts.org/@go/page/5938


CHAPTER OVERVIEW
10: DERIVATIVES OF MULTIVARIABLE FUNCTIONS
http://scholarworks.gvsu.edu/books/14/

10.1: SECTION 1-
10.2: SECTION 2-
10.3: SECTION 3-
10.4: SECTION 4-
10.5: SECTION 5-
10.6: SECTION 6-

1 12/22/2021
10.1: Section 1-

Your page has been created!


Remove this content and add your own.

Edit page
Click the Edit page button in your user bar. You will see a suggested structure for your content. Add your content and
hit Save.

Tips:

Drag and drop


Drag one or more image files from your computer and drop them onto your browser window to add them to your
page.

Classifications
Tags are used to link pages to one another along common themes. Tags are also used as markers for the dynamic
organization of content in the CXone Expert framework.

Working with templates


CXone Expert templates help guide and organize your documentation, making it flow easier and more uniformly.
Edit existing templates or create your own.

Visit for all help topics.

Matthew Boelkins, David Austin & Steven


10.1.1 11/24/2021 https://math.libretexts.org/@go/page/5941
Schlicker
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/1/2021 https://math.libretexts.org/@go/page/5942


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 11/3/2021 https://math.libretexts.org/@go/page/5943


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 11/3/2021 https://math.libretexts.org/@go/page/5944


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/1/2021 https://math.libretexts.org/@go/page/5945


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/1/2021 https://math.libretexts.org/@go/page/5946


CHAPTER OVERVIEW
11: MULTIPLE INTEGRALS
http://scholarworks.gvsu.edu/books/14/

11.1: SECTION 1-
11.2: SECTION 2-
11.3: SECTION 3-
11.4: SECTION 4-
11.5: SECTION 5-
11.6: SECTION 6-

1 12/22/2021
11.1: Section 1-

Your page has been created!


Remove this content and add your own.

Edit page
Click the Edit page button in your user bar. You will see a suggested structure for your content. Add your content and
hit Save.

Tips:

Drag and drop


Drag one or more image files from your computer and drop them onto your browser window to add them to your
page.

Classifications
Tags are used to link pages to one another along common themes. Tags are also used as markers for the dynamic
organization of content in the CXone Expert framework.

Working with templates


CXone Expert templates help guide and organize your documentation, making it flow easier and more uniformly.
Edit existing templates or create your own.

Visit for all help topics.

Matthew Boelkins, David Austin & Steven


11.1.1 10/20/2021 https://math.libretexts.org/@go/page/5948
Schlicker
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/15/2021 https://math.libretexts.org/@go/page/5949


11.3: Section 3-

Your page has been created!


Remove this content and add your own.

Edit page
Click the Edit page button in your user bar. You will see a suggested structure for your content. Add your content and
hit Save.

Tips:

Drag and drop


Drag one or more image files from your computer and drop them onto your browser window to add them to your
page.

Classifications
Tags are used to link pages to one another along common themes. Tags are also used as markers for the dynamic
organization of content in the CXone Expert framework.

Working with templates


CXone Expert templates help guide and organize your documentation, making it flow easier and more uniformly.
Edit existing templates or create your own.

Visit for all help topics.

Matthew Boelkins, David Austin & Steven


11.3.1 12/8/2021 https://math.libretexts.org/@go/page/5950
Schlicker
Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 12/22/2021 https://math.libretexts.org/@go/page/5951


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 11/3/2021 https://math.libretexts.org/@go/page/5952


Welcome to the Mathematics Library. This Living Library is a principal hub of the LibreTexts project, which is a multi-
institutional collaborative venture to develop the next generation of open-access texts to improve postsecondary education
at all levels of higher learning. The LibreTexts approach is highly collaborative where an Open Access textbook
environment is under constant revision by students, faculty, and outside experts to supplant conventional paper-based
books.

Campus Bookshelves Bookshelves

Learning Objects

Matthew Boelkins, David Austin & Steven Schlicker 1 11/10/2021 https://math.libretexts.org/@go/page/5953


Index
A G Power Series Differentiation and
Geometric Sums Integration Theorem
absolute extrema
8.2: Geometric Series 8.6: Power Series
3.3: Global Optimization
global minimum pressure
absolute minimum
3.1: Using Derivatives to Identify Extreme 6.4: Physics Applications - Work, Force, and
3.1: Using Derivatives to Identify Extreme Pressure
Values
Values
alternating series test Global Optimization
8.4: Alternating Series
3.3: Global Optimization Q
autonomous differential equation Graphing Antiderivatives quotient rule
5.1: Construction Accurate Graphs of 2.3: The Product and Quotient Rules
7.2: Qualitative Behavior of Solutions to
Antiderivatives Quotient Rules
Differential Equations
2.3: The Product and Quotient Rules
C H
carrying capacity harmonic series R
8.3: Series of Real Numbers related rates
7.6: Population Growth and the Logistic
Equation 3.5: Related Rates
chain rule I riemann sum
2.5: The Chain Rule implicit differentiation 4.2: Riemann Sums
common ratio 2.7: Derivatives of Functions Given Implicitely
8.2: Geometric Series IMPROPER INTEGRAL S
Concavity 6.5: Improper Integrals Second Derivative
1.6: The Second Derivative index of summation 1.6: The Second Derivative
constant multiple rule 4.2: Riemann Sums second derivative test
2.1: Elementary Derivative Rules inflection points 3.1: Using Derivatives to Identify Extreme
continuity 3.2: Using Derivatives to Describe Families of Values
1.7: Limits, Continuity, and Differentiability Functions Separable Differential Equations
Convergence inner function 7.4: Separable Differential Equations
6.5: Improper Integrals 2.5: The Chain Rule Sigma Notation
convergent sequences Integration by Parts 4.2: Riemann Sums
8.1: Sequences 5.4: Integration by Parts slope field
corresponding theorem interval of convergence 7.2: Qualitative Behavior of Solutions to
8.6: Power Series 8.6: Power Series Differential Equations
Sum Rule
D L 2.1: Elementary Derivative Rules
Derivative of cosine function lemniscate sump crock
2.7: Derivatives of Functions Given Implicitely 6.4: Physics Applications - Work, Force, and
2.2: The Sine and Cosine Function
limit Pressure
Derivative of sine function
2.2: The Sine and Cosine Function 1.7: Limits, Continuity, and Differentiability
Logistic equation T
Differentiability (two variables)
1.7: Limits, Continuity, and Differentiability 7.6: Population Growth and the Logistic tangent line approximation
Equation 1.8: The Tangent Line Approximation
Divergence
6.5: Improper Integrals
Taylor polynomial
Divergence Test
M 8.5: Taylor Polynomials and Taylor Series

8.3: Series of Real Numbers


MACLAURIN SERIES Taylor series
8.5: Taylor Polynomials and Taylor Series 8.5: Taylor Polynomials and Taylor Series

E midpoint rule The Integral Test


5.6: Numerical Integration 8.3: Series of Real Numbers
equilibrium solutions The Lagrange Error Bound
7.2: Qualitative Behavior of Solutions to
Differential Equations O 8.5: Taylor Polynomials and Taylor Series
Euler’s Method outer function The Limit Comparison Test
2.5: The Chain Rule 8.3: Series of Real Numbers
7.3: Euler's Method
Extreme Value Theorem The Ratio Test
3.3: Global Optimization P 8.3: Series of Real Numbers
partial fractions The Second Fundamental Theorem of
F 5.5: Other Options for Finding Algebraic Calculus
Derivatives 5.2: The Second Fundamental Theorem of
First Derivative Test
3.1: Using Derivatives to Identify Extreme
per capita growth rate. Calculus
Values 7.6: Population Growth and the Logistic Trapezoid Rule
Equation 5.6: Numerical Integration
force
6.4: Physics Applications - Work, Force, and
Population growth
Pressure 7.6: Population Growth and the Logistic W
Equation
fundamental theorem of calculus work
4.4: The Fundamental Theorem of Calculus
power rule 6.4: Physics Applications - Work, Force, and
2.1: Elementary Derivative Rules Pressure
power series
8.6: Power Series
Glossary

Sample Word 1 | Sample Definition 1

You might also like