0% found this document useful (0 votes)

10 views270 pages

Text Book

Uploaded by

jamesdawd1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views270 pages

Text Book

Uploaded by

jamesdawd1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 270

Algebra and Calculus:

Mathematical Modeling
for Business, Economics, and Finance
c 2014 Edoh Y. Amiran

ISBN–13: 9781500774936
ISBN–10: 1500774936
Introduction

To make informed decisions, for personal finances, at work, and as contributors to pub-
lic debates and elections, one must understand the financial state and economic behavior
of individuals, cities, counties, countries, and markets. The quality of life, and indeed the
type of lifestyles available in the future, depend on our management of natural, financial,
and human resources, and decisions about the use of resources are made, increasingly and
on a large scale, by informed citizens, and by people in government, business, and finance
who use quantitative models.
Quantitative understanding plays an important role in military strategy and planning,
in managing daily personal finances and investments for retirement, in placing emergency
services to maximize the number of people served and to minimize risks, in urban and rural
planning, in managing waste and cleaning hazardous waste sites, in hiring and firing of
employees, in setting interest rates that affect the well being of entire nations, in allocating
the use of public lands and water resources, in allocating fishing permits, in licensing and
routing air traffic, in engineering and design, in production and marketing of goods and
services, in industrial development, and in development of intellectual properties. Indeed,
there is no part of modern life in which quantitative analysis is entirely absent.
In short, those who wish to play a role in making decisions need to understand the
principles and techniques behind mathematical, economic, and financial models.
This text aims to illustrate the types of thinking required to understand quantitative
models in decision science, economics, and finance. The ideas are introduced in an ex-
plained context, and the tools used for analysis are the ones which naturally arise in the
context. The important notions are the settings from economics and finance, the math-
ematical ideas and tools, and especially the connections among them. The thoughts are
directed from and towards a mathematical perspective.
To continue one’s development as a decision maker, one would continue studying not
only Mathematics, Economics, and Finance, but also many other fields in the physical
sciences, humanities, and social sciences.
Table of Contents
1. Quantitative Models.
A. Quantitative and qualitative variables.
B. Relations among variables.
C. Models and decisions.
D. Chapter Summary.

2. Numbers and Calculations: A Review.

A. Numbers.
B. Operations with numbers.
C. Absolute value.
D. Powers.
E. Rules for variables.

3. Graphs and Equations.

A. Graphs.
B. Equations.
C. Inequalities.
D. Graphing equations.

4. Linear and Polynomial Expressions.

A. Linear relations and equations.
B. Quadratic expressions and equations.
C. Polynomials and equations with powers.
D. Proportions and inverse proportions.

5. Functions.
A. Definition of a function.
B. Graphs of functions.
C. Operations with functions.
D. Summary for functions.

6. Exponential Growth, Financial Interest, and Sums.

A. Exponentials.
B. The base e.
C. Logarithm.
D. Exponential and logarithmic equations and models.
E. Financial series and exponentials.
7. Probability.
A. Probabilities of events.
B. Random variables.
C. Average and expected value.
D. Cumulative distribution.

8. Approximation.
A. Improved approximations.
B. Limits: exact approximations.
C. Continuity.
D. Slope and marginal rates.
E. Some differentiation rules.
F. Linearization.

9. Further Uses of Differentiation.

A. Derivatives of higher order.
B. Implicit differentiation: a use of the chain rule.
C. Related rates: another use of the chain rule.
D. Behavior of functions: increasing, decreasing, critical points, and concavity.

10. Optimization and Further Analysis.

A. Models using optimization.
B. Proportional rates of change.
C. Eventual behavior.

11. Rates of Change of Exponentials and Logarithms.

12. Differential Equations and Anti-Differentiation.

13. Definite Integrals, Area, and Accumulated Value.

A. The definite integral.
B. The fundamental theorem of calculus.
C. Some anti-differentiation or integration techniques.
D. Improper integrals.
14. An Extremely Brief Introduction to Functions of Several Variables.
A. Space and coordinates.
B. Parameterized paths.
C. Functions.
D. Velocity vectors.
E. Vectors and functions.

15. Optimization for Functions of Two Variables.

A. Critical points and curvature near them.
B. Constraints and level sets.
C. Relaxation of constraints and rates of return.

Index.
1. Quantitative Models.
A. Quantitative and Qualitative Variables.
Some items of interest to people involve numbers, such as the amount a buyer would
pay each month for a particular car after a particular down payment. The quantities
involved can, in such cases, be thought of as variables whose relations and amounts can
be considered.
Other items of interest do not involve numbers. For example, whether a particular
person prefers cherry pie over apple pie is a qualitative question.
Of course, sometimes there are connections among quantitative and qualitative vari-
ables, and this is typically interesting. For example, it has been shown experimentally that
people choose different snacks when the analytic portion of their brains are heavily taxed.
In this experiment people were given a number to memorize and were asked to walk to
another room where they would repeat the number. On the way they were offered snacks
and those who were given longer numbers to memorize were much more likely to choose an
unhealthy snack. This might be taken as an interesting observation regarding qualitative
behavior, or one might try to model the relationship between preferences and mental tasks
more formally.

A quantitative variable is an item or symbol that represents the amount or value

of something.
Examples of quantitative variables include a person’s age, the temperature at a given
place and time, the distance of an airplane from an airport, the market value of a particular
commodity, the debt of a certain person or country, the length of a table, the height of a
mountain above sea-level, the speed of water in a river, the rate at which people save their
earnings in a country, and the size of a resistor on a printed electric circuit.

A qualitative variable is an item or symbol that represents a fact or preference but

not the amount of anything.
Examples of qualitative variables include a person’s gender, the color of a car, whether
a person is sensitive to heat, which colors a person sees, the shape of a knife (such as
serrated as opposed to smooth, or straight as opposed to curved), what brands a person
prefers (if any), the name assigned to a concept in a language or by different languages,
whether a person can swim, which archaeological sites a person would like to visit, the
sound made by a frog, and whether a certain plant can withstand frost.

A statistical variable is an item or symbol that describes the state of a process

or summarizes other variables. Statistical variables may summarize either quantitative or
2 Quantitative Models

qualitative variables.
An example of a statistical variable is the average number of boys born to a family in
a particular state or country. This average is a number, but it summarizes a qualitative
variable, namely whether the child is a boy or a girl. The type of pie preferred by the
largest number of people in the US is also a statistical variable.
Other examples of statistical variables include the losses or earnings of an investor at
a specified time, the average flow of a river at a certain place and time of year, the range
of heights of horses, the average worker productivity for a country, the median number of
years it takes to obtain a college degree, the minimum speed at which a bicycle can remain
upright for at least a minute, the total production of goods and services in a given country,
the average income for a resident of a particular county, and the percentage of the national
income received by the most wealthy ten percent of the population.

Example 1. Question: A person can afford car payments of 200 dollars a month.
The interest rate is 6 percent. The person wants to compute the maximum loan that she
or he can afford. Is the size of the loan a quantitative or qualitative variable? Explain.

Answer: The size of the loan is an amount (of money), so it is a quantitative variable.

Discussion: Quantitative variables always have a value and this value often changes
when other quantities change. In this example, if the size of the payment that the person
can afford goes up, then the loan they can afford also goes up. If the interest rate goes up,
then the loan the person can afford gets smaller.

Example 2. Question: A person is buying a car. The tulip-red version of the car
costs 170 dollars less than the quartz-blue version of the same car. Which car should the
person buy?

Answer: We cannot tell. This decision is based on the person’s preference over colors,
which is a qualitative variable. It also depends on how much that preference is worth to
the person, which is a quantitative variable.

Discussion: With more information one could answer the original question. For ex-
ample, if the person prefers the red over the blue, then certainly she or he would buy
the less expensive car. If the person prefers the blue over the red, then the question be-
comes whether the difference in color is worth the difference in price. With a model of the
importance of color and of the additional cost, we could decide what the purchaser will
do.
Quantitative Models 3

Example 3. Question: If asked for the fastest vehicle on land, would the response be
a quantitative or qualitative variable?
Answer: The name or make or type of the fastest vehicle on land are qualitative
variables. (The speed of this vehicle is a quantitative variable but that answers a different
question.)

Example 4. Question: A person considers two options for roofing a house with a
roof that measures 1000 square feet. A metal roof costs 3 dollars a square foot, and costs
1000 dollars to install. A shingle roof costs 1 dollar a square foot, and costs 2000 dollars to
install. A metal roof will last for 50 years, while a shingle roof will last for 30 years. The
person finds the metal roof uglier than the shingle roof, but not too bad. Which of the
variables considered is qualitative and which is quantitative? Based on the quantitative
variables, what can you say about the cost of the roof on an annual basis?
Answer: Except for the appearance of the roofs in question (that the metal roof is
uglier) all the variables are quantitative.
The metal roof costs 3 × 1000 for materials and 1000 for its installation. So the total
cost of the metal roof is 4000. Since this roof lasts 50 years, its annual cost is 4000/50 = 80
(dollars per year). The shingle roof costs 1×1000 for materials and 2000 for its installation.
So the total cost of the shingle roof is 3000. Since this roof lasts 30 years, its annual cost
is 3000/30 = 100 (dollars per year).

Example 5. Question: The following represents the orders for soup at a restaurant
one evening. The order includes the type of soup ordered and the time at which the order
was placed.

Soup type : chicken onion bean onion chicken onion bean

Time : 4 : 15 4 : 50 5 : 32 6 : 01 6 : 20 6 : 33 6 : 45
Which of the variables involved is quantitative and which is qualitative? Find at least two
new variables that help describe the situation.
Answer: The variables described are the type of soup, which is a qualitative variable,
and the time of the order, which is a quantitative variable. One variable that describes
the situation is the number of soups of a certain type, such as the number of dishes of
chicken soup ordered (two of them), which is a quantitative variable. Another variable
that describes the situation is the type of soup that was ordered most frequently, namely
onion, which is a qualitative variable. Another variable that describes the situation is the
average time between orders. Since there were 6 new orders between 4:15 and 6:45, the
average time between orders was 2.5/6 = 5/12 hours (or 25 minutes).
4 Quantitative Models

Discussion: The descriptive variables that were mentioned above are all statistical
variables in that they summarize information about other variables.

Example 6. Question: Is shoe size a quantitative variable or a qualitative variable?

Answer: In any given system of measurement (such as those developed in the US, UK,
or EU), the shoe size is an amount, in that the larger size corresponds to a longer foot, so
we could think of shoe size as a quantitative variable.
Discussion: This is a tricky variable because the correspondence between the shoe’s
labelled size and other measurable quantities is not exact. Different manufacturers label
their shoes with different sizes, and length and volume vary among shoes with the same
size.

EXERCISES.
1. A person saves 10 percent of her or his income. The person’s income is 2000 dollars a
month. Is the size of the persons savings a quantitative or qualitative variable? Explain.
2. A person is deciding on retirement plans and is considering investing with one of three
funds. One fund promises a rate of return of 3 percent, the second has a variable rate of
return, and the third invests in government bonds. Is the person’s choice of an investment
a quantitative or qualitative variable? Explain.
3. Is the probability of an accident acceptable to a company a quantitative or qualitative
variable? Explain.
4. A person considers two types of insulation for a house whose walls measure 3,000 square
feet. Fiberglass matts cost 1 dollar a square foot, and costs 200 dollars to install. Blown-in
insulation costs 0.70 dollars a square foot, and costs 2000 dollars to install. The person will
be warmer with the blown-in insulation. Which of the variables considered is qualitative
and which is quantitative? Can you think of a way to combine the variables so that the
final decision is based only on costs?
5. The following represents the recreational activities of 10 friends.

Activity : running hiking skiing bicycling canoeing fishing

Number participating : 4 6 5 7 8 2

Which of the variables involved is quantitative and which is qualitative? What is the
probability that one of the friends is a bicycle rider?
Quantitative Models 5

6. Is the risk that a person is willing to accept as part of a recreational activity (such as
climbing mountains, boating, or riding an ATV) a quantitative variable or a qualitative
variable?
7. A fire department classifies the situations to which it responds as a false alarm (FA), a
personal medical emergency (PM), a medical emergency resulting from an accident (MA),
an electrical fire (EF), a fire with hazardous chemicals present (HF), a vehicle fire (VF),
or a structural fire (SF). During a period of 50 days, the following occurred:

Classification : FA PM MA EF HF VF SF
Number occurring : 152 55 99 44 8 21 70
Which of the variables involved is quantitative and which is qualitative? Create at least
one statistical variable associated with this setting.

B. Relations Among Variables.

Interesting questions arise, typically, when variables change together. For quantitative
variables we say that there is a relation among those variables.
For example, it has been suggested that when an earthquake is about to occur the
movement in the rocks sometimes releases radon gas which in turn interacts with gasses
in portions of the atmosphere and changes the pattern of ionization in the atmosphere.
At the time of this writing, whether these changes always occurs together, and a precise
relation between the timing of an earthquake and the rise in ionization is not known.

In order to discuss the relations among variables we often use symbols to represent
the value of the variable. For example, when we write that z is the width of a stripe on a
zebra we might not have a particular stripe on a particular zebra in mind, and we do not
know the numerical value of the width at the moment. However, any relation having to
do with the width of a stripe on a zebra will be a relation having to do with z.

We refer to a symbol as a variable when it is to be replaced by numbers. For

instance, if x is the age of a person, then for any particular person there is a particular
value (a particular number) for x. As another example, if y is the maximum heart rate of
an individual, then for any particular person there is a value for y.

A variable y is a function of a variable x when for each value of x there is a cor-

responding value of y. We also call the rule that assigns the value to y the function and
write y = f (x). In this notation x and y are variables, and f is the label we’ve given to
the function computing y from x.
6 Quantitative Models

For example, if x is the age of a person and y is the maximum heart rate of this
person, then there is some dependence of heart rate on age, so y is a function of x.
For another example, let x be the length of a person’s head (chin to top of forehead)
and y the person’s arm length, one could suggest that the length of the arm is three times
the length of the head. Symbolically y = f (x) = 3x. Artists use this kind of information
in their drawings.
When y = f (x) we call y the dependent variable and we call x the independent
variable.

Example 1. Suppose that when x is the length of a person’s head and y is the
person’s arm length, y = 3x. What is the length of the person’s arm if the person’s head
is 8.5 inches long? What is the length of the person’s arm if the person’s head is 9 inches
long? What is the length of the person’s arm if the person’s head is 3/4 of a foot in length?
According to the relation given, a person whose head is 8.5 inches long would have an
arm that is 25.5 inches long, a head of length 9 inches corresponds to an arm of length 27
inches, and a head of length 3/4 foot corresponds to an arm length of 9/4 feet or 2.25 feet.
Discussion: The head of length 9 inches and that of 3/4 foot have the same length,
and so do the arms of length 27 inches and 2.25 feet. A proportional relation does not
change when we change the units used to describe the lengths (unless we change the units
for one variable but not for the other).

Example 2. Let A be the area of a square and let s be the length of the side of the
square. Write A as a function of s.
The relation is A = s2 .
To illustrate the relationship numerically we choose a few values for the length of the
side and calculate the area:

s 3 4.2 1 0 2 5 11
A 9 17.64 1 0 4 25 121

Discussion: This mathematical expression makes sense for any value of s, but the
relation between area and length only makes sense for s ≥ 0 since the length of the side of
a square is positive (or zero, if we shrink the square to nothing).

Example 3. Let A be the area of a square and let s be the length of the side of the
square. Write s as a function of A.
√
The relation is s = A, for A ≥ 0.
Quantitative Models 7

Numerically:

A 9 17.64 1 0 4 25 121
s 3 4.2 1 0 2 5 11

Discussion: This relation makes sense only if A ≥ 0, which is fine because the area of
a square is always a positive number.

Example 4. The temperature measured in centigrade is 0 when water freezes. The

temperature measured in fahrenheit is 32 when water freezes. When the temperature
increases by 1 degree on the centigrade scale, the temperature increases by 1.8 degrees on
the fahrenheit scale. Express the relation between the two measures symbolically.
Let C denote the temperature in centigrade and F denote the temperature in fahren-
heit. Then F = 32 + 1.8C.
Discussion: Let us check that the relationship agrees with the original description. If
C = 0, then F = 32 + 1.8 × 0 = 32 + 0 = 32. If C increases by 10, then F should increase
by 18: for C = 10, F = 32 + 1.8 × 10 = 32 + 18 = 50. So this also is correct. A few values
are:

C -40 -20 0 10 20 30 40
F -40 -4 32 50 68 86 104

Example 5. Suppose that the price, in dollars, that consumers are willing to pay for
each item when q items are available is p = 100 − 0.2q. Which variable is being used as the
dependent variable? which as the independent variable? for what values of the variables
does the relation makes sense?
The quantity q is the independent variable, and the price p is the dependent variable.
Since the quantity of items must be zero or positive, q ≥ 0. Since the price, interpreted
with its usual meaning, only makes sense if it is zero or positive, p ≥ 0 as well.
Discussion: The restriction p ≥ 0 can also be interpreted as a restriction on the
quantity q. You can check that if q = 500 then p = 0, so q ≤ 500 must be satisfied (for
larger q the price becomes negative so the model no longer makes sense).

Example 6. Consider the average annual savings for residents of a certain country
and the average income in the country. Which of the following relations makes sense for
the savings, S, and the income, I? (a) S = 5I, (b) S = 0.03I, (c) S = 10, 000 − 0.1I, (d)
√
S = 0.1 I, (e) S = 0.1I 2 .
We explore a few of these relationships numerically to see whether they make sense.
8 Quantitative Models

For (a):
I 10,000 20,000 50,000 100,000 1,000,000
S 50,000 100,000 500,000 500,000 5,000,000
We expect that only part of the income will be saved, so (a) does not make sense.
That is, the savings cannot be greater than the income. A similar check shows that for (e)
the amount saved becomes larger than income if income exceeds 10, so (e) also does not
make sense.
For (c):
I 10,000 20,000 50,000 100,000 1,000,000
S 9,000 8,000 5,000 0 -90,000

As income rises we expect the savings would also rise. So the relation (c) does not
make sense. The relations in (b) and (d) both make some sense.
For (b):
I 10,000 20,000 50,000 100,000 1,000,000
S 300 600 1,500 3,000 30,000

The relations in (b) makes some sense, even though we do not have a reason for it. A
similar check shows that the relation in (d) also makes some sense.
Discussion: One can imagine different relations between savings and income under
unusual or different circumstances. For example, if income rises and people think that this
is part of a trend, they might actually spend some of their savings, expecting to make the
difference up in the future.

Example 7. Suppose that during each year the amount saved by the residents of a
country is 2% of income. Write the amount saved as a function of the income.
Let S denote the amount saved and let I be income (both at the same year). Then
S = 0.02 I.

Example 8. Suppose that when a certain clinic treats 500 patients a month the
number of nurses, N , and the number of medical doctors, M , have to satisfy the relation
N + N · M 2 = 500. If the clinic plans to employ 10 nurses (and still plans to treat 500
patients), how many doctors must the clinic plan to employ?
Since N = 10, the number of doctors must satisfy 10 + 10 · M 2 = 500. So 10 · M 2 =
500 − 10 = 490 and M 2 = 49. There must be 7 doctors, because 72 = 49 (and 7 is the
only positive number for which this is true).
Quantitative Models 9

Discussion: In this example the relation between the number of nurses and the number
of doctors was described by an equation rather than by a function. In general, it might be
possible to describe one variable as a function of the other, and it might even be possible
to choose which variable will be the independent variable. We will return to this topic
after examining methods for solving equations. In the current example, as it turns out,
p
N = 500/(1 + M 2 ) and M = (500 − N )/N , so the number of nurses can be written as
a function of the number of doctors and also the number of doctors can be written as a
function of the number of nurses.

EXERCISES.

1. Suppose that the population of a certain country is growing by 2 percent each year. (a)
If the initial population is 10 million, what is it after 1 year? (b) what is the population
after 2 years? (c) If s is the size of the population at the start and p is the size of the
population a year later, how are the two variables related?

2. If a clinic employs N nurses and M doctors, then it can serve N + N 2 · M patients. (a)
Suppose a clinic has 8 nurses and 12 doctors. How many patients can this clinic serve?
(b) If the clinic can hire one more nurse, how many patients will it be able to serve? (c)
If the clinic can hire one more doctor, how many patients will it be able to serve?

3. Personal income in a country is, by definition, the country’s total income divided by
the country’s total population. Suppose that a country has 50 million people (a million
is 1,000,000) and the total income is 100 billion U.S. dollars (in the U.S. a billion is
1,000,000,000 – the name refers to a different number in the U.K.). (a) Calculate the
personal income in this country. (b) Suppose the total income remains the same but the
population increases by 1 million (say, during the following year) and calculate the personal
income at the end of the year. (c) Suppose the total income increases by 2 billion and the
population remains the same and calculate the personal income at the end of the year.
(d) Suppose the population increases by 1 million and that the total income increases by
2 billion and calculate the personal income at the end of the year.

4. Worker productivity increases with investment. Let K denote the amount invested in
a factory (say in dollars per year) and let f denote the total production (number of units
made each year). Assume that the total amount of labor available (the number of workers
and hours) is not changed. Which of the following relations make sense? explain which
relations are in agreement with the first statement and which make the most sense. (a)
√ √
f = 0.1 K, (b) f = 1000 + 0.03K, (c) f = 50, 000 − 0.01K, (d) f = 500 + K − 0.2K 2 .
10 Quantitative Models

5. Factory productivity increases with investment and also increases with labor. Let
K denote the amount invested in a factory (say in dollars per year), let L denote the
amount of labor available (say in hours per year), and let f denote the total production
(number of units made each year). Which of the following relations make sense? explain
which relations are in agreement with the first statement and which make the most sense.
√ √
(a) f = 0.1 K L, (b) f = 1000 + 0.03K + 0.02L, (c) f = 50, 000 − 0.1K + 0.3L, (d)
√
f = 500 + K − 0.003L2 .

6. Denote the total capital available in a country by K, the total population in the country
by P , and the total income in the country by I. Suppose that P is unchanged and K is
increased. How would you expect I to change? in particular, why might I increase or
decrease? Explain.

7. Suppose that the price that consumers are willing to pay for a bicycle is p = 10, 000 −
0.01q, where q is the number of bicycles produced. (a) When only 1,000 bicycles are made,
how much are consumers willing to pay for each of them (notice, this requires only the top-
paying 1,000 consumers)? (b) When 900,000 bicycles are made, how much are consumers
willing to pay for each of them?

8. Suppose that the price that producers are willing to accept for a bicycle is m =
500 + 0.04q, where q is the number of bicycles produced. (a) When only 1,000 bicycles are
made, how much are producers willing to accept for each of them (notice, this requires only
the 1,000 manufacturers most willing to produce bicycles)? (b) When 100,000 bicycles are
made, how much are producers willing to accept for each of them?

9. Suppose that producers and consumers of bicycles behave as in problems 7 and 8. That
is, consumers will pay p = 10, 000 − 0.01q when q bicycles are available and producers
require a price of m = 500 + 0.04q. (a) Suppose 100,000 bicycles are produced. What
price will producers set, and will consumers buy all of the bicycles? (b) Suppose 160,000
bicycles are produced. What price will producers set, and will consumers buy all of the
bicycles? (c) Suppose 200,000 bicycles are produced. What price will producers set, and
will consumers buy all of the bicycles?

10. The number of eggs that a robin lays depends on the fat reserves that the bird
has accumulated. Suppose that f is the fat accumulation, and e is the number of eggs
laid. Suppose that in nature 3 < f < 20 (grams) and 1 ≤ e ≤ 9. Tabulate the following
relationships and decide whether they agree with the description given (the number of eggs
in nature is a whole number, so the relationships are approximate). (a) e = −0.1 + 0.45f ,
(b) e = 11 − 0.73f , (c) e = −1.4 + 0.9f − 0.02f 2 .
Quantitative Models 11

C. Models and Decisions.

Mathematical models consist of relations among variables that are used to answer a
question about the setting being described.
Mathematical models may include several variables and relations, and the relations
may be described analytically – through functions or equations, graphically, numerically,
or qualitatively.

Example 1. To illustrate the ingredients in a model we discuss an economic model

of a market for one good. The description consists of the behavior of producers (sellers of
the good), the behavior of consumers (buyers), and of combining these to describe what
actually happens in the market. That is, the questions are how much of the good changes
hands and what is the market price of the good.
Since the behavior of a market for any particular good is expected to share features
with the market for many other goods, it makes sense to describe the behavior of producers
and consumers using general features. People often describe such general behavior picto-
rially by graphing the functions (rather than giving specific symbolic relations). Graphs
are pictorial representations of numbers (as we will see later), so in this section we use
numerical examples.
The variables of interest are the quantity of the good and its price. One would expect
producers, as a whole, to provide more of the item if the price is higher. That is, new
producers will enter production and existing producers will increase production only if the
market price is sufficiently high. So the price and the quantity produced might have values
similar to the following.
For producers:
quantity 2,500 5,000 10,000 20,000 40,000
price 100 150 250 600 1,500
One would expect consumers, as a whole, to require a lower price if they are to buy
more of the item. That is, new consumers will purchase the item and existing consumers
will increase consumption only if the market price is sufficiently low. So the price and the
quantity produced might have values similar to the following.
For consumers:
quantity 2,500 5,000 10,000 20,000 40,000
price 1000 450 250 200 150
A market exchange requires agreement between consumers and producers on the quan-
tity and price. So the quantitative version of the question is “find the quantity (of the
12 Quantitative Models

good) for which producers and consumers agree on the price”.

A look at the two tables of values shows that consumers and producers agree on
both the quantity and price when 10,000 units are exchanged at a price of 250 dollars an
item. So 250 dollars would be the market price, and 10,000 units would be produced and
purchased. This answers our original questions.
Discussion: In our example, if consumers were to buy 20,000 units then the price
would have to be 200 dollars per item. But producers would not be willing to produce
20,000 units for any less than 600 dollars an item. Hence there will not be agreement on
exchanging 20,000 units.
On the opposite side, if an exchange of 5,000 units is considered, then producers would
supply the items for 150 dollars each while there are enough consumers willing to pay 450
dollars an item for each of these 5,000 units. So producers will be glad to produce 5,000
units for any price at or above 150 and consumers will buy 5,000 units for any price below
450, which means that more than 5,000 units would be produced, and the price would be
between 150 and 450, but agreement on a quantity and price requires further examination.

As is often the case, models depend on our assumptions and give us some insight into
the process being described. In the market example, the assumption was that producers
and consumers behave according to the values in the tables. The insight gained is not
only the answers to the questions regarding the market price and quantity but also the
observation that some consumers pay less per item than they would be willing to pay and
that some producers get more per item than they are willing to accept. This shows that
there are economic benefits to the exchange, that is, it makes some consumers and some
producers better-off than they would be without the exchange.

Example 2. Next we decide the optimal time for replanting an orchard. The variables
involved are the time since the orchard was planted and the total amount produced during
the life of the orchard. The table below gives some data for the two variables, where time
is measured in years and production is measured in tons. Notice that it makes sense that
the orchard starts producing slowly when the trees are small, and that the production per
year increases at first but decreases when the orchard gets older (this varies by the type
of tree – oranges or almonds or apples decrease production more rapidly than olives or
grapes or walnuts).
Before we consider the data, we should decide on an approach. What determines the
optimal replanting time? The orchard will be in production for many years and should
be managed to have the maximum yield per year, or said differently, it should have the
maximum average yield during each cycle. Hence we should maximize the annual yield
Quantitative Models 13

(the yield per year) where the independent variable is the duration of the cycle, that is the
number of years before the orchard is replanted. In other words, the quantitative question
is “at what replanting time is the average yield over the replanting cycle the greatest?”
Orchard data:
year 6 12 18 30 42 54 66
total production 0 4 10 17 22 26 29
From the orchard data table we can calculate the annual production rate over the life
of the orchard by dividing the total produced by the number of years that elapsed. For
example, during the first 18 years the amount produced was 10 and the elapsed time was
18 years so the annual production was 10/18 ≈ 0.555 tons per year.
Orchard’s averages:
period to 12 to 18 to 30 to 42 to 54 to 66
annual production 4/12 10/18 17/30 22/42 26/54 29/66

From the orchard’s averages table we see that after 18 year annual production is about
0.555 tons per year, after 30 years production increases to about 0.567 tons per year, and
after 42 years annual production decreases to about 0.524 tons per year. Based on this
information the orchard should be replanted after more than 30 years but less than 42
years.

Example 3. This example concern consumer behavior in markets in which goods are
exchanged – bought and sold. A good could be concrete, such as tomatoes measured in,
say, pounds, or more vaguely defined, such as good health, or a service, such as arranging
for a sale or a loan.
Suppose that a consumer chooses among different combinations of goods based on
a preference called a “utility”. The consumer chooses the combination that yields the
greatest utility.
Case 1.
Let a denote the quantity (in pounds) of apples purchased by Mr. Appetite in a week,
while b is the number of baseball games that he attends. Suppose that a pound of apples
costs $1.59 while a baseball game costs $12. Suppose Mr. Appetite has $40 to devote to
apples and baseball games each week. Assume that Mr. Appetite’s utility for apples and
baseball is represented by
u(a, b) = a + 6b.

Clearly Mr. Appetite can afford at most 3 baseball games. Here are his options
14 Quantitative Models

Number of games pounds of apples (rounded) utility (rounded)

0 25.16 25.16
1 17.61 23.61
2 10.06 22.06
3 2.52 20.52
Mr. Appetite would choose to spend all his budget on apples, buying 25.16 pounds. We
could also say that when apples cost $1.59, Mr. Appetite purchases 25.16 pounds.
When a baseball game costs $12 but the price of apples rises to $ 1.70, Mr. Appetite
would still spend all $40 on apples (how do we know this?!). But now he will only buy
23.53 pounds.
Questions. a. How much do apples need to cost per pound before Mr. Appetite goes
to a baseball game (assuming a game still costs $12)? b. If the price of apples rises above
the threshold you discovered in response to question (a), what does Mr. Appetite do? c.
Is this behavior believable? (this question asks whether our utility function is realistic).
Case 2.
Let h denote the quantity (in pounds) of honey purchased by Ms. B in a year, while
y is the quantity (in pounds) of yams that she buys. Suppose that a pound of honey costs
$3 while a pound of yams costs $1.10. Suppose Ms. B has $50 to devote to honey and
yams each year. Assume that Ms. B’s utility for these goods is represented by

u(h, y) = h · y.

Ms. B can afford up to approximately 16.67 pounds of honey, and up to about 45.46 pounds
of yams. Here are some of her options

honey yams (rounded) utility (rounded)

0 45.46 0
3 37.27 111.82
6 29.09 174.55
9 20.91 188.18
12 12.73 152.73
16.66 0.018 0.303

Based on the calculation above, Ms. B will purchase between 6 and 12 pounds of honey
with the corresponding amount of yams (why is it not safe to assume that the maximum
utility is reached between, say, 9 and 12 pounds of honey?).
To improve our understanding of Ms. B’s decision we should refine our test values.
Quantitative Models 15

After some experimentation, we calculated the following

honey yams (rounded) utility (rounded)
8.2 23.09 189.35
8.3 22.82 189.39
8.4 22.55 189.38
8.5 22.27 189.32
It appears that Ms. B will purchase approximately 8.3 pounds of honey and 22.82 pounds
of yams.
Questions. a. How much honey does Ms. B purchase if the price of honey rises to $3.1
(with the price of yams remaining the same)? b. How much yams does Ms. B purchase if
the price of honey rises to $3.1? c. Is this behavior believable?

EXERCISES.

1. We are interested in income and population growth in developing countries. It seems

that as income rises (from an initially very low level) the population’s health would increase
since people will have greater access to medical care. Thus the rate at which the popula-
tion is growing will initially increase with income (more live births and fewer deaths). The
income is often measured in dollars per year for house-holds, on average, and the popula-
tion’s growth rate is often measured as the difference between live births and deaths per
100,000 people. As income increases further, the population growth rate often decreases,
because people depend on their ability to save for their well-being rather than depending
on their children. Decide on values for the following table that agree with the description
(many values are possible):
Population and income:
income 100 500 1,000 10,000 50,000 100,000
population growth 1,000 100

2. We are interested in savings and population growth in developing countries. It seems

that as income rises (from an initially very low level) the rate at which people save would
increase. The income is often measured in dollars per year for house-holds, on average, and
the savings rate is often measured as the percentage of earnings saved. Decide on values
for the following table that agree with the description and with your ideas about the way
in which the savings rate might change.
Income and the savings rate:
income ($) 100 500 1,000 10,000 50,000 100,000
savings (%) 0 8
16 Quantitative Models

3. Convert your table from exercise 1 into percentages. For example, at an income of 100
there are 1,000 additional people for each group of 100,000, so the population’s growth
rate is 1, 000/100, 000 = 0.01 = 1% and for an income of 100,000 there are 100 additional
people for each group of 100,000, so the population’s growth rate is 0.1%.
4. Examining your tables for the second and third problems, does the rate of growth of
income ever keep up with the rate of growth of the population? What does this mean in
terms of the income available per person?

D. Summary.

Chapter 1 is about the setting for quantitative models.

Definitions included qualitative variables, quantitative variables, statistical
variables, independent variables, dependent variables, functions, and relations
among variables.
We used models that described the relations between variables numerically. We were
able to decide what happens in these settings after we converted the questions asked into
quantitative questions involving our variables.
2. Numbers and Calculations : A Review
A. Numbers.
The real numbers consist of points on a straight line called the number line. We
think of the line as a horizontal line from left to right. Each number on the number line
consists of a distance from zero and a direction. Positive numbers are positioned to the
right of zero and negative numbers are positioned to the left of zero.
Special types of numbers include the natural numbers, integers, rational numbers, and
irrational numbers.
The natural numbers consist of 1, 2, 3, . . .
The integers consist of the natural numbers, zero, and the negatives of the natural
numbers: . . . − 3, −2, −1, 0, 1, 2, 3, . . .
Rational numbers can be written as ratios of integers: r = m/n for integers m and
n. For example, 11/7 is a rational number, and 0.23 = 23/100 is a rational number.
Decimal numbers are numbers written in terms of 10 and its powers. For example
123 means 1 × 100 + 2 × 10 + 3. And 12.34 means 1 × 10 + 2 + 3 × (1/10) + 4 × (1/100).
In our examples 123 = 123/1 and 12.34 = 1234/100 so both these numbers are rational.
There are lengths that are not rational. For example a square with an area of 3 has
a side whose length is not rational.
The same number can be represented in different ways. For example 2/5 and 0.4 and
4/10 and 44/110 are the same number.

1
Example 1. To calculate with fractions we need a common unit. Find 2 + 13 .
A solution would be
1 1 3 2 5
+ = + = .
2 3 6 6 6
Discussion: The goal was to add two fractions that are not comparable. A solution involved
making the two fractions comparable by writing them using the same unit, namely using
increments of 1/6. That meant using 6 as a denominator for both fractions.

Any number can be approximated by decimal numbers. We’ll discuss approximations

in much more detail much later, but an example is that using 8 non-zero digits 1/3 ≈
0.33333333. The number of digits used in the approximation depends on the intended use
of the approximation.
In some contexts numbers are usually represented as percentages. The words “per
cent” mean out of one hundred. So 5 percent is 5/100 = 0.05. The usual notation is “ %”,
so 5 percent is written 5%. The contexts in which percentages are used typically involved
18 Numbers and Calculations

proportional quantities, such as the amount of a tax as a proportion of the value of a sale,
or the amount of growth in the size of a population as a proportion of the size of the
population.

EXERCISES.
1. Write 2.3 in a way that demonstrates that it is a rational number.
2. Explain how you can tell without much calculation that 23/24 is greater than 16/17.
Hint: both numbers are close to 1.
3. Calculate (2/5) − (1/8) and write this result as a ratio of two integers.
4. Calculate (2/7) + (1/3) and write this result as a ratio of two integers.
5. Which is the larger of 3/7 and 5/11?
6. Write 3/8 as a decimal number.
7. Write −1.6 in a way that demonstrates that it is a rational number.
8. Explain how you can tell without much calculation that −21/11 is greater than −53/27.
Hint: both numbers are close to −2.
9. Calculate (7/5) − (1/8) and write this result as a ratio of two integers.
10. Calculate (11/6) + (4/7) and write this result as a ratio of two integers.
11. Which is the larger of −5/7 and −5/11?
12. Which is the larger of −7/8 and 1/131?

B. Operation with numbers.

For any two numbers we can take their sum and their product. We think of the sum as
resulting from combining the distances and directions of the two numbers, or alternatively
as represented by moving according to one of the numbers followed by moving according
to the other. For positive numbers we think of their product as the area of a rectangle
whose sides have the positive numbers as lengths.

Example 1. We think of the sum 4 + 7 = 11 as moving 4 units of length from zero

to the right, followed by 7 more units of length to the right, and thus ending 11 units to
the right of zero. We think of the sum 4 + (−7) = −3 as moving 4 units of length from
zero to the right, followed by 7 units of length to the left, and thus ending 3 units to the
left of zero which is the number −3.

Example 2. We think of the product 4 × 7 = 28 as the area of a rectangle with length

4 on one side and 7 on the other. The resulting area has 28 squares of size 1 × 1. If length
were measured in meters, then area would be measured in square meters.
Numbers and Calculations 19

For any number we can find its negative. That is, the number whose distance from
zero is the same but which lies on the opposite side of zero. Symbolically we write the
negative of x as −x. For example if x = 3.4, then −x = −3.4 and if x = −2 then −x = 2.
Addition is a commutative operation, that is the result remains the same even if we
reverse the order of the two numbers. Symbolically, x + y = y + x for any numbers x and
y. Addition is also associative, that is, the result remains the same even if three or more
numbers are involved. Symbolically, (x + y) + z = x + (y + z).
The easiest way to decide whether some rule for numbers is valid is to check it with
a few numbers. This does not always lead to a correct decision, but it usually does if we
check enough numbers. For example let us check whether (x + y)(x + z)? =?(y + x)(y + z)?
We first try x = 1, y = 2, and z = 3, so the right hand side is (1 + 2)(1 + 3) = 3 × 4 = 12
and the left hand side is (2 + 1)(2 + 3) = 3 × 5 = 15. We conclude that the suggested rule
does not apply.
Multiplication is also a commutative operation and it too is associative.
Multiplication distributes over addition, that is u×(y +z) = u×y +u×z, but addition
does not distribute over multiplication. For instance, with u = 4, y = 2, and z = 3,
u × (y + z) = 4 × 5 = 20 and u × y + u × z = 8 + 12 = 20. However, u + (y × z) = 4 + 6 = 10
and (u + y) × (u + z) = 6 × 7 = 42.

Further discussion. We did not list all of the properties of real numbers and op-
erations with numbers. The usual definition for zero is that x + 0 = x for any number
x, and the usual definition the negative of a number, x, is that there is a number y with
x + y = 0, and we denote that special number y by −x.
With some care one can show that the associative and commutative properties show
that 0 × x = 0 for any number x and that (−x) × y = −(x × y). Hence we can think
of multiplication by −1 as reversing the direction of the number: (−1) × x = −x. It
follows that multiplying by −1 twice reverses the direction twice, ending up in the original
direction: (−1) × (−x) = x.

EXERCISES.

1. Which of the following are positive for all numbers a?

(a) a × a, (b) (−a) × (−a), (c) −a × a, (d) (a + 3) × a.
2. True or false? For all numbers x, y, and z one has x(y + z) = xy + xz.
3. True or false? For all numbers x, y, and z, xy + z = xz + y.
4. True or false? For all numbers x, and y, one has x(x + y) = xy + x.
20 Numbers and Calculations

5. Suppose one starts at the point 2.4 on the number line and moves 3 units to the left.
Is the resulting position a positive number? If one moves 2 units to the left, is the result
a positive number?
6. How far does one have to move along the number line to get from 4 to 5.6? How fare
does one have to move to get from 5.6 to 4?

C. Absolute Value.
The absolute value of a number is the distance of this number from zero along the
number line.
Computationally, if the number is negative then its absolute value is the negative of
the number and if the number is positive then the absolute value of the number is the
same as the number.
Example 1. Calculate the absolute values of 1, 2.4, −3.2, 0, and −3.356.
Solution: The distance of 1 from zero is 1; the distance of 2.4 from zero is 2.4; the
distance of −3.2 from zero is 3.2; the distance of 0 from zero is 0; and the distance of
−3.356 from zero is 3.356.
Notation: We write the absolute value using two lines, one before and the other after the
number. So, for instance, the absolute value of −1.76 is written | − 1.76|.
Example 2. Calculate the absolute values of 1.4, −4, −9.2, 11, and −56.
Solution: |1.4| = 1.4; | − 4| = 4; | − 9.2| = 9.2; |11| = 11; and | − 56| = 56.

The distance between two numbers is the absolute value of their difference:

distance(a, b) = |b − a| .

Example 3. Calculate the distance between 7.3 and 5.6.

Solution: The distance is |5.6 − 7.3| = | − 1.7| = 1.7

EXERCISES.
1. Calculate the absolute values of 1.3, 11/24, −1.3, −13/15, 4 − 3, 3 − 4, 25 − 17.2,
17.2 − 25, 6 − 23.4, and −23.4 + 6.
2. Is it true that the absolute value of the difference between two numbers does not depend
on the order in which the difference is calculated? That is, is |a − b| = |b − a| for any two
numbers a and b?
3. Find all points whose distance from 5 is 2.
Numbers and Calculations 21

4. What is the proportion of the distance from 40 to 15 to the distance between 6 and 11?
5. Which is greater, the distance between 5 and 9 or the distance between −4 and −1?
6. Which is greater, the distance between 5 and 9 or the distance between −4 and 1?

D. Powers.
Our intuition for powers is greatly aided by calculating some examples by hand (rather
than by using a calculator). This also helps remember what the powers mean.
We define bx for three special cases, the first being the case where x is a positive
integer, the second being the case where x is zero or a negative integer, and third being
the case where x is a ratio of two integers. This will make sense once we do it.
For a power that is a natural number, the base number is multiplied by itself the
specified number of times. Symbolically, bn = b · b · . . . · b where the right hand side has n
copies of b.
Example 1. Illustrating this with some numbers, we have

23 = 2 · 2 · 2 = 8, 1.52 = 1.5 · 1.5 = 2.25, (−3)4 = (−3) · (−3) · (−3) · (−3) = 81,

1.14 = 1.1 × 1.1 × 1.1 × 1.1 = 1.4641 .

(−1)3 = (−1) · (−1) · (−1) = −1, 0.43 = 0.4 · 0.4 · 0.4 = 0.064,

(−1/3)2 = (−1/3) · (−1/3) = 1/9, (−0.1)4 = 0.0001, 53 = 125.

From this definition, we can obtain two rules for powers, namely
bn+m = bn × bm , (bn )m = b(nm)
We think of these rules by considering the number of copies of the base, b, that appear.
n+m
In b there are n + m copies of b and the same is true in bn × bm . In (bn )m there are
m copies of numbers each one of which contains n copies of b so there are nm copies of b
altogether.
Example 2. Continuing the theme of checking rules with numbers,

32 × 35 = 37 , since 9 × 243 = 2187, 94 = (32 )4 = 38 ,

1 2 1 3 2 1 6
0.53 × 0.52 = 0.55 , since 0.125 × 0.25 = 0.03125, = = .
8 2 2
For a natural number n and a positive base b, we think of b1/n by thinking about the
property
m n
bk = bk×m , so b1/n = b(1/n)×n = b1 = b .
22 Numbers and Calculations

When b1/n is raised to the power n we get the base, b.

√
Example 3. 41/2 = 4 = 2, 271/3 = 3 since 33 = 27, 1251/3 = 5, and 0.00011/4 = 0.1.
Caution. The interpretation of the power 1/n as a root requires, in general, that the
base be positive.
For a power that is the negative of a natural number, the result is 1 divided by the
base the specified number of times. One can think of this as a direct extension of the
rules for powers that we found for powers that are natural numbers, or one can note the
progression from higher to lower powers as shown immediately below.
Example 4. With the base 3, as the power decreases by 1, the result is divided by 3.
1 −2 1 1 1
34 = 81, 33 = 27, 32 = 9, 31 = 3, 30 = 1, 3−1 = , 3 = , 3−3 = , 3−4 = .
3 9 27 81

Example 5. From the rules one would get, 2−1 × 16 = 2−1 × 24 = 2−1+4 = 23 = 8,
so 2−1 = 8/16 = 0.5.
We can combine the definitions above to calculate more complicated powers that are
of the form bn/m with n an integer and m a natural number.
√ √ √
Example 6. 43/2 = ( 4)3 = 23 = 8, or alternatively 43/2 = 43 = 64 = 8. And
√
274/3 = 81, 25−1/2 = ( 25)−1 = 1/5, and 0.0001−3/4 = 0.1−3 = 103 = 1, 000.
Vexing problems. We are left with two problems. First, since a power can be
written in more than one way, do all the definitions agree? For instance do the definitions
for 40.2 agree when we use 0.2 = 1/5 and when we use 0.2 = 2/10?
The second, much harder, problem is that of defining bx for a positive base b when x
cannot be written as the ratio of an integer and a natural number.

EXERCISES.

1. Calculate the following.

.
(a) 25 , (b) 2(1+3) , (c) (24 )3 , (d) 1.32 , (e) (2 × 3)3 , (f ) 1.15 1.13 .

2. Write each of the following as a single power of 3.

(a) 9 × 9, (b) 27/3, (c) 94 , (d) 81, (e) 815 , (f ) 30.5×4 .

3. Calculate the following, and say which properties of exponentiation can be used to
check that the calculation is correct.
.
−3 −1 −3 −1 3
5
, (e) (2 × 3) , (f ) 2 2−1 .
−2 3

(a) 2 , (b) 3 , (c) 2 × 2 , (d) 2
Numbers and Calculations 23

4. Write each of the following as an integer or using only a combination of integers and
square roots.

(a) 41/2 , (b) 81/3 , (c) 31/2 , (d) 811/2 , (e) 81/2 , (f ) 85/3 .

5. Write each of the following as a single power of 3.

√
(a) 9 × 91/2 , (b )271/3 , (c) 92.5 , (d) 81 3, (e) 810.25 , (f ) 9/81.

E. Rules for variables.

We reviewed the properties of numbers because quantitative variables are numbers
(whose value is not known or fixed). Thus to decide whether a rule applies to variables we
need to decide whether that rule applies to all numbers.
√
For example, is it true that for variables x and y with positive values, (a) x × y? =
√ √ √ √ √
? x × y? and is it true that (b) x + y? =? x + y? We decide whether the rule in (a)
applies by first checking with some numbers and then using the definition of the square
root:
√ √ √ √ √ √
For x = 4 and y = 9, x×y =36 = 6, x × y = 4 × 9 = 2 × 3 = 6. So the
√
rule seems to hold. Now by definition x × y is the number whose square is x × y. Let us
√ √
check whether x × y satisfies this requirement. For x ≥ 0 and y ≥ 0 we calculate
√ √ √ √ √ √ √ √ √ √ √ √ √ √
( x × y)2 = ( x × y)( x × y) = x y x y = ( x x)( y y) = x × y.

We have found that the rule (a) is correct.

For the rule in (b), we first check with some numbers, say x = 4 and y = 9 again. We
√ √ √ √ √
have x + y = 13 and x + y = 2 + 3 = 5. Clearly 13 6= 5 because 52 = 25 6= 13.
So the rule suggested in (b) is incorrect.
Moral. The moral of this section is that rules apply to variables only when they apply
to every value of the variables. Moreover, if one is not sure of a rule then it is important
to check the rule with some numbers.
Example 1. Is the rule (x + y)(x − y)? = ?x2 − y 2 correct?
We produce a table with a sample of values for x and y:
x y (x + y)(x − y) x2 − y 2
1 2 3 · (−1) = −3 1 − 4 = −3
2 2 4·0=0 4−4=0
5 2 7 · 3 = 21 25 − 4 = 21
−1 3 2 · (−4) = −8 1 − 9 = −8
-2 -4 (−6) · 2 = −12 4 − 16 = −12
24 Numbers and Calculations

It seems, from these values, that the rule is correct. We can check the rule symbolically
by expanding (x + y)(x − y), namely

(x + y)(x − y) = x · x + x · (−y) + y · x + y · (−y) = x2 − x · y + x · y − y 2 = x2 − y 2 .

We have found the rule

(x + y)(x − y) = x2 − y 2 .

Example 2. Is the rule (x + y)2 ? = ?x2 + y 2 correct?

We start with a table with the same sample of values for x and y as above:
x y (x + y)2 x2 + y 2
1 2 32 = 9 1+4=5
2 2
5 2
−1 3
-2 -4
There is no point in continuing the table since the first values show that the rule is
not correct.
We can look for a rule symbolically by expanding (x + y)2 = (x + y)(x + y), namely

(x + y)(x + y) = x · x + x · y + y · x + y · y = x2 + x · y + x · y + y 2 = x2 + 2xy + y 2 .

We have found the rule

(x + y)2 = x2 + 2xy + y 2 .

EXERCISES.

For each of the following statements decide whether it is true or false.

√
1. x + x = (x2 + x)/x for x > 0.
2. x + 1 = (x2 + x)/x for x > 0.
√ √ √ √
3. ( x + y)( x − y) = x − y for x ≥ 0 and y ≥ 0.
4. (x + 1)(x + 3) = x2 + 4x + 3.
5. (x + 2)(x − 3) = x2 − 6.
6. x2 + 2 = (0.5 x3 + x)/(0.5 x) for x > 0.
7. x2 + 2 = (x3 + 2x)/x for x > 0.
8. (x + 3y)2 = x2 + 9y 2 .
Numbers and Calculations 25

9. (x + 1)(x + 3) = 7x.
10. (x + 2)(x − 3) = x2 − x − 6.
11. (x − 3)(x + 3) = x2 + 9.
12. (x + 3)(x − 3) = x2 − 9.
13. x2 + 6x = (x + 3)2 − 9.
14. x3 + 2x2 = (x2 + 2x) x.
15. (x + y)(x − y) = x2 − y 2 .
16. (x − 1)(x − 3) = (x + 1)(x + 3).
17. (x2 + x)(x − 3) = x3 − 2x2 − 3x.
18. (x2 + 3x)(x − 3) = x3 + x2 − 9x.
19. (x2 + 3x)(x − 3) = x3 − 9x.
20. (x + 5)x2 = x3 + 5x2 .
21. x2 + 5x + 6 = (x + 3)(x + 2).
22. x4 + 2x2 = (x2 + 2) x2 .
23. (x + 1)4 = x4 + 4x3 + 6x2 + 4x + 1.
24. (x + 1)4 = x4 + x3 + x2 + x + 1.
25. (x2 + x)(x2 − 3) = x4 + x3 − 3x2 − 3x.
3. Graphs and Equations
This chapter introduces the setting, some of the questions, and some of the tools, that
we will use to examine quantitative information and quantitative models.

A. Graphs.
Graphs provide one of the most useful means for organizing quantitative information.
Using a graph one can present data, summarize data, describe relations, present important
characteristics of a model, and keep track of the functions being analyzed. Using graphical
analysis one can often decide, to a great extent, what occurs in the setting being modeled.

A graph can describe a relation between two variables by plotting the joint values of
the two variables. The graph will have a horizontal axis and a vertical axis. The horizontal
axis represents values of one variable and the vertical axis represents values of the second
variable. A point in the graph represents the combination of values. The value of one
variable, called the independent variable, corresponds to the coordinate on the horizontal
axis and the value of the second variable, called the dependent variable, is on the vertical
axis.

(3,4)
4
vertical coordinate is 4

-2 -1 0 1 2 3 4 5 6 7

horizontal coordinate is 3

Figure. The point (3, 4) with its horizontal and vertical coordinates.

Example 1. Plot the points (1, 1), (−1, 2), (2, 3), (3, 4), and (0, −1.5) on a set of
axes with the horizontal values ranging approximately between −3 and 7 and the vertical
Graphs and Equations 27

values ranging approximately between −2 and 4. Label the horizontal axis with an x and
the vertical axis with a y.
4
(3,4)

3
(2,3)

(-1,2) 2

1
(1,1)

-2 -1 0 1 2 3 4 5 6 7

-1

(0,-1.5)

-2

Figure. Some points with their coordinates.

Example 2. Suppose that the first variable is the age of an orchard and the second
variable is the total amount of fruit produced by the orchard up to that time as in example
2 from chapter 1.C. In that example we had 7 data points: (6, 0), (12, 4), (18, 10), (30, 17),
(42, 22), (54, 26), and (66, 29). The graph of these data is
50

-75 -50 -25 0 25 50 75

-25

-50

Figure. The orchard production data.

Since the orchard actually produces fruit at times not included in our data, it makes
some sense to connect the points in our data to suggest a continuous relation between
time and the amount produced. (Of course, this is a simplification, since production is
seasonal.)
28 Graphs and Equations

-25 0 25 50 75 100 125

Figure. Orchard production as a continuous graph.

Example 3. We turn our attention to example 1 from chapter 1.C. which involved
two sets of data. We will plot the price for producers as a function of the quantity and the
price for consumers as a function of the quantity.

1500

Supply: producers

1000

500

Demand: consumers
0 5000 1⋅104 1.5⋅104 2⋅104 2.5⋅104 3⋅104 3.5⋅104 4⋅104 4.5⋅104

It is now even clearer that for quantities below 10,000 items the price offered by
consumers will be greater than the price required by producers, so producers will increase
production. For quantities above 10,000 items the price offered by consumers will be
smaller than the price required by producers, so producers will keep production at 10,000
items.
For producers of any item, not just the one for which the data was given, one expects
a similar graph that starts at a relatively low price if the quantity is small and requires a
Graphs and Equations 29

higher price for production to increase. For consumers one expects a similar graph that
starts at a relatively high price if the quantity is small and the item is rare, and requires
a lower price for consumption to increase. Hence one, in general, expects a similar market
situation in which producers and consumers agree on a price and quantity.

producers: supply
3

price market agreement

consumers: demand
1

0 1 2 3 4 5 6 7 8 9

quantity
-1

Figure. Generic graphs of supply and demand

Example 4. As a country develops greater efficiency in producing goods and services,

the income of each person can increase. This efficiency requires infrastructure and raw
materials supplied by capital and depends on the availability of capital per person. Hence,
in a model of the economy of a developing country three important variables are the total
income of the country, the size of the population, and the total amount of capital.

Let us consider the dependence of population growth and capital growth on income.
If income is high, then the saving rate would be high, that is, people would be able to
invest more of their income. This would mean a greater proportional increase of capital
over a fixed period of time, say one year.

The graph of the proportional growth rates of capital and income might look like this:
30 Graphs and Equations

4.8

Rate of Capital Growth

3.2

2.4

1.6

0.8
Income
0 4 8 12 16 20 24 28 32 36 40 44 48

Percentage growth rate of capital and personal income.

If income is high, what would we expect of the growth in the population? Historically,
in very poor countries the birth rate is high, but so is the death rate, so the growth
rate of the population is fairly low. As income increases, living conditions improve and
the population grows more rapidly. However, when income becomes high, people tend to
depend less on having many children and the population’s growth rate decreases.
4.8

Population Growth Rate

3.2

2.4

1.6

0.8

Personal Income
0 4 8 12 16 20 24 28 32 36 40 44 48

Percentage growth rate of population and personal income

Graphs and Equations 31

Example 5. Suppose that a car loan will be paid using monthly installments of 200
dollars each. The interest rate is fixed (this example uses 6 percent a year or 0.5 percent
each month) the payments will be made until the loan is paid off. The duration of the
loan will increase with the loan amount. The following table shows the (rounded) number
of months for different loan amounts (in dollars). The graph of these values follows the
table.
Loan amount and duration:
amount 12,000 15,000 18,000 21,000 24,000 27,000
duration 45 58 72 87 103 120

125

Duration (months)

100

Amount (thousands)
50

0 5 10 15 20 25 30

Time it takes to pay a loan at 200 a month.

Computers and Calculators

Calculators, computers, tablets, and other computing devices often have programs
that are excellent at graphing mathematical expressions. If you have access to such a
device, this would be an excellent time in the course to learn how to use it for graphing.
Keep in mind that most graphing devices allow the user to choose the range of values
for the independent variable that are graphed and the range of values of the dependent
variable displayed. So it makes sense to be sure that any features of interest appear in any
given graph by choosing the ranges of values correctly.
32 Graphs and Equations

EXERCISES.

1. Plot the points (1, 2), (−1, 2), (3, 3), (4, 4), and (−1, −2.5) on a set of axes with the
horizontal values ranging approximately between −5 and 5 and the vertical values ranging
between −5 and 5.
2. Suppose that the number of bacteria in a flask varies with the time as given in the table
below. Graph this relationship.
Bacterial population:
time (hour) 2 3 4 5 6 7
number (thousands) 4 5.5 7.5 10.5 14.5 20

3. The average height of a population depends in part on the nutrition available. In

general, as a country develops, incomes rise and nutrition improves. (a) Graph a plausible
relation between income and the average height of the population in a developing country.
(b) How does this graph reflect the fact that children are the most affected, so the height
of adults does not vary with nutrition?
4. The graph below shows the turbidity in a certain river (in parts per million) during a
year (time is measured in months with 1 representing January). (a) Read the values from
the graph and supply the numbers to fill the table. (b) During which month was turbidity
the lowest? (c) During which month was turbidity the highest?

7.5

Turbidity (parts per million)

2.5

Time (month)
0 2.5 5 7.5 10
Graphs and Equations 33

time (month) 2 3 6 8 9 11
turbidity (ppm)

5. Plot the points (1, 2), (2, 4), (3, 6), (4, 8), and (5, 10) on a set of axes with the horizontal
values ranging approximately between 0 and 7 and the vertical values ranging between 0
and 12. Connect these points with line segments. Geometrically, what is the graph?
6. Plot the points (1, 4), (−1, 5), (3, 3), and (6, 1.5) on a set of axes with the horizon-
tal and vertical values ranging appropriately. Connect these points with line segments.
Geometrically, what is the graph?
7. Plot the points (1, 2), (2, 5), (3, 10), (−1, 2), and (0, 1) on a set of axes with the
horizontal and vertical values ranging appropriately. Connect these points with a curve.
Geometrically, how would you describe the way this graph curves.
8. Plot the points (1, −2), (2, −5), (3, −10), (−1, −2), and (0, −1) on a set of axes with the
horizontal and vertical values ranging appropriately. Connect these points with a curve.
Geometrically, how would you describe the way this graph curves.
9. Plot the points representing the age and total production of the orchard described in
example 2 in section 1.C. Place the age on the horizontal axis and the production on the
vertical axis.
10. Plot the points representing the total production and age of the orchard described in
example 2 in section 1.C. Place the production on the horizontal axis and the age on the
vertical axis.
11. Plot the points representing the combinations of the number of games and amount of
apples described in example 3 in section 1.C. Place the number of games on the horizontal
axis and the amount of apples on the vertical axis. Label each point with the value of the
utility for its combination.
12. Plot the points representing the area and length of a square described in example 2
in section 1.B. Place the length on the horizontal axis and the area on the vertical axis.
Connect these points with a curve.
13. Plot the points representing the area and length of a square described in example 3
in section 1.B. Place the area on the horizontal axis and the length on the vertical axis.
Connect these points with a curve.
14. Draw a graph with annual income on the horizontal axis and the amount saved on the
vertical axis. Decide on plausible values for annual incomes of $15,000 , $35,000, $ 60,000,
$130,000, and $200,000 and include the corresponding points on your graph. Consider that
34 Graphs and Equations

the average rate of savings for the whole population in different countries ranges between
none and about 9 percent.

B. Equations.

An equation is an expression in which one quantity is declared equal to some other

quantity. Typically an equation includes a variable (or more than one variable) and we
typically wish to determine which values of the variable make the declaration correct.
Example 1. Determine a value of x for which 3x + 2 = 11. In this case there is only
one value of x for which the equation is satisfied, namely x = 3. Let us check that x = 3
satisfies the equation:
3 × 3 + 2 = 9 + 2 = 11.

We will examine the issue of finding all values that satisfy an equation soon, but for
now we want to understand what we mean by a solution.

Definiton. A value for a variable is a solution to an equation if when we replace

the variable with this value, the resulting equation is satisfied (that is, it gives a correct
mathematical statement).
Example 2. Determine which of the values x = 1, x = 2, x = 3, and x = 4 are
solutions to the equation
x3 − 9x2 + 26x = 24.

By trying out the different values we find that x = 2, x = 3, and x = 4, are solutions, but
x = 1 is not. The calculations are:

23 − 9 · 22 + 26 · 2 = 8 − 9 · 4 + 52 = 8 − 36 + 52 = 24;

33 − 9 · 32 + 26 · 3 = 27 − 9 · 9 + 78 = 27 − 81 + 78 = 24;

43 − 9 · 42 + 26 · 4 = 64 − 9 · 16 + 104 = 64 − 144 + 104 = 24;

13 − 9 · 12 + 26 · 1 = 1 − 9 · 1 + 26 = 1 − 9 + 26 = 18 6= 24.

Example 3. Solve the equation 3x + 2 = 7.

Solving means finding the value or values that can be substituted into the symbol x to
make the equation valid. To perform the solving, we operate using legitimate mathematical
operations until the symbol x is isolated. Below are the steps we choose for this equation:
Subtract 2 from the quantities – from the quantity on the left side of the equation
and also from the quantity on the right side of the equation: 3x + 2 − 2 = 7 − 2 or 3x = 5.
Graphs and Equations 35

Divide by 3: 3x/3 = 5/3 or x = 5/3. This is the only number that works, so the solution
is x = 5/3.
Example 4. Solve the equation 5x + 2 = −2x − 4.
Subtract 2 from the quantities: 5x = −2x − 6. Add 2x: 5x + 2x = −6 or 7x = −6.
Divide by 7: x = −6/7. This is the solution.
Example 5. Solve the equation 3x2 = 27.
Divide by 3: 3x2 /3 = 27/3 or x2 = 9. We now need to decide which numbers, when
√
squared, give 9. One of these numbers is the square root of 9, namely 9 = 3. Moreover,
if we think of the value x = −3, then we would realize that (−3)2 = (−3)(−3) = 9. In
conclusion, there are two solutions, x = 3 and x = −3.

The examples above suggest that the most important notion when solving equations
for a variable is that any operations or manipulations must be correct for all numbers
(since the values of the particular variables are still unknown). The next thing to keep in
mind is that the goal is to isolate the unknown variable.

EXERCISES.

1. For each of the following values decide whether it solves the equation x3 −6x2 +11x−6 =
0. (a) x = 1, (b) x = 0, (c) x = −2, (d) x = 4, (e) x = 3, (b) x = 2.
2. For each of the following values decide whether it solves the equation x3 − 2x + 1 = 0.
√ √
(a) x = 1, (b) x = 0, (c) x = −0.5 + 0.5 5, (d) x = −0.5 − 0.5 5, (e) x = −1, (f) x = 2.
3. For each of the following values decide whether it solves the equation x3 − 2x2 + 1 = 0.
√ √
(a) x = 1, (b) x = 0, (c) x = 0.5 + 0.5 5, (d) x = 0.5 − 0.5 5, (e) x = −2, (f) x = 3.
4. For each of the following values decide whether it solves the equation x4 = 16. (a)
√
x = −1, (b) x = 5, (c) x = 2, (d) x = 2 − 5, (e) x = −2, (f) x = 3.
5. For each of the following values decide whether it solves the equation x2 = 8. (a)
√ √ √
x = −1, (b) x = 0.4, (c) x = 2 2, (d) x = 2 − 2, (e) x = −2 2, (f) x = 3.
6. Solve for the values of x that satisfy x3 = −27.
7. Solve for the value of x in 3x + 4 = 25.
8. Solve for the values of x that satisfy 0.5x2 = 8.
9. Solve for the values of x that satisfy 0.3x + 0.4 = 11.
10. Solve for the value of x in 0.1x3 + 2 = 1.2.
11. Solve for the value of x that satisfies x3 = 7.
12. Solve for the value of x in 11x + 4/3 = −5.
36 Graphs and Equations

13. Solve for the values of x that satisfy 2x2 = 7.

14. Solve for the values of x that satisfy −0.3x + 5/3 = 4.
15. Solve for the value of x in −x3 + 30 = 13.

C. Inequalities.

An inequality is a statement comparing one quantity to another. For example 3 ≤ 4

is a valid inequality. Also 5 < 5.3 is a valid inequality.
Example 1. For each of the following inequalities decide whether it is valid, that is
it is a true statement, or invalid – a false statement:
(a) 3 > 2. True. Justification: 3 is to the right of 2 on the number line.
(b) 3 ≥ 2.4. True. Justification: 3 is to the right of 2.4 on the number line.
(c) 3 > 3.4. False. Justification: 3 is to the left of 3.4 on the number line.
(d) −5 ≥ −4. False. Justification: −5 is to the left of −4 on the number line.
(e) −3 < −2. True. Justification: −3 is to the left of −2 on the number line.
(f) x + 1 ≥ x for any number x. True. Justification: regardless of the value of x, x + 1
is to the right of x on the number line.
(g) 2x ≥ x for any number x. False. Justification: For negative x, 2x is to the left of
x on the number line. For example, if x = −2 then 2x = −4 and −4 < −2. (For positive
x, 2x is to the right of x on the number line, but the statement claims that the inequality
is true for all numbers.)
Example 2. For which values of x is 3x + 6 ≥ 9?
We proceed as we did for equalities in isolating the symbol x. Subtract 6: 3x ≥ 9−6 =
3. Divide by 3: x ≥ 1. The solution is all numbers x for which x ≥ 1.
Let us check this with some value. For instance, 2 ≥ 1 so x = 2 should work. We
check: 3 · 2 + 6 = 6 + 6 = 12 > 9 which is a valid inequality, so this value is correct.
Example 3. For which values of x is −x > 7?
We want to isolate the symbol x, so we multiply both sides by (−1). To proceed we
need to understand what this does to an inequality.
A few test values show us what to expect: 3 < 5 and −3 > −5, −2 < 1 and 2 > −1,
−4 < −3 and 4 > 3. We see that multiplying by (−1) reverses the ordering of the
numbers. This is not surprising since multiplying by a negative number flips the number
line: (−1) · 3 = −3 is on the opposite side of 0 from 3.
Back to our original problem, −x > 7 becomes (−1) · (−x) < (−1) · 7 = −7, or
x < −7. Let us check this with a number: for instance, x = −8 < −7 is a valid inequality.
Is −x = −(−8) > 7? Yes, because −(−8) = 8 > 7.
Graphs and Equations 37

Example 4. For which values of x is |x| ≥ 2?

If x ≥ 0 then |x| = x and the inequality is x ≥ 2. If x ≤ 0 then |x| = −x and the
inequality becomes −x ≥ 2 or x ≤ −2. Hence |x| ≥ 2 if x ≥ 2 or x ≤ −2.
Example 5. For which values of x is x2 ≥ 9?
We will approach this inequality in two ways. First we will consider replacing the
inequality with an equation. The idea is to find the values at which the inequality is just
satisfied. Once we know these “edge” values, we will need to decide what happens for
other values.
The corresponding equation is x2 = 9 for which the solutions are x = −3 and x = 3.
Therefore, we need to decide what inequality is satisfied in each of the three cases: for
values x with x < −3, for values x with −3 < x < 3, and for values x with x > 3. We
test a number in each case. For instance, −4 < −3 and (−4)2 = 16 > 9, so x2 > 9 for
x < −3. For −3 < x < 3 we can take, for instance, x = 0 and 02 = 0 < 9. So x2 < 9 for
−3 < x < 3. Finally, for instance, x = 3.5 > 3 and (3.5)2 = 12.25 > 9, so x2 > 9 for x > 3.
We conclude that x2 ≥ 9 when x ≤ −3 and when x ≥ 3.
Our second approach to this problem is to first re-interpret the original inequality. If
√ √
x ≥ 0, then x2 ≥ 9 is the same as x2 ≥ 9 = 3 or x ≥ 3. On the other hand, if x ≤ 0,
√ √ √
then x2 ≥ 9 is the same as x2 ≥ 9 = 3, but x2 = −x, so −x ≥ 3 and therefore
x ≤ −3.
The conclusion is the same: x2 ≥ 9 when x ≤ −3 and when x ≥ 3.
Typically the first of the two approaches we saw is easier to implement.
Example 6. A flower grower sells plants in 4 inch pots for 3 dollars each. The
materials in each pot costs 1.50 dollars, and the grower pays 8% in sales taxes on each
sale. How many pots must the grower sell to have a net income of at least 200 dollars from
the sale?
Let x denote the number of pots sold. The revenue from the sale is 3x, and the cost of
materials is 1.5x. The taxes are paid on the revenue, so the cost of taxes is 0.08×3x = 0.24x.
Hence the net income is 3x − 1.5x − 0.24x = 1.26x. For the net income to exceed 200, the
grower must have 1.26x ≥ 200, or x ≥ 200/1.26 ≈ 158.73. So the grower must sell at least
159 pots to have a net income above 200 dollars from the sale.

EXERCISES.

1. For each of the following values decide whether it solves the inequality x3 − 6x2 ≥ 3.
(a) x = −2, (b) x = 0, (c) x = 2, (d) x = 5, (e) x = 7, (b) x = 11.
38 Graphs and Equations

2. For each of the following values decide whether it solves the inequality x2 − 2x + 1 ≥ 0.
√
(a) x = −5, (b) x = −2, (c) x = 0, (d) x = 2, (e) x = 5, (f) x = 3, (g) x = 4.
3. For each of the following values decide whether it solves the inequality 2x + |x| ≥ 3. (a)
x = −1, (b) x = 0, (c) x = 1, (d) x = 2, (e) x = 3.
4. Describe all values of x that satisfy the inequality 3x + 2 ≤ 5.
5. Describe all values of x that satisfy the inequality 3x + 2 > 5.
6. Describe all values of x that satisfy the inequality x2 + 2 ≤ 6.
7. Describe all values of x that satisfy the inequality 3x2 − 1 ≥ 11.
8. Describe all values of x that satisfy the inequality −2x + 1 > 6.
9. Describe all values of x that satisfy the inequality −3x2 ≤ 6.
10. Describe all values of x that satisfy the inequality x2 + 3 ≥ 8.
11. Describe all values of x that satisfy the inequality |x| ≤ 5.
12. Describe all values of x that satisfy the inequality |x − 2| ≤ 5.
13. Describe all values of x that satisfy the inequality |x + 3| ≤ 5.
14. Describe all values of x that satisfy the inequality −3x + 1 ≤ 6.
15. Describe all values of x that satisfy the inequality −0.1x2 + 0.4 ≥ 4.
16. Describe all values of x that satisfy the inequality x2 ≥ 8x.
17. Describe all values of x that satisfy the inequality x2 − 5x ≤ 0.
18. Describe all values of x that satisfy the inequality |x − 7| ≥ 3.
19. Describe all values of x that satisfy the inequality |x| ≥ 2x.
20. A batch of pancakes costs a restaurant $1 in ingredients and 0.2 hours of labor. Suppose
the restaurant gets $7 for the pancakes. How much can the restaurant pay per hour of
labor if it is to make at least $3 in net revenue from this sale (to pay fixed costs and for
profit)?
21. A clinic treats patients using a combination of doctors and nurses. Suppose that when
√
m doctors and n nurses are employed (per month) the clinic can treat p = 100 m · n
patients. (a) If the clinic wants to treat at least 1000 patients and has 7 nurses on hand,
how many doctors does it need? (b) If the clinic wants to treat at least 1000 patients and
has 5 nurses on hand, how many doctors does it need? (c) If the clinic wants to treat at
least 1000 patients and has 9 nurses on hand, how many doctors does it need?
Graphs and Equations 39

D. Graphing Equations.

Graphing is a powerful tool for summarizing a situation, and this can be helpful when
keeping track of the solutions to an equation or an inequality.

Example 1. Draw a graph corresponding to solving x2 + x = 4 − x2 .

Solution: Here we will graph the left-hand side and the right-hand side on the same set of
axes and think of the solutions as the values of the x-coordinate at the points of intersection.

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

Combined graph of x2 + x and 4 − x2 .

It is clear that there are two solutions, one near −1.7 and the other near 1.2. If we
were to solve the equation exactly, we would write it as 2x2 + x − 4 = 0 and use the
quadratic equation.The quadratic equation is described later.

Example 2. Draw a graph corresponding to solving |x − 2| = 3.

Solution: Here we will graph the left-hand side and the right-hand side on the same set of
axes and think of the solutions as the values of the x-coordinate at the points of intersection.
40 Graphs and Equations

-1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

-1

Combined sketch of y = |x − 2| and y = 3.

It is clear that there are two solutions, one at x = −1 and the other at x = 5. If we
were to solve the equation algebraically, we would separate |x − 2| into the two cases x ≥ 2
and x < 2. The reader is invited to calculate the solutions algebraically. Even without
estimating the solutions, it should be clear from the graph that there are two solutions.

Example 3. Draw a graph corresponding to solving x + x2 = x3 + x4 .

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

Combined graph of x + x2 and x3 + x4

It seems that there are three solutions, one near x = −1, a second near x = 0,
and the third near x = 1. One can check that these are in fact solutions. Writing the
equation as x4 + x3 − x2 − x = 0 we can use the factors we’ve found to check that
x4 + x3 − x2 − x = x (x − 1) (x + 1)2 and we’ve found all the solutions.

√
Example 4. Draw a graph corresponding to solving x − 2 = x2 − 3.

Solution: Here we will graph the left-hand side and the right-hand side on the same set
of axes. Notice that the domain of the left-hand side is x ≥ 2 while the domain of the
right-hand side is all numbers.
42 Graphs and Equations

-1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8

-1

-2

-3

-4
√
Combined graph of x − 2 and x2 − 3.

It is clear that there are no solutions to this equation.

EXERCISES.

1. Use a graph to explain why |x − 2.1| = 5 has two solutions.

2. Graph the equation x3 = x + 1 and approximate the solutions from the graph.
3. Use a graph to explain why |x − 2.1| = −0.00001 has no solutions.
4. Graph the equation x3 − 2x = x + 3 and approximate the solutions from the graph.
5. Graph the equation x3 − 2x = x + 1 and approximate the solutions from the graph.
6. Graph the equation x3 − 2x = x + 2 and approximate the solutions from the graph.
7. Graph the equation x3 − 2x = x2 + 1 and approximate the solutions from the graph.
8. Graph the equation x3 + 3x2 + 5x − 1 = 0 and approximate the solutions from the
graph.
√
9. Graph the equation x2 + 3 = x2 + 1 and approximate the solutions from the graph.
√
10. Graph the equation x2 + 3 = x2 + 0.5 and approximate the solutions from the graph.
11. Graph the equation x4 − x2 + 0.3x = 0 and approximate the solutions from the graph.
4. Linear and Polynomial Expressions.

Familiar examples of the type of relations quantified in this chapter are the size of the
payment as a function of the size of a loan, the length of a shadow in relation to the height
of the object casting the shadow, the area of a roof as a function of the length of a side for
houses with similar proportions, the rate at which a population increases as a function of
the starting size of the population, the amount of energy used to travel a certain distance
by car as a function of the speed of the travel, the efficiency of a solar cell as a function
of the intensity of the light striking it, the power required for a radio station to reach
customers a certain distance away, or the cost of a manufacturing plant as a function of
its manufacturing capacity.
We describe the relations in the order of their mathematical complexity, and use the
descriptions to answer quantitative questions about these relations.

A. Linear Relations and Equations.

A relation is called linear, or more precisely linear in the variable, when the variable
appears only with a coefficient multiplying it.
Symbolically, a linear relations using a variable x has the form

y = ax + b

where a and b are numbers. The term a x is called the linear term or the first order
term and the number b is called the constant term or simply the constant.
An equation involving only a linear relation is called a linear equation. As mentioned
before, a solution to an equation is a number that makes the equation hold. For example,
the linear equation 3 x = 4.5 has x = 1.5 as its only solution, while the linear equation
2 x + 5 = 2 x + 3 has no solutions, and the linear equation 3(x + 2) = 3 x + 6 holds for any
number x.
Example 1. The state and local sales tax in Bellingham, WA, currently total 8.7%.
A neighborhood association has a plant sale where (for convenience) the total purchase
amounts will be $1, $2, and $5. What are the original prices of the plants before tax?
44 Linear and Polynomial Expressions

Solution: The quantities that we do not know are the original prices. Let x be the
original price before tax and let p be the price with the tax.
The relation is that the total price includes the original price and the tax. So
p = x + 0.087x = (1 + 0.087)x = 1.087x, and the three prices are p = 1, p = 2, and
p = 5.
We solve the equations to get x = p/1.087, and the original prices are 1/1.087 ≈ 0.92,
2/1.087 ≈ 1.84, and 5/1.087 ≈ 4.60.
In this example, we rounded the answers to the nearest cent, since we are discussing
plant prices. So we conclude that the original prices are 92 cents, 1.84 dollars, and 4.60
dollars for the plants with total prices of 1 dollar, 2 dollars, and 5 dollars (respectively).
Example 2. Find all solutions to the equation 3x − 4 = 7x − 5. We want to isolate
x so we’ll collect the terms involving x and the terms not involving x by moving, in this
instance, the 3x term to the right and the −5 to the left:

3x − 4 = 7x − 5, 3x − 4 − 3x + 5 = 7x − 5 − 3x + 5, −4 + 5 = 7x − 3x, 1 = 4x .

Now that there is only one term involving x, we can solve by dividing by the coefficient, 4:

1 = 4x, x = 1/4.

It turns out that this equation has a unique solution.

Example 3. Find all solutions to the equation

2 5
+4= .
x x

This seems like a new type of equation, but we want to isolate x so we’ll multiply by x to
get it out of the denominator:
2 5 2
+4 x= x, x + 4x = 5, 2 + 4x = 5, 4x = 3, x = 3/4 .
x x x

Example 4. Find all solutions to the equation

(x + 3)(x − 2) + 4 = x2 + 3x + 5.

We want to isolate x so it seems like a good idea to collect the terms involving x and
see what emerges. Since the left hand side includes a product while the right hand side
Linear and Polynomial Expressions 45

includes powers of x, it makes sense to expand the left hand side in terms of powers of x.
So (x + 3)(x − 2) + 4 = x2 + 3x − 2x − 6 + 4 = x2 + x − 2 is the left hand side. We get:

(x+3)(x−2)+4 = x2 +3x+5, as x2 +x−2 = x2 +3x+5, x−2 = 3x+5, −2x = 7, x = −7/2 .

Example 5. The price at which manufacturers would supply a certain quantity of an

item is called the supply function. Suppose that when a quantity q is to be manufactured,
the required price, in dollars, is p = 10 + 0.0003q.
Graph the price as a function of the quantity. Determine the price at which 10, 000
items would be manufactured, and the price at which 15, 000 items would be manufactured.
20

price, p
15

Producers' supply

quantity, q
0 2500 5000 7500 1⋅104 1.25⋅104 1.5⋅104 1.75⋅104 2⋅104

Graph of p = 10 + 0.0003q.
When the quantity is 10, 000 items, the price manufacturers will accept is p = 10 +
0.0003 × 10, 000 = 13 dollars. When the quantity is 15, 000 items, the price is p = 10 +
0.0003 × 15, 000 = 14.50 dollars.
Discussion: Does it make sense that to induce manufacturers to supply a larger amount
the price must be raised? Consider what it would take for the same factory to increase
production, and what it would take to add a new factory in order to increase supply.
Example 6. The price at which consumers would buy a certain quantity of an item
is called the demand function. Suppose that when a quantity q is to be sold, the required
price, in dollars, is p = 20 − 0.0005q.
Graph the price as a function of the quantity. Determine the price at which 10, 000
items would be purchased, and the price at which 15, 000 items would be purchased. If
46 Linear and Polynomial Expressions

supply is as in example 5, that is p = 10 + 0.0003q, then at which quantity do producers

and consumers agree on a price?
20

price, p
15

Consumers' demand

quantity, q
0 2500 5000 7500 1⋅104 1.25⋅104 1.5⋅104 1.75⋅104 2⋅104

Graph of p = 20 − 0.0005q
When the quantity is 10, 000 items, the price consumers are willing to pay is p =
20 − 0.0005 × 10, 000 = 15 dollars. When the quantity is 15, 000 items, the price is
p = 20 − 0.0005 × 15, 000 = 12.50 dollars. For consumers and producers to agree on a
price, the price at which manufacturers are willing to produce the quantity and at which
consumers are willing to purchase the quantity must be the same, so

10 + 0.0003q = p = 20 − 0.0005q.

The resulting equation is 0.0008q = 10, or q = 10/0.0008 = 12, 500.

Discussion: Does it make sense that to induce consumers to buy a larger amount the
price must be lowered? Consider what it would take to get the same consumer to purchase
more of a certain item and what it would take to get purchases from new consumers in
order to increase demand.
Example 7. Suppose that a shopper has a budget of 30 dollars with which to buy
potatoes and rice (for the month). Suppose that potatoes cost 1 dollar per pound and rice
costs 1.50 dollars per pound. Let p denote the amount of potatoes purchased (in pounds)
and let r denote the amount of rice purchased (in pounds). Describe the relation between
the two amounts. How many pounds of rice can this shopper buy if he or she purchases 9
pounds of potatoes?
Linear and Polynomial Expressions 47

The total amount spent on potatoes and rice is 1·p+1.50·r so the budget requirement
is p+1.50r = 30. If 9 pounds of potatoes are purchased, then the equation for rice becomes
9 + 1.50r = 30, and the solution is r = 21/1.50 = 14.
It help understand the setting of such a budget requirement if we graph the relation
between potatoes and rice.

Rice
20

16
(9,14)

potatoes
0 2.5 5 7.5 10 12.5 15

The budget constraint of 30 dollars

The most important notion when solving equations is that any operations or manipu-
lations must be correct for all numbers (since the values of the particular variables are still
unknown). What we decide to do depends on the particular equation. Recognizing the
type of an equation and hence an approach to finding solutions to it requires experience.
The best way to get good at solving equations is to work through examples in order to
gain useful experience. Many students find it helpful to write short notes justifying each
operation.
Example 8. An economy “decides” to spend some portion of its capital on (private)
manufacturing and the rest of its capital on (public) infrastructure. Suppose 100 billion
dollars of capital are available in one year. Suppose there is a 15% cost for investing in
manufacturing, that is, of every dollar designated for manufacturing there is a loss of 15
cents in financing and administration. Suppose that for infrastructure there is a 20% loss,
due to the cost of collecting taxes and administering the project. Describe the relation
between the net amounts of capital devoted to manufacturing and to infrastructure. If a
net amount of 60 billion is to be devoted to manufacturing, what net amount will remain
48 Linear and Polynomial Expressions

for infrastructure?
Solution (with notes): The variables are the amount allocated to manufacturing, m,
and the amount allocated to infrastructure, i. We will also need the net amounts, so let
n denote the net amount spent on manufacturing and let j be the net amount spent on
infrastructure.
We’ll remind ourselves: m = manufacturing. i = infrastructure. (total amounts in
billions) n = net for manufacturing. j = net for infrastructure.
The relation for the total amounts is m + i = 100. Also, n = m − 0.15m = 0.85m,
and j = i − 0.2i = 0.8i. We solve this for a relation of n and j:
Solve for m: m = n/0.85 = (20/17)n. Solve for i: i = j/0.8 = (5/4)j. Substitute in
m + i = 100: (20/17)n + (5/4)j = 100. This is the relation between the net amounts.
When the net amount spent on manufacturing is 60, n = 60 and (20/17)60 + (5/4)j =
100. We want to solve for j. Multiply by 4/5: (20×4×60)/(5×17)+j = 100×4/5. Simplify
arithmetic: (960/17) + j = 80. Subtract 960/17: j = 80 − (960/17) = 400/17 ≈ 23.53.
Report the result: The relation between net amounts is
20 5
n + j = 100
17 4
and if 60 billion are invested directly into manufacturing we find that 400/17 ≈ 23.53
billion are invested directly into infrastructure.

EXERCISES.

1. Find the solution(s) to the equation 3x − 4 = 4x − 3.

2. Find all solutions to the equation
3 5
+4= .
x x
3. Find all solutions to the equation

(x + 3)(x − 2) + 4 = x2 + 7x + 3.

4. The state and local sales tax is currently 10%. What are the original prices of items
whose total purchase prices (including the sales tax) are $1, $2, and $5 ?
5. Find the solutions to the equation 4x + 0.2 = 1.6x − 0.3.
6. Find all solutions to the equation
3+x 5+x
= .
x+2 x
Linear and Polynomial Expressions 49

7. Suppose that when a quantity q is to be sold, the price acceptable to consumers, in

dollars, is p = 40 − 0.001q. (a) For what values of q does this relation make sense (hint:
prices cannot drop below zero)? (b) Graph the price as a function of quantity. (c) What
is the price at which 5,000 units are purchased?
8. Suppose that when a quantity q is to be manufactured, the price acceptable to producers,
in dollars, is p = 10 + 0.003q. (a) Graph the price as a function of quantity. (b) What is
the price at which 5,000 units are manufactured? (c) What is the price at which 10,000
units are manufactured?
9. Suppose demand and supply are as in problems 7 and 8. (a) What is the quantity about
which consumers and producers agree? (b) What is the market price on which consumers
and producers agree?
10. A clinic spends 140 thousands of dollars a year on each doctor and 80 thousands of
dollars a year on each nurse. (a) Write an equation relating the number of doctors and
nurses that the clinic can hire with a budget of 1.4 million dollars a year. (b) Graph the
relation in part (a).
11. A consumer buys snacks and beverages at a convenience store. The tax rate on snacks
is 10 percent and the tax rate on beverages is 15 percent, and these taxes are added to
the price listed on the shelves in the store. Suppose the consumer will spend 10 dollars.
(a) What is the listed price on snacks if all the money is spent on snacks? (b) What is
the listed price on beverages if all the money is spent on beverages? (c) Write the budget
constraint for this consumer in terms of the listed shelf prices.

B. Quadratic Expressions and Equations.

A relation is called quadratic in the variable, when the variable appears only with
coefficients multiplying it and its second power.
Symbolically, a quadratic relations using a variable x has the form

y = a x2 + b x + c

where a, b, and c are numbers. The term a x2 is called the second order term.
An equation involving only a quadratic relation is called a quadratic equation.
50 Linear and Polynomial Expressions

Example 1. The total energy, E, used to travel 10 miles at a speed of v in a particular

car is
E = 0.0005v 2 − 0.05v + 1.45
where v is measured in miles per hour (mph) and E is measured in gallons of gasoline.
Calculate the amount of energy used when traveling at 50, 60, and 70 mph.
Solution: Using 50, 60, and 70 for v we find that the energy used is 0.2 gallons, 0.25
gallons, and 0.4 gallons, respectively.
Example 2. The simplest equation involving a quadratic expression is of the form
2
x = 6. Find the values of x that solve this equation.
√ √
Solution: There are two quantities whose square is 6, namely x1 = 6 and x2 = − 6.
√
We could approximate the square root (we chose 3 digits here) by 6 ≈ 2.45.

Example 3. In the setting of example 1, find how fast the car can travel if the total
energy available is 0.5 gallons.
Solution: We use the same variables: E for the energy, in gallons, and v for the
speed, in mph. We still have E = 0.0005v 2 − 0.05v + 1.45. We want to find v with
0.0005v 2 − 0.05v + 1.45 = 0.5. We will solve this equation by putting it into a form that
involves only a squared quantity and numbers (that is, there will be no first order term).
To simplify the numbers subtract 0.5 from both sides and then multiply everything by
2000:
v 2 − 100v + 1900 = 0.
Now notice that we can get the terms involving v in the combination v 2 − 100v by squaring
(v − 50) because (v − 50)2 = v 2 − 100v + 2500:

v 2 −100v+1900 = v 2 −100v+2500−2500+1900 = (v−50)2 −2500+1900 = (v−50)2 −600.

Back in the equation:

√ √
(v − 50)2 − 600 = 0. (v − 50)2 = 600. v − 50 = 600 or v − 50 = − 600.
√ √
Finally solving for v we get two solutions v1 = 50 − 600 and v2 = 50 + 600. We could
approximate the square root of 600 to get v1 ≈ 25.505 and v2 ≈ 74.495 (we chose 5 digits
arbitrarily). These are the two speeds at which it takes 0.5 gallons to travel 10 miles
(according to our model of fuel consumption).

Any quadratic equation in the variable x can be put into the standard form

a x2 + b x + c = 0.
Linear and Polynomial Expressions 51

And this equation has 0, 1, or 2 (real) solutions. The easiest way to find the solutions is
to remember the following formula, which is called the quadratic formula.
√ √
−b − b2 − 4ac −b + b2 − 4ac
x1 = , x2 = .
2a 2a

When x1 and x2 make sense, we get a solution, and if they are different, then we have two
solutions.
Notice that the quadratic formula actually tells us how many (real) solutions we have.
√
Namely, if b2 − 4ac < 0 then b2 − 4ac does not make sense as a real number and there
√
is no solution. If b2 − 4ac = 0 then b2 − 4ac = 0 and x1 = x2 and there is a unique
√
solution. Finally, if b2 − 4ac > 0 then b2 − 4ac > 0 and there are two different solutions.
Terminology: For the equation ax2 + bx + c = 0, the number b2 − 4ac is called the
discriminant.

It is useful to understand the origin of the quadratic formula, and also when it is easier
not to use it.
Example 4. A square swimming pool has a liner held by a border with width 1 foot.
The total area of the structure may not exceed 150 square feet (there are regulations on
the total impermeable surface area allowed on this site). How long a side can the pool
have?
Solution: Let x denote the length of a side and A denote the area covered by the pool
(with its border). Then the length of the side with the border of 1 foot on each end is x + 2
and the area of the whole structure is A = (x + 2)2 . Since the allowed area is A = 150, the
longest allowed side has length x with (x + 2)2 = 150. We could expand the expression
on the left hand side and put this equation into standard form, but that would be silly
because we can simply take the square root of both sides:
√ √ √ √
x+2= 150 or x + 2 = − 150. So x1 = −2 + 150 or x2 = −2 − 150.

And since the length of a side of the pool only makes sense if it is a positive number, we
√
have the solution x = 150 − 2 ≈ 10.247 feet.
The moral of this example is that one of the operations involved in solving a quadratic
equation is taking a square root.
Example 5. This is a fairly involved example showing that sometimes it is nice to
write quadratic expressions in special forms. It is provided to give a better understanding
of the ideas involved in examining quadratic expressions. Show that for every value of x
52 Linear and Polynomial Expressions

the combination u = x2 + 6x + 11 is positive, by showing that u > l2 for some expression

l.
Solution: Let us pause for a moment, since this is a new and fairly sophisticated idea.
So first, remember that any number squared is positive. Hence u > l2 would really show
that u is positive. We are working under the assumption that l2 plus something positive
would give u. Now u has an x2 term, so we might as well try l = x + something. We
reverse engineer this by plugging in:

l2 = (x + something)2 = x2 + 2 x something + something2 .

Taking another look at u tells us to guess that “something” should be 3. This gives

l2 = (x + 3)2 = x2 + 2 · 3 x + 32 = x2 + 6x + 9.

Comparing one last time to u we see that u = (x + 3)2 + 2. So we found that u ≥ 2 which is
certainly positive. Incidentally, we discovered that x2 + 6x + 11 = 0 has no real solutions.
So we now understand why some quadratic equations have no solutions.
Example 6. Graph the relation y = x2 + 6x + 11. Can you modify the variable y by
adding or subtracting a number c so that the equation y + c = 0 has exactly one solution?
Solution: Notice on the graph of y = x2 + 6x + 11 below that the lowest value of y is
achieved at (−3, 2).
4

3.5

2.5

(-3,2)

1.5

-5 -4.5 -4 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1

The vertex at (−3, 2)

Linear and Polynomial Expressions 53

With c = −2, u = y + c = x2 + 6x + 11 − 2 = x2 + 6x + 9 = (x + 3)2 . The equation

(x + 3)2 = 0 has the unique solution x = −3 (why?). Graphically, we’ve moved the graph
of y down by two units to get the graph of u.
Example 7. For the graph y = ax2 + bx + c, the point with x = −b/(2a) and
y = c − b2 /(4a) is called the vertex of the quadratic expression. Show that this point
represents a minimum value for y if a > 0 and a maximum value for y if a < 0.
Solution: One can rewrite y = ax2 + bx + c as
b 2 b2
y = ax2 + bx + c = a x + + c− .
2a 4a

The first term is always non-negative when a > 0, so x = −b/(2a) yields the minimum
value of y = c − b2 /(4a). The first term is always non-positive when a < 0, so x = −b/(2a)
yields a maximum value for y.

EXERCISES.

1. Find the solutions to the equation x2 = 144.

2. Find the solutions to the equation x2 − 6x + 9 = 49.
3. Find the solutions to the equation 2x2 − 12x + 18 = 98.
4. Find the solutions to the equation 3x2 + 2x − 1 = 0.
5. Find the solutions to the equation −3x2 + 2x + 7 = 1.
6. Find all solutions to the equation

3+x 5
= .
x x+2

7. Suppose a 6= 0. Calculate
b 2
x+ .
2a
8. Suppose a 6= 0. Show that ax2 + b + c = 0 can be rewritten as
b 2 b 2 c
x+ = − .
2a 2a a

9. Suppose a 6= 0. Show that

b 2 c b2 − 4ac
− = .
2a a (2a)2
54 Linear and Polynomial Expressions

10. Suppose a 6= 0 and b2 − 4ac ≥ 0. Show that ax2 + b + c = 0 can be rewritten as

√
b 2 b2 − 4ac 2
x+ = .
2a 2a

11. The height, in feet, of a bottle tossed upward at a speed of 7 feet per second from
the 15th floor of a building at time t seconds is h = 180 + 7t − 16t2 . At what time does
the bottle reach the ground (h = 0 represents the ground)?
12. The velocity, in feet per second (directed upward), of a bottle tossed upward at a
speed of 7 feet per second from the 15th floor of a building at time t seconds is v = 7 − 32t.
With what velocity does the bottle reach the ground? (The time at which the bottle
reaches the ground was calculated in the previous exercise.) Why would you expect this
velocity to be negative (what is the meaning of the sign)?
13. A milk carton in the shape of a (closed) rectangular box will have a square base
with side length x and a height of 16 centimeters. What is the length of the base if the
carton is to contain 946 cubic centimeters (1 quart)?
14. An open box is made out of a rectangular sheet of cardboard that is originally 12
inches long on each side. To make the box a (small) square with side length x is cut out
of each corner, and then the side is folder up. Is it possible to create such a box whose
volume is 100 cubic inches? 120 cubic inches? 140 cubic inches? If it is possible to create
the box, calculate the size of the cut (x).
15. When the quantity of a good sold is x, the price for which each unit sells is
p = 1200 − 3x. (a) For which range of values of the quantity of the good does this price
relationship make sense? (b) Revenue is quantity times price. For what quantities of this
good is the revenue 90, 000? 110, 000? 130, 000?
16. Suppose that the supply for raspberries relates price (p, measured in dollars per
pound) to quantity (q, the number of pounds produced) according to p = 1 + 0.002q −
0.0000001q 2 . (a) Find the value of q at the vertex of this quadratic. (b) Explain why this
description of supply only makes sense for a quantity below this vertex value. Hint: if the
quantity supplied is greater, would the price have to be higher or lower?
17. Suppose that the demand for raspberries relates price to quantity (pounds con-
sumed) according to p = 21 − 0.004q + 0.0000001q 2 . (a) Find the value of q at the vertex of
this quadratic. (b) Explain why this description of demand only makes sense for a quantity
below this vertex value.
18. Suppose that the supply for raspberries relates price (p, measured in dollars per
pound) to quantity (q, the number of pounds produced) according to p = 1 + 0.002q −
Linear and Polynomial Expressions 55

0.0000001q 2 , as above. Suppose that the demand for raspberries relates price to quantity
(pounds consumed) according to p = 21 − 0.004q + 0.0000001q 2 , also as above. What are
the quantity and price on which producers and consumers agree?
19. Calculate the revenue to producers when the supply and demand for raspberries
are (as above) p = 1 + 0.002q − 0.0000001q 2 and p = 21 − 0.004q + 0.0000001q 2 .

C. Polynomials and Equations with Powers (that we can solve).

A polynomial is a function that involves powers of the variable with natural numbers
as exponents. The general form of a polynomial in x is

y = c0 + c1 x + c2 x2 c3 x3 + . . . + cn xn .

The polynomial above is called a polynomial of degree n (the highest power that appears).
Example 1. In discussions of developing economies the ratio of capital to population
plays a role. Let r denote the ratio of the total capital in an economy to the total number
of people and suppose that with income, I, r = 0.3I 3 − 100. Is capital per person rising
with income?
Solution: Since the relation here makes sense for I ≥ 0 let us assume that we have two
positive numbers, a and b with a < b. We claim that then a3 < b3 . To see this write

b3 = b × b × b > a × b × b > a × a × b > a × a × a = a 3 .

Hence r is increasing with I. Discussion: It actually makes sense that personal income
and capital per person will rise together because for any individual if there is more capital
invested (in tools or machines or infrastructure) then the individual will be able to produce
more and have a greater income. And if the individual has more income then she or he
can save more, and hence capital increases.
Example 2. An investment grows by 5% each year for 3 years. Suppose the initial
amount invested is A. Calculate the amount at the end of the period.
Solution: Let M be the amount at the end of the 3 year period. The investment grows
by a factor of 1.05 each year, because after one year there is the principal to which is
added 0.05 of the principal. That is, after one year we have A + 0.05A = 1.05A. After the
second year the amount is multiplied by 1.05 again, and after the third year the amount
56 Linear and Polynomial Expressions

is multiplied by 1.05 a third time. Hence M = 1.053 A. Discussion: this relation involved
a power (1.053 = 1.157625) but M = 1.157625A is a linear function.
Example 3. An investment has a rate of return of r each year for 3 years. Suppose
the initial amount invested is 1000 dollars. Calculate the amount at the end of the period.
Solution: Let M be the amount at the end of the 3 year period. As in the previous
example, the investment grows by a factor of 1 + r each year. So after one year the
amount is (1 + r) 1000, after two years the amount is (1 + r)2 1000 and after 3 years,
M = (1 + r)3 1000.

We will examine three types of equations involving powers that we can solve exactly.
The first type is that of a polynomial whose factors we can guess. The second type is that
of an equation in which the variable appears only once (with some power of it). The third
type is that of an equation that is quadratic after some manipulation of the variable.
Example 4. Find all values of x for which x4 + x3 = 0.
Solution: We notice that there is a factor of x3 in each term on the left hand side of the
equation. Thus we can re-write x4 + x3 = x3 (x + 1), and the equation is x3 (x + 1) = 0.
Since when two numbers are multiplied the result is zero only when (at least) one of the
numbers is zero, the solutions are x = 0 and x = −1.
Example 5. Find the value of x for which x11 = 2048.
Solution: Because the variable only appears once, we can take the 11th root of both sides
to get x = 20481/11 = 2.
Example 6. Find all values of x for which x6 + 3x3 + 1.25 = 0.
Solution: This looks impossible until we notice that it is really a quadratic equation for
x3 because x6 = (x3 )2 . Using the quadratic formula we find that
√
3 −3 − 32 − 4 · 1 · 1.25 −3 − 2 −3 + 2
x = = = −2.5 , or x3 = = −0.5 .
2 2 2
From the first of these we get x = −2.51/3 ≈ −1.3572 and from the second x = −0.51/3 ≈
−0.7937.
Example 7. A treasury note pays $10,000 in 5 years and costs $8,500. Assume that
the interest is compounded once a year. What is the annual interest rate for this note?
Solution: Write r for the annual return on the investment as a proportion of the value at
the beginning of the year. Then the return after 1 year is (1 + r) × (starting value) and
the return after 5 years is (1 + r)5 × (starting value). So our equation for r is

8, 500 × (1 + r)5 = 10, 000.

Linear and Polynomial Expressions 57

We can solve for r by dividing both sides by 8, 700 and taking the fifth root:
10, 000 20 20 1/5 20 1/5
(1 + r)5 = = , 1+r = , r= − 1 ≈ 0.03304.
8, 500 17 17 17
So the annual interest rate is approximately 3.304 percent.
Example 8. Find all values of x for which |x| = x2 − 3.
Solution: To put this equation into power-land, where we have a chance of finding a
solution, we can recall that either x < 0 in which case |x| = −x, or x ≥ 0 and then |x| = x.
So we will consider these two cases separately.
Suppose first that x < 0. Then the equation becomes −x = x2 − 3 or x2 + x − 3 = 0. This
√ √
is a quadratic equation with solutions x1 = (−1 − 13)/2 and x2 = (−1 + 13)/2. Of
these only x1 satisfies our assumption that x < 0, and x2 is not a valid solution.
Now consider the other possibility, namely x ≥ 0. Then the equation becomes x = x2 − 3
√
or x2 − x − 3 = 0. This is a quadratic equation with solutions x1 = (1 − 13)/2 and
√
x2 = (1 + 13)/2. Of these only x2 satisfies our assumption that x ≥ 0, and x1 is not a
valid solution.
√
The conclusion is that the original equation has two solutions, x1 = (−1 − 13)/2 and
√
x2 = (1 + 13)/2. Discussion: Even though the original equation involved a function that
is not a polynomial, we converted it to (two) equations involving polynomials.
Example 9. Find all values of x for which
x+3 x
+ = 2.
x+1 x+4

Solution: To clear the denominators, we multiply both sides by (x+1)(x+4). The equation
becomes
x+3 x
(x + 1)(x + 4) + (x + 1)(x + 4) = 2(x + 1)(x + 4),
x+1 x+4

so (x + 3)(x + 4) + x(x + 1) = 2(x + 1)(x + 4) = 2x2 + 10x + 8,

or x2 + 7x + 12 + x2 + x = 2x2 + 10x + 8. This simplifies further to 2x2 + 8x + 12 =

2x2 + 10x + 8 and to 4 = 2x, so x = 2. Checking this solution in the original equation we
have (2 + 3)/(2 + 1) + 2/(2 + 4) = 5/3 + 2/6 = 5/3 + 1/3 = 2 which is correct.
√
Example 10. Find all values of x for which 3 x + 3 = 2.
Solution: As written the equation involves roots rather than powers. But we can use
the third power to remove the third root. After cubing both sides the equation becomes
x + 3 = 23 = 8. Thus the solution is x = 5.
58 Linear and Polynomial Expressions

√
Example 11. Find all values of x for which x + 3 = 2x + 1. Sketch the graphs of
the right hand side and left hand side on the same set of axes.
Solution: Notice that the right hand side must be non-negative for the equation to make
sense (and so 2x + 1 ≥ 0) and that x ≥ −3 is required to make the left hand side
mathematically meaningful. So there is no loss in squaring both sides. We get x + 3 =
4x2 + 4x + 1 or 0 = 4x2 + 3x − 2.
The candidate solutions are
√ √
−3 + 41 −3 − 41
x1 = ≈ 0.4254, x2 = ≈ −1.1754
8 8
and it is clear that the second of these has 2x + 1 < 0. So the only solution is x =
√
( 41 − 3)/8.

Solution

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

-3

√
The intersection of the graphs of y = x + 3 and y = 2x + 1

EXERCISES.

1. Calculate the solutions to the equation x3 = −8.

2. Calculate the solutions to the equation
3+x 5
= 2.
x x
3. Calculate the solutions to the equation x4 + 3x2 − 2 = 0. Keep in mind that x2
must be a non-negative number.
Linear and Polynomial Expressions 59

√
4. Calculate the solutions to the equation x = x2 + 5. Graph the two sides of the
equation on the same set of axes.
√
5. Calculate the solutions to the equation x + 2 = x + 8. Graph the two sides of the
equation on the same set of axes.
6. Calculate the solutions to the equation x6 + 4x3 = 32.
7. Calculate the solutions to the equation 3x4 + x2 − 14 = 0.
8. Calculate the solutions to the equation x + 2 = |x − 2|. Graph the two sides of the
equation on the same set of axes.
√
9. Calculate the solutions to the equation |x − 3| = x. Graph the two sides of the
equation on the same set of axes.
10. Calculate the solutions to the equation

3 5
= .
x+4 x + x2

11. Calculate the solutions to the equation x2 (x + 3) (x − 2) (x − 4) (x − 5) = 0.

12. Calculate the solutions to the equation x (x + 2) = x2 (x + 2).
13. Calculate the solutions to the equation x2 (x + 2) = (2x − 1) (x + 2).
14. Calculate the solutions to the equation x2 (x + 2) = 3 (2x2 + x − 6).
15. Calculate the solutions to the equation x1/3 = 3.
√
16. Calculate the solutions to the equation x + 2 = 3.
17. A car sales business will lease a property and has 5080 dollars (a month) budgeted
for this. The property will be a fenced square lot. Such lots lease for 2 dollars a square
foot (per month), and the fencing rents for 0.4 dollars a linear foot (per month). What
size lot can the business afford with this budget?
18. A rectangular box will have a square base and will be twice as tall as it is wide
along one side of the base. What dimensions give the box a volume of 54 cubic feet?
19. A rectangular box will have a square base and it height will be 75 percent of its
width along one side of the base. What dimensions give the box a volume of 54 cubic feet?
20. Calculate the solutions to the equation

6 3
= 3 .
2x3 2
+x +4 x + x2

21. Calculate the solutions to the equation (x2 + 3) (x + 3) (x2 − 2x − 8) = 0.

60 Linear and Polynomial Expressions

22. Calculate the solutions to the equation (x2 + 1)2 − 5x2 = 1.

D. Proportions, and Inverse Proportions.

The notion of proportions is important because it characterizes the nature of the

relationship among variables. For instance, a reasonable measure of production should
increase by the same factor as the ingredients. That is, if we have two factories that are
exactly the same and each factory uses the same materials, labor, and services, then the
total production should be double that of one such factory.

Definition. Two quantities are proportional when their ratio is constant. Symbol-
ically, f and x are proportional when there is a non-zero constant k such that f = k x.
Two quantities are inversely proportional when one is proportional to the reciprocal
of the other. Symbolically, f and x are inversely proportional when there is a non-zero
constant k such that f = k /x.

Example 1. The area of a disk is proportional to the square of the radius of the disk.
Write this using variables.

Solution: Let A denote the area and let r denote the radius. Then A = k · r2 for some
constant k. As the reader knows, the constant in this case is the number π.

Example 2. The amount that can be borrowed at a certain interest rate and with a
fixed number of payments is proportional to the size of the monthly payments. Write this
using variables.

Solution: Let A denote the amount that can be borrowed and let P denote the size of each
payment. Then A = k · P , where k is a constant. (Here the value of k would change if the
interest rate or the number of payments changed.)

Example 3. In a certain industry, the labor required to keep production at a certain

level is inversely proportional to the amount of capital invested. Express this as a relation
among variables.

Solution: Let L denote the amount of labor needed, and let C denote the amount of capital
needed. When the production is constant, L = k /C for some constant k. (Alternatively,
L · C = k.)
Linear and Polynomial Expressions 61

EXERCISES.

1. The growth in savings in a country is proportional to the income of people in this

country. Express this as a relation among variables.

2. The area of a rectangle with sides of lengths x and y is kept at 230, so x · y = 230.
What is the relation of y and x? Are these variables proportional? inversely proportional?

3. The annual increase in a population in a certain country is proportional to the

population size. Express this as a relation among variables.

4. The length of a shadow is proportional to the height of the object casting the
shadow. Express this as a relation among variables.

5. Environmental quality is proportional to the square root of technological innovation

(which includes cleaner energy production and chemical production). Express this as a
relation among variables.

6. Environmental quality is inversely proportional to the square root of manufacturing

(due to the pollution produced). Express this as a relation among variables.

7. The inequality in income distribution is proportional to the difference between the

rate of return on investment and the rate of economic growth. Express this as a relation
among the three variables (income inequality, the rate of return on investment, and the
rate of economic growth).

8. In developed countries the annual rate of growth of the population is inversely

proportional to the cube root of income. Express this as a relation among variables.

9. A person’s foot is (typically) 12 percent longer than her or his hand. Express this
as a relation among variables.

10. The length of an animals tongue is inversely proportional to the square of the
tongue’s diameter (even though the tongue has no bones, it can be moved by muscle
contractions because its volume is constant). Express this as a relation among variables.

11. An animal’s lifespan is inversely proportional to its heart rate (the number of
times the heart beats in a minute). Express this as a relation among variables. (Animals
that have long maturation periods, such as humans, are an exception to this rule.)
5. Functions.

We have examined various relations between variables. These included linear rela-
tions, quadratic relations, polynomials, and some non-integer powers. The notion of a
function captures those relations for which it is possible to think of one variable having
been produced from another.
This way of thinking about a relation between variables allows one to consider how
one variable changes with the other, and it also allows us to think of complicated relations
in terms of simpler component relations. Of course, the notion of one quantity depending
on another also reflects what we think is actually happening in some situations.

A. Definition of a function.

A function is a mapping from one set to another that assigns a unique element of
the second set to each of the elements in the first set. For a function f from a set A to a
set B we write f : A → B.
The domain of a function is the set containing the element on which the function
operates, the set A above. To each element in the domain the function assigns a corre-
sponding element in the co-domain, the set B above.
The range of a function is the collection of elements that are assigned to elements of
the function’s domain.
Of particular interest to us will be functions whose domain is some interval or union
of intervals of (real) numbers, and whose range also consists of a collection of numbers.
For functions that assign a value to a number, the number in the domain is called the
independent variable and the value assigned is called the dependent variable.

Example 1. The area of a circle can be viewed as a function of the circle’s radius.
In this example, the area, A, is the dependent variable, the radius, r, is the independent
variable, and the function is given by the formula A = f (r) = π r2 . The domain is all
non-negative numbers (lengths are either zero or positive), and the co-domain is all non-
negative real numbers (areas are either zero or positive). The range is all of the co-domain.

Example 2. To each number x assign a number y so that y 2 = x. Is this a function?

If this is a function, what is its domain?
Functions 63

Solution: This is not a function because the assignment is not unique. For example, if
x = 4, then there are two values for y, namely y1 = −2 and y2 = 2.

Example 3. To each number x assign a number y so that y = 1/x. Is this a function?

If this is a function, what is its domain?
Solution: The operation 1/x produces a number as long as x 6= 0. So f (x) = 1/x is a
function whose domain is x 6= 0.

Example 4. For each of the following expressions decide whether f (x) is a function
of x, and if so find its domain?

√ √
x x √ x3
(a) f (x) = 2 , (b) f (x) = , (c) f (x) = x + 5, (d) f (x) = .
x +3 x+1 x2 − 9

Solution: Each of these is a function. The domains are: (a) x ≥ 0 so that the square root
is defined. (b) x 6= −1 so that x + 1 6= 0 and so that we can divide by x + 1. (c) x ≥ −5
so that the square root is defined. (d) 0 ≤ x and |x| =
6 3, since we need the square root to
be defined (so x3 ≥ 0) and also need the denominator to be a number other than zero.

Example 5. Find the range of the function y = f (x) = 1/x.

Solution: Notice that 1/x is never zero. For any non-zero value of y there is an x with
f (x) = y. To find this x we solve 1/x = y getting 1 = x y and x = 1/y.
√
Example 6. Find the range of the function y = f (x) = x − 4.
Solution: Notice that the domain is x ≥ 4. If y is zero or a positive value, there is an x
with f (x) = y, namely x = y 2 + 4. If y is negative, then there is no value whose square
root is y. The range is thus y ≥ 0.

Example 7. Define f by the rule

x2 , x > 0
f (x) =
−x, x ≤ 0

Find the domain and range of f .

The notation above means that if we start with a value of x for which x > 0, then the
function assigns the value x2 , and if we start with a value of x for which x ≤ 0, then
the function assigns the value −x. For instance, f (0.3) = 0.09, f (11) = 121, f (−2) = 2,
f (0) = 0, f (−3.2) = 3.2, and f (π) = π 2 .
64 Functions

Solution: The function assigns a unique value to every real number, so the domain is all
numbers. The range is all non-negative values because if y = 0 then f (0) = 0 and if y > 0
then there is at least one value of x to which the function assigns y. In fact, there are two
√
values of x that are assigned the value y, namely x1 = y and x2 = −y. If y is negative,
then there is no value of x to which y is assigned.

Example 8. Find the domain and range of f (x) = |x|.

Solution: The domain is all numbers. The range is all non-negative values.

A function is injective or one-to-one when it assigns a value in the range to only one
number in the domain. Symbolically, f (a) = f (b) implies that a = b.

A function is surjective or onto when the range is the entire co-domain. Symbolically,
for any y in the co-domain there is an x with f (x) = y.

Example 9. For each of the following functions take the co-domain to be all real
numbers. Decide whether the function is one-to-one and whether it is onto. (a) f (x) = x3 ,
√
(b) f (x) = x2 , (c) f (x) = x, (d) f (x) = 2 + |x|, (e) f (x) = 1/x, (f) f (x) = 1/|x|.
Solution: For (a), f is both one-to-one and onto. For (b), f is not one-to-one and not onto.
For instance f (2) = 4 = f (−2) so f is not one-to-one, and there is no x with f (x) = −1.2
so f is not onto. For (c), f is one-to-one but not onto. For (d), f is not one-to-one and
not onto. The range is y ≥ 2. For (e), f is one-to-one but not onto. The range is y 6= 0.
For (f), f is not one-to-one and not onto.

EXERCISES.
1. To each person we associate a variable, d, depending on what the person is willing
to eat. If the person eats chicken and fish and vegetables, we assign the value d = 3. If the
person eats beef and vegetables but no fish, we set d = 4. If the person eats chicken and
vegetables but no fish and beef, we set d = 2. Otherwise, we set d = 1. Does this define a
function?
2. For each of the following functions state its domain and its range. (Here x and
√ √ √
y stand for numbers.) (a) y = x − 3, (b) y = x2 + 4, (c) y = x + 3, (d)
√
y = x2 − 4, (e) y = x3 , (f) y = x4 .
3. For each of the following functions state its domain and its range. (Here x and
y stand for numbers.) (a) y = x3 + x2 , (b) y = x3 − x2 , (c) y = x4 − 5, (d)
4 2 3 5
y = x +x , (e) y = x + x . Hint: you might want to consider what happens for
Functions 65

specific values of x, such as x = 0, and also for large values of x or values of x very far to
the left of zero.
4. For each of the following functions state whether it is one-to-one. (a) y = 3x + 5,
√ √
(b) y = x2 + 4, (c) y = x + 3, (d) y = |x + 1|, (e) y = x3 , (f) y = x4 .
5. For each of the following functions state whether it is onto. Here we let the co-
√ √
domain be all numbers. (a) y = 3x + 5, (b) y = x2 + 4, (c) y = x, (d)
y = 4 + 5x3 , (e) y = x3 , (f) y = x4 .
6. Sketch a graph of the function f and decide whether this function is one-to-one
and whether it is onto.
x3 , x > 0
f (x) =
−x3 , x ≤ 0 .

7. Sketch a graph of the function f and decide whether this function is one-to-one
and whether it is onto.
x + 2, x>0
f (x) =
x3 , x≤0.

8. Sketch a graph of the function f and decide whether this function is one-to-one
and whether it is onto.
x − 3, x>0
f (x) =
x3 , x≤0.

9. Sketch a graph of the function f and decide whether this function is one-to-one
and whether it is onto.
x, x > 0
f (x) =
x3 , x ≤ 0 .

10. The US has had a census every 10 years since 1780. If you turned this into a
function, what would you choose as the independent variable and what as the dependent
variable. What is the domain? Describe a convenient co-domain.
11. Recall that a number is rational if it is the ratio of two integers. If x is rational,
set f (x) = 1. If x is not rational, set f (x) = 0. Is this assignment rule a function? If f is
a function, what is its domain? range?
12. At any given time a bank charges an interest rate on a 5 year car loan. Consider
the interest rate as a function of time. Is this function likely to be one-to-one? Interest
rates are usually multiples of 1/4 of a percent. Sketch a possible graph of the interest rate
as a function of time.
√
13. What are the domains of (a) g(x) = (x+2)/(x2 −9) (b) g(x) = ( x + 2)/(x2 −9)
√
(c) g(x) = ( x + 2)/(x + 9).
66 Functions

B. Graphs of Functions.

As we saw previously, we can represent any relation between two variables, x and y,
in the plane. A graph of a function f is the plot of the relation y = f (x).

Example 1. When we are not familiar with a function, a first step in sketching the
graph is to calculate the coordinates of a few of the points on the graph. For example, for
f (x) = |x|,
x -4 -3.2 -1 0 1 2 3
f (x) 4 3.2 1 0 1 2 3
3

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

-3

The graph of f (x) = |x|

Example 2. Graph the function f (x) = x2 .

Some points on the graph:
x -2 -1.5 -1 0 1 1.5 2
f (x) 4 2.25 1 0 1 2.25 4
We have graphed a quadratic relation before, so the graphing of the points in the table
and the curve corresponding to y = x2 is left to the reader.

Example 3. Graph the function f (x) = x4 − x2 .

Some points on the graph:
Functions 67

x -1.5 -1 -0.5 0 0.5 1 1.5

f (x) 2.8125 0 -0.1875 0 -0.1875 0 2.8125

2.5

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

The graph of f (x) = x4 − x2

Example 4. Graph the functions f (x) = x and g(x) = |x − 2| on the same set of
axes. Use the graph to approximate the values of x for which |x − 2| > x.

Some points on the graph of g:

x -1 0 1 2 3 4 5
g(x) 3 2 1 0 1 2 3
68 Functions

2.5
y=x

y=|x-2|

-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

We can see from the graph that |x − 2| > x for x < 1. Typically a graph helps decide
the general shape of the solution to such equations.

Example 5. Graph the relationship xy = 3 + x2 and decide whether there is a

function so that y = f (x).

Some points on the graph:

x -1 0 1 2 3
y -4 ? 4 3.5 4
Functions 69

2
xy=3+x

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-5

The relationship does not define y as a function of x for x = 0, because there is no

point on the graph whose x-value is zero, but otherwise the relation assigns a unique value
of y to any value of x. So this relationship does define a function with domain x 6= 0.
This is an example of what is sometimes called the vertical line test. That is, if a
vertical line through any x-value hits at most one point on the graph of the relation then
the relation defines y as a function of x (for those x-values in the domain).

Example 6. Graph the relationship 4x + y = 3 and decide whether there is a function

so that y = f (x).
The reader is asked to provide the graph. It does represent the function f (x) = 3 − 4x,
which can be seen by solving for y.

Example 7. Graph the relationship 4x2 + y 4 = 3 and decide whether there is a

function so that y = f (x).
The graph, which the reader is asked to provide, is an ellipse and is not the graph of a
function of x because there are values of x for which there is more than one value of y in
√ √
the relation. For instance ( 2 2, 1) and ( 2 2, −1) both satisfy the relation.

Example 8. Suppose that production is f (x, y) = 100x1/3 y 2/3 where x is the amount
of labor (in hours) and y is the amount invested (in thousands of dollars). Suppose the
manufacturer wants to produce f = 2000 units. Does this determine the investment needed
70 Functions

as a function of the available labor? For instance, if labor is changed from 100 hours to
110 hours, how much can be saved in investment?

Solution: With production held at 2000, the relation between labor and investment be-
comes 100x1/3 y 2/3 = 2000, or x1/3 y 2/3 = 20, or x y 2 = 203 = 8000. The graph of
x y 2 = 8000 appears below. Since y > 0 is the only possibility, this relation does determine
y as a function of x.
50

xy2=8000

0 10 20 30 40 50 60 70 80 90 100

-25

-50

√ p
When x = 100, y = 80 ≈ 8.94427, and when x = 110, y = 800/11 ≈ 8.52803 so
the additional 10 hours saved about 0.41624 thousands of dollars (or 416.24 dollars).

Functions can often be stretched or translated to create new functions whose features
are similar to those of the original function.

For example, data compression often uses a base function to represent large-scale fea-
tures of the data and refined versions of the same function to capture small-scale features.
The collection might look something like this:
Functions 71

1.6

1.2
y=h(x)

0.8

y=f(x) y=g(x)
0.4

-1 -0.5 0 0.5 1 1.5 2 2.5 3

The base function shown above is

0 for x < 0


x for 0 ≤ x ≤ 1

h(x) =
 2 − x for 1 < x ≤ 2

0 for x > 2

The first smaller function is f (x) = 0.5h(2x) and the smaller bump to its right is g(x) =
0.5h(2(x − 1)). Defined directly in terms of x,

0 for x < 0 0 for x<1

 
 
x for 0 ≤ x ≤ 1/2 x−1 for 1 ≤ x ≤ 3/2
 
f (x) = g(x) =
 1 − x for 1/2 < x ≤ 1
 2 − x
 for 3/2 < x ≤ 2
0 for x > 1 0 for x>2

The reader should check the values of the functions at selected values of x.

Example 9. Graph the functions f (x) = |x|, g(x) = |x − 3|, and h(x) = 2|x − 3| on
the same set of axes. Label the graphs.
72 Functions

3
y=h(x)=2|x-3|
y=g(x)=|x-3|
y=f(x)=|x|
2

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

-3

Notice that the graph of g is the same as that of f except shifted and that the graph of h
is the same as that of g except stretched vertically or compressed horizontally (both points
of view are valid).

√ √
Example 10. The reader is asked to graph the functions f (x) = x, g(x) = x − 4,
p
and h(x) = x/2 on the same set of axes. Label the graphs.
Notice that the graph of g is the same as that of f except shifted (to the right by 4 units)
and that the graph of h is the same as that of f except vertically compressed by a factor
√
of 2 or horizontally stretched by a factor of 2.

Example 11. The reader is asked to graph the functions f (x) = x2 , g(x) = x2 + 1,
j(x) = (x/2)2 , and h(x) = 4x2 on the same set of axes. Label the graphs.
Notice that the graph of g is shifted up from that of f . In the graph of j, compared to
that of f , the vertical axis has been shrunk by a factor of 4, or, alternatively, the horizontal
direction has been compressed by a factor of 2. How does the graph of h compare?

EXERCISES.
1. Calculate at least 5 points on the graph of f (x) = x2 + x and then sketch the
graph. Check your answer with a computer or a graphing calculator if you have access to
such.
2. Calculate at least 4 points on the graph of f (x) = x2 − x and then sketch the
Functions 73

graph. Check your answer with a computer or a graphing calculator if you have access to
such.
3. Calculate at least 5 points on the graph of f (x) = x3 − x and then sketch the
graph. Check your answer with a computer or a graphing calculator if you have access to
such.
4. Sketch the graph of f (x) = 1/x with values of x between −5 and −0.1, and for x
between 0.1 and 5.
5. For which numbers is f (x) = 1/(x2 − 1) defined? Sketch the graph of f with
−3 ≤ x ≤ 3 and with reasonable values of x.
6. Sketch the graph of f (x) = 1 + x with values of x between −3 and 3.
7. Sketch the graph of f (x) = 1 + 3x with values of x between −3 and 3.
8. Sketch the graph of f (x) = 2 + x with values of x between −3 and 3.
9. Sketch the graph of f (x) = 2 − |x| with values of x between −3 and 3.
10. Sketch the graphs of f (x) = x2 and g(x) = x3 with values of x between −2 and 2
and using the same set of axes for both graphs. Label the graphs.
11. Sketch the graphs of f (x) = 3 − x2 and g(x) = −1 + x2 with values of x between
−3 and 3 and using the same set of axes for both graphs.
12. Sketch the graphs of f (x) = x4 and g(x) = x3 with values of x between −2 and 2
and using the same set of axes for both graphs.
13. Choose an appropriate range of values for x and sketch the graph of
x for x < 0
(
f (x) = x2 for 0 ≤ x ≤ 1
2 − x for 1 < x .

14. Choose an appropriate range of values for x and sketch the graph of
x for x < 0
(
g(x) = 3x 2
for 0 ≤ x ≤ 1
2 − x for 1 < x .

15. Choose an appropriate range of values for x and sketch the graph of

−(x + 1)2 for x < 0
f (x) =
(x − 1)2 for 0 ≤ x .

16. Choose an appropriate range of values for x and sketch the graphs of f (x) = (3x)2 ,
g(x) = 4x2 , h(x) = (2x − 4)2 , and i(x) = (3x − 1)2 on the same set of axes. Label each
graph.
74 Functions

C. Operations with functions.

Operations with functions come from two sources. The first is the fact that a function,
once evaluated, gives a number. The second source is that a function is an operation, or
transformation, of numbers to numbers.

Using addition, multiplication, division and subtraction of numbers we can define, for
two functions f and g:

(f +g)(x) = f (x)+g(x), (f −g)(x) = f (x)−g(x), (f g)(x) = f (x)g(x), (f /g)(x) = f (x)/g(x).

√
Example 1. For the functions f (x) = x2 , g(x) = x + 3, and h(x) = x, calculate
f + g, f + h, and f · g.
Solution:
√
(f + g)(x) = x2 + x + 3, (f + h)(x) = x2 + x, (f g)(x) = x2 (x + 3) = x3 + 3x2 .

Example 2. For the functions f (x) = x2 , and g(x) = x + 3, calculate f /g.

Solution: For x = −3, f (−3) = 9 and g(−3) = 0 so f /g does not make sense. For x 6= −3,

x2 (x + 3)(x − 3) + 9 9
(f /g)(x) = = =x−3+ .
x+3 x+3 x+3

Thinking of functions as operations we define, for two functions f and g the compo-
sition:
(f ◦ g)(x) = f (g(x)), (g ◦ f )(x) = g(f (x)).

We call the first of these f composed with g and the second g composed with f .
√
Example 3. For the functions f (x) = x + 3, and h(x) = x, calculate f ◦ h, and
h ◦ f.
√ √ √
Solution: f ◦ h(x) = f ( x) = x + 3, and h ◦ f (x) = h(x + 3) = x + 3 .

Example 4. For the functions f (x) = x2 , and g(x) = 5, calculate f ◦ g, and g ◦ f .

Solution: f ◦ g(x) = f (5) = 25, and g ◦ f (x) = g(x2 ) = 5.

Example 5. For the functions f (x) = 1/(x + 1), and g(x) = (1/x) − 1, calculate f ◦ g
and g ◦ f , and find the domains of these compositions.
Functions 75

Solution: f ◦ g(x) = x, and g ◦ f (x) = x. The domain of f ◦ g is x 6= 0, and the domain of

g ◦ f (x) is x 6= −1.

What we have in the previous example is a function that undoes what the previous
function did. That is, starting with a number, giving it to the function f and then giving
the result to the function g returned the number with which we started.

Start with a function f (x) defined on a domain D. Suppose there is a function g so

that for any x in D we have g(f (x)) = x. Then g is the inverse of f .

Example 6. Find an inverse for f (x) = x + 3.

Solution: Set g(u) = u − 3. Then g ◦ f (x) = g(x + 3) = (x + 3) − 3 = x. This is true for
any x and thus g is and inverse for f . It is interesting to note that f ◦ g is also the identity:
f ◦ g(u) = u.
√
Example 7. Find an inverse for f (x) = x.
√
Solution: Notice that the domain of f is x ≥ 0. Set g(u) = u2 . Then g ◦ f (x) = g( x) =
√
( x)2 = x (the square is always positive or zero, but we already know that x is non-
negative because it is in the domain of f ). This is true for any x and thus g is and inverse
for f . It is interesting to note that f ◦ g is not the identity: only for non-negative u is it
true that f ◦ g(u) = u.

Example 8. Find an inverse for f (x) = (x + 3)/(x − 2).

Solution: Notice that the domain of f is x 6= 2. To find and inverse we need to find x in
terms of y where y = (x + 3)/(x − 2). Here are some steps in the process:

x+3 3 + 2y
y= , (x−2)y = x+3, yx−2y = x+3, yx−x = 3+2y, (y−1)x = 3+2y, x = .
x−2 y−1

Set g(u) = (3 + 2u)/(u − 1). Then

3 + 2(x + 3)/(x − 2) (3(x − 2) + 2x + 6)/(x − 2) (3x − 6 + 2x + 6)

g ◦ f (x) = = = = x.
((x + 3)/(x − 2)) − 1 (x + 3 − (x − 2))/(x − 2) (x + 3 − x + 2))

Notice that the domain of g is u 6= 1 which is exactly the range of f . You can check that
f ◦ g(u) = u and that the range of g is exactly the domain of f .

Notation. The inverse for f is denoted f −1 . Since we think of f −1 as a function, we

often write f −1 (x), that is, we continue to call the independent variable x.
76 Functions

Example 9. The inverse of f (x) = 3x + 6 is f −1 (u) = (u − 6)/3 = u/3 − 2. Alterna-

tively, we can write f −1 (x) = x/3 − 2.

Example 10. Find f −1 for f (x) = 0.5x − 1.

Solution: f −1 (x) = 2x + 2. We get this by reversing the operations of f : first we add 1
and then we multiply by 2.

Example 11. For f (x) = 2x − 1 and g(x) = 3x + 1 find f ◦ g, f −1 , g −1 , and g −1 ◦ f −1 .

Solution: f ◦ g = 6x + 1, f −1 (x) = x/2 + 1/2, g −1 (x) = x/3 − 1/3, g −1 ◦ f −1 = g −1 (x/2 +
1/2) = x/6 + 1/6 − 1/3 = x/6 − 1/6.

Example 12. Suppose that f −1 and g −1 are defined. Then (f ◦ g)−1 = g −1 ◦ f −1 .

To see this, apply g −1 ◦ f −1 to (f ◦ g)(x):

(g −1 ◦ f −1 )((f ◦ g)(x)) = g −1 (f −1 (f (g(x)))) = g −1 (g(x)) = x.

Example 13. For f (x) = 2x + 4 find f −1 , and graph f and f −1 on the same set of
axes.
Solution: f −1 (x) = x/2 − 2.
3

y=f(x)=2x+4
1

-5 -4 -3 -2 -1 0 1 2 3 4 5

y=f-1(x)=x/2-2
-1

-2

-3
Functions 77

Example 14. For f (x) = 2 + x3 find f −1 , and graph f and f −1 on the same set of
axes.
Solution: f −1 (x) = (x − 2)1/3 .

2.5

y=f-1(x)=(x-2)1/3
y=f(x)= x3 + 2

-5 -4 -3 -2 -1 0 1 2 3 4 5

-2.5

√
Example 15. For f (x) = 2 + x find f −1 , and graph f and f −1 on the same set of
axes.
Solution: The domain of f is x ≥ 0 and f −1 (x) = (x − 2)2 for x ≥ 2.
4

y=f(x)= 2 + x0.5
3.2

2.4

y=f-1(x)=(x-2)2
1.6

0.8

-0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Explain the domains and ranges for f and for f −1 .

78 Functions

EXERCISES.
√
1. For the functions f (x) = x3 , g(x) = x − 5, and h(x) = x, calculate f + g, f · h, and
f · g.
√
2. For the functions f (x) = x3 , g(x) = x − 5, and h(x) = x, calculate f ◦ h, h ◦ f , f ◦ g,
and g ◦ f .
√
3. For the functions f (x) = 3x2 + 3, and h(x) = x, calculate f ◦ h, and h ◦ f . What are
the domains of f , of h, of f ◦ h, and of h ◦ f ?
4. For the functions f (x) = (x + 2)/(x − 1), and h(x) = (x + 3)/(x + 5), calculate f ◦ h,
and h ◦ f . What are the domains of f ◦ h and of h ◦ f ?
5. For f (x) = 2 + x3 calculate f −1 , and graph f and f −1 on the same set of axes.
6. For f (x) = 3 + x2 calculate f −1 , and graph f and f −1 on the same set of axes. Your
graph should indicate the correct domains for the two functions.
7. For f (x) = 4x + 7 calculate f −1 , and check algebraically that f ◦ f −1 (x) = x and
f −1 ◦ f (x) = x.
8. For f (x) = 3.6 − x/3 calculate f −1 , and check algebraically that f ◦ f −1 (x) = x.
9. For f (x) = 0.4 x + 1.3 calculate f −1 , and check algebraically that f ◦ f −1 (x) = x.
10. For f (x) = (x + 3)/(x + 4) calculate f −1 , and check algebraically that f ◦ f −1 (x) = x.
11. For f (x) = (x + 3)/(x + 4) calculate f −1 . What are the domains of f and of f −1 ?
What are the ranges of f and of f −1 ?
12. For f (x) = x5 calculate f −1 , and check algebraically that f ◦ f −1 (x) = x. What are
the domains of f and of f −1 ?
13. Let the domain be all real numbers. Does f (x) = x2 have an inverse?
14. Let the domain be all non-negative numbers, x ≥ 0. Does f (x) = x2 have an inverse?
15. Let the domain be all non-negative numbers, x ≥ 0. Does f (x) = |x| have an inverse?
16. Does f (x) = x|x| have an inverse? If so, graph f and f −1 .
17. Does f (x) = x3 − x have an inverse? If so, graph f and f −1 .
18. Does f (x) = x + |x| have an inverse? If so, graph f and f −1 .
19. Does f (t) = t7 have an inverse? If so, graph f and f −1 .
20. Restrict the domain so that f (t) = t2 + 6t has an inverse. Graph f and f −1 .
21. Does f (t) = t3/5 have an inverse? If not, is there some restriction on the domain that
makes this function invertible?
Functions 79

22. Does f (t) = 2t2 − t4 have an inverse? If not, is there some restriction on the domain
that makes this function invertible? Hint: the graph shows a maximum value of f = 1
achieved at t = −1 and at t = 1.
23. Set f (x) = x3 and g(x) = 4 + 5x. Calculate h = f ◦ g and h−1 . How is h−1 related to
f −1 and g −1 ?
24. Set f (x) = x3 and g(x) = 4 + 5x. Calculate h = g ◦ f and h−1 . How is h−1 related to
f −1 and g −1 ?
25. Use a symbolic calculation to show that if f is invertible and g is invertible then
h = f ◦ g is invertible and h−1 = g −1 ◦ f −1 .
26. Suppose that f is invertible (f is a function from A to B). Show that f is onto. Hint:
The inverse must be defined on every y in B. What must happen to x = f −1 (y)?
27. Suppose that f is invertible (f is a function from A to B). Show that f is one-to-one.
Hint: if y = f (x) = f (z), what does the inverse do to y?
28. Suppose that f is invertible and denote its inverse by f −1 . Show that f ◦ f −1 (y) = y.
Hint: since a previous problem shows that f is onto, there is an x with f (x) = y.
29. Suppose that h = f ◦ g is invertible. Does it follow that g is invertible?

D. Summary for Functions.

Given two spaces A and B, a function is a rule assigning a result, y in the codomain
B, to every element, x, in the domain. The collection of all results (all y in B actually
achieved) is called the range.
If f (x) = f (z) only when x = z, then f is one-to-one or injective.
If for each y in B, there is an x with f (x) = y, then f is onto or surjective.
Functions can be added or multiplied by numbers (f + g and c f are defined for
functions f and g and a number c and these are also functions).
The composition of f with g is f ◦ g defined by f ◦ g (x) = f (g(x)).
The inverse of f is a function f −1 from the codomain of f to the domain of f
satisfying f −1 ◦ f (x) = x, and consequently f ◦ f −1 (y) = y.
The graph of f −1 is the reflection of the graph of f across the diagonal (switching the
first and second coordinates in the plane).
6. Exponential Growth, Financial Interest, and Sums

In describing growth over time, of populations, of an economy, of an investment, or in

many other examples, the notion of proportional growth plays a major role. Exponential
functions describe proportional growth or proportional decay, and can also be combined
to describe other behaviors.

A. Exponentials.

For any positive base b and for any exponent x, the value bx is a function of the type
which we now investigate.
Recall that for any rational number x = m/n (where m and n are integers) we defined
x
b in terms of powers and roots of b. We extend this definition to real numbers by suggesting
that if x is close to m/n then bx is close to bm/n . In order to show that such an extension
is possible we need to understand two issues, the first being what is meant by “close”. The
second issue is whether different approximations to x yield consistent values for bx . We
ignore both these issues, and simply assume that bx is defined.

Mathematical Example. We discuss one example to illustrate what is entailed in

computing bx when x is not rational. The point here is only to explain the two issues that
√
we are ignoring. Consider approximating 3 2 .
√ √
Since 2 is not rational, we first need to approximate it. We will start with 2 ≈ 1.4,
since 1.42 = 1.96 < 2 < 1.52 = 2.25. Now we can write 31.4 as 37/5 which is the 5th
root of 37 = 2187. Since 4.655 < 2187 < 4.665 , we have 31.4 ≈ 4.65. We could have
taken a different view and written 31.4 as 314/10 which is the 10th root of 314 = 4, 782, 969.
Since 4.6510 ≈ 4, 726, 389.7 (and 4.6610 ≈ 4, 829, 021.8) we would still conclude that 31.4 ≈
4.65. In other words, the representation of the exponent as a fraction did not change our
√
conclusion. To improve our approximation we would need to estimate 2 more accurately
and also show that the errors resulting from these approximations are small. We will
√
2
discuss approximations more carefully in chapter 8 and conclude that 3 ≈ 4.7 to two
significant digits.

Returning to exponentials, recall the properties of exponential functions,

(1) bx+z = bx bz
Exponential Growth, Financial Interest, and Sums 81

z
(2) bx = bxz

The first property can be viewed as showing that an exponential grows in proportion
to itself. If the exponent is increased by z then the result is multiplied by bz .

Example 1. Suppose that the size of a population of bacteria after t days is 3, 000×2t .
By what factor does the size of the population grow from day 6 to day 7? from day 12 to
day 13? from day 43 to day 44?
Solution: We calculate that the population sizes are p(6) = 3, 0006 = 192, 000 and
p(7) = 3, 0007 = 384, 000, so p(7)/p(6) = 384, 000/192, 000 = 2. Similarly, p(13)/p(12) =
24, 576, 000/12, 288, 000 = 2 and p(44) = 17, 592, 186, 044, 416, 000, while
p(43) = 8, 796, 093, 022, 208, 000, so p(44)/p(43) = 2.

Example 2. Suppose that the value of an investment increases by 10 percent each

year. During which year does the investment double in size?
Solution: The first step is to describe the setting mathematically. After one year, the
investment is 1.1 times its value at the start of the year. Thus the factor by which it has
grown is 1.1 after one year, 1.1 × 1.1 = 1.12 = 1.21 after two years, 1.1 × 1.1 × 1.1 = 1.13 =
1.331 after three years, and 1.1t after t years. To decide when the investment doubles we
can calculate with different numbers of years. Our starting guess is 6 years, which gives
1.16 = 1.771561 so it takes longer. Next 1.17 = 1.9487171, and 1.18 = 2.14358881. We
conclude that the investment doubles during the eighth year.

Example 3. For the function f (x) = 3x calculate the factor by which f grows when
x changes by 2.
Solution: f (x + 2) = 3x+2 = 3x · 32 . So the function grows by a factor of 32 = 9 when
x changes by 2 (and this is true for any value of x).

Example 4. For the function f (x) = 0.7x calculate the factor by which f grows when
x changes by 3.
Solution: f (x + 3) = 0.7x+3 = 0.7x · 0.73 . So the function grows by a factor of
0.73 = 0.343 when x changes by 3 (and this is true for any value of x).

Considering the second of the properties of exponentials above, (bx )z = b(xz) , one can
see that one can write any exponential in terms of any other one.

Example 5. Write 4x as a power of 2.

Solution: Since 4 = 22 , 4x = (22 )x = 22x .
82 Exponential Growth, Financial Interest, and Sums

Example 6. Write (1/9)x as a power of 3.

Solution: Since 1/9 = 3−2 , (1/9)x = (3−2 )x = 3−2x .
√
Example 7. Write ( 2)x as a power of 2.
√ √ x
Solution: Since 2 = 2 0.5 , 2 = (2 0.5 )x = 2 0.5x .

EXERCISES.
1. Suppose that the size of a population of bacteria after t hours is 400 × 1.5t . By
what factor does the size of the population grow from hour 6 to hour 7? from hour 52 to
hour 53?
2. Suppose that the size of a population of bacteria after t hours is 400 × 1.5t . By
what factor does the size of the population grow from hour 6 to hour 8? from hour 52 to
hour 54?
3. For the function f (x) = 5x calculate the factor by which f grows when x changes
by 2.
4. For the function f (x) = 0.2x calculate the factor by which f grows when x changes
by 2.
5. For the function f (x) = 5x calculate the factor by which f grows when x changes
by 3.
6. Write 8x as a power of 2.
7. Write 0.5x as a power of 2.
8. Write 25x as a power of 5.
9. Write 5x as a power of 25.

B. The base e.
There is a particular number that turns out to be useful as a base. This is the number
e. The reason this number is natural as a base is that its growth is proportional to itself
with the constant of proportionality being 1.
Consider, then, and example for which the growth of the quantity is the quantity,
namely an interest rate of 100%. We construct a model that describes an investment at a
nominal annual interest rate of 100% when interest is paid continuously.

Set A = original amount invested.

Set n = the number of times interest is calculated each year (this number will be increased).
Let R = the resulting value at the end of one year.
Exponential Growth, Financial Interest, and Sums 83

If the interest is only paid at the end of the year (n = 1), then the interest is A and
the amount at the end is R = A + A = 2A. If the interest is paid twice a year (n = 2),
then the interest during each period is 0.5 of the original amount and the amount at the
end of the first half year is A + 0.5A = 1.5A. This amount serves as the investment for
the second half-year. So at the end of the year R = 1.5A + 0.5(1.5A) = 2.25A. If the
interest is paid three times a year (n = 3), then the interest during each period is 1/3
of the original amount and the amount at the end of the first third of a year (after 4
months) is A + A/3 = 4A/3. This amount serves as the investment for the second third of
a year. So at the end of eight months the amount is 4A/3 + (1/3)(4A/3) = 16A/9. This
amount serves as the investment for the last third of a year. So at the end of the year
R = 16A/9 + (1/3)(16A/9) = 64A/27 ≈ 2.370370A.
Using our variables, after the first period, which lasts for 1/n of a year, the amount
grows by a factor of 1 + (1/n), that is from A to [1 + (1/n)]A. Over the remaining portions
of the year, the amount grows by the same factor n − 1 more times:
1 1 2 1 3 1 4 1 n
A, 1+ A, 1 + A, 1 + A, 1 + A, . . . , R = 1 + A.
n n n n n
n
The table gives some rounded numerical values for 1 + (1/n) :

n= 1 2 3 12 365 5,000 100,000

R/A ≈ 2 2.25 64/27 2.613035 2.714567 2.718011 2.718268

As the number of times that the interest is compounded increases (that is n becomes
very large), the ratio of the resulting value at the end of the year to the amount invested
approaches a particular number. The number e is the number that describes the ratio of the
resulting value to the principal invested with 100% interest and continuous compounding.

We summarize the discussion so far by stating that et is the factor by which we multiply
an initial investment that has paid interest at 100% a year compounded continuously for
a duration of t years.
More generally, if a nominal interest rate of r per year is compounded continuously
for t years the amount at the end, V (for value), and the amount at the beginning, P (for
principal), are related by

Continuous compounding : V = P ert . Rate of interest r, duration t.

84 Exponential Growth, Financial Interest, and Sums

Mathematical Explanation. The calculation leading to this formula is that 1 +

n nt
(1/n) is replaced by 1 + (r/n) and if we change to a new variable, n = rm we get
r rmt 1 m rt
1+ = 1+ ≈ ert , for large m.
rm m
Justifying this carefully requires the notion of a limit and observations about the continuity
of functions, both of which are important ideas from calculus.

Example 1. An account starts out with $1,000 and receives interest at 3% com-
pounded continuously. Assume that no money is taken out of the account. How much is
in the account after 4 years?
Solution: The interest rate is r = 0.03 and the duration is 4 years, so the amount in the
account after 4 years is 1000e0.03×4 = 1000e0.12 . Rounded to the nearest cent this amount
is $1,127.50.

Example 2. A machine is bought for $15,000 and looses value at 5% compounded

continuously (one would also say that the value depreciates at 5%). How much is the
machine worth after 10 years?
Solution: The interest rate is r = −0.05 and the duration is 10 years, so the value after 10
years is 15000e−0.05×10 = 15000e−0.5 . Rounded to the nearest cent this value is $9,097.96.

Example 3. Graph ex after calculating its value at x = 0 and its approximate values
at x = −2, x = −1, x = 1, and x = 2.
Solution: We know that e0 = 1. The other values are obtained from a calculator: e−2 ≈
0.135, e−1 ≈ 0.368, e1 ≈ 2.718, and e2 ≈ 7.389. The graph follows.
7.5

2.5

-4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4

Exponential Growth, Financial Interest, and Sums 85

Example 4. Graph e0.1x after calculating its value at x = 0 and its approximate
values at x = −2, x = −1, x = 1, and x = 2.

Solution: We know that e0 = 1. The other values are obtained from a calculator: e−0.2 ≈
0.82, e−0.1 ≈ 0.90, e0.1 ≈ 1.11, and e0.2 ≈ 1.22.

7.5

2.5

-2.5 0 2.5 5 7.5 10 12.5

Graph of y = e0.1x showing e0.1·1 ≈ 1.11.

Example 5. Graph e−x after calculating its value at x = 0 and its approximate
values at x = −2, x = −1, x = 1, and x = 2.

Solution: We know that e0 = 1. The other values are obtained from a calculator: e2 ≈ 7.39,
e1 ≈ 2.72, e−1 ≈ 0.37, and e−2 ≈ 0.14.
86 Exponential Growth, Financial Interest, and Sums

7.5

2.5

-4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4

Notice that this graph is what we would get by reflecting y = ex across the y-axis.

Example 6. The half-life of strontium 90 is 29.1 years. Write a model that represents
the amount of radioactive strontium 90 present after t years, when the initial amount is
A.
Solution: There are several ways to think about this situation. In this case, we’ll think in
terms of the “half-life”. What this means is that as 29.1 years pass, half the radioactive
isotope decays, so the remaining amount of radioactive material is half what it was before.
Therefore, the factor that describes the remaining amount is 1/2 when time advances by
29.1. Let t denote time, A be the initial amount (at t = 0), and R be the remaining
amount of radioactive material. The model is

1 t/29.1
R=A = A 2−t/29.1 .
2

Example 7. The income from a business is $30,000 a year and it is discounted by 2%

each year. Assume that the discounting is continuous. Calculate the value of the income
at a function of time.
Solution: Denote time by t (in years). The income at time t is discounted by a factor of
e−0.02t so it is 30, 000 e−0.02t dollars per year.
Exponential Growth, Financial Interest, and Sums 87

EXERCISES.
1. An account starts out with $500 and receives interest at 4% compounded contin-
uously. Assume that no money is taken out of the account. How much is in the account
after 3 years? after 12 years?
2. An account starts out with $100 and receives interest at 5% compounded contin-
uously. Assume that no money is taken out of the account. How much is in the account
after 3 years? after 12 years?
3. An account starts out with $1,000 and receives interest at 3% compounded contin-
uously. Assume that no money is taken out of the account. How much is in the account
after 20 years?
4. A loan of $1,000 pays interest at 9% compounded continuously. Assume that no
money is paid for 6 years. How much is owed on the loan?
5. A machine is bought for $100,000 and looses value at 8% compounded continuously.
How much is the machine worth after 5 years? after 10 years?
6. A machine is bought for $100,000 and looses half its value in 17 years. Use a power
of 2 to write the value of the machine as a function of time, t.
7. The size of the population of an intestinal parasite starts at 10 and doubles every
8 hours. Use a power of 2 to write the size of the population as a function of time.
8. An account starts out with $1,000 and receives interest at 3% compounded contin-
uously for 5 years. Then the account receives interest at 5% compounded continuously for
the next 5 years. Assume that no money is taken out of the account. How much is in the
account in the end?
9. A new car is bought for $20,000. It looses 20 percent of its value once it becomes
used (you may assume this happens immediately once the car is purchased). Subsequently
the car looses 2% of its value each year. Assume the interest rate is compounded continu-
ously. Calculate the value of the car at time t years.
10. Graph ex and e2x on the same set of axes.
11. Graph ex and e−x on the same set of axes.
12. Graph ex and ex/2 on the same set of axes.
13. Graph ex/4 and ex/2 on the same set of axes.
14. The income from a business is $20,000 a year and it is discounted by 3% each
year. Assume that the discounting is continuous. Calculate the value of the income as a
function of time.
88 Exponential Growth, Financial Interest, and Sums

15. The income from a business is $36,000 a year and it is discounted by 2% each
year. Assume that the discounting is continuous. Calculate the value of the income at a
function of time.
16. The income from a business is $24,000 a year and it is discounted by 2.5% each
year. Assume that the discounting is continuous. Calculate the value of the income at a
function of time.
17. Graph e−0.04t .

C. Logarithm.

Associated with each exponential function is its inverse, the logarithm for that base:

logb (y) = x when bx = y.

For the base e the logarithm is called the natural logarithm:

ln(y) = x when ex = y.

Example 1. Graph ln x showing its value at x = 1 and its approximate values at

x = 0.5, x = 2, and x = 3.
Solution: We know that e0 = 1 so ln(1) = 0. The other values are obtained from a
calculator: ln 0.5 ≈ −0.69, ln 2 ≈ 0.69, and ln 3 ≈ 1.1.

0 1 2 3 4 5 6 7 8 9

-1

-2

-3

Graph of y = ln(x) showing ln 0.5 ≈ −0.69, ln 2 ≈ 0.69, and ln 3 ≈ 1.1.

Notice that this graph is what we would get by reflecting y = ex across the diagonal line
(y = x).
Exponential Growth, Financial Interest, and Sums 89

It is very useful in calculations to note that the logarithm allows us to write any
exponential and logarithm in terms of the exponential with base e:
x
b = eln(b) , so bx = eln(b) = eln(b)x , and logb (y) = ln(y)/ ln(b) .

Example 2. Show that log5 (3) = ln(3)/ ln(5) really works.

Solution: We need to check whether 5ln(3)/ ln(5) = 3. Now
ln(3)/ ln(5)
5ln(3)/ ln(5) = eln(5) = eln(5) ln(3)/ ln(5) = eln(3) = 3.

Example 3. The half-life of strontium 90 is 29.1 years. Use an exponential with base
e to model the amount of radioactive strontium 90 present after t years, when the initial
amount is A.
Solution: Let t denote time, A be the initial amount (at t = 0), and R be the remaining
amount of radioactive material. We found previously that R = A 2−t/29.1 , and 2 = eln 2 ≈
−t/29.1
e0.69315 . So R ≈ A e0.69315 = e−0.69315t/29.1 ≈ e−0.02382t .

Notation: We usually omit the parentheses when applying the logarithm. For instance,
we write “ln 5” instead of “ln(5)”.

Example 4. Write log5 16, 511 , log7 1.23, 71.23 , and log15 34 in terms of the natural
logarithm and the exponential with base e.
Solution: log5 16 = (ln 16)/(ln 5), 511 = e11 ln 5 , log7 1.23 = (ln 1.23)/(ln 7), 71.23 =
e1.23 ln 7 , and log15 34 = (ln 34)/(ln 15).
√
Example 5. Calculate the following exactly loge2 e3 , log2 16, log3 3 3, 41.5 , and
log5 10 − log5 2.
Solution:

ln e3 3 √
loge2 e3 = = , log 2 16 = log 2 24
= 4, log 3 3 3 = log3 31.5 = 1.5,
ln e2 2
10
41.5 = 8, log5 10 − log5 2 = log5 = 1.
2

Example 6. Find the value of log11 x2 in terms of the natural logarithm.

Solution: log11 x2 = ln(x2 )/ ln(11). Or also log11 x2 = 2 log11 x = 2 ln(x)/ ln(11).
90 Exponential Growth, Financial Interest, and Sums

Example 7. Calculate the following using a calculator (or table of values) and round
to 5 significant digits: log7 e3 , log5 16, log3 11, and ln 4.51,000,000 .
Solution: Calculators will typically give values for natural logarithms and logarithms
in base 10. We’ll use the former. log7 e3 = 3 log7 e = 3/ ln 7 ≈ 1.5417, log5 16 =
ln(16)/ ln(5) ≈ 1.7227, log3 11 = ln(11)/ ln(3) ≈ 2.1827, and ln 4.51,000,000 = 106 ln(4.5) ≈
1, 504, 100

Summary. Properties of exponentials and logarithms

For any b > 0:
bx × bz = bx+z ln(u × v) = ln(u) + ln(v)
z
bx = bx z ln(uv ) = v ln(u)

bx = ex ln b logb (x) = ln(x) ln(b)
In finance: with t denoting time, P denoting the amount at time t = 0 (“principal”),
V denoting value at time t (note that t could be negative or positive), and r denoting the
interest rate that is compounded continuously

V = P ert .

EXERCISES.
1. Write log3 14, 5−3 , log7 11.7, 72 , and log11 33 in terms of the natural logarithm or
the exponential with base e.
2. Write log2.3 11, 0.611 , log3 1.43, and log9 41 in terms of the natural logarithm or
the exponential with base e.
3. Use the properties of the exponential and the definition of ln (that is lnb (y) = x
when bx = y) to show that ln(x z) = ln(x) + ln(z).
4. Use the properties of the exponential and the definition of ln (that is lnb (y) = x
when bx = y) to show that ln(xz ) = z ln(x).
5. Write log2.3 11, 0.611 , log3 1.43, and log9 41 in terms of the natural logarithm or
the exponential with base e. Use a calculator or computer to approximate these numbers
with 3 decimal places.
6. Write (1/4)x as a power of 2.
7. An account starts out with $2,000 and receives interest at 3% compounded contin-
uously. Assume that no money is taken out of the account. How much is in the account
after 14 years.
Exponential Growth, Financial Interest, and Sums 91

8. An account starts out with $2,000 and receives interest at 1.5% compounded
continuously. Assume that no money is taken out of the account. How much is in the
account after 3 years.
9. A wood lot costs $2,000,000. Suppose the value rises at a proportional growth rate
of 3% continuously. Calculate the value of the wood lot after 40 years.
10. A tractor is bought for $100,000 and looses value at 3% compounded continuously
(one would also say that the value depreciates at 3%). How much is the tractor worth
after 10 years.
11. Graph ex/3 after calculating its value at x = 0 and its approximate values at
x = −2, x = −1, x = 1, and x = 2.
12. The half-life of strontium 90 is 29.1 years. Write a model that represents the
amount of radioactive strontium 90 present after t years, when the initial amount is 2
grams.
13. Graph ln(3x) showing the corresponding value of x if y = 0 and the approximate
values of y at x = 0.5, x = 1, x = 2, and x = 3. How does this graph compare to the
graph of ln(x)?
14. Graph ln(x2 ) showing the corresponding value of x if y = 0 and the approximate
values of y at x = 0.5, x = 1, x = 2, and x = 3. How does this graph compare to the
graph of ln(x)?
15. Graph ln(6x) showing the corresponding value of x if y = 0 and the approximate
values of y at x = 0.5, x = 1, x = 2, and x = 3. How does this graph compare to the
graph of ln(x)?
16. Graph ln(x3 ) showing the corresponding value of x if y = 0 and the approximate
values of y at x = 0.5, x = 1, x = 2, and x = 3. How does this graph compare to the
graph of ln(x)?
17. Write log4 12, 47 , log4 1.23, 41.23 , and log4 16 in terms of the natural logarithm
and the exponential with base e.
18. Write log7 12, 77 , log7 1.23, 71.23 , and log7 16 in terms of the natural logarithm
and the exponential with base e.
√
19. Calculate the following exactly loge3 e6 , log2 8, log3 3, 42.5 , and log3 18 − log3 2.
20. Calculate the following exactly: log3 81, log4 8, 9−1.5 , and log3 12 − log3 4.
21. Calculate the following using a calculator (or table of values) and round to 3
significant digits: log7 e3 , log3 13, log4 11, and ln 31,000,123 .
92 Exponential Growth, Financial Interest, and Sums

D. Exponential and logarithmic equations and models.

Many natural and social variables are related by exponential or logarithmic functions.
Which function is chosen depends on what variable is chosen as the independent variable
and what variable is chosen as the dependent variable. Moreover, whichever model we
choose, using the model to answer questions often requires the use of both logarithms and
exponential.

Example 1. Suppose that a certain bond will pay $10,000 at the end of 10 years.
Suppose the current bid on the bond is $7,600. What is the continuous interest rate for
this bond?
Solution: Let r denote the interest rate. We assumed that interest is compounded continu-
ously. Then 10, 000 = 7, 600 er×10 and e10 r = 10, 000/7, 600 = 25/19. To solve for r we use
the logarithm to get 10 r = ln(25/19), and r = 0.1 ln(25/19) ≈ 0.02744 (or approximately
2.744 percent).
2
Example 2. Find x if ex +x
− 17 = 0.
2
Solution: We have ex +x
= 17 or x2 + x = ln(17) or x2 + x − ln(17) = 0. Now we
p
can use the quadratic equation to obtain the solutions x = − 1 + 1 + 4 ln(17) /2 and
p
x = − 1 − 1 + 4 ln(17) /2.
√
Example 3. Find z with ln(z 4 ) − ln( z) = 3.
Solution: We rewrite the equation as 4 ln(z) − (1/2) ln(z) = 3, so (7/2) ln(z) = 3, and
ln z = 6/7 or z = e6/7 .

Example 4. As light passes through the water of the ocean it is scattered and
absorbed in proportion to the intensity of light present and to the depth that the light
travels. This means that the light available at a certain depth is an exponential function
of depth and is proportional to the light intensity on the surface. In variables, I = Aec h ,
where h is the depth, I is the light intensity at depth h, A is the intensity at the surface,
and c is some constant. It is measured that at 10 feet the light intensity in Puget Sound is
60% of what it is on the surface. At what depth is the light 1/6 of the light on the surface?
Solution: We can use the information to find the value of c. With h = 10 we have
I = 0.6A, so 0.6A = Ae10 c . Hence c = 0.1 ln 0.6 ≈ −0.05108256. We find h at which
I = A/6: A/6 ≈ Ae−0.05108256h and −0.05108256h ≈ ln(1/6) = − ln 6. So we get h ≈
ln(6)/0.05108256 ≈ 35.08 feet.
Exponential Growth, Financial Interest, and Sums 93

An alternative approach consists of noting that for each multiple of 10 feet the light inten-
sity is multiplied by a factor of 0.6, so to get to 1/6 of the light would require x multiples
of 10 feet where 0.6x = 1/6. Taking the logarithm of both sides we find x ln(0.6) = ln(1/6)

so x = ln(1/6) ln(0.6) ≈ 3.508. The depth is h = 10x ≈ 35.08.

Example 5. Suppose that a certain bond is to return 5% for 10 years. The bond will
be worth $3,000 at maturity (10 years from the present). What is the present value of the
bond?
Solution: Let P denote the present value. We assume that interest is compounded contin-
uously. Then 3, 000 = P e0.05×10 = P e0.5 and P = 3, 000/e0.5 ≈ 1, 819.59.

Example 6. In financial models there is often a relation between two variables, say
the value of a stock, v, and time, t, that is not linear or a polynomial, but some other
power. For instance, as part of the Box-Cox method, it is assumed that for a short time
near t = 1 one might have v = A tλ for some unknown parameter λ, and with A being the
value of the stock at time t = 1. The question is how to determine λ from data.
Solution: Taking the logarithm of both sides of v = A tλ yields ln v = ln A + λ ln t. So λ
is the slope of the line that fits the logarithm of the data. Fitting a line through data is
much easier than fitting a curve with an unknown parameter.

Example 7. After a patient is given a certain medicine it takes 1 hour for the medica-
tion to be released and then the amount of medication in the blood declines exponentially.
It is found that a patient starts with 1 milligram per liter (medication per volume of
blood) one hour after the administration of the medicine and that 4 hours later the level
has declined to 0.3 mg/L. How often should the medication be administered if a level of
0.4 mg/L is the desired minimum?
Solution: Let t be the time since the medication is released, and let M denote the con-
centration of the medicine. Then M = 1 e−c t for some (positive) constant c. We want
t so that M = 0.4 because the medication should be administered to keep that level (or
higher).
Calculating c from the data, we have 0.3 = e−c 4 , or c = 0.25 ln 3 ≈ 0.274653. The
level drops to 0.4 when −0.274653t ≈ ln 0.4 or when t ≈ 3.336 hours. So the medication
should be administered about every 3 hours and 20 minutes. (We are assuming that the
amount given would be calculated to raise the concentration to 1 mg/L each time.)

Example 8. The half life of radioactive carbon (a version of C-14) is about 5700
94 Exponential Growth, Financial Interest, and Sums

years. How long does it take for there to be only 1% of the original radio-active carbon in
an object?
Solution: The assumption here is that the half-life is the time that it takes for half the
radio-acive material to break down. Let t denote time, in years. The proportion of the
original amount, p, satisfies p = ec t for some constant c (which we expect to be negative
because the proportion is 1 at time t = 0 and decreases thereafter). We have 0.5 = ec 5700
so c = ln(0.5)/5700 ≈ −0.000121605. To find the time t for which p = 0.01 we solve
0.01 = ec t . Thus t = ln(0.01)/c ≈ 37, 869.98 or about 37,870 years.
As an alternative approach to this question, one could model the radioactive decay of the
carbon by p = 0.5t/5700 . The resulting equation is 0.01 = 0.5t/5700 and can be solved by
taking the logarithm of both sides: ln(0.01) = t/5700 · ln(0.5) or t = 5700 (ln 0.01/ ln 0.5).

Example 9. Why do people use the notion of half-life? what is the full life of a
radioactive isotope? How do scientists know how much radioactive carbon there is in
something to begin with? that is, why can we measure how much of the original amount
is left when the original amount was never measured?
Discussion: It takes arbitrarily long for all of a radio-active element to decay because the
amount of decay gets proportionately smaller when only a small amount of active material
is left. So there is no notion of the full life of a radioactive isotope. Scientists have to
assume that the proportion of radio-active carbon remains approximately uniform in the
air, from which plants absorb the carbon. It is only after the carbon is used in the plant
(from which an animal might get it) that the radio-active portion decays relative to the
stable non-radioactive carbon.

Example 10. The half life of radioactive carbon (a version of C-14) is about 5700
years. A mammoth bone is found to contain 11% of the original radioactive carbon. How
old is the bone?
Solution: Let t denote time, in years. The proportion of the original amount, p, satisfies
p = ec t for c = ln(0.5)/5700 ≈ −0.000121605 (which we found in an earlier example). To
find the time t for which p = 0.11 we solve 0.11 = ec t . Thus t = ln(0.11)/c ≈ 18, 151.22
years.

Example 11. Heated objects cool down in proportion to the difference between their
temperature and the surrounding temperature. As a result the excess heat, that is the
heat above the ambient temperature, decays exponentially. Suppose that a cup of tea was
Exponential Growth, Financial Interest, and Sums 95

made with boiling water, at 212◦ F, and was left in a room at 65◦ F. After 2 minutes the
tea was at 180 degrees. How long must one wait for the tea to reach 120◦ F?
Solution: Let t denote time (in minutes) and H denote heat (in ◦ F). Then the temperature
is 65◦ F plus the excess heat of 212 − 65 = 147. It is this excess heat which decays. In
terms of the variables, H = 65 + 147ec t . Now we know that at t = 2 the temperature is

H = 180, so 180 = 65 + 147e2c and c = ln(115/147) 2. The temperature reaches 120 for

t which satisfies 120 = 65 + 147ec t , or t = ln(55/147) c. Writing this with the value of c

found above, t = 2 ln(55/147) ln(115/147) ≈ 8.0089 minutes.

Example 12. The strength of earthquakes is reported according to a logarithmic

scale. The advantage is that small numbers can be used to describe both strong and
weak earthquakes. The commonly used scale, called the Richter scale, assigns a number
R according to
R = log10 (E/E0 )

where E is the energy of the quake and E0 is a fixed value at which the Richter scale
is zero. How much more energetic was the 2005 earthquake in northern California which
measured 7.2 on the Richter scale compared to the 2007 earthquake in the Nisqually delta
which measured 6.2 on the Richter scale?
Solution: Let E2 be the energy of the California quake, and let E1 be the energy of the
Nisqually quake. Then 7.2 = log10 (E2 /E0 ) and 6.2 = log10 (E1 /E0 ) for some (positive)
constant E0 . To compare the two energies, E2 /E0 = 107.2 and E1 /E0 = 106.2 and E2 /E1 =
107.2 /106.2 = 10. So the California quake carried 10 times more energy.

EXERCISES.
1. A certain municipal bond will pay $15,000 at the end of 8 years. Suppose the
current bid on the bond is $11,000. What is the continuous interest rate for this bond?
2. A certain bond will pay $10,000 at the end of 7 years and that the current bid on
the bond is $8,600. What is the continuous interest rate for this bond?
3. A certain bond will pay $5,000 at the end of 20 years. Suppose the current bid on
the bond is $1,300. What is the continuous interest rate for this bond?
4. A certain bond is to return 6% for 10 years. The bond will be worth $12,000 at
maturity (10 years from the present). What is the present value of the bond?
5. A certain bond is to return 4% for 20 years. The bond will be worth $10,000 at
maturity. What is the present value of the bond?
96 Exponential Growth, Financial Interest, and Sums

6. A patient is given a certain medication by intravenous injection, so the medication

appears in the blood immediately. It is found that a patient starting with 1 milligram per
liter (medication per volume of blood) has a level of to 0.3 mg/L four hours later. How
often should the medication be administered if a level of 0.7 mg/L is the desired minimum?
7. A patient has 5 liters of blood and is given a certain medication by intravenous
injection, so the medication appears in the blood immediately. It is found that a patient
starting with an injection of 3 milligram has a level of to 0.2 mg/L four hours later. How
often should the medication be administered if a level of 0.4 mg/L is the desired minimum?
8. As above, a patient has 5 liters of blood and is given a certain medication by
intravenous injection, so the medication appears in the blood immediately. It is found
that a patient starting with an injection of 3 milligram has a level of to 0.2 mg/L four
hours later. How often should the medication be administered if a level of 0.4 mg/L is the
desired minimum? and what amount of medication should be injected at each subsequent
instance (to reach the level achieved by the initial injection of 3 milligrams)?
9. The half life of radioactive carbon (a version of C-14) is about 5700 years. A
mammoth bone is found to contain 6% of the original radioactive carbon. How old is the
bone?
10.The half life of radioactive carbon (a version of C-14) is about 5700 years. A
dinosaur bone is found to contain 0.5% of the original radioactive carbon. How old is the
bone?
11. The half life of radioactive carbon is about 5700 years. A mummyfied dog is found
to contain 70% of the original radioactive carbon. How old is the mummy?
12. Suppose that a cup of tea was made with boiling water, at 212◦ F, and was left in
a room at 70◦ F. After 2 minutes the tea was at 190 degrees. How long must one wait for
the tea to reach 150◦ F?
13. Suppose that a body at 98.6◦ F was left in a room at 65◦ F. When found, the
body’s temperature was 80 degrees and an hour after being found its temperature was
75◦ F. How long was the body in the room before is was found?
14. Suppose that a person whose body temperature is 98.6◦ F falls in a lake whose
water temperature is 40◦ F. After 10 minutes the person’s temperature drops to 97.8◦ F.
How long does the person have before her or his body temperature reaches 96◦ F (a level
dangerous to organ function)?
15. How much more energetic is an earthquake which measures 9.2 on the Richter
scale, the largest level ever recorded, compared to the 2011 earthquake in Chile which
Exponential Growth, Financial Interest, and Sums 97

measured 8.1 on the Richter scale?

16. The intensity of sound is, like earthquakes, measured on a logarithmic scale with
a base 10 logarithm. The units are called SPL (sound pressure level). How much more
energetic is a sound at 107 SPL, the limit for safety near machinery, than a sound at 80
SPL, which is a comfortable listening level? What is the proportion of energies for a very
large noise measuring 120 SPL as compared to a sound at 80 SPL?
17. Insulation for houses and buildings is classified according to the rate of heat
lost. Suppose that insulation that is 3.5 inches deep reduces the rate of loss of heat by
70 percent (so 30 percent is still lost). What is the rate of loss with insulation that is 5.5
inches deep? In developing this model consider what might be reasonable assumptions on
the cumulative effect of additional insulation.
18. Suppose that a cup of tea was made with boiling water, at 212◦ F, and was left in
a room at 70◦ F. After 2 minutes the tea was at 190 degrees. Calculate the temperature
of the tea as a function of time.
19. Suppose that a cup of tea behaves as above. However, after the tea was made it
was mixed with milk with the milk at 40◦ F. The ratio of milk to tea is 1 to 10. Suppose
that the temperature of the combination of the tea and milk is the weighted average of
the two temperatures (1 part 40◦ F and 10 parts 212◦ F). Calculate the temperature of the
drink after 4 minutes.
20. Suppose that a cup of tea behaves as above. However, the tea cools for 4 minutes
and then the milk is added. Calculate the resulting temperature.

E. Financial Series and Exponentials.

Our goal is to compute the cumulative value of a sequence of payments. For example,
how large should a payment be if the borrower will repay a loan of $11,000 in 60 monthly
installments. Or what investment is justified in a property that produces an income of
$2,000 each month.
The approach taken involves understanding geometric sequences and series. In outline,
we define such sequences using the familiar notion of compound interest. Then we compute
the sum of geometric series. And finally, we put these summation formulae to work in
financial applications.
98 Exponential Growth, Financial Interest, and Sums

A sequence of numbers is an ordered list of numbers. The number that holds the
nth place in the list is called the nth term of the sequence.

Example 1. The sequence {2, 4, 6, 8, . . .} has the first 4 terms specified. The third
term is 6 and the fourth term is 8. We might guess a formula for the general term by
stating that the nth term is an = 2n. This formula does agree with the terms given in
that when n is 1, a1 = 2 × 1 = 2, when n is 2, a2 = 2 × 2 = 4, a3 = 2 × 3 = 6, and
a4 = 2 × 4 = 8.

Example 2. The sequence {2, −4, 6, −8, 10, . . .} has the first 5 terms specified. The
third term is 6 and the fourth term is −8. We might guess a formula for the general term
by stating that the nth term is an = (−1)n−1 2n. This formula does agree with the first 5
terms.

Example 3. The sequence {2, 4, 8, 16, 32, . . .} has the first 5 terms specified. The
third term is 6 and the fourth term is 16. We might guess a formula for the general term
by stating that the nth term is an = (2)n . This formula does agree with the first 5 terms.

Definition. For any investment, P , whose value at the end is V (after a duration of t
years, say), the amount V is called the future value of the investment. The amount P is
called the principal or present value of the investment.

Example 4. Simple interest is interest that is not compounded. Suppose that simple
interest at 10% a year is paid on a loan of $3,000. The loan will be paid back after 6
months. What is the future value of the loan?
Solution: the interest rate is 0.10 and is paid over half a year. So the interest paid is
0.5 × 0.10 × 3, 000 = 150. The future value is the principal and the interest, or $3,150.

Example 5. Suppose that simple interest at 10% a year is paid on a loan of $3,000.
The loan will be paid back in full (with the interest) after 4 years. What is the future
value of the loan?
Solution: the interest rate is 0.10 and is paid over 4 years. So the interest paid is 4 × 0.10 ×
3, 000 = 1, 200. The future value is the principal and the interest, or $4,200.

Example 6. The interest rate on a loan of $10,000 is 5% and is compounded every

year. The loan is paid after 7 years. Calculate the future value of the loan at the end of
the 7 year period.
Exponential Growth, Financial Interest, and Sums 99

Solution: After each year the principal is increased by 5% over its value the at the beginning
of the year. Hence the principal is multiplied by 1.05 each year. The resulting value after
7 years is 10, 000 × (1.05)7 ≈ 14, 071.00 dollars.

Example 7. A payment of $200,000 will be made 20 years into the future. Suppose
the interest rate used to discount this payment is 4.8% and is compounded annually.
Calculate the present value of the payment.
Solution: After each year the principal is decreased by a factor corresponding to the interest
rate of 4.8%. Hence the principal is divided by 1.048 each year. The resulting value after
discounting 20 times is 200, 000/(1.04820 ) ≈ 78, 307.68 dollars.

In finance we are often interested in the sum of the terms of a sequence or of the first part
of a sequence.

Example 8. A loan of $2,400 is taken out at 3% interest. At the end of each month,
the interest on the remaining balance is paid, and in addition $100 of the principal is paid.
Calculate the total amount paid on the loan.
Solution: First note that it takes 24 months to pay the loan back. At the beginning of the
first month, the principal is $2,400. Since the monthly interest rate is 0.03/12 = 0.0025,
the interest paid on this $2,400 is 2, 400 × 0.0025 = 6. The total paid at the end of the first
month is this interest plus $100 for a total of 106. At the beginning of the second month,
the principal is $2,300. The interest paid on this is 2, 300 × 0.0025 = 6 − 0.25. The total
paid at the end of the second month is this interest plus $100 for a total of 106 − 0.25.
At the beginning of the third month, the principal is $2,200. The interest paid on this is
2, 200 × 0.0025 = 6 − 0.50. The total paid at the end of the third month is this interest
plus $100 for a total of 106 − 0.50. This pattern continues, so the amount paid at the end
of m months is 106 − (m − 1) · 0.25. The total amount paid is

106 + (106 − 0.25) + (106 − 0.50) + . . . + (106 − 5.75) = 106 + 105.75 + 105.50 + . . . + 100.25.

To calculate the value of this sum, it is useful to know that for a sequence of n terms, with
values a1 = s, a2 = s + d, a3 = s + 2d, and so on, the sum is

s + (s + d) + (s + 2d) + . . . + (s + (n − 1)d) = n s + 0.5 n(n − 1)d .

In our example,

106 + (106 − 0.25) + (106 − 0.50) + . . . + (106 − 5.75)

100 Exponential Growth, Financial Interest, and Sums

= 24 · 106 + 0.5 · 24 · 23(−0.25) = 2544 − 69 = 2, 475.

A sequence in which the terms differ by a constant amount is called an arithmetic

sequence. To understand the formula for arithmetic sums,
s + (s + d) + (s + 2d) + . . . + (s + (n − 1)d) = n s + 0.5 n(n − 1)d ,

notice that s appears n times, and that d appears n − 1 times and with the coefficients
1, 2, 3, . . . , (n − 1). If we pair 1 with n − 1, 2 with n − 2, 3 with n − 3, and so on, we get
(n − 1)/2 = 0.5(n − 1) pairs and each pair adds to exactly n. The reader can check that
if n − 1 is not divisible by two, then the value of the middle term is n/2 and so one has
0.5(n − 2)n from the pairs, plus 0.5n from the middle term. So the sum in all cases is
0.5(n − 1)n d.

Example 9. A loan of $6,000 will be paid by paying the principal in 20 monthly

installments of $300 each. The interest rate is 4% and will be paid on the outstanding
balance each month. Calculate the total amount paid on the loan.
Solution: For this example we’ll consider the last payment first. The last payment will
consist of the last $300 of the principal plus the interest paid on this last $300. Since the
monthly interest rate is 0.04/12, the amount of interest paid is 300 × 0.04/12 = 1. The
total paid on the last month is then 300 + 1. On the second to last month, the payment
will consist of $300 of the principal plus the interest paid on the remaining balance of $600.
The total paid on the second to last month is then 300 + 2. On the third to last month,
the payment will consist of $300 of the principal plus the interest paid on the remaining
balance of $900. The total paid on the third to last month is then 300 + 3. Since there are
a total of 20 payments, the total amount paid is

20
(300 + 1) + (300 + 2) + (300 + 3) + . . . + (300 + 20) = (301 + 320) = 6, 210.
2

We often want the sum of loan payments or income that include compounded interest
or are discounted using compounded interest.

Example 10. (The setting – solution completed later). Suppose that a loan with an
annual interest rate of 5% is available and a person is willing to make payments of $100
dollars a month for 4 years. Assume that the interest is compounded monthly. How much
can the person borrow?
Exponential Growth, Financial Interest, and Sums 101

Solution Method: The loan is made in the present (time = 0) and the payments are made
in the future (in 48 installments, each a month apart). We will calculate what each future
payment is worth in the present.
When the annual interest rate is r, the monthly interest rate is r/12. Hence, a payment
in the amount of F made m months into the future will contribute Pm to the present value
where r m
F = Pm 1 + .
12
So the present value of this future payment, the value that it contributes to the size of the
loan, is
. r m r −m
Pm = F 1+ =F 1+ .
12 12
Before continuing with this example, let us recognize the form of this sequence of
contributions and analyze the sum of such terms.

A sequence for which the ratio of terms is constant is called a geometric sequence. The
sum of all the terms is called a geometric series.

Example 11. The sequence {1, 2, 4, 8, 16, 32, . . .} is a geometric sequence for which
the common ratio is 2. The corresponding series is 1 + 2 + 4 + 8 + 16 + 32 + . . ..

Example 12. The sequence {1, 1.05, 1.052 , 1.053 , . . .} is a geometric sequence for
which the common ratio is 1.05. The corresponding series is 1 + 1.05 + 1.052 + 1.053 + . . ..

Example 13. The sequence {1, 0.95, 0.952 , 0.953 , . . .} is a geometric sequence for
which the common ratio is 0.95. The corresponding series is 1 + 0.95 + 0.952 + 0.953 + . . ..

In general, a geometric series is a multiple of 1 + a + a2 + a3 + . . ., where a is the common

ratio of the terms. We calculate the value of such a series.
Set
sn = 1 + a + a2 + a3 + . . . + an−2 + an−1 + an .

To calculate the value of sn we observe that if we multiply this series by the common ratio
a only the first and last terms change:

a sn = a + a2 + a3 + a4 . . . + an−1 + an + an+1 .

sn − a sn = 1 − an+1 so (1 − a)sn = 1 − an+1 .

102 Exponential Growth, Financial Interest, and Sums

The case a = 1 is pretty boring, since then all terms are the same and sn = n. If a 6= 1,
then
1−an+1
Geometric sum: sn = 1 + a + a2 + a3 + . . . + an = 1−a , a 6= 1.

Example 14. In our loan example from above, the annual interest rate was 5% and
the monthly payments were $100 dollars for 48 months. The present value of the payment
made in the mth month is
0.05 −m 12.05 −m 12 m
Pm = 100 1 + = 100 = 100 .
12 12 12.05
Suppose now that the first payment is made at the very beginning and the second payment
is made after one month and that the last one is at the end of 47 months (the beginning
of the 48th month). The sum of the present values is
" #
12 12 2 12 3 12 47
Total = 100 1 + + + + ... +
12.05 12.05 12.05 12.05

and we can use the formula for the geometric sum, sn , with n = 47 and a = 12/12.05, so
" #," #
12 48 12
Total = 100 1 − 1− ≈ 100 × 43.6039 = 4, 360.39 .
12.05 12.05

This is the amount that the person can borrow.

Example 15. In our loan example just above, suppose that the first payment is made
at the end of the first month and the second payment is made at the end of the second
month and that the last payment is at the end of 48 months. The sum of the present
values is
" #
12 12 2 12 3 12 48
Total = 100 + + + ... +
12.05 12.05 12.05 12.05

and we can see that the total is now what we had before but multiplied once more by
a = 12/12.05.
" #," #
12 12 48 12
Total = 100 1− 1− ≈ 100 × 43.4230 = 4, 342.30 .
12.05 12.05 12.05

Example 16. Suppose that a retirement account will have deposits of $100 made
monthly for the next 40 years, and that interest on the account will be paid monthly at an
annual interest rate of 5%. How much will be in the account at the end of the 40 years?
Exponential Growth, Financial Interest, and Sums 103

Solution: The future value of the deposit made at the very end is $100. The value of the
deposit made in the mth month before the end of the 40 years period is
0.05 m
Fm = 100 1 + .
12
Suppose now that the first payment is made after one month and so on for the remaining
12 × 40 − 1 = 479 months (480 payments, with the last getting no interest). The sum of
the future values is the amount in the account at the end:
" #," #
0.05 480 0.05
Total = 100 1 − 1 + 1− 1+ ≈ 100 × 1526.020156 ≈ 152, 602.02 .
12 12

Example 17. Suppose that an investor is considering purchasing a business that will
have a net income of $3,000 a month for the next 20 years. This investor is considering the
value of the business in light of an annual interest rate of 6%. Assume that the business
will have no resale value after the 20 years (or, that costs during this time will balance
this resale value). How much is the investor willing to pay for the business?
Solution: The present value of the income (received payment) in the mth month is
0.06 −m 1 m
Pm = 3, 000 1 + = 3, 000 .
12 1.005
Suppose now that the first payment is made at the very beginning and the second payment
is made after one month and so on for the remaining 12 × 20 − 1 = 239 months. The sum
of the present values is the amount that the investor is willing to pay:
" #," #
1 240 1
Total = 3, 000 1 − 1− ≈ 3, 000 × 140.278676 ≈ 420, 836.03 .
1.005 1.005

Example 18. In the previous example, what is the value of the business if it lasts for
ever, rather than just for 20 years?
Solution: Extending the payments for ever amounts to replacing the 240 payments above
with arbitrarily many payments. As the power of (1/1.005) grows, the size of the result
gets very small. That is (1/1.005)n is very close to zero when n is very large. The sum of
the present values become arbitrarily close to
" #," #
1 1.005
Total = 3, 000 1 − 0 1− = 3, 000 × = 3, 000 × 201 = 603, 000 .
1.005 0.005
104 Exponential Growth, Financial Interest, and Sums

Summary.

In our financial applications the main relation is between present and future values.
Let r be the nominal annual interest rate.
Suppose that compounding occurs n times a year.
Suppose that the payment occurs m periods into the future.
Then the future value F and the present value P are related by
r m . r m r − m
F =P 1+ or P = F 1 + =F 1+ .
n n n
The sum of payments with compounded interest is the sum of a geometric series:

1 + a + a2 + . . . + am = (1 − am+1 )/(1 − a) , a 6= 1.

With a total number of m future payments, and a payment made n times a year
starting at time zero, and with each payment having size A, the present value is
h r −m−1 i.h r −1 i
Total present value = A 1 − 1 + 1− 1+ .
n n
With m payments made n times a year starting at time zero, and with each payment
having size A, the accumulated value at the end is
h r m+1 i.h r i nA h r m+1 i
Total future value = A 1 − 1 + 1− 1+ = 1+ −1 .
n n r n

EXERCISES.
1. A sequence starts with {1, 3, 5, 7, . . .}. Write the next term in the sequence and
give a formula for the nth term, an .
2. A sequence starts with {1, 1.2, 1.22 , 1.23 , . . .}. Write the next term in the sequence
and give a formula for the nth term, an .
3. A sequence starts with {1, 1.1, 1.21, 1.331, 1.4641, . . .}. Write the next term in the
sequence and give a formula for the nth term, an .
4. Evaluate the sum 3 + 5 + 7 + . . . + 103. The terms in this series are an = 1 + 2n.
5. Evaluate the sum 1 + 2 + 4 + . . . + 235 . The terms in this series are an = 2 n−1 .
6. Evaluate the sum 1 + 1.05 + 1.1025 + . . . + 1.0547 . The terms in this series are
an = 1.05 n−1 .
7. Evaluate the sum 1+0.5+0.25+. . .+0.599 . The terms in this series are an = 0.5 n−1 .
Exponential Growth, Financial Interest, and Sums 105

8. Evaluate the sum 1 + 0.99 + 0.9801 + . . . + 0.9999 . The terms in this series are
an = 0.99 n−1 .

9. Suppose that a car loan with an annual interest rate of 6% is available and a person
is willing to make payments of $200 dollars a month for 5 years. Assume that the interest
is compounded monthly. How much can the person borrow?

10. Suppose that a car loan with an annual interest rate of 4% is available and a
person is willing to make payments of $200 dollars a month for 5 years. Assume that the
interest is compounded monthly. How much can the person borrow?

11. Suppose that a car loan with an annual interest rate of 5% is available and a
person is willing to make payments of $200 dollars a month for 5 years. Assume that the
interest is compounded monthly. How much can the person borrow?

12. Suppose that an investor is considering purchasing a business that will have a net
income of $1,000 a month for the next 20 years. This investor is considering the value of
the business in light of an annual interest rate of 4%. Assume that the business will have
no resale value after the 20 years. How much is the investor willing to pay for the business?

13. Suppose that an investor is considering purchasing a business that will have a net
income of $1,000 a month for the next 40 years. This investor is considering the value of
the business in light of an annual interest rate of 4%. Assume that the business will have
no resale value after the 40 years. How much is the investor willing to pay for the business?

14. Suppose that an investor is considering purchasing a business that will have a net
income of $5,000 a month for the next 20 years. This investor is considering the value of
the business in light of an annual interest rate of 4%. Assume that the business will have
no resale value after the 20 years. How much is the investor willing to pay for the business?

15. Suppose that an investor is considering purchasing a business that will have a net
income of $1,000 a month for the next 20 years. This investor considers the value of the
business in light of an annual interest rate of 7%. Assume that the business will have no
resale value after the 20 years. How much is the investor willing to pay for the business?

16. Suppose that a savings account will have deposits of $300 made monthly for the
next 25 years, and that interest on the account will be paid monthly at an annual interest
rate of 3%. How much will be in the account at the end of the 25 years?

17. Suppose that a home loan with an annual interest rate of 4% is available and a
person is willing to make payments of $1,000 dollars a month for 30 years. Assume that
the interest is compounded monthly. How much can the person borrow?
106 Exponential Growth, Financial Interest, and Sums

18. Suppose that a loan with an annual interest rate of 4.5% is available and a person
is willing to make payments of $1,000 dollars a month for 30 years. Assume that the
interest is compounded monthly. How much can the person borrow?
19. Suppose that a savings account will have deposits of $300 made monthly for the
next 30 years, and that interest on the account will be paid monthly an annual interest
rate of 3%. How much will be in the account at the end of the 30 years?
20. Suppose that a savings account will have deposits of $300 made monthly for the
next 25 years, and that interest on the account will be paid monthly at an annual interest
rate of 4%. How much will be in the account at the end of the 25 years?
21. An annuity is an account that pays income into the future, in exchange for a single
purchase price at the beginning. Suppose that an annuity will pay an income of $2,000 a
month for the next 20 years. The annual interest rate is 4%. Calculate the purchase price
for this annuity.
22. Suppose that an annuity will pay an income of $2,000 a month for the next 30
years and that the annual interest rate is 4%. Calculate the purchase price for this annuity.
23. Suppose that an annuity will pay an income of $2,000 a month for the next 40
years and that the annual interest rate is 4%. Calculate the purchase price for this annuity.
24. Suppose that an annuity will pay an income of $2,000 a month for the next 30
years and that the annual interest rate is 3%. Calculate the purchase price for this annuity.
25. Suppose that an annuity will pay an income of $2,000 a month for the next 30
years and that the annual interest rate is 2%. Calculate the purchase price for this annuity.
7. Probability

Many situations involve either risk or uncertainty. To describe these situations, and
to decide on the optimal strategies in such situations, notions from probability are needed.
For instance, suppose that a manufacturer of light emitting diodes (LEDs) seeks a
strategy for testing the lights. There is a fairly small likelihood that any given LED is
faulty, so it makes sense to consider testing a string of lights together. If the string of
LEDs passes the test, then all are shipped, and if the string fails the test, then all the
LEDs are scrapped (even though some might still function correctly). The savings might
come in reducing the number of tests required to ship a given number of LEDs.

A. Probabilities of Events.

Probabilities represent the likelihood that something happens. One can think of this as
the frequency with which an event occurs within a sequence of experiments. For instance,
if the event is that an LED is faulty, and we have checked 1000 LED and found that 3 of
them did not function properly, then we would estimate the probability of an LED being
faulty at 3/1000 = 0.003.
Notation: For an event A we denote its probability by P (A).

Example 1. Suppose that the probability of an LED being defective is 0.003. What
is the probability that an LED is good?
Solution: Something must happen, so the total probability of something happening is
1. The probability of a good LED is thus 1 − 0.003 = 0.997. Notice that in terms of
frequencies, if among 1000 LEDs we have 3 defective lights, then we have 997 lights that
function properly.
We can consider the probability of a series of events. Two events, A and B, are independent
of one another if the likelihood of either one is not changed by the occurrence of the
other. In terms of frequencies, the relative frequency with which the event A occurs
in all experiments equals the relative frequency with which the event A occurs in those
experiments in which B occurs.

Example 2. Suppose that in a population of 100,000 people there are 8,000 who
are left-handed, 40,000 who drink coffee regularly, and 3,200 who are left-handed coffee
108 Probability

drinkers. Is the event that a person drinks coffee regularly independent of the event that
a person is left-handed?
Solution: The likelihood that a person is left-handed, in the population as a whole, is
8, 000/100, 000 = 0.08. The likelihood that a person drinks coffee, in the population as
a whole, is 40, 000/100, 000 = 0.4. The likelihood that a person is left-handed, in the
population of coffee drinkers, is 3, 200/40, 000 = 0.08. The likelihood that a person drinks
coffee, in the population of left-handed people, is 3, 200/8, 000 = 0.4. Hence the two events
are independent.

The events A and B are independent of one another when

P (A and B) = P (A) · P (B).

Suppose that an experiment or observation is repeated several times. If the experi-

ments are independent of one another then the probabilities multiply.

Example 3. Suppose that the probability of an LED being defective is 0.003. Suppose
that we examine a pair of LEDs together. What is the probability of having at least one
defective LED in the pair?
Solution: There are 4 possible combinations. Both LEDs may be good. The first LED
may be defective while the second is good. The first LED may be good with the second
being defective. And both LED’s may be defective. The last 3 combinations all have at
least one defective LED. The probability of an LED being good is 0.997 so the probability
of at least one defective LED is

0.003 × 0.997 + 0.997 × 0.003 + 0.003 × 0.003 = 0.005991.

Example 4. Suppose that the probability of an LED being defective is 0.003. Suppose
that we examine 5 LEDs together. What is the probability of having at least one defective
LED among the 5?
Solution: In this instance it is expedient to recognize an alternative description of having
at least one defective LED. Namely, if all 5 LEDs are good, then none is defective. Hence
the probability that at least one LED is defective is 1 minus the probability that all 5
are good. Now, since the status of each LED is independent of that of the others, the
probability that all 5 LEDs are good is 0.9975 . Thus

P (at least one defective LED) = 1 − 0.9975 ≈ 0.014910270 .

Probability 109

EXERCISES.
1. When a coin is flipped the result is either heads or tails. Suppose that the prob-
ability of both outcomes is the same, and hence is 1/2. Suppose that in a series of coin
tosses the outcomes are independent. (a) Calculate the probability of having 3 heads in 3
tosses. (b) Calculate the probability of having 2 heads in 3 tosses. (c) Describe the event
of having 2 heads in 3 tosses in terms of the number of tails.
2. When a coin is flipped the result is either heads or tails. Suppose that the prob-
ability of the two outcomes is not the same. Suppose that in a series of coin tosses the
outcomes are independent. (a) Calculate the probability of having heads and then tails,
in 2 tosses. (b) Calculate the probability of having tails and then heads (in two tosses).
(c) Explain how one could combine the results of parts (a) and (b) to obtain a fair coin
using an unfair coin.
3. The probability of winning in a certain lottery is 1 in 50 million. Suppose a
person buys a single lottery ticket every week for 30 years, that is 1560 times. What is
the likelihood that this person wins the lottery (at least once)? Hint: First calculate the
probability that the person never wins.
4. When a coin is flipped the result is either heads (H) or tails (T). Suppose that
the probability of both outcomes is the same and that the coin is flipped 4 times. Then
the probability of any particular sequence of 4 outcomes is 1/16. (a) How many sequences
include exactly 2 H and 2 T? (b) What is the probability of obtaining exactly two heads
and two tails?
5. When a coin is flipped the result is either heads (H) or tails (T). Suppose that
the probability of both outcomes is the same and that the coin is flipped 5 times. Then
the probability of any particular sequence of 5 outcomes is 1/32. (a) How many sequences
include exactly 3 H and 2 T? (b) What is the probability of obtaining exactly three heads
and two tails?
6. Suppose that in a sample of 1000 ducks, 650 were mallards and 520 were female.
Of the female ducks, 330 were mallards. In this sample, are the two traits (mallard and
female) independent?
7. Suppose that in a sample of 230 students in fourth grade 150 performed at grade
level on a math test and 180 performed at grade level on a language test. Of those who
performed at grade level on the language test, 125 performed at grade level on the math
test. (a) Based on this sample, is performing at grade level on a language test independent
of performing at grade level on a math test? (b) If a student performed at grade level on
110 Probability

a language test, is that student more or less likely to have performed at grade level on a
math test?
8. A group of 3000 US residents aged 70 years old was tracked for 5 years. 1200 of
these people had smoked for at least 5 years at some point in their lives. During the five
years, 250 of these people died, and of those who died 120 were among the smokers. Are
dying and having been a smoker independent for this sample?
9. Sometimes dependence (lack of independence) suggests a link among the variables,
but the link is indirect, that is, it is caused by a link to a different variable. A group of
10,000 US residents was selected at random and followed over the last 4 years. In this
sample, 500 died during the 4 years. Of the 10,000, 800 had seen the landing of Apollo
11 on the moon when it occurred (in 1969), and of those 800 who saw the first lunar
landing 58 had died. (a) Show that watching the first lunar landing (live) and dying are
not independent in this sample. (b) Are those who watched the first lunar landing more
or less likely to have died than the rest of the sample? (c) Would you expect the 800 who
watched the landing to be younger or older than the rest of the sample? (d) What other
characteristics of the group of 800 who watched the Apollo 11 landing might be different
than those of the rest of the sample?

B. Random Variables.

Events can involve the value of a variable. This description is useful when values are
combined to yield a compound variable, when assessing the sizes of risks and the expected
results of taking a risk, and when describing the behavior of measures such a statistical
measures.

Example 1. Suppose that a 6-sided die is rolled and the number that appears on its
top face is recorded. This represents a random variable whose value is one of 1, 2, 3, 4, 5,
and 6. When the die is fair the probability of each number appearing is 1/6.

Example 2. Suppose that two fair 6-sided dice are rolled and the sum of the two
numbers on the top faces is recorded. This represents a random variable whose value is
one of 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12.

A variable X which can take on different values with assigned probabilities is a ran-
dom variable.
Probability 111

We will consider two types of random variables, discrete and continuous.

A discrete random variable can assume discrete values, v1 , v2 , v3 , . . . , vn . It is also
possible that there are infinitely many discrete values.
A continuous random variable can assume values in some interval or collection of
intervals.

Example 3. Let X represent the sum of the values displayed on two fair 6-sided dice.
This is a discrete random variable. Calculate the probabilities of the possible values.
Solution: There is 1 combination that yields 2, namely 1 on each of the dice. There are 2
combinations that yield 3, namely 1 on the first die and 2 on the second or 2 on the first
die and 1 on the second. There are 3 combinations that yield 4, namely 1 on the first die
and 3 on the second or 2 on the first die and 2 on the second or 3 on the first die and 1
on the second. Similarly there are 4 combinations that yield 5, 5 combinations that yield
6, 6 combinations that yield 7, 5 combinations that yield 8, 4 combinations that yield 9, 3
combinations that yield 10, 2 combinations that yield 11, and 1 combinations that yields
12. The probability of any one combination is (1/6) × (1/6) = 1/36. The probabilities are
therefore
1 2 3
P (X = 2) = = P (X = 12), P (X = 3) = = P (X = 11), P (X = 4) = = P (X = 10)
36 36 36
4 5 6
P (X = 5) = P (X = 9) = , P (X = 6) = P (X = 8) = , P (X = 7) = .
36 36 36

Example 4. Suppose that a state has 5 million citizens of voting age. Let X denote
the number of people who vote in a particular election. Suppose that the probability that
one person votes (in this state) is 0.6. Discuss the probability that X = 3 million.
Solution: The number of people who vote in a given election is indeed a discrete random
variable. Any particular combination of 3 million voters and 2 million non-voters has the
probability
pr = 0.63,000,000 × 0.42,000,000 .

Because of the high power, this number is difficult to approximate in decimal form (it is
also very small, compared to 0.6 and 0.4). For the combinations with exactly 3 million
voters and 2 million non-voters we need to count the combinations that divide the 5 million
citizens of voting age into two groups of this size. There is clearly a very large number of
such combinations, but we have not examined any technique for calculating this number.
Conclusions: this question, even though reasonable, is not within our means of calculation
at this time.
112 Probability

Example 5. Suppose that T represents the duration of a phone call. This is a

random variable in that the durations of phone calls vary and there is some probability for
a call reaching a particular duration. For instance, there is some chance that a call lasts
1.5 minutes or less. Since the duration can, in theory, be any positive number, this is a
continuous random variable.

Example 6. Suppose a number is equally likely to fall at any value between 0 and 1.
What is the probability that this number is less than 0.76.
Solution: Let X denote the number. Since the proportion of numbers between 0 and 0.76
(the length of the interval [0, 0.76]) to numbers between 0 and 1 (the interval [0, 1]) is 0.76,
P(X ≤ 0.76) = 0.76.

Example 7. Suppose that 1000 LEDs are examined, and that there is a probability of
0.003 that an LED is defective. Suppose that whether one diode is defective is independent
of the condition of any other diode. Let X be the number of defective LEDs. Calculate
the probabilities that X = 0 and that X = 1.
Solution: For X = 0 all LEDs have to be good. The probability of one diode being good is
1 − 0.003 = 0.997. So P(X = 0) = 0.9971000 ≈ 0.049563. For X = 1 one LED is defective
and all remaining 999 LEDs have to be good. There are 1000 ways for one diode to be
defective with the rest being good, because any choice of the one defective diode (from the
1000) gives one of these combinations. So P(X = 1) = 1000 × 0.003 × 0.997999 ≈ 0.149137.

EXERCISES.
1. Write a list of the combinations of two dice that give a total of 7.
2. Suppose that two fair dice are tossed (resulting in two integers valued between 1
and 6). Calculate the probability that the number on the first die is strictly greater than
the number on the second.
3. An experiment consists of finding a person wearing shoes and checking the person’s
shoe size. Is the resulting random variable discrete or continuous? Explain.
4. An experiment consists of measuring the length of a person’s foot. Is the resulting
random variable discrete or continuous? Explain.
5. Suppose that 5 batteries are connected together to run a flashlight. If at least 3
batteries work well, then the flashlight functions properly. Suppose that the probability
that one battery works is 0.95. What is the probability that the flashlight functions
properly?
Probability 113

6. Suppose that 1000 LEDs are examined. Consider the event that exactly 2 LEDs
are defective. Explain why this can happen in 1000 × 999/2 ways. That is, suppose that
we have 1000 LEDs in a row (so they are ordered) and explain why there are 499, 500
different combinations that have exactly two defective LEDs.
7. Suppose that 1000 LEDs are examined, and that there is a probability of 0.003
that an LED is defective. Suppose that whether one diode is defective is independent of
the condition of any other diode. Let X be the number of defective LEDs. Calculate
the probabilities that X = 2. Hint: a previous problem shows that there are 499, 500
combinations with exactly two defective LEDs.
8. Suppose a number is equally likely to fall at any value between 0 and 4. What is
the probability that this number is less than 2? 1.4? 3.1?
9. In a survey respondents were asked whether they agree or disagree with each of a
list of statements. The results were thus either “agree”, “disagree”, or no answer. Suppose
that 90 percent of the respondents answered a question, and that twice as many respon-
dents agreed with the statement (compared with disagreeing with it). What percentage of
respondents agreed with the statement? what proportion disagreed with the statement?
10. In a survey regarding the preferred color for beverages received 770 responses. 300
preferred green beverages, 220 preferred red beverage, and 100 preferred brown beverages.
What is the probability that a person from the survey prefers green beverages? brown
beverages? red beverages? Do some of the people surveyed prefer some other color?

C. Average and Expected Value.

The average of a collection of numbers is the sum of the values divided by the number
of items in the collection. Probability has a different point of view but arrives at a similar
formulation.
In this section we will define and calculate the expected value only for random variables
that achieve a finite number of values. A definition for variables with continuous values
will be given much later, once the mathematical machinery is in place.

Definition. Suppose that a (random) variable can have values v1 , v2 , . . . , vn and that
the probability of obtaining the value vi is pi . Then the expected value of X is

E(X) = p1 v1 + p2 v2 + . . . + pn vn .
114 Probability

The interpretation of the expected value is that each of the possible values of the variable
(v1 through vn ) gets the weight of its likelihood of appearance.

Example 1. A game is played in which a coin is tossed repeatedly. A player wins

a dollar when a head appears and looses a dollar when a tail appears. Suppose that a
particular coin used in this game is flipped 100 times and it comes up heads 48 times and
tails 52 times. What is the average amount gained or lost in each of the 100 games?
Solution: The player won a dollar 48 times, so the total gain was 48 dollars, and lost a dollar
52 times, so the total loss was 52 dollars. The net value of 100 games was 48 − 52 = −4
and so the average gain was −4/100 = −0.04.

Example 2. A game is played in which a coin is tossed repeatedly. A player wins

a dollar when a head appears and looses a dollar when a tail appears. Let X denote the
random variable representing the outcome of the game. That is X = 1 if the player wins
and X = −1 when the player looses. Suppose that the probability of heads is 0.48 and of
tails is 0.52. What is the expected value of the game?
Solution: The value X = 1 has probability 0.48 and the value X = −1 has probability
0.52. Hence the expected value of the game is E(X) = 0.48 × 1 + 0.52 × (−1) = −0.04.

Example 3. A six sided die is cast, with the sides equally likely to appear. The
random variable X represents the number (1 through 6) that appears. Calculate the
expected value of X.
1 1 1 1 1 1
Solution: E(X) = 6 ·1+ 6 ·2+ 6 ·3+ 6 ·4+ 6 ·5+ 6 · 6 = 7/2.
Discussion: Notice that if all values appear then the resulting average is (1 + 2 + 3 +
4 + 5 + 6)/6 = 3.5. Thus when all the outcomes are equally likely, the expected value is
the sum of the results divided by the number of results.

Example 4. Suppose that 1000 light emitting diodes are examined, and that the
probability of one diode having a fault is 0.003. Let the random variable X represents the
number (0 through 1000) of faulty diodes. Calculate the expected value of X.
Solution: The formal calculation is difficult, because one would have to calculate the num-
ber of combinations that give X = k for each integer k. Then one would have to calculate
the probability that X = k (we actually know that this probability is 0.003k · 0.9971000−k ).
Finally one would have to multiply the number of combination by the probability and by
Probability 115

the resulting value of X, and add all these numbers. It is much easier to observe that we
expect 3 out of every batch of 1000 diodes to have faults, that is E(X) = 0.003 × 1000 = 3.

Fact. Assume that N independent experiments are conducted. Suppose the probability
of a certain outcome is p (the usual language is to call this outcome a “success”). Let X
denote the number of successes among the experiments.
Then E(X) = N · p.

Example 5. Suppose that a lender gives out 1000 loans, and that for each loan the
probability of default is 3 percent. Let X denote the number of loans which end in default.
Calculate the expected number of defaults.
Solution: There are N = 1000 loans and the probability of default is p = 0.03, so the
expected number of defaults is E(X) = 1000 × 0.03 = 30.

Example 6. Suppose that a bank gives out loans that will last 10 years, and that
the probability of default during the 10 years is 3 percent. Suppose that in case of default
the bank collects none of the money lent. Finally assume the loans are all for the same
amount and that the interest rate on the loans is 4 percent. Calculate the bank’s expected
return on the loans.
Solution: When a loan is not repaid, the amount returned is, by assumption, 0. This
happens with a probability of 0.03. When a loan is repaid, the amount returned is 1.0410 ≈
1.480244 times the amount of the loan. This happens with a probability of 1 − 0.03 = 0.97.
So in proportion to the original amount of the loans the expected return is

0 × 0.03 + 1.480244 × 0.97 ≈ 1.435837 .

It may be interesting to note that the resulting annual interest rate obtained by the bank
is 1.4358370.1 − 1 ≈ 0.0368.
Another way to think about this example is to consider a repeated “experiment”
in which a loan of 100 dollars is given out. Out of 1000 repetitions, 970 will result in
a return of 148.02 dollars and 30 will result in no returns. Hence the total return is
970 × 148.02 = 143, 579.40 (dollars). The bank originally lent out 1000 × 100 = 100, 000
(dollars) and so its return rate is 43, 579.40/100, 000 = 0.435794 in 10 years.

Example 7. Suppose that a state has 5 million citizens of voting age. Let X denote
the number of people who vote in a particular election. Suppose that the probability that
116 Probability

one person votes (in this state) is 0.6. Calculate the average number of voters in the
election.
Solution: Since it is assumed that the probability of one person voting is independent of
the probability of any other person voting, we can view the number of voters as the number
of successes in 5 million repeated experiments. Hence E(X) = 5 × 106 · 0.6 = 3 × 106 .

As the example regarding elections makes clear, the expected value may be an outcome
with a low probability of occurrence. To understand more about the distribution of values
one would next need to know how much the actual values differ from the average. The
relevant quantity is called the variance and is defined below for the curious.

Definition. For a random variable X with expected value E(X) = µ, the variance of X
is V (X) = E((X − µ)2 ). The standard deviation of X is the square root of the variance.
√
The usual notation is σ = V .

Example 8. A six sided die is cast, with the sides equally likely to appear. The
random variable X represents the number (1 through 6) that appears. Calculate the
variance of X and its standard deviation.
Solution: We have found that E(X) = 3.5. The variance, V , and standard deviation σ are

1 1 1 1 1 1
V (X) = ·(1−3.5)2 + ·(2−3.5)2 + ·(3−3.5)2 + ·(4−3.5)2 + ·(5−3.5)2 + ·(6−3.5)2 .
6 6 6 6 6 6
√
V (X) = 23/8 = 2.875, σ = 2.875 ≈ 1.69558.

In experiments and observations one usually obtains a sample, meaning a collection

of values from a larger population. One can then define the sample average and calcu-
late other sample statistics. Our discussion is aimed at being able to describe settings
involving probability (but does not include sample statistics).

EXERCISES.
1. Let X be the sum of the numbers appearing on two fair dice. Calculate the expected
value for this random variable.
2. Suppose that the number of customers looking for a particular item at a store (per
day) is between 5 and 10. Denote this number by X, and assume that the probability
for different values of X is given in the table below. Calculate the expected value for the
number of customers.
Probability 117

X= 5 6 7 8 9 10
P = 0.68 0.23 0.061 0.022 0.0046 0.0024

3. A county somewhere in the western US has a rural population of 50,000 voters

and an urban population of 60,000 voters. The probability that a rural voter chooses
candidate A is 0.53, and the probability that a rural voter chooses candidate B is 0.47.
The probability that an urban voter chooses candidate A is 0.48, and the probability that
an urban voter chooses candidate B is 0.52. What is the expected number of votes for each
candidate?
4. A county somewhere in the southern US has a rural population of 50,000 voters
and an urban population of 80,000 voters. The probability that a rural voter chooses
candidate A is 0.53, and the probability that a rural voter chooses candidate B is 0.47.
The probability that an urban voter chooses candidate A is 0.48, and the probability that
an urban voter chooses candidate B is 0.52. What is the expected number of votes for each
candidate?
5. Suppose that during a certain period of time an investor invests 1 million dollars
in each of 10 ventures. Suppose that the probability of failure of each ventures is 0.3. How
much must the investor make on the successful ventures, on average, to break even? How
much must the investor make on the successful ventures, on average, to earn a total of 6
percent return on her or his total investment?
6. Suppose that a 6-sided die is not fair, and that the probability of a 5 or a 2
appearing is 0.2 while the probabilities for the remaining 4 values are equal. Calculate the
average value of the outcome of one toss.
7. Let X represent the sum of the values displayed on two fair 6-sided dice. Calculate
the average value of X. Suggestion: the probabilities were calculated in the previous
section.
8. Suppose that two 6-sided dice are not fair, and that for each of the two the
probability of a 5 or a 2 appearing is 0.2 while the probabilities for the remaining 4 values
are equal. Let X represent the sum of the values displayed on the two dice. Calculate
the probability of each value of X. Suggestion: this calculation involves calculating the
probabilities of the 36 possible combinations. A table or spreadsheet might be useful.
9. As above, suppose that two 6-sided dice are not fair, and that for each of the two
the probability of a 5 or a 2 appearing is 0.2 while the probabilities for the remaining 4
values are equal. Let X represent the sum of the values displayed the two dice. Calculate
the average value of X. Suggestion: the probabilities are calculated in a previous problem.
118 Probability

10. Suppose the probability of an alligator being female is p = 0.25+0.01(T −50) where
T is the average daily temperature in degrees Farenheit (this holds for 60 < T < 100).
When the (average daily) temperature is 80 degrees, what is the expected number of female
alligators in a sample of 100 alligators? In a sample of 6 alligators?
11. As above, suppose the probability of an alligator being female is p = 0.25 +
0.01(T − 50) where T is the average daily temperature in degrees Farenheit. Suppose that
out of 6 alligators 4 are female. How likely is this outcome if the temperature is 80 degrees?
If the temperature is 90 degrees? Suggestion: given a particular temperature, the value
of p is fixed. You may take it as a fact that there are 15 combinations of 6 alligators that
include exactly 4 females and 2 males. Hence you can calculate the probability of this
outcome.
12. As above, suppose the probability of an alligator being female is p = 0.25 +
0.01(T − 50) where T is the average daily temperature in degrees Farenheit. Suppose that
out of 6 alligators 3 are female. How likely is this outcome if the temperature is 80 degrees?
If the temperature is 90 degrees? Suggestion: given a particular temperature, the value
of p is fixed. You may take it as a fact that there are 20 combinations of 6 alligators that
include exactly 3 females and 3 males. Hence you can calculate the probability of this
outcome.

D. Cumulative Distribution.

The notion of an accumulated probability is useful in characterizing the spacial or

population spread of a quantity such as income or athletic ability.

Example 1. Suppose that x represents a proportion of the population. For instance,

x = 0.4 represents 40 percent of the people. One quantity of interest is income and a
good way of describing the distribution of income is the accumulated share, z = f (x). For
instance, if the 40 percent of the population that receive the lowest incomes have, in total,
10 percent of the income in the country, then z = 0.1 for x = 0.4, or f (0.4) = 0.1.

Example 2. Suppose that x represents a proportion of the population, and z rep-

resents the accumulated share of income. Which relation between z and x represents a
completely even distribution of income?
Probability 119

Solution: Suppose that income is completely evenly distributed. Then the likelihood that
any particular dollar of income is given to any particular individual is the same for all
dollars and individuals. Hence the proportion of income, z, given to a certain proportion
of the population, x, is the same proportion. That is z = x (with the domain for x being
the interval [0, 1]).

Example 3. Suppose that x represents a proportion of the population, and z rep-

resents the accumulated share of income. Suppose that for one country z = f (x) = x2
and for a second country z = g(x) = x3 . Which country has a more even distribution of
income?
Solution: Since the relevant values of x are 0 < x < 1, one has x3 < x2 < x. Thus the
country with distribution f has a more even distribution of income than the country with
distribution g (though both distributions are not completely even). The reader is invited
to graph f and g and label the horizontal axis “proportion of population” and the vertical
axis “proportion of income”.

A stochastic variable X has the cumulative distribution function F , when

F (t) = P (X ≤ t).

Example 4. Suppose that X represents the duration of a phone call, in minutes.

Then a cumulative distribution F (t) should be zero for negative t and then the probability
of a call lasting t or fewer should increase to 1 as t increases in size. Such cumulative
distributions are given by F (t) = 1 − 2−t , and by G(t) = 1 − 3−t .

Example 5. Suppose that X represents the outcome of a flip of a (fair) coin, by

setting X = −1 if the coin lands with tails showing and X = 1 if the coin lands with
heads.
Then the cumulative distribution function is

0 t < −1
(
F (t) = 0.5 −1 ≤ t < 1
1 1≤t.

Example 6. Suppose that X represents a variable that is uniformly distributed

between 0 and 3.
120 Probability

Then the cumulative distribution function is

0 t<0
(
F (t) = 13 t 0 ≤ t < 3
1 3≤t.

EXERCISES.
1. Suppose that X represents the outcome of the roll of a fair die. Calculate the
cumulative distribution function for X.
2. Suppose that x represents a proportion of the population, and z represents the
√
accumulated share of income. Suppose that for one country z = f (x) = x x and for a
second country z = g(x) = x2 . Which country has a more even distribution of income?
3. Suppose that x represents a proportion of the population, and z represents the
accumulated share of income. Suppose that for one country z = f (x) = 0.5 x (1 + x2 )
and for a second country z = g(x) = x3 . Which country has a more even distribution of
income? Suggestion: think about comparing the two functions at some value of x between
0 and 1.
4. In the phone call example, which of the two cumulative distributions corresponds
to phone calls that last longer (F or G)?
5. Suppose that X represents a variable that is uniformly distributed between 2 and
4. Calculate the cumulative distribution function for X.
6. Suppose that X represents a variable that is uniformly distributed between 1 and
4. Calculate the cumulative distribution function for X.
7. Can F (t) = 2 − t3 with 0 ≤ t ≤ 1 be a cumulative distribution function?
3
8. Can F (t) = 1 − e−t with 0 ≤ t be a cumulative distribution function?
2
9. Can F (t) = 1 + e−t with 0 ≤ t be a cumulative distribution function?
2
10. Can F (t) = e−t with 0 ≤ t be a cumulative distribution function?
CALCULUS
8. Approximation

Some important quantities are defined as approximations. Of particular importance

in calculus are rates of change and accumulated values.
For example, a manufacturer will produce additional units until the rate of increase
in the cost of production equals the price that the producer can get for the additional
unit. This rate is obtained from the cost function using an approximation process called
differentiation. Another example is that of consumer surplus, which is the accumulation
of the value acquired by each consumer as a result of participating in the market. The
consumer surplus is calculated from the demand function and the market price using an
approximation process called integration.

A. Improved Approximations.

Approximation plays a crucial role in defining certain functions. That we can system-
atically improve an approximation is the main idea in Calculus.
We illustrate the use of approximation with a few examples.
√ √
Example 1. We approximate 2 with decimal numbers. By definition, x = 2 when
2
x ≥ 0 and x = 2.
√
We calculate that 1.42 = 1.96 and 1.52 = 2.25. Hence 1.4 < 2 < 1.5. To improve
this approximation, the next guess should be between 1.4 and 1.5 and it makes sense to
√
guess closer to 1.4. The table below lists a succession of guesses that get closer to 2.
x= 1.42 1.413 1.414 1.415 1.4147
2
x = 2.0164 1.996569 1.99396 2.002225 2.00137609
x= 1.4146 1.4144 1.4142 1.4143
x2 = 2.00109316 2.00052736 1.99996164 2.00024449
√
Notice that the guesses were chosen to get closer to 2 until it became clear that a change
in the opposite direction was needed. For instance, 1.42 is too big, so 1.413 was used. Since
1.413 was too small, the guess was increased to 1.414 which was still too small, so the next
guess was 1.415. Since 1.415 turned out to be too large, the next guess was smaller.
√
The conclusion of this example is that 1.4142 < 2 < 1.4143. If we want only four
√
significant digits, then 2 ≈ 1.414 will do. If we want five significant digits, then we would
√
have to check whether 1.4142 is more accurate (that is closer to 2) than 1.4143. If we
want greater accuracy we would have to guess some more.
Approximation 123

At this point you might be wondering what algorithm your calculator or computer
√
uses to produce the value it gives when you ask for 2. Most manufacturers don’t describe
their algorithms in their manuals, but rest assured that the calculator does not have all
possible results stored in memory.

Example 2. We approximate a solution x to x6 + 3x4 − x3 + x2 + 4x − 5 = 0 with

decimal numbers until the error is less than 0.001.
We will calculate values for p(x) = x6 + 3x4 − x3 + x2 + 4x − 5 with different values of
x. When we find a value of x for which p(x) < 0 we will change x until we get a positive
value for p(x). It is easy to calculate that p(0) = −5 and p(1) = 3 so we will start with a
guess between 0 and 1. The table below gives the values computed:
x 0.5 0.8 0.9 0.83 0.82
p(x) −2.671875 −0.181056 1.180741 0.1878030 0.06140409
x 0.81 0.815 0.816
p(x) −0.0615098 −0.0004812 0.0118269

We conclude that there is a solution z to z 6 + 3z 4 − z 3 + z 2 + 4z − 5 = 0 with 0.815 <

z < 0.816. Of course there may be other solutions as well, and since this polynomial
equation involves a polynomial whose order is even, there must be at least one more (real)
solution.

√
Example 3. We approximate 3 2 with decimal numbers to two significant digits.
√ √
First, 1.4 < 2 < 1.5 so 31.4 < 3 2 < 31.5 .
To approximate 31.4 write this as 37/5 which is the 5th root of 37 = 2187. We cal-
culate that 4.655 = 2174.0261540625 (and 4.665 = 2197.5035404576) so 4.65 < 31.4 . To
approximate 31.5 write this as 33/2 which is the square root of 33 = 27. Now calculate that
5.1972 = 27.008809 (and 5.1962 = 26.998416) so 31.5 < 5.197.
√
We have obtained our first approximation: 4.65 < 31.4 < 3 2
< 31.5 < 5.197.
√
To get a better approximation we need a better approximation of 2. We’ll use
√ √
1.414 < 2 < 1.4143 to get 31.414 < 3 2 < 31.4143 .
To approximate 31.414 write this as 3707/500 which is the 500th root of 3707 . Guessing
and checking for this value is even harder, but it is possible and one can check that
4.727 < 31.414 . To approximate 31.4143 write this as 314143/10000 which is the 10000th root
of 314143 . These numbers would hurt our calculator but it is true that 31.4143 < 4.730.
This approximation is good to 2 digits (in fact it turns out that 4.73 is correct to three
√
2
digits). So we conclude that 3 ≈ 4.7 to two significant digits.
124 Approximation

It is amazing how much approximation our calculators will do for us. The fact is, how-
ever, that, as we will see, only the approximation to one exponential function is necessary.
It is none the less intriguing that the approximation can be done at all!

Example 4. Approximate a solution to xx = 3 with an error of no more than 0.03.

We do this by guessing and then increasing our guess if the result is too low and
decreasing it if the result is too high. Since 11 = 1 and 22 = 4, our first guess will be
between 1 and 2. The calculations were carried out using a calculator that is accurate to
12 digits (but the methods used by the calculator remain mysterious).
x 1.5 1.8 1.9 1.87 1.84 1.82
xx 1.837 2.881 3.386 3.224 3.071 2.974
This is sufficient for our desired accuracy because we know that the number x with
xx = 3 satisfies 1.82 < x < 1.84.

While the examples above are meant mostly to illustrate what we mean by an ap-
proximation, it is an important observation that in all these cases the approximations can
be improved to reach any accuracy that we desire.

EXERCISES.
√
1. Approximate 1.6 by finding two numbers a and b so that a2 ≤ 1.6 and 1.6 ≤ b2
and the difference b − a is less than 0.04.
2. Approximate a solution z to z 6 + 3z 4 − z 3 + z 2 + 4z − 5 = 0 with decimal numbers
until the error is less than 0.01 and with the requirement −2 < z < −1. This is a different
solution than the one approximated in the example above.
3. Approximate (x2 − 4)/(x − 2) when x is close to 2 by using three different values
of x between 1.9 and 2.1.
4. Approximate (x2 − 9)/(x − 3) when x is close to 3 by using three different values
of x between 2.95 and 3.06.
5. Approximate (x2 − 4)/(x + 2) when x is close to −2 by using three different values
of x between −2.03 and −1.92.
6. Approximate a solution x to p(x) = x4 − x3 + 4x − 7 = 0 with decimal numbers
until the error is less than 0.01. Notice that p(1) = −3 and p(2) = 9. Also p(−1) < 0 and
p(−2) > 0, so there are at least two solutions.
7. Approximate (x3 − 8)/(x − 2) when x is close to 2 by using three different values
of x between 1.9 and 2.1.
Approximation 125

8. Approximate (x4 − 16)/(x − 2) when x is close to 2 by using three different values

of x between 1.9 and 2.1.
9. Approximate (x2 − 4)/(x − 2) when x is close to 2 by using three different values
of x between 1.999 and 2.001.
10. Approximate a solution x to p(x) = x4 − x3 + 4x − 7 = 0 with decimal numbers
until the error is less than 0.001.
11. Approximate 51/3 with decimal numbers until the error is less than 0.01.
√
12. Approximate 10 by finding two numbers a and b so that a2 ≤ 10 and 10 ≤ b2
and the difference b − a is less than 0.003.

B. Limits: Exact Approximations.

A limit is a value that the values of a function approach. We will now consider this
statement more carefully, discussing the context, a slightly less vague definition, a definition
using graphs, a definition using numbers, and a formal definition.
Limits represent an amazingly powerful idea: the idea that an approximations can be
improved to the point of being exact. These are no longer approximations of a value, but
actually the value itself. We will soon see that the idea allows a quantitative description
of the way functions change.
We are concerned with the values of a function, say f , that depends on a variable, say
x. The action takes place as the variable is close to a number, say a.

Example 1. Set f (x) = 1 + x2 . Suppose that x is close in value to 2. So think of x,

numerically, as 1.999 or 2.000003 or 1.9999999998 or maybe even closer to 2 than these.
The question is, “what is f (x) getting close to?”
Intuitively, the answer is that f (x) is getting close to 1 + 22 = 5. In this case, our
intuition is correct!
Notation: We write x → a to mean that the variable x gets arbitrarily close to the
value a, and we read this “x approaches a”. In this notation our previous example can be
summarized by writing
as x → 2, (1 + x2 ) → 5.
126 Approximation

Definition. An informal definition of a limit. The function f (x) has a limit value
L as x approaches a provided that when x is sufficiently close to a, f (x) is very close to
L.

Example 2. This example shows that the notion of a limit is required in some cases
– it is not just a matter of entering the value of the variable as in our previous example.
Consider
x2 + 3x − 4
f (x) = .
x−1

If x = 1, then f (1) does not make sense – because 1 is not in the domain of f . Let us
explore this example numerically.

x f(x)
0.999 4.999
1.0003 5.0003
0.99999996 4.99999996
1.0000000001 5.0000000001
0.99999999999 4.99999999999
1.00000000000002 5.0000000000000
0.9999999999999998 5.0000000000000

Evidently, for x = 0.9999999999999998 and for x = 1.00000000000002 the calculator used

to compute the values can no longer tell the difference between f (x) and 5. This suggests
a definition for limits that uses the notion of precision.
Definition. A formal definition of a limit. This definition is good enough to solve
even more complicated mysteries about limits, but its use can be cumbersome and we will
not use this definition. The function f (x) has a limit value L as x approaches the
point a provided that for any precision > 0 there is a tolerance δ > 0 so that whenever
x is within this tolerance of a it follows that f (x) is indistinguishable from L up to the
precision . Symbolically, whenever |x − a| < δ it follows that |f (x) − L| < .

Example 3. Another way of examining a function’s limit value is using the function’s
graph. Consider again the function

x2 + 3x − 4
f (x) = .
x−1

Here is its graph:

Approximation 127

The Graph shows that x = 1 is not in the domain of the function and it also shows
that when x is near 1 the value of f is near 5.

Example 4. Another way of examining a function’s limit value is algebraic manip-

ulation. When this approach is available, it is often the quickest and clearest. Consider
again the function
x2 + 3x − 4
f (x) = .
x−1
When x 6= 1 we have

x2 + 3x − 4 (x + 4)(x − 1)
f (x) = = = x + 4 for x 6= 1.
x−1 x−1

The expression x + 4 is easy to evaluate at x = 1 and the limit value is 5.

Definition. An operational definition of a limit. This definition is the one we, in practice,
will use. The function f (x) has a limit value L as x approaches the value a provided that
using numerical evidence, graphical evidence, or an algebraic calculation we can conclude
that the value of f (x) is indistinguishable from L when x gets closer and closer to a.
Notation. Another notation for the limit puts the variable under the abbreviation “lim”
and acts on the function:

x2 − 1
lim (x2 − 3) = 1, or f (x) = , lim f (x) = −2 .
x→2 x + 1 x→−1

Definition. One-sided limit. We can examine the behavior of a function, f , when the
variable, x, approaches the value a only from one side. We write x → a+ to mean that x
gets very close to a, but only values x > a are considered, and we write x → a− to mean
that x gets very close to a, but x < a.
128 Approximation

Example 5. Set f (x) = |x|/x. We examine this quantity as x approaches 0.

|x| x |x| −x
lim+ = lim+ = lim+ 1 = 1, and lim− = lim− = lim− −1 = −1 .
x→0 x x→0 x x→0 x→0 x x→0 x x→0

Please note that we used the fact that for x > 0 we have |x| = x while for x < 0 we have
|x| = −x. In this example, the one-sided limits exist, but there is no limit value as x → 0
because no single value is reached.
Summary. The limit value of a function is determined using a process in which we
analyze the values of the function as the variable approaches a given number.

EXERCISES.
1. Use a table of values to approximate the limit of f (x) = (x3 − x)/(x − 1) as x → 1.
2. Use a graph to approximate the limit of f (x) = (x3 − x)/(x − 1) as x approaches 1.
3. Show algebraically that for x 6= 1, (x3 − x)/(x − 1) = x2 + x.
4. Use a table of values to approximate the limit of f (x) = (x3 − 4x)/(x − 2) as x → 2.
5. Calculate (x3 − 4x)/(x − 2) algebraically for x 6= 2.
6. Does the limit of f (x) = |x| exist as x approaches 0?
7. Does the limit of f (x) = x2 /|x| exist as x approaches 0?
8. Does the limit of f (x) = 1/x exist as x approaches 0 from above (that is x → 0+ )?
√ √
9. Show algebraically that x − 4 = ( x − 2)( x + 2). What does this tell you about
√
the limit of (x − 4)/( x − 2) as x approaches 4? What does this tell you about the limit
√
of ( x − 2)/(x − 4) as x approaches 4?
10. Define a function f by f (x) = x when x > 2, and f (x) = 6 − 2x for x ≤ 2. Does
the limit of f (x) exist as x approaches 2?
11. Define a function f by f (x) = 2x when x > 1, and f (x) = 6 − 2x for x ≤ 1. Does
f (x) have a limit as x approaches 1?
12. Define a function f by

x, x ≤ 3,
f (x) =
4 − x, x > 3.

Is the limit of f (x) as x → 3 defined?

13. Define a function f by

2x, x ≤ 3,
f (x) =
4 − x, x > 3.
Approximation 129

Is the limit of f (x) as x → 3 defined?

14. When x 6= 0 let f (x) = 1/x. Is the limit of f (x) as x → 0 defined? Are values of
f (x) all very large and positive or very large and negative when x is near 0?
15. When x 6= 0 let f (x) = x−2 . Is the limit of f (x) as x → 0 defined? Are values of
f (x) all very large and positive or very large and negative when x is near 0?
16. Two inputs, capital in the amount x and labor in the amount l, are used in
√
production. The amount produced is A = x l. Suppose that production is to be kept at
√
1000 so that x l = 1000. Use some values of capital, x, that are positive and near 0 to
explore what happens to the amount of labor as a result.

C. Continuity

Definition. A function, f , is continuous at a, when (1) f is defined at and near a, (2)

as x → a, f (x) has a limit value, say L, and (3) f (a) = L. Written symbolically,

lim f (x) = f (a) .

x→a

Example 1. The following function are continuous at x = 0.

x5

x+1 3 x3 /x2 , x 6= 0
, x + 5x, |x|, f (x) = , ,
x−1 0, x=0 x+7

x2 x2 − 7x + 3

1/3
p x, x ≤ 0
x , , , |x|, f (x) = .
1+x 2 x−7 2x, x > 0

Example 2. The following function are not continuous at x = 0.

√

x+1 |x|/x, x 6= 0 −0.3 3 2 + x, x ≤ 0
, x, f (x) = , x , , f (x) = .
x 0, x=0 x x, x>0

Graphically, a function is continuous at a point in which its graph is an unbroken (or

connected) path. However, a function may be continuous at points at which the graph is
not a connected path – examples of this kind of behavior are quite complicated.
Definition. A function is continuous on an interval (a, b) when it is continuous at every
point in the interval. When a function is continuous on the whole real line, we sometimes
simply say that it is continuous.
130 Approximation

Example 3. The following are graphs of continues functions (for the values of the
variable included in the sketch).
2.5 2.5 2.5

-5 -2.5 0 2.5 5 -5 -2.5 0 2.5 5 -5 -2.5 0 2.5 5

-2.5 -2.5 -2.5

Graphs of continuous functions.

Example 4. The following are graphs of functions showing discontinuities.

2.5 2.5

-5 -2.5 0 2.5 5 -5 -2.5 0 2.5 5

-2.5 -2.5

Graphs showing discontinuities.

EXERCISES.
1. For each of the following functions decide whether it is continuous at x = 1.

x5

x+1 x3 /x2 , x 6= 0
(a) , (b) x3 + 5x, (c) |x|, (d) f (x) = , (e) , (f ) x1/3 .
x−1 0, x=0 x+7

2. For each of the following functions decide whether it is continuous at x = 4.

x2 x5

x x3 /x2 , x 6= 0
(a) , (b) , (c) |x|, (d) f (x) = , (e) , (f ) x1/3 .
x+4 x+4 0, x=0 x+7

3. The US income taxes behave in a way similar to the following. The amount of tax,
t, paid on an income of x is defined using the following algorithm. If x ≤ 8, 500 then t = 0.
If 8, 500 < x ≤ 20, 000 then t = 0.1x − 850. If 20, 000 < x ≤ 50, 000 then t = 0.25x − 3, 850.
And if 50, 000 < x then t = 0.3x − 6, 350. Is this tax function continuous?
Approximation 131

4. Is the tax function in the previous problem increasing? That is, if a person earns
a larger income does she or he pay a higher tax?
5. Is f (x) = x |x| continuous? If not, at which points is it not continuous?
6. Is f (x) = x/ |x| continuous? If not, at which points is it not continuous?
7. Define f (x) = 1/n if x = m/n where m and n are integers with no common factors.
Otherwise (that is when x is not rational) set f (x) = 0. This function is continuous at
irrational values of x and at zero, but is not continuous at rational points other than zero.
Give an explanation. A vague explanation will do since a careful explanation requires facts
we’ve not seen in this text.

D. Slope and Marginal Rates

Lines in the plan are characterized by having a constant slope. We think of the slope
as the rise divided by the run, which is a measure of how steeply the line in inclined.

Example 1. We will graph the lines y = x, y = x + 3, y = 2x, and y = 2 − x.

Decide from the graph which line is which, and try to do this without using the units on
the graph.

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

From a computational point of view, a line is determined by two points on it. From
any two points we can find the slope.

Example 2. Find the slope of the line that contains the points (1, 2) and (3, 5).
Solution: the slope of this line is
5−2 3
slope = = .
3−1 2
132 Approximation

The following sketch illustrates this calculation of the slope.

(3,5)

4.5

slope is 3/2
3.5

5-2=3
3

2.5

(1,2) 3-1=2
2
0 0.5 1 1.5 2 2.5 3 3.5 4

The slope of the line that containing (1, 2) and (3, 5).

Example 3. Find the slope of the line that contains the points (0, 3) and (4, 9). Are
these points on the line from the previous example?
Solution: The slope of this line is
9−3 6 3
slope = = = .
4−0 4 2
This line has the same slope as the line in the previous example, but (0, 3) and (4, 9) are
not on the previous line. For instance, the horizontal change from (1, 2) to (0, 3) is by −1
(meaning left by 1 unit of length) while the vertical change is by 1, so the slope of the line
containing (1, 2) and (0, 3) is 1/ − 1 = −1.

Example 4. Find an equation relating the x and y coordinates of the line that
contains the points (1, 2) and (3, 5).
Solution: We already know that the slope of this line is 3/2. We want a line with this
slope and containing, say (1, 2). So

3 3
y= x + something, and 2 = · 1 + something,
2 2
which gives us y = 1.5x + 0.5. One can check that this is correct by checking that the
coordinates of the point (3, 5) also satisfy this equation: 5 = 1.5 × 3 + 0.5.

Example 5. Find an equation relating the x and y coordinates of the line that
contains the points (0, 3) and (4, 9).
Approximation 133

Solution: We already know that the slope of this line is 1.5. We want a line with this
slope and containing, say (0, 3). This gives us y = 1.5x + 3. One can check that this is
correct by checking that the coordinates of the point (4, 9) also satisfy this equation.

Definition. Given the function f (x), the slope of the secant line to the graph of f
containing the points with x = a and with x = b is the slope of the line through the points
(a, f (a)) and (b, f (b)).
The definition above is simpler to understand if we draw a picture: we pick two points
on the graph and find the slope of the line containing them.

Example 6. Find an equation of the secant line to the graph of f (x) = x2 that
contains the points with x = 1 and x = 2.
Solution: The two points on the graph are (1, 1) and (2, 4). The slope is (4 − 1)/(2 −
1) = 3 and the line contains, say (1, 1). This gives us y = 3x − 2. This is graphed next.
5

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-1

2
The graphs y = x and y = 3x − 2 showing (1, 1) and (2, 4).

It seems clear that the secant line that we find will depend on the function whose
graph we’re considering and on both the points that we choose.

Example 7. Find an equation of the secant line to the graph of f (x) = x2 that
contains the points with x = 1 and x = 1.5.
Solution: The two points on the graph are (1, 1) and (1.5, 2.25). The slope is (2.25 −
1)/(1.5 − 1) = 2.5 and the line contains, say (1, 1). This gives us y = 2.5x − 1.5.
Let’s graph this last secant line and also the previous one on the same axes.
134 Approximation

3
y=3x-2

2
y=2.5x-1.5

y=x2 1

-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3

-1

Example 8. Find an equation of the secant line to the graph of f (x) = 1 + x3 that
contains the points with x = 1 and x = 3.
Solution: The two points on the graph are (1, 2) and (3, 28). The slope is 13 and the
line contains, say (1, 2). This gives us y = 13x − 11.
Definition. Given the function f (x), the slope of the tangent line to the graph of f
containing the point with x = a is the slope of the line through the points (a, f (a)) that is
the limit of secant lines through the points (a, f (a)) and (b, f (b)) as b gets close to a.
The definition above is not so simple so we’ll examine it carefully: we will draw a
picture, we will compute some examples numerically, and we’ll get a formula for computing
the slope from the function.

Example 9. The following sketch illustrates what we mean by the tangent line to the
graph of f (x) = 1 + x3 at the point with x = 1. The graph looks something like this:
3

3
y=1+x 1
Tangent line

-5 -4 -3 -2 -1 0 1 2 3 4 5

-1

-2

-3
Approximation 135

Example 10. We will eventually find the tangent line to the graph of f (x) = 1 + x3
at the point with x = 1. The line will pass through the point (1, f (1)) = (1, 2). We
consider a progression of secant lines: First consider, for the second point, the point
(2, f (2)) = (2, 9). The secant line containing (1, 2) and (2, 9) has slope 7. Next consider the
point (1.2, f (1.2)) = (1.2, 2.728). The secant line containing (1, 2) and (1.2, 2.728) has slope
3.64. Next consider the point (1.1, f (1.1)) = (1.1, 2.331). The secant line containing (1, 2)
and (1.1, 2.331) has slope 3.31. Next consider the point (1.01, f (1.1.01)) = (1.01, 2.030301).
The secant line containing (1, 2) and (1.01, 2.030301) has slope 3.0301. Finally consider
the point (1.0002, f (1.0002)) = (1.0002, 2.000600120008). The secant line containing (1, 2)
and (1.0002, 2.000600120008) has slope 3.00060004. It seems plausible that the slopes of
the secant lines are getting close to 3. The table bellow summarizes what we’ve calculated.

base-point second point slope

(1,2) (2, 9) 7
(1,2) (1.2, 2.728) 3.64
(1,2) (1.1, 2.331) 3.31
(1,2) (1.01, 2.030301) 3.0301
(1,2) (1.0002, 2.0006001200088) 3.00060004
(1,2) → (1.0, 2.0) → 3.0 ?

Of course, any trend that can be seen numerically can be seen (and often more quickly)
algebraically – if it can be done algebraically at all.

Example 11. Find the slope of the tangent line to the graph of f (x) = 1 + x3 at the
point with x = 1 algebraically.
Solution: The slope of the tangent line is the limit of the slopes of the secant lines.
The secant lines pass through the two points (1, 2) and (b, 1 + b3 ). The slope is

1 + b3 − 2 b3 − 1 (b2 + b + 1)(b − 1)
slope = = = = b2 + b + 1.
b−1 b−1 b−1

Now we can see what happens as b approaches 1, namely that the slope approaches 3.

Example 12. Find the tangent line to the graph of f (x) = 1 + x3 at the point with
x = 1.
Solution: The slope of the tangent line is 3, as we’ve explored numerically and calcu-
lated algebraically. The line passes through the point (1, 2), so its equation is y = 3x − 1.
136 Approximation

Definition. Given the function f (x), the derivative of f with respect to x at a is the
number
f (a + ∆x) − f (a)
f 0 (a) = lim .
∆x→0 ∆x
Comments: 1. This is an important definition. 2. In the above the symbol ∆x (delta ex)
is thought of as a small change in x. It is a number that gets very close to 0 (so it can be
either positive or negative, but always small in size). 3. Throughout the process of finding
the derivative, the value at the base-point x = a is fixed.

Definition. Given the function f (x), the function derived from f is the function whose
value at x is f 0 (x).
In finding f 0 (x) we first fix the base-point and evaluate the limit that gives the deriva-
tive, and only then consider a new base-point.

Example 13. Find the derivative of f (x) = x2 at the point with x = a, and find
f 0 (x).
Solution: The limit in the derivative involves the two points (a, a2 ) and (a + ∆x, (a +
∆x)2 ). The quotient is

(a + ∆x)2 − a2 a2 + 2 a ∆x + (∆x)2 2 a ∆x + (∆x)2

quotient = = = = 2 a + ∆x .
∆x ∆x ∆x
Now we can see what happens as ∆x approaches 0, namely that the quotient approaches
2a. We conclude that f 0 (a) = 2a and f 0 (x) = 2x.
Terminology: The function derived from f is often called the derivative of f (without
specifying the base point).

Example 14. Find the derivative of f (x) = 1/x at the point with x = a, and find
0
f (x).
Solution: The limit in the derivative involves the two points (a, 1/a) and (a+∆x, 1/(a+
∆x)). The quotient is

1 1 a−(a+∆x) −∆x
(a+∆x) − a a(a+∆x) a(a+∆x) −1
quotient = = = = .
∆x ∆x ∆x a(a + ∆x)

Now we can see what happens as ∆x approaches 0, namely that the quotient approaches
−1/a2 . We conclude that f 0 (a) = −1/a2 and f 0 (x) = −1/x2 .
Marginal rates. The derivative of a quantity f (x) with respect to x is a rate because it
involves the change in one variable relative to the other. For example, if x is time since the
Approximation 137

establishment of a business, measured in months, and f (x) is the number of individuals

who have become customers (measured, say, by the number of different entries in a data
base of customers), then f (7) − f (5) is the number of new customers in the 6th and 7th
month, and 7 − 5 is the elapsed time. Hence (f (7) − f (5))/(7 − 5) is the average number
of new customers per month during the 6th and 7th months. Similarly, f 0 (8) is the rate,
in customers per month, at which the business is recruiting new customers at the end of
its 8th month.

Example 15. Suppose that f (x) measures the cost, in dollars, of providing health
care to x residents of the US. What is the meaning of the derivative f 0 (x)?
Solution: The derivative involves a quotient of the change in the values of f and the
change in x. In this example, the change in the values of f is a change in the cost of
healthcare (in dollars) and the change in the values of x is a change in the number of
people to whom care is provided. So f 0 (x) represents the cost of healthcare per person.
The starting position is the number of people, so the change in the number of people
is relative to this starting population size. In conclusion, f 0 (x) is the marginal cost of
providing healthcare to an additional person.

Example 16. Suppose that f (x) measures the cost, in dollars, of producing x many
bicycles. What is the meaning of the derivative f 0 (x)? Suppose f 0 (1000) = 500 for a
certain manufacturer. What does this tell us about this manufacturer’s behavior in the
market?
Solution: The derivative involves a quotient of the change in the values of f and
the change in x. In this example, the change in the values of f is a change in the cost
of producing bicycles (in dollars) and the change in the values of x is a change in the
number of bicycles made. So f 0 (x) represents the cost per bicycle (in dollars per bicycle)
in response to the change from x bicycles. If f 0 (1000) = 500, then when 1000 bicycles are
produced it costs an additional 500 dollars to make an additional bicycle. Said differently,
the marginal cost of manufacturing a bicycle is 500 dollars when production is at 1000
bicycles. So the manufacturer will make more bicycles if the additional bicycle can be sold
for more than 500 dollars, and will decrease production when the additional bicycle can
only be sold for less than 500 dollars.

Example 17. Suppose that f (x) measures the cost, in dollars, of providing education
for a year to x students at a certain university. What is the meaning of the derivative
f 0 (12, 500)? If the state pays the university $564 a year for an additional student and the
138 Approximation

student pays $7,200 a year in tuition, and f 0 (12, 500) = 8, 400, would the university make
money by admitting the 12, 501th student?
Solution: In this example f 0 (x) represents the cost of education per student. The
starting value of x is the number of students, so the change measures a change in the
number of people admitted. So f 0 (12, 500) is the cost of educating an additional student
for a year, when the university already has 12,500 students. If this marginal cost is $8,400,
then the university will admit an additional student when the total received in revenue is
higher. The total received is the tuition and the added payment from the state. This total
is $7,764 so the university would loose money if it admits the 12, 501th student.

Summary We discussed the notion of the slope of a graph as the increase in the dependent
variable (the y) in proportion to the change in the independent variable (the x). This gave
rise, through the secant-line approximation, to the rate of change of the dependent variable
as a function of the independent variable, which is the derivative of the function.
The notion of a limit allowed us to go from discrete changes in the variables to the
“instantaneous” change that is the derivative.

EXERCISES.
1. Find an equation for the line through (1, 4) and (3, 2).
2. Find an equation for the line through (1, 4) and (3, 3).
3. Find an equation for the line through (1, 1) and (3, 2).
4. Find an equation for the line through the points on the graph of f (x) = 1/x with
x = 2 and with x = 3.
5. Find an equation for the line through the points on the graph of f (x) = 1/x with
x = 2 and with x = 2.1.
6. Find an equation for the line through the points on the graph of f (x) = 1/x with
x = 2 and with x = 2.001.
7. Find an equation for the line through the points on the graph of f (x) = x4 with
x = 1 and with x = 2.
8. Find an equation for the line through the points on the graph of f (x) = x4 with
x = 1 and with x = 1.1.
9. Find an equation for the line through the points on the graph of f (x) = x4 with
x = 1 and with x = 0.99.
10. Calculate the slope of the line through (2, 4) and (2 + h, (2 + h)2 ).
Approximation 139

11. Calculate the slope of the line tangent to y = x2 at (3, 9).

12. Calculate the slope of the line through (2, 0.5) and (2 + h, 1/ (2 + h)).
13. Suppose that v represents the value of a bond and t represents time. What is the
meaning of the derivative of v with respect to t? Would you expect this rate to be positive,
negative, or zero?
14. Suppose that b represents the remaining debt on a loan and t represents time.
What is the meaning of the derivative of b with respect to t? Would you expect this rate
to be positive, negative, or zero?
15. Suppose that p is the amount of milk given by a cow per day (in gallons per day)
and that t is time (in days). What is the meaning of the derivative of p with respect to t?
16. Suppose that r represents the total revenue for a manufacturer of bicycles, and
let x represent the number of bicycles sold by this manufacturer. Suppose the current
market price is p dollars per bicycle. Under what assumptions is the marginal revenue
for this manufacturer equal to p? If these assumptions fail to hold, would you expect the
derivative of r with respect to x to exceed p or be less than p?
17. Suppose that l represents the location of a car along a road and t represents time.
What is the meaning of the derivative of l with respect to t?
18. Suppose that i represents personal income and t represents time. Is an economy
developing when the derivative of i with respect to t is positive?
19. Suppose that I represents the total national income and P represents the total
population. Suppose that i represents personal income and t represents time. Then i =
I/P . Is an economy developing when the derivative of I with respect to t is positive? If
you cannot decide, then what information is missing?
20. Suppose that h represents a person’s height and t represents the person’s age.
Would you expect this derivative of h with respect to t to be positive, negative, or zero?
How does your answer depend on t?
21. Suppose that h represents the height of water in a drinking glass and v represents
the volume of water in the glass. Would you expect the rate of change of h with v to
be positive, negative, or zero? What assumptions on the shape of the glass lead to the
conclusion that the derivative of v with h is constant? What assumptions on the shape of
the glass lead to the conclusion that the derivative of h with v is constant?
22. Suppose that l represents the location of a bicyclist along a road and t represents
time. Suppose that the cyclist is moving in the same direction the whole time so l is
increasing with t. Assume also that at the beginning the road has a hill (first up, then
140 Approximation

down) and then the road has a slight uphill grade. How would the derivative of l with
respect to t change with time? Sketch a possible graph of this derivative and mark it with
“hill”, “downhill”, and “slight grade”.
23. A manufacturer has a marginal cost and a marginal revenue (for the marginal
unit of production). Suppose that the marginal cost is higher than the marginal revenue.
What does this mean to the manufacturer?
24. A manufacturer has a marginal cost and a marginal revenue (for the marginal
unit of production). Suppose that the marginal cost is lower than the marginal revenue.
What does this mean to the manufacturer?
25. A manufacturer has a marginal cost and a marginal revenue (for the marginal unit
of production). Suppose that the market price of the items produced drops (say because
of dumping by a competitor). How does this change the marginal cost and the marginal
revenue?
26. A manufacturer has a marginal cost and a marginal revenue (for the marginal
unit of production). Suppose that the cost of raw materials or components is dropping
over time (say because of technological innovation). How will the marginal cost and the
marginal revenue change over time?

E. Some Differentiation Rules

It turns out that differentiation is very useful, not only as a way of thinking about
functions, but also as a way of understanding some important features of functions. One
of the reasons differentiation is so useful is that there are fairly quick ways in which to
calculate derivatives – there are many examples for which we do not have to calculate the
derivative using a limit of quotients.
It is easiest to remember differentiation rules rather than derive them for each function
that we differentiate. Some rules follow below and others will be added to them later. To
help in remembering these rules, we tend to name them. That way we keep reminding
ourselves of the rule as we use it.

Power Rule : f (x) = xα , α 6= 0, f 0 (x) = αxα−1

Approximation 141

Product Rule : f (x) = g(x) · h(x), f 0 (x) = g 0 (x) · h(x) + g(x) · h0 (x)

g(x) 0 g 0 (x) · h(x) − g(x) · h0 (x)

Quotient Rule : f (x) = , f (x) =
h(x) (h(x))2

Chain Rule : f (x) = g(h(x)), f 0 (x) = g 0 (h(x)) · h0 (x)

There are some important properties of differentiation that are perhaps easier to
remember without a formula. First, the derivative of a constant is zero. Second, differen-
tiation distributes over addition and commutes with multiplication by numbers.
Let’s make sense of the last two statements. A function whose value is constant is
not changing, so its rate of change is zero. For example, if the temperature of the water
at Old Faithful geyser in Yellowstone National Park is always 199 degrees (o F), then the
rate at which the temperature is changing over time is 0.
If the cost of a gallon of fuel was rising by 5 cent per month during August of 2009,
then the price of 2 gallons was rising by 10 cents per month and the price of 1.3 gallons
was rising by 6.5 cents per month. So, for example, if a customer has a can containing 1.3
gallons (for her or his lawn mower, let’s say), then when the customer filled that can at
the end of August the cost was 6.5 cents higher that it had been a month earlier.
Finally, if the cost of a gallon of fuel was rising by 5 cent per month during August
of 2009, and the price of a cup of coffee was falling by 3 cents per month at that time,
then a customer buying 5 gallons of gas and two cups of coffee saw an increase at a rate
of 5 × 5 − 2 × 3 = 19 cents per month (the cost of buying the items at the beginning as
opposed to a month later had risen by 19 cents).
Cultural Note: A mathematical operation that commutes with multiplication by num-
bers and distributes over addition is called a linear operation.

Example 1. Assume that f 0 (2) = 3. Find the derivative of g(x) = 5 + 7f (x) at x = 2.

Solution: Since g(x) is a sum, the rate of change of g is the sum of the rates of change of
its two parts. Since 5 is a constant, it does not change as x changes. Since the second part
of g is 7f (x) we have g 0 (x) = 7 · f 0 (x) so g 0 (2) = 7 · f 0 (2) = 7 × 3 = 21.
142 Approximation

Example 2. Assume that f 0 (1) = 5 and g 0 (1) = 7. Find h0 (1) for h(x) = 0.2f (x) −
1.3g(x).
Solution: h0 (1) = 0.2 · f 0 (1) − 1.3 · g 0 (1) = 0.2 × 5 − 1.3 × 7 = −8.1.

Example 3. Find the derivative of f (x) = x5 .

Solution: The function is a power of the variable, so the power rule applies. The power is
5 and the rule tells us that
f 0 (x) = 5x(5−1) = 5x4 .

Example 4. Find the line tangent to the graph of f (x) = x3/2 at the point (1, 1).
Solution: The function is a power of the variable, so the power rule applies (the power is
3/2). The derivative, evaluated at x = 1, will give the slope of the tangent line, and the
fact that the line contains the point (1, 1) will allow us to decide on an equation for this
line.
3 (3/2−1) 3 3√ 3
f 0 (x) = x = x1/2 , f 0 (1) = 1 = = 1.5.
2 2 2 2
The tangent line is y = 1.5x − 0.5.

√
Example 5. Find the slope of the tangent line to the curve y = 1/ x at the point
(4, 0.5).
√
Solution: The function is a power of the variable since y = 1/ x = x−1/2 . The slope of
the tangent line is the derivative of y with respect to x evaluated at x = 4.

−1 −3/2 −1 1 −1 1 1
y0 = x , slope = y 0 (4) = · √ = · 3 =− .
2 2 ( 4)3 2 2 16

Example 6. Find all values of x for which the graph of f (x) = x3 − 3x2 + 5 has a
horizontal tangent line.
Solution: The line tangent to the graph is horizontal when its slope is zero. Hence we want
to find values of x for which f 0 (x) = 0. So

0 = f 0 (x) = 3x2 − 3 · 2x = 3x2 − 6x = 3x(x − 2).

The solutions are x = 0 and x = 2.

Example 7. Find the derivative of f (x) = x(3 + x)2 .

Approximation 143

Solution: We recognize that f is a combination of powers, so one can find the derivative
by expanding f as a sum of powers. We find that f (x) = x(3 + x)2 = x(9 + 6x + x2 ) =
9x + 6x2 + x3 and f 0 (x) = 9 + 12x + 3x2 .

In order to return to the interpretation of derivatives, we will notice the distinction between
an average rate and a marginal rate.

Example 8. The cost, in dollars, to a store of procuring x bicycles is C(x) = 10, 000+
400x + 5x2 (these costs are due to fixed costs for being a dealer for the bicycle brand, the
amount paid for each bicycle, and the cost of storing the bicycles). Each bicycle sells for
$1,000. Find the revenue as a function of the number of bicycles that the store sells. Find
the average cost and average revenue as functions of the number of bicycles sold.
Solution: We will assume that the store only purchases (from the manufacturer) the bi-
cycles that it actually sells. The revenue from the sale of x bicycles is the selling price of
each bicycle times the number sold, so revenue is R(x) = 1, 000x. The average cost, per
bicycle, is the total cost divided by the number of bicycles. Similarly, the average revenue,
per bicycle, is the total revenue divided by the number of bicycles. When x bicycles are
sold, the average cost, and average revenue are

10, 000 + 400x + 5x2 10, 000

Average Cost = AC(x) = = + 400 + 5x,
x x
1, 000x
Average Revenue = AR(x) = = 1, 000.
x
The units for both of these are dollars per bicycle (for each of the x bicycles sold).

Marginal cost, revenue, and profit. In Economics, the cost or value of an additional
unit is called the marginal value. Examples: If we know the cost of production as a
function of the number of items produced, then the cost of producing the additional unit
is called the marginal cost. If we know the revenue from the sale of a certain number of
items, then the revenue from the sale of an additional unit is called the marginal revenue.
If we know the profit from a sale as a function of the number of items sold, then the profit
from the sale of an additional unit is called the marginal profit.

Example 9. The cost to a store of procuring x bicycles is C(x) = 10, 000+400x+5x2 .

Each bicycle sells for $1,000. Find the revenue and profit as functions of the number of
bicycles that the store sells. Find the marginal cost, marginal revenue, and marginal profit
as functions of the number of bicycles sold.
144 Approximation

Solution: We found in the previous example that the revenue is R(x) = 1, 000x. The profit
is what remains from the revenue after the costs are paid, so the profit is

profit = f (x) = 1, 000x − (10, 000 + 400x + 5x2 ) = −10, 000 + 600x − 5x2 .

When x bicycles are sold, the marginal cost, marginal revenue, and marginal profit are

MC = C 0 (x) = 400 + 10x, MR = R0 (x) = 1, 000, MP = f 0 (x) = 600 − 10x.

Example 10. A ski shop finds that when a (one hour group) lesson costs $35 there are
100 students per day who sign up for a class, and that when the lesson costs $40 only 90
students sign up for a class on any given day. Suppose that the rate at which the number
of students taking lessons declines with the price is constant. Suppose also that the ski
shop offers 5 classes each day and pays $20 to the instructor of each of these classes. Find
the revenue, cost, and profit from ski lessons. Find the marginal profit as a function of the
price charged per lesson. Calculate the marginal profit when the price is $40. Would the
ski shop want to set the price higher or lower than $40?
Solution: The variable of interest here is the price of a ski lesson. Let p denote the price
of a ski lesson. Let n denote the number of people that take a lesson on any given day.
The data shows that for p = 35, n = 100, and that for p = 40, n = 90. For each dollar
of increase in the price, 2 fewer lessons were taken, and since the rate of change of n with
p is constant, we get n = 100 − 2(p − 35) = 170 − 2p. The revenue is the product of the
number of lessons and the price of a lesson. Hence the revenue is R = n · p = 170p − 2p2 .
The cost is $100 each day since there are 5 lessons and the instructor is paid $20 for each
of them. Profit is revenue less cost, so

profit = f (p) = 170p − 2p2 − 100.

When the price of a lesson is p the marginal profit is

MP = f 0 (p) = 170 − 4p. MP(40) = f 0 (40) = 170 − 160 = 10.

When the price is $40, the marginal profit is 10, so the profit is increasing by approximately
$10 when the price of a lesson is raised by $1 (in any given day). So the shop would increase
profits if it charged more than $40 for each lesson.

Example 11. Find the derivative of f (x) = (x5 + x − 2)(x4 + x3 + 1).

Approximation 145

Solution: We could, in theory, expand f (x) as a polynomial and differentiate each term
using the power rule (there would be 9 terms). However, it is easier to think of f (x) as a
product:
f 0 (x) = (5x4 + 1)(x4 + x3 + 1) + (x5 + x − 2)(4x3 + 3x2 ).

Example 12. Find the derivative of f (x) = (x + 2)3 by thinking of f as a repeated

product.
Solution: We think of f (x) = (x + 2)3 = (x + 2)[(x + 2)(x + 2)] and apply the product rule
twice to get
f 0 (x) = 1 · [(x + 2)(x + 2)] + (x + 2) · [(x + 2)(x + 2)]0

= [(x + 2)(x + 2)] + (x + 2)[1 · (x + 2) + (x + 2) · 1] = 3(x + 2)(x + 2) = 3(x + 2)2 .

Example 13. Find the derivative of f (x) = (5x + 1)/(2x + 3).

Solution: The function is a quotient of two terms, so the quotient rule applies. Thus

5 · (2x + 3) − (5x + 1) · 2 10x + 15 − 10x − 2 13

f 0 (x) = 2
= 2
= .
(2x + 3) (2x + 3) (2x + 3)2

Example 14. Find the points at which the graph of f (x) = x2 /(x+4) has a horizontal
tangent line.
Solution: We are looking for values of x (and y = f (x)) for which the slope of the tangent
line is zero. The slope is the derivative of f with respect to x and the function is a quotient
of two terms, so the quotient rule applies. Thus

2x · (x + 4) − x2 · 1 2x2 + 8x − x2 x2 + 8x x(x + 8)
f 0 (x) = 2
= 2
= 2
= .
(x + 4) (x + 4) (x + 4) (x + 4)2

Now f 0 (x) = 0 only when the numerator is zero. So the values of x are x = 0 and x = −8
and the corresponding points on the graph are (0, 0) and (−8, −16). We conclude that the
tangent lines for this graph are horizontal at (0, 0) and at (−8, −16).

Example 15. Find the derivative of f (x) = x5 + x−2 and evaluate it for x = 2.
Solution: The power rule tells us that

f 0 (x) = 5x4 − 2x−3 , f 0 (2) = 5 · 24 − 2 · 2−3 = 16 − 0.25 = 15.75.

146 Approximation

Example 16. Imagine that we are considering designs for health clinics. Using
a certain configuration, the cost of treating x patients in a clinic on any given day is
C(x) = 100, 000 + 2, 000 x + 200 x2 . (These are fixed costs, costs associated directly with
the treatment of an individual, and the costs associated with logistics for a larger number
of people.) For what number of patients is the average cost equal to the marginal cost?
Suppose that we are planning to build a number of clinics. What does the previous
calculation tell us about the most cost effective size for the clinics?
Solution: The average cost, A, is A(x) = C(x)/x, and the marginal cost is M C(x) = C 0 (x).
So we find
100, 000 + 2, 000 x + 200 x2 100, 000
A= = + 2, 000 + 200 x, M C = 2, 000 + 400 x.
x x
The average cost equals the marginal cost when x satisfies

100, 000
+ 2, 000 + 200 x = 2, 000 + 400 x or
x

100, 000 + 2, 000 x + 200 x2 = 2, 000 x + 400 x2 .

So we get 100, 000 = 200x2 , and x2 = 500 which gives between 22 and 23 patients.
In deciding whether to treat more patients in existing clinics or to build an additional
clinic, notice that the cost we would like to minimize is the cost per patient. When the
marginal cost is larger than the average cost, the additional patient costs us more than
the cost per patient so far. When the marginal cost is lower than the average cost, the
additional patient costs us less than the cost per patient so far. So the most efficient size
for our clinics is achieved when between 22 and 23 patients are treated each day.

Example 17. Find the derivative of f (x) = (x5 + 3x − 9)112 .

Solution: We could, in theory, expand f (x) as a polynomial and differentiate each term
using the power rule (there would be 113 terms). However, it is easier to think of f (x) as a
composition: with u = u(x) = x5 +3x−9 we can think of f as a power, namely f (u) = u112 .
Then differentiating with respect to u we get f 0 (u) = 112u111 and differentiating u with
respect to x we get u0 (x) = 5x4 + 3. We conclude that

f 0 (x) = 112(x5 + 3x − 9)111 · (5x4 + 3).

Example 18. As a balloon rises in the air it is observed that its diameter is 7 inches
and that it is increasing at 0.25 inches per minute. Suppose that the balloon is roughly
Approximation 147

spherical and its volume, v, is related to its diameter, x, by v = 0.5x3 . At what rate is the
balloon’s volume increasing over time?
Solution: Let t denote time (in minutes). In terms of the variables, the observation tells
dx dv
us that dt = 0.25. We want to find dt at x = 7. We put all this together

dv dv dx dv
= = 0.5 · 3x2 × 0.25 and at x = 7, = 0.5 · 3 · 72 × 0.25 = 18.375.
dt dx dt dt

In words, we said that the rate at which volume is increasing over time is the rate at which
the volume increases for each inch of the diameter times the rate at which the diameter is
changing over time.

Example 19. A product is introduced into the market. After 11 weeks the number
of stores carrying the product is 47 and is rising by 3 percent. The number of items sold in
each store is rising by 17 items per week. How quickly are sales of the new product rising?
Solution: Let t denote time (in weeks), x denote the number of stores carrying the product,
and v be the volume of sales. In these variables, the information is that at t = 11, x = 50,
and
dx dv
= 0.03x while = 17.
dt dx
Since x = 50,
dx dv
= 1.5 and = 17 × 1.5 = 25.5.
dt dt

Example 20. Find the derivative of f (x) = (5x11 + 2x4 + 3)23 at x = 1.

Solution: The function is a composition of two functions, so the chain rule applies.

f 0 (x) = 23(5x11 + 2x4 + 3)22 · (55x10 + 8x3 ).

f 0 (1) = 23(5 + 2 + 3)22 · (55 + 8) = 23 · 1022 · 63 = 1449 × 1022 .

Example 21. Find the values of x at which the graph of f (x) = (x2 − 3x + 4)3 has a
horizontal tangent line.
Solution: We are looking for values of x for which the slope of the tangent line is zero. The
slope is f 0 (x) = 3(x2 − 3x + 4)2 (2x − 3) and this is zero if x2 − 3x + 4 = 0 or if 2x − 3 = 0.
The first equation, x2 − 3x + 4 = 0, has no solutions (because 9 − 4 · 1 · 4 = −7 < 0). The
second equation is solved by x = 3/2. We conclude that the tangent line for this graph is
horizontal only for x = 3/2.
148 Approximation

Terminology: When the function is denoted by a variable, for example y = y(x), we often
think of the quotient in the derivative as a ratio of changes: the change in y divided by
the change in x, and we write
∆y
y 0 = lim .
∆x→0 ∆x

√
Example 22. Suppose that v(t) = 16 + 11t is the value, in dollars, of a bottle of
wine that has been kept for t years (since its production). Find v 0 (t). Evaluate v 0 (3) and
interpret its meaning.
Solution: We think of the function as v = (16 + 11t)0.5 . Then v 0 (t) = 0.5 (16 +
√
11t)−0.5 × 11. Now we can calculate that v 0 (3) = 11/(2 49) = 11/14. This means that
when the wine is 3 years old its value is rising by about 11/14 dollars per year. This
calculation may be useful to a wine merchant who is deciding when to sell the wine, for
example. While considering the wine merchant, notice that the value of the bottle at time
√
3 years is v(3) = 11 + 33 = 7, so the merchant is gaining 11/14 of a dollar (per year) on

an investment worth 7 dollars, which is a return of (11/14) 7 = 11/98 (dollars per year
per dollar invested) or just over 11.2 percent.

EXERCISES.
1. For some functions f and g, f 0 (2) = 3 and g 0 (2) = 5. Calculate h0 (2) when (a)
h(x) = g(x) − f (x), (b) h(x) = 2f (x) + g(x), and (c) h(x) = 11g(x) + 1.76.
√
2. Calculate the slope of the tangent line to the curve y = 1/ x2 at the point (1, 1).
3. Calculate the derivative of f (x) = x4 + 2x2 + 5.
√
4. Calculate the slope of the tangent line to the curve y = x3 + 2x at the point
(2, 10) and give an equation for this line.
5. For some functions f and g, f (2) = 1, g(2) = 7, f 0 (2) = 3 and g 0 (2) = 5. Calculate
h0 (2) when (a) h(x) = g(x) · f (x), (b) h(x) = g(x)/f (x), and (c) h(x) = 3g(x) + f (x)/2.
6. Find the points in the x-y plane at which the tangent lines to the curve y = x3 +3x2
are horizontal.
7. Calculate the derivative of g(x) = (x4 + 2x2 + 5)/(x2 + 3).
8. For some functions f and g, f (1) = 2, f (2) = 1, f 0 (1) = 7, f 0 (2) = 3, g(1) = 2,
g(2) = 1, g 0 (1) = 6, and g 0 (2) = 5. Calculate h0 (2) when (a) h(x) = g(f (x)), and (b)
h(x) = f (g(x)). Calculate h0 (1) when (c) h(x) = g(f (x)), and (d) h(x) = f (g(x)).
9. Find the values of x for which the tangent lines to the curve y = x4 − x2 are
horizontal.
Approximation 149

√
10. Differentiate f (x) = x3 − 6x + 5.
√
11. Find the slope of the tangent line to the curve y = x + x2 at the point (1, 2).
12. The cost to a store of purchasing x cereal boxes is C(x) = 1, 000 + x + 0.001x2 .
Each cereal box sells for $4.50 (assume all the purchased boxes do get sold). (a) Find
the revenue and profit as functions of the number of cereal boxes that the store sells. (b)
Calculate the marginal cost, marginal revenue, and marginal profit as functions of the
number of bicycles sold.
13. The cost to a manufacturer of building x boats is C(x) = 100, 000 + 3, 000x + x2 .
Each boat sells for $5,000. (a) Find the revenue and profit as functions of the number of
boats that the manufacturer builds. (b) Calculate the marginal profit as a function of the
number of boats built. (c) Suppose 1000 boats are built. Calculate the profit. (d) When
1000 boats are built, is the profit increasing (with more production)? decreasing?
14. The cost to a store of purchasing x cereal boxes is C(x) = 1, 000 + x + 0.001x2 .
Each cereal box sells for $4.50 (assume all the purchased boxes do get sold). (a) Suppose
that 1000 boxes of cereal are sold in one year. Calculate the profit from these sales. (b)
Suppose that 2000 boxes of cereal are sold in one year. Calculate the profit with this new
number of sales. (c) Calculate the profit per item with sales as in (a) and as in (b).
15. The cost, in dollars, to a store of procuring x bicycles is C(x) = 200, 000 +
400x + 5x2 . Each bicycle sells for $3,000. Find the revenue as a function of the number
of bicycles that the store sells. (a) Find the average cost and average revenue as functions
of the number of bicycles sold. (b) Calculate the marginal cost. (c) Suppose 1,300 of
these bicycles are sold. Calculate the average cost, average revenue, marginal revenue,
and marginal cost. (d) Which of the numbers in part (c) are most important to the
store’s decision regarding ordering bicycles? (e) Which of the numbers in part (c) are most
important to the store’s balance sheet?
16. A kayak school finds that when a certain training class costs $700 there are 100
students per summer who sign up for this class, and that when the class costs $800 only
90 students sign up for a the class. Suppose that the rate at which the number of students
taking class declines with the price is constant. Suppose cost of offering this class is $300
per student. (a) Find the revenue, cost, and profit from these classes as functions of the
price charged per class. (b) Calculate the marginal profit when the price is $700. (c)
Interpret this number in terms of the school’s decision regarding the price of its class.
17. A pizza shop finds that when a pizza costs $15, 100 pizzas are sold per day, and
when a pizza costs $10, 150 pizzas are sold. Suppose the relation between the price and
the number of pizzas sold is linear. (a) Find the revenue as a function of the price charged
150 Approximation

per pizza. (b) Calculate the marginal revenue when the price is $14. (c) (b) Calculate the
marginal revenue when the price is $12. (c) Would the revue be greater with a price of
$12 or $14?
18. A ski shop finds that when a (one hour group) lesson costs $35 there are 70
students per day who sign up for a class, and that when the lesson costs $40 only 60
students sign up for a class on any given day. Suppose that the relation between the
number of students taking lessons and the price is linear. (a) Find the revenue from ski
lessons. (b) Calculate the marginal revenue as a function of the price charged per lesson.
(c) Calculate the marginal revenue when the price is $40.
19. A pizza shop finds that when a pizza costs $15, 100 pizzas are sold per day, and
when a pizza costs $10, 150 pizzas are sold. Suppose the relation between the price and
the number of pizzas sold is linear. Suppose the cost of making a pizza is fixed (does not
depend on the number of pizzas made). Would the cost of making a pizza change the
shop’s decision about how much to charge for each pizza? and if so, in what way?
20. A product is introduced into the market. After 11 weeks the number of stores
carrying the product is 100 and is rising by 3 percent (per week). The number of items
sold in each store is rising by 17 items per week. How quickly are sales of the new product
rising?
21. A product is introduced into the market. After 11 weeks the number of stores
carrying the product is 100 and is rising by 3 percent. The number of items sold in each
store is rising by 12 items per week. How quickly are sales of the new product rising?
22. Imagine that we are considering designs for distribution centers. The cost of
sending out x items on any given day is C(x) = 5, 000 + 2 x + 0.1 x2 . (These are fixed costs,
costs associated directly with the each shipment and the costs associated with logistics for a
larger number of parcels.) (a) Calculate the average cost. (b) Calculate the marginal cost.
(c) For what number of items is the average cost equal to the marginal cost? (d) What
does the previous calculation tell us about the most cost effective size for the distribution
centers?
23. Imagine that we are considering the best time for replanting an orchard. Annual
production from an orchard of age a is X(a) = 20 a + 0.4 a2 − 0.008 a3 . (Production is
measured in bushels per acre.) (a) Calculate the marginal production rate and the average
production rate over the life of the orchard. (b) At what age is the average production
rate equal to the marginal production rate? (b) Suppose that we can ignore the cost of
replanting. At which age should the orchard be replanted? Explain the reasoning behind
this decision.
Approximation 151

F. Linearization

The linearization of a function provides an alternative approach to the derivative, and

this point of view is useful in understanding the product rule and the chain rule. It is also
useful when estimating changes, as we’ll see later.

Example 1. Suppose that T (x) is the total amount that a consumer is willing to pay
for x crayons, in dollars. Suppose T (64) = 7 and T 0 (64) = 0.1. Interpret this information
in words. How much is the consumer willing to pay per crayon, on average, when she or
he buys 64 crayons? Approximately how much is the consumer willing to pay for a total
of 66 crayons?
Solution: The consumer is willing to pay 7 dollars for the 64 crayons, and is willing
to pay an additional 0.1 dollars (ten cents) for each additional crayon. When buying 64
crayons, the consumer is willing to pay, on average, 7/64 dollars per crayon (or 10.9375
cents per crayon). For 66 crayons, the consumer would be willing to pay approximately 7
dollars for the initial 64 crayons and 0.1 for each of the two additional crayons or 7+2×0.1 =
7.20 dollars.

Suppose that the (independent) variable x is changing from x to x + ∆x. Then the
corresponding change in the function is ∆f = f (x + ∆x) − f (x). With this notation, the
derivative is
∆f
lim
∆x→0 ∆x

and for ∆x near zero we can think of ∆f in terms of the derivative of f , namely

∆f ≈ f 0 (x) · ∆x .

Example 2. Suppose that when x is the number of shoes produced at a factory,

measured in thousands of pairs, the cost of production is C(x) = 3, 000, 000 + 10, 000 x +
20x2 dollars. Suppose that 5,000 pairs are currently produced (so x = 5). Approximately
how much does it cost to produce 7 more pairs of shoes?
Solution: The marginal cost is C 0 (x) = 10, 000 + 40x and currently x = 5 so the marginal
cost is C 0 (5) = 10, 000 + 200 = 10, 200. Now the change in x is 7/1, 000 = 0.007 (because
x represented the number of pairs of shoes in thousands). We calculate that

∆C ≈ 10, 200 · 0.007 = 71.40 .

152 Approximation

So an additional 7 pairs cost about $71.40 dollars to produce.

The advantage of the approach we saw in this example is that we get a meaningful
estimate for the additional cost without alot of computation (that is, once we calculate
the marginal cost we can evaluate it fairly easily and estimate the change in the cost very
easily).

Example 3. Suppose that when x is the number of shoes produced at a factory,

measured in thousands of pairs, the marginal cost of production is C 0 (x) = 10, 000 + 40x.
Suppose that 5,000 pairs are currently produced (so x = 5). Approximately how many
additional pairs of shoes can be produced with an investment of $150?
Solution: The marginal cost is currently C 0 (5) = 10, 000 + 200 = 10, 200. For a change
in x of size h, the additional cost is approximately ∆C ≈ 10, 200 · h. So if this additional
cost is $150, then 150 = ∆C ≈ 10, 200 · h, and h ≈ 1/68 thousands of shoes. So between
14 and 15 additional pairs of shoes can be produced.

Example 4. Use linearization to justify the product rule.

Solution: Suppose now that we have a variable x and two functions f (x) and g(x). We
are interested in the rate of change of the product p(x) = f (x) · g(x). Let us calculate

∆p = f (x + ∆x) · g(x + ∆x) − f (x) · g(x) and

f (x + ∆x) = f (x) + ∆f ≈ f (x) + f 0 (x)∆x , and

g(x + ∆x) = g(x) + ∆g ≈ g(x) + g 0 (x)∆x , so

∆p ≈ f (x) + f 0 (x)∆x g(x) + g 0 (x)∆x − f (x) · g(x) or

2
∆p ≈ f 0 (x)g(x)∆x + f (x)g 0 (x)∆x + f 0 (x)g 0 (x) ∆x .
2
Since ∆x → 0, the last term which involves ∆x approaches zero even more rapidly. So
the linear approximation becomes

∆(f · g)(x) ≈ f 0 (x)g(x)∆x + f (x)g 0 (x)∆x .

To summarize this in words, the change in f · g has two parts, one coming from the change
in f which is scaled by the factor g(x) and the second coming from the change in g which
is multiplied by the factor f (x).

Example 5. Suppose that for a certain brand of sunglasses the number of sunglasses
sold was 113,000 and was rising by 2,000 sunglasses per month while the price was $52
Approximation 153

and was rising by 3 dollars per month. Use this to approximate the rate at which revenue
was changing.
Solution: The revenue is the product of the number of items sold and the price at which
they sell. Hence the revenue on the 113,000 sunglasses sold was rising by $3 each for
a total of 113, 000 × 3 = 339, 000 and the additional sunglasses sold brought in about
2, 000 × 52 = 104, 000. At this moment in time the revenue was rising by approximately

113, 000 × 3 + 2, 000 × 52 = 339, 000 + 104, 000 = 443, 000 dollars per month.

Example 6. Use linearization to justify the chain rule.

Solution: We are interested in the rate of change of a composition of functions. A function
u acts on the variable x producing u = u(x), and a function f acts on the result, f =
f (u) = f (u(x)).
For instance, f (x) = (x3 + 2x)57 has u(x) = x3 + 2x as the “inside” function.
Using linearization,

∆f ∆f ∆u df du
= · ≈ (u) · (x) .
∆x ∆u ∆x du dx
Since, in this case, ∆u and ∆x are not zero, the multiplication and division by ∆u is
legitimate and is a useful way to understand and remember the chain rule. (If ∆u = 0,
then a change in x creates no change in u so there is no change in f .)

Example 7. Find f 0 (x) for f (x) = (x3 + 2x)57 .

Solution: We think of f as f (u) = u57 with u(x) = x3 + 2x. Then f 0 (u) = 57u56 and
u0 (x) = 3x2 + 2. Thus

f 0 (x) = 57u56 · (3x2 + 2) = (x3 + 2x)56 (3x2 + 2) .

Differentials

The differential is used in two ways. One is as an aid in integration (which we will see
later). The other is to emphasize the use of the derivative in approximations. For y = f (x)
we write df
dy = (x) dx
dx
154 Approximation

to remind us that the change in y is approximately the derivative (of y with respect to x)
times the change in x.

Example 8. A square room is being measured (imagine that it will be covered

in wooden flooring, say, and we want to know how much flooring boards to purchase).
Suppose that the measure is 15 feet on each side. Suppose the error in the measurement
can be up to 0.1 foot. What might the error in the area be?
Solution: Let x be the side length and A be the area. Then A = x2 and dA/dx = 2x. So
at x = 15, dA = 30 dx. That means that an error of 0.1 (feet) in x, that is dx ≈ 0.1, can
lead to an error of 3 (square feet) in the area, that is dA ≈ 30 × 0.1.

Example 9. Suppose that a manufacturer is measuring resin in a cube of side length

4 cm (so the volume is 64 cc). How accurately must the side of the cube be measured to
make sure that the measurement of the volume is within 1 percent?
Solution: Let x be the side length, and V be the volume. Since V = x3 , dV = 3x2 dx so at
x = 4, dV = 48dx. For the error to be less than 1 percent, one needs 48dx < 0.01 × 64 =
0.64. This requires, the error in x to be less than 0.64/48 = 1/75. (The error in the side
length can be no more than (1/75)/4 = 1/300 or 1/3 of a percent, in this case).
√
Example 10. The frequency of a stringed instrument is f = k t, where t is the
tension in the string, and k is a positive constant. Suppose that for t = 100 (Kg·m/ sec2 )
the frequency is 440 (Hz). If the tension changes by 1%, how much does the frequency
change?
√ √
Solution: Calculate first that k = f / t = 440/10 = 44. Now df = k/(2 t) dt. At
√
t = 100, k/(2 t) = k/20 = 2.2 so df = 2.2 dt. A change of 1% in t gives a change of 1 in t
(dt ≈ 1), so the frequency changes by 2.2 which is 0.5% of the frequency.

Continuity and differentiability.

Linearization also can be used to see that when a function is differentiable, the function is
continuous. Suppose that a function is differentiable at some point x = a. Then

f (a + ∆x) − f (a) ≈ f 0 (a) × ∆x ,

with the approximation being valid when ∆x gets small. As x aproaches a, ∆x is ap-
proaching zero, so f (a + ∆x) − f (a) → 0. It follows that f is continuous at x = a. Said
Approximation 155

another way, if the graph of f has a tangent line at (a, f (a)), then this graph does not
have a jump at this point.

Question: Is the opposite implication also true? If a function is continuous at x = a,

does it follow that it has a derivative at that point?
Solution: The answer is that continuity does not imply differentiability. We will examine
a few examples.

Example 11. Set f (x) = |x|.

Consider the behavior near x = 0. As x → 0, |x| → 0 as well, and f (0) = |0| = 0, so
f is continuous at x = 0. Next we examine the limit quotients used in differentiation: For
∆x near zero and ∆x < 0,
f (0 + ∆x) − f (0) |∆x| − 0 −∆x
∆x < 0 : = = = −1
∆x ∆x ∆x
For ∆x near zero and ∆x > 0,
f (0 + ∆x) − f (0) |∆x| − 0 ∆x
∆x > 0 : = = =1
∆x ∆x ∆x
and so the limit as ∆x → 0 does not exist. So f (x) = |x| is not differentiable at x = 0.

Example 12. Set

2x x<1
n
f (x) = .
3x − 1 x ≥ 1
Consider the behavior near x = 1. Note that f (1) = 2 and that f is continuous at
x = 1. For ∆x > 0, 1 + ∆x > 1 so
f (1 + ∆x) − f (1) (3(1 + ∆x) − 1) − 2 3 ∆x
∆x > 0 : = = = 3.
∆x ∆x ∆x
For ∆x < 0, 1 + ∆x < 1 so
f (1 + ∆x) − f (1) 2(1 + ∆x) − 2 2 ∆x
∆x < 0 : = = = 2.
∆x ∆x ∆x
Hence f is not differentiable at x = 1.
Conclusion: A function that is differentiable at x = a is continuous at x = a. However,
a function that is continuous at x = a may or may not be differentiable at this point.

Example 13. Is the following function differentiable at x = 0?

x2 x < 0
f (x) = .
x3 x ≥ 0
156 Approximation

Solution: This function is differentiable at x = 0 because for ∆x > 0, f (∆x)−f (0) = (∆x)3
while for ∆x < 0, f (∆x) − f (0) = (∆x)2 , and both of these, when divided by ∆x still
approach zero as ∆x → 0.

Example 14. Is the following function differentiable at x = 1?

x2 x<1
f (x) = .
2x x≥1

Solution: This function is not differentiable at x = 1 because it is not continuos there. By

definition f (1) = 2 and if x approaches 1 from the left, f (x) → 1.

EXERCISES.
1. Suppose that when x is the distance of highway (in miles) outfitted with safety
median barriers, the cost is C(x) = 3, 000, 000+1, 000 x+0.03x2 dollars (the x2 term is due
to procurement costs rising with demand for the needed hardware). Suppose that 6,000
miles are currently being outfitted. (a) Approximately how much does it cost to outfit
an additional 50 miles? (b) Approximately how many miles can be outfitted for 100,000
dollars?
2. Suppose that when x is the distance of highway (in miles) outfitted with safety
median barriers, the cost is C(x) = 3, 000, 000+1, 000 x+0.03x2 dollars (the x2 term is due
to procurement costs rising with demand for the needed hardware). Suppose that 12,000
miles are currently being outfitted. (a) Approximately how much does it cost to outfit an
additional 50 miles? (b) Approximately how many additional miles can be outfitted for
100,000 dollars?
√
3. Calculate the differential of f (x) = x + 0.7x3 at x = 1.
√
4. Calculate the differential of f (x) = x + 0.7x3 at x = 4.
5. Suppose that m(x) is the amount (in dollars) that a consumer is willing to pay for
x pounds of rice. Suppose m(50) = 70 and m0 (50) = 1.30. (a) Interpret this information
in words. (b) How much is the consumer willing to pay per pound of rice, on average,
when she or he buys 50 pounds? Approximately how much is the consumer willing to pay
for a total of 53 pounds?
6. Suppose that m(x) is the amount (in dollars) that a consumer is willing to pay for
x pounds of rice. Suppose m(50) = 70 and m0 (50) = 1.30. In this example the average
amount per pound is higher than the marginal amount. Does this make sense? Explain.
Approximation 157

7. Suppose that for a certain brand of sunglasses the number of sunglasses sold was
50,000 and was rising by 200 sunglasses per month while the price was $40 and was falling
by 2 dollars per month. Use this to approximate the rate at which revenue was changing.
8. Suppose that for a certain brand of sunglasses the number of sunglasses sold was
50,000 and was rising by s sunglasses per month while the price was $40 and was falling by
2 dollars per month. Calculate the least value of s (added sales) needed to keep revenue
from falling.
9. Suppose that the amount of an item produced with 100 hours of labor is f = 10 x0.4 ,
where x measures the expenditure on materials. Suppose x changes by 1 percent. By
approximately what percentage does production change? Suggestion: the change in x is
dx ≈ 0.01 x. The percentage change in f is 100 df /f .
10. Suppose that the amount of an item produced with 100 hours of labor is f =
10 x0.7 , where x measures the expenditure on materials. Suppose x changes by 1 percent.
By approximately what percentage does production change?
√
11. The frequency of a stringed instrument is f = 44 t, where t is the tension in the
string. Suppose that the current frequency is 437 (Hz) and the desired frequency is 440
(Hz). By what percentage should the tension be changed to reach the desired frequency?
√
12. The frequency of a stringed instrument is f = k/ L, where L is the length at
which the string is held. Suppose that the current frequency is 494 (Hz) and the desired
frequency is 523 (Hz). By approximately what percentage should the length be shortened
to reach the desired frequency?
13. Is the following function continuous at x = 0? differentiable at x = 0?

x x<0
n
f (x) = .
x3 x≥0

14. Is the following function continuous at x = 0? differentiable at x = 0?

x2 x<0
f (x) = .
−x2 x≥0

15. In this exercise we will use linearization to take two steps in calculating an
approximate solution to x3 + x2 + 4 = 0. Set f (x) = x3 + x2 + 0.3 = 0. (a) Calculate
f (−1). (b) Calculate the differential of f at −1 and use it to approximate the change in
x (from −1) for which the change in f is −0.3. (c) Calculate the differential of f at −1.3
and use it to approximate the change in x (from −1.3) for which the change in f is 0.207.
9. Further Uses of Differentiation

A. Derivatives of Higher Order

Since the derivative of a function can be viewed as a function, one can try to differen-
tiate the new function. If the new function is differentiable, then this works. We will see
a bit later that the second derivative describes the way in which the graph of a function
curves, and is useful in deciding maximum and minimum values for a function.

Example 1. Calculate the second derivative of f (x) = x3 .

Solution: The first derivative is f 0 (x) = 3x2 so the second derivative is 6x.

Notation: we write the second derivative of f (x) with respect to x as

d2 f d 0
2
(x) = f (x) = f 00 (x) .
dx dx

Similar notation is used for the third derivative of a function, and even higher order
derivatives can be taken.

Example 2. Calculate the third derivative of f (x) = x7 .

Solution: The first derivative is f 0 (x) = 7x6 so the second derivative is f 00 (x) = 42x5 , and
the third derivative is f 000 (x) = 210x4 .

Example 3. Find the fourth order derivative of f (x) = x4 .

Solution: The first derivative is f 0 (x) = 4x3 , the second derivative is f 00 (x) = 12x2 , the
third derivative is f 000 (x) = 24x,
d4 f
(x) = 24 .
dx4

Example 4. Sketch the graph of f (x) = x3 and its first and second derivatives.
Solution: The first derivative is f 0 (x) = 3x2 , and the second derivative is f 00 (x) = 6x.
Further Uses of Differentiation 159

1.2

0.8

y=f ' (x)

0.4

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

y=f(x)
-0.4

y=f '' (x)

-0.8

-1.2

Example 5. Find the first and second derivatives of

x2 , x≥0
f (x) =
−x2 , x < 0.

Solution: The first derivative is

0 2x, x≥0
f (x) =
−2x, x < 0.

Notice that this derivative makes sense for x 6= 0, and that f 0 (0) = 0 actually holds,
agreeing with the formula.
For x 6= 0, we can use the power rule again to find f 00 (x), However, at x = 0 the
second derivative on the left is −2 while on the right it is +2, so f 00 (0) does not exist. In
conclusion, for x 6= 0
00 2, x>0
f (x) =
−2, x < 0.

EXERCISES.
1. Calculate the first two derivatives of f (x) = x0.31 + x4.2 . What are the domains
for each of these functions?
2. Calculate the first three derivatives of f (x) = x11 + x−2 . What are the domains
for each of these functions?
3. Calculate the first two derivatives of f (x) = x5 and sketch the graphs of these
functions.
160 Further Uses of Differentiation

4. Calculate the first two derivatives of f (x) = x2 |x|. What are the domains for each
of these functions?
5. Calculate the first two derivatives of f (x) = x |x|. What are the domains for each
of these functions?
6. Find the first and second derivatives of

x4 , x≥0
f (x) =
−x3 , x < 0.

7. Find the first and second derivatives of

x3.4 , x ≥ 0
f (x) =
x4 , x < 0.

B. Implicit Differentiation: a use of the chain rule

In some instances the relation between two variables, say x and y, does not give one
variable in terms of the other. It is still possible in such cases to find the derivatives, but
the derivatives may involve both variables.

Example 1. Find the derivative of y with respect to x when x = 2 and y = 1 and

y 5 − y 2 + x3 + x = 10.
Solution: The idea is to take the derivative with respect to x on both sides of the equation,
and then solve for y 0 in terms of x and y. Recall that y 0 is the derivative of y with respect
to x and that every term is being differentiated with respect to x (not with respect to y).

5y 4 · y 0 − 2y · y 0 + 3x2 + 1 = 0.

We gather the terms involving y 0 and solve:

−3x2 − 1
5y 4 · y 0 − 2y · y 0 = −3x2 − 1, (5y 4 − 2y)y 0 = −3x2 − 1, y0 = .
5y 4 − 2y

With x = 2 and y = 1 we get y 0 = −13/3.

Example 2. Find the derivative of y with respect to x when x = 1 and y 3 +x3 +x = 10.
Further Uses of Differentiation 161

Solution: Two different approaches work here. We can solve for y as a function of x and
then differentiate or we can differentiate implicitly:
When x = 1 we have y 3 + 13 + 1 = 10 or y 3 = 8, so y = 2. Differentiating implicitly
we get

3y 2 · y 0 + 3x2 + 1 = 0, 3(22 )y 0 + 3(12 ) + 1 = 0, 12y 0 + 4 = 0, y 0 = −1/3 .

Alternatively, for any x we have y 3 = 10 − x − x3 , so y = (10 − x − x3 )1/3 , and

1
y0 = (10 − x − x3 )−2/3 (−1 − 3x2 ),
3

and for x = 1,

1 1 −4 −4 1
y0 = (10 − 1 − 13 )−2/3 (−1 − 3) = 8−2/3 (−4) = 1/3 2
= =− .
3 3 3 · (8 ) 3·4 3

Example 3. Suppose that when the amount of labor provided per week is x (hours
of labor) and the monetary investment is y (dollars) the quantity produced is 17x1/3 y 2/3
(items). Suppose that currently x = 1000 and y = 8000. At what rate can monetary
investment be reduced if additional labor is available (but production is to remain at the
same level)?
Solution: We see how the quantities change when x changes. Since production is kept
constant,

d 1 −2/3 2/3 2 1/3 −1/3 0

17x1/3 y 2/3 = 0,

so 17 x y + x y y =0
dx 3 3

and for x = 1000 and y = 8000,

1 2 1 1 2 1
1000−2/3 80002/3 + 10001/3 8000−1/3 y 0 = 0, or · · 400 + · 10 · y 0 = 0
3 3 3 100 3 20

and we conclude that y 0 = −4. In practical terms this says that for every additional hour
of labor the investment can be reduced by 4 dollars.

Example 4. Find the equation of the line tangent to x4 + x2 + y 4 = 3 at (1, −1).

Solution: Two different approaches work here. We can solve for y as a function of x, being
careful to include the negative branch of the fourth root, and then differentiate or we can
differentiate implicitly. The second is easier, so we’ll take that approach.
162 Further Uses of Differentiation

Differentiating implicitly we get

4x3 + 2x + 4y 3 · y 0 = 0, so 4 + 2 − 4y 0 = 0, 6 = 4y 0 , and y 0 = 3/2 .

Now that we know the slope, the equation of the tangent line is y = 1.5x − 2.5 (because
the line includes y = −1 when x = 1).

EXERCISES.
1. Find the derivative of y with respect to x when x = −3 and y = 2 and y 4 + 2y 2 −
2x2 = 6.
2. Find the derivative of y with respect to x when x = 1 and y = −1 and y 4 + y 3 +
x3 − x = 0.
3. Find the derivative of y with respect to x when x = 0 and y = 1 and y 6 +y 2 +7x = 2.
4. Find the derivative of y with respect to x when x = 0 and y 3 + y + 5x + x2 = 1.
Suggestion: One value of y that satisfies this equation (with x = 0) is y = 1. Factor
y 3 + y − 1 to show there is no other.
5. Suppose that the height along a mountain at the location with coordinates x and
y is x2 + y 2 . Suppose that a person is currently located at x = 10 and y = 5 and wants
to move but remain at the same height. At what rate should y vary with x so the person
remains at the same level?
6. Suppose that when the amount of labor provided per week is x (hours) and the
monetary investment is y (dollars) the quantity produced is x1/3 y 2/3 (items). Suppose
that currently x = 8000 and y = 1000. At what rate can monetary investment be reduced
if additional labor is available (and production is to remain at the same level)?
7. Suppose that when the amount of labor provided per week is x (hours) and the
monetary investment is y (dollars) the quantity produced is 100x1/2 y 1/2 . Suppose that
currently x = 900 and y = 4000. At what rate can monetary investment be reduced if
additional labor is available (and production is to remain at the same level)?
8. Suppose that when the amount of labor provided per week is x (hours) and the
monetary investment is y (dollars) the quantity produced is 100x1/2 y 1/2 . Suppose that
currently x = 4000 and y = 900. At what rate can monetary investment be reduced if
additional labor is available (production is to remain at the same level)?
9. Find the equation of the line tangent to x3 + x2 + y 4 = 3 at (1, 1).
10. Find the equation of the line tangent to x4 + 5x + y 7 + 3y = 30 at (2, 1).
Further Uses of Differentiation 163

√
11. Find the equation of the line tangent to x4 + x2 + 3y + y = 15 at (1, 4).

C. Related Rates: another use of the chain rule

In the 20th century, the amount of carbon dioxide in the Earth’s atmosphere was
changing in a way that related to the size of the human population. It makes sense, then,
that if we know the rate of increase in the population then we would also know the rate
of increase in carbon dioxide.
This is an example of related rates. Because the amount of carbon dioxide is related
to the size of the population, the rate of change in carbon dioxide over time is related to
the rate of change in the population over time.

Example 1. Suppose that y = x2 and x depends on t. It is known that when x = 7

the rate of change of x with t is dx/dt = 5. Find the rate of change of y with t.
Solution:
dy dy dx dx
= · = 2x ·
dt dx dt dt
and for x = 7, dy/dt = 2 × 7 × 5 = 70.

Example 2. The radius of a disk is 11 and is increasing at 3 feet per minute. How
quickly is the area of the disk increasing?
Solution: Let r be the radius of the disk, A its area, and let t be time. The information
gives dr/dt = 3, and for disks A = πr2 . We want to find dA/dt. So we calculate that

dA dA dr
= · = 2π r · 3
dt dr dt

and since r = 11 the area is changing at a rate dA/dt = 66π.

Example 3. A 17 foot ladder leans against a wall and its foot is sliding away along
the ground. Assume that the ground in perpendicular to the wall and that the foot of the
ladder is moving away from the wall at 3 feet per second. How quickly is the top of the
ladder descending when the base of the ladder is 10 feet away from the wall?
Solution: Let h denote the height of the top of the ladder and let x be the distance of
the foot of the ladder from the wall. Let t denote time. We want to find dh/dt. Then
164 Further Uses of Differentiation

h2 + x2 = 172 , dx/dt = 3, and x = 10. We use implicit differentiation with respect to t

(because it is easier) to find that

dh dx dh 2x dx
2h + 2x = 0, so =− .
dt dt dt 2h dt

It turns out that we need the value of both x and h. Since x = 10, h2 + 102 = 172 = 289
√ √
gives h2 = 189 and h = 189. Thus dh/dt = −30/ 189 ≈ −2.182 feet per second.

Example 4. A company estimates that the weekly cost and revenue (both in dollars)
when x items are produced and sold are

cost = C(x) = 30000 + 4x + x2 , revenue = R(x) = 700x − x2 .

Find the rate at which profit is changing when there are 100 items produced (each week)
and the number of items produced is rising by 3 items per week.
Solution: Profit is f (x) = R(x) − C(x). Let t denote time (in weeks). Then

df dR dC dx dx dx
= − = (700 − 2x) − (4 + 2x) = (696 − 4x) = 296 × 3 = 888.
dt dt dt dt dt dt

The units here would be dollars per week.

Example 5. A reservoir has a square bottom that is 400 feet by 400 feet. The sides
of the reservoir slope at a ratio of 1 (the width at the side increases by the same amount as
the height). Suppose water is entering the reservoir at 300 cubic feet per second (the flow
rate of a small stream). How quickly is the water in the reservoir rising when the water is
10 feet deep?
Solution: We know the rate at which the volume of water in the reservoir is changing over
time. This and the shape of the reservoir should allow us to find how quickly the water
level rises. We choose t as time, x as the length of a side of the reservoir, h as the depth
of the water in the reservoir, and v for the volume of water in the reservoir. We have

dv
x = 400 + 2h, and = 300.
dt

We want to find dh/dt when h = 10.

We write everything in terms of h and v, and use the volume of a pyramid to calculate v :

1 1
v= (200 + h) x2 − · 200 · 4002 .
3 3
Further Uses of Differentiation 165

1 1 4 4
v= (200 + h) (2(200 + h))2 − · 200 · (2 · 200)2 = (200 + h)3 − 2003 .
3 3 3 3
And for h = 10,

dv 2 2 dv dh dv dh dv
= 4 (200 + h) = 4 · 210 = 176400, · = = 300 so = 300 .
dh dh dt dt dt dh

We can finally solve for dh/dt to get dh/dt = 300/176400 = 1/588 feet per second.

EXERCISES.
1. Suppose that y = x3/2 and x depends on t. It is known that when x = 4 the rate
of change of x with t is dx/dt = 5. Find the rate of change of y with t.
2. Suppose that y = 3x + x2 and x depends on t. It is known that when x = 4 the
rate of change of x with t is dx/dt = 5. Calculate the rate of change of y with t.
3. The side of a cube has length 5 and is increasing at 1 foot per minute. How quickly
is the volume of the cube increasing?
4. The side of a cube has length 5 and is increasing at 1 foot per minute. How quickly
is the surface area of the cube increasing?
5. A company estimates that the weekly cost and revenue (in dollars) when x items
are produced and sold are,

cost = C(x) = 100, 000 + 80x + 0.002 x2 , revenue = R(x) = 800x − 0.03 x2 .

Find the rate at which profit is changing when there are 1000 items produced (each week)
and the number of items produced is rising by 5 items per week.
6. As above, a company estimates that the weekly cost and revenue when x items are
produced and sold are

cost = C(x) = 100, 000 + 80x + 0.002 x2 , revenue = R(x) = 800x − 0.03 x2 .

Suppose there are 1000 items produced (each week). What rate of increase in sales is
required so that profits rise by (about) 5,000 dollars per week?
7. As above, a company estimates that the weekly cost and revenue when x items are
produced and sold are

cost = C(x) = 100, 000 + 80x + 0.002 x2 , revenue = R(x) = 800x − 0.03 x2 .

Suppose there are 800 items produced (each week). What rate of increase in sales is
required so that profits rise by (about) 5,000 dollars per week?
166 Further Uses of Differentiation

8. A reservoir is in the shape of a cone with its vertex at the lowest point. The height
of the reservoir is 60 feet and the radius of the disk at the top of the reservoir is 20 feet.
Suppose water is entering the reservoir at 100π cubic feet per second. How quickly is the
water in the reservoir rising when the water is 20 feet deep?
9. As above, a reservoir is in the shape of a cone with its vertex at the lowest point.
The height of the reservoir is 60 feet and the radius of the disk at the top of the reservoir
is 20 feet. Suppose water is entering the reservoir at 100π cubic feet per second. How
quickly is the water in the reservoir rising when the water is 40 feet deep?
10. The unit of volume of a certain liquid increases with temperature according to
v = 1 + 0.01h + 0.00002 h3 , where h is the temperature in degrees centigrade. At what
rate is a unit of the liquid’s volume increasing with time if the temperature is 10 degrees
and is rising at 2 degrees per hour?

D. Behavior of Functions: increasing, decreasing, critical points, and concavity

We will soon be interested in understanding the behavior of functions in context. We

prepare for this by discussing the techniques that allow us to describe such behavior.

A function is called increasing on an interval if when the independent variable in-

creases, the function increases as well. More formally, if x1 ≤ x2 then f (x1 ) ≤ f (x2 ).

A function is called decreasing on an interval if when the independent variable in-

creases, the function decreases in value. That is, if x1 ≤ x2 then f (x1 ) ≥ f (x2 ).

Note that the independent variable is always increasing and that the question is
whether the function increases or decreases with it.

In addition, in symbolic terms, a function f (x) is strictly increasing if whenever

x1 < x2 we have f (x1 ) < f (x2 ), and a function f (x) is strictly decreasing if whenever
x1 < x2 we have f (x2 ) < f (x1 ).

Example 1. Find the intervals on which f (x) = x2 is increasing.

Solution: Suppose that x1 < x2 . If these values are both negative, then |x1 | > |x2 |, so
x21 > x22 and f (x1 ) > f (x2 ) so f is decreasing on the interval x < 0. If x1 and x2 are both
Further Uses of Differentiation 167

positive, then x21 < x22 and f (x1 ) < f (x2 ), so f is increasing for x > 0. When x1 < 0 < x2
the relative sizes of f (x1 ) and f (x2 ) are not determined.

The derivative is particularly well suited for deciding whether a function is increasing
or decreasing near a particular value x = a. Suppose f is differentiable at a.
If f 0 (a) > 0, then f is increasing near a.
If f 0 (a) < 0, then f is decreasing near a.

Example 2. Find the intervals on which f (x) = x3 − 12x is increasing.

Solution: Consider f 0 (x) = 3x2 − 12. Since f 0 (x) = 3(x2 − 4) = 3(x + 2)(x − 2) we see that
this derivative changes signs at x = −2 and at x = 2. We have f 0 (x) > 0 for x < −2 and
for x > 2 and f is increasing on those intervals.
Discussion: it is sometimes useful to keep track of the behavior of the function on the
number line representing the domain. Below is the sketch for the last example.
increasing increasing
-5 -4 -3 -2 -1 0 1 2 3 4 5

decreasing

Example 3. Find the intervals on which f (x) = x4 − x2 is increasing.

√
Solution: Here f 0 (x) = 4x3 −2x = 2x(x2 −2) and this derivative changes signs at x = − 2,
√ √ √
at x = 0, and at x = 2. We have f 0 (x) > 0 for − 2 < x < 0 and for x > 2 and f is
increasing on those intervals.

Example 4. Find the intervals on which f (x) = x3/2 − x is increasing.

Solution: Notice that the domain of this function is x ≥ 0. Here f 0 (x) = (3/2)x1/2 − 1 and
this derivative is zero for x = 4/9. We have f 0 (x) ≤ 0 for 0 < x ≤ 4/9 and f 0 (x) > 0 for
x > 4/9 and f is increasing on the interval x > 4/9.

Example 5. When the quantity manufactured is x, the price of each item is p =

400 − 12x. For what quantities is revenue increasing?
Solution: Set R =revenue, and note that x ≥ 0. Here R(x) = x(400 − 12x) = 400x − 12x2
and the marginal revenue is R0 (x) = 400 − 24x. The marginal revenue is positive for
x < 50/3 and revenue is increasing on 0 < x < 50/3.

Definition. A value c of x is a critical number, or sometimes a critical point, for the

function f when either f 0 (c) = 0 or when f 0 (c) is not defined. The idea is that in either of
these situations, the sign of f 0 (x) could change at x = c.
168 Further Uses of Differentiation

Example 6. Find the critical numbers for

1
f (x) = x − .
x
Solution: f 0 (x) = 1 + 1/x2 so f 0 (x) is never zero. However, f is not defined at x = 0 so
this is a critical number. The sketch below is a graph for this function. Notice that as
x → 0− the function gets very large, while as x → 0+ the function gets very negative.
10

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-5

-10

Example 7. Find the critical numbers for

x2
f (x) = .
1 + x2
Solution:
0 2x(1 + x2 ) − x2 · 2x 2x(1 + x2 ) − x2 · 2x 2x
f (x) = 2 2
= 2 2
= .
(1 + x ) (1 + x ) (1 + x2 )2
Hence f 0 (x) = 0 only for x = 0. Both f and f 0 are defined for all x. The only critical
number is x = 0.

Critical numbers are useful in finding the maximum and minimum values of a function.
This makes differentiation an extremely powerful tool in optimization and in any theory
regarding the best fit of functions or statistics to data. Here we explore the basic ideas
involved in finding a maximum or minimum value of a function either relative to nearby
values or on an interval.

Example 8. Find the critical numbers and the minimum and maximum values for
f (x) = x4 − x2 on the interval [−2, 2].
Further Uses of Differentiation 169

We calculate the derivative and set it equal to zero.

df 1
(x) = 4x3 − 2x = 0 so , x = 0 or x2 = .
dx 2
√ √
We consider f (−2) = 12, f (−1/ 2) = −1/4, f (0) = 0, f (1/ 2) = −1/4, and f (2) = 12.
The conclusion is that f has a minimum value of −1/4 and a maximum value of 12 on the
interval [−2, 2].
Discussion: The reader is invited to sketch the graph of this function and notice
the minimum and maximum values, as well as the horizontal tangents to the graph at
√ √
x = −1/ 2, x = 0, and x = 1/ 2.

Example 9. Find the critical numbers for f (x) = x3 and decide the minimum and
maximum values of f (x) on [−2, 4].
The function f (x) = x3 has derivative 3x2 so it is always increasing. It has a minimum
value of f (−2) = −8 and a maximum value of f (4) = 64 on [−2, 4].

Definition. When describing the features of functions, we say that f (x) has a local max-
imum at x = c when f (c) ≥ f (x) for x near c. We say that f (x) has a local minimum
at x = c when f (c) ≤ f (x) for x near c. We say that f (x) has an absolute maximum at
x = c when f (c) ≥ f (x) for all x in the domain of f , and we say that f (x) has an absolute
minimum at x = c when f (c) ≤ f (x) for all x in the domain of f . An absolute maximum
or minimum is also called a global maximum or minimum.

Example 10. We analyzed f (x) = x2 /(1 + x2 ) in a previous example. It has a local

and global minimum at x = 0 and it has no local or global maximum.
Discussion: Notice that if x is very close to zero the dominant term in the denominator
is the 1 and so the function is very close to x2 /1 = x2 for x near 0. This shows that f (0) = 0
is a local minimum. If x2 is large (that is, x is far from 0), then the ratio of the numerator
to the denominator is close to 1 but never reaches this value.

Example 11. Find the local and global minimum and maximum for f (x) = x3 − x2 .
Solution: The critical points are at x = 0 and at x = 2/3 and f (0) = 0 while f (2/3) =
−4/27. As x gets very large so does f (x), and as x gets very negative so does f (x). So
this function has a local minimum at x = 2/3 and a local maximum at x = 0. However
f (x) does not have an absolute minimum or an absolute maximum.
Discussion: sketch the graph of this function and notice the features mentioned above.
170 Further Uses of Differentiation

The slope tells us the direction at which a curve is “moving”. The rate at which the
slope is changing tells us how the curve is turning. The name usually used for this turning
is concavity.

Example 12. The prototype for a function that has a critical point at which the slope
is turning upward (as the independent variable increases) is f (x) = x2 . This function has
a local minimum at the critical point.

Example 13. The prototype for a function that has a critical point at which the slope
is turning downward (as the independent variable increases) is f (x) = −x2 . This function
has a local maximum at the critical point.

Definition. A function f (x) is concave up at x = a when its second derivative is positive

there and is concave down when its second derivative is negative there:

d2 f d2 f
(a) > 0 means concave up, (a) < 0 means concave down.
dx2 dx2

Example 14. Find the concavity of f (x) = x4 − x2 at its critical points.

√ √
We already found that the critical points are at x = −1/ 2, x = 0, and x = −1/ 2.
√
The second derivative of f is f 00 (x) = 12x2 − 2. So f 00 (−1/ 2) = 4, f 00 (0) = −2, and
√ √
f 00 (1/ 2) = 4 from which we conclude that the graph is concave up at x = −1/ 2 and at
√
x = 1/ 2 and concave down at x = 0.

Example 15. Find the concavity of f (x) = x3 .

The second derivative of f is f 00 (x) = 6x. So the graph is concave down for x < 0 and
concave up for x > 0.

A function f (x) has an inflection point at x = a when its second derivative changes
signs as x passes through a. That is, f 00 (a) = 0 and the graph is concave up on one side
of x = a and concave down on the other.

Example 16. Find the concavity and inflection points of f (x) = 4x5 − 5x4 .
Solution: The second derivative of f is f 00 (x) = 20x2 (4x − 3). So the graph is concave
down for x < 3/4 and concave up for x > 3/4. Even though the second derivative is zero
at x = 0, the only inflection point is at x = 3/4.
Further Uses of Differentiation 171

Example 17. Graph a differentiable function f (x) whose domain is the entire line so
that f is always increasing and has an inflection point at x = 2.
Solution:

2.4

1.6

0.8

-5 -4 -3 -2 -1 0 1 2 3 4 5

-0.8

One formula that works for this is f (x) = (x − 2)3 . The function used in the graph is
a different one.

Example 18. Graph a differentiable function f (x) with the following features: f is
decreasing for x < 1, increasing for 1 < x < 4, and increasing for x > 4. It has inflection
points at x = 2 and at x = 6.
10

0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4

-5

-10

Notice that f must be concave up at x = 1 and concave down at x = 4 and so it must

be concave up for x > 6.

Example 19. Find where f (x) is increasing, its concavity and its inflection points,
172 Further Uses of Differentiation

given that its derivative is x3 − 3x2 .

Solution: The derivative is zero at x = 0 and at x = 3. Considering the signs of the
derivative we see that f (x) is increasing for 3 < x. The second derivative of f is f 00 (x) =
3x2 − 6x so it is zero for x = 0 and for x = 2. This second derivative is negative only for
0 < x < 2. Thus the graph of f is concave up for x < 0 and for x > 2 and concave down
for 0 < x < 2.
Discussion: The reader is invited to graph a function with the features described.

Example 20. The total cost of producing x units in a particular type of facility is
reported as C(x) = x2 + 30x + 15, 129. Find the average cost per unit produced when
the total number produced is x (this is a measure of the efficiency of a facility of the
size required to maintain this batch size). For which values of x is this cost increasing?
decreasing? What batch size minimizes average cost?
Solution: The average cost is A(x) = C(x)/x = x + 30 + 15, 129/x. The rate of change of
average cost as the batch size is changed is

dA 15, 129
=1−
dx x2

so the average cost is decreasing until x2 = 15, 129, that is for 0 < x < 123, and increasing
for x > 123. The average cost is minimized if x = 123 units are produced.

EXERCISES.
1. Find the intervals on which f (x) = x3 + 6x is increasing.
2. Find the intervals on which f (x) = x4 − 12x is increasing.
3. Find the intervals on which f (x) = x3 + 6x2 − x is increasing.
4. Find the critical numbers for x+2
f (x) = .
x2 − 2x + 3
5. Find the critical numbers for f (x) = x4 − x3 − x2 .

6. Find the critical numbers for f (x) = x7 + x5 + x3 .

7. Find the critical numbers for x2 + 3x

f (x) = .
1 + x2

8. Find the critical numbers for f (x) = x3 +x2 and decide the minimum and maximum
values of f (x) on [−2, 1].
Further Uses of Differentiation 173

9. Find the critical numbers for f (x) = x3 +x2 and decide the minimum and maximum
values of f (x) on [−2, 0].
10. Find the critical numbers for f (x) = x3 + x2 and decide the minimum and
maximum values of f (x) on [−0.5, 0.2].
11. Find where f (x) = x5 − x3 is increasing, where it is decreasing, its concavity, and
its inflection points.
12. Find the critical points for f (x) = x3 − 4x2 + x and decide the concavity there.
13. Find the critical points for f (x) = x5 + x3 − 2x and decide the concavity there.
Suggestion: consider the equation for critical points as an equation in x2 .
14. Find where f (x) is increasing, its concavity, and its inflection points, given that
its derivative is 3x + 4.
15. Find where f (x) is increasing, its concavity, and its inflection points, given that
its derivative is x2 − 4x.
16. Graph a differentiable function f (x) whose domain is the entire line so that f is
always increasing and has an inflection point at x = 5.
17. Graph a differentiable function f (x) whose domain is the entire line so that f is
increasing for x < 2, decreasing for x > 2, and has an inflection point at x = 4.
18. Suppose a differentiable function f (x) whose domain is the entire line is decreasing
for x < −1, and has an inflection point at x = −1. Can f be decreasing for x > −1? can
it be increasing for x > −1?
19. Suppose a differentiable function f (x) whose domain is the entire line is decreasing
for x < −1, and has an inflection point at x = −1. Is it possible that f is bounded? That
is, that there is a number b so that for any value of x, −b < f (x) < b?
10. Optimization and Further Analysis

One of the main goals of this course is to develop the mathematical instruments that
allow one to decide what quantity maximizes or minimizes an objective function. The
modeling described here is often used in making decisions.

A. Models Using Optimization.

Example 1. Maximization of profits.

Suppose that a firm wishes to maximize profits. Suppose that when q units are
produced and sold, the total revenue is 5q and the total cost is q 3 − 2, 000q 2 + 100, 000q.
We assumed that the selling price is fixed (at 5 dollars per unit) and that the cost rises
fairly sharply for small quantities and for large quantities while increasing more slowly for
moderate quantities.
The objective here is to maximize profit, and profit is the total revenue less the total
cost. Here is a summary:
q = quantity sold. The values that make sense are q ≥ 0.
f = profit.
f = 5q − q 3 + 2, 000q 2 − 100, 000q.
Find q so that f is maximized.
To analyze the profit function f , we first collect terms. We have
f = −q 3 + 2, 000q 2 − 99, 995q.
Since this is a cubic function with a negative leading coefficient, profits first fall, then
rise, and finally fall again. So the values of q for which profit might be maximized are the
initial profit, at q = 0, before profit falls at the beginning, and the value of q at the end of
the rise (before profits fall at the end).
Follow the discussion: Sketch a graph of this profit, f , as a function of the quantity,
q, and identify the two points where profit might be maximized.
To calculate the value of q at the end of the rise in profit, notice that while the profit
is rising df /dq > 0. For larger values of q, where profit is falling, df /dq < 0. Since the
derivative is continuous, the profit is instantaneously neither rising nor falling at the point
qb when
df
(qb ) = 0 .
dq
Optimization and Further Analysis 175

To find the value of qb we calculate:

df
(q) = −3q 2 + 4, 000q − 99, 995 so − 3qb2 + 4, 000qb − 99, 995 = 0 .
dq

Using the quadratic formula we find not one but two points, namely

1 p 1 p
qa = 4000 − 14, 800, 060 ≈ 25.486, qb = 4000 + 14, 800, 060 ≈ 1307.847
6 6

Follow the discussion: Why is the point with profit increasing on the left and decreasing
on the right the second of the two points we found?
Finally, we can decide the maximum profit. At the beginning, q = 0 and f (0) = 0.
At the end of the rise in profit, q = qb and f (qb ) ≈ f (1307.847) ≈ 1, 053, 124, 472.78.
The maximum profit is obtained when between 1307 and 1308 units are produced and the
maximum profit is approximately 1, 053, 124, 472.78.
Notice that the marginal cost is

d 3
MC = q − 2, 000q 2 + 100, 000q = 3q 2 − 4, 000q + 100, 000
dq
√
so the marginal cost at qb = (4000 + 14, 800, 060)/6 is

M C(qb )

3 p 1 p
40002 −8000 14, 800, 060+14, 800, 060 −4000× 4000− 14, 800, 060 +100, 000

=
36 6
= −99, 995 + 100, 000 = 5 .

In other words, the marginal cost equals the price of an item, which is the amount of
revenue generated by the additional item.
Follow the discussion: Often people say that to maximize profit the marginal cost
must equal the marginal revenue. Why does this make sense? Specifically, if a producer’s
marginal revenue is larger than her or his marginal cost, what would you expect the
producer to do?

Example 2. Artificially simple example.

To capture the technical features of the optimization process, consider the silly exam-
ple of finding the minimum value of g(x) = 3x2 + 4x + 5. (This example is silly because it
is easily solved with algebra alone.)
176 Optimization and Further Analysis

This function is a quadratic with a positive leading coefficient, so it is decreasing for

values of x to the left of the value of x at which the minimum occurs, and increasing for
x to the right of the value of x at which the minimum occurs. Denote the value of x at
which the minimum occurs by xm . Then

dg dg
(x) < 0 when x < xm , and (x) > 0 when x > xm .
dx dx

As in example 1, the value of x which minimizes g is found by solving

dg
(xm ) = 0 so 6xm + 4 = 0
dx

which gives us xm = −2/3 and the minimum value of g is g(−2/3) = 11/3.

Example 3. Quality Control and Sample Size.

Assume that a manufacturer has its product inspected with a frequency x times per
day. Suppose that the cost of each inspection is c (for some c > 0). When the number
of inspections increases, the chance of catching a problem with production also increases.
Suppose, then, that the number of defective units sold is y = 100/x2 . Suppose also that
the cost of having the defective units, in replacement and reduced reputation, is L · y.
The total cost of inspections and their consequences is T = c · x + 100L/x2 . The rate
at which total cost changes with the number of inspections is

dT L
= c − 200 3 .
dx x

Clearly, x = 0 leads to unbounded total costs, so x ≥ 1. The rate of change in cost

stops decreasing when c − 200L/x3 = 0. This will be the number of inspections with the
minimum total cost. The optimal number of inspections is thus x = (200L/c)1/3 .

Example 4. Labor versus leisure.

Consider a person who is deciding how much time to spend working. In this model
we will assume that both work and leisure contribute to the person’s utility and that the
person will maximize utility. Here is our formulation of the variables and their relations:
n= number of hours worked per day.
w=hourly wage (dollars per hour worked).
p=price of material goods (dollars per generic item).
v=material income (number of generic items per hour worked).
r=number of hours spent on rest and recreation per day.
Optimization and Further Analysis 177

u=person’s utility for real income and rest/recreation.

v = (w/p) · n
We assume the person has 16 hours per day for these activities and that
w 0.55
u(n, r) = v 0.55 r0.45 = n0.55 r0.45 .
p

Let us make some observations about this optimization problem. The trade-off is
between the utility due to work, and the goods that it enables the person to have, and the
utility due to leisure activities. Utility is not infinite because there are only 16 hours per
day, so the person has to trade one activity for the other. As a quick exploration, if n = 0
then r = 16 and utility is u(0, 16) = 0. Similarly u(16, 0) = 0. When 0 < n < 16 then also
0 < r < 16 and u > 0.
Clearly the number of hours cannot be negative, and we have assumed that n+r = 16,
so utility, in fact, is determined once the number of hours worked, n, is chosen. Here is
our new formulation of the variables and their relations:
n= number of hours worked per day. 0 < n < 16.
r=number of hours spent on rest and recreation per day. r = 16 − n.
u=person’s utility for work and rest/recreation.
w/p=material income per hour worked.
w 0.55 w 0.55
0.55 0.45
u(n) = n r = n0.55 (16 − n)0.45 .
p p

At the value of n that maximizes u, we will again have u increasing from the left and
decreasing to the right. (Recall that as a function of n alone u(0) = u(16) = 0 while u is
positive for 0 < n < 16.) The values of n at which u might have a maximum are those
values of n that satisfy

du
(n) = 0 so 0.55n−0.45 (16 − n)0.45 + 0.45n0.55 (16 − n)−0.55 (−1) = 0 .
dn

Follow the discussion: The factor w/p is part of the derivative. Why is w/p not involved
in the equation above?
To simplify the last equation, multiply by n0.45 (16 − n)0.55 and collect the coefficients
to get
0.55(16 − n) − 0.45n = 0 so n = 0.55 × 16 = 8.8 .

Since we already knew that utility had a maximum and since there is only one candidate
value for this maximum we can conclude that utility reaches the maximum value of u =
178 Optimization and Further Analysis

8.80.55 (16 − 8.8)0.45 ≈ 8.04 when the person works n = 8.8 hours a day and spends r = 7.2
hours a day on rest and recreation.
What is most believable about example 3 is that people do make decisions of the
type described. What is least believable is that the person’s utility takes this particular
functional form with the specific parameters assigned. Because consumer behavior plays
an important role in financial and policy decisions, models of consumer behavior and
experiments for determining the parameters of such models are a subject of hundreds
of journal articles each year. The analysis of these models uses the type of technique
illustrated in this example, and some of the techniques of multivariable calculus.

Example 5. Scheduling Service.

An office manager is deciding on a service schedule for printers. The decision amounts
to choosing between reliability and the cost of service.
Suppose that an unscheduled maintenance costs 300 dollars (due to the cost of ob-
taining emergency service and the cost of making up for the work lost while the printer
was not available). Suppose that scheduled maintenance costs 50 dollars (there is no loss
of work because the maintenance is scheduled while the office is not using the printers).
Assume that the average weekly cost of having a service interval of length s weeks is

50 300s
c(s) = + .
s 1+s

Our strategy for minimizing the cost is to first find values of s for which the cost is
(instantaneously) neither increasing or decreasing, and then to analyze what occurs near
these values. Here is the summary of our model so far:
s =service interval (in weeks).
c = cost of scheduled and unscheduled maintenance (in dollars).
c(s) = 50/s + 300s/(1 + s).
Goal: minimize c.
The values of s for which cost is neither increasing nor decreasing satisfy

dc −50 300
(s) = 0 ⇒ 2
+ =0.
ds s (1 + s)2
√
After some simplification one obtains 5s2 − 2s − 1 = 0 and sa = (1 + 6)/5 and sb =
√
(1 − 6)/5.
We need to decide what happens to the cost near s = 0 (since the range of s is s ≥ 0)
and near sa ≈ 0.69 (why are we ignoring sb ?).
Optimization and Further Analysis 179

As s → 0, c(s) → ∞, so the minimum is attained (if at all) for s > 0. Consider the
√
cost near sa = (1 + 6)/5. If the cost at sa is a minimum, then for s < sa we must have
c(s) > c(sa ) so the slope of c(s) is negative near sa for s < sa . Additionally, if the cost at
sa is a minimum, then for s > sa we must have c(s) > c(sa ) so the slope of c(s) is positive
near sa for s > sa . Hence the slope goes from negative to zero to positive as s increases
through the value sa . In other words, the slope is increasing near sa . Therefor the second
derivative at sa must be positive if c(s) at sa is a minimum. Let us check.
√ √
d2 c 100 600 d2 c 1 + 6 12500(210 + 60 6)
2
(s) = 3 − , = √ √ ≈ 180.21 > 0 .
ds s (1 + s)3 ds2 5 (1 + 6)3 (6 + 6)3
√
This calculation tells us that when we consider values of s near s = (1 + 6)/5 the cost is
smallest at sa . Since we already know that this is the only candidate point for a minimum,
we need look no further.
We conclude that the cost is minimized when the printers are services approximately
every 0.69 weeks (or about 1.45 times a week, or about every 3.45 work days, or between 75
and 76 times a year). The minimum weekly cost (of scheduled service and the consequences
of unscheduled maintenance) is approximately c ≈ 194.95 dollars.

Process Summary: Maximization and Minimization.

1. Choose variables, determine the relevant values of these variables, and determine
relations among the variables. State the objective.
2. Formulate the objective either as optimizing a function of one variable, or as
optimizing a function of several variables with a set of constraints.
3. Find the value(s) of the variable(s) at which the optimum occurs. (When we cannot
get an exact value for the critical point, or when we know the accuracy needed in advance,
we would use numerical methods to approximate the value of the critical point.)
4. Find the optimal value of the objective function (if this makes sense in the setting)
and report the results (the values of the variables and the optimal value) in a manner
consistent with the original description of the setting.

Technical Summary: Local Maximization and Minimization.

A function f (x) has a critical point at a value x = c when f 0 (c) = 0. This critical
point is a local minimum when f 00 (c) > 0. This critical point is a local maximum
when f 00 (c) < 0. If f 00 (c) = 0, then the critical point may be either a local minimum or
a local maximum or neither (there is not sufficient information in the first two derivatives
to decide).
180 Optimization and Further Analysis

Memory aid: A simple example of a minimum is f (x) = x2 at c = 0 where f 0 (0) = 0,

and f 00 (0) = 2 > 0. A simple example of a maximum is f (x) = −x2 at c = 0 where
f 0 (0) = 0, and f 00 (0) = −2 < 0.

Example 6. Find and determine the type of the critical points of f (x) = x3 + x2 −
3x + 4. Here f 0 (x) = 3x2 + 2x − 3 and f 0 (x) = 0 for
p √ √
−2 − 22 − 4 · 3 · (−3) −2 − 40 −2 + 40
x1 = = , x2 = .
2·3 6 6
√ √
The second derivative is f 00 (x) = 6x + 2 and f 00 (x1 ) = − 40 < 0 while f 00 (x2 ) = 40 > 0.
We conclude that f has a local maximum at x1 , while f has a local minimum at x2 .
For polynomial functions it is easier to find critical points when we have common
factors.

Example 7. Find the critical points of f (x) = (x − 1)3 (x + 2)2 . Here

f 0 (x) = 3(x − 1)2 (x + 2)2 + 2(x − 1)3 (x + 2)

= (x − 1)2 (x + 2)(3(x + 2) + 2(x − 1)) = (x − 1)2 (x + 2)(5x + 4).

The critical points are x1 = 1, x2 = −2 and x3 = −4/5.

EXERCISES.
1. Suppose that a firm wishes to maximize profits. Suppose that when q units are
produced and sold, the total revenue is 25q and the total cost is q 3 − 2, 000q 2 + 100, 000q.
How many units should the firm produce?
2. Find the minimum value of g(x) = x4 + 32x + 5.
3. A manufacturer of LEDs tests its product in batches of size n. Suppose that the
cost of producing each item is 0.30 dollars, and suppose that the cost of each test is 35
dollars (per batch). Suppose that the probability of one LED being faulty is p = 0.002.
Suppose also that the whole test batch is discarded if one LED in the batch is faulty
(and that if the test is passed then all the LEDs can be sold). (a) Calculate the expected
number of LEDs that are ready for sale, from a batch of size n using this testing scheme.
(b) Calculate the total cost of producing and testing the LEDs in the batch. (c) Calculate
the cost per item for each of the LEDs that is ready for sale.
4. A manufacturer of LEDs tests the product in batches of size n as above (with the
cost of production being 0.30 dollars per LED and the cost of a test being 35 dollars per
Optimization and Further Analysis 181

batch, and the probability of an LED being faulty at 0.002 also as above). What batch
size minimizes the cost of producing an LED that is ready for sale?
5. A manufacturer of LEDs tests its product in batches of size n. Suppose that the
cost of producing each item is c, and suppose that the cost of each test is T (c > 0 and
T > 0, and it makes sense to assume that T is significantly larger than c). Suppose that
the probability of one LED being faulty is p > 0. Suppose that the whole test batch is
discarded if one LED in the batch is faulty and that otherwise the LEDs are all ready for
sale. What batch size minimizes the cost of producing an LED that is ready for sale?
6. As above, a manufacturer of LEDs tests its product in batches of size n. Suppose
the cost of production for one LED is 0.30 dollars and the cost of a test is 35 dollars,
and the probability of an LED being faulty is 0.002. Suppose that an investment in new
technology (of a fixed amount) could yield a 5 percent reduction in production cost (per
item) or a 10 percent reduction in testing costs (per batch). Which improvement is more
beneficial?
7. Because the air in Earth’s atmosphere is thinner at higher altitudes, airplanes that
fly at higher altitudes experience less drag and save energy on that portion of the flight.
However, getting to high altitude requires greater energy. Suppose that the energy used to
lift a plane to a height h (miles above the surface of the Earth) is g = 2000 h, and that the
energy used to travel a distance of 1000 miles at height h is f = (643 × 1011 ) · (4000 + h)−2 .
At what elevation should this plane fly to minimize the total energy used?
8. Suppose, as above, that the energy used to lift a plane to a height h (miles) is
g = 2000 h. Suppose the energy used to travel a distance of 2000 miles at height h is
f = (1286 × 1011 ) · (4000 + h)−2 , that is, the energy used for travel is proportional to the
distance traveled. At what elevation should this plane fly to minimize the total energy
used?
9. A common model of production assumes that the amount produced is

m(c, l) = A · cα l1−α , 0 < α < 1.

Here c represents the total amount of capital (the investment in machinery and raw ma-
terials) and l represents the total amount of labor (the investment in hours of work and
training). A depends on the specific industry involved and on the units used. Suppose that
for a particular industry α = 0.5 and that for a particular manufacturer c + 12l = 240 is
the total invested. What combination of c and l yield the greatest amount produced?
10. Suppose production is as described above, with m(c, l) = A · cα l1−α , 0 < α < 1,
182 Optimization and Further Analysis

but now with α = 0.4 and c + 12l = 240. What combination of c and l yield the greatest
amount produced?
11. Suppose production is as described above, with m(c, l) = A · cα l1−α , 0 < α < 1,
but now with α = 0.6 and c + 12l = 240. What combination of c and l yield the greatest
amount produced?
12. Suppose production is as described above, with m(c, l) = A · cα l1−α , 0 < α < 1,
but now with α = 0.5 and c + 12l = 300. What combination of c and l yield the greatest
amount produced?
13. Suppose production is as described above, with m(c, l) = A · cα l1−α , 0 < α < 1,
but now with α = 0.5 and c + 10l = 240. What combination of c and l yield the greatest
amount produced?
14. Suppose production is as described above, with m(c, l) = A · cα l1−α , 0 < α < 1,
and with the budget c + 12l = B. (a) Review the results of the previous problems on
production and decide how α influences the importance of capital and labor (how the
amount allocated to the two varies with α). (b) How does the budget, B, influence the
importance of capital and labor?
15. Find and determine the type of the critical points of f (x) = 2x3 − 5x2 + 3x + 5.
16. Find and determine the type of the critical points of f (x) = −2x3 + 5x2 − 3x − 11.
17. Find and determine the type of the critical points of f (x) = 2x3 + x2 + 3x + 5.
18. Find and determine the type of the critical points of f (x) = 2x3 − 5x2 + 4x + 5.
19. A car has a service interval of 5,000 miles. The cost of this service is 50 dollars.
If a longer service interval of length m is used, then the average cost of the eventual repair
needed is 100(m − 5, 000)2 . What service interval minimizes the cost of having the car.
Suggestion: the cost to be minimized should be the cost per mile (of driving).
20. Find the critical points of f (x) = (x − 1)4 (x − 2)3 (x − 3)2 .
21. Find the critical points of f (x) = (x − 1)4 (x3 + x2 ) and determine their type.
Suggestion: the second derivative test might fail at some of these but you should still be
able to decide whether the point in question gives a minimum or a maximum.
22. Suppose that f (x) = c0 + c1 x + c2 x2 + . . . + cn xn for some positive integer n.
How many critical points might f have? Suggestion: what is the highest power of the
polynomial for df /dx?
23. An ambulance station will be placed somewhere along the road connecting two
rural communities. The response time to an emergency call is r = 2 + l, where r is
Optimization and Further Analysis 183

measured in minutes and l is the distance in miles. Suppose the distance between the two
communities is 6 miles (by road), and that the larger community has 300 residents while
the smaller has 100 residents. Where should the station be located to minimize the average
response time? You should assume that the number of emergency calls is proportional to
the population size.
24. As in the previous question, an ambulance station will be placed somewhere along
a 6 mile road connecting two communities, A and B. The response time to an emergency
call is r = c+m·l. Suppose that community A has 3 times as many residents as community
B. Where should the station be located to minimize the average response time? You should
assume that the number of emergency calls is proportional to the population size.
25. As in the previous question, an ambulance station will be placed somewhere along
a 6 mile road connecting two communities, A and B. The response time to an emergency
call is r = c + m · l. Suppose that community A has k > 1 times as many residents as
community B. Where should the station be located to minimize the average response time?
You should assume that the number of emergency calls is proportional to the population
size.
26. Hiring a physician costs 600 dollars a day, and hiring a nurse costs 200 dollars a day.
When p physicians (doctors) and n nurses are hired, a clinic can serve c = 10p + n · p + 6n
clients each day. Suppose a clinic has a budget that averages 9,000 a day for these wages.
Which combination of nurses and physicians serves the greatest number of clients? Note
that since the hiring is a daily average, the results do not have to be integers.
27. As above, hiring a physician costs 600 dollars a day, and hiring a nurse costs 200
dollars a day. When p physicians and n nurses are hired, a clinic can serve c = 10p+n·p+6n
clients each day. Suppose a clinic has a budget averaging 9,100 a day for these wages.
Which combination of nurses and physicians serves the greatest number of clients?

B. Proportional Rates of Change

For any quantity q and a change in it by ∆q, one can consider the relative change. This
is the change as a proportion of the original amount:
∆q
relative change = .
q
The price elasticity of demand is the absolute value of the relative rate of change
in the quantity demanded (by consumers) as compared to the relative rate of change in
184 Optimization and Further Analysis

the price. For a quantity x and a price p this is:

(∆x)/x p/x p/x

price elasticity of demand ≈ − =− ≈− .
(∆p)/p (∆p)/(∆x) dp/dx

Example 1. Calculate the price elasticity of demand when 150 units are sold if the
price of a unit is p = 1000 − x − 0.03x2 when x units are sold.
Solution: dp/dx = −1 − 0.06x. When 150 units are sold, the price is p = 175, and
dp/dx = −10. So the price elasticity of demand is

p/x p . dp 175 . 7
− =− =− (−10) = .
dp/dx x dx 150 60

Example 2. Calculate the price elasticity of demand when 3000 units are sold if the
price of a unit is p = 200 − 0.05x when x units are sold.
Solution: dp/dx = −0.05. When 3000 units are sold, the price is p = 50. So the price
elasticity of demand is

p/x p . dp 50 . 1
− =− =− (−0.05) = .
dp/dx x dx 3000 3

Example 3. Calculate the price elasticity of demand when 100 units are sold if the
price of a unit is p = 5000 − 20x when x units are sold.
Solution: dp/dx = −20. When 100 units are sold, the price is p = 3000. So the price
elasticity of demand is

p/x p . dp 3000 .
− =− =− (−20) = 1.5.
dp/dx x dx 100

Follow the discussion: When the price elasticity of demand is less than 1, then in
relative terms the demand is falling more slowly than the price is rising. When the price
elasticity of demand is greater than 1, then in relative terms the demand is falling more
quickly than the price is rising.
The relative rate of change of a population of size P when it changes by ∆P is the
quantity ∆P/P . Similarly, the relative rate of change of capital K when it changes by ∆K
is the quantity ∆K/K. Suppose now that at time t a country has population P , capital
K and total income I. Personal income is thus I/P and per capita capital is K/P .
Optimization and Further Analysis 185

Example 4. Suppose that per capita income increases over time exactly when per
capita capital increases over time. Show that per capita income increases exactly when
the proportional rate of growth of capital is higher than the proportional rate of growth
of the population.
Solution: per capita capital is K/P and increases over time when

d K P (dK/dt) − K(dP/dt)
> 0, or > 0.
dt P P2
Since the population is always positive in size, the last inequality is equivalent to
P (dK/dt) > K(dP/dt). Dividing both sides by P · K gives (dK/dt)/K > (dP/dt)/P , or
that the proportional rate of growth of capital is greater than the proportional rate of
growth of the population.

Example 5. Suppose that per capita income increases exactly when the proportional
rate of growth of capital is higher than the proportional rate of growth of the population.
Assume that a certain country has population P = 20 million, capital K = 500, 000
million, and that dP/dt = 0.4 (million) while dK/dt = 5, 000 (million). Is personal income
increasing in this country?
Solution: The proportional rates of growth are (dP/dt)/P = 0.4/20 = 0.02 (or 2 per-
cent), and (dK/dt)/K = 5, 000/500, 000 = 0.01 (or 1 percent). Thus personal income is
decreasing in this country.

EXERCISES.
1. The price of a unit is p = 1000 − x − 0.03x2 when x units are sold. Calculate the
price elasticity of demand when 100 units are sold.
2. The price of a unit is p = 1000 − x − 0.03x2 when x units are sold. Calculate the
price elasticity of demand when 160 units are sold.
3. The price of a unit is p = 5000 − 20x when x units are sold. (An example above
calculates the price elasticity of demand when 100 units are sold.) (a) Calculate the revenue
as a function of the number of units sold. (b) Calculate the number of units that maximize
revenue.
4. The price of a unit is p = 200 − 0.05x when x units are sold. (a) Calculate the
revenue as a function of the number of units sold. (b) Calculate the number of units that
maximize revenue.
5. Suppose that per capita income increases exactly when the proportional rate of
growth of capital is higher than the proportional rate of growth of the population. Assume
186 Optimization and Further Analysis

that a certain country has population P = 20 million, capital K = 200, 000 million, and
that dP/dt = 0.4 (million per year) while dK/dt = 5, 000 (million per year). Is personal
income increasing in this country?
6. Suppose that per capita income increases exactly when the proportional rate of
growth of capital is higher than the proportional rate of growth of the population. Assume
that a certain country has population P = 50 million, capital K = 500, 000 million, and
that dP/dt = 0.4 (million/year) while dK/dt = 5, 000 (million/year). Is personal income
increasing in this country?
7. Suppose that per capita income increases exactly when the proportional rate of
growth of capital is higher than the proportional rate of growth of the population. Assume
that a certain country has population P = 100 million, capital K = 50, 000, 000 million,
and that dP/dt = 0.8 (million/year) while dK/dt = 350, 000 (million/year). Is personal
income increasing in this country?
8. Graph a plausible relationship between the proportional rate of growth of capital
(at a certain instant in time) and personal income (at that time). Suggestion: capital
grows when people save money.
9. Graph a plausible relationship between the proportional rate of growth of a county’s
population and personal income. Suggestion: consider that poor countries have, on the
whole, less health care and other services. How would people react to these conditions?
10. Suppose that per capita income increases exactly when the proportional rate of
growth of capital is higher than the proportional rate of growth of the population. Suppose
that people from a certain country work abroad and send money back to their home country
(such payments are called remitances). Does this increase or decrease the likelihood that
personal income in the home country increases? Explain.

C. Eventual Behavior.

It is often useful to understand the behavior of a function when one of the variables
(either the independent variable or the dependent variable) is large.

Example 1. The cost of producing x units in a year is C = 10x2 + 100x + 16, 000.
What production level minimizes average cost?
Optimization and Further Analysis 187

Solution: Average cost is A(x) = 10x + 100 + 16, 000/x. Notice that as x approaches 0
(from above), the average cost gets very large, due to the term 16, 000/x. We would write

A(x) → ∞ as x → 0+

Also, as x gets very large, the average cost gets very large, due to the term 10x. We would
write
A(x) → ∞ as x → ∞

These observations are useful because it then makes sense that the average cost does truly
have a minimum for some positive value of production.
Calculating the critical points: dA/dx = 10 − 16, 000/x2 = 0 for x2 = 1600 or x = 40.
Hence we know that x = 40 is the number of units that minimizes the average cost of
production.

Example 2. The value of a certain good to a consumer is described by the utility

x 1/2
u(x) = k , 0 < k < 1, x > 0,
1+x
with x being the amount of the good. What is the largest level of satisfaction derived from
this good?
Solution: We calculate the rate of change of utility with the good:

du 1 x −1/2 1 du
(x) = k · , so > 0.
dx 2 1+x (1 + x)2 dx

It follows that the utility is always increasing. This means that the consumer derives more
utility from more of the good, which makes sense. As the amount increases,
x 1/2
lim k = k · 1 = k.
x→∞ 1+x

Example 3. Calculate the minimum and maximum values of the function f (x) =
x /(1 + x2 ). Describe in words how this would be useful in graphing this function.
2

Solution: We have seen (and it is possible to check again) that f has its only critical
number at x = 0 and that f (0) = 0. It is clear that for x 6= 0 both the numerator and the
denominator in f are positive, so f (x) > 0. Hence f (0) = 0 is the minimum value for f .
Since x2 < 1 + x2 , f (x) < 1. As x → ∞, the numerator and denominator are, relatively,
of the same size, and f (x) → 1. Similarly, as x → ∞, we also have f (x) → 1.
188 Optimization and Further Analysis

In summary, the values of f (x) = x2 /(1 + x2 ) are always between 0 and 1, with f
decreasing for negative x, having a minimum at x = 0, and increasing for x > 0. This
function does not reach a maximum value.

EXERCISES.
1. The cost of producing x units in a year is C = x2 + 60x + 1000. What production
level minimizes average cost?
2. The resistance, R (in kOhms), of an electronic filter depends on the frequency x
(in kHz) according to R = 10/(x − 1) + 4/(x − 20). What frequency, with 1 < x < 20,
minimizes resistance?
3. The total amount produced by an orchard during its first t years is A = 4 + 6t +
(4/900)t2 , where 3 ≤ t ≤ 50. What age, t, maximizes average annual production over the
life of the orchard?
4. The total amount produced by a worker during the first t years after being hired
is A = 9 + 11t + 0.0064t2 , where t > 2 (due to training the company will not hire a worker
for less than 2 years). What duration, t, maximizes average annual production for this
worker? What is this maximum value?
5. A consumer works to be able to consume, hence leisure, l, and the amount of the
good consumed are related. For some good its amount, x, satisfies l + 0.4x = 10. The
resulting utility to the consumer is

u(x) = x1/2 l1/2 , 0 < x < 25.

(a) Interpret l + 0.4x = 10 by considering this equation as describing the consumer’s use
of time. (b) Describe the utility as a function of x alone. (c) What is the largest level of
utility for this consumer?
6. A consumer’s utility when the amount of a good consumed is x is
x 1/2
u(x) = (10 − 0.2x)1/2 , 0 < x < 50.
1+x

What is the largest level of utility for this consumer?

7. Calculate the minimum and maximum values of the function f (x) = x/(1 + x3 ).
Be careful to consider the behavior near the value that is not in the domain of f as well
as the behavior for large |x|.
8. Calculate the minimum and maximum values of the function f (x) = x3 /(1 + x4 ).
Optimization and Further Analysis 189

9. Calculate the largest level of utility for a consumer whose utility when the amount
of a good consumed is x is
x 0.5 9 0.5
u(x) = , 0<x.
1+x 10 + x
11. Rates of Change of Exponentials and Logarithms.

In describing growth over time, of populations, of an economy, of an investment, or

Recall that for any positive number b and any number x, one can define the value
of b raised to the power x, which is denoted bx . Recall also the properties of exponential
z
functions, bx+z = bx bz and bx = bxz . Associated with each exponential function is its
inverse, the logarithm for that base: logb (y) = x when bx = y.

We have already seen that the number e is useful as a base for continuously com-
pounded interest. The reason this number is natural as a base is that its growth is pro-
portional to itself with the constant of proportionality being 1.

Fact: The slope of y = ex at x = 0 is 1.

For the base e the logarithm is called the natural logarithm:

ln(y) = x when ex = y.

The rules of differentiation for exponentials and logarithms are

d x d 1
dx e = ex , dx ln(x) = x .

From these it follows that

d x d 1
b = bx ln b, logb (x) = .
dx dx ln(b)x

In fact, all these rules follow from the first one, the derivative of ex with respect to x.
That rule, in turn, is a result of the properties of exponentials and the fact that the slope
of y = ex at x = 0 is 1.
The details follow.
Recall that ex+z = ex ez . So ex = ea+(x−a) = ea ex−a and in the derivative:
d ex ex − ea ea ex−a − ea ea (ex−a − 1) ex−a − 1 a
(a) = lim = lim = lim = lim e ;
dx x→a x − a x→a x−a x→a x−a x→a x − a
Rates of Change of Exponentials and Logarithms 191

d ex ex−a − 1 a ex−a − 1
(a) = lim e = ea lim .
dx x→a x − a x→a x − a

The last limit is the slope at x − a = 0: setting h = x − a we have

ex−a − 1 eh − e0
lim = lim = 1.
x→a x − a h→0 h

The fact that the slope of the y = eh at h = 0 is 1 is what makes e such a special base for
exponentiation.
Now consider u = ln(x). By definition, eu = x. We differentiate this equation with
respect to x to get

d u du d du 1 1
e = eu = 1, so ln(x) = = u = .
dx dx dx dx e x

Finally, bx = ex ln b and logb (x) = ln(x) ln(b) give

d x d 1 1
b = ex ln b ln b = bx ln b, logb (x) = .
dx dx ln(b) x

Example 1. The hyperbolic cosine function is f (x) = ex +e−x /2. Find its derivative

with respect to x.
Solution:
d ex + e−x ex + e−x (−1) ex − e−x
= = .
dx 2 2 2

Example 2. Find the rate of change of 1000e−0.04t with respect to t when t = 5.

Solution: The derivative with respect to t is 1000e−0.04t (−0.04) = −40e−0.04t so when
t = 5 the rate of change is −40e−0.2 .

Example 3. The hyperbolic sine function is g(x) = ex − e−x /2. Find its derivative

with respect to x.
Solution:
d ex − e−x ex − e−x (−1) ex + e−x
= = .
dx 2 2 2

Example 4. Find the equation of the line tangent to y = x2 e3x at (1, e3 ).

Solution: The slope of the tangent line is dy/dx = 2xe3x + x2 e3x 3, so at x = 1 the slope
is 5e3 . The tangent line is y = e3 + 5e3 (x − 1) = 5e3 x − 4e3 .
192 Rates of Change of Exponentials and Logarithms

Example 5. Find the maximum value of xe−x .

Solution: To familiarize ourselves with the setting, notice that f (x) = xe−x is negative for
x < 0 and approaches 0 as x → ∞. Finding the rate of change, df /dx = e−x + xe−x (−1) =
e−x (1 − x). So the critical point is at x = 1. Since f (1) = e−1 > 0 this is the maximum
value. We can confirm that the graph is concave down at x = 1 by finding the second
derivative: d2 f /dx2 = −e−x −e−x +xe−x = xe−x −2e−x and at x = 1 the second derivative
with respect to x is −e−1 < 0.

Example 6. The cumulative distribution function for the exponential distribution is

F (x) = 1 − e−λx where λ is a constant (this symbol is called lambda). Find the rate of
change of this cumulative distribution function.
Solution: dF/dx = 0−e−λx (−λ) = λe−λx . This derivative is called the probability density
function for the exponential distribution.

Example 7. The probability density function for the normal distribution is f (x) =
√
−(x−µ)2 /2σ 2
e /( 2π σ) where µ and σ are constants and σ > 0 (these symbols are “mu”
and “sigma”). Find the rate of change of this density function. Where does it have a
maximum or minimum?
Solution:
2 2
df e−(x−µ) /2σ −(x − µ)
= √
dx 2π σ σ2
√
This derivative is only zero at x = µ and f (µ) = 1/( 2π σ) is a maximum value. As
x → ∞, f (x) → 0 and as x → −∞, f (x) → 0. So the function does not have a minimum
value, but gets close to zero.

Example 8. Set f (x) = x3 e−x . Find its maximum and minimum values.
Solution: df /dx = 3x2 e−x − x3 e−x . So the critical values are x = 0 and x = 3. The second
derivative is d2 f /dx2 = (6x − 6x2 + x3 )e−x = (6 − 6x + x2 )xe−x . The second derivative
is negative at x = 3 so that is a local maximum. For x = 0, f (0) = 0 and both the first
and second derivatives are zero. Examining the values for x < 0 we see that f (x) < 0 for
x < 0, so x = 0 is neither a maximum nor a minimum. (In fact, the graph at (0, 0) has a
horizontal tangent and an inflection point.)

Example 9. Find the slope of the line tangent to y = 3x at x = 2.

Solution: dy/dx = 3x ln(3), so at x = 2 the slope is 9 ln(3).
Rates of Change of Exponentials and Logarithms 193

2
Example 10. Find the maximum value of f (x) = x5−x .
Solution: The critical points are found by setting the rate of change to zero. Here df /dx =
2 2 2 2 2
5−x + x 5−x ln(5) (−2x). So we get 5−x − 2(ln 5)x2 5−x = 0 or (1 − (2 ln 5)x2 )5−x = 0
√ √
so x2 = 1/(2 ln 5). Before x = −1/ 2 ln 5 , f is decreasing, and after x = +1/ 2 ln 5 , f is
√ √
decreasing. Hence the maximum value is f (1/ 2 ln 5 ) = 5−1/(2 ln 5) / 2 ln 5 .

Example 11. Find an equation for the line tangent to xey + yex = 1 at the point
(1, 0).
Solution: We want to find the slope, dy/dx. So we differentiate with respect to x to get

dy dy x dy ey + yex
ey + xey + e + yex = 0. So =− y .
dx dx dx xe + ex

Thus the slope at (1.0) is dy/dx = −1/(1 + e) and the tangent line is

1 1
y= − x.
1+e 1+e

Example 12. Find x as a function of z when z 2 · ex = 7.

Solution: Taking logarithms (base e) of both sides, ln(z 2 ) + ln(ex ) = ln(7) or 2 ln(z) + x =
ln(7). So x = ln(7) − 2 ln(z).

Example 13. The value of a certain machine at time t is V = 50, 000e−0.08t dollars.
(Here t is measured in years and t ≥ 0). At what time will the machine have a value of
$15,000?
Solution: We want t with 15, 000 = 50, 000e−0.08t . So e−0.08t = 3/10 = 0.3 and −0.08t =
ln 0.3 or t = − ln(0.3)/0.08. This is approximately 15.05 years (about 15 years and 18 days
and 3 hours).

Example 14. Find the rate of change of z = x2 ln(x) with respect to x.

Solution:
d 2 1
(x ln(x)) = 2x ln(x) + x2 · = 2x ln(x) + x .
dx x

Example 15. Find the minimum and maximum values of f (x) = x ln(x3 ) on the
interval [0.1, 4].
Solution: It is easier to compute if we notice that ln(x3 ) = 3 ln(x), so that our function
is f (x) = 3x ln(x). We calculate that df /dx = 3 ln(x) + 3. This derivative is zero if
194 Rates of Change of Exponentials and Logarithms

ln x = −1 or x = e−1 = 1/e. For x < 1/e the function is decreasing and for x > 1/e
the function is increasing. We compute that f (0.1) = 0.3 ln(0.1) = −0.3 ln(10), and
f (4) = 12 ln(4). Thus, on the interval [0.1, 4], the local (relative) and absolute minimum
is f (1/e) = f (e−1 ) = −3/e and the absolute maximum is f (4) = 12 ln(4).

Example 16. Find the rate of change of t with p when p = 1/(1 + et ) for t > 0.
Solution: We’ll first solve for t in terms of p. We have 1+et = 1/p, et = (1/p)−1 = (1−p)/p,
and t = ln((1 − p)/p). The rate of change of t with p is dt/dp = (p/(1 − p)) × (−p−2 ) or
dt/dp = −1/(p(1 − p)).

Example 17. Find the slope of the line tangent to ex + ln(y) = 1 at (0, 1).
Solution: We differentiate with respect to x in order to find dy/dx.

1 dy dy
ex + = 0, = −yex .
y dx dx

At x = 0 and y = 1, the slope is −1 · e0 = −1.

Example 18. Find the maximum value of f (x) = ln(1 + x − x2 ).

Solution: We analyze this using the derivative.

df 1
(x) = (1 − 2x) .
dx 1 + x − x2

The critical numbers are x = 1/2 and those values of x for which 1 + x − x2 = 0. However,
√ √
the domain of f requires that 1 + x − x2 > 0, or (1 − 5)/2 < x < (1 + 5)/2. Inside this
domain, for x < 1/2 the function is increasing and for x > 1/2 the function is decreasing.
The maximum value is achieved at x = 1/2 and f (1/2) = ln(5/4).

Example 19. Suppose that a certain bond will pay $10,000 at the end of 10 years.
Suppose the current bid on the bond is $7,600. What is the interest rate for this bond?
Solution: Let r denote the interest rate. We assume that interest is compounded con-
tinuously. Then 10, 000 = 7, 600 er×10 and e10 r = 10, 000/7, 600 = 25/19. Taking the
logarithm we get 10 r = ln(25/19), and r = 0.1 ln(25/19) ≈ 0.02744 (or approximately
2.744 percent).

Example 20. It is assumed that the total number of a certain item sold by time
t follows the model N = Aek/t for some constants A and k. Suppose that 100,00 items
were sold in the first year after the product was introduced, and that a total of 250,000
Rates of Change of Exponentials and Logarithms 195

were sold by the end of the second year. Find the constants in the model and the time
at which the total number of items sold reaches 500,000. How many items would be sold
eventually? At what rate is this product being sold?
Solution: We have N (1) = 100, 000 so Aek = 100, 000, and N (2) = 250, 000 so Aek/2 =
250, 000. The ratio of these gives 2.5 = ek/2 /ek = ek/2−k = e−k/2 . Taking ln of
both sides, −k/2 = ln 2.5 and k = −2 ln 2.5. Putting this into N (1) = 100, 000 gives
Ae−2 ln 2.5 = 100, 000 or A = 100, 000 e2 ln 2.5 = 625, 000. So far we have found that
N (t) = 625, 000 e−2 ln(2.5)/t .
The number of items sold reaches 500,000 when N (t) = 625, 000 e−2 ln(2.5)/t = 500, 000
or when e−2 ln(2.5)/t = 500/625 = 4/5, so −2 ln(2.5)/t = ln(4/5) and t = (−2 ln 2.5) ln(4/5) ≈

8.2126 years.
According to this model eventually a total of 625,000 items will be sold, since as
t → ∞, ek/t → e0 = 1.
The rate of sales is dN/dt = 625, 000 e−2 ln(2.5)/t (2 ln(2.5)/t2 ) items per year.

EXERCISES.
1. Set f (x) = e3x + e−3x . Find the rate of change of f with respect to x.
2. Set f (x) = 3ex − 3e−x . Find the rate of change of f with respect to x.
3. Set g(x) = e5x + ln(5x). Find the rate of change of g with respect to x.
4. Set g(x) = e−6x + 5 ln(x). Find the rate of change of g with respect to x.
5. Find the rate of change of f (x) = x4 ex with respect to x.
6. Calculate the rate of change of f (x) = x ln(x) with respect to x.
2
7. Calculate the rate of change of f (x) = ex with respect to x.
3
8. Calculate the rate of change of f (x) = ex with respect to x.
5
9. Calculate the rate of change of f (x) = ex with respect to x.
2
10. Calculate the rate of change of h(x) = ln(x) · ex with respect to x.
3
11. Set f (x) = ex + e−x . Calculate the rate of change of f with respect to x.
12. Find x as a function of z when z 2 · ex = 8. Then calculate the rate of change of x
with respect to z.
13. Find x as a function of z when z 3 · ex = 2. Then calculate the rate of change of x
with respect to z.
14. Find x as a function of y when y = y 2 + ln x. Then calculate the rate of change
of x with respect to y.
196 Rates of Change of Exponentials and Logarithms

15. Find the slope of the line tangent to y = 5x at x = 1.

16. Find the slope of the line tangent to y = ex + 3x at x = 0.
17. Find the maximum value of f (x) = ln(9 − x2 ). Suggestion: first consider the
domain of f .
2
18. Find the maximum value of f (x) = x2 ex .
19. Find the maximum value of g(u) = ueu .
20. Find the maximum value of f (x) = ln(1 + x − x2 ) one the interval [2, 6].
21. The value of a certain machine at time t is V = 50, 000e−0.08t dollars. (Here t is
measured in years and t ≥ 0). At what time will the machine have a value of $10,000?
22. The value of a certain machine at time t is V = 50, 000e−0.08t dollars. (Here t is
measured in years and t ≥ 0). At what rate is the value changing over time?
23. Recall that the probability density function, f (x) of a stochastic variable X whose
cumulative distribution function is G(x) = P (X ≤ x) is given by f = dG/dx. A variable
is uniformly distributed in [0, 3] when G(x) = 0 for x < 0, G(x) = x/3 for 0 ≤ x < 3, and
G(x) = 1 for x ≥ 3. Calculate the probability density function for this variable. Explain
why it makes sense to set f (x) = 0 when x < 0 and when x > 3.
24. The population of a certain country, in millions, at time t years is P (t) = 55e0.03t .
At what rate is the population increasing with time? What is the population’s relative
rate of growth?
25. The capital in a certain country, in billions of dollars, at time t years is K(t) =
0.02t
800e . At what rate is capital increasing with time? What is the relative rate of growth
of capital?
26. The total income in a certain country, in billions of dollars, at time t years is
I(t) = 20e0.025t . At what rate is the income increasing with time? In the same country
the population, in millions, is P (t) = 55e0.03t . Calculate the personal income W (t) in this
country. Is personal income increasing over time?
27. The population of a certain country, in millions, at time t years is P (t) = 50e0.03t .
The capital, in billions of dollars, is K(t) = 800e0.02t . Does the relative rate of growth of
capital exceed the population’s relative rate of growth?
28. The population of a certain country, in millions, at time t years is P (t) = 12e0.015t .
The capital, in billions of dollars, is K(t) = 800e0.03t . Suppose that personal income is
W (t) = 0.01K(t)/P (t). Is personal income increasing over time?
29. Suppose that a painting has the market value of 600(1 + 0.07t2 ), where time,
t, is measured in years since its creation. Assume the discount rate is 0.04 (4 percent).
Rates of Change of Exponentials and Logarithms 197

Calculate the present value, v(t), of the painting at time t. At what rate is v(t) changing?
Is the present value ever maximized? and if so, at what time?
30. Suppose that a painting has the market value of 300(1 + 0.5t), where time, t,
is measured in years since its creation. Assume the discount rate is 0.02 (2 percent).
Calculate the present value, v(t), of the painting at time t. At what rate is v(t) changing?
Is the present value ever maximized? and if so, at what time?
12. Differential Equations and Anti-Differentiation

Differential equations are equations that involve a derivative.

One type of equation is of the form

dy
= f (x),
dx
which is relatively simple for two reasons. First, only one derivative appears and only one
quantity is being differentiated. Secondly, the right hand side involves only the independent
variable.
Equations of this type lend themselves well to solution by guessing.

Example 1. Find a function y so that dy/dx = x3 + x2 + 5.

Solution: The right hand side is a polynomial, so we guess a function whose derivative is
a polynomial, namely another polynomial. Here y = x4 /4 + x3 /3 + 5x + 1.34 works fine.

A function g(x) so that dg/dx = f (x) is called an anti-derivative for f (x), or an indefinite
integral for f (x). The general anti-derivative is written
Z
f (x) dx = g(x) + C .

Example 2. Find the indefinite integral for f (x) = x5 + ex .

Solution: We guess a power for the power and an exponential for the exponential.
Z
1
x5 + ex dx = x6 + ex + C .

6

Another differential equation that lends itself to guessing is one of the form

dy
= h(y) .
dx
And of particular interest to us is a rate of change that is proportional to the quantity:

dy
= r y.
dx
We know that the exponential function satisfies equations of this type. In other words, a
solution is given by y = Aerx for some constant A (which is not yet determined).
Differentials Equations and Anti-Differentiation 199

Example 3. When a product is introduced into the market, the number of additional
shoppers exposed to the product is proportional to the number of shoppers who have not
been exposed to the product. Suppose that the total number of shoppers is 100,000. Write
a differential equation that models the situation.
Solution: Let t denote time, and let x be the number of shoppers exposed to the product.
The number of shoppers not exposed to the product is 100, 000 − x so the model is

dx
= r(100, 000 − x) .
dt

We now turn our attention in a more systematic way to finding indefinite integrals.
We have seen three examples above, that of polynomials, that of exponentials, and that of
logarithms (which one was that?). The formulae are:

Z
1
(I1) xα dx = xα+1 + C for α 6= −1
α+1

Z
1 rx
(I2) er x dx = e +C for r 6= 0
r

Z
1
(I3) dx = ln(x) + C
x

dy
Example 4. Solve the differential equation dx = x3 + 5 with y(0) = 2.
Solution:
Z
1 4 1
(x3 + 5) dx = x + 5x + C, and y(0) = 2, so y = x4 + 5x + 2.
4 4

dy
Example 5. Find the general solution to the differential equation dx = x3/2 + x2 +e−x .
Solution: We guess a power for the first power, and our first integration formula, (I1),
tells us to raise the power by one. So our guess is a x5/2 for some constant a. When we
differentiate x5/2 we get 25 x3/2 and we want a to cancel the 25 . In conclusion, our guess is
2 5/2
5x for the first term.
200 Differentials Equations and Anti-Differentiation

For the second term the story is a bit more complicated. While 2
x = 2x−1 and so is a
power, the power is −1, so the applicable formula is (I3) (and we know that the derivative
of x−1+1 = x0 = 1 is zero). Our guess, then, is a multiple of ln(x), namely 2 ln(x).
For the third term, since it is an exponential we will guess an exponential. Also, be-
cause the derivative of the exponential has the same exponent, we’ll also keep the exponent
as −x. So we will guess a multiple of e−x . We see that what works is −e−x . In conclusion,
Z
2 2
x3/2 + + e−x dx = x5/2 + 2 ln(x) − e−x + C

y(x) =
x 5

dy
Example 6. Find y = y(x) with dx = x−2 + e2x , and with y(1) = 3.
Solution: We guess a multiple of x−2+1 = x−1 for the first term and a multiple of e2x for
the second. We decide the correct coefficients by first calculating the derivatives and then
adjusting to get what we want:
Z
1
x−2 + e2x dx = −x−1 + e2x + C

y(x) =
2
For this function, y(1) = −1+e−2 +C and we want y(1) = 3, so we choose C = 3+1−e−2 =
4 − e−2 . In conclusion y = −x−1 + 21 e2x + 4 − e−2 .

Every differentiation rule can be used to enrich our library of rules for solving differential
equations – indefinite integrals, in this case. We have the chain rule and the product rule,
and they both produce integration formulae. We now consider the chain rule in reverse.

The setting is that f (x) = f (u(x)) as for the chain rule.

Z
df du
(I4) dx = f (u(x)) + C
du dx
This formula is much easier to interpret after we look at some examples.

dy 2
Example 7. Find y = y(x) with dx = 2x ex
2
Solution: We consider u = x2 which turns ex into eu . With this substitution, du/dx = 2x
and our integral becomes
Z Z
x2 du 2
y(x) = 2x e dx = eu dx = eu + C = ex + C.
dx
It is a good idea to check that the integral is correct by differentiating the result.
d x2 2 d 2
e + C = ex (x2 ) = ex 2x. So the guess is correct.

dx dx
Differentials Equations and Anti-Differentiation 201

dy 2
Example 8. Find y = y(x) with dx = x ex
2
Solution: We consider u = x2 (again) which turns ex into eu . With this substitution,
du/dx = 2x and our integral becomes
Z Z
x2 1 du 1 1 2
y(x) = x e dx = eu dx = eu + C = ex + C.
2 dx 2 2
Check that the integral is correct by differentiating the result.

dy
Example 9. Find y = y(x) with dx = (x2 + 3x + 2)11 (2x + 3).
Solution: We consider u = x2 + 3x + 2 which turns (x2 + 3x + 2)11 into u11 . With this
substitution, du/dx = 2x + 3 and our integral becomes
Z Z
du 1 12 1 2
y(x) = (x + 3x + 2) (2x + 3) dx = u11
2 11
dx = u +C = (x + 3x + 2)12 + C.
dx 12 12
Check that this integral is correct by differentiating the result.

dy x
Example 10. Find y = y(x) with dx = 1+x2 .
1 1
Solution: We consider u = 1 + x2 which turns 1+x2 into u. With this substitution,
du/dx = 2x and our integral becomes
Z Z
x 1 1 du 1 1
y(x) = dx = dx = ln(u) + C = ln(1 + x2 ) + C.
1 + x2 u 2 dx 2 2
This example shows that it is useful to think ahead to what the du/dx will be. In this
example we set u = 1 + x2 knowing that the x in the numerator will appear in the du/dx
terms.

dy
Example 11. Find y = y(x) with dx = (x3 + 2)101 x2
Solution: We consider u = x3 + 2 which turns (x3 + 2)101 into u101 . With this substitution,
du/dx = 3x2 and our integral becomes
Z Z
1 du 1 1 102 1
y(x) = (x + 2) x dx = u101
3 101 2
dx = u +C = (x3 + 2)102 + C.
3 dx 102 3 306

dy 5
Example 12. Find y = y(x) with dx = x4 e−x
5
Solution: We consider u = −x5 which turns e−x into eu . With this substitution, du/dx =
−5x4 and our integral becomes
Z Z
4 −x5 1 du 1 u 1 −x5
y(x) = x e dx = eu dx = e +C = e + C.
−5 dx −5 −5
202 Differentials Equations and Anti-Differentiation

Sometimes it is convenient to think of the substitution as a change in the variables and so

relate the change in x to the change in u:

du 1
= g(x) is equivalent to du = g(x) dx or dx = du.
dx g(x)

dy x6
Example 13. Find y = y(x) with dx = x7 +5

Solution: We consider u = x7 + 5. Then du = 7x6 dx and our integral becomes

x6 x6 1
Z Z Z
1 1 1 1
y(x) = dx = du = du = ln(u) + C = ln(x7 + 5) + C.
x7 + 5 u 7x6 u 7 7 7

Check that this indefinite integral is correct by differentiating the result.

Summary: A differential equation is solved by reversing the differentiation process – es-

sentially we have learned a few patterns that allow us to guess a function whose derivative
is prescribed.
Meanings of this process will be easier to understand after definite integrals are defined.
For now, think of anti-differentiation as finding a quantity from its marginal value.

Example 14. Suppose that if a facility produces x carbon-fiber bicycles (during the
time that this facility is in use), then the marginal cost of manufacturing an additional
bicycle is 500 + 0.01x dollars. Suppose that the fixed costs of building the factory is 20
million dollars. Calculate the total cost when x bicycles are manufactured.
Solution: The total cost, C(x), satisfies the differential equation

dC
(x) = 500 + 0.01x
dx

and so C(x) = 500x + 0.005x2 + S, where S is some constant. The fixed cost is the
cost before any bicycles are made, so it must be the total cost for x = 0. That is,
C(0) = 20, 000, 000. It follows that C(x) = 20, 000, 000 + 500x + 0.005x2 dollars.

Example 15. Suppose that if a facility produces v airplanes during its existence,
then the marginal cost of manufacturing a plane is 500, 000 + 400v dollars. Suppose that
the fixed costs of building the factory is 30 million dollars. Calculate the total cost when
v airplanes are manufactured. Calculate the average cost of making an airplane at this
facility.
Differentials Equations and Anti-Differentiation 203

Solution: The total cost, C(v), satisfies the differential equation

dC
(v) = 500, 000 + 400v, with C(0) = 30, 000, 000 .
dv

and so the total cost is C(v) = 500, 000v + 200v 2 + 30, 000, 000 dollars. The average cost
of producing a plane when v are made is A(v) = C(v)/v. That is, the average cost is
A(v) = 500, 000 + 200v + (30, 000, 000/v) dollars.

Example 16. Suppose that a facility produces v airplanes during its existence and
that the average cost per plane is A(v) = 500, 000 + 200v + (30, 000, 000/v) dollars. Cal-
culate the production level that minimizes average cost.
Solution: dA/dv(v) = 0 when 200 − 30, 000, 000/v 2 = 0 which occurs for 387 < v < 388
(v is positive). We calculate A(387) ≈ 654, 919.38 and A(388) ≈ 654, 919.59 . Notice that
the average cost is minimized when the average cost equals the marginal cost (C(387) =
654, 800 and C(388) = 655, 200). Explain this – if the initial cost and marginal cost are
positive, why would a manufacturer minimize the average cost when the average cost equals
the cost of an additional unit?

EXERCISES.
1. Find a function y(x) with dy/dx = x4 and y(0) = 3.
2. Find a function y(x) with dy/dx = x4 and y(2) = 3.
3. Find a general solution to df /dx = e3x .
4. Find a function y(x) with dy/dx = (1/x) + (1/x2 ) − x2 and y(1) = 2.
5. Find a function y(x) with dy/dx = x/(x2 + 1) and y(0) = 0.
6. Find a function y(x) with dy/dx = e−0.03x and y(0) = 17.
7. Find a function y(x) with dy/dx = e−0.07x and y(0) = 10, 000.
8. Find a function y(x) with dy/dx = e0.02x and y(0) = 10, 000.
9. Find a function y(x) with dy/dx = 4/(5 − x) and y(0) = 11. For which values of x
is y defined?
10. Find the general solution f (x) to df /dx = (x4 + 3)103 x3 .
2
11. Find the general solution f (x) to df /dx = x e−9x .
12. As a new product is introduced, the increase in the number of people owning the
new item, say a particular game, is proportional to the number of interactions between
people who already own the item and people who do not have the item. Let t denote
204 Differentials Equations and Anti-Differentiation

time and z = z(t) be the number of people who own the new product. Suppose the total
population of potential customers is 200,000,000 people, and that initially 2,000 people
have the introduced product (at t = 0). Write a differential equation describing this
situation.
13. The marginal cost of treating an additional patient at a specialty clinic when p
patients are being served is mc = 60 − p + 0.01p2 . Calculate the total cost of treating p
patients, assuming that the fixed cost is 5 million dollars. Notice that the costs are over
the life-time of the clinic.
14. The cost of treating patients at a clinic is as above. Calculate the average cost
(per patient) of treating p patients. For how many patients is the marginal cost equal to
the average cost? (It makes sense to approximate this optimal number with an integer.)
15. Consider the cost of treating patients at a clinic, as above. Denote the number
of patients for which marginal cost equals average cost by q. Is it less expensive to treat
2q patients in two clinics each of which treats q patients, or is it less expensive to treat all
the patients in one clinic?
16. Consider consumers ordered by the value of an item to them. That is, if x denotes
the number of consumers, then v(x) is the value of the item to the marginal xth consumer.
Suppose v(x) = 1000 − 4x. (a) In this setting, v represents the marginal value of the
items purchased (dollars for the additional buyer). Assume that the total value of no
items is zero. Calculate the total consumer value when 100 consumers purchase the item.
(b) Calculate the total consumer value when 200 consumers purchase the item. (c) If the
market value (price) for the item is 100, calculate the number of consumers who purchase
the item. Assume that a consumer will purchase an item if the item’s price is below the
value of the item to this consumer.
13. Definite Integrals, Area, and Accumulated Value.

The product of an increment in a variable and some function of the variable often
has meaning. For example, if the variable is the height of a satellite above the surface of
the Earth and the function is the force of gravity on the satellite at that height, then the
product of an increment in the height times the force is the amount of energy needed to
move the satellite from the starting height through the increment in height.
Another example is provided by producers’ surplus. When the variable is the number
of units produced and the function is the profit on the sale of a unit when a certain number
of units are sold, then the increment in the number of units sold times the marginal profit
is the net gain. The sum over all items sold is the total surplus to producers in this market.
A third motivating example comes from probability. Many stochastic variables are
continuous and can achieve any of an infinite range of values. The distribution of these
values is then described using a density function, and the probability of the variable having
a value in any particular small interval is approximately the length of the interval times
the density function over the interval. Descriptive measures for the stochastic variable,
such as the average and variance, involve the product of the length of the interval and a
power of the variable multiplying the density function.

A. The Definite Integral.

We think of the definite integral of a function f (x), with respect to x, and over the
interval [a, b] as the sum of the function’s values over this interval weighted by the length.
For example, if the interval is [0, 5] and the function, f , has f (x) = 7 when 0 ≤ x ≤ 2
and f (x) = 9 when 2 ≤ x ≤ 5, then the value of 7 gets weighted by the length of [0, 2]
and the value of 9 gets weighted by the length of [2, 5]. So the integral has the value
7 × 2 + 9 × 3 = 41.
Geometrically, this is easier to visualize for a positive function f (x), where we think
of the integral over the interval [a, b] as the area under the curve y = f (x) and above the
x-axis.
The integral is the size of the shaded area in the sketch:
206 Definite Integrals

3.2

2.4

1.6

0.8

-5 -4 -3 -2 -1 0 1 2 3 4 5

-0.8

The area of a rectangle is (the length of) its base multiplied by (the length of) its height.
It is possible to approximate the shaded area with rectangles, and this approach works to
define the integral of a continuous function even if some of the values of the function are
not positive. We take this approach to defining the integral, and develop a more general
definition in the exercises, where we encounter some functions for which integrals do not
exist.

Definition. The definite integral of a continuous function f on an interval [a, b], where
a < b are finite numbers, is defined as follows:
(a) Fix a positive integer n, and let xi = a + i (b − a)/n, where i is an integer between 0
and n.
(b) For each i with 1 ≤ i ≤ n pick a point ci in the sub-interval [xi−1 , xi ].
(c) Calculate the sum
n
X b−a b − a
S(f, {c1 , . . . , cn }) := f (ci ) × = f (c1 ) + f (c2 ) + . . . + f (cn ) .
i=1
n n

(d) The integral of f on [a, b] is the limit as n increases to infinity of the sums in (c).

Theorem. Suppose that f is a continuous function on [a, b]. There is a number I so that
the sums in (c) above approach I regardless of the choice of the ci made in step (b) above.
That is
lim S(f, {c1 , . . . , cn }) = I, for any choice of ci .
n→∞
Definite Integrals 207

Notation.The definite integral of f on [a, b] is written as

Z Z b
f= f (x) dx .
[a,b] a

We can think of the integral as sampling the values of f on the interval [a, b] with
an equal spacing. Step (b) above generates a collection of numbers a = x0 < x1 < . . . <
xn−1 < xn = b that are equally spaced through the interval [a, b]. The number ci with
xi−1 ≤ ci ≤ xi is trapped inside the sub-interval [xi−1 , xi ] which has length (b − a)/n.
Finally, the weight that the value f (ci ) receives is this length (b − a)/n.

Theorem. Suppose that f is a piecewise continuous function on [a, b]. Then the process
described in the definition of the integral of a continuous function also defines an integral
for f .
By a piecewise continuous function on [a, b] we mean a function that is defined (and
hence has finite values) at each point of [a, b] and is continuous at all but finitely many
points of [a, b].

Example 1. Suppose that f is a continuous function defined on [a, b]. Then

Z b
1
f (x) dx
b−a a

is the average value of f on [a, b]. For instance, for f (x) = x2 and the interval [1, 3]
the average value is Z 3
1
average = x2 dx .
2 1

At this point a calculation of this value directly from the definition is algebraically
messy, and we will wait for an easier calculation. We expect the value to be somewhere
between f (1) = 1 and f (3) = 9. Let us calculate the average over the points 1, 1.5, 2, 2.5, 3
in the interval:

1 2
approximate average = 1 + 1.52 + 22 + 2.52 + 32 = 22.5/5 = 4.5 .
5
Discussion: The average of f (1) = 1 and f (3) = 9 is 5. Do you expect the true average of
f (x) = x2 to be above or below 5? Are there “more” numbers for which the value of f is
close to 1 or are there “more” numbers in [1, 3] for which the value of f is close to 9?
208 Definite Integrals

That the integral gives a change in the variable times a value of a function is responsible
for the great number of applications of the definite integral. In the definite integral we are
adding quantities where the change in the variable – the horizontal distance – is multiplied
by the value of the function – the height. In fact, the height can be signed, that is the
value of the function can be either positive (as for area) or negative.

Example 2. Let x be the number of items sold. Let p(x) be the price that consumers
are willing to pay for each item when x items are sold. Then when x changes by 1 and
the price is p(x) the product of the change in x and p(x) is the value to consumers of the
marginal item. The definite integral of p(x) with respect to x is the accumulated value of
all the items sold.
For instance, if p(x) = 100 − 0.04x dollars then consumers are willing to pay something
when the number of items sold is between 0 and 2, 500. To illustrate this, if 1, 000 items
are available, then consumers are willing to pay $60 for each item (which means that 999
consumers are willing to pay more than $60 and one consumer is willing to pay exactly
$60). So the value of the 1001st item is about 1 × 60. When 2, 000 items are available,
then consumers are willing to pay $20 for each item (which means that 1999 consumers
are willing to pay more than $20 and one consumer is willing to pay exactly $20). So the
value of the 2001st item is about 1 × 20. Adding all these quantities times the prices gives
the total value:
Z 2500
Total possible consumer value = (100 − 0.04x) dx.
0

In this example the area corresponding to the integral is a triangle with height 100 (when
x = 0) and a base of length 2,500 (when the price becomes zero), so the total area is
2500 × 100 /2 = 125, 000. In other words,
Z 2500
Total possible consumer value = (100 − 0.04x) dx = 125, 000 dollars.
0

Discussion: Typically we are actually interested in consumer surplus, which is the value
to consumers of the quantity sold less the total cost to consumers. To calculate consumer
surplus we need to know the actual market price and market quantity.

Convention regarding orientation of intervals.

It is conventional, and we will follow this convention, to think of an interval as having an
orientation from the number written on the left to the number written on the right. Hence
Definite Integrals 209

[b, a] is the same interval as [a, b] but with the opposite orientation. With this convention,
a switch to the opposite orientation gives, by definition, the negative of the integral:
Z a Z b
f (x) dx = − f (x) dx .
b a

Example 3. Z 1 Z 3
2
x dx = − x2 dx .
3 1
As a consequence to this convention we have a nice rule regarding integration of the
same function, f , on a combination of intervals:
Z b Z c Z c
Interval addition : f (x) dx + f (x) dx = f (x) dx .
a b a

Example 4. Probability Density Functions. When a stochastic variable is continuous,

the probability is assigned to an interval of values, and this is done using a probability
density function. A variable X has probability density function f (x) when
Z b
Probability (a < X ≤ b) = f (x)dx .
a

For instance if the probability density function is f (x) = e−x for x ≥ 0, and f (x) = 0 for
x < 0, then the probability that X lies between 2 and 4.2 is
Z 4.2
e−x dx.
2

We will be able to evaluate this number after the next section.

Example 5. The probability density function for a variable that is uniformly dis-
tributed between 3 and 6 is

 f (x) = 0, x≤3
f (x) = 1/3, 3 < x ≤ 6
f (x) = 0, x>6


Calculate the probability that such a variable lies between 3.5 and 5.
Solution: Since our function is constant on the interval of interest, the definite integral is
the length of the interval times the value of the function and the probability is
Z 5
(1/3) dx = (5 − 3.5) × (1/3) = 0.5.
3.5
210 Definite Integrals

EXERCISES.
1. Define a function f by f (x) = 3 when x is between 0 and 5, and by f (x) = 4 when
x is between 5 and 7. Recall that this is written

3 x ∈ [0, 5]
f (x) = .
4 x ∈ [5, 7]
R7
Calculate 0
f (x) dx.
2.
0 x ∈ [0, 4]
f (x) = .
2 x ∈ [4, 8]
R8
Calculate 0
f (x) dx.
3. Set
0 x ∈ [0, 4]
f (x) = .
−2 x ∈ [4, 8]
R8
Calculate 0
f (x) dx.
4. Set 
0 x ∈ [0, 4]
f (x) = 2 x ∈ [4, 8] .
3 x ∈ [8, 12]

R 10
Calculate 0
f (x) dx. Hint: the interval for the integral is not [0, 12].
5. Set
3 x ∈ [0, 4]
f (x) = .
−2 x ∈ [4, 8]
R8
Calculate 0
f (x) dx.
R8
6. Set f (x) = 11 − x. Calculate 0
f (x) dx. Hint: Think of the area between the
graph of f and the x-axis and over the interval [0, 8].
R6
7. Set f (x) = 3 − x. Calculate 0 f (x) dx.
R9
8. Set f (x) = 3 − x. Calculate 0 f (x) dx.
9. Suppose that producers have a cost of p = 10 + 0.002x dollars for each item, say a
bicycle pump, when they are asked to produce x items. Suppose that 10, 000 such bicycle
pumps are produced, and that they sell for 30 dollars each. (a) Calculate the marginal
profit that the producers make when x bicycle pumps are produced (here x is some value
between 0 and 10, 000). (b) The total profit is the profit per item times the number of
items for which this profit is generated. Represent the total profit as an integral. (c) Use
geometry to calculate the value of the integral in part (b).
Definite Integrals 211

10. Suppose that the maximum price that consumers are willing to pay for each item,
say a bicycle pump, when there are x items available is p = 60 − 0.003x dollars. To
understand this better, notice that bicycle pump x is bought by a consumer who is willing
to pay 60 − 0.003x – that is, the value to this consumer does not change with the market
quantity or price. However, if the quantity changes, then the consumers involved change
(if the quantity available is larger and the price is lower, then more consumers are willing
to purchase a bicycle pump). Suppose that 10, 000 such bicycle pumps are available, and
that they sell for 30 dollars each. (a) The consumer surplus on an item is the price that
the consumer is willing to pay for the item less the price that the consumer is actually
paying. Calculate the marginal surplus for the consumer who purchases bicycle pump x.
(b) The total consumer surplus is the surplus per item times the number of items for which
this surplus is generated. Represent the total consumer surplus as an integral. (c) Use
geometry to calculate the value of the integral in part (b).
11. Suppose that some function f has
Z 5 Z 5 Z 8 Z 10
f (x) dx = 7, f (x) dx = 4, f (x) dx = 9, and f (x) dx = −2 .
1 3 3 5

Calculate
Z 3 Z 3 Z 8 Z 10 Z 10
(a) f (x) dx, (b) f (x) dx, (c) f (x) dx, (d) f (x) dx, and (e) f (x) dx .
1 5 1 8 3

Project: Riemann Integrals - The General Case

Given a function f (x) and an interval V = [a, b], we want to decide whether the
R Rb
integral V f = a f (x) dx exists. And if it does, then we want to define this value. For a
Rb
constant function f (x) ≡ h, the integral we want is a f (x) dx = h × (b − a). We wish to
extend this definition to functions that are not constant, and we proceed by hoping that
our functions will be close to constant on small intervals.
The main difference in this setting, as opposed to the setting we had before, is that
if the function f is not continuous, then the values of f cannot simply be sampled. We
have to make sure that the value on each sub-interval is close to all values of f on the
sub-interval.
212 Definite Integrals

We split [a, b] into n subintervals (typically of equal lengths), evaluate the maximum
and minimum values of f on each subinterval, and use the resulting estimates on the
integral. When the maximum or minimum is not available, we use upper or lower bounds
(respectively). If the resulting estimates of the integral converge, then we have succeeded
in integrating the function on the interval. The definitions begin below.
Definition. An upper bound for a function f on an interval [a, b] is a number u
so that for every x ∈ [a, b], f (x) ≤ u.
A lower bound for a function f on an interval [a, b] is a number l so that for every
x ∈ [a, b], l ≤ f (x).

Example 6. Notice that if a function has an upper bound on an interval, then it has
many upper bounds on that interval. For example, f (x) = x2 has f (x) ≤ 7 on [1, 2], but
also f (x) ≤ 4.0003 and many other upper bounds.
12. Let f (x) = x2 . Consider the sub-intervals [1, 1.5] and [1.5, 2]. (a) Find an upper
bound for f on each sub-interval. (b) Find a lower bound for f on each sub-interval. (c)
What is the smallest possible gap between an upper bound and a lower bound on each
sub-interval?
13. Let f (x) = −x2 . Consider the sub-intervals [0, 1.5] and [1.5, 3]. (a) Find an upper
bound for f on each sub-interval. (b) Find a lower bound for f on each sub-interval. (c)
What is the smallest possible gap between an upper bound and a lower bound on each
sub-interval?
14. Let f (x) = x3 . Consider the sub-intervals [1, 2], [2, 3], and [3, 4]. (a) Is 8 an upper
bound for f on each sub-interval? (b) Is 1 a lower bound for f on each sub-interval. (c)
What is the smallest possible gap between an upper bound and a lower bound on each
sub-interval?
Notation: Fix a function f , an interval [a, b], and a natural number n. Divide
[a, b] into n subintervals [xj−1 , xj ] of equal length, that is, xk = a + k(b − a)/n for k =
0, 1, 2, 3, . . . , n. For each interval [xj−1 , xj ], j = 1, 2, . . . , n, let u(f, j, n) and l(f, j, n) be
(respectively) some choice of upper and lower bounds for f on [xj−1 , xj ]. Let
n n
X b−a X b−a
U (f, n) = u(f, j, n) × and L(f, n) = l(f, j, n) ×
j=1
n j=1
n
where we have used the above choice of upper and lower bounds on each subinterval.
Rb
We call U (f, n) and L(f, n) an upper estimate and a lower estimate for a f . One
Pn
thinks of these as upper and lower estimates for j=1 f (cj ) × b−a
n , where f (cj ) is somehow
the typical value (whatever “typical” means) of f on [xj−1 , xj ].
Definite Integrals 213

Definition. A function f that is bounded on the interval [a, b] is integrable on

[a, b] and its integral on the interval is I when for any given > 0 there is an n and a
choice of upper bounds and lower bounds for f on [xj−1 , xj ] with j = 1, 2, . . . , n, so that
I − L(f, n) < and U (f, n) − I < .

In words, this definition says that I is close to both the upper and lower estimates
when sufficiently small sub-intervals are used to create the approximations. Notice that is
chosen first. Then one chooses an n and upper and lower bounds for f on each subinterval
to get upper and lower estimates for the integral.
15. Let f (x) = x2 and consider the interval V = [1, 3]. Divide V into n = 3
subintervals of equal length. Let u(f, j, 3) = x2j and l(f, j, 3) = x2j−1 , where j is 1, 2,
or 3. These are the best upper and lower estimates (the ones with the smallest gap).
Calculate U (f, 3) and L(f, 3).
16. Let f (x) = x2 and consider the interval V = [1, 3]. Divide V into n = 6
subintervals of equal length. Let u(f, j, 6) = x2j and l(f, j, 6) = x2j−1 , where j is 1, 2,
3, 4, 5, or 6. Calculate U (f, 6) and L(f, 6).
17. . Let f (x) = x2 and consider the interval V = [1, 3]. Divide V into n = 10
subintervals of equal length. Let u(f, j, 10) = x2j and l(f, j, 10) = x2j−1 , , where j is an
integer between 1 and 10. Calculate U (f, 10) and L(f, 10).
18. Let f (x) = −x3 and consider the interval V = [0, 5]. Divide V into n = 5
subintervals of equal length. Let u(f, j, 5) = −x3j−1 and l(f, j, 5) = x3j , j between 1 and 5,
which are the best upper and lower estimates. Calculate U (f, 5) and L(f, 5).
19. Let f (x) = −x3 and consider the interval V = [0, 5]. Divide V into n = 10
subintervals of equal length. Let u(f, j, 10) = −x3j−1 and l(f, j, 10) = x3j , j between 1 and
10. Calculate U (f, 10) and L(f, 10).
R2
Example 7. Consider 1
x dx. Fix n, and choose u(f, j, n) = xj and l(f, j, n) = xj−1
on [xj−1 , xj ], j = 1, 2, . . . , n (here xj = 1 + j/n). Then
n n
X b−a Xn+j 1 1 n(n + 1) n+1
U (f, n) = u(f, j, n) × = × = 2 n2 + =1+
j=1
n j=1
n n n 2 2n

n n
X b−a Xn+j−1 1 1 (n − 1)n) n−1
L(f, n) = l(f, j, n) × = × = 2 n2 + =1+
j=1
n j=1
n n n 2 2n

Given > 0, choose n > 1/ and the upper and lower estimates as above. Then
1 1 1 1
U (f, n) − 1.5 = < < and 1.5 − L(f, n) = < < .
2n n 2n n
214 Definite Integrals

R2
One concludes that 1
x = 1.5.
Computing integrals directly from the definition, as in the example above, is not
efficient. The main use of the definition is to decide integrability in complicated cases.

Example 8. Denote the rational numbers by Q and let

1 x∈Q
f (x) =
0 x 6∈ Q

Then this function is not integrable on any interval containing more than one point.
20. Show that f in the previous example is not integrable on [0, 1]. Note that every
non-empty subinterval contains both rational and irrational numbers.

Example 9. For x ∈ [0, 1] and natural numbers m, set

1/m x = 1/m, some natural number m

n
f (x) =
0 otherwise
R1
21. Is f integrable on [0, 1]? If so, what is the value of 0
f ? If not, how can we see
this?

Example 10. A rational number is k/m in lowest terms if k and m have no common
integer factors. For example, 3/25 and 1/7 are in lowest terms, but 3/21 is not. Set

1/m x = k/m in lowest terms
f (x) =
0 x 6∈ Q

Consider the integral of f on the interval [0, 1]. For any fixed value of n there are finitely
many values of x for which f (x) ≥ 1/n, so it seems that if we take sufficiently small sub-
intervals centered around those values of x we can keep the total estimate for the integral
arbitrarily close to zero. To decide whether this function has an integral we have to count
how many sub-intervals contain rational numbers with a denominator of a given size. To
start, we need to choose a collection of sub-intervals.
22. Let f (x) = 1/m when x = k/m in lowest terms, and set f (x) = 0 otherwise.
Divide [0, 1] into 3 subintervals of equal length and use this to estimate the integral of f
on [0, 1].
23. Let f (x) = 1/m when x = k/m in lowest terms, and set f (x) = 0 otherwise.
Divide [0, 1] into 7 subintervals of equal length and use this to estimate the integral of f
on [0, 1].
Definite Integrals 215

B. The Fundamental Theorem of Calculus.

Definite integrals are great in representing accumulated quantities but areas are not
easy to calculate. However, there is a relation between anti-derivatives and definite inte-
grals that makes the calculation much easier.

Theorem. The Fundamental Theorem of Calculus for Definite Integrals.

Assume that f (x) is a continuous function defined on the interval [a, b].
Assume that G(x) is a differentiable function so that dG/dx = f at all points on [a, b].
Then Z b
f (x) dx = G(b) − G(a).
a

Example 1. Calculate the average value of f (x) = x2 on the interval [1, 3].
Solution: We have seen that the average value is

1 3 2
Z
average = x dx .
2 1

Now G(x) = x3 /3 has dG/dx = x2 so

1 3 2
Z
1 3 1 13 1
average = x dx = 3 /3 − 13 /3 = (27 − 1) = =4 .
2 1 2 6 3 3

√
Example 2. Suppose that consumers are willing to pay p(x) = 500 − x dollars for
an item when x items are available. Suppose 90, 000 items are sold. Calculate the total
consumer surplus.
Solution: Since 90, 000 items are sold, the price must be p(90, 000) = 200, and the marginal
√
surplus is p(x) − 200 = 300 − x. This function has the anti-derivative G(x) = 300x −
(2/3)x3/2 . Notice that the marginal surplus gives us the rate of increase in the surplus
given that x have already been sold. Hence the total consumer surplus is
Z 90,000
√
surplus = (300 − x) dx = G(90, 000) − G(0) = 9, 000, 000.
0

Notation. Since we often take an anti-derivative and then evaluate at the endpoints, this
is denoted with a square right bracket:
b
G(x) a
= G(b) − G(a).
216 Definite Integrals

Example 3. The rate of production in a factory t years after it starts operating

2
is f (t) = 5t(1 − e−t ). Calculate the average rate of production for a factory that has
operated for 10, 20, and h years.
Solution: The average will involve the integral of the rate of production so we begin by
finding a function G(t) with dG/dt = f . This is easier to do if we write f as a sum of two
2 2
functions, f (t) = 5t − 5te−t . Then G(t) = 2.5t2 + 2.5e−t works. The averages are
Z 10
1 −t2 1 2

−t2 10
A(10) = 5t(1 − e ) dt = 2.5t + 2.5e 0
10 0 10

= 25 + 0.25e−100 − 0.25 = 24.75 + 0.25e−100 ≈ 24.75,

Z 20
1 2 1 2 20

A(20) = 5t(1 − e−t ) dt = 2.5t2 + 2.5e−t 0 ≈ 49.875,
20 0 20
1 h
Z
2 2
A(h) = 5t(1 − e−t ) dt = 2.5h + 2.5e−h /h − 2.5/h.
h 0

Example 4. A rental property will generate $10,000 in income each year for 25
years. The discount rate is 0.02 (2 percent) and suppose income is generated continuously.
Calculate the total present value of the property.
Solution: To calculate the present value, income generated at time t is multiplied by a
factor of e−0.02t . Since it is assumed that income is continuously generated, during ∆ of a
year the income generated is 10, 000 × ∆. Hence the total present value or income received
during ∆ is 10, 000e−0.2t ∆.
The total present value is therefore the integral of 10, 000e−0.2t with respect to the
variable t. The function 10, 000e−0.2t represents the income density in present dollars. It
is this income density that multiplies the increment in the variable to yield the integral.
Z 25 25
Total present value = 10, 000e−0.02t dt = −500, 000e−0.02t 0
0

= −500, 000e−0.5 + 500, 000 ≈ 196, 734.67.

Discussion: If the income were not discounted to its present value, it would amount to
250,000 so the present value has to be less than 250,000.

Example 5. When a stochastic variable, X, is uniformly distributed in [0, 5], its

density function is f (x) = 1/5 for 0 ≤ x ≤ 5. This means that the probability of finding
Rb
X between a and b, where 0 ≤ a < b ≤ 5, is a (1/5)dx. Calculate (a) the probability
Definite Integrals 217

that X lies between 0 and 2, (b) the probability that X lies between 1 and 3, and (c) the
probability that X lies between 4 and 5.
Solution: These definite integrals can also be calculated directly as areas. Here we’ll use
the anti-derivative of 1/5 = 0.2 which is G(x) = 0.2x. Hence
Z 2 2
P (0 ≤ X ≤ 2) = 0.2 dx = 0.2x 0
= 0.4 − 0 = 0.4,
0
Z 3 3
P (1 ≤ X ≤ 3) = 0.2 dx = 0.2x 1 = 0.6 − 0.2 = 0.4,
1
Z 5
and P (4 ≤ X ≤ 5) = 0.2 dx = 1 − 0.8 = 0.2 .
4

Example 6. When a stochastic variable, X, has the density function f (x), which is
positive for a ≤ x ≤ b, its average or mean value or expected value is
Z b
Average value of X = E(X) = x f (x) dx .
a

Suppose that X is uniformly distributed in [0, 5]. Calculate the average value of X.
Solution: Z 5 5
Average = E(X) = x (1/5) dx = 0.1x2 0
= 2.5 − 0 = 2.5 .
0

Theorem. The Fundamental Theorem of Calculus for Differential Equations.

Assume that f (x) is a continuous function defined on the interval [a, b].
Define g(x) for a ≤ x ≤ b by
Z x
g(x) := f (s) ds .
a

Then g(x) satisfies the differential equation

dg
(x) = f (x) .
dx

Example 7. Find a function u(x) with du/dx = x2 ex and u(1) = 2.

Solution: Using the fundamental theorem of calculus for differential equations,
Z x
u(x) = 2 + s2 es ds .
1
218 Definite Integrals

Notice that for x = 1 the integral above is zero, so the condition u(1) = 2 is satisfied.
Discussion: One could also guess directly at the differential equation. A good guess involves
a product of a power and an exponential. The guess should start with x2 ex (why?), and
eventually one gets u(x) = x2 ex − 2xex + 2ex + 2 − e.

Example 8. Alternative definition of logarithm. As a consequence of the fun-

damental theorem of calculus for differential equations we can construct a solution using
a definite integral.
We define a new function L(x) for x > 0 by
Z x
1
L(x) := dt , x > 0.
1 t

Notice that the interval of integration is [1, x] if x ≥ 1 or [x, 1] if 0 < x ≤ 1, and that
in either case f (x) = 1/x is continuous on the interval of integration. Also, by definition
L(1) = 0 and dL/dx = 1/x. Hence the function L agrees with the natural logarithm. Some
of the exercises, in the next section – after we learn how to change variables – explore this
alternative definition of the logarithm. (Earlier, we defined the logarithm as the inverse of
the exponential, and the exponential was defined vaguely as an extension of the process of
taking powers.)

EXERCISES.
R2
1. Evaluate 0
(x2 − x) dx.
R2
2. Evaluate 0
(3x + ex ) dx.
R2 2
3. Evaluate 0
(x + e0.05x ) dx.
R2
4. Evaluate 0
(5x − 7 + (3/x)) dx.
R2
5. Evaluate 0
(12x11 − 5e−0.2x + 6) dx.
6. A rental property will generate $16,000 in income each year for 20 years. The
discount rate is 0.02 (2 percent). Suppose income is generated continuously, and calculate
the total present value of the property.
√
7. Suppose that it costs producers c(x) = 50 + 0.2 x + 0.001x dollars to produce an
item when x items are available. Suppose 90, 000 items are sold. (a) At what price are the
items sold? (b) What is the marginal profit when the number of units produced is x? (c)
Calculate the producers’ total profit.
8. Suppose that it costs producers c(x) = 25 + 0.03x − 10−6 x2 dollars to produce an
item when x items are available. Suppose consumers are willing to pay p(x) = 300 − 0.03x
Definite Integrals 219

dollars to purchase an item when x items are available. Notice that this model only makes
sense for 0 < x < 10, 000 because the price must be positive. (a) At what price are the
items sold? That is, at which number of items do producers and consumers reach an
agreement on the price and what is the resulting price? (b) What is the marginal profit
when the number of units produced is x? (c) What is the marginal consumer benefit when
x units are sold? (d) Calculate the producers’ total profit. (e) Calculate the total consumer
surplus.
9. An investment is made at a constant rate of $3,000 a year in a retirement account
with a continuous interest rate of 0.03 (3 percent). This investment continues for 20 years.
(a) Calculate the value of the income density that has been in the account for t years. Note
that it is assumed that this income has earned interest for t years, not that the income
was deposited t years after the account was opened. (b) Calculate the total amount in the
account at the end of the 20 years.
10. The rate of production from an orchard is f (t) = 20t+3t2 −0.01t3 at year t, where
4 < t < 300. (a) Calculate the average rate of production for this orchard as a function of
time. (b) After 50 years, is the orchard’s average rate of production higher or lower than
its (instantaneous) rate of production? (c) Would it be worthwhile to replant the orchard
after 50 years? Explain.
11. A stochastic variable, X, has the density function f (x) = 0.25 x3 for 0 < x ≤ 2,
and f (x) = 0 otherwise. Calculate the average value of X.
12. A stochastic variable, X, has the density function f (x) = 2x/9 for 0 < x ≤ 3, and
f (x) = 0 otherwise. Calculate the average value of X.
13. A stochastic variable, X, has the density function f (x) = x2 /3 for −1 < x ≤ 2,
and f (x) = 0 otherwise. Calculate the average value of X.

C. Some Anti-Differentiation or Integration Techniques – briefly.

It is common to look up anti-differentiation formulae that have been gathered over

the years. These are available as “integral tables” in reference books, in many symbolic
calculators, in mathematical computation packages on computers, and from web-based
sources.
It is always a good idea to check that a formula used does in fact apply to the equation
at hand by differentiating. It is also good to understand how some integration formulae
220 Definite Integrals

arise from differentiation rules.

This section presents a few integration techniques and examples that apply them.

Direct anti-differentiation.
A few indefinite integrals come directly from the differentiation rules for powers, ex-
ponentials, and logarithms. These appeared in the section on differential equations.

Z
1
(I1) xα dx = xα+1 + C for α 6= −1
α+1

Z
1 rx
(I2) er x dx = e +C for r 6= 0
r

Z
1
(I3) dx = ln(x) + C
x

Substitution.

The setting is that of the chain rule, f (x) = f (u(x)). The main focus is on spotting that
this setting applies. Recall the formula:
Z
df du
(I4 : Substitution) dx = f (u(x)) + C
du dx

Example 1. The number of orders of size s (items) placed with a certain company
(in a week) is N (s) = 7000s/(1 + s2 ) where 0 ≤ s ≤ 100. Calculate the average size of an
order.
Solution: Z 100
1 7000s
Average = ds.
100 0 1 + s2
In order to find an anti-derivative, observe that the denominator, 1 + s2 , has derivative 2s
which is the numerator multiplied by a constant. Hence the substitution u = 1 + s2 might
help. In terms of u, 1/(1 + s2 ) is 1/u which has anti-derivative f (u) = ln(u). So
Z 100 Z s=100
1 7000 s 1 7000 1
Average size = 2
ds = 2s ds
100 0 1+s 100 s=0 u 2
Definite Integrals 221

Z s=100
1 7000 1 du 100
= ds = 35 ln(1 + s2 ) 0 = 35 ln(10, 001) ≈ 322.36541 .
100 s=0 u 2 ds

An alternative and equivalent form for integration by substitution changes the variable
both for the function and for the increment. The formula becomes
Z Z
1
(I4 alternative) f (u(x)) dx = f (u(x)) du .
du/dx

This formulation is simpler, but leaves the task of finding an anti-derivative for the new
function g(u) = f (u)/(du/dx).
R3 2
Example 2. Evaluate 0
xe−x /2
dx.
Solution: Notice that x is a constant multiple of the derivative of x2 . Hence the substitution
u = −x2 /2 might work. We get

3 x=3
x2 du
Z Z
1 1 −x2 /2 1
u=− , = −x, dx = du = du : xe dx = xeu du
2 dx du/dx −x 0 x=0 −x
Z x=3 x=3
= −eu du = −eu x=0
= −e−9/2 + e0 = 1 − e−9/2 .
x=0

It is sometimes convenient to write the endpoints explicitly in terms of the substituted

variable:
Z b Z u(b) dx
(I4 with endpoints) f (u(x)) dx = f (u) du .
a u(a) du

R2 11
Example 3. Evaluate 0
x2 x3 + 1 dx.
Solution: Notice that x2 is a constant multiple of the derivative of x3 . Hence the substi-
tution u = x3 + 1 might work. We get

du 1
u = x3 + 1, = 3 x2 , dx = du :
dx 3 x2
Z 2 Z u=9 11 1
Z 9
2 3
11 2 3 1 11
x x +1 dx = x x +1 du = u du
0 u=1 3 x2 1 3
1 12 9 1 12
= u 1= (9 − 1) .
36 36
222 Definite Integrals

Algebraic manipulation.

Sometimes writing a function in a different form makes it easier to find anti-derivatives.

The two main approaches discussed here involve rational expressions. The first is reducing
powers in the numerator, and the second is separating factors in a common denominator.

Example 4. The density function for a stochastic variable is f (x) = 0.5/(1 + x) for
0 ≤ x ≤ e2 −1. This might represent, for instance, the duration of a cell phone conversation
(a low probability of a long conversation). Calculate the average value of this stochastic
variable.
R e2 −1
Solution: the average is given by 0
xf (x)dx.

e2 −1 e2 −1 e2 −1
1+x−1 −1
Z Z Z
0.5
E(X) = x dx = 0.5 dx = 0.5 1 + dx
0 1+x 0 1+x 0 1+x
e2 −1
= 0.5x − 0.5 ln(1 + x) 0 = 0.5(e1 − 1) − 0.5 ln(e2 ) − 0 = 0.5e2 − 1.5 ≈ 2.1945 .

Example 5. According to the widely used logistic equation, a population of size

p(t) at time t satisfies the differential equation

dp 1 r
(t) = rp(t) 1 − p(t) = p(K − p) .
dt K K
This equation can be written as

K dp
= r,
p(K − p) dt

and so the chain rule applies and we can solve the equation using the integral
Z
K
G(p) = dp .
p(K − p)

Find G(p).
Solution: The denominator has two factors. If the quotient can be written as a sum of two
fractions each of which has only one of the factors, then anti-derivatives could be found.
Z Z
K 1 1 p
G(p) = dp = + dp = ln(p) − ln(K − p) + C = ln +C.
p(K − p) p K −p K −p

Discussion: In ecology and biology, the parameter r is the natural growth rate, and K
represents the carrying capacity of the environment in which the population lives.
Definite Integrals 223

The example above is a special case of a technique called integration by partial fractions.
Z Z
ax + b A B
(I5 : Partial Fractions) dx = + dx,
(x − c)(x − d) x−c x−d

b + ac b + ad
for A = , and B = .
c−d d−c

Products and integration by parts.

When the anti-differentiation involves a product, it can be useful to guess a product for
the anti-derivative. The product rule produces two portions or parts when a product is
differentiated, and the corresponding integration technique is called integration by parts.

Example 6. An investment yields income at a rate of v(t) = 10, 000 + 500 t dollars
per year at time t years, where 0 ≤ t ≤ 50. Assume a discount rate of 2 percent and
calculate the total present value of the investment.
Solution: The contribution to the present value is made at a rate of (10, 000 + 500 t)e−0.02t
and this is integrated over 0 ≤ t ≤ 50. The first piece can be anti-differentiated directly:
Z 50
50
10, 000 e−0.02t dt = −500, 000 e−0.02t 0 = 500, 000(1 − e−1 ) ≈ 316, 060.28 .
0

For the second piece, the product 500 t e−0.02t , consider what happens when the product
−25, 000 t e−0.02t is differentiated:
d
− 25, 000 t e−0.02t = −25, 000 e−0.02t + 500 t e−0.02t .

dt
Clearly, what we want is the second part, so we should subtract the first!
d
− 25, 000 t e−0.02t + 25, 000 e−0.02t = 500 t e−0.02t .

dt
Z 50 Z 50
−0.02t −0.02t 50
25, 000 e−0.02t dt

or 500 t e dt = −25, 000 t e 0
+
0 0
−0.02t 50
= −25, 000 t e−0.02t − 1, 250, 000 e = −2, 500, 00e−1 + 1, 250, 000 ≈ 919, 698.60 .

0

The total present value of this investment is approximately 316, 060.28 + 919, 698.60 =
1, 235, 758.88 dollars.
The formula that captures the procedure above is
Z Z
dg df
(I6 : By Parts) f (x) (x) dx = f (x)g(x) − g(x) (x) dx .
dx dx
224 Definite Integrals

Example 7. Find an anti-derivative for f (x) = x2 ex .

Solution: This is a product and we will repeatedly differentiate the power and anti-
differentiate the exponential.
Z Z Z
x e dx = x e − 2x e dx = x e − 2x e + 2 ex dx = x2 ex − 2x ex + 2 ex + C .
2 x 2 x x 2 x x

Example 8. When a new technology is introduced successfully, its use spreads rapidly
(until use drops and the product is replaced). Suppose that the income from a new product
is proportional to the square of the duration since the product’s introduction. Specifically
suppose that the income in millions of dollars is y = 100 t2 with 0 ≤ t ≤ 3 representing
time in years. Suppose the discount rate is 4 percent. Find the total present value of the
income.
Solution: The present value of the rate of contribution to income at time t is ye−0.04t and
hence the total present value of the income is
Z 3
Total yield = 100 t2 e−0.04t dt.
0

The calculations start with the product of f (t) = t2 and dg/dt = e−0.04 t . One gets
Z 3 i3
Z 3
2 −0.04t −0.04 t
100 t e dt = 100 t 2
− 25 e − 100 · 2t (−25 e−0.04t ) dt
0 0 0

i3 i3
Z 3
−0.04 t −0.04 t
= −2, 500 t e 2
+ 5, 000 t − 25 e − 5, 000 1 · (−25 e−0.04t ) dt
0 0 0
i3 i3 i3
2 −0.04 t −0.04 t −0.04t
= −2, 500 t e − 125, 000 t e − 3, 125, 000 e
0 0 0
−0.12
= −3, 522, 500 e + 3, 125, 000 ≈ 822.762 .

In conclusion, the total present value of the income during the first 3 years is approximately
822.762 million dollars.

EXERCISES.
1.
4 4 4 4
x2 − 1 x2 − 1 x2 + 1
Z Z Z Z
x
Evaluate (a) dx, (b) dx, (c) 2
dx, (d) dx.
2 x−1 2 x 2 x −1 2 x−1
Definite Integrals 225

2.
3 1 2
ex
Z Z Z
2 1/2 0.03 t
Evaluate (a) (x + 3) x dx, (b) e dt, (c) dx.
1 0 0 ex + 3

3.
Z 3
x+2
Evaluate dx. Note : x2 + 4x + 3 = (x + 1)(x + 3).
1 x2 + 4x + 3

4. A bicycle tube has a circular cross-section with radius 1.5 cm and its thickness is
g(x) = 0.1 + 0.004 |x − 25| where x represents the distance from one end of the tube and
0 ≤ x ≤ 50. Suppose that the density of the material is 8 grams per cubic centimeter.
Calculate the total mass of the tube. Physics facts: mass is volume times density. Assume
that the area of a cross-section is 2π times the radius times the thickness, so the volume
is this area multiplied by the length of a segment of the tube (the x direction).
5.
Z 30 Z 1
−0.05x
Evaluate (a) xe dx, (b) t3 (t2 + 3)23 dt. Hint : t3 = t2 · t.
0 0

6.
Z 10 Z 10
Evaluate x ln(x) dx, (b) ln(t) dt. Hint : ln(t) = 1 · ln(t).
3 3

7. The income from an investment at time t years (since it is made) is 500 + 40t + 3t2 ,
where 0 ≤ t ≤ 15. For this investment the discount rate is 5 percent. Calculate the total
present value of the investment.
8.
Z p Z Z
4 at
Find antiderivatives (a) x x5 + 1 dx, (b) te dt, (c) x ex (ex + x) dx.

9. When the amount of items sold (in a month) is x consumers are willing to pay
−0.2x
150e per item while producers are willing to accept e−20 (50 + x). Notice that the
market quantity resulting from this is x = 100. Calculate the total consumers’ surplus and
the total producers’ surplus.
10. When a new product is introduced, the number of people who have purchased
it increases in proportion to the number of interactions between the people who have
already purchased the product and those potential customers who have not yet purchased
226 Definite Integrals

the product. Suppose a population has 100 million potential customers, p represents the
number of people (in millions) who have purchased the product, and t represents time (in
months). Suppose that at the beginning (t = 0) there are 0.1 million customers (through
special promotions, say, p(0) = 0.1). Suppose that it is found that at the beginning the
number of customers is increasing by 0.4995 million a month. This description says that
for some constant a

dp dp
(t) = a p(t) 100 − p(t) , p(0) = 0.1, (0) = 0.4995 .
dt dt

(a) Calculate the value of the constant a. (b) Find p(t), the number of customers at time
t. (c) Calculate the number of customers after 12 months.

Logarithms and exponentials.

Rx
Define the function L(x) for x > 0 by L(x) = 1
(1/t) dt.

11. Show that L(2) > 1/2.

12. Show that L(1/2) < −1/2.

13. Show that L(x · z) = L(x) + L(z). Hint: The interval [1, x · z] can be split into the
two intervals [1, x] and [x, x · z]. Consider the change of variables u = t/x for the second
interval, and notice that the number x is a constant in this instance – the variables are u
and t.

14. What is the range of L? Hint: you might find what you showed in the previous
three problems useful.

15. Show that if 0 < a < b, then L(a) < L(b). Notice that there are two interesting
cases, namely 0 < a < b < 1 (in which case L(a) and L(b) are negative) and 1 < a < b (in
which case L(a) and L(b) are positive).

16. The previous problem shows that L is a strictly increasing function, and hence
has an inverse. Let G be the inverse of L. That is, for any y in the range of L, G(y) = x
when L(x) = y. Show that G(y + w) = G(y) · G(w).

17. The fundamental theorem of calculus tells us that (dL/dx)(x) = 1/x. Using this,

show directly from the definition of G(y), that is from L(G(y)) = y, that dG/dy (y) =
G(y).
Definite Integrals 227

D. Improper Integrals.

Improper integrals involve one of two infinities. Either the integral is over an interval
that is not bounded (hence its length is infinite), or the integral involves a function that
is not bounded (and hence not defined at some values of the dependent variable).

Example 1. The function xe−x is defined for all 0 ≤ x and its value approaches 0
as x gets very large. However, we cannot break the interval 0 ≤ x into a finite number of
subintervals of finite length. So we do not have a definition of
Z ∞
x e−x dx =??
0

√
Example 2. The function f (x) = 1/ x is not defined for x = 0. So if an interval
includes 0, then we cannot find an upper bound (or evaluate the maximum or least upper
bound) of f on that interval. Hence we do not have a definition of
Z 1
1
√ dx =??
0 x

We will define the integrals in both these cases (unbounded interval of integration and
undefined function) as extensions of the integrals we have defined. And it will turn out
that the integrals in both the examples above can be evaluated.

Definition. Assume that a function f is bounded on the interval [r, b] for a < r < b. f
is integrable on [a, b] and its integral on the interval is I when f is integrable on [r, b]
and Z b Z b
lim f= lim f = I.
r→a+ r r→a,r>a r

The same criterion is applied when f is bounded on the interval [a, r] for a < r < b, and
when b is replaced by ∞:
Z b Z r Z ∞ Z r
f = lim f and f = lim f.
a r→b− a a r→∞ a

Recall that limr→b− means that r approaches b and r < b.

Terminonlogy. When an integral includes an unbounded interval or a function which

is not defined on the interval of integration, the integral is an improper integral. When
228 Definite Integrals

the value of this integral is defined, the integral is said to converge. If the integral is not
defined, then we say that it diverges or simply that it is undefined.
R1 √
Example 3. For 0
(1/ x) dx,
Z 1 √ √
1
√ dx = 2 − 2 r for 0 < r < 1, and 2 − 2 r → 2, when r → 0, r > 0.
r x
R1 √
Hence 0
(1/ x) dx = 2.
R∞
Example 4. For 0
e−x dx,
Z r
e−x dx = 1 − e−r , and 1 − e−r → 1, when r → ∞.
0
R∞
Hence 0
e−x dx = 1.

Example 5.
Z r r
xe−x dx = −xe−x − e−x 0
= 1 − e−r − re−r , and 1 − e−r − re−r → 1, r → ∞.
0
R∞
Hence 0
x e−x dx = 1.

Example 6. For f (x) = 1/x on x ≥ 2,

Z ∞ Z r
f (x) = lim (1/x) = lim ln(r) − ln(2) = ∞. (diverges)
2 r→∞ 2 r→∞

Example 7. For f (x) = 1/x2 on x ≥ 3,

Z ∞ Z r
1 1 1 1
f (x) = lim = lim − = .
3 r→∞ 3 x2 r→∞ 3 r 3

R2
Example 8. Decide whether −1
(1/x2 ) dx converges.
Solution: The issue is that 1/x2 is not defined at x = 0 which is contained in the interval
of integration. The integral is defined, then, only if both the integral on [−1, 0] and the
integral on [0, 2] are defined. Calculating the first of these,
Z 0 Z r
1 1 1 ir 1
dx = lim dx = lim − = lim −1 − ,
−1 x2 r→0− −1 x 2 r→0 − x −1 r→0 − r
Definite Integrals 229

which does not exist. Hence the original integral diverges.

Example 9. The probability density function for a certain variable is f (x) = c e−c x
where c is a positive constant and x is any non-negative value (x ≥ 0). Recall that the
R∞
average value for such a variable is −∞ x f (x) dx. Calculate the average value.
Solution:
Z ∞ Z b
−c x 1 −c x b 1
xce dx = lim x c e−c x dx = lim −x e−c x − e 0
= .
0 b→∞ 0 b→∞ c c

Example 10. It can be shown that f (x) = 2/(π(1 + x2 )) is a probability density

function, where x > 0. Calculate the average value of q(x) = x/(1 + x2 )2 over the interval
x ≥ 1.
Solution: the weight given to q(x) on the interval of size ∆ containing x is f (x)∆
Z ∞ Z ∞ Z b
x 2 x 2 x
2 2
f (x) dx = 2 3
dx = lim dx
1 (1 + x ) π 1 (1 + x ) π b→∞ 1 (1 + x2 )3

2 h −1/4 ib 1
= lim = .
π b→∞ (1 + x2 )2 1 8π

Example 11. The preservation of a certain park is valued at 100, 000 dollars a year.
Suppose this value is discounted at 1.5 percent (annually and continuously). Calculate the
total present value of preserving this park.
Solution: Z ∞ Z b
−0.015t
100, 000e dt = lim 100, 000e−0.015t dt
0 b→∞ 0
h 100, 000 ib 100, 000
= lim e−0.015 t = ≈ 6, 666, 666.67 .
b→∞ −0.015 0 0.015
So preserving the park (for all future generations) is worth nearly 7 million dollars.

EXERCISES.
1. For each of the following improper integrals, decide whether it converges (or fails
to converge).
∞ ∞ ∞ ∞ ∞
t2
Z Z Z Z Z
1 1 1 t
(a) dt, (b) dt, (c) dt, (d) dt, (e) dt.
1 t1.1 1 t0.98 1 t1.005 1 et 1 5t
230 Definite Integrals

2. For each of the following improper integrals, decide whether it converges (or fails
to converge).

1 1 1 1 1
t2
Z Z Z Z Z
1 1 1 1
(a) dt, (b) dt, (c) dt, (d) dt, (e) dt.
0 t1.1 0 t0.98 0 t1.005 0 1 + t3 0 t0.3

3. The normal distribution for the stochastic variable X has the probability density
2 2 √
f (x) = e(x−µ) /2σ / 2πσ, where x is any real number. Calculate the average value of X.
R∞
Suggestion: since f is a probability density function, you may use that −∞ f = 1.
4. Use the probability density function f (x) = 3e−3x , x > 0, to calculate the average
values of g(x) = x and of h(x) = x2 over the interval x ≥ 0.
5. Suppose a stochastic variable X has the probability density function f (x) =
2/(π(1 + x2 )), where x > 0. Does X have an average value?
6. The preservation of a certain park is valued at 100, 000 dollars a year. Suppose this
value is discounted at 3 percent (annually and continuously). Calculate the total present
value of preserving this park (for ever).
7. The preservation of a certain park is valued at 100, 000 dollars a year. Suppose
this value is discounted at 3 percent (annually and continuously). Calculate the present
value of preserving this park for 100 years.
8. A landmark draws 50, 000 dollars a year in tourism to a certain region. Suppose this
value is discounted at 2 percent (annually and continuously). Calculate the total present
value of preserving this landmark.
9. An annuity will pay its owner 70, 000 dollars a year for 20 years. The issuer of the
annuity expects funds to grow at 3.5 percent (annually and continuously). Calculate the
total cost of the annuity. Suggestion: the assumption here is that the person buying the
annuity has to pay the amount that will produce the continuous income stream that pays
for the annuity.
10. An annuity will pay its owner 70, 000 dollars a year for 40 years. The issuer of
the annuity expects funds to grow at 3.5 percent (annually and continuously). Calculate
the total cost of the annuity.
11. An annuity will pay its owner 70, 000 dollars a year for ever (the owner is a
museum that will last indefinitely). The issuer of the annuity expects funds to grow at 3.5
percent (annually and continuously). Calculate the total cost of the annuity.
14. An Extremely Brief Introduction to
Functions of Several Variables

A. Space and Coordinates.

In two dimensional space, R2 , we describe a location in space by means of two coor-

dinates, which we typically call the ex coordinate and the wye coordinate. We think of
these coordinates as representing signed distances on the x-axis and y-axis, which are two
perpendicular lines. We refer to the space of two dimensions as the plane. This is the
space in which we’ve drawn graphs of functions of one variable (using points (x, y) when
y = f (x) holds).
Symbolically, a point in R2 is given by a pair of numbers, (x, y). For example, (0, 0) is
the point at which the x-axis and y-axis intersect, and (1, 2.3) is the point corresponding
to moving 1 unit along the x-axis and 2.3 units along the y-axis.
In three dimensional space, R3 , we describe a location in space by means of three
coordinates, which we typically call the ex coordinate, the wye coordinate, and the zee
coordinate. We think of these coordinates as representing signed distances on the x-axis,
y-axis, and z-axis, which are three perpendicular lines. We refer to the space of triples of
numbers as 3-space.
There are two conventions for sketching R3 . We will typically place the y-axis hor-
izontally with the z-axis vertical and the x-axis “coming out of the page”. The other
conventional sketch places the z-axis vertically with the x-axis and the y-axis equally
angled to the sides.
Symbolically, a point in R3 is given by three numbers, (x, y, z). For example, (0, 0, 0)
is the point at which the x-axis, y-axis, and z-axis intersect, and (1, −1, 3) is the point
corresponding to moving 1 unit along the x-axis, −1 units along the y-axis (that is, moving
left along the y-axis), and moving 3 units along the z-axis.
Two points in the plane or in R3 are separated by a distance. For us the distance
will be the length of the line segment between the two points.
In R2 , the distance between (x, y) and (u, v) is
p
distance in two variables : (u − x)2 + (v − y)2 .

In R3 , the distance between (x, y, z) and (u, v, w) is

p
distance in three variables : (u − x)2 + (v − y)2 + (w − z)2 .
232 Functions of Several Variables

Example 1. Sketch the points (1, 2) and (2, 5), the line segment between these points,
and the point (−2, 1).
Solution: Notice that these points live in R2 .

2.5

-2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8

Example 2. Sketch the points (1, 0, 0) and (0, 2, 3) and the line segment between
these points.
Solution: Notice that these points live in R3 . We’ll describe some steps that might help
visualizing these points and line. With the y-axis horizontal and the z-axis vertical, and
with the x-axis at a 2π/3 angle to the positive y direction, (0, 2, 3) would be higher and to
the right of (1, 0, 0), and the line would appear to be coming slightly into the y-z plane.

z
2.5

-5 -4 -3 -2 -1 0 1 2 3 4 5
y

x
-2.5
Functions of Several Variables 233

Example 3. Sketch the surface {(x, y, z) | z = 4 − x2 − y 2 }.

Solution: Notice that this surface is a collection of points in R3 . The surface intersects
the x-axis at (2, 0, 0), the y-axis at (0, 2, 0), and the z-axis at(0, 0, 4). It looks a bit like
a bowl that opens downwards. In particular, the intersection of this surface with the x-y
plane (where z = 0) is a circle of radius 2. For a fixed value of x = 0, the slice in the
√ √
y-z plane is a downward parabola. Some points that are on this surface are ( 2, 2, 0),
√ √
( 2, 1, 1), (1, −1, 2), ( 3, 1, 0), (0.9, −1.5, 0.94), (0, 1, 3), (−1.2, 1.3, 0.87), (−1, 1.5, 0.75)
and (1.1, 0.8, 2.15).
The actual sketching is left to the reader – a color version using a mathematics package
on a computer is probably the most revealing.

EXERCISES.
1. Which two of the following three points are closest? (1, 2, 3), (1, −2, 3), and (2, 3, 1).
2. Which of the following three points is closest to the y-z plane? (1, 2, 3), (−2, −2, 3),
(2, 2, 1), and (3, 3, 1).
3. Sketch the curve in R2 defined by x2 + y 4 = 6.
4. Sketch the curve in R2 defined by x · y = 6.
5. Describe the surface in R3 given by x2 + y 2 + z 2 = 9 in words. (What would you
call this shape?)

B. Parameterized Paths.

Definition. A parameterized path in two dimensions is a continues map from one

dimension to two dimensions. Symbolically, and using coordinates, a path α has

α : R → R2 , α(t) = (x(t), y(t)), t in I.

We call t the parameter, I the values of the parameter, and x(t) and y(t) are the path’s
coordinates.

Example 1. α(t) = (t, t2 ), with t any number, describes a parabola. With −1 ≤ t ≤ 1,

√
α(t) = ( 1 − t2 , t) describes half a circle.
Notice that for any given value of the parameter, the path is “at” a point. That is,
for a given value of t the path produces a point (x(t), y(t)).
234 Functions of Several Variables

A parameterized path in three dimensions is a continues map from one dimension to

three dimensions. Symbolically, and using coordinates, a path α has

α : R → R3 , α(t) = (x(t), y(t), z(t)), t in I.

Example 2. The curve α(t) = (t, t2 , 3), with t any number, describes a parabola which
√
is “suspended” at a height of 3, and α(t) = (t, 1 − t2 , t2 ) with −1 ≤ t ≤ 1 describes a
helix with varying pitch centered along the z-axis. α(t) = (t2 , t3 , t) with −2 ≤ t ≤ 2
describes a curve with a cusp at (0, 0, 0).

Example 3. Parameterize the path which is a line from (1, 2, 3) to (4, 7, 7) and so
that the parameter changes by 3 over the path.
Solution: The parameter, t, must appear only with the first power to make the path a
line. Since 0 ≤ t ≤ 3 along the path, we adjust the coefficients of t accordingly. The path
is α(t) = (1 + t, 2 + (5/3)t, 3 + (4/3)t).

Example 4. Parameterize the path which is a line from (0, −2, 3) to (1, 1, 7) and so
that the parameter changes by 5 over the path. Do this in two ways and compare theme.
Solution: First, let the parameter, t, have the values 0 ≤ t ≤ 5 along the path, so we
adjust the coefficients of t accordingly. The path is α(t) = (0.2t, −2 + 0.6t, 3 + 0.8t).
Second, let the parameter, s, have the values −2 ≤ s ≤ 3 along the path. The path is
now β(s) = (0.4 + 0.2s, −0.8 + 0.6s, 4.6 + 0.8s). Both these paths are lines and both have
the same coefficients for the variables. In fact, s = t − 2 gives the correspondence of the
points.

EXERCISES.
1. Sketch the curve α(t) = (t5 , t2 ) with −1 ≤ t ≤ 1.
2. Sketch the curve α(t) = (3 + 2t, 4 − 3t) with −1 ≤ t ≤ 2.
3. Sketch the curve α(t) = (t, −t) with −2 ≤ t ≤ 2.
4. Parameterize the path which is a line from (0, 2, 3) to (3, 1, 7) and so that the
parameter changes by 1 over the path.
5. Parameterize the path which is a line from (0, 2, 3) to (3, 1, 7) and so that the
parameter changes by 3 over the path.
6. Parameterize the path which is a line from (0, 2, 3) to (1, −1, 5) and so that the
parameter changes by 1 over the path.
Functions of Several Variables 235

7. Parameterize the path which is a line from (0, 1, −2) to (3, 1, 6) and so that the
parameter changes by 2 over the path.

C. Functions.

A function on R2 or on R3 assigns a (real) value to each point in its domain.

Example 1. f (x, y) = x2 + y 2 , f (x, y) = x y, f (x, y) = 8.23, f (x, y) = 3x2 − y 2 ,

f (x, y) = (x2 − y)5 , g(x, y, z) = x y − z, g(x, y, z) = x + z 2 , g(x, y, z) = x2 + y 2 − z 3 , and
g(x, y, z) = x + y 2 + z 3 are all functions of several variables.
Definition. For a function f = f (x, y) on R2 , its graph is the collection of all points
in 3-space with z = f (x, y). That is, the collection is all points (x, y, z) with (x, y) in the
domain of f and with z = f (x, y).

Example 2. The graph of f (x, y) = x2 + y 2 looks like a bowl with its lowest point at
(0, 0, 0). The graph of f (x, y) = x y looks like a mountain pass, with the saddle at (0, 0, 0),
the valley along the line y = −x, and the mountain peaks rising along y = x.

Example 3. The graph of f (x, y) = −x2 − y 2 looks like an upside-down bowl with its
highest point at (0, 0, 0). The graph of f (x, y) = −x4 − y 4 looks very similar, only flatter
near (0, 0, 0) and steeper further away from the origin, (0, 0, 0).

Example 4. The graph of f (x, y) = x+3y is a plane. The graph of h(x, y) = 4+x+3y
is a plane parallel to the graph of f . The graph of g(x, y) = 2x + 7y is a plane whose
height changes more rapidly.
Definition. For a function f = f (x, y) on R2 , its level set through (a, b) is the
collection of all points in 2-space with f (x, y) = f (a, b). That is, the collection is all points
(x, y) with (x, y) in the domain of f and with f (x, y) = f (a, b).

Example 5. The level set of f (x, y) = x2 + y 2 through (1, 2) is the circle x2 + y 2 =

12 + 22 = 5. The level set of f through (−3, 4) is the circle x2 + y 2 = 25.

Example 6. The level set of f (x, y) = x y through (1, 1) is the curve xy = 1 with x
and y both positive, or y = 1/x with x > 0.

Example 7. The level set of f (x, y) = −x2 − y 2 through (2, 1) is the hyperbola
√ √ √ √
x2 − y 2 = 3. It contains the points ( 3, 0), (2, −1), (5, 22), (5, − 22), (7, 46), and, of
course, many others.
236 Functions of Several Variables

Example 8. The level sets of f (x, y) = x + 3y are parallel lines in the x-y plane.
They all have the slope −1/3 (with y as a function of x).

EXERCISES.
p
1. What is the domain of f (x, y) = x2 + y 2 ?
2. What is the domain of f (x, y) = 1/(x2 + y 2 ) ?
p
3. What is the domain of g(x, y, z) = z 2 − x2 + y 2 ?
p
4. Describe and compare the graphs of f (x, y) = x2 + y 2 and g(x, y) = x2 + y 2 .
5. Describe and compare the graphs of f (x, y) = x2 − y 2 and g(x, y) = 2xy.
6. Describe and compare the graphs of f (x, y) = x2 + y 2 and g(x, y) = x6 + y 6 .
7. Give a formula for the intersection of the graph of f (x, y) = x · y 3 and the planes
x = 1, x = 3, and x = −2.
8. Give a formula for the intersection of the graph of f (x, y) = x · y 3 and the planes
y = 1, y = 3, and y = −2.
9. Does x2 + y 4 + z 2 = 3 agree with the graph of a function of two variables?
10. Describe and sketch the level set of f (x, y) = x y through (2, 3).
11. Describe and sketch the level set of f (x, y) = x2 through (2, 3).
12. Describe and sketch the level set of f (x, y) = x4 + y 4 through (1, 1).
13. Describe and sketch the level set of f (x, y) = x2 + x y + y 2 through (2, −2).
14. Describe and sketch the level set of g(x, y) = x2 − y 2 through (3, 0).

D. Velocity Vectors.

To any path and any given point on the path we can associate a velocity. Intuitively,
we think that the path describes the position of an object at any given time, and its
velocity at any given time describes the speed and direction in which the object is moving.
One can think of velocity vectors in two ways. The easiest approach uses displacement
vectors in coordinates, and we describe it first. The second approach calculates the change
in the values of a function along the path, and we consider this action of a path second.
To make a long story short, the velocity vector ~v corresponding to the path α(t) =
(x(t), y(t), z(t)) for t = s has coordinates
d d d
~v = x(s), y(s), z(s) .
dt dt dt
Functions of Several Variables 237

Example 1. The path α(t) = (1 + 3t, t2 , 2 + t5 ) has the velocity vector ~v = (3, 2s, 5s4 )
when t = s. To continue this example, if s = 1 then the path passes through the point
(4, 1, 3) and has the velocity ~v = (3, 2, 5) there.

Coordinates for Vectors.

We can (and do) declare versions of R2 and R3 that we call vectors.

In the vector version of R2 we have coordinates ı̂ and ̂ and for numbers a and b we
write the pair of numbers as
aı̂ + b̂ .

In the vector version of R3 we have coordinates ı̂, ̂, and k̂ and for numbers a, b, and
c we write the triple of numbers as

aı̂ + b̂ + ck̂ .

We think of these as displacement vectors, in that we can start at a base-point and

move according to the vector. Starting at (x, y, z) and displacing with the vector aı̂+b̂+ck̂
we end at the point (x + a, y + b, z + c).
Definition. The length of a vector ~v = aı̂ + b̂ + ck̂ is
p
||~v || = a2 + b2 + c2 .

It is easy to see that what we have done in defining a displacement vector is moved
along a very special path, namely we start with t = 0 along α(t) = (x + a · t, y + b · t, z + c · t)
and end at α(1) when t = 1. In other words, a displacement vector describes motion along
a default path consisting of a straight line for a duration of 1. The length of the vector
is the distance traveled along this path.
Alternatively, for any two points we can think of the displacement vector that repre-
sents the difference between them. This requires that we designate one of the two points as
a starting point and the second as an end-point. The resulting difference has coordinates
that are the differences along each of the 2 or 3 coordinates.
In coordinates, the displacement vector from (x, y, z) to (u, v, w) has the change u − x
in the first coordinate, the change v − y in the second coordinate, and the change w − z in
the third coordinate. And so we write this vector as

(u − x)ı̂ + (v − y)̂ + (w − z)k̂ .

238 Functions of Several Variables

Notice that in this last formulation we have forgotten the starting point and retained
only the information about the motion (or displacement).
We can also “remember” the forgotten information in a useful way. For two displace-
ment vectors
~v = v1 ı̂ + v2 ̂ + v3 k̂, and w
~ = w1 ı̂ + w2 ̂ + w3 k̂,

we can think of the starting point for w

~ as being at the end-point of ~v . This will give
a new displacement going from the starting point of ~v to the end-point of w.
~ The new
displacement is the sum of the two vectors:

~v + w
~ = (v1 + w1 )ı̂ + (v2 + w2 )̂ + (v3 + w3 )k̂ .

We can also think of this way of adding two displacement vectors as combining the
motion described by the corresponding velocity vectors. For instance, if a plane is flying
with an air velocity of ~v and it is subject to a wind with velocity w,
~ then the resulting
combined velocity is ~v + w.
~ Suppose the paths start at zero and the default path corre-
sponding to ~v is α(t), 0 ≤ t ≤ 1, and the default path corresponding to w
~ is β(t), then the
~ is α(t) + β(t), 0 ≤ t ≤ 1.
path corresponding to ~v + w
In addition to addition, we can also scale the displacement – which we think of as
either shrinking or expanding the displacement and either keeping or reversing its direction.
For a vector ~v = v1 ı̂ + v2 ̂ + v3 k̂ we can scale it by a number s to get

s~v = (s · v1 )ı̂ + (s · v2 )̂ + (s · v3 )k̂

If all this motivation is too complicated, we can also declare that the vector version
3
of R consists of triples of numbers called vectors with addition and multiplication by
numbers defined using coordinates: for ~v = v1 ı̂ + v2 ̂ + v3 k̂, w
~ = w1 ı̂ + w2 ̂ + w3 k̂, and a
number s,
~v + w
~ = (v1 + w1 )ı̂ + (v2 + w2 )̂ + (v3 + w3 )k̂ ,

and s~v = (s · v1 )ı̂ + (s · v2 )̂ + (s · v3 )k̂ .

Dot product and angles.

Definition. The dot product of a vector ~v = v1 ı̂ + v2 ̂ + v3 k̂ and a vector w

~ = w1 ı̂ +
w2 ̂ + w3 k̂ is the number
~v · w
~ = v1 w1 + v2 w2 + v3 w3 .
Functions of Several Variables 239

There is a geometric meaning to the dot product. Namely

~v · w
~ = ||~v || · ||w||
~ · cos(A),

where A is the angle between the two vectors. We have not define the cosine function
in this book. If you have not seen this function, then this geometric meaning can be
understood as giving the projection of ~v on the vector w
~ scaled by the length of w.
~
2.4

1.6

0.8
W

-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

-0.8

The dot product has a geometric meaning, and is thus used extensively to visualize
the relations of curves and surfaces. One of the useful features is that the dot product of
non-zero vectors is zero exactly when they are perpendicular. We will not use the geometric
properties of the dot product here, but the reader should be aware of the potential to do
so.

Example 2. Calculate the length of ~v , the length of w

~ and the dot product of ~v and
~ where ~v = 1ı̂ − 2̂ + 3k̂ and a vector w
w ~ = 1ı̂ + 2̂ + 1k̂.
Solution:
√ √ √ √
||~v || = 1+4+9= 14, ||w||
~ = 1+4+1= 6, ~v · w
~ = 1 − 4 + 3 = 0.

EXERCISES.
1. Calculate the velocity vector for the path α(t) = (t2 , t3 , t4 ) at t = 0 and at t = 1.
2. Calculate the velocity vector for the path α(t) = (2t, 3t, 4t) at t = 0 and at t = 1.
240 Functions of Several Variables

3. Find a path β(t) in R2 so that β(0) = (0, 0) and the velocity vector of β at t = 0
is ~v = 1ı̂ − 2̂.
4. Find a path β(t) in R2 so that β(0) = (3, 7) and the velocity vector of β at t = 0
is ~v = 1ı̂ − 2̂.
5. Find a path α(t) in R3 so that α(0) = (3, 4, 7) and so that the velocity vector of α
at t = 0 is ~v = 1ı̂ − 2̂ + 3k̂.
6. Calculate the length of the vector ~v = 2ı̂ − 2̂.
7. Calculate the length of the vector ~v = 2ı̂ − 2̂ + 1k̂.
8. Calculate the velocity vector for the path α(t) = (2 + t, 1 + 3t, −3 + 4t) at t = 0.
Which point does the path pass through at t = 0?
9. Calculate the velocity vector for the path α(t) = (2t, 3t, 4t) at t = 1. Which point
does the path pass through at t = 1?
10. Find a path β(t) in R2 so that the velocity vector of β at t = 0 is ~v = −2ı̂ + 5̂.
11. Find a path β(t) in R3 so that β(0) = (−3, 4, 1) and so that the velocity vector of
β at t = 0 is ~v = 0.7ı̂ − 2̂ + 1.2k̂.
12. Find a path α(t) in R3 so that α(0) = (3, 4, 7) and so that the velocity vector of
α at t = 0 is ~v = 2ı̂ − 2̂ + 2.3k̂.
13. Calculate the length of the vector ~v = 3ı̂ − 5̂.
14. Calculate the length of the vector ~v = −2ı̂ − 3̂ − 1k̂.

E. Vectors and Functions.

We finally get to understanding how vectors represent changes in the variables and
how these changes lead to a change in a function.

Partial derivatives.

Definition. When the limit below is defined, the derivative of a function f (x, y, z)
with respect to the vector ~v = v1 ı̂ + v2 ̂ + v3 k̂ at the point (a, b, c) is

∂f f (a + t · v1 , b + t · v2 , c + t · v3 ) − f (a, b, c)
(a, b, c) = lim .
∂~v t→0 t
This derivative is also called the partial derivative of f (x, y, z) with respect to ~v at
(a, b, c). The word “partial” is used because the change in the variables is only in the
direction of ~v .
Functions of Several Variables 241

If we think of ~v as a velocity vector then we are thinking of the way in which f changes
along the path α(t) that passes through (a, b, c) at t = 0 with velocity ~v . If we think of ~v
as a displacement vector then we are thinking of the limit quotient of the change in f for
the displacement t · ~v starting at (a, b, c) relative to the change in t.

Example 1. For f (x, y, z) = x2 + 3y + z and ~v = 1ı̂ + 2̂ + 3k̂ at (1, 0, 2)

∂f (1 + t)2 + 3(0 + 2t) + (2 + 3t) − 3

(1, 0, 2) = lim
∂~v t→0 t

2t + t2 + 6t + 3t
= lim = 11 .
t→0 t

Example 2. For f (x, y) = x2 + y 3 and ~v = 1ı̂ + 2̂ at (1, −1)

∂f (1 + t)2 + (−1 + 2t)3 − 0

(1, −1) = lim
∂~v t→0 t

2t + t2 + 3 · 2t + 3 · (−1) · 4t2 + 8t3

= lim = 2 + 6 = 8.
t→0 t

Example 3. For f (x, y) = xy and ~v = 1ı̂ + 2̂ at (3, 2)

∂f (3 + t)(2 + 2t) − 6 8t + 2t2

(3, 2) = lim = lim = 8.
∂~v t→0 t t→0 t

Example 4. For f (x, y, z) = x − 3y + 2z and ~v = 1ı̂ − 5̂ + 7k̂ at (1, 3, 2)

∂f (1 + t) − 3(3 − 5t) + 2(2 + 7t) − (−4)

(1, 3, 2) = lim
∂~v t→0 t

t + 15t + 14t
= lim = 30 .
t→0 t
A special case of partial differentiation occurs when the vector is the unit vector along
a coordinate. The partial derivative with respect to ı̂ is called the partial derivative with
respect to x, the partial derivative with respect to ̂ is called the partial derivative with
respect to y, and the partial derivative with respect to k̂ is called the partial derivative
with respect to z. Symbolically,

∂f ∂f ∂f ∂f ∂f ∂f
(a, b, c) = (a, b, c) , (a, b, c) = (a, b, c) , (a, b, c) = (a, b, c) .
∂x ∂ı̂ ∂y ∂̂ ∂z ∂ k̂
242 Functions of Several Variables

These partial derivatives are relatively easy to calculate symbolically. To do this, we think
of the variable that is not changing as a constant and use the symbolic differentiation rules
that we know.

Example 5. For f (x, y) = xy

∂f ∂f
(x, y) = y, (x, y) = x .
∂x ∂y

Example 6. For f (x, y) = x2 + 7xy + y 3

∂f ∂f
(x, y) = 2x + 7y, (x, y) = 7x + 3y 2 .
∂x ∂y

To continue this example, at (1, −1)

∂f ∂f
(1, −1) = −5, (1, −1) = 10 .
∂x ∂y

Example 7. For f (x, y, z) = x5 + xy + y 3 + xyz

∂f ∂f ∂f
(x, y, z) = 5x4 + y + yz, (x, y, z) = x + 3y 2 + xz (x, y, z) = xy .
∂x ∂y ∂z

The chain rule.

Supposing that the function is sufficiently nice, there is a relationship between the
partial derivative with respect to a vector ~v = v1 ı̂ + v2 ̂ + v3 k̂ and the partial derivatives.
Namely,

∂f ∂f ∂f ∂f
(a, b, c) = (a, b, c) v1 + (a, b, c) v2 + (a, b, c) v3 .
∂~v ∂x ∂y ∂z

To examine this connection between partial derivatives with respect to the coordinates
and a partial derivative with respect to a vector, consider the linear approximation for
f (x, y). If the function is linear then

f (x, y) = f (a, b) + c1 (x − a) + c2 (y − b) .
Functions of Several Variables 243

In that case,

∂f ∂f ∂f
(a, b) = c1 · v1 + c2 · v2 and (a, b) = c1 , (a, b) = c2 .
∂~v ∂x ∂y

Definition. A function f (x, y, z) is differentiable at (a, b, c) when there is a linear

function L(x, y, z) = c1 (x − a) + c2 (y − b) + c3 (z − c) so that, with w
~ = (x − a)ı̂ + (y −
b)̂ + (z − c)k̂,
f (x, y, z) − f (a, b, c) − L(x, y, z)
lim = 0.
||w||→0
~ ||w||
~

Now suppose that the variables x and y depend on some third variable, t, so that
x = x(t) and y = y(t). At what rate does f change with t?
We could think of (x(t), y(t)) as a description of a path. The vector tangent to this
path at t is

dx dy
~v = (t) ı̂ + (t) ̂
dt dt
and so df /dt is ∂f /∂~v that we already calculated:

df ∂f ∂f dx ∂f dy
(t) = = (x(t), y(t)) (t) + (x(t), y(t)) (t) .
dt ∂~v ∂x dt ∂y dt

Now suppose that the variables x and y depend on two other variables, s and t, so
that x = x(s, t) and y = y(s, t). At what rate does f change with, say, s?
As s changes and t does not, (x(s, t), y(s, t)) describes a path in the plane, and this
path has the tangent vector

∂x ∂y
~v =(s, t) ı̂ + (s, t) ̂
∂s ∂s
and so ∂f /∂s is ∂f /∂~v for this new vector:

∂f ∂f ∂f ∂x ∂f ∂y
(s, t) = = (x(s, t), y(s, t)) (s, t) + (x(s, t), y(s, t)) (s, t) .
∂s ∂~v ∂x ∂s ∂y ∂s

Similarly,

∂f ∂f ∂x ∂f ∂y
(s, t) = (x(s, t), y(s, t)) (s, t) + (x(s, t), y(s, t)) (s, t) .
∂t ∂x ∂t ∂y ∂t
244 Functions of Several Variables

Example 8. Set f (x, y) = x5 + xy + y 3 . Let x = x(s, t) = s2 − t, and y = y(s, t) =

2s + t2 . Calculate ∂f /∂s when (s, t) = (2, 1).
First note that x(2, 1) = 3, and y(2, 1) = 5. Then

∂f ∂f ∂x ∂y
(3, 5) = 410, (3, 5) = 78, (2, 1) = 4, (2, 1) = 2,
∂x ∂y ∂s ∂s

∂f
(2, 1) = 410 × 4 + 78 × 2 = 1796 .
∂s

The differential: total derivative.

When a function f = f (x, y, z) is differentiable it is useful to summarize how it changes

with all the variables. Our notation uses df to represent the differential of f . We use dx
to capture the portion of the total derivative due to x, dy to capture the portion of the
total derivative due to y, and dz to capture the portion of the total derivative due to z.
Therefore,
∂f ∂f ∂f
df(a,b,c) = (a, b, c) dx + (a, b, c) dy + (a, b, c) dz .
∂x ∂y ∂z

Example 9. Set f (x, y, z) = x2 z + xy + y 3 z + 7z. Calculate df at (2, 3, 1). Use this

differential to approximate the change in f when the point changes to (2.2, 2.9, 1.1).
At (2, 3, 1), ∂f /∂x = 2 · 2 · 1 + 3 = 7, ∂f /∂y = 2 + 3 · 32 · 1 = 29, and ∂f /∂z =
22 + 33 + 7 = 38. So df = 7dx + 29dy + 38dz. The change in x is dx ≈ 2.2 − 2 = 0.2, the
change in y is dy ≈ 2.9 − 3 = −0.1, and the change in z is dz ≈ 1.1 − 1 = 0.1. Thus the
change in f is approximately df ≈ 7 · 0.2 + 29 · (−0.1) + 38 · 0.1 = 2.3.

There is a geometric version of the differential, called the gradient vector. Geomet-
rically, the gradient points in the direction in which the function increases most rapidly,
and the size of the gradient is the rate at which the function changes in this direction.

EXERCISES.
1. Use the quotient definition of the derivative to calculate ∂f / ∂~v at (2, 1), where
f (x, y) = x2 + 3xy and ~v = 2ı̂ − 1̂.
2. Use the quotient definition of the derivative to calculate ∂f / ∂~v at (3, −1), where
f (x, y) = x2 + 5y and ~v = 2ı̂ + 4̂.
Functions of Several Variables 245

3. For (x, y) 6= (0, 0), set f (x, y) = (x2 y)/(x2 + y 4 ) and let f (0, 0) = 0. Use the
quotient definition of the derivative to calculate ∂f / ∂~v at (0, 0), where (a) ~v = ı̂, (b)
~v = ̂, (c) ~v = ı̂ + ̂. (d) Is this function differentiable at (0, 0)?
4. Use symbolic differentiation to calculate ∂f / ∂~v at (2, 1, 0), where f (x, y, z) =
x + 3xy + z 7 and ~v = −1ı̂ + 2̂ + 3k̂.
2

5. Use symbolic differentiation to calculate ∂f / ∂~v at (2, 1, 0), where f (x, y, z) =

x2 + 3xy + z 7 and ~v = 2ı̂ + 1̂ − 3k̂.
6. Set f (x, y) = x5 + xy + y 3 . Let x = x(t) = t4 , and y = y(t) = t2 . Calculate df /dt
when t = 1.
7. Set f (x, y) = ex + xy + 1/y. Let x = x(s, t) = 2s − t, and y = y(s, t) = −s + t.
Calculate ∂f /∂s when (s, t) = (0, 1).
8. Set f (x, y) = ex + xy + 1/y. Let x = x(s, t) = 2s − t, and y = y(s, t) = −s + t.
Calculate ∂f /∂t when (s, t) = (0, 1).
9. Set f (x, y) = x2 + 4xy + y 2 . Calculate ∂f / ∂~v at (1, 2), where ~v = 1ı̂ + 3̂.
10. Set f (x, y) = x2 + 4xy + y 2 . Calculate ∂f / ∂~v at (1, 2), where ~v = 1ı̂ + 3̂.
11. Set f (x, y) = x2 + 4xy + y 2 . Calculate ∂f / ∂~v at (1, 2), where ~v = −3ı̂ + 5̂.
12. Set f (x, y) = x2 + 4xy + y 2 . Calculate ∂f / ∂~v at (1, 2), where ~v = 3ı̂ + 1̂.
13. Set f (x, y) = x5 + xy + y 3 . Let x = x(s, t) = s2 − t, and y = y(s, t) = 2s + t2 .
Calculate ∂f /∂t when (s, t) = (2, 1).
14. Set f (x, y, z) = xz + xy + yz + 7z. Calculate df at (2, 3, 4). Use this differential
to approximate the change in f when the point changes to (2.1, 3.05, 4.01).
15. Set f (x, y, z) = x3 + y 4 + yz. Calculate df at (1, 1, 3). Use this differential to
approximate the change in f when the point changes to (0.9, 1.02, 2.95).
16. Set f (x, y, z) = xyz + y 2 z 2 + z 5 . Calculate df at (1, 2, 1). Use this differential to
approximate the change in f when the point changes to (1.1, 2.1, 1.1).
15. Optimization for Functions of Two Variables
We consider a function of two variables, and seek the maximum or minimum value for
this function, and the combination of the variables that yields this optimum. The nature
of this type of problem is similar when there are more variables, but deciding the type for
a critical point is more difficult.
Examples of optimization included minimizing costs, maximizing profits, minimizing
risk while meeting a certain goal, minimizing losses due to variation in product quality,
finding the optimal location for emergency services, scheduling flights or deliveries, deciding
the combinations of consumable goods that maximizes utility, and many other applications.

A. Critical points and curvature near them.

Consider a candidate point (a, b) where a function f = f (x, y) is tested for having
a minimum or maximum. If (∂f /∂x)(a, b) 6= 0 then the value of f can be increased and
decreased by changing the value of x. Similarly, if (∂f /∂y)(a, b) 6= 0 then the value of f
can be increased and decreased by changing the value of y.

Definition. At point (a, b) is a critical point for a function f = f (x, y) when either
f is not differentiable at (a, b) or when

∂f ∂f
(a, b) = (a, b) = 0.
∂x ∂y

Geometrically, the graph z = f (x, y) has a horizontal tangent plane at a critical point.
Whether this point is a maximum or minimum or neither is determined by the way in which
the surface z = f (x, y) curves away from (a, b). We illustrate this with several examples,
and then state a computational method for determining the curvature.

Example 1. Set f (x, y) = x2 + xy + y 2 . Find the critical points of f and decide the
behavior of f near them.
Solution: ∂f /∂x = 2x + y = 0 and ∂f /∂y = x + 2y = 0 give y = −2x and then −3x = 0.
So the only critical point is (0, 0). Now x2 + xy + y 2 = 0.5(x + y)2 + 0.5x2 + 0.5y 2 , so
f (x, y) > 0 when (x, y) 6= (0, 0). Hence f (0, 0) = 0 is a minimum.

Example 2. Set f (x, y) = x2 + 3xy + y 2 . Find the critical points of f and decide the
behavior of f near them.
Optimization for Functions of Two Variables 247

Solution: ∂f /∂x = 2x + 3y = 0 and ∂f /∂y = 3x + 2y = 0 give y = −2x/3 and then

5x/3 = 0. So the only critical point is (0, 0). Now for y = x = t, with t near zero,
f (x, y) = f (t, t) = 5t2 > 0. For x = −t and y = t, f (x, y) = f (−t, t) = −t2 < 0. Hence
f (0, 0) = 0 is neither a minimum nor a maximum.

Example 3. Set f (x, y) = x2 − 3y 2 . Find the critical points of f and decide the
behavior of f near them.
Solution: ∂f /∂x = 2x = 0 and ∂f /∂y = −3y = 0 give x = 0 and y = 0. So the only
critical point is (0, 0). Now for if y = 0, and x 6= 0, f (x, 0) = x2 > 0. And if x = 0 and
y 6= 0 then, f (0, y) = −3y 2 < 0. Hence f (0, 0) = 0 is neither a minimum nor a maximum.

Example 4. Set f (x, y) = −5x2 − 3y 2 . Find the critical points of f and decide the
behavior of f near them.
Solution: The only critical point is (0, 0). If (x, y) 6= (0, 0) then f (x, y) < 0. Hence
f (0, 0) = 0 is a maximum.

Example 5. Set f (x, y) = 2 − 4x − 6y + x2 + xy + y 2 . Find the critical points of f

and decide the behavior of f near them.
Solution: ∂f /∂x = −4 + 2x + y = 0 and ∂f /∂y = −6 + x + 2y = 0 give y = 4 − 2x and
then −6 + x + 8 − 4x = 0. So the only critical point is (2/3, 8/3). Deciding the behavior
near this critical point requires some algebra:

2 2 22 2 2 2 8 8 2
2 − 4x − 6y + x + xy + y = − + x − + x− y− + y−
3 3 3 3 3
22 2 2 h 2 8 i2 8 2
= − + 0.5 x − + 0.5 x − + y− + 0.5 y − .
3 3 3 3 3
It follows that f (2/3, 8/3) = −22/3 is a minimum.

Deciding the curvature at (a, b) requires the second order derivatives at the point.
The notation is
∂2f ∂ ∂f ∂2f ∂ ∂f
(a, b) = (a, b), and (a, b) = (a, b) .
∂x2 ∂x ∂x ∂y∂x ∂y ∂x

Proposition. Suppose f = f (x, y) is differentiable near (a, b) and (∂f /∂x)(a, b) = 0 and
(∂f /∂y)(a, b) = 0. The type of the critical point (a.b) is determined as follows. Set
∂2f ∂ 2 f ∂2f ∂ 2 f
H(f ; (a, b)) = (a, b) (a, b) − (a, b) (a, b) .
∂x2 ∂y 2 ∂x∂y ∂y∂x
248 Optimization for Functions of Two Variables

If H(f ; (a, b)) > 0 and (∂ 2 f /∂x2 )(a, b) > 0 then f (a, b) is a local minimum.
If H(f ; (a, b)) > 0 and (∂ 2 f /∂x2 )(a, b) < 0 then f (a, b) is a local maximum.
If H(f ; (a, b)) < 0 then f (a, b) is a saddle point – it is neither a minimum nor a maximum.
If H(f ; (a, b)) = 0 then the second order derivatives do not determine the behavior of f
near (a, b).

Example 6. Set f (x, y) = 11 + 6x + y + 2x2 + y 2 . Find the critical points of f and

decide the behavior of f near them.
Solution: ∂f /∂x = 6 + 2x = 0 and ∂f /∂y = 1 + 2y = 0 give x = −3 and y = −1/2.
The second derivatives are ∂ 2 f /∂x2 = 2, ∂ 2 f /∂x∂y = 0, and ∂ 2 f /∂y 2 = 2. Thus
H(f, (−3, −1/2)) = 4 > 0, and since ∂ 2 f /∂x2 = 2 > 0, f (−3, −1/2) = 10.75 is a lo-
cal minimum.

Example 7. Set f (x, y) = 11 + 6x + 6y + 2x2 + y 3 . Find the critical points of f and

Example 8. Set f (x, y) = 11 + 6x + 2x2 + 3xy + y 2 . Find the critical points of f and
decide the behavior of f near them.
Solution: ∂f /∂x = 6 + 2x = 0 and ∂f /∂y = 1 + 2y = 0 give x = −3 and y = −1/2.
The second derivatives are ∂ 2 f /∂x2 = 2, ∂ 2 f /∂x∂y = 0, and ∂ 2 f /∂y 2 = 2. Thus
H(f, (−3, −1/2)) = 4 > 0, and since ∂ 2 f /∂x2 = 2 > 0, f (−3, −1/2) = 10.75 is a lo-
cal minimum.

To understand the proposition we need to think about the graph z = f (x, y). This
graph is a surface and the graph of the linearization of f at (a, b) is a plane. This plane is
tangent to the surface at (a, b), and at a critical point the plane tangent to the graph of
f is a horizontal plane. To understand the behavior of f near (a, b) we have to know how
the graph of f curves away from the horizontal tangent plane. The most significant terms
in determining this curving are the quadratic terms, and these quadratic terms are given
by the second derivatives.

The second order approximation to f (x, y) near (a, b) is

∂f ∂f
f (x, y) ≈ f (a, b) + L(x, y) + Q(x, y), L(x, y) = (a, b) (x − a) + (a, b) (y − b),
∂ ∂y
Optimization for Functions of Two Variables 249

1 ∂2f
2
∂2f 1 ∂2f
Q(x, y) = 2
(a, b) (x − a) + (a, b) (x − a)(y − b) + 2
(a, b) (y − b)2 .
2 ∂x ∂x∂y 2 ∂y

To see why this leads to our test in the proposition, compare the second order terms
in Q(x, y) to a standard quadratic q(s, t) = a s2 + b s t + c t2 . In the exercises you will see
algebraically that for nonzero s and t, q(s, t) > 0 if ac − b2 > 0 and a > 0 (or equivalently
c > 0); that for nonzero s and t, q(s, t) < 0 if ac − b2 > 0 and a < 0 (or equivalently c < 0);
and that for nonzero s and t, q(s, t) attains both signs if ac − b2 < 0.

EXERCISES.
1. An emergency care facility is located at the point (x, y) on a regional map and
will serve three cities. The cities are located at (0, 0), (5, 0), and (2, 4). Find the location
for the station that minimizes the sum of the squares of the three distances from the care
facility to the cities. (The assumption is that the probability of recovery for a victim is
inversely proportional to the square of the travel time from the cities to the facility.)
2. Set f (x, y) = x2 + 3xy + 3y 2 − 7x − 12y + 2.4. Find the critical points of f and
decide the behavior of f near them.
3. Set f (x, y) = x3 + x2 + 4xy + y 2 − 11x − y. Find the critical points of f and decide
the behavior of f near them.
4. Set f (x, y) = x3 − 3x2 + 4xy + y 2 + 11x − 2y. Find the critical points of f and
decide the behavior of f near them.
5. Set f (x, y) = 11 + 2x4 + 3x2 y 2 + y 4 . Find the critical points of f and decide the
behavior of f near them.
6. Set f (x, y) = 11 + x4 + y 5 . Find the critical points of f and decide the behavior of
f near them.
7. Set q(s, t) = a s2 + b s t + c t2 . Show that if a and c have opposite signs, then there
are values of (s, t) near (0, 0) for which q > 0 and there are (other) values of (s, t) near
(0, 0) for which q < 0.
8. Set q(s, t) = a s2 + b s t + c t2 . Assume that a and c are both positive. Write
q/a = s2 + (b/a) s t + (c/a) t2 and then write q/a = (s − (b/2a) t)2 + r. Calculate r and
give conditions under which r > 0. Show that if r > 0 then for all values of (s, t) near
(0, 0) one has q > 0.
9. Set q(s, t) = a s2 + b s t + c t2 . Assume that a and c are both negative. Write
q/(−a) = −s2 − (b/a) s t − (c/a) t2 and then write q/(−a) = −(s − (b/2a) t)2 + r. Calculate
250 Optimization for Functions of Two Variables

r and give conditions under which r > 0. Show that if r > 0 then for all values of (s, t)
near (0, 0) one has q < 0.
10. An emergency care facility is located at the point (x, y) on a regional map and
will serve three cities. The cities are located at (0, 0), (5, 0), and (2, 4). Approximate the
location for the station that minimizes the sum of the three distances from the care facility
to the cities. The equations one gets are difficult to solve exactly, so an approximation is
sought (you might sketch the locations and obtain good starting values for point to try in
the equation for critical points, then improve the guess).

B. Constraints and level sets.

Constrained optimization refers to optimizing a quantity when restrictions on the

variables are present. For instance, one might wish to maximize the amount produced
with the restriction that the budget for ingredients not exceed a given cost. A similar
problem involves minimizing the cost of producing a required amount of an item.

The general setting with two variables is to maximize or minimize f = f (x, y) subject
to a constraint g(x, y) = c, where c is a constant. One way to think about this problem
is geometric, the other is variational. We will describe the variational approach because it
generalizes to other settings, and because we are not assuming knowledge of angles. For
the sake of completeness we will then discuss the geometric approach briefly.
Because g(x, y) must be kept constant, the only allowed variations are ones that do
not change g. Hence, an allowed variation is a vector ~v so that ∂g/∂~v = 0. When f reaches
a maximum or minimum at (a, b) and ~v is an allowed variation, f must not increase or
decrease along ~v . It follows that ∂f /∂~v = 0. We summarize this conclusion:
∂g ∂f
(a, b) = 0 implies (a, b) = 0.
∂~v ∂~v
Since this must hold for any ~v for which ∂g/∂~v (a, b) = 0, the change in f must be a
multiple of the change in g. Symbolically,

df(a,b) = λdg(a,b) for some λ.

Definition. The parameter λ appearing in this equation is called the Lagrange multi-
plier for the constrained optimization problem.

Example 1. Set f (x, y) = 11 + 2x + 3xy + y 2 . Find the maximum value of f on the

line x + 3y = 6.
Optimization for Functions of Two Variables 251

Solution: The constraint here is g(x, y) = x + 3y = 6. At the point (x, y), df =

(2 + 3y)dx + (3x + 2y)dy while dg = 1dx + 3dy. Therefore df = λdg becomes 2 + 3y = λ · 1
and 3x + 2y = λ · 3. We also have the constraint x + 3y = 6. Combining the equations
we get 3(2 + 3y) = 3λ = 3x + 2y or x = 2 + (7/3)y. Putting this in x + 3y = 6 gives
y = 3/4. Thus (x, y) = (3.75, 0.75) and λ = 4.25. The point (3.75, 0.75) is a maximum
because the quadratic term, 3xy + y 2 , is decreasing along 3x + y = 6 if either x or y are
large. (Along 3x + y = 6, if either variable is large, then the other has large absolute value
and is negative.)

To visualize what we’ve done geometrically consider the level sets for f and the con-
straint g(x, y) = x + 3y = 6 (which is a level set for g). The level set for f at the maximum
is the one with the highest value that intersects the constraint.
5

f = 27.5
3

f = 31
2

-1 0 1 2 3 4 5 6 7 8 9

-1

Constrained maximum is f (3.75, 0.75) = 27.5.

To complete the geometric description, we need to describe the tangency of the level
set for f and the constraint (the level set for g). This can be done using the gradient
vector that is, by definition, a vector that calculates the rate of change of f along any
other vector using the dot product. Symbolically, the gradient is denoted ∇f(a,b) and

∂f ∂f ∂f
(a, b) = ∇f(a,b) · ~v , so ∇f(a,b) = (a, b) ı̂ + (a, b) ̂ .
∂~v ∂x ∂y

The condition for tangency is now ∇f(a,b) = λ∇g(a,b) for some λ, which is the same
condition we reached through variational principles.
252 Optimization for Functions of Two Variables

Example 2. Suppose two ingredients, x and y, are used to manufacture an item, and
that the amount produced is f (x, y) = 100 x0.4 y 0.6 . Suppose the cost of x is 5 dollars per
unit and the cost of y is 6 dollars per unit. Find the maximum that can be produced when
the total cost for ingredients is 120 dollars.
Solution: The constraint here is the cost (or budget) so g(x, y) = 5x + 6y = 120. At
the point (x, y), df = 40 x−0.6 y 0.6 dx + 60 x0.4 y −0.4 dy while dg = 5dx + 6dy. Therefore
df = λ dg becomes 40 x−0.6 y 0.6 = λ · 5 and 60 x0.4 y −0.4 = λ · 6. We also have the constraint
5x+6y = 120. Combining the equations we get 10 x0.4 y −0.4 = λ = 8 x−0.6 y 0.6 or 8y = 10x.
Putting this in 5x + 6y = 120 gives x = 9.6 and y = 12. The point (9.6, 12) is a maximum
because the value of f clearly increases if both x and y are increased, and we have increased
them as much as the constraint allows. The largest amount that can be produced with a
budget of 120 is f (9.6, 12) ≈ 1097.5321.

Example 3. Suppose two ingredients, x and y, are used to manufacture an item, and
that the amount produced is f (x, y) = 100 x0.4 y 0.6 . Suppose the cost of x is 5 dollars per
unit and the cost of y is 6 dollars per unit. Find the minimum cost when 1000 items are
produced.
Solution: The constraint here is the amount produced (f = 1000) and we wish to min-
imize the cost g(x, y) = 5x + 6y. At the point (x, y), dg = λ df becomes 5 = λ 40 x−0.6 y 0.6
and 6 = λ 60 x0.4 y −0.4 . We also have the constraint 100 x0.4 y 0.6 = 1000. Combining
the equations we get 10 x0.4 y −0.4 = 1/λ = 8 x−0.6 y 0.6 or 8y = 10x. Putting this in
100 x0.4 y 0.6 = 1000 gives x0.4 (5/4)0.6 x0.6 = 10, or x = 10 · 0.80.6 ≈ 8.7469 and y =
12.5 · 0.80.6 ≈ 10.9336. This point minimizes cost because the value of g clearly decreases if
both x and y are decreased, and we have decreased them as much as the constraint allows.
The minimum cost of producing 1000 units is approximately 5·8.7469+6·10.9336 ≈ 109.34
dollars.

Discussion: As can be seen in the last two examples, maximizing production with a fixed
cost is the same, mathematically, as minimizing cost with a fixed target for production.
The two solutions agree on the proportions of ingredients used, and the amounts used
depend on the target value. (The amounts of the ingredients would have been identical
in the two examples if the budget in the first example were 109.34 dollars or the required
production in the second example were 1097.5321.)

EXERCISES.
1. Set f (x, y) = x2 + 5xy + y 2 . Find the maximum value of f on the line x + y = 7.
Optimization for Functions of Two Variables 253

2. Set f (x, y) = x2 + y 2 . Find the maximum value of f on the line x + 3y = 4, or

decide that such a value does not exist.
3. Set f (x, y) = 11 + 2x + xy + 5y. Find the maximum value of f on the line
2x + 3y = 10.
4. Set f (x, y) = x2 + y 2 . Find the minimum value of f on the line x + 3y = 4, or
decide that such a value does not exist.
5. Set f (x, y) = 3x + 2x2 + xy + 5y + 6y 2 . Find the minimum value of f on the line
2x + 3y = 10, or decide that such a value does not exist.
6. Set f (x, y) = x2 y 3 . Find the maximum value of f on the line x + 2y = 4, or decide
that such a value does not exists.
7. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 7 x0.5 y 0.5 . Suppose the cost of x is 1 dollar per unit and the
cost of y is 4 dollars per unit. Find the minimum cost when 700 items are produced.
8. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 7 x0.5 y 0.5 . Suppose the cost of x is 1 dollar per unit and the
cost of y is 4 dollars per unit. Find the minimum cost when 350 items are produced.
9. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 7 x0.5 y 0.5 . Suppose the cost of x is 5 dollars per unit and
the cost of y is 4 dollars per unit. Find the minimum cost when 700 items are produced.
10. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 7 x0.5 y 0.5 . Suppose the cost of x is 1 dollar per unit and the
cost of y is 4 dollars per unit. Find the largest number of items that can be produced with
a budget of 400 dollars.
11. The amount of energy produced using the amounts x of coal and y of biodiesel
is E(x, y) = 0.2x + y. The amount of pollution produced as a result is p = 10x + 0.5y.
Suppose that the economic cost of pollution is C(p) = 12p + 0.3 p2 . Minimize the cost of
pollution given that E = 100 is the amount of energy to be produced.
12. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = x0.6 y 0.9 . Suppose the cost of x is 4 dollars per unit and the
cost of y is 4 dollars per unit. Find the largest number of items that can be produced with
a budget of 100 dollars.
13. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = x0.6 y 0.9 . Suppose the cost of x is 4 dollars per unit and the
254 Optimization for Functions of Two Variables

cost of y is 4 dollars per unit. Find the largest number of items that can be produced with
a budget of 200 dollars.

C. Relaxation of constraints and rates of return.

A natural question after finding an optimal combination is how matters change with
small changes in conditions. One question is how the result changes with the constraint.
For instance, if a certain quantity of a good can be produced with a given budget,
then how much more can be produced if the budget is increased?
A second question is how the optimal combination of variables changes if the model
changes. For instance, if production as a function of labor and parts depends on a param-
eter, and this parameter is estimated with an error of 1 percent, then what is the error in
the choice of the optimal combination of labor and parts? This second question falls under
the title of “sensitivity” and will be explored briefly in the exercises.
The answer to the first question, about changing the constraint, turns out to come
from an interpreting the Lagrange multiplier in the previous section.

Example 1. Suppose two ingredients, x and y, are used to manufacture an item, and
that the amount produced is f (x, y) = 100 x0.4 y 0.6 . Suppose the cost of x is 5 dollars per
unit and the cost of y is 6 dollars per unit. Find the maximum that can be produced when
the total cost for ingredients is 120 dollars. Then find the maximum that can be produced
when the total cost for ingredients is 121 dollars.
Solution: The constraint here is the cost (or budget) so g(x, y) = 5x + 6y = 120.
We found previously that df = λ dg becomes 40 x−0.6 y 0.6 = λ · 5 and 60 x0.4 y −0.4 = λ · 6
and results in 8y = 10x. Putting this in 5x + 6y = 120 gives x = 9.6 and y = 12. The
point (9.6, 12) is a maximum and the largest amount that can be produced with a budget
of 120 is f (9.6, 12) ≈ 1097.5321. Putting 8y = 10x into 5x + 6y = 121 gives x = 9.68
and y = 12.1. The point (9.68, 12.1) is a maximum and the largest amount that can be
produced with a budget of 121 is f (9.68, 12.1) ≈ 1106.6782.
Discussion: The difference in the amount produced is f (9.68, 12.1)−f (9.6, 12) ≈ 9.146.
This is the difference in f resulting from a change by 1 in the constraint g. Examining
df = λ dg we see that λ ≈ 9.146 is that same rate of change. Indeed, solving for λ from
40 x−0.6 y 0.6 = λ·5 (or equivalently from 60 x0.4 y −0.4 = λ·6) gives λ = 8 (5/4)0.6 ≈ 9.14610.
Summary: The Lagrange multiplier is the rate of change of the objective with the
constraint. Symbolically, when f (x, y) is maximized (or minimized) subject to the con-
Optimization for Functions of Two Variables 255

straint g(x, y) = c, a condition for optimization at x = a and y = b is df (a, b) = λ dg(a, b)

and if c changes by a small amount ,f changes by λ times that amount.

Example 2. Suppose, as above, that two ingredients, x and y, are used to manufacture
an item, and that the amount produced is f (x, y) = 100 x0.4 y 0.6 and the cost of x is 5 dollars
per unit and the cost of y is 6 dollars per unit. Suppose the current budget is 120 dollars.
If a manufactured item is worth 12 cents, is it worth increasing production?
Solution: We found previously that λ ≈ 9.14610. It follows that an increase in the
budget by 1 dollar results in approximately 9.14610 additional items and in an additional
revenue of approximately 9.14610 × 0.12 = 1.09753 dollars. Hence it is worth increasing
production.

Example 3. Suppose a clinic employs doctors and nurses and treats patients. Let
m denote the number of doctors on staff (during an hour of operation) and let n denote
the number of nurses. The number of patients treated is p = 5 m0.7 n0.3 . Suppose doctors
cost 60 dollars per hour and nurses cost 30 dollars per hour. What is the optimal ratio
of doctors and nurses? Suppose that after the cost of materials used for each patient, the
revenue from each patient is 19 dollars. What is the optimal size for the clinic? What
would you tell prospective investors for this clinic?
Solution: The budget is g(m, n) = 60m + 30n. The resulting constrained optimization
problem gives 5 × 0.7 m−0.3 n0.3 = 60λ and 5 × 0.3 m0.7 n−0.7 = 30λ. Combining these we
find that m/n = 7/6 and λ = 0.05(7/6)0.7 . We conclude that the optimal ratio of doctors
to nurses is 7 : 6 and that an additional dollar in the budget would result in 0.05(7/6)0.7
more patients being treated. With a revenue of 19 dollars per patient, one additional
dollar results in 19 × 0.05(7/6)0.7 ≈ 1.0582453 dollars in revenue. Hence it is always worth
increasing the size of the clinic. Investors who add a dollar to the budget would generate
about 1.058 dollars in revenue, or a profit of about 5.8 percent.

Example 4. A cylindrical can has height h and radius r. The cost of the material is
1 cent per square centimeter and the cost of the joins or folds where the top and bottom
meet the sides is 2 cents per centimeter. The volume of the can should be 200π cubic
centimeters. Minimize the cost of the can.
Solution: The volume of the can is V = π r2 h = 192π. The length of the joins is 2π r
each and there are two of them, so their cost is 8π r. The area of the can consists of the
two disks at the top and bottom, each with area π r2 , and the sides with area 2π r h. So
the total cost is C = 8πr + 2π r2 + 2π r h. We order the variables (r, h) and dC = λ dV
becomes 8π + 4π r + 2π h = λ 2π r h and 2π r = λ πr2 . Solving these equations we get
256 Optimization for Functions of Two Variables

λ = 2/r, and 8 + 4r + 2h = 4h, so h = 2r + 4. Inserting this in the constraint (the volume),

2r3 + 4r2 = 192. We would, in general, have to look up formulae for cubic equations to
solve this, but in this case r = 4 works. Since r must be some positive number, it can
be checked that this is the only solution. Hence r = 4 and h = 12 describe the can with
minimal cost.

EXERCISES.
1. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 60 x0.5 y 0.5 . Suppose the cost of x is 1 dollar per unit and
the cost of y is 4 dollars per unit. (a) Find the maximum number of items that can be
produced with a budget of 1200 dollars. (b) How many additional units can be made
if an additional dollar is invested? (c) At which price (per unit) is it worth increasing
production?
2. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 60 x0.5 y 0.5 . Suppose the cost of x is 1 dollar per unit and
the cost of y is 4 dollars per unit. (a) Find the minimum cost of producing 18,000 items.
(b) Thinking of the production, f , as the constraint, what is the meaning of the Lagrange
multiplier in this setting?
3. A car company makes two models of a particular car. Let x denote the amount
of one model (units might be the number of cars per week) and let y denote the amount
of the second model. The cost of production is C = 30x2 + 90x y + 24y 2 . Suppose the
manufacturer wishes to minimize the cost of making 100 cars. (a) How much of each car
should be made? (b) What is the cost of making an additional car? (Why does this cost
not depend on which model is being made?)
4. What are the proportions of a rectangle which is inscribed in an ellipse with minor
axis of length 1 and major axis of length 3 and has maximal area? In symbols, let the
opposite corners of the rectangle be at (0, 0) and at (x, y) where x > 0, y > 0, and (x, y)
lies on the ellipse x2 + y 2 /9 = 1. Maximize the area of the rectangle.
5. Let the opposite corners of the rectangle be at (0, 0) and at (x, y) where x > 0,
y > 0, and (x, y) lies on the ellipse x2 /a2 + y 2 /b2 = 1. Maximize the area of the rectangle.
6. In South America the ancient fishing industry and agriculture industry supported
one another and made protein available to the population. Suppose that to initiate either
fishing or agriculture an investment in terracing and tools must be made. Let A be the
initial investment in tools for agriculture and let F be the initial investment in tools for
fishing. Suppose that A + 2F = 100 represents the total available for initial investment.
Optimization for Functions of Two Variables 257

Suppose that the amount of protein available to the population at time t is P = (A +

F )(A + tF ), where t ≥ 5. (a) Consider a fixed time t (with t ≥ 5), and maximize the
amount of protein available to the population at that time. (b) Does the best strategy
depend on the time?
7. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 60 x0.4 y 0.6 . Suppose the cost of x is 2 dollar per unit and
the cost of y is 4 dollars per unit. (a) Find the maximum number of items that can be
produced with a budget of 1200 dollars. (b) How many additional units can be made if an
additional dollar is invested?
8. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 60 x0.405 y 0.595 . Suppose the cost of x is 2 dollar per unit
and the cost of y is 4 dollars per unit. (a) Find the maximum number of items that can
be produced with a budget of 1200 dollars. (b) How many additional units can be made
if an additional dollar is invested? (c) Compare the results of this problem to those of the
previous problem.
9. Suppose two ingredients, x and y, are used to manufacture an item, and that the
amount produced is f (x, y) = 60 xα y 1−α . Suppose the cost of x is 2 dollar per unit and
the cost of y is 4 dollars per unit. (a) Find the maximum number of items that can be
produced with a budget of 1200 dollars. Denote this maximum amount by m(α). (b)
Calculate the rate of change of m(α) with α. (c) Calculate the proportional rate of change
of m(α) with α.
Index
Absolute value , 20.
Actuarial value , 115.
Algebraic manipulation (for integrals) , 221.
Alligator , 117.
Annuity , 106.
Arithmetic sequence , 99.
Arithmetic series (or sum) , 99.
Average cost , 202.
Average value (on an interval) , 207.
Bacterial growth , 32, 81.
Batch size , 180.
Binomial expected value , 115.
Bond pricing , 92.
Budget , 46.
Capital per capita , 30.
Car loan , 2.
Carbon dating , 93.
Chain rule , 141.
Chain rule (vector version) , 242.
Cobb-Douglas , 181.
Co-domain , 62.
Composition of functions , 74.
Compounded interest , 81.
Computers and calculators , 31.
Concave down , 170.
Concave up , 170.
Consumer demand , 7.
Consumer surplus , 208, 215.
Continuity and differentiability , 154.
Continuous , 129.
Continuous compounding , 83.
Continuous random variable , 110.
Convergent integral , 227.
Convex , 170.
Index 259

Cooling , 94.
Coordinates , 26, 231.
Cost of quality control , 176.
Critical number , 167.
Critical point , 167.
Critical points (functions of two variables) , 246.
Cumulative distribution function , 119.
Curvature , 247.
Decreasing , 166.
Definite integral for a continuous function , 206.
Degree , 55.
Dependent variable , 6, 62.
Depreciation , 84.
Derivative , 136.
Derivative with respect to a vector , 240.
Derived function , 136.
Developing country , 15, 30.
Differentiable function (vector version) , 242.
Differential (of a function of one variable) , 153.
Differential (function of several variables) , 243.
Differential (function of three variables) , 244.
Discrete random variable , 110.
Discriminant , 51.
Distance (in 2 and 3 dimensional spaces) , 231.
Divergent integral , 227.
Domain , 62.
Dot product , 238.
Duration of a phone call , 111.
Elasticity , 184.
Emergency services , 182.
Events (probability) , 107.
Expected number of voters , 115.
Expected value , 113.
Expected value of a continuous stochastic variable , 217.
First order term , 43.
Frequency of occurrence , 107.
260 Index

Function , 5, 62.
Future value , 86, 98.
Geometric sequence , 101.
Geometric series , 101.
Geometric sum , 101.
Global maximum , 169.
Global minimum , 169.
Gradient , 244.
Graph , 26.
Graph (function of 2 variables) , 235.
Gravity , 53.
Half-life , 86.
Home loan , 105.
Improper integral , 227.
Income distribution , 118.
Income per capita , 184.
Income stream (total present value) , 103.
Increasing , 166.
Indefinite integral , 198.
Independent events , 107.
Independent variable , 6, 62.
Inequality , 36.
Injective , 64.
Inflection point , 170.
Inner product , 238.
Instantaneous rate , 138.
Insulation , 4.
Integers , 17.
Integrable (function) , 212.
Integral , 198.
Integration by parts , 222, 223.
Integration formulae , 199.
Inverse , 75.
Isocline , 235.
Lagrange multiplier , 250.
Length of a vector , 237.
Index 261

Level set , 235.

Light intensity , 92.
Limit (informal definition) , 125.
Limit , 126.
Limit (operational definition) , 127.
Limit from above , 127.
Limit from below , 127.
Linear , 43.
Linear operation , 141.
Loan default , 115.
Local maximum , 169.
Local minimum , 169.
Locating emergency services , 182.
Logarithm , 88, 217.
Logarithms and exponentials , 226.
Logistic equation , 222.
Lower bound , 212.
Marginal cost , 143.
Marginal profit , 143.
Marginal rate , 136.
Marginal revenue , 143.
Marginal value , 136.
Market , 11.
Maximum , 169.
Medical clinic , 8.
Medical dosing , 93.
Minimum , 169.
Natural logarithm , 88.
Natural numbers , 17.
One-sided limit , 127.
One-to-one , 64.
Onto , 64.
Optimization , 179.
Orchard , 12, 27.
Parameterized path , 233.
Partial derivative , 240.
262 Index

Partial derivatives , 241.

Partial fractions , 222.
Path , 233.
Patients served , 38.
Percentage , 18.
Personal income , 30.
Point of inflection , 170.
Point-wise operations , 74.
Poisson distribution , 112.
Population’s growth , 184.
Power rule , 140.
Present value , 98, 216.
Price elasticity of demand , 184.
Principal , 98.
Probability , 107.
Probability density function , 192, 209.
Producer supply , 10.
Product rule , 140.
Productivity and capital , 9.
Properties of exponentials and logarithms , 90.
Quadratic , 49.
Quadratic formula , 50.
Quantitative variable , 1.
Qualitative variable , 1.
Quotient rule , 141.
Random variable , 110.
Range , 62.
Raspberries , 54.
Rate of return , 56.
Rational numbers , 17.
Real numbers , 17.
Relation , 5.
Relative change , 183.
Relative rate of change , 184.
Restaurant , 3.
Retirement account , 102.
Index 263

Roofing , 3.
Rules for powers , 21.
Sample (statistical) , 116.
Saving and income , 7.
Savings’ rate , 4.
Secant line , 134.
Second derivative , 158.
Second derivative test (two variables) , 247.
Second order approximation (two variables) , 249.
Sequence , 97.
Slope , 133, 134.
Solution (to an equation) , 34.
Sound pressure , 97.
Standard deviation , 116.
Statistical variable , 1.
Strictly decreasing , 166.
Strictly increasing , 166.
Substitution , 220.
Sum of two vectors , 238.
Supply and demand , 29.
Surjective , 64.
Tangent line , 134.
Terms (of a sequence) , 97.
The robin and the egg , 10.
Total prsent value , 102, 216.
Turbidity , 32.
Uniform distribution , 112, 216.
Upper bound , 212.
Utility of work and leisure , 176.
Variable , 5.
Variance , 116.
Vector addition , 238.
Vector algebra , 238.
Vector’s length , 237.
Vector scaling , 238.
Vector’s size , 237.
264 Index

Vertex , 53.
Waiting time , 111.

(Ebook PDF) Statistics For Business and Economics 8th Edition Ebook All Chapters PDF
100% (3)
(Ebook PDF) Statistics For Business and Economics 8th Edition Ebook All Chapters PDF
51 pages
Variables & Types
100% (9)
Variables & Types
11 pages
Task 1.1 - Physical Activity Readiness - Questionnaire YES NO
No ratings yet
Task 1.1 - Physical Activity Readiness - Questionnaire YES NO
3 pages
upGrad+Simple+Guide_+Types+of+Variable+(1)
No ratings yet
upGrad+Simple+Guide_+Types+of+Variable+(1)
3 pages
w3_ch2_anno
No ratings yet
w3_ch2_anno
28 pages
Week 2 Lesson 2
No ratings yet
Week 2 Lesson 2
33 pages
Variables in RM 2
No ratings yet
Variables in RM 2
7 pages
29122022_Appendix-95
No ratings yet
29122022_Appendix-95
207 pages
Introduction To Statistics-Part I
No ratings yet
Introduction To Statistics-Part I
28 pages
Statistics Sol
No ratings yet
Statistics Sol
208 pages
UNIT 1-Module 1
No ratings yet
UNIT 1-Module 1
40 pages
part 2
No ratings yet
part 2
3 pages
Types of Data
No ratings yet
Types of Data
34 pages
Regression
No ratings yet
Regression
82 pages
Quantitative-Research-Notes
No ratings yet
Quantitative-Research-Notes
29 pages
2035 CH1 Notes
No ratings yet
2035 CH1 Notes
32 pages
Types of variables
No ratings yet
Types of variables
6 pages
Report Stat
No ratings yet
Report Stat
21 pages
1 Introduction to Statistics_241108_104334
No ratings yet
1 Introduction to Statistics_241108_104334
11 pages
II PPM Research Method
No ratings yet
II PPM Research Method
220 pages
Module 1
No ratings yet
Module 1
10 pages
Module-5
No ratings yet
Module-5
53 pages
Guttman 1944
No ratings yet
Guttman 1944
12 pages
Preliminary Concepts: Formal Definitions
No ratings yet
Preliminary Concepts: Formal Definitions
12 pages
6461 Assignment No 2
No ratings yet
6461 Assignment No 2
16 pages
Variables
No ratings yet
Variables
14 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
STS Reviewer
No ratings yet
STS Reviewer
8 pages
2. Variables
No ratings yet
2. Variables
4 pages
Lesson 1: Brief History of Statistics
No ratings yet
Lesson 1: Brief History of Statistics
17 pages
Statistics For Economists - Lecture Notes
No ratings yet
Statistics For Economists - Lecture Notes
171 pages
Chapter 1 Introduction To Statistcs
No ratings yet
Chapter 1 Introduction To Statistcs
9 pages
Variables - Definition, Types of Variable in Research
100% (1)
Variables - Definition, Types of Variable in Research
27 pages
Practical Research 2: Quarter 1 - Module 2 Research Variables
63% (8)
Practical Research 2: Quarter 1 - Module 2 Research Variables
23 pages
Statistics
No ratings yet
Statistics
16 pages
Statistics For Business Decision
No ratings yet
Statistics For Business Decision
248 pages
Statistics Is The Science of The Collection, Organization, and Interpretation of
No ratings yet
Statistics Is The Science of The Collection, Organization, and Interpretation of
8 pages
Module-5 Notes
No ratings yet
Module-5 Notes
53 pages
Types of Variables
No ratings yet
Types of Variables
31 pages
Sampling and Variables
No ratings yet
Sampling and Variables
10 pages
UNIT 1-Module 1
No ratings yet
UNIT 1-Module 1
39 pages
UNIT 1-Module 1-Teaching
No ratings yet
UNIT 1-Module 1-Teaching
26 pages
Class Module - APPLIED STATS - ICFV
No ratings yet
Class Module - APPLIED STATS - ICFV
14 pages
1 Introduction
No ratings yet
1 Introduction
6 pages
QM Version 1.0
No ratings yet
QM Version 1.0
303 pages
Statistics Lecture Notes
No ratings yet
Statistics Lecture Notes
7 pages
Chapter 1. Introductory Notions Meaning of Statistics
No ratings yet
Chapter 1. Introductory Notions Meaning of Statistics
4 pages
rEVIEWER-PRACTICAL-RESEARCH-12-1ST-QUARTER
No ratings yet
rEVIEWER-PRACTICAL-RESEARCH-12-1ST-QUARTER
8 pages
1 Descriptive Statistics
No ratings yet
1 Descriptive Statistics
5 pages
Statistics Notes
No ratings yet
Statistics Notes
100 pages
PSY 311 Week 3
No ratings yet
PSY 311 Week 3
8 pages
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
No ratings yet
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
10 pages
The Variables in Research
No ratings yet
The Variables in Research
1 page
Chap 1
No ratings yet
Chap 1
5 pages
Eco2011 Notes
No ratings yet
Eco2011 Notes
96 pages
Social Economics: Current and Emerging Avenues
From Everand
Social Economics: Current and Emerging Avenues
Joan Costa-Font
No ratings yet
21st Century Quantitative Reasoning
From Everand
21st Century Quantitative Reasoning
Pasquale De Marco
No ratings yet
The Elements of Choice: Why the Way We Decide Matters
From Everand
The Elements of Choice: Why the Way We Decide Matters
Eric J. Johnson
4.5/5 (3)
Data Analysis in 6th Grade
From Everand
Data Analysis in 6th Grade
Christopher Casey
No ratings yet
Cultural Paradigms to Redefine Economics
From Everand
Cultural Paradigms to Redefine Economics
Dawood Mamoon
No ratings yet
Thinking Statistically
From Everand
Thinking Statistically
Anthony Banfield
5/5 (1)
Briefing of Phy370 Sem Jan 2023
No ratings yet
Briefing of Phy370 Sem Jan 2023
10 pages
SASSIN Electric Price ListMARCH2016
100% (3)
SASSIN Electric Price ListMARCH2016
16 pages
Fruit Chain Store
No ratings yet
Fruit Chain Store
7 pages
Namma Kalvi 11th Commerce Answer Key em Half Yearly Exam 2018 PDF
No ratings yet
Namma Kalvi 11th Commerce Answer Key em Half Yearly Exam 2018 PDF
15 pages
CHN by Sir Daclis PDF
100% (1)
CHN by Sir Daclis PDF
20 pages
Properties of Ss410
No ratings yet
Properties of Ss410
5 pages
Assignment 8 Solution
No ratings yet
Assignment 8 Solution
8 pages
Mystery MR2.75
No ratings yet
Mystery MR2.75
8 pages
C - Electric Conductivity Apparatus
No ratings yet
C - Electric Conductivity Apparatus
1 page
Time Seies Slides - 2
No ratings yet
Time Seies Slides - 2
26 pages
Spillway NOTE
No ratings yet
Spillway NOTE
74 pages
Unit-3 Sensors
No ratings yet
Unit-3 Sensors
40 pages
Katalog HiCOP 2019
No ratings yet
Katalog HiCOP 2019
40 pages
Question Paper: Tolani Maritime Institute, Induri
No ratings yet
Question Paper: Tolani Maritime Institute, Induri
3 pages
Experiment 07
100% (1)
Experiment 07
4 pages
MATHEMATICS-2005 (Set I-Delhi)
No ratings yet
MATHEMATICS-2005 (Set I-Delhi)
6 pages
The Answer Writing Manual for UPSC Civil Services & State Srushti
No ratings yet
The Answer Writing Manual for UPSC Civil Services & State Srushti
262 pages
Sensors: Created By: Kadiya Jaydeep Sandupalli Sashidharan
No ratings yet
Sensors: Created By: Kadiya Jaydeep Sandupalli Sashidharan
50 pages
Fibre Optics Panda Fibre Presentation PDF
No ratings yet
Fibre Optics Panda Fibre Presentation PDF
34 pages
British Steel Crane Rail Brochure
No ratings yet
British Steel Crane Rail Brochure
6 pages
M. Tech - Dig Elo. Error Control Coding
No ratings yet
M. Tech - Dig Elo. Error Control Coding
5 pages
Grade 9 Rationalized Integrated Science Schemes of Work Term 1
No ratings yet
Grade 9 Rationalized Integrated Science Schemes of Work Term 1
6 pages
Ee Power Electronics
No ratings yet
Ee Power Electronics
41 pages
Which Forms of Energy Are Ultimately Derived From Solar Energy
No ratings yet
Which Forms of Energy Are Ultimately Derived From Solar Energy
7 pages
I. General Information: Grace, Charity S. Molina, Richcar F. Paet, Shammel A. Valenzuela, Dareen Jay
No ratings yet
I. General Information: Grace, Charity S. Molina, Richcar F. Paet, Shammel A. Valenzuela, Dareen Jay
5 pages
Robotic Additive Manufacturing Along Curved Surface - A Step Towards Free-Form Fabrication
No ratings yet
Robotic Additive Manufacturing Along Curved Surface - A Step Towards Free-Form Fabrication
6 pages
DLP Science 1-17-2023
No ratings yet
DLP Science 1-17-2023
7 pages
Estudio Compresores H2 Centrifugos
No ratings yet
Estudio Compresores H2 Centrifugos
56 pages
BP Energol GR XP 220
No ratings yet
BP Energol GR XP 220
10 pages