0% found this document useful (0 votes)
82 views52 pages

Cao 4

This document provides an overview of key concepts and skills to be learned in a chapter on probability. It defines key terms like random variable, probability, and events. It explains how to calculate probabilities of single and joint events using rules of multiplication, addition, and total probability. It also covers expected value, variance, covariance, Bayes' formula, and counting methods. The chapter aims to equip readers with essential probability tools for making investment decisions involving risk and uncertainty.

Uploaded by

kswbmm7wpd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views52 pages

Cao 4

This document provides an overview of key concepts and skills to be learned in a chapter on probability. It defines key terms like random variable, probability, and events. It explains how to calculate probabilities of single and joint events using rules of multiplication, addition, and total probability. It also covers expected value, variance, covariance, Bayes' formula, and counting methods. The chapter aims to equip readers with essential probability tools for making investment decisions involving risk and uncertainty.

Uploaded by

kswbmm7wpd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CHAPTER

PROBABILITY CONCEPTS

LEARNING OUTCOMES
After completing this chapter, you will be able to do the following:
■ Define a random variable.

■ Explain the two defining properties of probability.


■ Explain how the term event is usefin probability.
■ Distinguish among empirical, a priori, and subjective probabilities.
■ State the probability of an event in terms of odds for or against the event.
■ Identify probabilities that are not consistent.
■ Describe the investment consequences of probabilities that are not consistent.
■ Distinguish between unconditional and conditional probabilities.
■ Define a joint probability.
■ State the multiplication rule for probabilities.
■ Calculate the joint probability of two events, using the multiplication rule for
probabilities.
■ State the addition rule for probabilities.
■ Calculate the probability that at least one of two events will occur, using the ad-
dition rule.
■ Distinguish between dependent and independent events.
■ State the multiplication rule for independent events.
■ Calculate a joint probability of any number of independent events.
■ Calculate an unconditional probability, using the total probability rule.
■ Define and calculate expected value.
■ Explain the use of expected value in investment applications.
■ Define variance and standard deviation.
■ Explain the use of variance and standard deviation in investment applications.
■ Explain the use of conditional expectation in investment applications.
■ Calculate an expected value, using the total probability rule for expected value.
■ Diagram an investment problem, using a tree diagram.
■ Explain the properties of expected value.
■ Explain the properties of variance.
■ Define and calculate covariance.

171
174 Chapter 4 Probability Concepts

■ Explain the properties of covariance.


■ Explain the relationship among covariance, standard deviation, and correlation.
■ Explain the concept of covariance matrices.
■ Calculate the expected return on a portfolio.

■ Explain the inputs to calculating the variance of return on a portfolio.


■ Calculate the variance of return on a portfolio.
■ Calculate covariance, given a joint probability function.
■ State Bayes' formula.
■ Calculate an updated probability, using Bayes' formula.
■ Calculate the number of ways a specified number of steps can be done, using
the multiplication rule of counting.
■ Solve counting problems using the factorial, combination, and permutation no-
tations.
■ Distinguish between problems for which different counting methods are appro-
priate.
■ Calculate the number of ways to choose r objects from a total of n objects,
where the order in which the r objects is listed does not matter.
■ Calculate the number of ways to choose r objects from a total of n objects,
where the order in which the r objects is listed does matter.

1 INTRODU CTION
All investment decisions are made in an environment of risk. The tools that allow us to
make decisions with consistency and logic in this setting come under the heading of prob-
ability. This chapter presents the essential probability tools needed to frame and address
many real world problems involving risk. We illustrate how these tools apply to such issues
as predicting investment manager performance, forecasting financial variables, and pricing
a bond so that it fairly compensates bondholders for default risk. In contrast to most intro-
ductions to probability, we de-emphasize mathematics but explore concepts important to
investments more fully. One such concept is independence, as independence relates to the
predictability of returns and financial variables. Another concept which receives special at-
tention is expectation, as analysts continually look to the future in their analyses and deci-
sions. Analysts and investors must also cope with variability. We present variance, or dis-
persion around expectation, as a risk concept important in investments. You will acquire
specific skills in using portfolio expected return and variance.
The basic tools of probability, including expected value and variance, are set out in
Section 2 of this chapter. Section 3 introduces covariance and correlation (measures of re-
latedness between random quantities) and the principles for calculating portfolio expected
return and variance. Two topics end the chapter: Bayes' formula and outcome counting.
Bayes' formula is a procedure for updating beliefs based on new information. In several
areas, including a widely-used option pricing model, the calculation of probabilities in-
Probability, Expected Value, and Variance 175

volves defining and counting outcomes. The chapter ends with a discussion of principles
and shortcuts for counting.

2 PROBABILITY, EXPECTED VALUE, AND VARIANCE


The probability concepts and tools that an investment analyst needs to know for most of his
or her work are relatively few and simple. However, they require thought to apply. This
section presents the essential tools for working with probability, expectation, and variance,
drawing on examples from equity and fixed income analysis.
An investor's concerns center on returns. The return on a risky asset is an example of
a random variable, a quantity whose outcomes are uncertain. For example, an investor's
expectation in making an investment may be that it will earn a return of 14 percent. How
likely is that return? Fourteen percent is a particular value or outcome of the random vari-
able. In probability discussions, an outcome is also one type of event. An event is any out-
come or specified set of outcomes of a random variable. To return to the question: How
likely is that return of 14 percent? •
The answer to this question is a probability. A probability is a number between O and
I that gives the chance that a stated event occurs. If the probability is 0.10 that a stock
earns a return of 14 percent, there is a 10 percent chance of that return happening. If an
event is impossible, it has a probability of 0. If an event is certain to happen, it has a prob-
ability of 1. If an event is impossible or a sure thing, it is not random at all. So O and 1 serve
as the two endpoints of probability.
To save words, it is common to use a capital letter in italics, such as A, to represent an
event, after it has been defined. P with parentheses stands for "the probability of (the event in
parentheses)" as in P(E) for "the probability of event£." Probability as a function of the dis-
tinct possible outcomes of a random variable is the probability function of the random vari-
able. There are two properties of probability which together constitute its definition.

• Definition of Probability. The two defining properties of a probability are as fol-


lows:
1. 0 :::;: P(E) :::; 1, the probability of any event Eis a number between O and 1.
2. The sum of the probabilities of any list of mutually exclusive and exhaustive
events equals I.

In the above definition, the term mutually exclusive events means that only one
event can occur at a time; exhaustive means that the events cover all possible outcomes.
The most basic kind of mutually exclusive and exhaustive events is the set of the distinct
possible outcomes of the random variable. If we have that set and the assignment of prob-
abilities to those outcomes-the probability distribution of the random variable-we have
a complete description of the random variable.
Suppose we have a statement of the possible outcomes of stock returns and we know
their probabilities. But we are interested in the probability of a more complex event than a
particular outcome: What is the probability that the stock earns a return above the risk-free
rate? (We use italics to highlight statements that define events, in this chapter.) The proba-
bility of any event is the sum of the probabilities of the distinct outcomes-here, stock re-
turn outcomes-included in the definition of the event. So if the risk-free rate is 4 percent,
we would sum the probabilities of returns above 4 percent. And that raises a question: How
do we, in practice, obtain probabilities?
176 Chapter 4 Probability Concepts

In investments, the probability of an event is very often estimated from data, as a rel-
ative frequency of occurrence. This is an empirical probability. We will point out empiri-
cal probabilities in several places in which they are used in this chapter. Relationships have
to be stable through time for empirical probabilities to be accurate. We cannot calculate an
empirical probability of an event not in the historical record, or a reliable empirical proba-
bility for a very rare event. There are cases, then, in which we may adjust an empirical prob-
ability to take account of perceptions of changing relationships. In other cases, we do not
have an empirical probability to use at all. We may also make a personal assessment of prob-
ability without reference to any particular data. Each of these three probabilities is a sub-
jective probability, one drawing on personal or subjective judgment. Subjective probabili-
ties are of great importance in investments. Investors, in making buy and sell decisions that
determine asset prices, often draw on subjective probabilities. Subjective probabilities ap-
pear in various places in this chapter, notably in our discussion of Bayes' formula. In a more
narrow range of well-defined problems, we can sometimes deduce probabilities by reason-
ing about the problem. The resulting probability is an a priori probability, one based on
logical analysis rather than on observation or personal judgment. We will use this type of
probability in Example 4-6. The counting methods we discuss later are particularly impor-
tant in calculating an a priori probability. Because a priori and empirical probabilities gen-
erally do not vary from person to person, they are often grouped as objective probabilities.
In business, we often meet probabilities stated in terms of odds, as "the odds for£,"
1
or the "odds against £," for example. These terms can be defined as follows:

• Probability Stated as Odds. Given a probability P(E),

1. Odds for E = P(E)/[ I - P(E)]. In words, the odds for E are the probability of E
divided by I minus the probability of E. Given odds for E of "a to b" (for exam-
ple, "7 to 2"), the implied probability of Eis a/(a + b).
2. Odds against E = [ I - P(E) ]/P(E), the reciprocal of odds for E. Given odds
against E of "a to b," the implied probability of Eis bl(a + b).

As an example of Statement I, if P(E) = 1/3, the odds for E are ( l/3)/(2/3) = l/2, or "I to
2." For odds of "l to 2," the implied probability is l/3 = l/(1 + 2) = 1/3, as expected. As
an example of Statement 2, in wagering it is common to speak in terms of the odds against
something. For odds of "2 to l" against E (an implied probability of E of 1/3), a $1 wager
on E, if successful, returns $2 in profits plus the $1 staked in the wager. The bet's antici-
pated profit is $0 because ( l/3 probability of winning) X ($2 profit if the wager is won) +
(2/3 probability of losing) X ( -$1 loss if the wager is lost) = 0. This is an example of an
expected value calculation, which we define later.

EXAMPLE 4-1. Profiting from Inconsistent Probabilities.

You are examining the common stock of two firms in the same industry in which an
important antitrust decision will be announced next week. The first firm, SmithCo
Corporation, will benefit by a governmental decision that there is no antitrust obsta-
cle related to a merger in which it is involved. You believe that SmithCo's share price
reflects a 0.85 probability of such a decision. A second firm, Selbert Corporation,
will equally benefit from a "go ahead" ruling. Surprisingly, you believe Selbert stock

1
In certain econometric and statistical applications, probability is also stated as odds.
Probability, Expected Value, and Variance 177

reflects only a 0.50 probability of a favorable decision. Assuming your analysis is


correct, what investment strategy would profit from this pricing discrepancy?
You start by thinking about the logical possibilities. One possibility is that the
probability of 0.50 reflected in Selbert's share price is accurate. In that case, Selbert
is fairly valued but SmithCo is overvalued, as its .current share price overestimated
the probability of a "go ahead" decision. The second possibility is that the probabil-
ity of 0.85 is accurate. In that case, SmithCo shares are fairly valued, but Selbert
shares, which build in a lower probability of a favorable decision, are undervalued.
You diagram the situation as shown in Table 4-1.

TABLE 4-1 Worksheet for Investment Problem

True Probability of a "Go Ahead" Decision

0.50 0.85

SmithCo Shares Overvalued Shares Fairly Valued


Selbert Shares Fairly Valued Shares Undervalued

The 0.50 probability column shows that Selbert shares are a better value than
SmithCo shares. Selbert shares are also a better value if a 0.85 probability is accu-
rate. On average, SmithCo shares are overvalued and Selbert shares are undervalued.
Your investment actions depend on your confidence in your analysis and on
any investment constraints you face (such as constraints on selling stock short). 2 A
conservative strategy would be to buy Selbert shares and reduce or eliminate any
current position in SmithCo. The most aggressive strategy is to short SmithCo stock
(relatively overvalued) and simultaneously buy the stock of Selbert (relatively un-
dervalued). This is known as a pairs arbitrage trade: a trade in two closely related
stocks involving the short sale of one and the purchase of the other.
The prices of SmithCo and Selbert shares reflect probabilities that are not con-
sistent. According to the Dutch Book Theorem,3 one of the most important proba-
bility results for investments, inconsistent probabilities create profit opportunities. In
our example, investors, by their buy and selJ decisions to exploit the inconsistent
probabilities, should eliminate the profit opportunity and inconsistency.

Probabilities are either unconditional or conditional. The probability in answer to the


straightforward question, What is the probability of this event A?, is an unconditional
probability, denoted P(A). Unconditional probabilities are also frequently referred to as

2
Selling short or shorting stock is selling borrowed shares in the hope that you can repurchase them later at a
lower price.
3
The theorem's name comes from the terminology of wagering. Suppose someone places a$ I 00 bet on X at
odds of IO to l against X, and later he is able to place a $600 bet against X at odds of l to I against X. Whatever
the outcome of X, that person makes a riskless profit of $500 because the implied probabilities are inconsistent.
He is said to have made a Dutch book in X. Ramsey ( I 93 I) presented the problem of consistent probabilities.
See also Lo ( 1999).
178 Chapter 4 Probability Concepts

4
marginal probabilities. Suppose the question is: What is the probability that the stock
earns a return above the risk-free rate? The answer is an unconditional probability that can
be viewed as the ratio of tw9 quantities. In the numerator is the sum of the probabilities of
stock returns above the risk-free rate. In the denominator is I, the sum of the probabilities
of all possible returns.
Contrast the question, What is the probability of A? with the question, What is the
probability of A, given that B has occurred? The probability in answer to this last question
is a conditional probability, denoted P(A I B) (read: "the probability of A given B"). For
example, suppose we want to know the probability that the stock earns a return above the
risk-free rate, given that the stock earns a positive return. With the words "given that" we
are restricting returns to those larger than O percent; this is a new element in contrast to the
question that brought forth an unconditional probability. The conditional probability is cal-
culated as the ratio of two quantities. The numerator is the sum of the probabilities of stock
returns above the risk-free rate; in this particular case, the numerator is the same as it was
in the unconditional case. The denominator, however, changes from I to the sum of the
probabilities for all outcomes (returns) above O percent; the denominator is a number less
5
than I, as negative returns are possible. To review, an unconditional probability is the
probability of an event without any restriction; it might even be thought of as a stand-alone
probability. A conditional probability, in contrast, is a probability of an event given that an-
other event has occurred.
Investors continually seek an information edge that will help improve their forecasts.
In mathematical terms, they are attempting to frame their view of the future using proba-
bilities conditioned on relevant information or events. Investors do not ignore useful infor-
mation; they adjust their probabilities to reflect it. Thus, the concepts of conditional proba-
bility and conditional expectation, which are discussed later, are extremely important in
investment analysis and financial markets. To state an exact definition of conditional prob-
ability, we need to introduce the concept of joint probability.
Suppose we ask the question: What is the probability of both A and B happening?
The answer to this question is a joint probability, denoted P(AB) (read: "the probability of
A and B"). If we think of the probability of A and the probability of B as sets built of the
outcomes of one or more random variables, the joint probability of A and B is the sum of
the probabilities of the outcomes they have in common. For example, consider two events:
the stock earns a return above the risk-free rate (A) and the stock earns a positive return
(B). The outcomes of A are contained within (are a subset of) the outcomes of B, so P(AB)
equals P(A). We can now state a definition of conditional probability that provides a for-
mula for calculating it.

• Definition of Conditional Probability. The conditional probability of A given that


B has occurred is equal to the joint probability of A and B divided by the probability
of B (assumed not to equal 0).

P(A I B) = P(AB)IP(B), P(B) -=I= 0 (4-1)

4
In analyses of probabilities presented in tables, unconditional probabilities usually appear at the ends or
margins of the table, thus the term marginal probability. Because of possible confusion with the way marginal
is used in economics (roughly meaning incremental), we use the term unconditional probability throughout this
discussion.
5
ln this example, the conditional probability is larger than the unconditional probability. We cannot generalize
from this example, however. For instance, the probability that the stock earns a return above the risk-free rate
given that the stock earns a negative return is 0.
Probability, Expected Value, and Variance 179

Sometimes we know the conditional probability P(A I B) and we want to know the joint
probability P(AB). We can obtain the joint probability from the following multiplication
rule for probabilities, which is Equation 4-1 rearranged.

• Multiplication Rule for Probabilities. The joint probability of A and B can be ex-
pressed as

P(AB) = P(A I B)P(B) (4-2)

Equation 4-2 states that the joint probability of A and B equals the probability of A given B
times the probability of B. As P(AB) = P(BA), the expression P(AB) = P(BA) =
P(B I A)P(A) is equivalent to Equation 4-2.

EXAMPLE 4-2. Conditional Probabilities and Predictability of Mutual Fund


Performance (1).
Kahn and Rudd ( 1995) examined wfether historical performance predicts future per-
formance for a sample of mutual funds that included 300 actively managed U.S. do-
mestic equity funds. One approach they used involved calculating each fund's expo-
sure to a set of style indexes (the term style captures the distinctions of growth/value
and large-capitalization/mid-capitalization/small-capitalization). After establishing a
style benchmark (a comparison portfolio matched to the fund's style) for each fund,
Kahn and Rudd computed the fund's selection return for two periods. They defined
selection return as fund return minus the fund's style-benchmark return. The first pe-
riod was October 1990 to March 1992. The top 50 percent of funds by selection re-
turn for that period were labeled winners; the bottom 50 percent were labeled losers.
Based on selection return in the next period, April 1992 to September 1993, the top
50 percent of funds were tagged as winners and the bottom 50 percent as losers for
that period. An excerpt from their results is given in Table 4-2. The winner-winner
entry, for example, shows that 79 of the 150 first-period winner funds were winners
in the second period (52.6% = 79/150).

TABLE 4-2 Equity Selection Returns


Period 1: October 1990 to March 1992
Period 2: April 1992 to September 1993
Entries are number of funds (percent of row total in parentheses)

Period 2 Winner Period 2 Loser

Period 1 winner 79 (52.6%) 71 (47.4%)


Period I loser 71 (46.9%) 79(53.1%)

Source: Kahn and Rudd ( 1995), Table 3.

1. The four entries in parentheses in the table can be viewed as conditional


probabilities. State the four events that define the four conditional
probabilities.
2. Restate the four entries of the table as conditional probabilities. Use the form
P(this event I that event) = number.
180 Chapter 4 Probability Concepts

3. Are the conditional probabilities in Part 2 empirical, a priori, or subjective


probabilities?
4. Using information in the table, calculate the probability of the event a fund is
loser in both period I and period 2. (Note that because 50 percent of funds
are categorized as losers in each period, the unconditional probability that a
fund is labeled a loser in either period is 0.5.)

Solution to I. The four events needed to define the conditional probabilities


are as follows:

Fund is a period I winner


Fund is a period I loser
Fund is a period 2 winner
Fund is a period 2 loser

Solution to 2.
From row 1:

P(Jund is a period 2 winner Ifund is a period I winner) = 0.526


P(fund is a period 2 loser lfund is a period I winner) = 0.474

From row 2:

P(fund is a period 2 winner Ifund is a period I loser) = 0.469


P(fund is a period 2 loser I fund is a period I loser) = 0.531

Solution to 3. These probabilities are calculated from data, so are empirical


probabilities.
Solution to 4. The estimated probability is 0.266. We use Equation 4-2:

P(fund is a period 2 loser and fund is a period I loser) = P(Jund is a period 2


loser Ifund is a period I loser) X P(Jund is a period I loser) = 0.531 X 0.50
= 0.2655, or a probability of 0.266.

When we have two events, A and B, that we are interested in, we often want to know
the probability that either A or B occurs. Here by or we mean an inclusive-or: that either A
or B occurs, or both A and B occur. To put this another way, the probability of A or B is the
probability that at least one of the two events occurs. Such probabilities are calculated
using the addition rule for probabilities.

• Addition Rule for Probabilities. Given events A and B, the probability that A or B
occurs, or both occur, is equal to the probability that A occurs, plus the probability
that B occurs, minus the probability that both A and B occur.

P(A or B) = P(A) + P(B) - P(AB) (4-3)

If we think of the individual probabilities of A and B as sets built of outcomes of one


or more random variables as shown in Figure 4-1, the first step in calculating the probabil-
ity of A or B is to sum the probabilities of the outcomes in A to obtain P(A). If A and B
share any outcomes, if we now added P(B) to P(A) we would count twice the probabilities
of those shared outcomes. So we add to P(A) the quantity [P(B) - P(AB)], which is the
Probability, Expected Value, and Variance 181

probability of outcomes in B net of the probability of any outcomes already counted when
we computed P(A). This is illuslrated in Figure 4-1, where we avoid double-counting the
outcomes in the intersection of A and B by subtracting P(AB). As an example of the calcu-
lation, if P(A) = 0.50, P(B) = 0.40, and P(AB) = 0.20, then P(A or B) = 0.50 + 0.40 -
0.20 = 0.70. Only if the two events A and B were m_utually exclusive, so that P(AB) = 0,
would it be correct to state that P(A or B) = P(A) + P(B).

FIGURE 4-1 Addition Rule for Probabilities

A and B

The next example shows how much useful information can be obtained using the few prob-
ability rules presented to this point.

EXAMPLE 4-3. Probability of a Limit Order Executing.


You have two buy limit orders outstanding on the same stock. A limit order to buy
stock at a stated price is an order to buy at that price or lower. A number of vendors,
including an Internet service that you use, supply the estimated probability that a
limit order will be filled within a stated time horizon, given the current stock price
and the price limit. One buy order (order I) was placed at a price limit of $10. The
probability that it will execute within one hour is 0.35. The second buy order (order
2), was placed at a price limit of $9.75; it has a 0.25 probability of executing within
the same one-hour time frame.

1. What is the probability that either order I or order 2 will execute?


2. What is the probability that order 2 executes, given that order l executes?

Solution to 1. The probability is 0.35. The calculation uses the addition rule
for probabilities:

?(order 1 executes or order 2 executes) = ?(order 1 executes) + P(order 2


executes) - ?(order 1 executes and order 2 executes) = 0.35 + 0.25 - 0.25
= 0.35
Note that ?(order I executes and order 2 executes) = ?(order I executes I order 2
executes)P(order 2 executes) = l X 0.25 = 0.25. ?(order I executes I order 2 exe-
182 Chapter 4 Probability Concepts

cutes) = 1 because, if order 2 executes, it is certain that order 1 also executes: Price
must pass through $10 to reach $9.75.

Note that the outcomes for which order 2 executes are a subset of the outcomes for
which order 1 executes. After you count the probability that order 1 executes, you
have counted the probability of the outcomes for which order 2 also executes. There-
fore, the answer to the question is the probability that order I executes, 0.35.
Solution to 2. If the first order executes, the probability that the second
order executes (stated as a percent) is 71.4 percent. In the solution to Part 1, you
found that P(order I executes and order 2 executes) = P(order I executes I order 2
executes)P(order 2 executes) = 1 X 0.25 = 0.25. An equivalent way to state this
joint probability is useful here:

P(order I executes and order 2 executes) = 0.25 =


P(order 2 executes I order I executes) X P(order 1 executes)

Now P(order 1 executes) = 0.35 was a given, so you have one equation in one un-
known:

0.25 = P(order 2 executes I order I executes) X 0.35

You conclude that P(order 2 executes I order 1 executes)= 0.2510.35 = 5/7, or about
0.714.

Of great interest to investment analysts are the concepts of independence and de-
pendence. These concepts bear on such basic investment questions as which financial vari-
ables are useful for investment analysis, whether asset returns can be predicted, and
whether superior investment managers can be selected on the basis of their past records.
Two events are independent if the occurrence of one event does not affect the proba-
bility of occurrence of the other event.

• Definition of Independent Events. Two events A and B are independent if and only
if P(AI B) = P(A) or, equivalently, P(B I A)= P(B).

When two events are not independent, they are dependent: the occurrence of one is related
to the probability of occurrence of the other. If we are trying to forecast one event, infor-
mation about a dependent event may be useful, but information about an independent event
will not be useful.
When two events are independent, the multiplication rule for probabilities, Equation
4-2, simplifies as follows.

• Multiplication Rule for Independent Events. When two events are independent,
the joint probability of A and B equals the product of the individual probabilities of
A and B.

P(AB) = P(A )P(B) (4-4)

Thus, if we are interested in two independent events with probabilities of 0. 75 and 0.50, re-
spectively, the probability that they both occur is 0.375 = 0.75 X 0.50. The multiplication
rule for independent events generalizes to more than two events; for example, if A, B, and
Care independent events, then P(ABC) = P(A)P(B)P( C).
Probability, Expected Value, and Variance 183

EXAMPLE 4-4. BankCorp's Earnings per Share (1).


As part of your work as a banking industry analyst, you build models for forecasting
earnings per share (EPS) of the banks you cover. Today you are studying BankCorp.
The historical record shows that in 55 percent of recent quarters BankCorp's EPS
has increased sequentially, and in 45 percent of quarters EPS has decreased or re-
mained unchanged sequentially. 6 At this point in your analysis, you are assuming
that changes in sequential EPS are independent.
Earnings per share for 2Q:2001 (that is, EPS for the second quarter of 2001)
was larger than EPS for IQ:2001.

1. What is the probability that 3Q:2001 EPS will be larger than 2Q:2001 EPS (a
positive change in sequential EPS)?
2. What is the probability of two negative changes in sequential EPS (3Q:2001
EPS smaller than 2Q:2001 EPS, and 4Q:2001 EPS smaller than 3Q:2001
EPS)?

Solution to 1. Under the assumption of independence, the probability that


3Q:2001 EPS will be larger than 2Q:2001 EPS is the unconditional probability of
positive change, 0.55. That 2Q:2001 EPS was larger than IQ:2001 EPS is not
useful information, as the next change in EPS is independent of the prior change.
Solution to 2. The probability of two negative changes in a row is
0.2025 = 0.45 X 0.45.

The following example illustrates how hard it is to satisfy a set of independent crite-
ria even when, individually, the criteria may not be stringent.

EXAMPLE 4-5. Screening Stocks for Investment.


You have developed a stock screen-a set of criteria for selecting stocks. Your in-
vestment universe (the set of securities from which you make your choices) is the
Russell 1000, an index of 1,000 large-capitalization U.S. equities. Your criteria cap-
ture different aspects of the selection problem; you believe that the criteria are inde-
pendent, to a close approximation.

Percent of Russell 1000 Stocks


Criterion Meeting Criterion
First valuation criterion 50%
Second valuation criterion 50%
Analyst coverage criterion 25%
Profitability criterion for company 55%
Financial strength criterion for company 67%

How many stocks do you expect to pass your screen?

6
Sequential comparisons of quarterly EPS are with the immediately prior quarter. A sequential comparison
stands in contrast to a comparison with the same quarter one year ago (another frequent type of comparison).
184 Chapter 4 Probability Concepts

Only 23 stocks out of 1,000 pass through your screen. If you define five
events-the stock passes the first valuation criterion, the stock passes the second
valuation criterion, the s.tock passes the analyst coverage criterion, the company
passes the profitability criterion, the company passes the.financial strength criterion,
say events A, B, C, D, and£, respectivel y-then the probability that a stock will pass
all five criteria, under independence, is

P(ABCDE) = P(A)P(B)P(C)P(D)P(E) = 0.50 X


0.50 X 0.25 X 0.55 X 0.67 = 0.023031

Although only one of the five criteria is even moderately strict (the strictest lets
25 percent of stocks through), the probability that a stock can pass all five is only
0.023031, or about 2 percent. The size of the list of candidate investments is
0.023031 x 1,000 = 23.031 or 23 stocks.

An area of intense interest to investment managers and their clients is whether past
records of performance are useful in identifying repeat winners and losers. The following
example shows how this issue relates to the concept of independence.

EXAMPLE 4-6. Conditional Probabilities and Predictability of Mutual Fund


Performanc e (2).

The purpose of the Kahn and Rudd ( 1995) study, introduced in Example 4-2, was to
address the question of repeat mutual fund winners and losers. If whether a fund is a
loser in one period is independent of whether it is winner in the next period, the prac-
tical value of performance ranking is questionable. Using the four events defined in
Example 4-2 as building blocks, we can define the following events to address the
issue of predictability of mutual fund performance:

Fund is a period I winner and fund is a period 2 winner


Fund is a period 1 winner and fund is period 2 loser
Fund is a period 1 loser and fund is a period 2 winner
Fund is a period 1 loser and fund is a period 2 loser

In Part 4 of Example 4-2, you calculated that

P(jund is a period 2 loser and fund is a period I loser) = 0.266

If the ranking in one period is independent of the ranking in the next period,
what would you expect P(jund is a period 2 loser and fund is a period I loser) to be?
Interpret the calculated probability 0.266.
By the multiplication rule for independent events, P(fund is a period 2 loser
and fund is a period I loser) = P(fund is a period 2 loser) X P(jund is a period I
loser). Because 50 percent of funds are categorized as losers in each period, the un-
conditional probability that a fund is labeled a loser either period is 0.50. Thus
P(jund is a period 2 loser) X P(fund is a period 1 loser) = 0.50 X 0.50 = 0.25. If
whether a fund is a loser in one period is independent of whether a fund is a loser in
the other period, we conclude that P(fund is a period 2 loser and fund is a period I
Probability, Expected Value, and Variance 185

loser) = 0.25. This is an a priori probability because it is obtained from reasoning


about the problem. You could also reason that the four events described above define
categories, and that if funds were randomly assigned to the four categories, there is a
1/4 probability of fund is a period 1 loser and fund is a period 2 loser. If the classifi-
cations in period 1 and period 2 were dependent, t~en the assignment of funds to cat-
egories would not be random. The calculated probability of 0.266 is only slightly
above 0.25. Is this apparent slight amount of predictability the result of chance? A
test conducted by Kahn and Rudd indicated a 35.6 percent chance of observing the
tabled data if the period I and period 2 rankings were independent.

In investments, the question of whether one event (or characteristic) provides infor-
mation about another event (or characteristic) arises in both time-series settings (across
time) and cross-sectional settings (across units at a given point in time). Examples 4-4 and
4-6 illustrated independence in a time-series setting. Example 4-5 illustrated independence
in a cross-sectional setting. Independence/dependence relationships are often also ex-
plored in both settings using regressi<1franalysis, a technique we discuss in a later chapter.
In many practical problems, we logically analyze a problem as follows: We formu-
late scenarios that we think are important for understanding the likelihood of an event that
we are interested in. We then estimate the probability of the event, given the scenario.
When the scenarios (conditioning events) are mutually exclusive and exhaustive, no possi-
ble outcomes are left out. We can then analyze the event using the total probability rule.
This rule explains the unconditional probability of the event in terms of probabilities con-
ditional on the scenarios.
The total probability rule is stated below for two cases. Part I gives the simplest case,
where we have two scenarios. One new notation is introduced. If we have an event or sce-
nario S, the event not-S, called the complement of S, is written sC. 7 Note that P(S) + P(sC)
= I, as either Sor not-S must occur. Part 2 states the rule for the general case of n mutually
exclusive and exhaustive events or scenarios.

• The Total Probability Rule.

I. P(A) = P(A I S)P(S) + P(A I Sc)P(Sc) (4-5)

2. P(A) = P(A I S1)P(S1) + P(A I S2)P(S2) + ••· + P(A I Sn)P(Sn) (4-6)

where S 1, S2, ... , Sn are mutually exclusive and exhaustive scenarios or events.

Equation 4-6 states the following: The probability of any event [P(A)] can be ex-
pressed as a weighted average of the probabilities of the event, given scenarios [terms such
P(A I S 1)]; the weights applied to these conditional probabilities are the respective proba-
bilities of the scenarios [terms such as P(S 1) multiplying P(A I S 1)], and the scenarios must
be mutually exclusive and exhaustive. Among other applications, this rule is needed to un-
derstand Bayes' formula, which we discuss later in the chapter.
In the next example, we use the total probability rule to develop a consistent set of
views about BankCorp's earnings per share.

7
For readers familiar with mathematical treatments of probability, S, a notation usually reserved for a concept
called the sample space, is being appropriated to stand for scenario.
186 Chapter 4 Probability Concepts

EXAMPLE 4-7. BankCorp's Earnings per Share (2).

You are continuing your investigation into whether you can predict the direction of
changes in BankCorp's quarterly EPS. You define four events:

Event Probability
A = change in sequential EPS is positive next quarter 0.55
Ac = change in sequential EPS is O or negative next quarter 0.45
S = change in sequential EPS is positive the prior quarter 0.55
Sc = change in sequential EPS is O or negative the prior quarter 0.45

On inspecting the data, you observe some persistence in EPS changes: increases tend
to be followed by increases, and decreases by decreases. The first probability esti-
mate you develop is P(change in sequential EPS is positive next quarter I change in
sequential EPS is O or negative the prior quarter) = P(A I s2) = 0.40. The most re-
cent quarter's EPS (2Q:2001) is announced, and the change is a positive sequential
change (the event S). You are interested in forecasting EPS for 3Q:200 l.

1. Write this statement in probability notation: "The probability that the change
in sequential EPS is positive next quarter, given that the change in sequential
EPS is positive the prior quarter."
2. Calculate the probability in Part 1. (Calculate the probability that is consistent
with your other probabilities or beliefs.)

Solution to I. In probability notation, this statement is written P(A IS).


Solution to 2. The probability that the change in sequential EPS is positive
for 3Q:2001, given the positive change in sequential EPS for 2Q:2001, is 0.673.
The values of the probabilities needed for the P(A I S) calculation are already
known: P(A) = 0.55, P(S) = 0.55, P(s2') = 0.45, and P(A I s2) = 0.40. According
to Equation 4-5,

P(A) = P(A I S)P(S) + P(A I §- )P(s2)


0.55 = P(A I S) X 0. 55 + 0.40 X 0.45

Solving for the unknown, P(A I S) = (0.55 - 0.40 X 0.45)/0.55 = 0.672727, or


0.673.
You conclude that P(change in sequential EPS is positive next quarter I change
in sequential EPS is positive the prior quarter) = 0.673. Any other probability is not
consistent with your other estimated probabilities. Reflecting the persistence in EPS
changes, this conditional probability of a positive EPS change, 0.673, is greater than
the unconditional probability of an EPS increase, 0.55.

In the chapter on statistical concepts and market returns, we discussed the concept of
a weighted average or weighted mean. The example highlighted in that chapter was that
portfolio return is a weighted average of the returns on the individual assets in the portfo-
lio, where the weight applied to each asset's return is the fraction of the portfolio invested
in that asset. The total probability rule, which is a rule for stating an unconditional proba-
Probability, Expected Value, and Variance 187

bility in terms of conditional probabilities, is also a weighted average. In that formula,


probabilities are used as weights. Part of the definition of weighted average is that the
weights sum to I. Probabilities of mutually exclusive and exhaustive events do sum to I
(that is part of the definition of probability). The next weighted average we discuss, the ex-
pected value of a random variable, also uses probabil_ities as weights.
The expected value of a random variable is an essential quantitative concept in in-
vestments. Investors continually make use of expected values: in estimating the rewards of
alternative investments, in forecasting EPS and other corporate financial variables and ra-
tios, and in assessing any other factor that may affect their financial position. The expected
value of a random variable is defined as follows:

• Definition of Expected Value. The expected value of a random variable is the


probability-weighted average of the possible outcomes of the random variable. For a
random variable X, the expected value of Xis denoted E(X).

Expected value (for example, expected stock return) looks either to the future, as a fore-
cast, or to the "true" value of the mea~the population mean, discussed in the chapter on
statistical concepts and market return"§); We should distinguish expected value from the
concepts of historical or sample mean. The sample mean also summarizes in a single num-
ber a central value. However, the sample mean presents a central value for a particular set
of observations as an equally weighted average of those observations. To summarize, the
contrast is forecast versus historical, or population versus sample.

EXAMPLE 4-8. BankCorp's Earnings Per Share (3).

You continue with your analysis of BankCorp's EPS. In Table 4-3, you have
recorded a probability distribution for BankCorp's EPS for the current fiscal year.

TABLE 4-3 Probability Distribution


for BankCorp's EPS

Probability EPS

0.15 $2.60
0.45 $2.45
0.24 $2.20
0.16 $2.00
Sum= 1.00

What is the expected value of BankCorp's EPS for the current fiscal year?
Following the definition of expected value, list each outcome, weight it by its
probability, and sum the terms.

E(EPS) = (0.15 X $2.60) + (0.45 X $2.45) +


(0.24 X $2.20) + (0.16 X $2.00) = $2.3405

The expected value of EPS is $2.34.


188 Chapter 4 Probability Concepts

An equation that summarizes your calculation in Example 4-8 is

II

E(X) = P(xi) X x 1 + P(x2 ) X x 2 + ... + P(x11 ) X x 11 = L P(x;) X xi (4-7)


i=I

8
where xi is one of n possible outcomes of the random variable X.
The expected value is our forecast. Because we are discussing random quantities, we
cannot count on an individual forecast being realized (although we hope that, on average,
forecasts will be accurate). It is important, as a result, to measure the risk we face. Variance
and standard deviation measure the dispersion of outcomes around the expected value or
forecast.

• Definition of Variance. The variance of a random variable is the expected value


(the probability-weighted average) of squared deviations from the random variable's
expected value.

u\X) = E{[X - E(X)]2) (4-8)

2
The two notations for variance are u (X) and Var(X).

Variance is a number greater than or equal to O because it is the sum of squared terms. If
variance is 0, there is no dispersion or risk. The outcome is certain, and the quantity Xis
not random at all. Variance greater than O indicates dispersion of outcomes. Increasing
variance indicates increasing dispersion, all else equal. Variance of X is a quantity in the
squared units of X. For example, if the random variable is return in percent, variance of re-
turn is in units of percent squared. Standard deviation is easier to interpret than variance, as
it is in the same units as the random variable. If the random variable is return in percent,
standard deviation of return is also in units of percent.

• Definition of Standard Deviation. Standard deviation is the positive square root


of variance.

The best way to become familiar with these concepts is to work examples.

EXAMPLE 4-9. BankCorp's Earnings Per Share (4).

In Example 4-8, you calculated the expected value of BankCorp's EPS as $2.34,
which is your forecast. Now you want to measure the dispersion around your fore-
cast. Table 4-4 shows your view of the probability distribution of EPS.

x For simplicity, we model all random variables in this chapter as discrete random variables, which have a
countable set of outcomes. For continuous random variables, which are discussed along with discrete random
variables in the chapter on common probability distributions, the operation corresponding to summation is
integration.
Probability, Expected Value, and Variance 189

TABLE 4-4 Probability Distribution for BankCorp's EPS

Probabi Iity EPS

0.15 $2.60
0.45 $2.45
0.24 $2.20
0.16 $2.00
Sum= 1.00

What are the variance and standard deviation of BankCorp's EPS for the current fis-
cal year?
The order of calculation is always expected value, then variance, then standard
deviation. Expected value has already been calculated. Following the definition of
variance above, calculate the devi~n of each outcome from the mean or expected
value, square each deviation, weight (multiply) each squared deviation by its proba-
bility of occurrence, then sum these terms.

cr2 =P($2.60)[$2.60 - £(EPS)]2 + P($2.45)[$2.45 - £(EPS)] 2


+ ?($2.20)($2.20 - £(EPS)]2 + P($2.00)[$2.00 - E(EPS)] 2
= (0.15 X ($2.60 - $2.34) 2 ] + [0.45 X ($2.45 - $2.34)2]
+ (0.24 X ($2.20 - $2.34 )2] + (0.16 X ($2.00 - $2.34 )2]
= 0.01014 + 0.005445 + 0.004704 + 0.018496
= 0.038785 dollars squared
Standard deviation is the positive square root of 0.038785 do11ars squared.

u-(EPS) = (0.038785) 112 = $0.196939 or approximately $0.20.

An equation that summarizes your calculation of variance in Example 4-9 is

2
cr (X) = P(x )[x
1 1
- £(X)] 2 + P(x )[x
2 2
- E(X)]2 +
n

... + P(xn)[xn - E(X)]2 = L P(xi)[xi - E(X)]


2
(4-9)
i= I

where x; is one of n possible outcomes of the random variable X.


In investments, we make use of any relevant information available in making our
forecasts. When we refine our expectations or forecasts, we are typically making adjust-
ments based on new information or events; in these cases we are using conditional ex-
pected values. The expected value of a random variable X given an event or scenario S is
denoted E(X IS). Suppose the random variable X can take on n distinct outcomes x 1, x2,
... , xn. The expected value of X conditional on Sis the first outcome, xi, times the proba-
bility of the first outcome given S, P(x 1 IS), plus the second outcome, x 2 , times the proba-
bility of the second outcome given S, P(x2 I S), and so forth.

E(X I S) = [P(x1 IS) X xi] + LP(x2 I S) X x2J + ... + [P(xn IS) X x 11 ] (4-10)

We will illustrate this equation shortly.


190 Chapter 4 Probability Concepts

Parallel to the total probability rule for stating unconditional probabilities in terms of
conditional probabilities, there is a principle for stating (unconditional) expected values in
terms of conditional expect~d values. This principle is the total probability rule for ex-
pected value.

• The Total Probability Rule for Expected Value.

I. E(X) = E(X I S)P(S) + E(X I Sc)P(Sc) (4-11)

2. E(X) = E(X I S,)P(S,) + E(X I S2) P(S2) + ••• + E(X I S11 ) P(S,,) (4-12)

where S 1, S2 , ... , S,, are mutually exclusive and exhaustive scenarios or events.

The general case, Part 2, states that the expected value of X equals the expected value of X
given Scenario I, E(X I S 1), times the probability of Scenario 1, P(S 1), plus the expected
value of X given Scenario 2, E(X I S2 ), times the probability of Scenario 2, P(S2 ), and so
forth.
To use this principle, we formulate mutually exclusive and exhaustive scenarios that
are useful for understanding the outcomes of the random variable. This approach was em-
ployed in developing the probability distribution of BankCorp's EPS in Examples 4-8 and
4..:9_
The earnings of BankCorp are interest rate sensitive, benefiting from a declining in-
terest rate environment. Suppose there is a 0.60 probability that BankCorp will operate in
a declining interest rate environment in the current fiscal year, and a 0.40 probability that it
will operate in a stable interest rate environment (assessing the chance of an increasing in-
terest rate environment as negligible). If a declining interest rate environment occurs, the
probability that EPS will be $2.60 is estimated at 0.25, and the probability that EPS will be
$2.45 is estimated at 0.75. Note that 0.60, the probability of declining interest rate envi-
ronment, times 0.25, the probability of $2.60 EPS given a declining interest rate environ-
ment, equals 0.15, the (unconditional) probability of $2.60 given in the table in Examples
4-8 and 4-9 above. The probabilities are consistent. Also, 0.60 X 0.75 = 0.45, the proba-
bility of $2.45 EPS given in Table 4-2. The tree diagram in Figure 4-2 shows the rest of
the analysis.

FIGURE 4-2 BankCorp's Forecasted EPS

EPS = $2.60 with


Prob= 0.15

EPS = $2.45 with


Prob= 0.45

E(EPS) = $2.34
EPS = $2.20 with
Prob= 0.24
Prob. of stable
inleresl rales = 0.40
EPS = $2.00 with
Prob= 0.16
Probability, Expected Value, and Variance 191

Given a declining interest rate environment, we are at the node of the tree that
branches off to outcomes of $2.60 and $2.45. We can find expected EPS given a declining
interest rate environment as follows, using Equation 4-10:

E(EPS I declining interest rate environment) =. (0.25 X $2.60)


+ (0.75 X $2.45) = $2.4875 •
If interest rates are stable

E(EPS j stable interest rate environment) = (0.60 X $2.20)


+ (0.40 X $2.00) = $2.12

Once we have the new piece of information that interest rates are stable, for example, we
revise our original expectation of EPS from $2.34 downward to $2.12. Now using the total
probability rule for expected value (Part 1)

E(EPS) = E(EPS I declining int.flfst rate environment)


X P(declining interest rate environment)
+ E(EPS I stable interest rate environment)
X P(stable interest rate environment)

So E(EPS) = ($2.4875 X 0.60) + ($2.12 X 0.40) = $2.34.


This amount is identical to the estimate of the expected value of EPS calculated di-
rectly from the probability distribution in Example 4-8. Just as our probabilities must be
consistent, so must our expected values, unconditional and conditional; otherwise our in-
vestment actions may create profit opportunities for other investors at our expense.
To review, we first developed the factors or scenarios that influence the outcome of
the event of interest. After assigning probabilities to these scenarios, we formed expecta-
tions conditioned on the different scenarios. Then we worked backward to formulate an ex-
pected value as of today. In the problem just worked, EPS was the event of interest, and the
interest rate environment was the factor influencing EPS.
We can also calculate the variance of EPS given each scenario:

cr2(EPS I declining interest rate environment)


= P($2.60 I declining interest rate environment)
2
X [$2.60 - E(EPS I declining interest rate environment)]
+ P($2.45 I declining interest rate environment)
X [$2.45 - E(EPS j declining interest rate environment)]2
= [0.25 X ($2.60 - $2.4875)2)
+ [0.75 X ($2.45 - $2.4875)2) = 0.004219

a\EPS I stable interest rate environment)


= P($2.20 I stable interest rate environment)
2
X [$2.20 - E(EPS I stable interest rate environment)]
+ P($2.00 I stable interest rate environment)
X [$2.00 X E(EPS I stable interest rate environment)]2
= [0.60 X ($2.20 - $2. l 2)2] + [0.40 X ($2.00 - $2.12)2] = 0.0096

These are conditional variances, the variance of EPS given a declining interest rate
environment and the variance of EPS given a stable interest rate environment. The rela-
tionship between unconditional variance and conditional variance is a relatively advanced
192 Chapter 4 Probability Concepts

topic. 9 The main points are that variance, like expected value, has a conditional counterpart
to the unconditional concept, and that we can use conditional variance to assess risk given
a particular scenario.

EXAMPLE 4-10. BankCorp's Earnings Per Share (5).

Continuing with BankCorp, you focus now on BankCorp's cost structure. One
model you are researching for BankCorp's operating costs is

Y =a+ hX

where Y is a forecast of operating costs in millions of dollars and Xis the number of
branch offices. (This model was developed using regression analysis, which we will
discuss in a later chapter.) You interpret the intercept a as fixed costs and h as vari-
able costs. You estimate the equation as

Y= 12.5 + 0.65X

. BankCorp currently has 66 branch offices, and the equation estimates that
12.5 + 0.65 X 66 = $55.4 million. You have two scenarios for growth, pictured in
the tree diagram in Figure 4-3.

FIGURE 4-3 BankCorp's Forecasted Operating Costs

Branches = 125
Op. Costs=?
Prob=?

Branches = 100
Op. Costs= ?
Prob=?
Expected Op.
Costs=? :lranches = 80
Op. Costs= ?
Low Growth Prob=?
Probability= 0.20
Branches = 70
Op. Costs=?
Prob=?

9
The unconditional variance of EPS is the sum of two terms: (I) the expected value (probability weighted
average) of the conditional variances (parallel to the total probability rules), and (2) the variance of conditional
expected values of EPS. The second term arises because the variability in conditional expected value is a source
of risk. Term (I) is cr2(EPS) = ?(declining interest rate environment) X a (EPS I declining inlerest
2

2
rate environmenl) + P(stahle interesl rale environment) X a (EPS I stable inlerest rale environment) =
(0.60 X 0.004219) + (0.40 X 0.0096) = 0.006371. Term (2) is <T 1£(EPS I interest rate environment)l =
2

10.60 X ($2.4875 - $2.34/1 + 10.40 X ($2.12 - $2.34) 1 = 0.032414. Summing the two terms,
2

unconditional variance equals 0.006371 + 0.032414 = 0.038785.


Probability, Expected Value, and Variance 193

1. Compute the~ forecasted operating costs given the different levels of operating
costs, using Y = 12.5 + 0.65X. State the probability of each level of the
number of branch offices. These are the answers to the questions in the termi-
nal boxes of the tree diagram.
2. Compute the expected value of operating co~ts, given the high-growth sce-
nario. Also calculate the expected value of operating costs, given the low-
growth scenario.
3. Answer the question in the initial box of the tree: What are BankCorp's ex-
pected operating costs?

Solution to I. Using E(X I Y) = 12.5 + 0.65Y, from top to bottom you have

Operating Costs Probability


Y= 12.5 + 0.65 X 125 = $93.75 million 0.80 X 0.50 = 0.40
Y= 12.5 + 0.65 X 100 = $77.50 million 0.80 X 0.50 = 0.40
Y = 12.5 + 0.65 X 80 = $6~ million 0.20 X 0.85 = 0.17
Y= 12.5 + 0.65 X 70 = $58~0 million 0.20 X 0.15 = 0.03
Sum= 1.00

Solution to 2. U.S. dollar amounts are in millions.

£(operating costs I high growth)


= (0.50 X $93.75) + (0.50 X $77.50) = $85.625
£(operating costs I low growth)
= (0.85 X $64.5) + (0.15 X $58.00) = $63.53

Solution to 3. U.S. dollar amounts are in millions.

£(operating costs) = E(operating costs I high growth) X P(high growth)


+ £(operating costs I low growth) X P(low growth)
= ($85.625 X 0.80) + ($63.53 X 0.20) = $81.206
BankCorp's expected operating costs are $81.206 million.

We will see conditional probabilities again when we discuss Bayes' formula. This
section has only introduced some of the problems that can be addressed using probability
tools. The following problem draws on these tools, as well as on analytical skills.

EXAMPLE 4-11. The Default Risk Premium for a One-Period Debt


Instrument.
As the co-manager of a short-term bond portfolio, you are reviewing the pricing of a
speculative grade, one-year maturity, zero-coupon bond. For this type of bond, the
return is the difference between the amount paid and the principal value received at
maturity. Your goal is to estimate an appropriate default risk premium for this bond.
You define the default risk premium as the extra return above the risk-free return
that will compensate investors for default risk. If R is the promised return (yield-to-
194 Chapter 4 Probability Concepts

maturity) on the debt instrument and R1 is the risk-free rate, the default risk premium
is R - Rp You assess the probability that the bond defaults as P(the bond defaults) =
0.06. Looking at current money market yields, you find that one-year Treasury bills
(T-bills) are offering a return of 5.8 percent, an estimate of Rf. As a first step, you
make the simplifying assumption that bondholders will recover nothing in the event
of a default. What is the minimum default risk premium you should require for this
instrument?
The challenge in this type of problem is to find a starting point. In many prob-
lems, including this one, an effective first step is to divide up the possible outcomes
into mutually exclusive and exhaustive events in an economically logical way. Here,
from the viewpoint of a bondholder, the two events that affect returns are the bond
defaults and the bond does not default. These two events cover all outcomes. How do
these events affect a bondholder's returns? A second step is to compute the value of
the bond for the two events. We don't have specifics on bond face value, but we can
compute value per $1 or one unit of currency invested. (It is useful to use symbols so
that a sensitivity analysis can be done.)

The Bond Defaults The Bond Does Not Default


Bond value $0 $(] + R)

The third step is to find the expected value of the bond (per $1 invested).

£(bond) = $0 X P(the bond defaults) + $(1 + R)


X [1 - P( the bond defaults)]

So £(bond) = $(1 + R) X [1 - P(the bond defaults)]. The expected value of the T-


bill per$ I invested is (1 + Rf). In fact, this value is certain because the T-bill is risk-
free. The next step requires economic reasoning. You want the default premium to be
large enough so that you expect to at least break even. This will happen if the ex-
pected value of the bond equals the expected value of the T-bill per $1 invested.

Expected Value of Bond = Expected Value of T-Bill


$(1 + R) X [1 - P(the bond defaults)] (1 + R_r)

Solving for the promised return on the bond, you find R = {(1 + Rf)/[l - P(the bond
defaults)]} - 1. Substituting in the values in the statement of the problem, R =
[1.058/(1 - 0.06)) - 1 = 1.12553 - I = 0.12553 or about 12.55 percent, and de-
fault risk premium is R - Rr = 12.55% - 5.8% = 6.75%.
You require a default risk premium of at least 675 basis points. You can state
the matter as follows: If the bond is priced to yield 12.55 percent, you will earn a 675
basis-point spread and receive the bond principal with 94 percent probability. If the
bond defaults, however, you will lose everything. With a premium of 675 basis
points, you expect to just break even relative to an investment in T-bills. Because an
investment in the zero-coupon bond has variability, if you are risk averse you might
demand a higher risk premium than 675 basis points.
This analysis is a starting point. Bondholders usually recover part of their in-
vestment after a default. A next step would be to incorporate a recovery rate. That
problem is left for the end-of-chapter problems.
Prortfolio Expected Return and Variance 195

In this section, we have treated random variables such as EPS as stand-alone quanti-
ties. We have not explored how descriptors such as expected value and variance of EPS
may be functions of other random variables such as sales and costs. To analyze portfolios,
we must understand how portfolio expected return and variance of return are a function of
characteristics of the individual securities' returns. When we look at the dispersion or vari-
ance of portfolio return, we see that how individual security returns move together or co-
vary is important. New concepts, covariance and correlation, are needed. These new con-
cepts are introduced in the next section, which deals with portfolio expected return and
variance of return.

3 PORTFOLIO EXPECTED RETURN AND VARIANCE


Modem portfolio theory (MPT) makes frequent use of the idea that investment opportuni-
ties can be evaluated using expected return as a measure of reward and variance of return
as a measure of risk. Fundamental skilli,are the calculation and interpretation of portfolio
expected return and variance of return. Tri this section, we will develop an understanding of
portfolio expected return and variance of return. 10 Portfolio return is determined by the re-
turns on the individual holdings. As a result, the calculation of portfolio variance, as a
function of the individual asset returns, is more complex than the variance calculations il-
lustrated in the previous section.
We work with an example of a portfolio that is 50 percent invested in an S&P 500
index fund, 25 percent invested in a U.S. long-term corporate bond fund, and 25 percent in-
vested in an EAFE index fund. Table 4-5 shows these weights.

TABLE 4-5 Portfolio Weights

Asset Class Weights

S&P 500 0.50


U.S. long-term corporate bonds 0.25
MSCI EAFE 0.25

The first question is: What is the expected return on the portfolio? In the previous
section, we defined the expected value of a random variable as the probability-weighted
average of the possible outcomes. Portfolio return, we know, is a weighted average of the
returns on the securities in the portfolio. Similarly, the expected return on a portfolio is a
weighted average of the expected returns on the securities in the portfolio, using exactly
the same weights. When we have estimated the expected returns on the individual securi-
ties, we immediately have portfolio expected return. This convenient fact follows from the
properties of expected value.

111
Although we outline a number of basic concepts in this section, we do not present mean-variance analysis per
se. For extended treatments, consult standard investment textbooks such as Bodie, Kane, and Marcus ( 1999),
Elton and Gruber ( 1995), Reilly and Brown (2000), and Sharpe, Alexander, and Bailey ( 1998).
196 Chapter 4 Probability Concepts

• Properties of Expected Value. Let w; be any constant and R; be a random variable.


1. The expected value of a constant times a random variable equals the constant
times the expected-value of the random variable.

2. The expected value of a weighted sum of random variables equals the weighted
sum of the expected values, using the same weights.

E(w 1R, + W2R2 + ... + wnR,,) = w,E(R,) + w2E(R2)


+ ... + WnE(RJ (4-13)

Suppose we have a random variable with a given expected value. We then multiply each
outcome by 2, doubling the value of each outcome. The random variable's expected value
doubles as well. That is the meaning of Part 1. The second statement generalizes the prin-
ciple; it is the rule that directly leads to the expression for portfolio expected return. A port-
folio with n securities is defined by its portfolio weights, w 1, w 2, ... , wn, which sum to 1.
So portfolio return, Rp, is RP = w 1R 1 + w 2R 2 + ... + wnRn- We can state the following
principle:

• Calculation of Portfolio Expected Return. Given a portfolio with n securities, the


expected return on the portfolio is a weighted average of the expected returns on the
component securities.

E(Rp) = E(w,R, + W2R2 + --- + WnRn) = W1E(R1)


+ W2E(R2) + ... + wnE(Rn)

Suppose we have estimated expected returns on the assets in the portfolio, as given in
Table 4-6.

TABLE 4-6 Weights and Expected Returns

Asset Class Weight Expected Return(%)

S&P 500 0.50 13


U.S. long-term corporate bonds 0.25 6
MSCIEAFE 0.25 15

We calculate the expected return on the portfolio as 11.75 percent:

E(Rp) = w 1E(R 1) + w2E(R2) + W3E(R3) = (0.50 X 13%) + (0.25 X 6%)


+ (0.25 X 15%) = 11.75%

In the previous section, we studied variance as a measure of dispersion of outcomes


around the expected value. Here we are interested in portfolio variance of return as a meas-
ure of investment risk. Letting Rp stand for the return on the portfolio, portfolio variance is
2 his defini-
fJ' (Rp) = E{ [Rp - E(Rp)]2} according to Equation 4-8. How do we implementt
Prortfolio Expected Return and Variance 197

tion? In the chapter on statistical concepts and market returns, we learned how to calculate
a historical or sample variance based on a sample of returns. Now we are considering vari-
ance in a forward-looking sense. We will use information about the individual assets in the
portfolio to obtain portfolio variance of return. To avoid clutter in notation, we write ERP
for E(Rp)- We need the concept of covariance.

• Definition of Covariance. Given two random variables R; and RJ, the covariance be-
tween R; and RJ is

Cov(R;, R) = E[(R; - ER;)(RJ - ER)] (4-14)

Alternative notations are a(R;,R) and aiJ.

Equation 4-14 states that the covariance between two random variables is the probability-
weighted average of the cross-product of each random variable's deviation from its own
expected value. We will return to discuss covariance after we establish the need for the
concept. Working from the definition o""'ariance, we find

a\RP) = E[(Rp - ERP)2]

= E{[w 1R 1 + w2R2 + w3R3 - E(w 1R 1 + w2R2 + w3R3)]2}

= E{[w 1R 1 + w2R2 + w3R 3 - w 1ER 1 - w2ER2 - w3ER3]2}


(using Equation 4-13)
= E{ [w 1(R 1 - ER 1) + w2(R 2 - ER 2) + wJ{R 3 - ER3)]2} (rearranging)
= E{[w 1(R 1 - ER 1) + w2(R 2 - ER 2) + w3(R 3 - ER 3)]
X [w 1(R 1 - ER 1) + w2(R2 - ER2) + wJ{R 3 - ER3)]}
(what squaring means)
= E[w 1w 1(R 1 - ER 1)(R 1 - ER,) + w 1w2(R 1 - ER 1)(R2 - ER2)
+ w 1wJ{R 1 - ER 1)(R3 - ER 3) + w2w 1(R2 - ER2)(R 1 - ER 1)
+ w2w2(R 2 - ER2)(R2 - ER2) + w2wJ(R2 - ER2)(R3 - ER 3 )
+ W3W1(R3 - ER3)(R1 - ER,) + W3W2(R3 - ER3)(R2 - ER2)
+ w3wJ{R3 - ER3)(R3 - ER 3)] (doing the multiplication)
= wf E[(R 1 - ER 1 )2] + w 1w2E[(R, - ER 1)(R2 - ER 2)]
+ w 1w 3E[(R 1 - ER 1)(R3 - ER3)] + w2w 1E[(R2 - ER2)(R 1 - ER 1)]
2
+ w~ E[(R 2 - ER2) ] + w2w3E[(R 2 - ER 2)(R3 - ER3)]
+ W3W1E[(R3 - ER3)(R, - ER,)] + W3W2E[(R3 - ER3)(R2 - ER2)]
+ wj E[(R3 - ER3 )2] (recalling that the W; terms are constants)

(4-15)

The last step follows from the definitions of variance and covariance. 11 For the italicized
covariance terms below the diagonal, we used the fact that the order of variables in covari-

11
The calculations leading to Equation 4-15 demonstrate the first of the following useful facts about variance.
Let w be any constant, and let R be any random variable: ( 1) The variance of a constant times a random variable
equals the constant squared times the variance of the random variable, or a2(wR) = w 2 u 2(R); (2) The variance
of a constant plus a random variable equals the variance of the random variable, or a 2 ( w + R) = a\R).
Chapt er 4 Probability Conce pts
198

will show, the diag-


ance does not matter: Cov(R2 , R 1) = Cov(Ri, R 2), for example. As we
, R ), Cov(R 2 ,
onal variance terms u2(R 1), u2(R 2), and u2(R 3) can be expressed as Cov(R 1 1
compa ct way to state Equa-
R2), and Cov(R3, R3), respectively. Using this fact, the most
3 3

tion 4-15 is u\Rp) = LL w; w1 Cov(R;, Rj). The double summation signs say: "Set
i= I j= I
i = l then letj run from l to 3; then set i = 2 and letj run from I to 3; next set i = 3 and
izes for a portfolio
letj run from 1 to 3; finally add the nine terms." This expression general
of any size n to

n n
u2(Rp) = LL W; w1 Cov(R;, Rj) (4-16)
i= I j=I

diagonal
We see from Equation 4-15 that individual variances of return (the bolded
es are actually
terms) constitute part, but not all, of portfolio variance. The three varianc
the ratio is I to
outnumbered by the six covariance terms off the diagonal. For three assets,
20 varianc e terms and 20 X 20 -
2, or 50 percent. If there are 20 assets, there are
to off-dia gonal co-
20 = 380 off-diagonal covariance terms. The ratio of variance terms
then, is this: As the
variance terms is less than 6 to 100, or 6 12percent. A first observation,
nt, all else equal.
number of holdings increases, covariance becomes increasingly importa
io varianc e? The covaria nce terms
What exactly is the effect of covariance on portfol
examp le, consider
capture how the co-movements of returns affect portfolio variance. For
when the other
two stocks: one tends to have high returns (relative to its expected return)
tend to offset the
has low returns (relative to its expected return). The returns on one stock
e of returns on the portfolio.
returns on the other stock, lowering the variability or varianc
will introdu ce a more
Like variance, the units of covariance are hard to interpret, and we
we can establish
intuitive concept shortly. Meanwhile, from the definition of covariance
two essential observations about covariance.

• Facts About Covariance.


1. We can interpret the sign of covariance as follows:
its ex-
Covariance of returns is negative if, when the return on one asset is above
below its expecte d value (an averag e
pected value, the return on the other asset is
is O if returns on the
inverse relationship between returns). Covariance of returns
assets are unrelated.
its ex-
Covariance of returns is positive if, when the return on one asset is above
above its expecte d value (an averag e
pected value, the return on the other asset is
positive relationship between returns).
is its own vari-
2. The covariance of a random variable with itself (own covaria nce)
ance. Cov(R, R) = E{[R - E(R)][R - E(R)]} = E{[R - E(R)]2} =
u 2 (R)

needed to compute
A complete list of the covariances constitutes all the statistical data
format called a co-
portfolio variance of return. Covariances are often presented in a square
d return and vari-
variance matrix. Table 4-7 summarizes the inputs for portfolio expecte
ance of return.

as it is here, we omit the qualifying


Where the meaning of covariance as "off-diagonal covariance" is obvious,
12

words. Covariance is usually used in this sense.


Prortfolio Expected Return and Variance 199

TABLE 4-7 Inputs to Portfolio Expected Return and Variance

Panel A: Inputs for Portfolio Expected Return

Stock B· C
£(Rs) E(Rc)

Panel B: Covariance Matrix: The Inputs for Portfolio Variance of Return

Stock A B C
A Cov(RA, RA)* Cov(RA, Rs) Cov(RA, Re)
B Cov(Rs, RA) Cov(Rs, Rs)** Cov(Rs, Re)
C Cov(Rc, RA) Cov(Rc, Rs) Cov(Rc, Re)***

.
With three assets, the covariance matrix has 32 = 3 X 3 = 9 entries, but it is cus-
tomary to treat the diagonal terms, the variances, separately from the off-diagonal terms.
This is natural, as security variance is a single variable concept. So there are 9 - 3 = 6
covariances, excluding variances. But Cov(Rs, RA) = Cov(RA, Rs), Cov(Rc, RA) =
Cov(RA, Re), and Cov(Rc, Rs) = Cov(Rs, Re)- The covariance matrix below the diagonal
is the mirror image of the covariance matrix above the diagonal. As a result, there are only
3 = 6/2 distinct covariance terms to estimate. In general, for n securities there are
n(n - 1)/2 distinct covariances to estimate, and n variances to estimate.
Suppose we have the covariance matrix shown in Table 4-8:

TABLE 4-8 Covariance Matrix

U.S. Long-Term
S&P 500 Corporate Bonds MSCI EAFE

S&P 500 400 45 189


U.S. Long-Term Corporate Bonds 45 81 38
MSCI EAFE 189 38 441

Let us take Equation 4-15 and group variance terms together. We have:

(4-17)

= (0.50)2(400) + (0.25)2(81) + (0.25)2(441) + 2(0.50)(0.25)(45)


+ 2(0.50)(0.25)(189) + 2(0.25)(0.25)(38)
= 100 + 5.0625 + 27.5625 + 11.25 + 47.25 + 4.75 = 195.875

The variance is 195.875. Standard deviation of return is (195.875) 112 = 14 percent. To


summarize, the portfolio has an expected annual return of 11.75 percent and a standard de-
viation of return of 14 percent.
200 Chapter 4 Probability Concepts

Let us look at the first three terms in the calculation above. Their sum, 132.625 =
100 + 5.0625 + 27.5625, is the contribution of the individual variances to portfolio vari-
ance. If the returns on the three assets were independent, according to a fact given above,
covariances would be 0 'and the standard deviation of portfolio return would be
(132.625) 112 = 11.52 percent as compared to 14 percent before. The portfolio would have
less risk. Suppose the covariance terms were negative. Then a negative number would be
added to 132.625, so portfolio variance and risk would be even smaller. At the same time,
we have not changed expected return. For the same expected portfolio return, the portfo-
lio has less risk. This risk reduction is a diversification benefit, meaning a risk-reduction
benefit from holding a portfolio of assets. The diversification benefit increases with de-
creasing covariance. This observation is a key insight of modem portfolio theory. It is
even more intuitively stated when we can use the concept of correlation. Then we can say
that as long as security returns are not perfectly positively correlated, diversification ben-
efits are possible. Furthermore, the smaller the correlation between security returns, the
greater the cost of not diversifying (in terms of risk reduction benefits forgone), all else
equal.

• Definition of Correlation. The correlation between two random variables, R; and Rj,
is defined as p(R;, R) = Cov(R;, R)lrr(R;)(T(Ri). Alternative notations are Corr(R;, R)
and Pij·

Frequently, covariance is substituted out using the relationship Cov(R;, R) =


p(R;, R)rr(R;)rr(R). The division indicated in the definition makes correlation a pure num-
ber (one without a unit of measurement) and places bounds on its largest and smallest pos-
sible values. Using the above definition, we can state a correlation matrix from data in the
covariance matrix alone. Table 4-9 shows the correlation matrix.

TABLE 4-9 Correlation Matrix of Returns

U.S. Long-Term
S&P 500 Corporate Bonds MSCI EAFE

S&P 500 1.00 0.25 0.45


U.S. Long-Term Corporate Bonds 0,25 1.00 0.20
MSCIEAFE 0.45 0.20 1.00

For example, the covariance between long-term bonds and EAFE is 38, from Table 4-8.
112
The standard deviation of long-term bond returns is (81) = 9 percent, that of EAFE re-
turns is 21 percent, from diagonal terms in Table 4-8. The correlation p(Retum on long-
2
term bonds, Return on EAFE) is (38% )/(9%)(21 %) = 0.201, rounded to 0.20. The corre-
lation of the S&P 500 with itself equals I: The calculation is own covariance, which is
variance divided by its standard deviation squared, which equals variance.

• Properties of Correlation.
1. Correlation is a number between - I and + I:
-1 ::::: p(X,Y) ::::: + 1
Prortfolio Expected Return and Variance 201

2. A correlation of 0 (uncorrelated variables) indicates an absence of any linear


13
(straight-line) relationship between the variables. Increasingly positive correla-
tion indicates an increasingly strong positive linear relationship (up to 1, which
indicates a perfect linear relationship). Incre·asingly negative correlation indi-
cates an increasingly strong negative (inverse) linear relationship (down to - 1,
14
which indicates a perfect inverse linear relationship).

EXAMPLE 4-12. Portfolio Expected Return and Variance of Return.


You have a portfolio of two mutual funds, A and B, 75 percent invested in A, as
shown in Table 4-10.

TABLE 4-10 Mutual Fu~d Expected Returns, Return


Variances, nd Covariances

Fund A B
£(RA)= 20% f(R 8) = 12%

Covariance Matrix

Fund A B
A 625 120
B 120 196

1. Calculate the expected return on the portfolio.


2. Calculate the correlation matrix for this problem. Carry two decimal places.
3. Compute portfolio standard deviation of return.

Solution to 1. E(Rp) = wAE(RA) + (1 - wAE(R 8)) = 0.75 X 20% + 0.25 X


12% = 18%. Portfolio weights must sum to l: w 8 = 1 - wA.
112
Solution to 2. rr(RA) = (625) 112 = 25 percent, rr(R8 ) = (196) = 14
percent. There is one distinct covariance, and thus one distinct correlation:

p(RA, R 8
) = Cov(RA, R )/rr(RA)rr(R
8 8
) = 120/(25 X 14) = 0.342857, or 0.34.

Table 4-11 shows the correlation matrix.

13
If the correlation is 0, R 1 = a + bR 2 + error, with b = 0.
14
If the correlation is positive, R 1 = a + bR2 , + error, with b > 0. If the correlation is negative, b < 0.
202 Chapter 4 Probability Concepts

TABLE 4-11 Correlation Matrix

A B

A 1.00 0.34
B 0.34 1.00

Diagonal terms are always equal to 1.

Solution to 3.

2 2
u\Rp) = w1 u (RA) + w1 u (R 8 ) + 2wAwsCov(RA, R8 )
2
= (0.75)2(625) + (0.25) (196) + 2(0.75)(0.25)(120)
= 351.5625 + 12.25 + 45 = 408.8125
u(Rp) = (408.8125) 112 = 20.22 percent

How do we estimate return covariance and correlation? Frequently, we make fore-


casts on the basis of historical covariance or other methods based on historical return data,
15
such as a market model regression. We can also calculate covariance using the joint
probability function of the random variables, if that can be estimated. The joint probabil-
ity function of two random variables X and Y, denoted P(X, Y), gives the probability of
joint occurrences of values of X and Y. For example, P(3, 2), is the probability that X equals
3 and Y equals 2.
Suppose that the joint probability function of the returns on BankCorp stock (R 8 )
and the returns on NewBank stock (RN) has the simple structure given in Table 4-12.

TABLE 4-12 Joint Probability Function of BankCorp and NewBank


Returns (Entries are joint probabilities)

RN= 20% RN= 16% RN= 10%

Rn= 25% 0.20 0 0


Rn= 12% 0 0.50 0
Rn= 10% 0 0 0.30

The expected return on BankCorp stock is (0.20 X 25%) + (0.50 X 12%) + (0.30 X
10%) = 14%. The expected return on NewBank stock is (0.20 X 20%) + (0.50 X
16%) + (0.30 X 10%) = 15%. The joint probability function above might reflect an
analysis based on whether banking industry conditions are good, average, or poor. Table
4-13 presents the calculation of covariance.

15
See any of the textbooks mentioned in footnote I0.
Prortfolio Expected Return and Variance 203

TABLE 4-13 Covariance Calculations

Banking Deviations Deviations Product of Probability of Probability-Weighted


Industry Condition Bank Corp. New Bank Deviations Condition Product

Good 25-l4 20-1S ss 0.20 11


Average 12-14 16-1S -2 a.so -I
Poor 10-14 10-1S 20 0.30 6
Cov(R8 , RN) = 16

Expected Return: BankCorp 14%, NewBank 15'½,

The first and second columns of numbers show, respectively, the deviations of BankCorp
and NewBank returns from their mean or expected value. The next column shows the
product of the deviations. For examje, for good industry conditions, (25 - 14) X
(20 - 15) = 11 X 5 = 55. Then 55 is multiplied or weighted by 0.20, the probability that
banking industry conditions are good: 55 X 0.20 = 11. The calculations for average and
poor banking conditions follow the same pattern. Summing up these probability-weighted
products, we find that Cov(R/J, RN) = 16.
A formula for computing the covariance between random variables R; and Rj is

Cov(R;, R) = L L P(R;, R)(R; - ER;)(Rj - ER) (4-18)


j

The formula tells us to sum all possible cross-products of the two random variables
weighted by the appropriate joint probability. In the example we just worked, as you can
see from Table 4-13, only three joint probabilities are non-zero. Therefore, in computing
the covariance of returns in this case, we need to consider only three cross-products:

Cov(R 8 , RN) = + P(l2, 16) X [(12 - 14)


P(25, 20) X [(25 - 14) X (20 - 15)]
X(16 - 15)] + P(l0,10) X [(10 - 14) X (10 - 15)]
= (0.20 X 11 X 5) + [0.50 X (-2) X l] + (0.30 X (-4) X (-5)]
= 11 - 1 + 6 = 16

One theme of this chapter has been independence. Two random variables are inde-
pendent when every possible pair of events--one event corresponding to a value of X and
another event corresponding to a value of Y-are independent events. When: two random
variables are independent, their joint probability function simplifies.

• Definition of Independence for Random Variables. Two random variables X and Y


are independent if and only if P(X, Y) = P(X)P(Y).

For example, given independence, P(3, 2) = P(3)P(2). We multiply the individual proba-
bilities. Independence is a stronger property than uncorrelatedness because correlation ad-
dresses only linear relationships. The following condition holds for uncorrelated random
variables, and therefore also holds for independent random variables.
204 Chapter 4 Probability Concepts

• Multiplication Rule for the Expected Value of the Product of Uncorrelated Ran-
dom Variables. The expected value of the product of uncorrelated random variables
is the product of their .expected values.

E(XY) = E(X)E(Y) if X and Yare uncorrelated.


Many financial variables, such as revenue (price times quantity), are the product of random
quantities. When applicable, the above rule simplifies calculating expected value of a prod-
16
uct of random variables.

4 TOPICS IN PROBABILITY
In the remainder of the chapter we discuss two topics that can be important in solving in-
vestment problems. We start with Bayes' formula: what probability theory has to say about
learning from experience. Then we move to a discussion of shortcuts and principles for
counting.

4.1 BAYES' When we make decisions involving investments, we often start with viewpoints based on
FORMULA our experience and knowledge. These viewpoints may be changed or confirmed by new
knowledge and observations. Bayes' formula is a rational method for adjusting our view-
17
points as we confront new information. Bayes' formula and related concepts have been
applied in many business and investment decision-making contexts, including the evalua-
18
tion of mutual fund performance.
Bayes' formula makes use of Equation 4-6, the total probability rule. To review, that
rule expressed the probability of an event as a weighted average of the probabilities of the
event, given a set of scenarios. Bayes' formula works in reverse, or more precisely, reverses
the "given that" information. Bayes' formula uses the occurrence of the event to infer the
19
probability of the scenario generating it. In many applications, including the one illus-
trating its use in this section, an individual is updating his beliefs concerning the causes
that may have produced a new observation.
To illustrate Bayes' formula, we work through an investment example that you can
adapt to any actual problem. Suppose you are an investor in the stock of DriveMed, Inc.
Security analysts make forecasts of earnings per share of the firms they cover, and various
services report consensus EPS estimates. Positive earnings surprises relative to consensus
EPS estimates often result in positive stock returns, and negative surprises often have the
opposite effect. DriveMed will release last quarter's EPS and you are interested in which of
these three events happened: last quarter's earnings exceeded the consensus EPS estimate,
or last quarter's earnings exactly met the consensus EPS estimate, or last quarter's earn-
ings fell short of the consensus EPS estimate. This list of the alternatives is mutually ex-
clusive and exhaustive. You expect that when the actual earnings become public, you will
be benefited or hurt as an investor by the reaction of the stock price to the news.
On the basis of your own research, you jot down the following prior probabilities
(or priors, for short) concerning these three events:

16
Otherwise, the calculation depends on conditional expected value; the calculation can be expressed as
E(XY) = E[X E(Y I X)J.
17
Named after the Reverend Thomas Bayes ( 1702-1761 ).
ix See Eaks, Metrick, and Wachter (200 l ).

19
For that reason, Bayes' formula is sometimes called an inverse probability.
Topics in Probability 205

• P(EPS exceeded consensus) = 0.45


• P(EPS met consensus) = 0.30
• P(EPS fell short of consensus) = 0.25

These probabilities are prior in the sense that they reflect only what you know now, before
the arrival of any new information.
The next day, DriveMed announces that it is expanding factory capacity in Singapore
and Ireland to meet increased sales demand. You now assess this new information. The de-
cision to expand capacity relates not only to current demand, but probably also to the prior
quarter's sales demand. You know that sales demand is positively related to EPS. So now it
appears more likely that last quarter's EPS will exceed the consensus.
The question you have is this: In light of the new information, what is my updated
probability that the prior quarter's EPS exceeded the consensus estimate?
Bayes' formula provides a rational method for accomplishing this updating. We can
abbreviate the new information as DriveMed expands. The first step in applying Bayes'
formula is to calculate the probability of the new information (here: DriveMed expands),
given a list of events or scenarios tha~ay have generated it. The list of events should
cover all possibilities, as it does here. Formulating these conditional probabilities is the key
step in the updating process. Suppose your view is

P(DriveMed expands I EPS exceeded consensus)= 0.75


P(DriveMed expands I EPS met consensus) = 0.20
P(DriveMed expands I EPS fell short of consensus) = 0.05

Conditional probabilities of an observation (here: DriveMed expands) are sometimes re-


ferred to as likelihoods. Again, likelihoods are required for the updating.
Next, you combine these conditional probabilities or likelihoods with your prior
probabilities to get the unconditional probability for DriveMed expanding, P(DriveMed
expands), as follows:

P(DriveMed expands) =
P(DriveMed expands I EPS exceeded consensus) X
P(EPS exceeded consensus)+
P(DriveMed expands I EPS met consensus) X
P(EPS met consensus) +
P(DriveMed expands I EPS fell short of consensus) X
P(EPS fell short of consensus)
= 0.75 X 0.45 + 0.20 X 0.30 + 0.05 X 0.25 = 0.41, or41%

This is Equation 4-6, the total probability rule, in action. Now we can answer_ the question
on your mind. According to Bayes' formula,

P(EPS exceeded consensus I DriveMed expands) =


P(DriveMed expands I EPS exceeded consensus)
- - - - - - - - - - - - - - - - - - - X P(EPS exceeded consensus)
P(DriveMed expands)
= (0.75/0.41) X 0.45 = 1.829268 X 0.45 = 0.823171

Prior to DriveMed's announcement, you thought the probability that DriveMed would beat
consensus expectations was 45 percent. On the basis of your interpretation of the an-
206 Chapter 4 Probability Concep ts

nouncement, you update that probability to 82.3 percent. This updated probability is called
your posterior probability because it reflects or comes after the new information.
The Bayes' calculation takes the prior probability, which was 45 percent, and multi-
plies it by a ratio-th e first term on the right-hand side of the equal sign. In the denomina-
-
tor of the ratio is the probability that DriveMed expands, as you view it without consider
ing (conditioning on) anything else. Therefore, this probability is uncondit ional. The
numerator is the probability that DriveMed expands, if last quarter's EPS actually ex-
ceeded the consensus estimate. This last probability is larger than unconditional probabil-
ity in the denominator, so the ratio (l.83 roughly) is greater than 1. As a result, your up-
dated or posterior probability is larger than your prior probability. Thus, the ratio reflects
the impact of the new information on your prior beliefs. The following is a general state-
ment of Bayes' formula:

• Bayes' Formula. Given a set of prior probabilities for an event of interest, if you re-
ceive new information, the rule for updating your probability of the event is

Updated probability of event given the new information =


Probability of the New Information given Event
- - - - - - - - - - - - - - - - - - - X Prior probability of event
Unconditional Probability of the New Information

EXAMPLE 4-13. Inferring Whethe r DriveM ed's EPS Met Consens us EPS.
You are still an investor in DriveMed stock. To review the givens, your prior proba-
bilities are P(EPS exceeded consensus) = 0.45, P(EPS met consensus) = 0.30, and
P(EPSfe ll short of consens us)= 0.25. You also have the following conditional prob-
abilities:

P(DriveMed expands I EPS exceeded consens us)= 0.75


P(DriveMed expands I EPS met consensus) = 0.20
P(DriveMed expands I EPS fell short of consensus) = 0.05

Recall that you updated your probability that last quarter's EPS exceeded the con-
sensus estimate from 45 percent to 82.3 percent after DriveMed announced that it
would expand. Now you want to update your other priors.

1. Update your prior probability that DriveMed's EPS met consensus.


2. Update your prior probability that DriveMed's EPS fell short of consensus.
3. Show that the three updated probabilities sum to 1. (Carry each probability to
four decimal places.)
4. Suppose, because of lack of prior beliefs about whether DriveMed met con-
sensus, you updated on the basis of prior probabilities that all three possibili-
ties were equally likely: P(EPS exceeded consensus) = P(EPS met
consensus) = P(EPS fell short of consensus) = 1/3. What is your estimate of
the probability P(EPS exceeded consensus I DriveMed expands)?

Solution to 1. The probability is P(EPS met consensus I DriveMedexpands) =

P(DriveMed expands I EPS met consensus)


- - - - - - - - - - - - - - - - X P(EPS met consensus)
P(DriveMed expands)
Topics in Probability 207

The probability P(DriveMed expands) is found by taking each of the three condi-
tional probabilities in the statement of the problem, such as P(DriveMed expands I
EPS exceeded consensus); multiplying each one by the prior probability of the
conditioning event, such as P(EPS exceeded consensus); then adding the three
products. The calculation is unchanged from the problem in the text above:
P(DriveMed expands)= 0.75 X 0.45 + 0.20 X 0.30 + 0.05 X 0.25 = 0.41, or41
percent. The other probabilities needed, P(DriveMed expands I EPS met consen-
sus) = 0.20 and P(EPS met consensus) = 0.30, are givens. So

P(EPS met consensus I DriveMed expands)= [P(DriveMed expands I EPS


met consensus)IP(DriveMed expands)] X P(EPS met consensus)
= (0.20/0.41) X 0.30 = 0.487805 X 0.30 = 0.146341

After taking account of the announcement on expansion, your updated probability


that last quarter's EPS for DriveMed just met consensus is 14.6 percent compared to
your prior probability of 30 percent.
Solution to 2. P(DriveMed eif,_ands) was already calculated as 41 percent.
Recall that P(DriveMed expands I EPS fell short of consensus) = 0.05 and P(EPS
fell short of consensus) = 0.25 are givens.

P(EPS fell short of consensus I DriveMed expands)


= [P(DriveMed expands I EPSfell short of consensus)/
P(DriveMed expands)] X P(EPS fell short of consensus)
= (0.05/0.41) X 0.25 = 0.121951 X 0.25 = 0.030488

As a result of the announcement, you have revised your probability that DriveMed's
EPS fell short of consensus from 25 percent (your prior probability) to 3 percent.
Solution to 3. The sum of the three updated probabilities is

P(EPS exceeded consensus I DriveMed expands) + P(EPS met consensus I


DriveMed expands) + P(EPS fell short of consensus I DriveMed expands)
= 0.8232 + 0.1463 + 0.0305 = 1.0000

The three events (EPS exceeded consensus, EPS met consensus, EPS fell short of
consensus) are mutually exclusive and exhaustive: One of these events or statements
must be true, so the conditional probabilities must sum to I. Whether we are talking
about conditional or unconditional probabilities, whenever we have a complete list
of the distinct possible events or outcomes, the probabilities must sum to 1. This is a
check on your work.
Solution to 4. According to Bayes' formula, P(EPS exceeded consensus I
DriveMed expands) = [0.75/(1/3)] X (1/3) = 0.75 or 75 percent. This probability
is identical to your estimate of P(DriveMed expands I EPS exceeded consensus).
This holds true in general: When a decision-maker is uninformed, his beliefs are
completely determined by the data or new information. The assumption of equal
prior probabilities is called a diffuse prior.

4.2 PRINCIPLES OF The first step in addressing a question often involves determining the different logical pos-
COUNTING sibilities. We may also want to know the number of ways each of these possibilities can
happen. In back of our mind is often a question about probability. How likely is it that I
208 Chapter 4 Probability Concepts

will observe this particular possibility? Records of success and failure are an example.
When we evaluate a market timer's record, one well-known evaluation method uses count-
ing methods presented in this section.2° An important investment model, the Binomial Op-
tion Pricing Model, incorporates the combination formula that you will learn shortly. The
methods of this section are also useful for calculating what were called a priori probabili-
ties in Section 2. When we can assume that the possible outcomes of a random variable are
equally likely, the probability of an event equals the number of possible outcomes favor-
able for the event divided by the total number of outcomes.
In counting, enumeration (counting the outcomes one by one) is of course the most
basic resource. What we discuss in this section are shortcuts and principles. Without these
shortcuts and principles, counting the total number of outcomes can be very difficult and
prone to error. The first and basic principle of counting is the multiplication rule.

• Multiplication Rule of Counting. If one thing can be done in n I ways, and a second
thing, given the first, can be done in n2 ways, and a third thing, given the first two
things, can be done in n 3 ways, and so on for k things, then the number of ways the k
things can be done is n I X n 2 X n 3 X ... X nk.

Suppose we have three steps in an investment decision process. The first step can
be done in two ways, the second in four ways, and the third in three ways. Following the
multiplication rule, there are 2 X 4 X 3 = 24 ways in which we can carry out the three
steps.
Another illustration is the assignment of members of a group to an equal number of
positions. For example, suppose you want to assign three security analysts to cover three
different industries. In how many ways can the assignments be made? The first analyst
may be assigned in three different ways. Then two industries remain. The second analyst
can be assigned in two different ways. Then one industry remains. The third and last ana-
lyst can be assigned in only one way. The total number of different assignments equals
3 X 2 X I = 6. The compact notation for the multiplication we have just performed is 3 !
(read: 3 factorial). If we had n analysts, the number of ways we could assign them to n
tasks would be

n! = n X (n - 1) X (n - 2) X (n - 3) X ... X I

or n factorial. (By convention, O! = I.) To review, in this application we repeatedly carry


out an operation (here, job assignment) until we use up all members of a group (here, three
analysts). With n members in the group, the multiplication formula reduces ton factorial. 21
22
The next type of counting problem can be called labeling problems. We want to
give each object in a group a label, to place it in a category. The following example illus-
trates this type of problem.
A mutual fund guide ranked 18 bond mutual funds by year 2000 total returns. The
guide also assigned each fund one of five risk labels: high risk (4 funds), above average
risk (4 funds), average risk (3 funds), below average risk (4 funds), and low risk (3 funds);
as 4 + 4 + 3 + 4 + 3 = 18, all the funds are accounted for. How many different ways

211
Henriksson and Merton ( 1981 ).
A
21
The shortest explanation of n factorial is that it is the number of ways we can order n objects in a row.
we use up all the members of a
characteristic of the problems lo which we apply this counting method is that
group (sampling without replacement).
22
This discussion follows Kemeny, Schleifer, Snell, and Thompson ( 1972) in terminology and approach.
Topics in Probability 209

can we take 18 mutual funds and label 4 of them high risk, 4 above-average risk, 3 average
risk, 4 below-average risk, and 3 low risk, so each fund is labeled?
The answer is close to 13 billion. We can label 18 funds high risk (the first slot), then
17 funds, then 16 funds, then 15 funds (now we have 4 funds in the high risk group); then
we can label 14 funds above average risk, then 13 fmids, and so forth. There are 18! possi-
ble sequences. However, order of assignment within a category does not matter. For exam-
ple, whether a fund occupies the first or third slot of the four funds labeled high risk, the
fund has the same label (high risk). Thus, there are 4! ways to assign a given group of 4
funds to the 4 high risk slots. Making the same argument for the other categories, in total
there are 4 ! X 4 ! X 3 ! X 4 ! X 3 ! equivalent sequences. To eliminate such redundancies
from the 18! total, we divide 18! by 4! X 4! X 3! X 4! X 3!. We have 18!/(4! X 4! x
3! X 4! X 3!) = 18!/(24 X 24 X 6 X 24 X 6) = 12,864,852,000. This procedure gen-
eralizes as follows.

• Multinomial Formula (The General Formula for Labeling Problems). The num-
ber of ways that n objects can be labeled with k different labels, with n I of the first
type, n2 of the second type, and ~on, with n 1 + n 2 + ... + nk = n, is given by

n!

The special case of the general rule for when there are just two different labels (k =
2) is especially important. The special case is called the combination formula. A combina-
tion is a listing in which order of listing does not matter. We state the combination formula
in a traditional way, but no new concepts are involved. Using the notation in the formula
below, the number of objects with the first label is r = n 1, and the number with the second
label is n - r = n 2 (there are just two categories, son 1 + n 2 = n). Here is the formula.

• Combination Formula (The Binomial Formula). The number of ways that we can
choose r objects from a total of n objects, where the order in which the r objects is
listed does not matter, is

n
C _
r -
(n)-
r
n!
- (n - r)! X r!

Here nCr and (;) are shorthand notations for n !/[ (n - r) !r!] (read: n choose r, or n com-
bination r).
If we label the r objects as belongs to the group and the remaining objects as does
not belong to the group, whatever the group of interest, the combination formula tells us
how many ways we can select a group of size r. We can illustrate this formula with the bi-
nomial option pricing model (BOPM). The BOPM describes the movement of the under-
lying asset as a series of moves, price up (U) or price down (D). For example, two se-
quences of five moves containing three up moves, such as UUUDD and UDUUD, result in
the same final stock price. At least for an option with a payoff dependent on final stock
price, the number but not the order of up moves in a sequence matters. How many se-
quences of five moves belong to the group with three up moves? The answer is 10, calcu-
lated using the combination formula ("5 choose 3"):

5 C3 = n!l[(n - r)!r!] = 5!/[(5 - 3)!3!] = (5 X 4 X 3 X 2 X 1)/


[(2 x 1) (3 X 2 X 1)] = 120/12 = 10 ways
210 Chapter 4 Probability Concepts

A useful fact can be illustrated as follows: 5 C3 = ·5!/(2!3!) equals 5 C2 = 5!/(3!2!), as


3 + 2 = 5; 5 C4 = 5!/(1 !4!) equals 5 C 1 = 5!/(4! I!), as 4 + 1 = 5. This symmetrical rela-
tionship can save work wh~n we need to calculate many possible combinations.
Suppose jurors want to select three companies out of a group of five to receive the
first-, second-, and third-place awards for the best annual report. In how many ways can the
jurors make the three awards? Order does matter if we want to distinguish among the three
awards (the rank within the group of 3); it is clear that the question considers order impor-
tant. On the other hand, if the question was "In how many ways can the jurors choose three
winners, without regard to place of finish?" we would use the combination formula.
To address the first question above, we need to count ordered listings such as first
place-Ne w Company, second place-Fi r Company, third place-We ll Company. An or-
dered listing is known as a permutation, and the formula that counts the number of per-

mutat10ns • known as the permutatio
1s • n formu1a. 23

• Permutation Formula. The number of ways that we can choose r objects from a
total of n objects, where the order in which the r objects is listed does matter, is

n!
npr = ---
(n - r)!

So the jurors have 5 P 3 = n!/(n - r)! = 5!/(5 - 3)! = (5 X 4 X 3 X 2 X 1)/(2 X 1) =


120/2 = 60 ways in which they can make their awards. To see why this formula works,
note that (5 X 4 X 3 X 2 X I )/(2 X 1) reduces to 5 X 4 X 3, after cancellation of terms.
This counts the number of ways to fill three slots choosing from a group of five people, ac-
cording to the multiplication rule of counting. This number is naturally larger than it would
be if order did not matter (compare 60 to the value of 10 for "5 choose 3" that we calcu-
lated above). For example, first place-We ll Company, second place-Fir Company, third
place-Ne w Company contains the same three companies as first place-Ne w Company,
second place-Fi r Company, third place-We ll Company. If we were concerned with
award winners (without regard to place of finish), the two listings would count as one com-
bination. But when we are concerned with order of finish, the listings count as two permu-
tations.
Answering the following questions may help you apply the counting methods we
have presented in this section.

1. Does the thing that I want to count have a finite number of possible outcomes? If
the answer is yes, you may be able to use a tool in this section, and you can go to
the second question. If the answer is no, the number of outcomes is infinite, and the
tools in this section do not apply.
2. Do I want to assign every member of a group of size n to one of n slots (or tasks)?
If the answer is yes, use n factorial. If the answer is no, go to the third question.
3. Do I want to count the number of ways to apply one of three or more labels to each
member of a group? If the answer is yes, use the multinomial formula. If the an-
swer is no, go to the fourth question.
4. Do I want to count the number of ways that I can choose r objects from a total of n,
where the order in which I list the r objects does not matter (can I give the r objects
a label)? If the answer to these questions is yes, the combination formula applies. If
the answer is no, go to the fifth question.

23
A more formal definition states that a permutation is an ordered subset of n distinct objects.
Summary 211

5. Do I want to count the number of ways I can choose r objects from a total of n,
where the order in which I list the r objects is important? If the answer is yes, the
permutation formula applies. If the answer is no, go to question 6.
6. Can the multiplication rule of counting be used? If it cannot, you may have to count
the possibilities one by one, or use more advanced techniques than those presented
here. 24

5 SUMMARY
In this chapter, we have discussed the essential concepts and tools of probability. We have
applied probability, expected value, and variance to a range of investment problems.

• Probability is a number between Oand I that describes the chance that a stated event
will occur.
• A random variable is a quantity whose outcome is uncertain.
• An event is any outcome or spec~ed set of outcomes of a random variable.
• The probability of an event Eis denoted P(E).
• Mutually exclusive events can only occur one at a time. Exhaustive events cover or
contain all possible outcomes.
• The two defining properties of a probability are, first, that O ~ probability of any
event ~ 1 and second, the sum of the probabilities of any list of mutually exclusive
and exhaustive events equals 1.
• A probability estimated from data as a relative frequency of occurrence is an empir-
ical probability. A probability obtained based on logical analysis is an a priori prob-
ability. A probability drawing on personal or subjective judgment is a subjective
probability.
• A probability of an event£, P(E), can be stated as odds for E = P(E)/[l - P(E)] or
against E = [I - P(E)]IP(E).
• Probabilities that are not consistent create profit opportunities, according to the
Dutch Book Theorem.
• A probability of an event not conditioned on another event is an unconditional prob-
ability. The unconditional probability of an event A is denoted P(A). Unconditional
probabilities are also called marginal probabilities.
• A probability of an event given (conditioned on) another event is a conditional prob-
ability. The probability of an event A given an event B is denoted P(A I B).
• The probability of both A and B occurring is the joint probability of A and B, denoted
P(AB).
• P(A I B) = P(AB)IP(B), P(B) 0.=:/=

• The multiplication rule for probabilities is P(AB) = P(A I B)P(B).


• The probability that A or B occurs, or both occur, is denoted by P(A or B).
• The addition rule for probabilities is P(A or B) = P(A) + P(B) - P(AB).
• When events are independent, the occurrence of one event does not affect the proba-
bility of occurrence of the other event. Otherwise, the events are dependent.

24
Feller ( 1957) contains a very full treatment of counting problems and solution methods.
212 Chapter 4 Probability Concepts

• Two events A and B are independent if and only if P(A I B) = P(A) and
P(B I A) = P(B).
• The multiplication rule for independent events states that if A and B are independent
events, P(AB) = P(A)P(B).
• If S1, S2 , . . . , S,, are mutually exclusive and exhaustive scenarios or events, then
P(A) = P(A I S1)P(S1) + P(A I S2)P(S2) + · · · + P(A I S,,)P(S,,).
• The expected value of a random variable is a probability-weighted average of the
possible outcomes of the random variable. For a random variable X, the expected
value of Xis denoted E(X).
• The variance of a random variable is the expected value (the probability-weighted
average) of squared deviations from its expected value E(X): <i(X) = E{ [X -
2
E(X)] }, where (J"2(X) stands for the variance of X. An alternative notation for the
variance of Xis Var(X).
• Variance is a measure of dispersion about the mean. Increasing variance indicates in-
creasing dispersion. Variance is measured in squared units of the original variable.
• Standard deviation is the positive square root of variance.
• Standard deviation measures dispersion (as does variance), but it is measured in the
same units as the variable.
• If w1, w2, ... , w,, are constants and R 1, R2, ... , R,, are random variables, then
E(w 1R 1 + w2R2 + ... + w,,R,,) = w 1E(R 1) + w2E(R 2) + ... + w,,E(R,,).
• The properties of variance include the following, where w and a are constants and R
2
is a random variable: (J"2(wR) = w (J"2(R) and (J"2(a + R) = (J"2(R).
• Covariance is a measure of the co-movement (linear association) between random
variables.
• The covariance between two random variables R; and Rj is the expected value of the
cross-product of the deviations of the two random variables from their respective
means: Cov(R;, R) = E{[R; - E(R;)][~ - E(R)]}.
• The covariance of a random variable with itself is its own variance:
Cov(R, R) = (J"2(R).
• Correlation is a number between - I and + 1 that measures the co-movement (linear
association) between two random variables: p(R;, R) = Cov(R;, R)/[(J"(R;) (J"(R)].
• When return correlation is less than + 1, diversification reduces risk.
• To calculate the variance of return on a portfolio of n assets, the inputs needed are the
n expected returns on the individual securities, n variances of return on the individ-
ual securities, and n(n - 1)/2 distinct covariances.
n n

• Portfolio variance of return is (J"2(Rp) = L L w;wj Cov(R;, R).


i= I j= I

• The calculation of covariance in a forward-looking sense requires the specification


of a joint probability function, which gives the probability of joint occurrences of
values of the two random variables.
• When two random variables are independent, the joint probability function is the
product of the individual probability functions of the random variables.
• When two random variables are uncorrelated, the expected value of the product
equals the product of the expected values: E(XY) = E(X)E(Y).
• Bayes' formula is a method for updating probabilities based on new information.
Summary 213

• Bayes' formula is expressed as follows: Updated probability of event given the new
information = [(Probability of the new information given event)/(Unconditional
probability of the new information)] X Prior probability of event.
• The multiplication rule of counting says, for example, that if the first step in a
process can be done in IO ways, the second step, given the first, can be done in 5
ways, and the third step, given the first two, can be done in 7 ways, then the steps can
be carried out in l O x 5 X 7 = 350 ways.
• The number of ways to assign every member of a group of size n to n slots is
n! = 11 X (n - 1) X (11 - 2) X (n - 3) X ... X I. (By convention, O! = l.)
• The number of ways that 11 objects can be labeled with k different labels, with 11 1 of
the first type, n2 of the second type, and so on, with n 1 + n 2 + ... + n" = n, is given
by 11 !/(n 1 ! X 11 2 ! X ... X nk)- This expression is the multinomial formula.
• A special case of the multinomial formula is the combination formula. The number
of ways that we can choose r objects from a total of n objects, where the order in
which the r objects is listed does not matter, is

C _ ( 11 ) - n!
11
r - r - (11 - r)! X r!

• The number of ways that we can choose r objects from a total of 11 objects, where the
order in which the r objects is listed does matter, is

n!
nPr=---
(n - r)!

This expression is the permutation formula.


214 Chapter 4 Probability Concepts

PROBLEMS 1. Define the following terms:


a. Probability
b. Conditional probability
c. Event
d. Independent events
e. Variance
2. State three mutually exclusive and exhaustive events describing the reaction of a firm's
stock price to a corporate earnings announcement on the day of the announcement.
3. Label each of the following as an empirical, a priori, or subjective probability.
a. The probability that U.S. stock returns exceed long-term corporate bond returns
over a I 0-year period, based on Ibbotson Associates data.
b. An updated (posterior) probability of an event arrived at using Bayes' formula and
judgment on prior probabilities.
c. The probability of one particular outcome when there are exactly 12 equally likely
possible outcomes.
d. A historical probability of default for double-B-rated bonds, adjusted to reflect your
perceptions of changes in the quality of double-B-rated issuance .
.4. You are comparing two firms, BestRest Corporation and Relaxin, Inc. The exports of
both firms stand to benefit substantially from the removal of import restrictions on
their products in a large export market. The price of BestRest Corporation shares re-
flects a probability of 0.90 that the restrictions will be removed within the year. The
price of Relaxin stock, however, reflects a 0.50 probability that the restrictions will be
removed within that time frame. By all other information related to valuation, the two
stocks appear comparably valued. How would you characterize the implied probabili-
ties reflected in share prices? Which stock is relatively overvalued, and which stock is
relatively undervalued, compared to the other?
5. Suppose you have two limit orders outstanding on two different stocks. The probabil-
ity that the first limit order executes before the close of trading is 0.45. The probability
that the second limit order executes before the close of trading is 0.20. The probability
that the two orders both execute before the close of trading is 0.10. What is the proba-
bility that at least one of the two limit orders executes before the close of
trading?
6. You are using the following three criteria to select a list of 500 companies that may be-
come acquisition targets:

Fraction of the 500 Companies


Criterion Meeting the Criterion
Product lines compatible 0.20
Company will increase combined sales growth rate 0.45
Balance sheet impact manageable 0. 78

If the criteria are independent, how many companies will pass the screen?
7. You apply both valuation criteria and financial strength criteria in choosing stocks.
The probability that a randomly selected stock (from your investment universe)
meets your valuation criteria is 0.25. Given that a stock meets your valuation criteria,
the probability that the stock meets your financial strength criteria is 0.40. What is the
probability that a stock meets both your valuation and financial strength criteria?
Problems 215

8. Suppose that 5 percent of the stocks meeting your stock selection criteria are in the
telecommunications (telecom) industry. Also, dividend-paying telecom stocks are I
percent of the total number of stocks meeting your selection criteria. What is the prob-
ability that a stock is dividend-paying, given that it is a telecom stock that has met your
stock selection criteria?
9. The following two facts were cited in a report from Fitch data service. 25
• In 2000, the volume of defaulted U.S. high-yield debt was $27.9 billion. The aver-
age market size of the high-yield bond market during 2000 was $550 billion.
• The average recovery rate for defaulted U.S. high-yield bonds in 2000 (defined as
average price one month after default) was $0.27 on the dollar.

Address the following three tasks:


a. On the basis of the first fact given above, calculate the default rate on U.S. high-
yield debt in 2000. Interpret this default rate as a probability.
b. State the probability computed in Part a as an odds against default.
c. The quantity I minus the recovery rate given in the second fact above is the ex-
pected loss per $1 of principar'falue, given that default has occurred. Suppose you
are told that an institution held a diversified high-yield bond portfolio in 2000.
Using the information in both facts, what was the institution's expected loss in
2000?
10. You are given the following probability distribution for the annual sales of ElStop Cor-
poration:

Probability Distribution for


EIStop Annual Sales
(in millions of dollars)

Probability Sales

0.20 $275
0.40 $250
0.25 $200
0.10 $190
0.05 $180
Sum= 1.00

a. Calculate the expected value of ElStop's annual sales.


b. Calculate the variance of EIStop's annual sales.
c. Calculate the standard deviation of ElStop's annual sales.
11. Suppose the prospects for recovery of principal for a defaulted bond issue depends on
which of two economic scenarios prevails. Scenario 1 has probability 0. 75 and will re-

25
"High Yield Defaults Soar in 2000," February 12, 200 I.
216 Chapter 4 Probability Concepts

suit in recovery of $0. 90 per $1 principal value with probability 0.45, or in recovery of
$0.80 per $1 principal value with probability 0.55. Scenario 2 has probability 0.25 and
will result in recovery of $0.50 per $1 principal value with probability 0.85, or in re-
covery of $0.40 per $1 principal value with probability 0.15.
a. Compute the probability of each of the four possible recovery amounts: $0.90,
$0.80, $0.50, and $0.40.
b. Compute the expected recovery, given the first scenario.
c. Compute the expected recovery, given the second scenario.
d. Compute the expected recovery.
e. Graph the information in a tree diagram.
12. Suppose we have the expected daily returns (in terms of U.S. dollars), standard devia-
tions, and correlations shown in the table below.

U.S., German, and Italian Bond Returns

U.S. Dollar Daily Returns in Percent

U.S. Bonds German Bonds Italian Bonds


Expected Return 0.029 0.021 0.073
Standard Deviation 0.409 0.606 0.635

Correlation Matrix

A B C
U.S. Bonds I 0.09 0.10
German Bonds 0.70
Italian Bonds

Source: Kool (2000), Table I (excerpted and adapted)

a. Using the data given above, construct a covariance matrix for the daily returns on
U.S., German, and Italian bonds.
b. State the expected return and variance of return on a portfolio 70 percent
invested in U.S. bonds, 20 percent in German bonds, and 10 percent in Italian
bonds.
c. Calculate the standard deviation of return on a portfolio 70 percent invested
in U.S. bonds, 20 percent in German bonds, and 10 percent in Italian
bonds.
13. The variance of a portfolio of stocks depends on the variances of each individual stock
in the portfolio and also the covariances among the stocks in the portfolio. If you have
five stocks, how many unique covariances (excluding variances) must you use in order
to compute the variance of return on your portfolio? (Recall that the covariance of a
stock with itself is the stock's variance.)
14. Calculate the covariance of the returns on Bedolf Corporation (Ru) with the returns on
Zedock Corporation (R 2 ), using the following data.
Problems 217

Bedolf and Zedock Returns

Rz = 15% Rz = 10% Rz = 5%

Rs= 30% 0.25 ·O 0


Rs= 15% 0 0.50 0
Rs= 10% 0 0 0.25

Note: Entries are joint probabilities.

15. You have developed a set of criteria for evaluating distressed credits. Firms that do not
receive a passing score are classed as likely to go bankrupt within 12 months. You
gathered the following information when validating the criteria:
• Forty percent of the companies to which the test is administered will go bankrupt
within 12 months: P(non-survivi = 0.40.
• Fifty-five percent of the comparnes to which the test is administered pass it: P(pass)
= 0.55.
• The probability that a firm will pass the test (and be classed as a 12-month survivor),
given that it will subsequently survive 12 months, is 0.85: P(pass test I survivor) =
0.85.

a. What is P(pass test I non-survivor)?


b. Using Bayes' formula, calculate the probability that a firm is a survivor, given that
it passes the test; that is, calculate P(survivor I pass test).
c. What is the probability that a firm is a non-survivor, given that it does not pass the
test?
d. Is the test effective?
16. On a day in March, 3,292 issues traded on the NYSE: 1,303 advanced, 1,764 declined,
and 225 were unchanged. In how many ways could this have happened? (Set up the
problem but do not solve it.)
17. Your firm intends to select 4 of 10 vice presidents for the investment committee. How
many different groups of 4 are possible?
18. As in Example 4-11, you are reviewing the pricing of a speculative grade, one-year
maturity, zero-coupon bond. Your goal is to estimate an appropriate default risk pre-
mium for this bond. The default risk premium is defined as the extra return above the
risk-free return that will compensate investors for default risk. If R is the promised re-
turn (yield-to-maturity) on the debt instrument and R1 is the risk-free rate, the default
risk premium is R - R1, You assess that the probability that the bond defaults is
0.06: P(the bond defaults) = 0.06. One-year T-bills are offering a return of 5.8 per-
cent, an estimate of R1. In contrast to your approach in Example 4-11, you now do not
make the simplifying assumption that bondholders will recover nothing in the event of
a default. Rather, you now assume that recovery will be $0.35 on the dollar, given de-
fault.
a. Denote the fraction of principal recovered in default as 0. Following the model of
Example 4-11, develop a general expression for the promised return R on this bond.
b. Given your expression for R and the estimate of R1, state the minimum default risk
premium you should require for this instrument.
218 Chapter 4 Probability Concepts

SOLUTIONS 1. a. Probability is defined by the following two properties: (1) the probability of any
event is a number between 0 and 1, and (2) the sum of the probabilities of any list
of mutually exclusive and exhaustive events equals 1.
b. Conditional probability is the probability of a stated event, given that another event
has occurred. For example P(A I B) is the probability of A, given that B has oc-
curred.
c. An event is any specified outcome or set of outcomes of a random variable.
d. Two events are independent if the occurrence of one event does not affect the prob-
ability of occurrence of the other event. In symbols, two events A and B are inde-
pendent if and only if P(A I B) = P(A) or, equivalently, P(B I A) = P(B).
e. The variance of a random variable is the expected value (the probability-weighted
average) of squared deviations from the random variable's expected value. In sym-
2
bols, cr2(X) = E{[X - E(X)] }.
2. One logical set of three mutually exclusive and exhaustive events for the reaction of a
firm's stock price on the day of a corporate earnings announcement are as follows
(wording may vary):

• Stock price increases on the day of the announcement.


• Stock price does not change on the day of the announcement.
• Stock price decreases on the day of the announcement.

In fact, there are an unlimited number of ways to split up the possible outcomes into
three mutually exclusive and exhaustive events. For example, the following list also
answers this question satisfactorily:

• Stock price increases by more than 4 percent on the day of the announcement.
• Stock price increases by 0 percent to 4 percent on the day of the announcement.
• Stock price decreases on the day of the announcement.

3. The probability in Part a is an empirical probability. The probability in Part b is a sub-


jective probability. The probability in Part c is an a priori probability. The probability
in Part d is a subjective probability.
4. The implied probabilities of 0.90 and 0.50 are inconsistent in that they create a poten-
tial profit opportunity. The shares of BestRest are relatively overvalued compared to
Relaxin, as their price incorporates a much higher probability of the favorable event
(lifting of the trade restriction) than the shares of Relaxin, which are relatively under-
valued compared to BestRest.
5. The probability that at least one of the two orders executes is given by the addition
rule for probabilities. Letting A stand for the event that the first limit order executes
before the close of trading and letting B stand for the event that the second limit order
executes before the close of trading, P(A or B) = P(A) + P(B) - P(AB) = 0.45 +
0.20 - 0.10 = 0.55. The probability that at least one of the two orders executes be-
fore the close of trading is 0.55.
6. According to the multiplication rule for independent events, the probability of a com-
pany passing all three criteria is the product of the three probabilities. Labeling the
event that a company passes the first, second, and third criteria, A, B, and C, respec-
Solutions 219

tively P(ABC) = P(A)P(B)P(C) = 0.20 X 0.45 X 0.78 = 0.0702. As a consequence,


0.0702 X 500 = 35.10 or 35 companies pass the screen.
7. Use Equation 4-2, the multiplication rule for probabilities P(AB) = P(A I B)P(B),
defining A as the event that a stock meets the .financial criteria and defining B as the
event that a stock meets the valuation criteria. Then P(AB) = 0.40 X 0.25 = 0.10.
The probability that a stock meets both the financial and valuation criteria, stated as a
percent, is IO percent.
8. Use Equation 4-1 to find this conditional probability: P(stock is dividend-paying I tele-
com stock that meets criteria) = P(stock is dividend-paying and telecom stock that
meets criteria)IP(telecom stock that meets criteria) = 0.01/0.05 = 0.20
9. a. The default rate was ($27 .9 billion)/($550 billion) = 0.050727 or 5.1 percent. This
can be interpreted as the probability that $1 invested in a market-value weighted
portfolio of U.S. high-yield bonds was subject to default in 2000.
b. The odds against an event are denoted E = [I - P(E)]IP(E). In this case, the odds
against default are (1 - 0.051)/0.051 = 18.607, or "18.6 to l."
c. First, note that E(loss I bond tfaults) = I - $0.27 = $0.73. According to the
total probability rule for expected value, E(Loss) = E(Loss I bond defaults)P(bond
defaults) + E(loss I bond does not default)P(bond does not default)
= $0.73 X 0.051 + $0.0 X 0.949 = 0.03723, or $0.03723. Thus, the institu-
tion's expected loss was approximately 4 cents per dollar of principal value in-
vested.
10. a. Using Equation 4-6 for the expected value of a random variable (where dollar
amounts are in millions)

£(Sales) = (0.20 X $275) + (0.40 X $250) + (0.25 X $200)


+ (0.10 X $190) + (0.05 X $180) = $233 million

b. Using Equation 4-8 for variance,

cr2(Sales) = P($275)[$275 - E(Sales)] 2 + P($250)[$250 E(Sales)]2


+ P($200)[$200 - E(Sales)] 2 + P($190)[$190 - E(Sales)]2
+ P($180)[$180 - E(Sales)]2
= [0.20 X ($275 - $233)2] + [0.40 X ($250 - $233)2] + [0.25 X
($200 - $233)2] + [0.10 X ($190 - $233) 2 ] + [0.05 X ($180
- $233)2]
= 352.80 + 115.60 + 272.25 + 184.90 + 140.45
= (1,066 million dollars) 2
c. The standard deviation of annual sales is [(1,066 million dollars)2] 112 =
$32.649655 million, or $32.65 million.
11. a. Outcomes associated with Scenario 1: The probability of recovering $0.90 is
0.3375 = 0.45 (the probability of $0.90 recovery per $1 principal value, given
Scenario 1) X 0.75 (the probability of Scenario 1). The probability of recovering
$0.80 is 0.4125 = 0.55 X 0.75.
Outcomes associated with Scenario 2: The probability of recovering $0.50 is
0.2125 = 0.85 (the probability of $0.50 recovery per $1 principal value, given Sce-
nario 2) X 0.25 (the probability of Scenario 2). The probability of recovering $0.40
is 0.0375 = 0.15 X 0.25.
220 Chapter 4 Probability Concepts

b. £(recovery I Scenario 1) = (0.45 X $0.90) + (0.55 X $0.80) = $0.845


c. £(recovery I Scenario 2) = (0.85 X $0.50) + (0.15 X $0.40) = $0.485
d. £(recovery) = ($0.845 X 0.75) + ($0.485 X 0.25) = $0.755
e.
Recovery = $0.90
Prob = 0.3375

Recovery = $0.80
Prob= 0.4125
Expected
Recovery = $0.755
Recovery = $0.50
Prob= 0.2125
Scenario 2,
Probability = 0.25
Recovery = $0.40
Prob = 0.0375

12. a. The diagonal entries in the covariance matrix are the variances, found by squaring
the standard deviations.

2
Var(U.S. bond returns)= 0.409 = 0.167281
2
Var(German bond returns) = 0.606 = 0.367236
2
Var(Italian bond returns) = 0.635 = 0.403225

The covariances are found using the relationship Cov(R;, R) = p(R;,R)a(R;)a(R).


There are three distinct covariances:

• Cov(U.S. bond returns, German bond returns) = p(U.S. bond returns, German
bond returns)a(U.S. bond returns)a(German bond returns) = 0.09 X 0.409 X
0.606 = 0.022307
• Cov(U.S. bond returns, Italian bond returns) = p(U.S. bond returns, Italian
bond returns)a(U.S. bond returns)a(Italian bond returns) = 0.10 X 0.409X
0.635 = 0.025972
• Cov(German bond returns, Italian bond returns)= p(German bond returns, Ital-
ian bond returns)a(German bond returns)a(ltalian bond returns) = 0.70 X
0.409 X 0.635 = 0.181801

Covariance Matrix of Returns, 1996 to 1998

U.S. Bonds German Bonds Italian Bonds


U.S. Bonds 0.167281 0.022307 0.025972
German Bonds 0.022307 0.367236 0.181801
Italian Bonds 0.025972 0.181801 0.403225
Solutions
221

b. Using Equation 4-16, we find

a\Rp) = wf a\R + w~a\R + w~a\R + 2w w Cov(R R


1
)

2
)

3
)
1 2
, )

+ 2w w Cov(R ,R + w w_:,Cov(R~, R
1 2

)
1 3 )
1 3 2
3

= (0.70)\0.167281) + (0.20)2(0.367236) + (0.10)2(0.403225)


+ 2(0.70)(0.20)(0.022307) + 2(0.70)(0.10)(0.025972)
+ 2(0.20)(0.10)(0.18180 l)
= 0.081968 + 0.014689 + 0.004032 + 0.006246 + 0.003636
+ 0.007272
= 0.117843

c. The standard deviation of this portfolio is a 2 (Rp) = (0.117843) 112 = 0.343283, or


34.3 percent.
13. A covariance matrix for five assets has 5 x 5 = 25 entries. Subtracting the five diag-
onal varia?c~ terms, we_ have 25 ~ ~ 20 off-d~agonal entries. Because the covari-
ance matnx 1s symmetnc, only 10·entnes are unique ( IO = 20/2). Hence, you must
use 10 unique covariances in your five stock portfolio variance calculation.
14. The covariance is 25, computed as follows. First, we calculate expected values:

E(R 8 ) = (0.25 X 30%) + (0.50 X 15%) + (0.25 X 10%) = 17.5%


E(Rz) = (0.25 X 15%) + (0.50 X 10%) + (0.25 X 5%) = 10%

Then we find the covariance as follows:

Cov (R 8 , R2 ) = P(30, 15) X [(30 --17.5) X (15 - 10)] + P(15, 10)


X [(15 - 17.5) X (10 - 10)] + P(lO, 5) X [(10 - 17.5)
X (5 - 10)]
= (0.25 X 12.5 X 5) + [0.50 X (-2.5) X 0] + [0.25 X (-7.5)
X (-5)]
= 15.625 + 0 + 9.375 = 25

15. a. We can set up the equation using the total probability rule:

P(pass test) = P(pass test I survivor)P(survivor)


+ P(pass test I non-survivor)P(non-survivor)
0.55 = 0.85 X 0.60 + P(pass test I non-survivor)
X 0.40, as P(survivor) = l - P(non-survivor)
= l - 0.40 = 0.60

Thus P(pass test I non-survivor) = (0.55 - 0.85 X 0.60)/0.40 = 0.10


b. P(survivor I pass test) = [P(pass test I survivor)IP(pass test)] X P(survivor)
= (0.85/0.55) X 0.60 = 0.927273

The information that a firm passes the test causes you to update your probability
that the firm is a survivor from 0.60 to approximately 0.927.
c. According to Bayes' formula, P(non-survivor I not pass test) = [P(not pass test I
non-survivor)IP(not pass test)] X P(non-survivor) = [P(not pass test I
non-survivor)/0.45] X 0.40.
222 Chapter 4 Probability Concepts

We can set up the following equation to obtain P(not pass test I non-survivor):

P(not pass test) = P(not pass test I non-survivor)P(non-survivor)


+ P(not pass test I survivor)P(survivor)
0.45 = P(not pass test I non-survivor) X 0.40 + 0.15 X 0.60

where P(not pass testjsurvivor) = I - P(pass testjsurvivor) = I - 0.85 = 0.15.


So P(not pass testjnon-survivor) = (0.45 - 0.15 X 0.60)/0.40 = 0.90. Using
this result with the formula above, we find P(non-survivorjnot pass test) =
[0.90/0.45] X 0.40 = 0.80. Seeing that a firm does not pass the test causes us to up-
date the probability that it is a non-survivor from 0.40 to 0.80.
d. Passing the test greatly increases our confidence that the firm is a survivor. Failing
the test doubles the probability that the firm is a non-survivor. Therefore, the test
appears to be useful.
16. This is a labeling problem where we assign each NYSE issue a label: advanced, de-
clined, or unchanged. The expression to count the number of ways 3,292 issues can be
assigned to these three categories such that 1,303 advanced, 1,764 declined, and 225
were unchanged is 3,292!/(l,303! X 1,764! X 225!).

17. We find the answer using the combination formula (:) = n!/[(n - r)!r!] Here,
n = 10 and r = 4, so the answer is 10!/[(10 - 4)!4!] = 3,628,800/(720 X
24) = 210.
18. a. The two events that affect a bondholder's returns are the bond defaults and the
bond does not default. First, compute the value of the bond for the two events per
$1 invested.

The Bond Defaults The Bond Does Not Default


Bond value () X $(1 + R) $(1 + R)

Second, find the expected value of the bond (per $1 invested):

£(bond) = () X $(1+ R) X P(the bond defaults) + $(1 + R)


X [1 - P(the bond defaults)]

On the other hand, the expected value of the T-bill is the certain value ( I + R1).
Setting the expected value of the bond to the expected value of the T-bill permits us
to find the promised return on the bond such that bondholders expect to break even.

() X $( I + R) X P( the bond defaults) + $( 1 + R)


X [1 - P(the bond defaults)] = $(1 + R1)

Rearranging the left-hand side,

(I + R) X { 0 X P( the bond defaults) + [1 - P( the bond defaults)] }


= (1 + R1)
R = (l +R1)1 {0 X P( the bond defaults)
+ [1 P(the bond defaults)]} - I
-
Solutions 223

b. For this problem, Rf= 0.058, P(the bond defaults) = 0.06, 1 - P(the bond de-
faults)= 0.94, and 0 = 0.35.

R = [ 1.058/(0.35 X 0.06 + 0.94)] - 1 = 0.100937, or 10. l percent

With a recovery rate of 35 cents on the dollar, a minimum default risk premium
of about 430 basis points is required, calculated as 4.3% = 10.1 % - 5.8%.
"'

You might also like