Untitled
Untitled
Statistics
OpenStax
Rice University
6100 Main Street MS-375
Houston, Texas 77005
©2018 Rice University. Textbook content produced by OpenStax is licensed under a Creative Commons
Attribution 4.0 International License (CC BY 4.0). Under this license, any user of this textbook or the textbook
contents herein must provide proper attribution as follows:
- If you redistribute this textbook in a digital format (including but not limited to PDF and HTML), then you
must retain on every page the following attribution:
“Download for free at https://openstax.org/details/books/introductory-business-statistics.”
- If you redistribute this textbook in a print format, then you must include on every physical page the
following attribution:
“Download for free at https://openstax.org/details/books/introductory-business-statistics.”
- If you redistribute part of this textbook, then you must retain in every digital format page view (including
but not limited to PDF and HTML) and on every physical printed page the following attribution:
“Download for free at https://openstax.org/details/books/introductory-business-statistics.”
- If you use this textbook as a bibliographic reference, please include
https://openstax.org/details/books/introductory-business-statistics in your citation.
Trademarks
The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, OpenStax CNX logo,
OpenStax Tutor name, Openstax Tutor logo, Connexions name, Connexions logo, Rice University name, and
Rice University logo are not subject to the license and may not be reproduced without the prior and express
written consent of Rice University.
OPENSTAX
OpenStax provides free, peer-reviewed, openly licensed textbooks for introductory college and Advanced Placement®
courses and low-cost, personalized courseware that helps students learn. A nonprofit ed tech initiative based at Rice
University, we’re committed to helping students access the tools they need to complete their courses and meet their
educational goals.
RICE UNIVERSITY
OpenStax, OpenStax CNX, and OpenStax Tutor are initiatives of Rice University. As a leading research university with a
distinctive commitment to undergraduate education, Rice University aspires to path-breaking research, unsurpassed
teaching, and contributions to the betterment of our world. It seeks to fulfill this mission by cultivating a diverse community
of learning and discovery that produces leaders across the spectrum of human endeavor.
FOUNDATION SUPPORT
OpenStax is grateful for the tremendous support of our sponsors. Without their strong engagement, the goal
of free access to high-quality textbooks would remain just a dream.
Laura and John Arnold Foundation (LJAF) actively seeks opportunities to invest in organizations and
thought leaders that have a sincere interest in implementing fundamental changes that not only
yield immediate gains, but also repair broken systems for future generations. LJAF currently focuses
its strategic investments on education, criminal justice, research integrity, and public accountability.
The William and Flora Hewlett Foundation has been making grants since 1967 to help solve social
and environmental problems at home and around the world. The Foundation concentrates its
resources on activities in education, the environment, global development and population,
performing arts, and philanthropy, and makes grants to support disadvantaged communities in the
San Francisco Bay Area.
Calvin K. Kazanjian was the founder and president of Peter Paul (Almond Joy), Inc. He firmly believed
that the more people understood about basic economics the happier and more prosperous they
would be. Accordingly, he established the Calvin K. Kazanjian Economics Foundation Inc, in 1949 as a
philanthropic, nonpolitical educational organization to support efforts that enhanced economic
understanding.
Guided by the belief that every life has equal value, the Bill & Melinda Gates Foundation works to
help all people lead healthy, productive lives. In developing countries, it focuses on improving
people’s health with vaccines and other life-saving tools and giving them the chance to lift
themselves out of hunger and extreme poverty. In the United States, it seeks to significantly
improve education so that all young people have the opportunity to reach their full potential. Based
in Seattle, Washington, the foundation is led by CEO Jeff Raikes and Co-chair William H. Gates Sr.,
under the direction of Bill and Melinda Gates and Warren Buffett.
The Maxfield Foundation supports projects with potential for high impact in science, education,
sustainability, and other areas of social importance.
Our mission at The Michelson 20MM Foundation is to grow access and success by eliminating
unnecessary hurdles to affordability. We support the creation, sharing, and proliferation of more
effective, more affordable educational content by leveraging disruptive technologies, open
educational resources, and new models for collaboration between for-profit, nonprofit, and public
entities.
The Bill and Stephanie Sick Fund supports innovative projects in the areas of Education, Art, Science
and Engineering.
Study where you want, what
you want, when you want.
When you access College Success in our web view, you can use our new online
highlighting and note-taking features to create your own study guides.
PREFACE
Welcome to Introductory Business Statistics, an OpenStax resource. This textbook was written to increase student access to
high-quality learning materials, maintaining highest standards of academic rigor at little to no cost.
About OpenStax
OpenStax is a nonprofit based at Rice University, and it’s our mission to improve student access to education. Our first
openly licensed college textbook was published in 2012, and our library has since scaled to over 25 books for college
and AP® courses used by hundreds of thousands of students. OpenStax Tutor, our low-cost personalized learning tool, is
being used in college courses throughout the country. Through our partnerships with philanthropic foundations and our
alliance with other educational resource organizations, OpenStax is breaking down the most common barriers to learning
and empowering students and instructors to succeed.
About OpenStax resources
Customization
Introductory Business Statistics is licensed under a Creative Commons Attribution 4.0 International (CC BY) license, which
means that you can distribute, remix, and build upon the content, as long as you provide attribution to OpenStax and its
content contributors.
Because our books are openly licensed, you are free to use the entire book or pick and choose the sections that are most
relevant to the needs of your course. Feel free to remix the content by assigning your students certain chapters and sections
in your syllabus, in the order that you prefer. You can even provide a direct link in your syllabus to the sections in the web
view of your book.
Instructors also have the option of creating a customized version of their OpenStax book. The custom version can be made
available to students in low-cost print or digital form through their campus bookstore. Visit the Instructor Resources section
of your book page on OpenStax.org for more information.
Errata
All OpenStax textbooks undergo a rigorous review process. However, like any professional-grade textbook, errors
sometimes occur. Since our books are web based, we can make updates periodically when deemed pedagogically necessary.
If you have a correction to suggest, submit it through the link on your book page on OpenStax.org. Subject matter experts
review all errata suggestions. OpenStax is committed to remaining transparent about all updates, so you will also find a list
of past errata changes on your book page on OpenStax.org.
Format
You can access this textbook for free in web view or PDF through OpenStax.org, and for a low cost in print.
About Introductory Business Statistics
Introductory Business Statistics is designed to meet the scope and sequence requirements of the one-semester statistics
course for business, economics, and related majors. Core statistical concepts and skills have been augmented with practical
business examples, scenarios, and exercises. The result is a meaningful understanding of the discipline which will serve
students in their business careers and real-world experiences.
Coverage and scope
Introductory Business Statistics began as a customized version of OpenStax Introductory Statistics by Barbara Illowsky and
Susan Dean. Statistics faculty at The University of Oklahoma have used the business statistics adaptation for several years,
and the author has continually refined it based on student success and faculty feedback.
The book is structured in a similar manner to most traditional statistics textbooks. The most significant topical changes
occur in the latter chapters on regression analysis. Discrete probability density functions have been reordered to provide a
logical progression from simple counting formulas to more complex continuous distributions. Many additional homework
assignments have been added, as well as new, more mathematical examples.
Introductory Business Statistics places a significant emphasis on the development and practical application of formulas so
that students have a deeper understanding of their interpretation and application of data. To achieve this unique approach,
the author included a wealth of additional material and purposely de-emphasized the use of the scientific calculator. Specific
changes regarding formula use include:
2 Preface
Figure 1.1 We encounter statistics in our daily lives more often than we probably realize and from many different
sources, like the news. (credit: David Sim)
Introduction
You are probably asking yourself the question, "When and where will I use statistics?" If you read any newspaper, watch
television, or use the Internet, you will see statistical information. There are statistics about crime, sports, education,
politics, and real estate. Typically, when you read a newspaper article or watch a television news program, you are given
sample information. With this information, you may make a decision about the correctness of a statement, claim, or "fact."
Statistical methods can help you make the "best educated guess."
Since you will undoubtedly be given statistical information at some point in your life, you need to know some techniques
for analyzing the information thoughtfully. Think about buying a house or managing a budget. Think about your chosen
profession. The fields of economics, business, psychology, education, biology, law, computer science, police science, and
early childhood development require at least one course in statistics.
Included in this chapter are the basic ideas and words of probability and statistics. You will soon understand that statistics
and probability work together. You will also learn how data are gathered and what "good" data can be distinguished from
"bad."
Probability
Probability is a mathematical tool used to study randomness. It deals with the chance (the likelihood) of an event occurring.
For example, if you toss a fair coin four times, the outcomes may not be two heads and two tails. However, if you toss
the same coin 4,000 times, the outcomes will be close to half heads and half tails. The expected theoretical probability of
heads in any one toss is 1 or 0.5. Even though the outcomes of a few repetitions are uncertain, there is a regular pattern
2
of outcomes when there are many repetitions. After reading about the English statistician Karl Pearson who tossed a coin
24,000 times with a result of 12,012 heads, one of the authors tossed a coin 2,000 times. The results were 996 heads. The
fraction 996 is equal to 0.498 which is very close to 0.5, the expected probability.
2000
The theory of probability began with the study of games of chance such as poker. Predictions take the form of probabilities.
To predict the likelihood of an earthquake, of rain, or whether you will get an A in this course, we use probabilities. Doctors
use probability to determine the chance of a vaccination causing the disease the vaccination is supposed to prevent. A
stockbroker uses probability to determine the rate of return on a client's investments. You might use probability to decide to
buy a lottery ticket or not. In your study of statistics, you will use the power of mathematics through probability calculations
to analyze and interpret your data.
Key Terms
In statistics, we generally want to study a population. You can think of a population as a collection of persons, things, or
objects under study. To study the population, we select a sample. The idea of sampling is to select a portion (or subset)
of the larger population and study that portion (the sample) to gain information about the population. Data are the result of
sampling from a population.
Because it takes a lot of time and money to examine an entire population, sampling is a very practical technique. If you
wished to compute the overall grade point average at your school, it would make sense to select a sample of students who
attend the school. The data collected from the sample would be the students' grade point averages. In presidential elections,
opinion poll samples of 1,000–2,000 people are taken. The opinion poll is supposed to represent the views of the people
in the entire country. Manufacturers of canned carbonated drinks take samples to determine if a 16 ounce can contains 16
ounces of carbonated drink.
From the sample data, we can calculate a statistic. A statistic is a number that represents a property of the sample. For
example, if we consider one math class to be a sample of the population of all math classes, then the average number of
points earned by students in that one math class at the end of the term is an example of a statistic. The statistic is an estimate
of a population parameter, in this case the mean. A parameter is a numerical characteristic of the whole population that
can be estimated by a statistic. Since we considered all math classes to be the population, then the average number of points
earned per student over all the math classes is an example of a parameter.
One of the main concerns in the field of statistics is how accurately a statistic estimates a parameter. The accuracy really
depends on how well the sample represents the population. The sample must contain the characteristics of the population
in order to be a representative sample. We are interested in both the sample statistic and the population parameter in
inferential statistics. In a later chapter, we will use the sample statistic to test the validity of the established population
parameter.
A variable, or random variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that
can be determined for each member of a population. Variables may be numerical or categorical. Numerical variables
take on values with equal units such as weight in pounds and time in hours. Categorical variables place the person or
thing into a category. If we let X equal the number of points earned by one math student at the end of a term, then X is a
numerical variable. If we let Y be a person's party affiliation, then some examples of Y include Republican, Democrat, and
Independent. Y is a categorical variable. We could do some math with values of X (calculate the average number of points
earned, for example), but it makes no sense to do math with values of Y (calculating an average party affiliation makes no
sense).
Data are the actual values of the variable. They may be numbers or they may be words. Datum is a single value.
Two words that come up often in statistics are mean and proportion. If you were to take three exams in your math classes
and obtain scores of 86, 75, and 92, you would calculate your mean score by adding the three exam scores and dividing by
three (your mean score would be 84.3 to one decimal place). If, in your math class, there are 40 students and 22 are men
and 18 are women, then the proportion of men students is 22 and the proportion of women students is 18 . Mean and
40 40
proportion are discussed in more detail in later chapters.
NOTE
The words " mean" and " average" are often used interchangeably. The substitution of one word for the other is
common practice. The technical term is "arithmetic mean," and "average" is technically a center location. However, in
practice among non-statisticians, "average" is commonly accepted for "arithmetic mean."
Example 1.1
Determine what the key terms refer to in the following study. We want to know the average (mean) amount
of money first year college students spend at ABC College on school supplies that do not include books. We
randomly surveyed 100 first year students at the college. Three of those students spent $150, $200, and $225,
respectively.
Solution 1.1
The population is all first year students attending ABC College this term.
The sample could be all students enrolled in one section of a beginning statistics course at ABC College (although
this sample may not represent the entire population).
The parameter is the average (mean) amount of money spent (excluding books) by first year college students at
ABC College this term: the population mean.
The statistic is the average (mean) amount of money spent (excluding books) by first year college students in the
sample.
The variable could be the amount of money spent (excluding books) by one first year student. Let X = the amount
of money spent (excluding books) by one first year student attending ABC College.
The data are the dollar amounts spent by the first year students. Examples of the data are $150, $200, and $225.
1.1 Determine what the key terms refer to in the following study. We want to know the average (mean) amount of
money spent on school uniforms each year by families with children at Knoll Academy. We randomly survey 100
families with children in the school. Three of the families spent $65, $75, and $95, respectively.
Example 1.2
Solution 1.2
1. f; 2. g; 3. e; 4. d; 5. b; 6. c
8 Chapter 1 | Sampling and Data
Example 1.3
Table 1.1
Cars with dummies in the front seats were crashed into a wall at a speed of 35 miles per hour. We want to know
the proportion of dummies in the driver’s seat that would have had head injuries, if they had been actual drivers.
We start with a simple random sample of 75 cars.
Solution 1.3
The population is all cars containing dummies in the front seat.
The sample is the 75 cars, selected by a simple random sample.
The parameter is the proportion of driver dummies (if they had been real people) who would have suffered head
injuries in the population.
The statistic is proportion of driver dummies (if they had been real people) who would have suffered head injuries
in the sample.
The variable X = the number of driver dummies (if they had been real people) who would have suffered head
injuries.
The data are either: yes, had head injury, or no, did not.
Example 1.4
Solution 1.4
The population is all medical doctors listed in the professional directory.
The parameter is the proportion of medical doctors who have been involved in one or more malpractice suits in
the population.
The sample is the 500 doctors selected at random from the professional directory.
The statistic is the proportion of medical doctors who have been involved in one or more malpractice suits in the
sample.
The variable X = the number of medical doctors who have been involved in one or more malpractice suits.
The data are either: yes, was involved in one or more malpractice lawsuits, or no, was not.
The data are the number of books students carry in their backpacks. You sample five students. Two students carry
three books, one student carries four books, one student carries two books, and one student carries one book. The
numbers of books (three, four, two, and one) are the quantitative discrete data.
1.5 The data are the number of machines in a gym. You sample five gyms. One gym has 12 machines, one gym has
15 machines, one gym has ten machines, one gym has 22 machines, and the other gym has 20 machines. What type of
data is this?
The data are the weights of backpacks with books in them. You sample the same five students. The weights (in
pounds) of their backpacks are 6.2, 7, 6.8, 9.1, 4.3. Notice that backpacks carrying three books can have different
weights. Weights are quantitative continuous data.
1.6 The data are the areas of lawns in square feet. You sample five houses. The areas of the lawns are 144 sq. feet,
160 sq. feet, 190 sq. feet, 180 sq. feet, and 210 sq. feet. What type of data is this?
10 Chapter 1 | Sampling and Data
Example 1.7
You go to the supermarket and purchase three cans of soup (19 ounces) tomato bisque, 14.1 ounces lentil, and 19
ounces Italian wedding), two packages of nuts (walnuts and peanuts), four different kinds of vegetable (broccoli,
cauliflower, spinach, and carrots), and two desserts (16 ounces pistachio ice cream and 32 ounces chocolate chip
cookies).
Name data sets that are quantitative discrete, quantitative continuous, and qualitative(categorical).
Solution 1.7
One Possible Solution:
• The three cans of soup, two packages of nuts, four kinds of vegetables and two desserts are quantitative
discrete data because you count them.
• The weights of the soups (19 ounces, 14.1 ounces, 19 ounces) are quantitative continuous data because you
measure weights as precisely as possible.
• Types of soups, nuts, vegetables and desserts are qualitative(categorical) data because they are categorical.
Try to identify additional data sets in this example.
Example 1.8
The data are the colors of backpacks. Again, you sample the same five students. One student has a red backpack,
two students have black backpacks, one student has a green backpack, and one student has a gray backpack. The
colors red, black, black, green, and gray are qualitative(categorical) data.
1.8 The data are the colors of houses. You sample five houses. The colors of the houses are white, yellow, white, red,
and white. What type of data is this?
NOTE
You may collect data as numbers and report it categorically. For example, the quiz scores for each student are recorded
throughout the term. At the end of the term, the quiz scores are reported as A, B, C, D, or F.
Example 1.9
Work collaboratively to determine the correct data type (quantitative or qualitative). Indicate whether quantitative
data are continuous or discrete. Hint: Data that are discrete often start with the words "the number of."
a. the number of pairs of shoes you own
b. the type of car you drive
c. the distance from your home to the nearest grocery store
d. the number of classes you take per school year
e. the type of calculator you use
f. weights of sumo wrestlers
Solution 1.9
Items a, d, and g are quantitative discrete; items c, f, and h are quantitative continuous; items b and e are
qualitative, or categorical.
1.9 Determine the correct data type (quantitative or qualitative) for the number of cars in a parking lot. Indicate
whether quantitative data are continuous or discrete.
Example 1.10
A statistics professor collects information about the classification of her students as freshmen, sophomores,
juniors, or seniors. The data she collects are summarized in the pie chart Figure 1.1. What type of data does this
graph show?
Figure 1.2
Solution 1.10
This pie chart shows the students in each year, which is qualitative (or categorical) data.
1.10 The registrar at State University keeps records of the number of credit hours students complete each semester.
The data he collects are summarized in the histogram. The class boundaries are 10 to less than 13, 13 to less than 16,
16 to less than 19, 19 to less than 22, and 22 to less than 25.
12 Chapter 1 | Sampling and Data
Figure 1.3
Tables are a good way of organizing and displaying data. But graphs can be even more helpful in understanding the data.
There are no strict rules concerning which graphs to use. Two graphs that are used to display qualitative(categorical) data
are pie charts and bar graphs.
In a pie chart, categories of data are represented by wedges in a circle and are proportional in size to the percent of
individuals in each category.
In a bar graph, the length of the bar for each category is proportional to the number or percent of individuals in each
category. Bars may be vertical or horizontal.
A Pareto chart consists of bars that are sorted into order by category size (largest to smallest).
Look at Figure 1.4 and Figure 1.5 and determine which graph (pie or bar) you think displays the comparisons better.
It is a good idea to look at a variety of graphs to see which is the most helpful in displaying the data. We might make
different choices of what we think is the “best” graph depending on the data and the context. Our choice also depends on
what we are using the data for.
(a) (b)
Figure 1.4
Figure 1.5
Characteristic/Category Percent
Full-Time Students 40.9%
Students who intend to transfer to a 4-year educational institution 48.6%
Characteristic/Category Percent
Students under age 25 61.0%
TOTAL 150.5%
Figure 1.6
Frequency Percent
Asian 8,794 36.1%
Black 1,412 5.8%
Filipino 1,298 5.3%
Hispanic 4,180 17.1%
Native American 146 0.6%
Pacific Islander 236 1.0%
White 5,978 24.5%
TOTAL 22,044 out of 24,382 90.4% out of 100%
Figure 1.7
The following graph is the same as the previous graph but the “Other/Unknown” percent (9.6%) has been included. The
“Other/Unknown” category is large compared to some of the other categories (Native American, 0.6%, Pacific Islander
1.0%). This is important to know when we think about what the data are telling us.
This particular bar graph in Figure 1.8 can be difficult to understand visually. The graph in Figure 1.9 is a Pareto chart.
The Pareto chart has the bars sorted from largest to smallest and is easier to read and interpret.
(a)
(b)
Figure 1.10
Sampling
Gathering information about an entire population often costs too much or is virtually impossible. Instead, we use a sample
of the population. A sample should have the same characteristics as the population it is representing. Most statisticians
use various methods of random sampling in an attempt to achieve this goal. This section will describe a few of the most
common methods. There are several different methods of random sampling. In each form of random sampling, each
member of a population initially has an equal chance of being selected for the sample. Each method has pros and cons. The
easiest method to describe is called a simple random sample. Any group of n individuals is equally likely to be chosen as
any other group of n individuals if the simple random sampling technique is used. In other words, each sample of the same
size has an equal chance of being selected.
Besides simple random sampling, there are other forms of sampling that involve a chance process for getting the sample.
Other well-known random sampling methods are the stratified sample, the cluster sample, and the systematic
sample.
To choose a stratified sample, divide the population into groups called strata and then take a proportionate number
from each stratum. For example, you could stratify (group) your college population by department and then choose a
proportionate simple random sample from each stratum (each department) to get a stratified random sample. To choose
a simple random sample from each department, number each member of the first department, number each member of
the second department, and do the same for the remaining departments. Then use simple random sampling to choose
proportionate numbers from the first department and do the same for each of the remaining departments. Those numbers
picked from the first department, picked from the second department, and so on represent the members who make up the
stratified sample.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters.
All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments
from your college population, the four departments make up the cluster sample. Divide your college faculty by department.
The departments are the clusters. Number each department, and then choose four different numbers using simple random
sampling. All members of the four departments with those numbers are the cluster sample.
To choose a systematic sample, randomly select a starting point and take every nth piece of data from a listing of the
population. For example, suppose you have to do a phone survey. Your phone book contains 20,000 residence listings. You
must choose 400 names for the sample. Number the population 1–20,000 and then use a simple random sample to pick a
number that represents the first name in the sample. Then choose every fiftieth name thereafter until you have a total of 400
names (you might have to go back to the beginning of your phone list). Systematic sampling is frequently chosen because
it is a simple method.
A type of sampling that is non-random is convenience sampling. Convenience sampling involves using results that are
readily available. For example, a computer software store conducts a marketing study by interviewing potential customers
who happen to be in the store browsing through the available software. The results of convenience sampling may be very
good in some cases and highly biased (favor certain outcomes) in others.
Sampling data should be done very carefully. Collecting data carelessly can have devastating results. Surveys mailed to
households and then returned may be very biased (they may favor a certain group). It is better for the person conducting the
survey to select the sample respondents.
True random sampling is done with replacement. That is, once a member is picked, that member goes back into the
population and thus may be chosen more than once. However for practical reasons, in most populations, simple random
sampling is done without replacement. Surveys are typically done without replacement. That is, a member of the
population may be chosen only once. Most samples are taken from large populations and the sample tends to be small in
comparison to the population. Since this is the case, sampling without replacement is approximately the same as sampling
with replacement because the chance of picking the same individual more than once with replacement is very low.
In a college population of 10,000 people, suppose you want to pick a sample of 1,000 randomly for a survey. For any
particular sample of 1,000, if you are sampling with replacement,
• the chance of picking the first person is 1,000 out of 10,000 (0.1000);
• the chance of picking a different second person for this sample is 999 out of 10,000 (0.0999);
• the chance of picking the same person again is 1 out of 10,000 (very low).
If you are sampling without replacement,
• the chance of picking the first person for any particular sample is 1000 out of 10,000 (0.1000);
• the chance of picking a different second person is 999 out of 9,999 (0.0999);
• you do not replace the first person before picking the next person.
Compare the fractions 999/10,000 and 999/9,999. For accuracy, carry the decimal answers to four decimal places. To four
decimal places, these numbers are equivalent (0.0999).
Sampling without replacement instead of sampling with replacement becomes a mathematical issue only when the
population is small. For example, if the population is 25 people, the sample is ten, and you are sampling with replacement
for any particular sample, then the chance of picking the first person is ten out of 25, and the chance of picking a different
second person is nine out of 25 (you replace the first person).
If you sample without replacement, then the chance of picking the first person is ten out of 25, and then the chance of
picking the second person (who is different) is nine out of 24 (you do not replace the first person).
Compare the fractions 9/25 and 9/24. To four decimal places, 9/25 = 0.3600 and 9/24 = 0.3750. To four decimal places,
these numbers are not equivalent.
When you analyze data, it is important to be aware of sampling errors and nonsampling errors. The actual process of
sampling causes sampling errors. For example, the sample may not be large enough. Factors not related to the sampling
18 Chapter 1 | Sampling and Data
process cause nonsampling errors. A defective counting device can cause a nonsampling error.
In reality, a sample will never be exactly representative of the population so there will always be some sampling error. As a
rule, the larger the sample, the smaller the sampling error.
In statistics, a sampling bias is created when a sample is collected from a population and some members of the population
are not as likely to be chosen as others (remember, each member of the population should have an equally likely chance of
being chosen). When a sampling bias happens, there can be incorrect conclusions drawn about the population that is being
studied.
Critical Evaluation
We need to evaluate the statistical studies we read about critically and analyze them before accepting the results of the
studies. Common problems to be aware of include
• Problems with samples: A sample must be representative of the population. A sample that is not representative of the
population is biased. Biased samples that are not representative of the population give results that are inaccurate and
not valid.
• Self-selected samples: Responses only by people who choose to respond, such as call-in surveys, are often unreliable.
• Sample size issues: Samples that are too small may be unreliable. Larger samples are better, if possible. In some
situations, having small samples is unavoidable and can still be used to draw conclusions. Examples: crash testing cars
or medical testing for rare conditions
• Undue influence: collecting data or asking questions in a way that influences the response
• Non-response or refusal of subject to participate: The collected responses may no longer be representative of the
population. Often, people with strong positive or negative opinions may answer surveys, which can affect the results.
• Causality: A relationship between two variables does not mean that one causes the other to occur. They may be related
(correlated) because of their relationship through a different variable.
• Self-funded or self-interest studies: A study performed by a person or organization in order to support their claim. Is
the study impartial? Read the study carefully to evaluate the work. Do not automatically assume that the study is good,
but do not automatically assume the study is bad either. Evaluate it on its merits and the work done.
• Misleading use of data: improperly displayed graphs, incomplete data, or lack of context
• Confounding: When the effects of multiple factors on a response cannot be separated. Confounding makes it difficult
or impossible to draw valid conclusions about the effect of each factor.
Example 1.11
A study is done to determine the average tuition that San Jose State undergraduate students pay per semester.
Each student in the following samples is asked how much tuition he or she paid for the Fall semester. What is the
type of sampling in each case?
a. A sample of 100 undergraduate San Jose State students is taken by organizing the students’ names by
classification (freshman, sophomore, junior, or senior), and then selecting 25 students from each.
b. A random number generator is used to select a student from the alphabetical listing of all undergraduate
students in the Fall semester. Starting with that student, every 50th student is chosen until 75 students are
included in the sample.
c. A completely random method is used to select 75 students. Each undergraduate student in the fall semester
has the same probability of being chosen at any stage of the sampling process.
d. The freshman, sophomore, junior, and senior years are numbered one, two, three, and four, respectively.
A random number generator is used to pick two of those years. All students in those two years are in the
sample.
e. An administrative assistant is asked to stand in front of the library one Wednesday and to ask the first 100
undergraduate students he encounters what they paid for tuition the Fall semester. Those 100 students are
the sample.
Solution 1.11
a. stratified; b. systematic; c. simple random; d. cluster; e. convenience
Example 1.12
Determine the type of sampling used (simple random, stratified, systematic, cluster, or convenience).
a. A soccer coach selects six players from a group of boys aged eight to ten, seven players from a group of
boys aged 11 to 12, and three players from a group of boys aged 13 to 14 to form a recreational soccer team.
b. A pollster interviews all human resource personnel in five different high tech companies.
c. A high school educational researcher interviews 50 high school female teachers and 50 high school male
teachers.
d. A medical researcher interviews every third cancer patient from a list of cancer patients at a local hospital.
e. A high school counselor uses a computer to generate 50 random numbers and then picks students whose
names correspond to the numbers.
f. A student interviews classmates in his algebra class to determine how many pairs of jeans a student owns,
on the average.
Solution 1.12
a. stratified; b. cluster; c. stratified; d. systematic; e. simple random; f.convenience
If we were to examine two samples representing the same population, even if we used random sampling methods for the
samples, they would not be exactly the same. Just as there is variation in data, there is variation in samples. As you become
accustomed to sampling, the variability will begin to seem natural.
Example 1.13
Suppose ABC College has 10,000 part-time students (the population). We are interested in the average amount of
money a part-time student spends on books in the fall term. Asking all 10,000 students is an almost impossible
task.
Suppose we take two different samples.
First, we use convenience sampling and survey ten students from a first term organic chemistry class. Many of
these students are taking first term calculus in addition to the organic chemistry class. The amount of money they
spend on books is as follows:
$128; $87; $173; $116; $130; $204; $147; $189; $93; $153
The second sample is taken using a list of senior citizens who take P.E. classes and taking every fifth senior citizen
on the list, for a total of ten senior citizens. They spend:
$50; $40; $36; $15; $50; $100; $40; $53; $22; $22
It is unlikely that any student is in both samples.
a. Do you think that either of these samples is representative of (or is characteristic of) the entire 10,000 part-time
student population?
Solution 1.13
a. No. The first sample probably consists of science-oriented students. Besides the chemistry course, some of
them are also taking first-term calculus. Books for these classes tend to be expensive. Most of these students are,
more than likely, paying more than the average part-time student for their books. The second sample is a group of
senior citizens who are, more than likely, taking courses for health and interest. The amount of money they spend
on books is probably much less than the average parttime student. Both samples are biased. Also, in both cases,
20 Chapter 1 | Sampling and Data
Solution 1.13
b. No. For these samples, each member of the population did not have an equally likely chance of being chosen.
Now, suppose we take a third sample. We choose ten different part-time students from the disciplines of
chemistry, math, English, psychology, sociology, history, nursing, physical education, art, and early childhood
development. (We assume that these are the only disciplines in which part-time students at ABC College are
enrolled and that an equal number of part-time students are enrolled in each of the disciplines.) Each student is
chosen using simple random sampling. Using a calculator, random numbers are generated and a student from
a particular discipline is selected if he or she has a corresponding number. The students spend the following
amounts:
$180; $50; $150; $85; $260; $75; $180; $200; $200; $150
c. Is the sample biased?
Solution 1.13
c. The sample is unbiased, but a larger sample would be recommended to increase the likelihood that the sample
will be close to representative of the population. However, for a biased sampling technique, even a large sample
runs the risk of not being representative of the population.
Students often ask if it is "good enough" to take a sample, instead of surveying the entire population. If the survey
is done well, the answer is yes.
1.13 A local radio station has a fan base of 20,000 listeners. The station wants to know if its audience would prefer
more music or more talk shows. Asking all 20,000 listeners is an almost impossible task.
The station uses convenience sampling and surveys the first 200 people they meet at one of the station’s music concert
events. 24 people said they’d prefer more talk shows, and 176 people said they’d prefer more music.
Do you think that this sample is representative of (or is characteristic of) the entire 20,000 listener population?
Variation in Data
Variation is present in any set of data. For example, 16-ounce cans of beverage may contain more or less than 16 ounces of
liquid. In one study, eight 16 ounce cans were measured and produced the following amount (in ounces) of beverage:
15.8; 16.1; 15.2; 14.8; 15.8; 15.9; 16.0; 15.5
Measurements of the amount of beverage in a 16-ounce can may vary because different people make the measurements or
because the exact amount, 16 ounces of liquid, was not put into the cans. Manufacturers regularly run tests to determine if
the amount of beverage in a 16-ounce can falls within the desired range.
Be aware that as you take data, your data may vary somewhat from the data someone else is taking for the same purpose.
This is completely natural. However, if two or more of you are taking the same data and get very different results, it is time
for you and the others to reevaluate your data-taking methods and your accuracy.
Variation in Samples
It was mentioned previously that two or more samples from the same population, taken randomly, and having close to
the same characteristics of the population will likely be different from each other. Suppose Doreen and Jung both decide
to study the average amount of time students at their college sleep each night. Doreen and Jung each take samples of 500
students. Doreen uses systematic sampling and Jung uses cluster sampling. Doreen's sample will be different from Jung's
sample. Even if Doreen and Jung used the same sampling method, in all likelihood their samples would be different. Neither
Levels of Measurement
The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher
being familiar with levels of measurement. Not every statistical operation can be used with every set of data. Data can be
classified into four levels of measurement. They are (from lowest to highest level):
• Nominal scale level
• Ordinal scale level
• Interval scale level
• Ratio scale level
Data that is measured using a nominal scale is qualitative (categorical). Categories, colors, names, labels and favorite
foods along with yes or no responses are examples of nominal level data. Nominal scale data are not ordered. For example,
trying to classify people according to their favorite food does not make any sense. Putting pizza first and sushi second is not
meaningful.
Smartphone companies are another example of nominal scale data. The data are the names of the companies that make
smartphones, but there is no agreed upon order of these brands, even though people may have personal preferences. Nominal
scale data cannot be used in calculations.
Data that is measured using an ordinal scale is similar to nominal scale data but there is a big difference. The ordinal scale
data can be ordered. An example of ordinal scale data is a list of the top five national parks in the United States. The top
five national parks in the United States can be ranked from one to five but we cannot measure differences between the data.
Another example of using the ordinal scale is a cruise survey where the responses to questions about the cruise are
“excellent,” “good,” “satisfactory,” and “unsatisfactory.” These responses are ordered from the most desired response to the
least desired. But the differences between two pieces of data cannot be measured. Like the nominal scale data, ordinal scale
data cannot be used in calculations.
Data that is measured using the interval scale is similar to ordinal level data because it has a definite ordering but there
is a difference between data. The differences between interval scale data can be measured though the data does not have a
starting point.
Temperature scales like Celsius (C) and Fahrenheit (F) are measured by using the interval scale. In both temperature
measurements, 40° is equal to 100° minus 60°. Differences make sense. But 0 degrees does not because, in both scales, 0 is
not the absolute lowest temperature. Temperatures like -10° F and -15° C exist and are colder than 0.
Interval level data can be used in calculations, but one type of comparison cannot be done. 80° C is not four times as hot as
20° C (nor is 80° F four times as hot as 20° F). There is no meaning to the ratio of 80 to 20 (or four to one).
Data that is measured using the ratio scale takes care of the ratio problem and gives you the most information. Ratio scale
data is like interval scale data, but it has a 0 point and ratios can be calculated. For example, four multiple choice statistics
22 Chapter 1 | Sampling and Data
final exam scores are 80, 68, 20 and 92 (out of a possible 100 points). The exams are machine-graded.
The data can be put in order from lowest to highest: 20, 68, 80, 92.
The differences between the data have meaning. The score 92 is more than the score 68 by 24 points. Ratios can be
calculated. The smallest score is 0. So 80 is four times 20. The score of 80 is four times better than the score of 20.
Frequency
Twenty students were asked how many hours they worked per day. Their responses, in hours, are as follows: 5; 6; 3; 3; 2;
4; 7; 5; 2; 3; 5; 6; 5; 4; 4; 3; 5; 2; 5; 3.
Table 1.5 lists the different data values in ascending order and their frequencies.
A frequency is the number of times a value of the data occurs. According to Table 1.5, there are three students who work
two hours, five students who work three hours, and so on. The sum of the values in the frequency column, 20, represents
the total number of students included in the sample.
A relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all
outcomes to the total number of outcomes. To find the relative frequencies, divide each frequency by the total number of
students in the sample–in this case, 20. Relative frequencies can be written as fractions, percents, or decimals.
The sum of the values in the relative frequency column of Table 1.6 is 20 , or 1.
20
Cumulative relative frequency is the accumulation of the previous relative frequencies. To find the cumulative relative
frequencies, add all the previous relative frequencies to the relative frequency for the current row, as shown in Table 1.7.
5 or 0.25
3 5 20 0.15 + 0.25 = 0.40
3 or 0.15
4 3 20 0.40 + 0.15 = 0.55
6 or 0.30
5 6 20 0.55 + 0.30 = 0.85
2 or 0.10
6 2 20 0.85 + 0.10 = 0.95
1 or 0.05
7 1 20 0.95 + 0.05 = 1.00
Table 1.7 Frequency Table of Student Work Hours with Relative and Cumulative
Relative Frequencies
The last entry of the cumulative relative frequency column is one, indicating that one hundred percent of the data has been
accumulated.
NOTE
Because of rounding, the relative frequency column may not always sum to one, and the last entry in the cumulative
relative frequency column may not be one. However, they each should be close to one.
Table 1.8 represents the heights, in inches, of a sample of 100 male semiprofessional soccer players.
CUMULATIVE
HEIGHTS RELATIVE
FREQUENCY RELATIVE
(INCHES) FREQUENCY
FREQUENCY
5 = 0.05
59.95–61.95 5 100 0.05
3 = 0.03
61.95–63.95 3 100 0.05 + 0.03 = 0.08
15 = 0.15
63.95–65.95 15 100 0.08 + 0.15 = 0.23
40 = 0.40
65.95–67.95 40 100 0.23 + 0.40 = 0.63
17 = 0.17
67.95–69.95 17 100 0.63 + 0.17 = 0.80
12 = 0.12
69.95–71.95 12 100 0.80 + 0.12 = 0.92
7 = 0.07
71.95–73.95 7 100 0.92 + 0.07 = 0.99
CUMULATIVE
HEIGHTS RELATIVE
FREQUENCY RELATIVE
(INCHES) FREQUENCY
FREQUENCY
1 = 0.01
73.95–75.95 1 100 0.99 + 0.01 = 1.00
The data in this table have been grouped into the following intervals:
• 59.95 to 61.95 inches
• 61.95 to 63.95 inches
• 63.95 to 65.95 inches
• 65.95 to 67.95 inches
• 67.95 to 69.95 inches
• 69.95 to 71.95 inches
• 71.95 to 73.95 inches
• 73.95 to 75.95 inches
In this sample, there are five players whose heights fall within the interval 59.95–61.95 inches, three players whose heights
fall within the interval 61.95–63.95 inches, 15 players whose heights fall within the interval 63.95–65.95 inches, 40 players
whose heights fall within the interval 65.95–67.95 inches, 17 players whose heights fall within the interval 67.95–69.95
inches, 12 players whose heights fall within the interval 69.95–71.95, seven players whose heights fall within the interval
71.95–73.95, and one player whose heights fall within the interval 73.95–75.95. All heights fall between the endpoints of
an interval and not at the endpoints.
Example 1.14
From Table 1.8, find the percentage of heights that are less than 65.95 inches.
Solution 1.14
If you look at the first, second, and third rows, the heights are all less than 65.95 inches. There are 5 + 3 + 15 = 23
players whose heights are less than 65.95 inches. The percentage of heights less than 65.95 inches is then 23
100
or 23%. This percentage is the cumulative relative frequency entry in the third row.
1.14 Table 1.9 shows the amount, in inches, of annual rainfall in a sample of towns.
7 = 0.14
4.97–6.99 7 50 0.12 + 0.14 = 0.26
15 = 0.30
6.99–9.01 15 50 0.26 + 0.30 = 0.56
8 = 0.16
9.01–11.03 8 50 0.56 + 0.16 = 0.72
9 = 0.18
11.03–13.05 9 50 0.72 + 0.18 = 0.90
5 = 0.10
13.05–15.07 5 50 0.90 + 0.10 = 1.00
Table 1.9
From Table 1.9, find the percentage of rainfall that is less than 9.01 inches.
Example 1.15
From Table 1.8, find the percentage of heights that fall between 61.95 and 65.95 inches.
Solution 1.15
Add the relative frequencies in the second and third rows: 0.03 + 0.15 = 0.18 or 18%.
1.15 From Table 1.9, find the percentage of rainfall that is between 6.99 and 13.05 inches.
Example 1.16
Use the heights of the 100 male semiprofessional soccer players in Table 1.8. Fill in the blanks and check your
answers.
a. The percentage of heights that are from 67.95 to 71.95 inches is: ____.
b. The percentage of heights that are from 67.95 to 73.95 inches is: ____.
c. The percentage of heights that are more than 65.95 inches is: ____.
d. The number of players in the sample who are between 61.95 and 71.95 inches tall is: ____.
e. What kind of data are the heights?
26 Chapter 1 | Sampling and Data
f. Describe how you could gather this data (the heights) so that the data are characteristic of all male
semiprofessional soccer players.
Remember, you count frequencies. To find the relative frequency, divide the frequency by the total number of
data values. To find the cumulative relative frequency, add all of the previous relative frequencies to the relative
frequency for the current row.
Solution 1.16
a. 29%
b. 36%
c. 77%
d. 87
e. quantitative continuous
f. get rosters from each team and choose a simple random sample from each
Example 1.17
Nineteen people were asked how many miles, to the nearest mile, they commute to work each day. The data are
as follows: 2; 5; 7; 3; 2; 10; 18; 15; 20; 7; 10; 18; 5; 12; 13; 12; 4; 5; 10. Table 1.10 was produced:
CUMULATIVE
RELATIVE
DATA FREQUENCY RELATIVE
FREQUENCY
FREQUENCY
3
3 3 19 0.1579
1
4 1 19 0.2105
3
5 3 19 0.1579
2
7 2 19 0.2632
4
10 3 19 0.4737
2
12 2 19 0.7895
1
13 1 19 0.8421
1
15 1 19 0.8948
1
18 1 19 0.9474
1
20 1 19 1.0000
b. True or False: Three percent of the people surveyed commute three miles. If the statement is not correct,
what should it be? If the table is incorrect, make the corrections.
c. What fraction of the people surveyed commute five or seven miles?
d. What fraction of the people surveyed commute 12 miles or more? Less than 12 miles? Between five and 13
miles (not including five and 13 miles)?
Solution 1.17
a. No. The frequency column sums to 18, not 19. Not all cumulative relative frequencies are correct.
b. False. The frequency for three miles should be one; for two miles (left out), two. The cumulative relative
frequency column should read: 0.1052, 0.1579, 0.2105, 0.3684, 0.4737, 0.6316, 0.7368, 0.7895, 0.8421,
0.9474, 1.0000.
c. 5
19
d. 7 , 12 , 7
19 19 19
1.17 Table 1.9 represents the amount, in inches, of annual rainfall in a sample of towns. What fraction of towns
surveyed get between 11.03 and 13.05 inches of rainfall each year?
28 Chapter 1 | Sampling and Data
Example 1.18
Table 1.11 contains the total number of deaths worldwide as a result of earthquakes for the period from 2000 to
2012.
Table 1.11
Solution 1.18
a. 97,118 (11.8%)
b. 41.6%
c. 67,092/823,356 or 0.081 or 8.1 %
d. 27.8%
e. Quantitative discrete
f. Quantitative continuous
1.18 Table 1.12 contains the total number of fatal motor vehicle traffic crashes in the United States for the period
from 1994 to 2011.
Table 1.12
accomplished by the random assignment of experimental units to treatment groups. When subjects are assigned treatments
randomly, all of the potential lurking variables are spread equally among the groups. At this point the only difference
between groups is the one imposed by the researcher. Different outcomes measured in the response variable, therefore, must
be a direct result of the different treatments. In this way, an experiment can prove a cause-and-effect connection between
the explanatory and response variables.
The power of suggestion can have an important influence on the outcome of an experiment. Studies have shown that the
expectation of the study participant can be as important as the actual medication. In one study of performance-enhancing
drugs, researchers noted:
Results showed that believing one had taken the substance resulted in [performance] times almost as fast as those associated
with consuming the drug itself. In contrast, taking the drug without knowledge yielded no significant performance
increment.[1]
When participation in a study prompts a physical response from a participant, it is difficult to isolate the effects of the
explanatory variable. To counter the power of suggestion, researchers set aside one treatment group as a control group.
This group is given a placebo treatment–a treatment that cannot influence the response variable. The control group helps
researchers balance the effects of being in an experiment with the effects of the active treatments. Of course, if you are
participating in a study and you know that you are receiving a pill which contains no actual medication, then the power of
suggestion is no longer a factor. Blinding in a randomized experiment preserves the power of suggestion. When a person
involved in a research study is blinded, he does not know who is receiving the active treatment(s) and who is receiving
the placebo treatment. A double-blind experiment is one in which both the subjects and the researchers involved with the
subjects are blinded.
Example 1.19
The Smell & Taste Treatment and Research Foundation conducted a study to investigate whether smell can
affect learning. Subjects completed mazes multiple times while wearing masks. They completed the pencil and
paper mazes three times wearing floral-scented masks, and three times with unscented masks. Participants were
assigned at random to wear the floral mask during the first three trials or during the last three trials. For each
trial, researchers recorded the time it took to complete the maze and the subject’s impression of the mask’s scent:
positive, negative, or neutral.
a. Describe the explanatory and response variables in this study.
b. What are the treatments?
c. Identify any lurking variables that could interfere with this study.
d. Is it possible to use blinding in this study?
Solution 1.19
a. The explanatory variable is scent, and the response variable is the time it takes to complete the maze.
b. There are two treatments: a floral-scented mask and an unscented mask.
c. All subjects experienced both treatments. The order of treatments was randomly assigned so there were no
differences between the treatment groups. Random assignment eliminates the problem of lurking variables.
d. Subjects will clearly know whether they can smell flowers or not, so subjects cannot be blinded in this study.
Researchers timing the mazes can be blinded, though. The researcher who is observing a subject will not
know which mask is being worn.
1. McClung, M. Collins, D. “Because I know it will!”: placebo effects of an ergogenic aid on athletic performance.
Journal of Sport & Exercise Psychology. 2007 Jun. 29(3):382-94. Web. April 30, 2013.
KEY TERMS
Average also called mean or arithmetic mean; a number that describes the central tendency of the data
Categorical Variable variables that take on values that are names or labels
Cluster Sampling a method for selecting a random sample and dividing the population into groups (clusters); use
simple random sampling to select a set of clusters. Every individual in the chosen clusters is included in the sample.
Continuous Random Variable a random variable (RV) whose outcomes are measured; the height of trees in the
forest is a continuous RV.
Control Group a group in a randomized experiment that receives an inactive treatment but is otherwise managed
exactly as the other groups
Convenience Sampling a nonrandom method of selecting a sample; this method selects individuals that are easily
accessible and may result in biased data.
Cumulative Relative Frequency The term applies to an ordered set of observations from smallest to largest. The
cumulative relative frequency is the sum of the relative frequencies for all values that are less than or equal to the
given value.
Data a set of observations (a set of possible outcomes); most data can be put into two groups: qualitative (an attribute
whose value is indicated by a label) or quantitative (an attribute whose value is indicated by a number).
Quantitative data can be separated into two subgroups: discrete and continuous. Data is discrete if it is the result of
counting (such as the number of students of a given ethnic group in a class or the number of books on a shelf). Data
is continuous if it is the result of measuring (such as distance traveled or weight of luggage)
Discrete Random Variable a random variable (RV) whose outcomes are counted
Double-blinding the act of blinding both the subjects of an experiment and the researchers who work with the subjects
Explanatory Variable the independent variable in an experiment; the value controlled by researchers
Informed Consent Any human subject in a research study must be cognizant of any risks or costs associated with the
study. The subject has the right to know the nature of the treatments included in the study, their potential risks, and
their potential benefits. Consent must be given freely by an informed, fit participant.
Institutional Review Board a committee tasked with oversight of research programs that involve human subjects
Lurking Variable a variable that has an effect on a study even though it is neither an explanatory variable nor a
response variable
Mathematical Models a description of a phenomenon using mathematical concepts, such as equations, inequalities,
distributions, etc.
Nonsampling Error an issue that affects the reliability of sampling data other than natural variation; it includes a
variety of human errors including poor study design, biased sampling methods, inaccurate information provided by
study participants, data entry errors, and poor analysis.
Numerical Variable variables that take on values that are indicated by numbers
Observational Study a study in which the independent variable is not manipulated by the researcher
Parameter a number that is used to represent a population characteristic and that generally cannot be determined easily
Placebo an inactive treatment that has no real effect on the explanatory variable
32 Chapter 1 | Sampling and Data
Population all individuals, objects, or measurements whose properties are being studied
Probability a number between zero and one, inclusive, that gives the likelihood that a specific event will occur
Proportion the number of successes divided by the total number in the sample
Random Assignment the act of organizing experimental units into treatment groups using random methods
Random Sampling a method of selecting a sample that gives every member of the population an equal chance of
being selected.
Relative Frequency the ratio of the number of times a value of the data occurs in the set of all outcomes to the number
of all outcomes to the total number of outcomes
Representative Sample a subset of the population that has the same characteristics as the population
Response Variable the dependent variable in an experiment; the value that is measured for change at the end of an
experiment
Sampling Bias not all members of the population are equally likely to be selected
Sampling Error the natural variation that results from selecting a sample to represent a larger population; this variation
decreases as the sample size increases, so selecting larger samples reduces sampling error.
Sampling with Replacement Once a member of the population is selected for inclusion in a sample, that member is
returned to the population for the selection of the next individual.
Sampling without Replacement A member of the population may be chosen for inclusion in a sample only once. If
chosen, the member is not returned to the population before the next selection.
Simple Random Sampling a straightforward method for selecting a random sample; give each member of the
population a number. Use a random number generator to select a set of labels. These randomly selected labels
identify the members of your sample.
Statistic a numerical characteristic of the sample; a statistic estimates the corresponding population parameter.
Statistical Models a description of a phenomenon using probability distributions that describe the expected behavior
of the phenomenon and the variability in the expected observations.
Stratified Sampling a method for selecting a random sample used to ensure that subgroups of the population are
represented adequately; divide the population into groups (strata). Use simple random sampling to identify a
proportionate number of individuals from each stratum.
Systematic Sampling a method for selecting a random sample; list the members of the population. Use simple
random sampling to select a starting point in the population. Let k = (number of individuals in the
population)/(number of individuals needed in the sample). Choose every kth individual in the list starting with the
one that was randomly selected. If necessary, return to the beginning of the population list to complete your sample.
CHAPTER REVIEW
2. Andrew Gelman, “Open Data and Open Methods,” Ethics and Statistics, http://www.stat.columbia.edu/~gelman/
research/published/ChanceEthics1.pdf (accessed May 1, 2013).
34 Chapter 1 | Sampling and Data
HOMEWORK
Use the following information to answer the next three exercises: A Lake Tahoe Community College instructor is interested
in the mean number of days Lake Tahoe Community College math students are absent from class during a quarter.
9. What is the population she is interested in?
a. all Lake Tahoe Community College students
b. all Lake Tahoe Community College English students
c. all Lake Tahoe Community College students in her classes
d. all Lake Tahoe Community College math students
10. Consider the following:
X = number of days a Lake Tahoe Community College math student is absent
In this case, X is an example of a:
a. variable.
b. population.
c. statistic.
d. data.
11. The instructor’s sample produces a mean number of days absent of 3.5 days. This value is an example of a:
a. parameter.
b. data.
c. statistic.
d. variable.
32. Name the sampling method used in each of the following situations:
a. A woman in the airport is handing out questionnaires to travelers asking them to evaluate the airport’s service.
She does not ask travelers who are hurrying through the airport with their hands full of luggage, but instead asks
all travelers who are sitting near gates and not taking naps while they wait.
b. A teacher wants to know if her students are doing homework, so she randomly selects rows two and five and then
calls on all students in row two and all students in row five to present the solutions to homework problems to the
class.
c. The marketing manager for an electronics chain store wants information about the ages of its customers. Over
the next two weeks, at each store location, 100 randomly selected customers are given questionnaires to fill out
asking for information about age, as well as about other variables of interest.
d. The librarian at a public library wants to determine what proportion of the library users are children. The librarian
has a tally sheet on which she marks whether books are checked out by an adult or a child. She records this data
for every fourth patron who checks out books.
e. A political party wants to know the reaction of voters to a debate between the candidates. The day after the debate,
the party’s polling staff calls 1,200 randomly selected phone numbers. If a registered voter answers the phone or
is available to come to the phone, that registered voter is asked whom he or she intends to vote for and whether
the debate changed his or her opinion of the candidates.
33. A “random survey” was conducted of 3,274 people of the “microprocessor generation” (people born since 1971, the
year the microprocessor was invented). It was reported that 48% of those individuals surveyed stated that if they had $2,000
to spend, they would use it for computer equipment. Also, 66% of those surveyed considered themselves relatively savvy
computer users.
a. Do you consider the sample size large enough for a study of this type? Why or why not?
b. Based on your “gut feeling,” do you believe the percents accurately reflect the U.S. population for those
individuals born since 1971? If not, do you think the percents of the population are actually higher or lower than
the sample statistics? Why?
Additional information: The survey, reported by Intel Corporation, was filled out by individuals who visited the
Los Angeles Convention Center to see the Smithsonian Institute's road show called “America’s Smithsonian.”
c. With this additional information, do you feel that all demographic and ethnic groups were equally represented at
the event? Why or why not?
d. With the additional information, comment on how accurately you think the sample statistics reflect the population
parameters.
34. The Well-Being Index is a survey that follows trends of U.S. residents on a regular basis. There are six areas of
health and wellness covered in the survey: Life Evaluation, Emotional Health, Physical Health, Healthy Behavior, Work
Environment, and Basic Access. Some of the questions used to measure the Index are listed below.
Identify the type of data obtained from each question used in this survey: qualitative(categorical), quantitative discrete, or
quantitative continuous.
a. Do you have any health problems that prevent you from doing any of the things people your age can normally do?
b. During the past 30 days, for about how many days did poor health keep you from doing your usual activities?
c. In the last seven days, on how many days did you exercise for 30 minutes or more?
d. Do you have health insurance coverage?
35. In advance of the 1936 Presidential Election, a magazine titled Literary Digest released the results of an opinion
poll predicting that the republican candidate Alf Landon would win by a large margin. The magazine sent post cards
to approximately 10,000,000 prospective voters. These prospective voters were selected from the subscription list of the
magazine, from automobile registration lists, from phone lists, and from club membership lists. Approximately 2,300,000
people returned the postcards.
a. Think about the state of the United States in 1936. Explain why a sample chosen from magazine subscription lists,
automobile registration lists, phone books, and club membership lists was not representative of the population of
the United States at that time.
b. What effect does the low response rate have on the reliability of the sample?
c. Are these problems examples of sampling error or nonsampling error?
d. During the same year, George Gallup conducted his own poll of 30,000 prospective voters. These researchers used
a method they called "quota sampling" to obtain survey answers from specific subsets of the population. Quota
sampling is an example of which sampling method described in this module?
36. Crime-related and demographic statistics for 47 US states in 1960 were collected from government agencies, including
the FBI's Uniform Crime Report. One analysis of this data found a strong connection between education and crime
indicating that higher levels of education in a community correspond to higher crime rates.
Which of the potential problems with samples discussed in Section 1.2 could explain this connection?
37. YouPolls is a website that allows anyone to create and respond to polls. One question posted April 15 asks:
“Do you feel happy paying your taxes when members of the Obama administration are allowed to ignore their tax
liabilities?”[3]
As of April 25, 11 people responded to this question. Each participant answered “NO!”
Which of the potential problems with samples discussed in this module could explain this connection?
38. A scholarly article about response rates begins with the following quote:
“Declining contact and cooperation rates in random digit dial (RDD) national telephone surveys raise serious concerns
about the validity of estimates drawn from such research.”[4]
The Pew Research Center for People and the Press admits:
“The percentage of people we interview – out of all we try to interview – has been declining over the past decade or
more.”[5]
a. What are some reasons for the decline in response rate over the past decade?
b. Explain why researchers are concerned with the impact of the declining response rate on public opinion polls.
3. lastbaldeagle. 2013. On Tax Day, House to Call for Firing Federal Workers Who Owe Back Taxes. Opinion poll posted
online at: http://www.youpolls.com/details.aspx?id=12328 (accessed May 1, 2013).
4. Scott Keeter et al., “Gauging the Impact of Growing Nonresponse on Estimates from a National RDD Telephone
Survey,” Public Opinion Quarterly 70 no. 5 (2006), http://poq.oxfordjournals.org/content/70/5/759.full
(http://poq.oxfordjournals.org/content/70/5/759.full) (accessed May 1, 2013).
5. Frequently Asked Questions, Pew Research Center for the People & the Press, http://www.people-press.org/
methodology/frequently-asked-questions/#dont-you-have-trouble-getting-people-to-answer-your-polls (accessed May 1,
2013).
38 Chapter 1 | Sampling and Data
40. Sixty adults with gum disease were asked the number of times per week they used to floss before their diagnosis. The
(incomplete) results are shown in Table 1.14.
3
2 3 19 0.2632
1
4 1 19 0.3158
3
5 3 19 0.4737
2
7 2 19 0.5789
2
10 2 19 0.6842
2
12 2 19 0.7895
1
15 1 19 0.8421
1
20 1 19 1.0000
a. Fix the errors in Table 1.15. Also, explain how someone might have arrived at the incorrect number(s).
b. Explain what is wrong with this statement: “47 percent of the people surveyed have lived in the U.S. for 5 years.”
c. Fix the statement in b to make it correct.
d. What fraction of the people surveyed have lived in the U.S. five or seven years?
e. What fraction of the people surveyed have lived in the U.S. at most 12 years?
f. What fraction of the people surveyed have lived in the U.S. fewer than 12 years?
g. What fraction of the people surveyed have lived in the U.S. from five to 20 years, inclusive?
42. How much time does it take to travel to work? Table 1.16 shows the mean commute time by state for workers at least
16 years old who are not working at home. Find the mean travel time, and round off the answer properly.
24.0 24.3 25.9 18.9 27.5 17.9 21.8 20.9 16.7 27.3
18.2 24.7 20.0 22.6 23.9 18.0 31.4 22.3 24.0 25.5
24.7 24.6 28.1 24.9 22.6 23.6 23.4 25.7 24.8 25.5
21.2 25.7 23.1 23.0 23.9 26.0 16.3 23.1 21.4 21.5
27.0 27.0 18.6 31.7 23.3 30.1 22.9 23.3 21.7 18.6
Table 1.16
43. Forbes magazine published data on the best small firms in 2012. These were firms which had been publicly traded for
at least a year, have a stock price of at least $5 per share, and have reported annual revenue between $5 million and $1
billion. Table 1.17 shows the ages of the chief executive officers for the first 60 ranked firms.
Table 1.17
(a) (b)
Figure 1.11
40 Chapter 1 | Sampling and Data
Use the following information to answer the next two exercises: Table 1.18 contains data on hurricanes that have made
direct hits on the U.S. Between 1851 and 2004. A hurricane is given a strength category rating based on the minimum wind
speed generated by the storm.
44. What is the relative frequency of direct hits that were category 4 hurricanes?
a. 0.0768
b. 0.0659
c. 0.2601
d. Not enough information to calculate
45. What is the relative frequency of direct hits that were AT MOST a category 3 storm?
a. 0.3480
b. 0.9231
c. 0.2601
d. 0.3370
REFERENCES
SOLUTIONS
2
a. all children who take ski or snowboard lessons
b. a group of these children
c. the population mean age of children who take their first snowboard lesson
d. the sample mean age of children who take their first snowboard lesson
e. X = the age of one child who takes his or her first ski or snowboard lesson
f. values for X, such as 3, 7, and so on
4
a. the clients of the insurance companies
b. a group of the clients
c. the mean health costs of the clients
d. the mean health costs of the sample
e. X = the health costs of one client
f. values for X, such as 34, 9, 82, and so on
6
a. all the clients of this counselor
b. a group of clients of this marriage counselor
c. the proportion of all her clients who stay married
d. the proportion of the sample of the counselor’s clients who stay married
e. X = the number of couples who stay married
f. yes, no
8
a. all people (maybe in a certain geographic area, such as the United States)
b. a group of the people
c. the proportion of all people who will buy the product
d. the proportion of the sample who will buy the product
e. X = the number of people who will buy it
f. buy, not buy
10 a
12 quantitative discrete, 150
14 qualitative, Oakland A’s
16 quantitative discrete, 11,234 students
18 qualitative, Crest
20 quantitative continuous, 47.3 years
22 b
24
a. The survey was conducted using six similar flights.
The survey would not be a true representation of the entire population of air travelers.
Conducting the survey on a holiday weekend will not produce representative results.
b. Conduct the survey during different times of the year.
26 Answers will vary. Sample Answer: You could use a systematic sampling method. Stop the tenth person as they leave
one of the buildings on campus at 9:50 in the morning. Then stop the tenth person as they leave a different building on
campus at 1:50 in the afternoon.
28 Answers will vary. Sample Answer: Many people will not respond to mail surveys. If they do respond to the surveys,
you can’t be sure who is responding. In addition, mailing lists can be incomplete.
30 b
32 convenience; cluster; stratified ; systematic; simple random
34
a. qualitative(categorical)
b. quantitative discrete
c. quantitative discrete
d. qualitative(categorical)
36 Causality: The fact that two variables are related does not guarantee that one variable is influencing the other. We
cannot assume that crime rate impacts education level or that education level impacts crime rate. Confounding: There are
many factors that define a community other than education level and crime rate. Communities with high crime rates and
high education levels may have other lurking variables that distinguish them from communities with lower crime rates
and lower education levels. Because we cannot isolate these variables of interest, we cannot draw valid conclusions about
the connection between education and crime. Possible lurking variables include police expenditures, unemployment levels,
region, average age, and size.
38
a. Possible reasons: increased use of caller id, decreased use of landlines, increased use of private numbers, voice mail,
privacy managers, hectic nature of personal schedules, decreased willingness to be interviewed
b. When a large number of people refuse to participate, then the sample may not have the same characteristics of the
population. Perhaps the majority of people willing to participate are doing so because they feel strongly about the
subject of the survey.
40
a.
Table 1.19
b. 5.00%
c. 93.33%
42 The sum of the travel times is 1,173.1. Divide the sum by 50 to calculate the mean value: 23.462. Because each state’s
travel time was measured to the nearest tenth, round this calculation to the nearest hundredth: 23.46.
44 b
44 Chapter 1 | Sampling and Data
2 | DESCRIPTIVE
STATISTICS
Figure 2.1 When you have large amounts of data, you will need to organize it in a way that makes sense. These
ballots from an election are rolled together with similar ballots to keep them organized. (credit: William Greeson)
Introduction
Once you have collected data, what will you do with it? Data can be described and presented in many different formats. For
example, suppose you are interested in buying a house in a particular area. You may have no clue about the house prices, so
you might ask your real estate agent to give you a sample data set of prices. Looking at all the prices in the sample often is
overwhelming. A better way might be to look at the median price and the variation of prices. The median and variation are
just two ways that you will learn to describe data. Your agent might also provide you with a graph of the data.
In this chapter, you will study numerical and graphical ways to describe and display your data. This area of statistics is called
"Descriptive Statistics." You will learn how to calculate, and even more importantly, how to interpret these measurements
and graphs.
A statistical graph is a tool that helps you learn about the shape or distribution of a sample or a population. A graph can be
a more effective way of presenting data than a mass of numbers because we can see where data clusters and where there are
only a few data values. Newspapers and the Internet use graphs to show trends and to enable readers to compare facts and
figures quickly. Statisticians often graph data first to get a picture of the data. Then, more formal tools may be applied.
Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the
stem-and-leaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot. In this chapter, we
46 Chapter 2 | Descriptive Statistics
will briefly look at stem-and-leaf plots, line graphs, and bar graphs, as well as frequency polygons, and time series graphs.
Our emphasis will be on histograms and box plots.
Example 2.1
For Susan Dean's spring pre-calculus class, scores for the first exam were as follows (smallest to largest):
33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94;
96; 100
Stem Leaf
3 3
4 299
5 355
6 1378899
7 2348
8 03888
9 0244446
10 0
The stemplot shows that most scores fell in the 60s, 70s, 80s, and 90s. Eight out of the 31 scores or approximately
⎛ ⎞
26% ⎝ 8 ⎠ were in the 90s or 100, a fairly high number of As.
31
2.1 For the Park City basketball team, scores for the last 30 games were as follows (smallest to largest):
32; 32; 33; 34; 38; 40; 42; 42; 43; 44; 46; 47; 47; 48; 48; 48; 49; 50; 50; 51; 52; 52; 52; 53; 54; 56; 57; 57; 60; 61
Construct a stem plot for the data.
The stemplot is a quick way to graph data and gives an exact picture of the data. You want to look for an overall pattern
and any outliers. An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme
value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for
example, writing down 50 instead of 500) while others may indicate that something unusual is happening. It takes some
background information to explain outliers, so we will cover them in more detail later.
Example 2.2
The data are the distances (in kilometers) from a home to local supermarkets. Create a stemplot using the data:
1.1; 1.5; 2.3; 2.5; 2.7; 3.2; 3.3; 3.3; 3.5; 3.8; 4.0; 4.2; 4.5; 4.5; 4.7; 4.8; 5.5; 5.6; 6.5; 6.7; 12.3
Do the data seem to have any concentration of values?
NOTE
The leaves are to the right of the decimal.
Solution 2.2
The value 12.3 may be an outlier. Values appear to concentrate at three and four kilometers.
Stem Leaf
1 15
2 357
3 23358
4 025578
5 56
6 57
7
8
9
10
11
12 3
Table 2.2
2.2 The following data show the distances (in miles) from the homes of off-campus statistics students to the college.
Create a stem plot using the data and identify any outliers:
0.5; 0.7; 1.1; 1.2; 1.2; 1.3; 1.3; 1.5; 1.5; 1.7; 1.7; 1.8; 1.9; 2.0; 2.2; 2.5; 2.6; 2.8; 2.8; 2.8; 3.5; 3.8; 4.4; 4.8; 4.9; 5.2;
5.5; 5.7; 5.8; 8.0
Example 2.3
A side-by-side stem-and-leaf plot allows a comparison of the two data sets in two columns. In a side-by-side
stem-and-leaf plot, two sets of leaves share the same stem. The leaves are to the left and the right of the stems.
Table 2.4 and Table 2.5 show the ages of presidents at their inauguration and at their death. Construct a side-
by-side stem-and-leaf plot using this data.
48 Chapter 2 | Descriptive Statistics
Solution 2.3
Table 2.3
Another type of graph that is useful for specific data values is a line graph. In the particular line graph shown in Example
2.4, the x-axis (horizontal axis) consists of data values and the y-axis (vertical axis) consists of frequency points. The
frequency points are connected using line segments.
Example 2.4
In a survey, 40 mothers were asked how many times per week a teenager must be reminded to do his or her chores.
The results are shown in Table 2.6 and in Figure 2.2.
Table 2.6
50 Chapter 2 | Descriptive Statistics
Figure 2.2
2.4 In a survey, 40 people were asked how many times per year they had their car in the shop for repairs. The results
are shown in Table 2.7. Construct a line graph.
Table 2.7
Bar graphs consist of bars that are separated from each other. The bars can be rectangles or they can be rectangular boxes
(used in three-dimensional plots), and they can be vertical or horizontal. The bar graph shown in Example 2.5 has age
groups represented on the x-axis and proportions on the y-axis.
Example 2.5
By the end of 2011, Facebook had over 146 million users in the United States. Table 2.7 shows three age groups,
the number of users in each age group, and the proportion (%) of users in each age group. Construct a bar graph
using this data.
Table 2.8
Solution 2.5
Figure 2.3
52 Chapter 2 | Descriptive Statistics
2.5 The population in Park City is made up of children, working-age adults, and retirees. Table 2.9 shows the three
age groups, the number of people in the town from each age group, and the proportion (%) of people in each age group.
Construct a bar graph showing the proportions.
Table 2.9
Example 2.6
The columns in Table 2.9 contain: the race or ethnicity of students in U.S. Public Schools for the class of
2011, percentages for the Advanced Placement examine population for that class, and percentages for the overall
student population. Create a bar graph with the student race or ethnicity (qualitative data) on the x-axis, and the
Advanced Placement examinee population percentages on the y-axis.
Table 2.10
Solution 2.6
Figure 2.4
2.6 Park city is broken down into six voting districts. The table shows the percent of the total registered voter
population that lives in each district as well as the percent total of the entire population that lives in each district.
Construct a bar graph that shows the registered voter population by district.
Table 2.11
54 Chapter 2 | Descriptive Statistics
Example 2.7
Below is a two-way table showing the types of pets owned by men and women:
Table 2.12
Given these data, calculate the conditional distributions for the subpopulation of men who own each pet type.
Solution 2.7
Men who own dogs = 4/8 = 0.5
Men who own cats = 2/8 = 0.25
Men who own fish = 2/8 = 0.25
Note: The sum of all of the conditional distributions must equal one. In this case, 0.5 + 0.25 + 0.25 = 1; therefore,
the solution "checks".
For example, if three students in Mr. Ahab's English class of 40 students received from 90% to 100%, then, f = 3, n = 40,
f
and RF = n = 3 = 0.075. 7.5% of the students received 90–100%. 90–100% are quantitative measures.
40
To construct a histogram, first decide how many bars or intervals, also called classes, represent the data. Many
histograms consist of five to 15 bars or classes for clarity. The number of bars needs to be chosen. Choose a starting point
for the first interval to be less than the smallest data value. A convenient starting point is a lower value carried out to one
more decimal place than the value with the most decimal places. For example, if the value with the most decimal places is
6.1 and this is the smallest value, a convenient starting point is 6.05 (6.1 – 0.05 = 6.05). We say that 6.05 has more precision.
If the value with the most decimal places is 2.23 and the lowest value is 1.5, a convenient starting point is 1.495 (1.5 – 0.005
= 1.495). If the value with the most decimal places is 3.234 and the lowest value is 1.0, a convenient starting point is 0.9995
(1.0 – 0.0005 = 0.9995). If all the data happen to be integers and the smallest value is two, then a convenient starting point
is 1.5 (2 – 0.5 = 1.5). Also, when the starting point and other boundaries are carried to one additional decimal place, no data
value will fall on a boundary. The next two examples go into detail about how to construct a histogram using continuous
data and how to create a histogram using discrete data.
Example 2.8
The following data are the heights (in inches to the nearest half inch) of 100 male semiprofessional soccer players.
The heights are continuous data, since height is measured.
60; 60.5; 61; 61; 61.5
63.5; 63.5; 63.5
64; 64; 64; 64; 64; 64; 64; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5
66; 66; 66; 66; 66; 66; 66; 66; 66; 66; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 67; 67; 67;
67; 67; 67; 67; 67; 67; 67; 67; 67; 67.5; 67.5; 67.5; 67.5; 67.5; 67.5; 67.5
68; 68; 69; 69; 69; 69; 69; 69; 69; 69; 69; 69; 69.5; 69.5; 69.5; 69.5; 69.5
70; 70; 70; 70; 70; 70; 70.5; 70.5; 70.5; 71; 71; 71
72; 72; 72; 72.5; 72.5; 73; 73.5
74
The smallest data value is 60. Since the data with the most decimal places has one decimal (for instance, 61.5),
we want our starting point to have two decimal places. Since the numbers 0.5, 0.05, 0.005, etc. are convenient
numbers, use 0.05 and subtract it from 60, the smallest value, for the convenient starting point.
60 – 0.05 = 59.95 which is more precise than, say, 61.5 by one decimal place. The starting point is, then, 59.95.
The largest value is 74, so 74 + 0.05 = 74.05 is the ending value.
Next, calculate the width of each bar or class interval. To calculate this width, subtract the starting point from the
ending value and divide by the number of bars (you must choose the number of bars you desire). Suppose you
choose eight bars.
74.05 − 59.95 = 1.76
8
NOTE
We will round up to two and make each bar or class interval two units wide. Rounding up to two is one
way to prevent a value from falling on a boundary. Rounding to the next number is often necessary even if
it goes against the standard rules of rounding. For this example, using 1.76 as the width would also work.
A guideline that is followed by some for the width of a bar or class interval is to take the square root of the
number of data values and then round to the nearest whole number, if necessary. For example, if there are
150 values of data, take the square root of 150 and round to 12 bars or intervals.
Figure 2.5
2.8 The following data are the shoe sizes of 50 male students. The sizes are continuous data since shoe size is
measured. Construct a histogram and calculate the width of each bar or class interval. Suppose you choose six bars.
9; 9; 9.5; 9.5; 10; 10; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5
11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5; 11.5; 11.5; 11.5; 11.5
12; 12; 12; 12; 12; 12; 12; 12.5; 12.5; 12.5; 12.5; 14
Example 2.9
Create a histogram for the following data: the number of books bought by 50 part-time college students at ABC
College. The number of books is discrete data, since books are counted.
1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1
2; 2; 2; 2; 2; 2; 2; 2; 2; 2
3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3
4; 4; 4; 4; 4; 4
5; 5; 5; 5; 5
6; 6
Eleven students buy one book. Ten students buy two books. Sixteen students buy three books. Six students buy
four books. Five students buy five books. Two students buy six books.
Because the data are integers, subtract 0.5 from 1, the smallest data value and add 0.5 to 6, the largest data value.
Then the starting point is 0.5 and the ending value is 6.5.
Next, calculate the width of each bar or class interval. If the data are discrete and there are not too many different
values, a width that places the data values in the middle of the bar or class interval is the most convenient. Since
the data consist of the numbers 1, 2, 3, 4, 5, 6, and the starting point is 0.5, a width of one places the 1 in the
middle of the interval from 0.5 to 1.5, the 2 in the middle of the interval from 1.5 to 2.5, the 3 in the middle of the
interval from 2.5 to 3.5, the 4 in the middle of the interval from _______ to _______, the 5 in the middle of the
interval from _______ to _______, and the _______ in the middle of the interval from _______ to _______ .
Solution 2.9
• 3.5 to 4.5
• 4.5 to 5.5
• 6
• 5.5 to 6.5
Calculate the number of bars as follows:
6.5 − 0.5 =1
number of bars
where 1 is the width of a bar. Therefore, bars = 6.
The following histogram displays the number of books on the x-axis and the frequency on the y-axis.
Figure 2.6
Example 2.10
Table 2.13
58 Chapter 2 | Descriptive Statistics
Solution 2.10
Figure 2.7
Some values in this data set fall on boundaries for the class intervals. A value is counted in a class interval if it
falls on the left boundary, but not if it falls on the right boundary. Different researchers may set up histograms for
the same data in different ways. There is more than one correct way to set up a histogram.
Frequency Polygons
Frequency polygons are analogous to line graphs, and just as line graphs make continuous data visually easy to interpret, so
too do frequency polygons.
To construct a frequency polygon, first examine the data and decide on the number of intervals, or class intervals, to use on
the x-axis and y-axis. After choosing the appropriate ranges, begin plotting the data points. After all the points are plotted,
draw line segments to connect them.
Example 2.11
Table 2.14
Figure 2.8
The first label on the x-axis is 44.5. This represents an interval extending from 39.5 to 49.5. Since the lowest test
score is 54.5, this interval is used only to allow the graph to touch the x-axis. The point labeled 54.5 represents
the next interval, or the first “real” interval from the table, and contains five scores. This reasoning is followed
for each of the remaining intervals with the point 104.5 representing the interval from 99.5 to 109.5. Again, this
interval contains no data and is only used so that the graph will touch the x-axis. Looking at the graph, we say
that this distribution is skewed because one side of the graph does not mirror the other side.
2.11 Construct a frequency polygon of U.S. Presidents’ ages at inauguration shown in Table 2.15.
60 Chapter 2 | Descriptive Statistics
Table 2.15
Frequency polygons are useful for comparing distributions. This is achieved by overlaying the frequency polygons drawn
for different data sets.
Example 2.12
We will construct an overlay frequency polygon comparing the scores from Example 2.11 with the students’
final numeric grade.
Table 2.16
Table 2.17
Figure 2.9
Example 2.13
The following data shows the Annual Consumer Price Index, each month, for ten years. Construct a time series
graph for the Annual Consumer Price Index data only.
Table 2.18
Table 2.19
Solution 2.13
Figure 2.10
2.13 The following table is a portion of a data set from www.worldbank.org. Use the table to construct a time series
graph for CO2 emissions for the United States.
CO2 Emissions
Ukraine United Kingdom United States
2003 352,259 540,640 5,681,664
2004 343,121 540,409 5,790,761
2005 339,029 541,990 5,826,394
2006 327,797 542,045 5,737,615
2007 328,357 528,631 5,828,697
2008 323,657 522,247 5,656,839
2009 272,176 474,579 5,299,563
Table 2.20
aware of such deception, but perhaps more importantly to educate so that others do not make the same errors inadvertently.
Again, the goal is to enlighten with visuals that tell the story of the data. Pie charts have a number of common problems
when used to convey the message of the data. Too many pieces of the pie overwhelm the reader. More than perhaps five or
six categories ought to give an idea of the relative importance of each piece. This is after all the goal of a pie chart, what
subset matters most relative to the others. If there are more components than this then perhaps an alternative approach would
be better or perhaps some can be consolidated into an "other" category. Pie charts cannot show changes over time, although
we see this attempted all too often. In federal, state, and city finance documents pie charts are often presented to show the
components of revenue available to the governing body for appropriation: income tax, sales tax motor vehicle taxes and so
on. In and of itself this is interesting information and can be nicely done with a pie chart. The error occurs when two years
are set side-by-side. Because the total revenues change year to year, but the size of the pie is fixed, no real information is
provided and the relative size of each piece of the pie cannot be meaningfully compared.
Histograms can be very helpful in understanding the data. Properly presented, they can be a quick visual way to present
probabilities of different categories by the simple visual of comparing relative areas in each category. Here the error,
purposeful or not, is to vary the width of the categories. This of course makes comparison to the other categories impossible.
It does embellish the importance of the category with the expanded width because it has a greater area, inappropriately, and
thus visually "says" that that category has a higher probability of occurrence.
Time series graphs perhaps are the most abused. A plot of some variable across time should never be presented on axes
that change part way across the page either in the vertical or horizontal dimension. Perhaps the time frame is changed from
years to months. Perhaps this is to save space or because monthly data was not available for early years. In either case this
confounds the presentation and destroys any value of the graph. If this is not done to purposefully confuse the reader, then
it certainly is either lazy or sloppy work.
Changing the units of measurement of the axis can smooth out a drop or accentuate one. If you want to show large changes,
then measure the variable in small units, penny rather than thousands of dollars. And of course to continue the fraud, be
sure that the axis does not begin at zero, zero. If it begins at zero, zero, then it becomes apparent that the axis has been
manipulated.
Perhaps you have a client that is concerned with the volatility of the portfolio you manage. An easy way to present the data
is to use long time periods on the time series graph. Use months or better, quarters rather than daily or weekly data. If that
doesn't get the volatility down then spread the time axis relative to the rate of return or portfolio valuation axis. If you want
to show "quick" dramatic growth, then shrink the time axis. Any positive growth will show visually "high" growth rates.
Do note that if the growth is negative then this trick will show the portfolio is collapsing at a dramatic rate.
Again, the goal of descriptive statistics is to convey meaningful visuals that tell the story of the data. Purposeful
manipulation is fraud and unethical at the worst, but even at its best, making these type of errors will lead to confusion on
the part of the analysis.
Example 2.14
For the following 13 real estate prices, calculate the IQR and determine if any prices are potential outliers. Prices
are in dollars.
389,950; 230,500; 158,000; 479,000; 639,000; 114,950; 5,500,000; 387,000; 659,000; 529,000; 575,000;
488,800; 1,095,000
Solution 2.14
Order the data from smallest to largest.
114,950; 158,000; 230,500; 387,000; 389,950; 479,000; 488,800; 529,000; 575,000; 639,000; 659,000;
1,095,000; 5,500,000
M = 488,800
Q1 =
230,500 + 387,000 = 308,750
2
Q3 =
639,000 + 659,000 = 649,000
2
IQR = 649,000 – 308,750 = 340,250
66 Chapter 2 | Descriptive Statistics
Example 2.15
For the two data sets in the test scores example, find the following:
a. The interquartile range. Compare the two interquartile ranges.
b. Any outliers in either set.
Solution 2.15
The five number summary for the day and night classes is
Table 2.21
Example 2.16
Fifty statistics students were asked how much sleep they get per school night (rounded to the nearest hour). The
results were:
CUMULATIVE
AMOUNT OF SLEEP PER RELATIVE
FREQUENCY RELATIVE
SCHOOL NIGHT (HOURS) FREQUENCY
FREQUENCY
4 2 0.04 0.04
5 5 0.10 0.14
6 7 0.14 0.28
7 12 0.24 0.52
8 14 0.28 0.80
9 7 0.14 0.94
10 3 0.06 1.00
Table 2.22
Find the 28th percentile. Notice the 0.28 in the "cumulative relative frequency" column. Twenty-eight percent of
50 data values is 14 values. There are 14 values less than the 28th percentile. They include the two 4s, the five 5s,
and the seven 6s. The 28th percentile is between the last six and the first seven. The 28th percentile is 6.5.
Find the median. Look again at the "cumulative relative frequency" column and find 0.52. The median is the
50th percentile or the second quartile. 50% of 50 is 25. There are 25 values less than the median. They include the
two 4s, the five 5s, the seven 6s, and eleven of the 7s. The median or 50th percentile is between the 25th, or seven,
and 26th, or seven, values. The median is seven.
Find the third quartile. The third quartile is the same as the 75th percentile. You can "eyeball" this answer. If you
look at the "cumulative relative frequency" column, you find 0.52 and 0.80. When you have all the fours, fives,
sixes and sevens, you have 52% of the data. When you include all the 8s, you have 80% of the data. The 75th
percentile, then, must be an eight. Another way to look at the problem is to find 75% of 50, which is 37.5, and
round up to 38. The third quartile, Q3, is the 38th value, which is an eight. You can check this answer by counting
the values. (There are 37 values below the third quartile and 12 values above.)
2.16 Forty bus drivers were asked how many hours they spend each day running their routes (rounded to the nearest
hour). Find the 65th percentile.
Table 2.23
68 Chapter 2 | Descriptive Statistics
Example 2.17
Solution 2.17
Using the data from the frequency table, we have:
a. The 80th percentile is between the last eight and the first nine in the table (between the 40th and 41st values).
Therefore, we need to take the mean of the 40th an 41st values. The 80th percentile = 8 + 9 = 8.5
2
th th th
b. The 90 percentile will be the 45 data value (location is 0.90(50) = 45) and the 45 data value is nine.
c. Q1 is also the 25th percentile. The 25th percentile location calculation: P25 = 0.25(50) = 12.5 ≈ 13 the 13th
data value. Thus, the 25th percentile is six.
• Calculate i = k (n + 1)
100
• If i is an integer, then the kth percentile is the data value in the ith position in the ordered set of data.
• If i is not an integer, then round i up and round i down to the nearest integers. Average the two data values in these two
positions in the ordered data set. This is easier to understand in an example.
Example 2.18
Listed are 29 ages for Academy Award winning best actors in order from smallest to largest.
18; 21; 22; 25; 26; 27; 29; 30; 31; 33; 36; 37; 41; 42; 47; 52; 55; 57; 58; 62; 64; 67; 69; 71; 72; 73; 74; 76; 77
a. Find the 70th percentile.
b. Find the 83rd percentile.
Solution 2.18
a. k = 70
i = the index
n = 29
i= k (n + 1) = ( 70 )(29 + 1) = 21. Twenty-one is an integer, and the data value in the 21st position in
100 100
the ordered data set is 64. The 70th percentile is 64 years.
b. k = 83rd percentile
i = the index
n = 29
i = k (n + 1) = ) 83 )(29 + 1) = 24.9, which is NOT an integer. Round it down to 24 and up to 25. The
100 100
age in the 24th position is 71 and the age in the 25th position is 72. Average 71 and 72. The 83rd percentile is
71.5 years.
2.18 Listed are 29 ages for Academy Award winning best actors in order from smallest to largest.
18; 21; 22; 25; 26; 27; 29; 30; 31; 33; 36; 37; 41; 42; 47; 52; 55; 57; 58; 62; 64; 67; 69; 71; 72; 73; 74; 76; 77
Calculate the 20th percentile and the 55th percentile.
Example 2.19
Listed are 29 ages for Academy Award winning best actors in order from smallest to largest.
18; 21; 22; 25; 26; 27; 29; 30; 31; 33; 36; 37; 41; 42; 47; 52; 55; 57; 58; 62; 64; 67; 69; 71; 72; 73; 74; 76; 77
a. Find the percentile for 58.
b. Find the percentile for 25.
Solution 2.19
a. Counting from the bottom of the list, there are 18 data values less than 58. There is one value of 58.
x + 0.5y 18 + 0.5(1)
x = 18 and y = 1. n (100) = (100) = 63.80. 58 is the 64th percentile.
29
b. Counting from the bottom of the list, there are three data values less than 25. There is one value of 25.
x + 0.5y 3 + 0.5(1)
x = 3 and y = 1. n (100) = (100) = 12.07. Twenty-five is the 12th percentile.
29
Understanding how to interpret percentiles properly is important not only when describing data, but also when calculating
probabilities in later chapters of this text.
NOTE
When writing the interpretation of a percentile in the context of the given data, the sentence should contain the
following information.
• information about the context of the situation being considered
• the data value (value of the variable) that represents the percentile
• the percent of individuals or items with data values below the percentile
• the percent of individuals or items with data values above the percentile.
Example 2.20
On a timed math test, the first quartile for time it took to finish the exam was 35 minutes. Interpret the first quartile
in the context of this situation.
Solution 2.20
• Twenty-five percent of students finished the exam in 35 minutes or less.
• Seventy-five percent of students finished the exam in 35 minutes or more.
• A low percentile could be considered good, as finishing more quickly on a timed exam is desirable. (If you
take too long, you might not be able to finish.)
Example 2.21
On a 20 question math test, the 70th percentile for number of correct answers was 16. Interpret the 70th percentile
in the context of this situation.
Solution 2.21
• Seventy percent of students answered 16 or fewer questions correctly.
• Thirty percent of students answered 16 or more questions correctly.
• A higher percentile could be considered good, as answering more questions correctly is desirable.
2.21 On a 60 point written assignment, the 80th percentile for the number of points earned was 49. Interpret the 80th
percentile in the context of this situation.
Example 2.22
At a community college, it was found that the 30th percentile of credit units that students are enrolled for is seven
units. Interpret the 30th percentile in the context of this situation.
Solution 2.22
• Thirty percent of students are enrolled in seven or fewer credit units.
Example 2.23
Sharpe Middle School is applying for a grant that will be used to add fitness equipment to the gym. The principal
surveyed 15 anonymous students to determine how many minutes a day the students spend exercising. The results
from the 15 anonymous students are shown.
0 minutes; 40 minutes; 60 minutes; 30 minutes; 60 minutes
10 minutes; 45 minutes; 30 minutes; 300 minutes; 90 minutes;
30 minutes; 120 minutes; 60 minutes; 0 minutes; 20 minutes
Determine the following five values.
Min = 0
Q1 = 20
Med = 40
Q3 = 60
Max = 300
If you were the principal, would you be justified in purchasing new fitness equipment? Since 75% of the students
exercise for 60 minutes or less daily, and since the IQR is 40 minutes (60 – 20 = 40), we know that half of the
students surveyed exercise between 20 minutes and 60 minutes daily. This seems a reasonable amount of time
spent exercising, so the principal would be justified in purchasing the new equipment.
However, the principal needs to be careful. The value 300 appears to be a potential outlier.
Q3 + 1.5(IQR) = 60 + (1.5)(40) = 120.
The value 300 is greater than 120 so it is a potential outlier. If we delete it and calculate the five values, we get
the following values:
Min = 0
Q1 = 20
Q3 = 60
Max = 120
We still have 75% of the students exercising for 60 minutes or less daily and half of the students exercising
between 20 and 60 minutes a day. However, 15 students is a small sample and the principal should survey more
students to be sure of his survey results.
NOTE
The words “mean” and “average” are often used interchangeably. The substitution of one word for the other is
common practice. The technical term is “arithmetic mean” and “average” is technically a center location. Formally,
the arithmetic mean is called the first moment of the distribution by mathematicians. However, in practice among non-
statisticians, “average" is commonly accepted for “arithmetic mean.”
When each value in the data set is not unique, the mean can be calculated by multiplying each distinct value by its frequency
and then dividing the sum by the total number of data values. The letter used to represent the sample mean is an x with a
–
bar over it (pronounced “x bar”): x .
The Greek letter μ (pronounced "mew") represents the population mean. One of the requirements for the sample mean to
be a good estimate of the population mean is for the sample taken to be truly random.
To see that both ways of calculating the mean are the same, consider the sample:
1; 1; 1; 2; 2; 3; 4; 4; 4; 4; 4
x = 1 + 1 + 1 + 2 + 2 + 3 + 4 + 4 + 4 + 4 + 4 = 2.7
–
11
–x = 3(1) + 2(2) + 1(3) + 5(4)
= 2.7
11
In the second calculation, the frequencies are 3, 2, 1, and 5.
You can quickly find the location of the median by using the expression n + 1 .
2
The letter n is the total number of data values in the sample. If n is an odd number, the median is the middle value of
the ordered data (ordered smallest to largest). If n is an even number, the median is equal to the two middle values added
together and divided by two after the data has been ordered. For example, if the total number of data values is 97, then
n + 1 = 97 + 1 = 49. The median is the 49th value in the ordered data. If the total number of data values is 100, then
2 2
n + 1 = 100 + 1 = 50.5. The median occurs midway between the 50th and 51st values. The location of the median and
2 2
the value of the median are not the same. The upper case letter M is often used to represent the median. The next example
illustrates the location of the median and the value of the median.
Example 2.24
AIDS data indicating the number of months a patient with AIDS lives after taking a new antibody drug are as
follows (smallest to largest):
3; 4; 8; 8; 10; 11; 12; 13; 14; 15; 15; 16; 16; 17; 17; 18; 21; 22; 22; 24; 24; 25; 26; 26; 27; 27; 29; 29; 31; 32; 33;
33; 34; 34; 35; 37; 40; 44; 44; 47;
Calculate the mean and the median.
Solution 2.24
The calculation for the mean is:
– [3 + 4 + (8)(2) + 10 + 11 + 12 + 13 + 14 + (15)(2) + (16)(2) + ... + 35 + 37 + 40 + (44)(2) + 47]
x = = 23.6
40
To find the median, M, first use the formula for the location. The location is:
n + 1 = 40 + 1 = 20.5
2 2
Starting at the smallest value, the median is located between the 20 th and 21st values (the two 24s):
3; 4; 8; 8; 10; 11; 12; 13; 14; 15; 15; 16; 16; 17; 17; 18; 21; 22; 22; 24; 24; 25; 26; 26; 27; 27; 29; 29; 31; 32; 33;
33; 34; 34; 35; 37; 40; 44; 44; 47;
M = 24 + 24 = 24
2
Example 2.25
Suppose that in a small town of 50 people, one person earns $5,000,000 per year and the other 49 each earn
$30,000. Which is the better measure of the "center": the mean or the median?
Solution 2.25
–x = 5, 000, 000 + 49(30, 000) = 129,400
50
M = 30,000
(There are 49 people who earn $30,000 and one person who earns $5,000,000.)
The median is a better measure of the "center" than the mean because 49 of the values are 30,000 and one is
5,000,000. The 5,000,000 is an outlier. The 30,000 gives us a better sense of the middle of the data.
Another measure of the center is the mode. The mode is the most frequent value. There can be more than one mode in a
data set as long as those values have the same frequency and that frequency is the highest. A data set with two modes is
called bimodal.
Example 2.26
Solution 2.26
The most frequent score is 72, which occurs five times. Mode = 72.
Example 2.27
Five real estate exam scores are 430, 430, 480, 480, 495. The data set is bimodal because the scores 430 and 480
each occur twice.
When is the mode the best measure of the "center"? Consider a weight loss program that advertises a mean weight
loss of six pounds the first week of the program. The mode might indicate that most people lose two pounds the
first week, making the program less appealing.
NOTE
The mode can be calculated for qualitative data as well as for quantitative data. For example, if the data set
is: red, red, red, green, green, yellow, purple, black, blue, the mode is red.
frequencies); therefore, you cannot compute an exact mean for the data set. What we must do is estimate the actual mean
by calculating the mean of a frequency table. A frequency table is a data representation in which grouped data is displayed
along with the corresponding frequencies. To calculate the mean from a grouped frequency table we can apply the basic
definition of mean: mean = data sum We simply need to modify the definition to fit within the restrictions
number o f data values
of a frequency table.
Since we do not know the individual data values we can instead find the midpoint of each interval. The midpoint
lower boundary + upper boundary
is . We can now modify the mean definition to be
2
∑ fm
Mean o f Frequency Table = where f = the frequency of the interval and m = the midpoint of the interval.
∑f
Example 2.28
A frequency table displaying professor Blount’s last statistic test is shown. Find the best estimate of the class
mean.
Table 2.24
Solution 2.28
• Find the midpoints for all intervals
Table 2.25
• Calculate the sum of the product of each interval frequency and midpoint. ∑ f m
∑ f m 1460.25
• µ= = = 76.86
∑f 19
2.28 Maris conducted a study on the effect that playing video games has on memory recall. As part of her study, she
compiled the following data:
Table 2.26
What is the best estimate for the mean number of hours spent playing video games?
This unit is here to remind you of material that you once studied and said at the time “I am sure that I will never need this!”
Here are the formulas for a population mean and the sample mean. The Greek letter μ is the symbol for the population
–
mean and x is the symbol for the sample mean. Both formulas have a mathematical symbol that tells us how to make
the calculations. It is called Sigma notation because the symbol is the Greek capital letter sigma: Σ. Like all mathematical
symbols it tells us what to do: just as the plus sign tells us to add and the x tells us to multiply. These are called mathematical
operators. The Σ symbol tells us to add a specific list of numbers.
Let’s say we have a sample of animals from the local animal shelter and we are interested in their average age. If we list
each value, or observation, in a column, you can give each one an index number. The first number will be number 1 and the
second number 2 and so on.
76 Chapter 2 | Descriptive Statistics
Animal Age
1 9
2 1
3 8.5
4 10.5
5 10
6 8.5
7 12
8 8
9 1
10 9.5
Table 2.27
Each observation represents a particular animal in the sample. Purr is animal number one and is a 9 year old cat, Toto is
animal number 2 and is a 1 year old puppy and so on.
To calculate the mean we are told by the formula to add up all these numbers, ages in this case, and then divide the sum by
10, the total number of animals in the sample.
Animal number one, the cat Purr, is designated as X1, animal number 2, Toto, is designated as X2 and so on through Dundee
who is animal number 10 and is designated as X10.
The i in the formula tells us which of the observations to add together. In this case it is X1 through X10 which is all of them.
We know which ones to add by the indexing notation, the i = 1 and the n or capital N for the population. For this example
the indexing notation would be i = 1 and because it is a sample we use a small n on the top of the Σ which would be 10.
The standard deviation requires the same mathematical operator and so it would be helpful to recall this knowledge from
your past.
The sum of the ages is found to be 78 and dividing by 10 gives us the sample mean age as 7.8 years.
⎝i = 1 ⎠
where π is another mathematical operator, that tells us to multiply all the x i numbers in the same way capital Greek sigma
tells us to add all the x i numbers. Remember that a fractional exponent is calling for the nth root of the number thus an
exponent of 1/3 is the cube root of the number.
The geometric mean answers the question, "if all the quantities had the same value, what would that value have to be in
order to achieve the same product?” The geometric mean gets its name from the fact that when redistributed in this way
the sides form a geometric shape for which all sides have the same length. To see this, take the example of the numbers
10, 51.2 and 8. The geometric mean is the product of multiplying these three numbers together (4,096) and taking the cube
root because there are three numbers among which this product is to be distributed. Thus the geometric mean of these three
numbers is 16. This describes a cube 16x16x16 and has a volume of 4,096 units.
The geometric mean is relevant in Economics and Finance for dealing with growth: growth of markets, in investment,
population and other variables the growth in which there is an interest. Imagine that our box of 4,096 units (perhaps dollars)
is the value of an investment after three years and that the investment returns in percents were the three numbers in our
example. The geometric mean will provide us with the answer to the question, what is the average rate of return: 16 percent.
The arithmetic mean of these three numbers is 23.6 percent. The reason for this difference, 16 versus 23.6, is that the
arithmetic mean is additive and thus does not account for the interest on the interest, compound interest, embedded in the
investment growth process. The same issue arises when asking for the average rate of growth of a population or sales or
market penetration, etc., knowing the annual rates of growth. The formula for the geometric mean rate of return, or any
other growth rate, is:
1
r s = ⎛⎝x 1 * x 2 ∙∙∙x n⎞⎠ n - 1
Manipulating the formula for the geometric mean can also provide a calculation of the average rate of growth between two
periods knowing only the initial value a 0 and the ending value a n and the number of periods, n . The following formula
provides this information:
1
⎛a n ⎞
n
~
⎝a 0 ⎠ = x
Finally, we note that the formula for the geometric mean requires that all numbers be positive, greater than zero. The reason
of course is that the root of a negative number is undefined for use outside of mathematical theory. There are ways to
avoid this problem however. In the case of rates of return and other simple growth problems we can convert the negative
values to meaningful positive equivalent values. Imagine that the annual returns for the past three years are +12%, -8%, and
+2%. Using the decimal multiplier equivalents of 1.12, 0.92, and 1.02, allows us to compute a geometric mean of 1.0167.
Subtracting 1 from this value gives the geometric mean of +1.67% as a net rate of population growth (or financial return).
From this example we can see that the geometric mean provides us with this formula for calculating the geometric (mean)
rate of return for a series of annual rates of return:
~
rs = x - 1
~
where r s is average rate of return and x is the geometric mean of the returns during some number of time periods. Note
that the length of each time period must be the same.
As a general rule one should convert the percent values to its decimal equivalent multiplier. It is important to recognize
that when dealing with percents, the geometric mean of percent values does not equal the geometric mean of the decimal
multiplier equivalents and it is the decimal multiplier equivalent geometric mean that is relevant.
Figure 2.11
The histogram displays a symmetrical distribution of data. A distribution is symmetrical if a vertical line can be drawn at
some point in the histogram such that the shape to the left and the right of the vertical line are mirror images of each other.
The mean, the median, and the mode are each seven for these data. In a perfectly symmetrical distribution, the mean
and the median are the same. This example has one mode (unimodal), and the mode is the same as the mean and median.
In a symmetrical distribution that has two modes (bimodal), the two modes would be different from the mean and median.
The histogram for the data: 4; 5; 6; 6; 6; 7; 7; 7; 7; 8 is not symmetrical. The right-hand side seems "chopped off" compared
to the left side. A distribution of this type is called skewed to the left because it is pulled out to the left. We can formally
measure the skewness of a distribution just as we can mathematically measure the center weight of the data or its general
3
⎛ ¯⎞
⎝x i − x ⎠
"speadness". The mathematical formula for skewness is: a 3 = ∑ 3
. The greater the deviation from zero indicates
ns
a greater degree of skewness. If the skewness is negative then the distribution is skewed left as in Figure 2.12. A positive
measure of skewness indicates right skewness such as Figure 2.13.
Figure 2.12
The mean is 6.3, the median is 6.5, and the mode is seven. Notice that the mean is less than the median, and they are
both less than the mode. The mean and the median both reflect the skewing, but the mean reflects it more so.
The histogram for the data: 6; 7; 7; 7; 7; 8; 8; 8; 9; 10, is also not symmetrical. It is skewed to the right.
Figure 2.13
The mean is 7.7, the median is 7.5, and the mode is seven. Of the three statistics, the mean is the largest, while the mode
is the smallest. Again, the mean reflects the skewing the most.
To summarize, generally if the distribution of data is skewed to the left, the mean is less than the median, which is often less
than the mode. If the distribution of data is skewed to the right, the mode is often less than the median, which is less than
the mean.
As with the mean, median and mode, and as we will see shortly, the variance, there are mathematical formulas that give us
precise measures of these characteristics of the distribution of the data. Again looking at the formula for skewness we see
that this is a relationship between the mean of the data and the individual observations cubed.
3
⎛ ¯⎞
⎝x i − x ⎠
a3 = ∑ 3
ns
¯
where s is the sample standard deviation of the data, X i , and x is the arithmetic mean and n is the sample size.
Formally the arithmetic mean is known as the first moment of the distribution. The second moment we will see is the
variance, and skewness is the third moment. The variance measures the squared differences of the data from the mean and
skewness measures the cubed differences of the data from the mean. While a variance can never be a negative number,
the measure of skewness can and this is how we determine if the data are skewed right of left. The skewness for a normal
distribution is zero, and any symmetric data should have skewness near zero. Negative values for the skewness indicate
data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean
that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left
tail. The skewness characterizes the degree of asymmetry of a distribution around its mean. While the mean and standard
deviation are dimensional quantities (this is why we will take the square root of the variance ) that is, have the same units
as the measured quantities X i , the skewness is conventionally defined in such a way as to make it nondimensional. It is a
pure number that characterizes only the shape of the distribution. A positive value of skewness signifies a distribution with
an asymmetric tail extending out towards more positive X and a negative value signifies a distribution whose tail extends
out towards more negative X. A zero measure of skewness will indicate a symmetrical distribution.
Skewness and symmetry become important when we discuss probability distributions in later chapters.
is measured in pounds-squared. One reason to use the standard deviation is to return to the original units of measurement
by taking the square root of the variance. Further, when the deviations are squared it explodes their value. For example, a
deviation of 10 from the mean when squared is 100, but a deviation of 100 from the mean is 10,000. What this does is place
great weight on outliers when calculating the variance.
Types of Variability in Samples
When trying to study a population, a sample is often used, either for convenience or because it is not possible to access the
entire population. Variability is the term used to describe the differences that may occur in these outcomes. Common types
of variability include the following:
• Observational or measurement variability
• Natural variability
• Induced variability
• Sample variability
Here are some examples to describe each type of variability.
Example 1: Measurement variability
Measurement variability occurs when there are differences in the instruments used to measure or in the people using those
instruments. If we are gathering data on how long it takes for a ball to drop from a height by having students measure the
time of the drop with a stopwatch, we may experience measurement variability if the two stopwatches used were made by
different manufacturers: For example, one stopwatch measures to the nearest second, whereas the other one measures to the
nearest tenth of a second. We also may experience measurement variability because two different people are gathering the
data. Their reaction times in pressing the button on the stopwatch may differ; thus, the outcomes will vary accordingly. The
differences in outcomes may be affected by measurement variability.
Example 2: Natural variability
Natural variability arises from the differences that naturally occur because members of a population differ from each other.
For example, if we have two identical corn plants and we expose both plants to the same amount of water and sunlight,
they may still grow at different rates simply because they are two different corn plants. The difference in outcomes may be
explained by natural variability.
Example 3: Induced variability
Induced variability is the counterpart to natural variability; this occurs because we have artificially induced an element of
variation (that, by definition, was not present naturally): For example, we assign people to two different groups to study
memory, and we induce a variable in one group by limiting the amount of sleep they get. The difference in outcomes may
be affected by induced variability.
Example 4: Sample variability
Sample variability occurs when multiple random samples are taken from the same population. For example, if I conduct four
surveys of 50 people randomly selected from a given population, the differences in outcomes may be affected by sample
variability.
Example 2.29
In a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages of
her students. The following data are the ages for a SAMPLE of n = 20 fifth grade students. The ages are rounded
to the nearest half year:
9; 9.5; 9.5; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5;
–x = 9 + 9.5(2) + 10(4) + 10.5(4) + 11(6) + 11.5(3) = 10.525
20
The average age is 10.53 years, rounded to two places.
The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square
root of the variance. We will explain the parts of the table after calculating s.
82 Chapter 2 | Descriptive Statistics
Table 2.28
The sample variance, s2, is equal to the sum of the last column (9.7375) divided by the total number of data values
minus one (20 – 1):
s 2 = 9.7375 = 0.5125
20 − 1
The sample standard deviation s is equal to the square root of the sample variance:
s = 0.5125 = 0.715891, which is rounded to two decimal places, s = 0.72.
Example 2.30
Use the following data (first exam scores) from Susan Dean's spring pre-calculus class:
33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94;
96; 100
a. Create a chart containing the data, frequencies, relative frequencies, and cumulative relative frequencies to
three decimal places.
b. Calculate the following to one decimal place:
i. The sample mean
ii. The sample standard deviation
iii. The median
iv. The first quartile
v. The third quartile
vi. IQR
Solution 2.30
a. See Table 2.29
b. i. The sample mean = 73.5
ii. The sample standard deviation = 17.9
iii. The median = 73
iv. The first quartile = 61
v. The third quartile = 90
vi. IQR = 90 – 61 = 29
Table 2.29
84 Chapter 2 | Descriptive Statistics
Table 2.29
Just as we could not find the exact mean, neither can we find the exact standard deviation. Remember that standard deviation
describes numerically the expected deviation a data value has from the mean. In simple English, the standard deviation
allows us to compare how “unusual” individual data is compared to the mean.
Example 2.31
–
Class Frequency, f Midpoint, m f *m f(m - x ) 2
Table 2.30
For this data set, we have the mean, –x = 7.58 and the standard deviation, sx = 3.5. This means that a randomly
selected data value would be expected to be 3.5 units from the mean. If we look at the first class, we see that the
class midpoint is equal to one. This is almost two full standard deviations from the mean since 7.58 – 3.5 – 3.5 =
Σ(m − –x ) 2 f
0.58. While the formula for calculating the standard deviation is not complicated, s x = where
n−1
sx = sample standard deviation, –x = sample mean, the calculations are tedious. It is usually best to use
technology when performing the calculations.
¯ x − –x
Sample x = x + zs z = s
x − µ
Population x = µ + zσ z = σ
Table 2.31
Example 2.32
Two students, John and Ali, from different high schools, wanted to find out who had the highest GPA when
compared to his school. Which student had the highest GPA when compared to his school?
Table 2.32
Solution 2.32
For each student, determine how many standard deviations (#ofSTDEVs) his GPA is away from the average, for
his school. Pay careful attention to signs when comparing and interpreting the answer.
2.32 Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50
meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?
Table 2.33
The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the
distribution of the data.
For ANY data set, no matter what the distribution of the data is:
• At least 75% of the data is within two standard deviations of the mean.
• At least 89% of the data is within three standard deviations of the mean.
• At least 95% of the data is within 4.5 standard deviations of the mean.
• This is known as Chebyshev's Rule.
For data having a Normal Distribution, which we will examine in great detail later:
• Approximately 68% of the data is within one standard deviation of the mean.
• Approximately 95% of the data is within two standard deviations of the mean.
• More than 99% of the data is within three standard deviations of the mean.
• This is known as the Empirical Rule.
• It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and
symmetric. We will learn more about this when studying the "Normal" or "Gaussian" probability distribution in later
chapters.
Coefficient of Variation
Another useful way to compare distributions besides simple comparisons of means or standard deviations is to adjust for
differences in the scale of the data being measured. Quite simply, a large variation in data with a large mean is different than
the same variation in data with a small mean. To adjust for the scale of the underlying data the Coefficient of Variation (CV)
has been developed. Mathematically:
CV = s¯ * 100 conditioned upon x ≠ 0, where s is the standard deviation of the data and x is the mean.
¯ ¯
x
We can see that this measures the variability of the underlying data as a percentage of the mean value; the center weight of
the data set. This measure is useful in comparing risk where an adjustment is warranted because of differences in scale of
two data sets. In effect, the scale is changed to common scale, percentage differences, and allows direct comparison of the
two or more magnitudes of variation of different data sets.
KEY TERMS
Frequency the number of times a value of the data occurs
Frequency Table a data representation in which grouped data is displayed along with the corresponding frequencies
Histogram a graphical representation in x-y form of the distribution of data in a data set; x represents the data and y
represents the frequency, or relative frequency. The graph consists of contiguous rectangles.
Interquartile Range or IQR, is the range of the middle 50 percent of the data values; the IQR is found by subtracting
the first quartile from the third quartile.
Mean (arithmetic) a number that measures the central tendency of the data; a common name for mean is 'average.' The
¯
term 'mean' is a shortened form of 'arithmetic mean.' By definition, the mean for a sample (denoted by x ) is
–x = Sum of all values in the sample , and the mean for a population (denoted by μ) is
Number of values in the sample
Sum of all values in the population
µ= .
Number of values in the population
Mean (geometric) a measure of central tendency that provides a measure of average geometric growth over multiple
time periods.
Median a number that separates ordered data into halves; half the values are the same number or smaller than the median
and half the values are the same number or larger than the median. The median may or may not be part of the data.
Outlier an observation that does not fit the rest of the data
Percentile a number that divides ordered data into hundredths; percentiles may or may not be part of the data. The
median of the data is the second quartile and the 50th percentile. The first and third quartiles are the 25th and the 75th
percentiles, respectively.
Quartiles the numbers that separate the data into quarters; quartiles may or may not be part of the data. The second
quartile is the median of the data.
Relative Frequency the ratio of the number of times a value of the data occurs in the set of all outcomes to the number
of all outcomes
Standard Deviation a number that is equal to the square root of the variance and measures how far data values are
from their mean; notation: s for sample standard deviation and σ for population standard deviation.
Variance mean of the squared deviations from the mean, or the square of the standard deviation; for a set of data, a
deviation can be represented as x – –x where x is a value of the data and –x is the sample mean. The sample
variance is equal to the sum of the squares of the deviations divided by the difference of the sample size and one.
CHAPTER REVIEW
represents a discrete value. Some bar graphs present bars clustered in groups of more than one (grouped bar graphs), and
others show the bars divided into subparts to show cumulative effect (stacked bar graphs). Bar graphs are especially useful
when categorical data is being used.
A histogram is a graphic version of a frequency distribution. The graph consists of bars of equal width drawn adjacent to
each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies.
The heights of the bars correspond to frequency values. Histograms are typically used for large, continuous, quantitative
data sets. A frequency polygon can also be used when graphing large data sets with data points that repeat. The data usually
goes on y-axis with the frequency being graphed on the x-axis. Time series graphs can be helpful when looking at large
amounts of data for one variable over a period of time.
∑ (x − –x ) 2 ∑ f (x − –x ) 2
• s = or s = is the formula for calculating the standard deviation of a sample.
n−1 n−1
To calculate the standard deviation of a population, we would use the population mean, μ, and the formula σ =
∑ (x − µ) 2 ∑ f (x − µ) 2
or σ = .
N N
FORMULA REVIEW
where i = the ranking or position of a data value,
2.2 Measures of the Location of the Data
k = the kth percentile,
⎛ ⎞
i = ⎝ k ⎠(n + 1) n = total number of data.
100
PRACTICE
For the next three exercises, use the data to construct a line graph.
90 Chapter 2 | Descriptive Statistics
1. In a survey, 40 people were asked how many times they visited a store before making a major purchase. The results are
shown in Table 2.34.
Table 2.34
2. In a survey, several people were asked how many years it has been since they purchased a mattress. The results are shown
in Table 2.35.
Table 2.35
3. Several children were asked how many TV shows they watch each day. The results of the survey are shown in Table
2.36.
Table 2.36
4. The students in Ms. Ramirez’s math class have birthdays in each of the four seasons. Table 2.37 shows the four seasons,
the number of students who have birthdays in each season, and the percentage (%) of students in each group. Construct a
bar graph showing the number of students.
Table 2.37
5. Using the data from Mrs. Ramirez’s math class supplied in Exercise 2.4, construct a bar graph showing the percentages.
6. David County has six high schools. Each school sent students to participate in a county-wide science competition. Table
2.38 shows the percentage breakdown of competitors from each school, and the percentage of the entire student population
of the county that goes to each school. Construct a bar graph that shows the population percentage of competitors from each
school.
Table 2.38
7. Use the data from the David County science competition supplied in Exercise 2.6. Construct a bar graph that shows the
county-wide population percentage of students at each school.
8. Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen
people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine
generally sell six cars; eleven generally sell seven cars. Complete the table.
Table 2.39
9. What does the frequency column in Table 2.39 sum to? Why?
10. What does the relative frequency column in Table 2.39 sum to? Why?
11. What is the difference between relative frequency and frequency for each data value in Table 2.39?
92 Chapter 2 | Descriptive Statistics
12. What is the difference between cumulative relative frequency and relative frequency for each data value?
13. To construct the histogram for the data in Table 2.39, determine appropriate minimum and maximum x and y values
and the scaling. Sketch the histogram. Label the horizontal and vertical axes with words. Include numerical scaling.
Figure 2.14
Table 2.40
Table 2.41
Table 2.42
94 Chapter 2 | Descriptive Statistics
15. Construct a frequency polygon from the frequency distribution for the 50 highest ranked countries for depth of hunger.
Table 2.43
16. Use the two frequency tables to compare the life expectancy of men and women from 20 randomly selected countries.
Include an overlayed frequency polygon and discuss the shapes of the distributions, the center, the spread, and any outliers.
What can we conclude about the life expectancy of women compared to men?
Table 2.44
Table 2.45
17. Construct a times series graph for (a) the number of male births, (b) the number of female births, and (c) the total
number of births.
Table 2.46
Table 2.47
Table 2.48
18. The following data sets list full time police per 100,000 citizens along with homicides per 100,000 citizens for the city
of Detroit, Michigan during the period from 1961 to 1973.
Table 2.49
Table 2.50
a. Construct a double time series graph using a common x-axis for both sets of data.
b. Which variable increased the fastest? Explain.
c. Did Detroit’s increase in police officers have an impact on the murder rate? Explain.
96 Chapter 2 | Descriptive Statistics
Grade Frequency
a.
49.5–59.5 2
59.5–69.5 3
69.5–79.5 8
79.5–89.5 12
89.5–99.5 5
Table 2.51
Table 2.52
Table 2.53
Use the following information to answer the next three exercises: The following data show the lengths of boats moored in a
marina. The data are ordered from smallest to largest: 16; 17; 19; 20; 20; 21; 23; 24; 25; 25; 25; 26; 26; 27; 27; 27; 28; 29;
30; 32; 33; 33; 34; 35; 37; 39; 40
37. Calculate the mean.
38. Identify the median.
39. Identify the mode.
Use the following information to answer the next three exercises: Sixty-five randomly selected car salespersons were
asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars;
nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars.
98 Chapter 2 | Descriptive Statistics
Table 2.54
44. A group of children are measured to determine the average height of the group. The results are in Table 2.55 below.
What is the mean height of the group to the nearest hundredth of an inch?
Table 2.55
45. A person compares prices for five automobiles. The results are in Table 2.56. What is the mean price of the cars the
person has considered?
Price
$20,987
$22,008
$19,998
$23,433
$21,444
Table
2.56
46. A customer protection service has obtained 8 bags of candy that are supposed to contain 16 ounces of candy each. The
candy is weighed to determine if the average weight is at least the claimed 16 ounces. The results are in given in Table
2.57. What is the mean weight of a bag of candy in the sample?
Weight in Ounces
15.65
16.09
16.01
15.99
16.02
16.00
15.98
16.08
Table 2.57
47. A teacher records grades for a class of 70, 72, 79, 81, 82, 82, 83, 90, and 95. What is the mean of these grades?
48. A family is polled to see the mean of the number of hours per day the television set is on. The results, starting with
Sunday, are 6, 3, 2, 3, 1, 3, and 7 hours. What is the average number of hours the family had the television set on to the
nearest whole number?
100 Chapter 2 | Descriptive Statistics
49. A city received the following rainfall for a recent year. What is the mean number of inches of rainfall the city received
monthly, to the nearest hundredth of an inch? Use Table 2.58.
Table 2.58
50. A football team scored the following points in its first 8 games of the new season. Starting at game 1 and in order the
scores are 14, 14, 24, 21, 7, 0, 38, and 28. What is the mean number of points the team scored in these eight games?
Figure 2.15
66. Describe the relationship between the mode and the median of this distribution.
Figure 2.16
67. Describe the relationship between the mean and the median of this distribution.
Figure 2.17
102 Chapter 2 | Descriptive Statistics
Figure 2.18
69. Describe the relationship between the mode and the median of this distribution.
Figure 2.19
70. Are the mean and the median the exact same in this distribution? Why or why not?
Figure 2.20
Figure 2.21
72. Describe the relationship between the mode and the median of this distribution.
Figure 2.22
73. Describe the relationship between the mean and the median of this distribution.
Figure 2.23
104 Chapter 2 | Descriptive Statistics
74. The mean and median for the data are the same.
3; 4; 5; 5; 6; 6; 6; 6; 7; 7; 7; 7; 7; 7; 7
Is the data perfectly symmetrical? Why or why not?
75. Which is the greatest, the mean, the mode, or the median of the data set?
11; 11; 12; 12; 12; 12; 13; 15; 17; 22; 22; 22
76. Which is the least, the mean, the mode, and the median of the data set?
56; 56; 56; 58; 59; 60; 62; 64; 64; 65; 67
77. Of the three measures, which tends to reflect skewing the most, the mean, the mode, or the median? Why?
78. In a perfectly symmetrical distribution, when would the mode be different from the mean and median?
Baseball Player Batting Average Team Batting Average Team Standard Deviation
Fredo 0.158 0.166 0.012
Karl 0.177 0.189 0.015
Table 2.59
82. Use Table 2.59 to find the value that is three standard deviations:
a. above the mean
b. below the mean
Find the standard deviation for the following frequency tables using the formula. Check the calculations with the TI 83/84.
83. Find the standard deviation for the following frequency tables using the formula. Check the calculations with the TI 83/
84.
Grade Frequency
a.
49.5–59.5 2
59.5–69.5 3
69.5–79.5 8
79.5–89.5 12
89.5–99.5 5
Table 2.60
Table 2.61
Table 2.62
HOMEWORK
106 Chapter 2 | Descriptive Statistics
Table 2.63
a. Use a random number generator to randomly pick eight states. Construct a bar graph of the obesity rates of those
eight states.
b. Construct a bar graph for all the states beginning with the letter "A."
c. Construct a bar graph for all the states beginning with the letter "M."
85. Suppose that three book publishers were interested in the number of fiction paperbacks adult consumers purchase per
month. Each publisher conducted a survey. In the survey, adult consumers were asked the number of fiction paperbacks they
had purchased the previous month. The results are as follows:
a. Find the relative frequencies for each survey. Write them in the charts.
b. Use the frequency column to construct a histogram for each publisher's survey. For Publishers A and B, make bar
widths of one. For Publisher C, make bar widths of two.
c. In complete sentences, give two reasons why the graphs for Publishers A and B are not identical.
d. Would you have expected the graph for Publisher C to look like the other two graphs? Why or why not?
e. Make new histograms for Publisher A and Publisher B. This time, make bar widths of two.
f. Now, compare the graph for Publisher C to the new graphs for Publishers A and B. Are the graphs more similar
or more different? Explain your answer.
108 Chapter 2 | Descriptive Statistics
86. Often, cruise ships conduct all on-board transactions, with the exception of gambling, on a cashless basis. At the end
of the cruise, guests pay one bill that covers all onboard transactions. Suppose that 60 single travelers and 70 couples were
surveyed as to their on-board bills for a seven-day cruise from Los Angeles to the Mexican Riviera. Following is a summary
of the bills for each group.
87. Twenty-five randomly selected students were asked the number of movies they watched the previous week. The results
are as follows.
Table 2.69
88. The percentage of people who own at most three t-shirts costing more than $19 each is approximately:
a. 21
b. 59
c. 41
d. Cannot be determined
89. If the data were collected by asking the first 111 people who entered the store, then the type of sampling is:
a. cluster
b. simple random
c. stratified
d. convenience
110 Chapter 2 | Descriptive Statistics
90. Following are the 2010 obesity rates by U.S. states and Washington, DC.
Table 2.70
Construct a bar graph of obesity rates of your state and the four states closest to your state. Hint: Label the x-axis with the
states.
92. Six hundred adult Americans were asked by telephone poll, "What do you think constitutes a middle-class income?"
The results are in Table 2.71. Also, include left endpoint, but not the right endpoint.
Table 2.71
Table 2.72
a. What is the best estimate of the average obesity percentage for these countries?
b. The United States has an average obesity rate of 33.9%. Is this rate above average or below?
c. How does the United States compare to other countries?
112 Chapter 2 | Descriptive Statistics
94. Table 2.73 gives the percent of children under five considered to be underweight. What is the best estimate for the
mean percentage of underweight children?
Table 2.73
Table 2.74
Table 2.75
96. A standardized test is given to ten people at the beginning of the school year with the results given in Table 2.76 below.
At the end of the year the same people were again tested.
a. What is the average improvement?
b. Does it matter if the means are subtracted, or if the individual values are subtracted?
Table 2.76
97. A small class of 7 students has a mean grade of 82 on a test. If six of the grades are 80, 82,86, 90, 90, and 95, what is
the other grade?
98. A class of 20 students has a mean grade of 80 on a test. Nineteen of the students has a mean grade between 79 and 82,
inclusive.
a. What is the lowest possible grade of the other student?
b. What is the highest possible grade of the other student?
99. If the mean of 20 prices is $10.39, and 5 of the items with a mean of $10.99 are sampled, what is the mean of the other
15 prices?
• μ = 1000 FTES
• median = 1,014 FTES
• σ = 474 FTES
• first quartile = 528.5 FTES
• third quartile = 1,447.5 FTES
• n = 29 years
106. A sample of 11 years is taken. About how many are expected to have a FTES of 1014 or above? Explain how you
determined your answer.
107. 75% of all years have an FTES:
a. at or below: _____
b. at or above: _____
108. The population standard deviation = _____
109. What percent of the FTES were from 528.5 to 1447.5? How do you know?
110. What is the IQR? What does the IQR represent?
111. How many standard deviations away from the mean is the median?
Additional Information: The population FTES for 2005–2006 through 2010–2011 was given in an updated report. The data
are reported here.
Table 2.77
112. Calculate the mean, median, standard deviation, the first quartile, the third quartile and the IQR. Round to one decimal
place.
113. Compare the IQR for the FTES for 1976–77 through 2004–2005 with the IQR for the FTES for 2005-2006 through
2010–2011. Why do you suppose the IQRs are so different?
114. Three students were applying to the same graduate school. They came from schools with different grading systems.
Which student had the best GPA when compared to other students at his school? Explain how you determined your answer.
Table 2.78
115. A music school has budgeted to purchase three musical instruments. They plan to purchase a piano costing $3,000, a
guitar costing $550, and a drum set costing $600. The mean cost for a piano is $4,000 with a standard deviation of $2,500.
The mean cost for a guitar is $500 with a standard deviation of $200. The mean cost for drums is $700 with a standard
deviation of $100. Which cost is the lowest, when compared to other instruments of the same type? Which cost is the highest
when compared to other instruments of the same type. Justify your answer.
116. An elementary school class ran one mile with a mean of 11 minutes and a standard deviation of three minutes. Rachel,
a student in the class, ran one mile in eight minutes. A junior high school class ran one mile with a mean of nine minutes
and a standard deviation of two minutes. Kenji, a student in the class, ran 1 mile in 8.5 minutes. A high school class ran one
mile with a mean of seven minutes and a standard deviation of four minutes. Nedda, a student in the class, ran one mile in
eight minutes.
a. Why is Kenji considered a better runner than Nedda, even though Nedda ran faster than he?
b. Who is the fastest runner with respect to his or her class? Explain why.
117. The most obese countries in the world have obesity rates that range from 11.4% to 74.6%. This data is summarized in
Table 14.
Table 2.79
What is the best estimate of the average obesity percentage for these countries? What is the standard deviation for the listed
obesity rates? The United States has an average obesity rate of 33.9%. Is this rate above average or below? How “unusual”
is the United States’ obesity rate compared to the average rate? Explain.
118. Table 2.80 gives the percent of children under five considered to be underweight.
Table 2.80
What is the best estimate for the mean percentage of underweight children? What is the standard deviation? Which
interval(s) could be considered unusual? Explain.
119. Javier and Ercilia are supervisors at a shopping mall. Each was given the task of estimating the mean distance that
shoppers live from the mall. They each randomly surveyed 100 shoppers. The samples yielded the following information.
Javier Ercilia
¯
x 6.0 miles 6.0 miles
s 4.0 miles 7.0 miles
Table 2.81
Figure 2.24
Use the following information to answer the next three exercises: We are interested in the number of years students in
a particular elementary statistics class have lived in California. The information in the following table is from the entire
section.
Table 2.82
# of movies Frequency
0 5
1 9
2 6
3 4
4 1
Table 2.83
¯
a. Find the sample mean x .
b. Find the approximate sample standard deviation, s.
124. Forty randomly selected students were asked the number of pairs of sneakers they owned. Let X = the number of pairs
of sneakers owned. The results are as follows:
X Frequency
1 2
2 5
3 8
4 12
5 12
6 0
7 1
Table 2.84
¯
a. Find the sample mean x
b. Find the sample standard deviation, s
c. Construct a histogram of the data.
d. Complete the columns of the chart.
e. Find the first quartile.
f. Find the median.
g. Find the third quartile.
h. What percent of the students owned at least five pairs?
i. Find the 40th percentile.
j. Find the 90th percentile.
k. Construct a line graph of the data
l. Construct a stemplot of the data
118 Chapter 2 | Descriptive Statistics
125. Following are the published weights (in pounds) of all of the team members of the San Francisco 49ers from a previous
year.
177; 205; 210; 210; 232; 205; 185; 185; 178; 210; 206; 212; 184; 174; 185; 242; 188; 212; 215; 247; 241; 223; 220; 260;
245; 259; 278; 270; 280; 295; 275; 285; 290; 272; 273; 280; 285; 286; 200; 215; 185; 230; 250; 241; 190; 260; 250; 302;
265; 290; 276; 228; 265
a. Organize the data from smallest to largest value.
b. Find the median.
c. Find the first quartile.
d. Find the third quartile.
e. The middle 50% of the weights are from _______ to _______.
f. If our population were all professional football players, would the above data be a sample of weights or the
population of weights? Why?
g. If our population included every team member who ever played for the San Francisco 49ers, would the above data
be a sample of weights or the population of weights? Why?
h. Assume the population was the San Francisco 49ers. Find:
i. the population mean, μ.
ii. the population standard deviation, σ.
iii. the weight that is two standard deviations below the mean.
iv. When Steve Young, quarterback, played football, he weighed 205 pounds. How many standard
deviations above or below the mean was he?
i. That same year, the mean weight for the Dallas Cowboys was 240.08 pounds with a standard deviation of 44.38
pounds. Emmit Smith weighed in at 209 pounds. With respect to his team, who was lighter, Smith or Young? How
did you determine your answer?
126. One hundred teachers attended a seminar on mathematical problem solving. The attitudes of a representative sample
of 12 of the teachers were measured before and after the seminar. A positive number for change in attitude indicates that a
teacher's attitude toward math became more positive. The 12 change scores are as follows:
3; 8; –1; 2; 0; 5; –3; 1; –1; 6; 5; –2
a. What is the mean change score?
b. What is the standard deviation for this population?
c. What is the median change score?
d. Find the change score that is 2.2 standard deviations below the mean.
127. Refer to Figure 2.25 determine which of the following are true and which are false. Explain your solution to each
part in complete sentences.
Figure 2.25
a. The medians for both graphs are the same.
b. We cannot determine if any of the means for both graphs is different.
c. The standard deviation for graph b is larger than the standard deviation for graph a.
d. We cannot determine if any of the third quartiles for both graphs is different.
128. In a recent issue of the IEEE Spectrum, 84 engineering conferences were announced. Four conferences lasted two
days. Thirty-six lasted three days. Eighteen lasted four days. Nineteen lasted five days. Four lasted six days. One lasted
seven days. One lasted eight days. One lasted nine days. Let X = the length (in days) of an engineering conference.
a. Organize the data in a chart.
b. Find the median, the first quartile, and the third quartile.
c. Find the 65th percentile.
d. Find the 10th percentile.
e. The middle 50% of the conferences last from _______ days to _______ days.
f. Calculate the sample mean of days of engineering conferences.
g. Calculate the sample standard deviation of days of engineering conferences.
h. Find the mode.
i. If you were planning an engineering conference, which would you choose as the length of the conference: mean;
median; or mode? Explain why you made that choice.
j. Give two reasons why you think that three to five days seem to be popular lengths of engineering conferences.
129. A survey of enrollment at 35 community colleges across the United States yielded the following figures:
6414; 1550; 2109; 9350; 21828; 4300; 5944; 5722; 2825; 2044; 5481; 5200; 5853; 2750; 10012; 6357; 27000; 9414; 7681;
3200; 17500; 9200; 7380; 18314; 6557; 13713; 17768; 7493; 2771; 2861; 1263; 7285; 28165; 5080; 11622
a. Organize the data into a chart with five intervals of equal width. Label the two columns "Enrollment" and
"Frequency."
b. Construct a histogram of the data.
c. If you were to build a new community college, which piece of information would be more valuable: the mode or
the mean?
d. Calculate the sample mean.
e. Calculate the sample standard deviation.
f. A school with an enrollment of 8000 would be how many standard deviations away from the mean?
Use the following information to answer the next two exercises. X = the number of days per week that 100 clients use a
particular exercise facility.
x Frequency
0 3
1 12
2 33
3 28
4 11
5 9
6 4
Table 2.85
132. Suppose that a publisher conducted a survey asking adult consumers the number of fiction paperback books they had
purchased in the previous month. The results are summarized in the Table 2.86.
Table 2.86
a. Are there any outliers in the data? Use an appropriate numerical test involving the IQR to identify outliers, if any,
and clearly state your conclusion.
b. If a data value is identified as an outlier, what should be done about it?
c. Are any data values further than two standard deviations away from the mean? In some situations, statisticians
may use this criteria to identify data values that are unusual, compared to the other data values. (Note that this
criteria is most appropriate to use for data that is mound-shaped and symmetric, rather than for skewed data.)
d. Do parts a and c of this problem give the same answer?
e. Examine the shape of the data. Which part, a or c, of this question gives a more appropriate result for this data?
f. Based on the shape of the data which is the most appropriate measure of center for this data: mean, median or
mode?
REFERENCES
April 3, 2013).
“Births Time Series Data.” General Register Office For Scotland, 2013. Available online at http://www.gro-scotland.gov.uk/
statistics/theme/vital-events/births/time-series.html (accessed April 3, 2013).
“Demographics: Children under the age of 5 years underweight.” Indexmundi. Available online at
http://www.indexmundi.com/g/r.aspx?t=50&v=2224&aml=en (accessed April 3, 2013).
Gunst, Richard, Robert Mason. Regression Analysis and Its Application: A Data-Oriented Approach. CRC Press: 1980.
“Overweight and Obesity: Adult Obesity Facts.” Centers for Disease Control and Prevention. Available online at
http://www.cdc.gov/obesity/data/adult.html (accessed September 13, 2013).
SOLUTIONS
1
Figure 2.26
Figure 2.27
Figure 2.28
Figure 2.29
9 65
11 The relative frequency shows the proportion of data points that have each value. The frequency tells the number of data
points that have each value.
13 Answers will vary. One possible histogram is shown:
124 Chapter 2 | Descriptive Statistics
Figure 2.30
15 Find the midpoint for each class. These will be graphed on the x-axis. The frequency values will be graphed on the
y-axis values.
Figure 2.31
17
Figure 2.32
19
a. The 40th percentile is 37 years.
b. The 78th percentile is 70 years.
21 Jesse graduated 37th out of a class of 180 students. There are 180 – 37 = 143 students ranked below Jesse. There is
x + 0.5y 143 + 0.5(1)
one rank of 37. x = 143 and y = 1. n (100) = (100) = 79.72. Jesse’s rank of 37 puts him at the 80th
180
percentile.
23
a. For runners in a race it is more desirable to have a high percentile for speed. A high percentile means a higher speed
which is faster.
b. 40% of runners ran at speeds of 7.5 miles per hour or less (slower). 60% of runners ran at speeds of 7.5 miles per hour
or more (faster).
25 When waiting in line at the DMV, the 85th percentile would be a long wait time compared to the other people waiting.
85% of people had shorter wait times than Mina. In this context, Mina would prefer a wait time corresponding to a lower
percentile. 85% of people at the DMV waited 32 minutes or less. 15% of people at the DMV waited 32 minutes or longer.
27 The manufacturer and the consumer would be upset. This is a large repair cost for the damages, compared to the other
cars in the sample. INTERPRETATION: 90% of the crash tested cars had damage repair costs of $1700 or less; only 10%
had damage repair costs of $1700 or more.
29 You can afford 34% of houses. 66% of the houses are too expensive for your budget. INTERPRETATION: 34% of
houses cost $240,000 or less. 66% of houses cost $240,000 or more.
31 4
33 6 – 4 = 2
35 6
37 Mean: 16 + 17 + 19 + 20 + 20 + 21 + 23 + 24 + 25 + 25 + 25 + 26 + 26 + 27 + 27 + 27 + 28 + 29 + 30 + 32 + 33 + 33
+ 34 + 35 + 37 + 39 + 40 = 738; 738 = 27.33
27
39 The most frequent lengths are 25 and 27, which occur three times. Mode = 25, 27
126 Chapter 2 | Descriptive Statistics
41 4
44 39.48 in.
45 $21,574
46 15.98 ounces
47 81.56
48 4 hours
49 2.01 inches
50 18.25
51 10
52 14.15
53 14
54 14.78
55 44%
56 100%
57 6%
58 33%
59 The data are symmetrical. The median is 3 and the mean is 2.85. They are close, and the mode lies close to the middle
of the data, so the data are symmetrical.
61 The data are skewed right. The median is 87.5 and the mean is 88.2. Even though they are close, the mode lies to the
left of the middle of the data, and there are many more instances of 87 than any other number, so the data are skewed right.
63 When the data are symmetrical, the mean and median are close or the same.
65 The distribution is skewed right because it looks pulled out to the right.
67 The mean is 4.1 and is slightly greater than the median, which is four.
69 The mode and the median are the same. In this case, they are both five.
71 The distribution is skewed left because it looks pulled out to the left.
73 The mean and the median are both six.
75 The mode is 12, the median is 12.5, and the mean is 15.1. The mean is the largest.
77 The mean tends to reflect skewing the most because it is affected the most by outliers.
79 s = 34.5
81 For Fredo: z = 0.158 – 0.166 = –0.67 For Karl: z = 0.177 – 0.189 = –0.8 Fredo’s z-score of –0.67 is higher than
0.012 0.015
Karl’s z-score of –0.8. For batting average, higher values are better, so Fredo has a better batting average compared to his
team.
83
∑ f m2
a. sx = n − –x 2
= 193157.45 − 79.5 2 = 10.88
30
∑ f m2
b. sx = n − –x 2
= 380945.3 − 60.94 2 = 7.62
101
∑ f m2
c. sx = n − –x 2
= 440051.5 − 70.66 2 = 11.14
86
84
a. Example solution for using the random number generator for the TI-84+ to generate a simple random sample of 8
states. Instructions are as follows.
Number the entries in the table 1–51 (Includes Washington, DC; Numbered vertically)
Press MATH
Arrow over to PRB
Press 5:randInt(
Enter 51,1,8)
Eight numbers are generated (use the right arrow key to scroll through the numbers). The numbers correspond to the
numbered states (for this example: {47 21 9 23 51 13 25 4}. If any numbers are repeated, generate a different number
by using 5:randInt(51,1)). Here, the states (and Washington DC) are {Arkansas, Washington DC, Idaho, Maryland,
Michigan, Mississippi, Virginia, Wyoming}.
Corresponding percents are {30.1, 22.2, 26.5, 27.1, 30.9, 34.0, 26.0, 25.1}.
Figure 2.33
b.
Figure 2.34
128 Chapter 2 | Descriptive Statistics
c.
Figure 2.35
86
Figure 2.36
c. In the following histogram, the data values that fall on the right boundary are counted in the class interval, while values
that fall on the left boundary are not counted (with the exception of the first interval where values on both boundaries
are included).
Figure 2.37
88 c
90 Answers will vary.
92
a. 1 – (0.02+0.09+0.19+0.26+0.18+0.17+0.02+0.01) = 0.06
b. 0.19+0.26+0.18 = 0.63
c. Check student’s solution.
d. 40th percentile will fall between 30,000 and 40,000
80th percentile will fall between 50,000 and 75,000
e. Check student’s solution.
96
a. 20
b. No
97 51
98
a. 42
b. 99
99 $10.19
100 17%
101 $30,772.48
102 4.4%
103 7.24%
104 -1.27%
106 The median value is the middle value in the ordered list of data values. The median value of a set of 11 will be the 6th
number in order. Six years will have totals at or below the median.
108 474 FTES
110 919
112
• mean = 1,809.3
• median = 1,812.5
• standard deviation = 151.2
• first quartile = 1,690
• third quartile = 1,935
• IQR = 245
113 Hint: Think about the number of years covered by each time period and what happened to higher education during
those periods.
115 For pianos, the cost of the piano is 0.4 standard deviations BELOW the mean. For guitars, the cost of the guitar is 0.25
standard deviations ABOVE the mean. For drums, the cost of the drum set is 1.0 standard deviations BELOW the mean.
Of the three, the drums cost the lowest in comparison to the cost of other instruments of the same type. The guitar costs the
most in comparison to the cost of other instruments of the same type.
117
• –x = 23.32
• Using the TI 83/84, we obtain a standard deviation of: s x = 12.95.
• The obesity rate of the United States is 10.58% higher than the average obesity rate.
• Since the standard deviation is 12.95, we see that 23.32 + 12.95 = 36.27 is the obesity percentage that is one standard
deviation from the mean. The United States obesity rate is slightly less than one standard deviation from the mean.
Therefore, we can assume that the United States, while 34% obese, does not hav e an unusually high percentage of
obese people.
120 a
122 b
123
a. 1.48
b. 1.12
125
a. 174; 177; 178; 184; 185; 185; 185; 185; 188; 190; 200; 205; 205; 206; 210; 210; 210; 212; 212; 215; 215; 220; 223;
228; 230; 232; 241; 241; 242; 245; 247; 250; 250; 259; 260; 260; 265; 265; 270; 272; 273; 275; 276; 278; 280; 280;
285; 285; 286; 290; 290; 295; 302
b. 241
c. 205.5
d. 272.5
e. 205.5, 272.5
f. sample
g. population
h. i. 236.34
ii. 37.50
iii. 161.34
iv. 0.84 std. dev. below the mean
132 Chapter 2 | Descriptive Statistics
i. Young
127
a. True
b. True
c. True
d. False
129
Enrollment Frequency
a.
1000-5000 10
5000-10000 16
10000-15000 3
15000-20000 3
20000-25000 1
25000-30000 2
Table 2.89
131 a
3 | PROBABILITY TOPICS
Figure 3.1 Meteor showers are rare, but the probability of them occurring can be calculated. (credit: Navicore/flickr)
Introduction
It is often necessary to "guess" about the outcome of an event in order to make a decision. Politicians study polls to guess
their likelihood of winning an election. Teachers choose a particular course of study based on what they think students can
comprehend. Doctors choose the treatments needed for various diseases based on their assessment of likely results. You
may have visited a casino where people play games chosen because of the belief that the likelihood of winning is good. You
may have chosen your course of study based on the probable availability of jobs.
You have, more than likely, used probability. In fact, you probably have an intuitive sense of probability. Probability deals
with the chance of an event occurring. Whenever you weigh the odds of whether or not to do your homework or to study
for an exam, you are using probability. In this chapter, you will learn how to solve probability problems using a systematic
approach.
3.1 | Terminology
Probability is a measure that is associated with how certain we are of outcomes of a particular experiment or activity.
An experiment is a planned operation carried out under controlled conditions. If the result is not predetermined, then the
experiment is said to be a chance experiment. Flipping one fair coin twice is an example of an experiment.
A result of an experiment is called an outcome. The sample space of an experiment is the set of all possible outcomes.
Three ways to represent a sample space are: to list the possible outcomes, to create a tree diagram, or to create a Venn
diagram. The uppercase letter S is used to denote the sample space. For example, if you flip one fair coin, S = {H, T} where
H = heads and T = tails are the outcomes.
An event is any combination of outcomes. Upper case letters like A and B represent events. For example, if the experiment
is to flip one fair coin, event A might be getting at most one head. The probability of an event A is written P(A).
The probability of any outcome is the long-term relative frequency of that outcome. Probabilities are between zero and
one, inclusive (that is, zero and one and all numbers between these values). P(A) = 0 means the event A can never happen.
P(A) = 1 means the event A always happens. P(A) = 0.5 means the event A is equally likely to occur or not to occur. For
example, if you flip one fair coin repeatedly (from 20 to 2,000 to 20,000 times) the relative frequency of heads approaches
0.5 (the probability of heads).
Equally likely means that each outcome of an experiment occurs with equal probability. For example, if you toss a fair,
134 Chapter 3 | Probability Topics
six-sided die, each face (1, 2, 3, 4, 5, or 6) is as likely to occur as any other face. If you toss a fair coin, a Head (H) and a
Tail (T) are equally likely to occur. If you randomly guess the answer to a true/false question on an exam, you are equally
likely to select a correct answer or an incorrect answer.
To calculate the probability of an event A when all outcomes in the sample space are equally likely, count the number
of outcomes for event A and divide by the total number of outcomes in the sample space. For example, if you toss a fair
dime and a fair nickel, the sample space is {HH, TH, HT, TT} where T = tails and H = heads. The sample space has four
outcomes. A = getting one head. There are two outcomes that meet this condition {HT, TH}, so P(A) = 2 = 0.5.
4
Suppose you roll one fair six-sided die, with the numbers {1, 2, 3, 4, 5, 6} on its faces. Let event E = rolling a number that
is at least five. There are two outcomes {5, 6}. P(E) = 2 . If you were to roll the die only a few times, you would not be
6
surprised if your observed results did not match the probability. If you were to roll the die a very large number of times, you
would expect that, overall, 2 of the rolls would result in an outcome of "at least five". You would not expect exactly 2 .
6 6
The long-term relative frequency of obtaining this result would approach the theoretical probability of 2 as the number of
6
repetitions grows larger and larger.
This important characteristic of probability experiments is known as the law of large numbers which states that as the
number of repetitions of an experiment is increased, the relative frequency obtained in the experiment tends to become
closer and closer to the theoretical probability. Even though the outcomes do not happen according to any set pattern or
order, overall, the long-term observed relative frequency will approach the theoretical probability. (The word empirical is
often used instead of the word observed.)
It is important to realize that in many situations, the outcomes are not equally likely. A coin or die may be unfair, or biased.
Two math professors in Europe had their statistics students test the Belgian one Euro coin and discovered that in 250 trials,
a head was obtained 56% of the time and a tail was obtained 44% of the time. The data seem to show that the coin is not a
fair coin; more repetitions would be helpful to draw a more accurate conclusion about such bias. Some dice may be biased.
Look at the dice in a game you have at home; the spots on each face are usually small holes carved out and then painted to
make the spots visible. Your dice may or may not be biased; it is possible that the outcomes may be affected by the slight
weight differences due to the different numbers of holes in the faces. Gambling casinos make a lot of money depending on
outcomes from rolling dice, so casino dice are made differently to eliminate bias. Casino dice have flat faces; the holes are
completely filled with paint having the same density as the material that the dice are made out of so that each face is equally
likely to occur. Later we will learn techniques to use to work with probabilities for events that are not equally likely.
We get the same result by using the formula. Remember that S has six outcomes.
(the number of outcomes that are 2 or 3 and even inS) 1
P(A ∩ B)
P(A | B) = = 6 = 6 =1
P(B) (the number of outcomes that are even inS) 3 3
6 6
Odds
The odds of an event presents the probability as a ratio of success to failure. This is common in various gambling formats.
Mathematically, the odds of an event can be defined as:
P(A)
1 − P(A)
where P(A) is the probability of success and of course 1 − P(A) is the probability of failure. Odds are always quoted as
"numerator to denominator," e.g. 2 to 1. Here the probability of winning is twice that of losing; thus, the probability of
winning is 0.66. A probability of winning of 0.60 would generate odds in favor of winning of 3 to 2. While the calculation
of odds can be useful in gambling venues in determining payoff amounts, it is not helpful for understanding probability or
statistical theory.
Understanding Terminology and Symbols
It is important to read each problem carefully to think about and understand what the events are. Understanding the wording
is the first very important step in solving probability problems. Reread the problem several times if necessary. Clearly
identify the event of interest. Determine whether there is a condition stated in the wording that would indicate that the
probability is conditional; carefully identify the condition, if any.
Example 3.1
The sample space S is the whole numbers starting at one and less than 20.
a. S = _____________________________
Let event A = the even numbers and event B = numbers greater than 13.
b. A = _____________________, B = _____________________
c. P(A) = _____________, P(B) = ________________
d. A ∩ B = ____________________, A OR B = ________________
e. P(A ∩ B) = _________, P(A ∪ B) = _____________
f. A′ = _____________, P(A′) = _____________
g. P(A) + P(A′) = ____________
h. P(A | B) = ___________, P(B | A) = _____________; are the probabilities equal?
Solution 3.1
a. S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
b. A = {2, 4, 6, 8, 10, 12, 14, 16, 18}, B = {14, 15, 16, 17, 18, 19}
c. P(A) = 9 , P(B) = 6
19 19
d. A ∩ B = {14,16,18}, A OR B = {2, 4, 6, 8, 10, 12, 14, 15, 16, 17, 18, 19}
e. P(A ∩ B) = 3 , P(A ∪ B) = 12
19 19
g. P(A) + P(A′) = 1 ( 9 + 10 = 1)
19 19
136 Chapter 3 | Probability Topics
P(A ∩ B) P(A ∩ B)
h. P(A | B) = = 3 , P(B | A) = = 3 , No
P(B) 6 P(A) 9
3.1 The sample space S is all the ordered pairs of two whole numbers, the first from one to three and the second from
one to four (Example: (1, 4)).
a. S = _____________________________
Let event A = the sum is even and event B = the first number is prime.
b. A = _____________________, B = _____________________
c. P(A) = _____________, P(B) = ________________
d. A ∩ B = ____________________, A ∪ B = ________________
e. P(A ∩ B) = _________, P(A ∪ B) = _____________
f. B′ = _____________, P(B′) = _____________
g. P(A) + P(A′) = ____________
h. P(A | B) = ___________, P(B | A) = _____________; are the probabilities equal?
Example 3.2
A fair, six-sided die is rolled. Describe the sample space S, identify each of the following events with a subset of
S and compute its probability (an outcome is the number of dots that show up).
a. Event T = the outcome is two.
b. Event A = the outcome is an even number.
c. Event B = the outcome is less than four.
d. The complement of A.
e. A | B
f. B | A
g. A ∩ B
h. A ∪ B
i. A ∪ B′
j. Event N = the outcome is a prime number.
k. Event I = the outcome is seven.
Solution 3.2
a. T = {2}, P(T) = 1
6
e. A | B = {2}, P(A | B) = 1
3
f. B | A = {2}, P(B | A) = 1
3
g. A ∩ B = {2}, P(A ∩ B) = 1
6
Example 3.3
Table 3.1 describes the distribution of a random sample S of 100 individuals, organized by gender and whether
they are right- or left-handed.
Right-handed Left-handed
Males 43 9
Females 44 4
Table 3.1
Let’s denote the events M = the subject is male, F = the subject is female, R = the subject is right-handed, L = the
subject is left-handed. Compute the following probabilities:
a. P(M)
b. P(F)
c. P(R)
d. P(L)
e. P(M ∩ R)
f. P(F ∩ L)
g. P(M ∪ F)
h. P(M ∪ R)
i. P(F ∪ L)
j. P(M')
k. P(R | M)
138 Chapter 3 | Probability Topics
l. P(F | L)
m. P(L | F)
Solution 3.3
a. P(M) = 0.52
b. P(F) = 0.48
c. P(R) = 0.87
d. P(L) = 0.13
e. P(M ∩ R) = 0.43
f. P(F ∩ L) = 0.04
g. P(M ∪ F) = 1
h. P(M ∪ R) = 0.96
i. P(F ∪ L) = 0.57
j. P(M') = 0.48
k. P(R | M) = 0.8269 (rounded to four decimal places)
l. P(F | L) = 0.3077 (rounded to four decimal places)
m. P(L | F) = 0.0833
Independent Events
Two events are independent if one of the following are true:
• P(A|B) = P(A)
• P(B| A) = P(B)
• P(A ∩ B) = P(A)P(B)
Two events A and B are independent if the knowledge that one occurred does not affect the chance the other occurs. For
example, the outcomes of two roles of a fair die are independent events. The outcome of the first roll does not change the
probability for the outcome of the second roll. To show two events are independent, you must show only one of the above
conditions. If two events are NOT independent, then we say that they are dependent.
Sampling may be done with replacement or without replacement.
• With replacement: If each member of a population is replaced after it is picked, then that member has the possibility
of being chosen more than once. When sampling is done with replacement, then events are considered to be
independent, meaning the result of the first pick will not change the probabilities for the second pick.
• Without replacement: When sampling is done without replacement, each member of a population may be chosen
only once. In this case, the probabilities for the second pick are affected by the result of the first pick. The events are
considered to be dependent or not independent.
If it is not known whether A and B are independent or dependent, assume they are dependent until you can show
otherwise.
Example 3.4
You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts and
spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J (jack), Q (queen), K (king) of
that suit.
a. Sampling with replacement:
Suppose you pick three cards with replacement. The first card you pick out of the 52 cards is the Q of spades. You
put this card back, reshuffle the cards and pick a second card from the 52-card deck. It is the ten of clubs. You
put this card back, reshuffle the cards and pick a third card from the 52-card deck. This time, the card is the Q of
spades again. Your picks are {Q of spades, ten of clubs, Q of spades}. You have picked the Q of spades twice.
You pick each card from the 52-card deck.
b. Sampling without replacement:
Suppose you pick three cards without replacement. The first card you pick out of the 52 cards is the K of hearts.
You put this card aside and pick the second card from the 51 cards remaining in the deck. It is the three of
diamonds. You put this card aside and pick the third card from the remaining 50 cards in the deck. The third card
is the J of spades. Your picks are {K of hearts, three of diamonds, J of spades}. Because you have picked the
cards without replacement, you cannot pick the same card twice. The probability of picking the three of diamonds
is called a conditional probability because it is conditioned on what was picked first. This is true also of the
probability of picking the J of spades. The probability of picking the J of spades is actually conditioned on both
the previous picks.
3.4 You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts and
spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J (jack), Q (queen), K (king) of that suit.
Three cards are picked at random.
a. Suppose you know that the picked cards are Q of spades, K of hearts and Q of spades. Can you decide if the
sampling was with or without replacement?
b. Suppose you know that the picked cards are Q of spades, K of hearts, and J of spades. Can you decide if the
sampling was with or without replacement?
Example 3.5
You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts, and
spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J (jack), Q (queen), and K (king)
of that suit. S = spades, H = Hearts, D = Diamonds, C = Clubs.
a. Suppose you pick four cards, but do not put any cards back into the deck. Your cards are QS, 1D, 1C, QD.
b. Suppose you pick four cards and put each card back before you pick the next card. Your cards are KH, 7D,
6D, KH.
Which of a. or b. did you sample with replacement and which did you sample without replacement?
Solution 3.5
a. Without replacement; b. With replacement
140 Chapter 3 | Probability Topics
3.5 You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts, and
spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J (jack), Q (queen), and K (king) of
that suit. S = spades, H = Hearts, D = Diamonds, C = Clubs. Suppose that you sample four cards without replacement.
Which of the following outcomes are possible? Answer the same question for sampling with replacement.
a. QS, 1D, 1C, QD
b. KH, 7D, 6D, KH
c. QS, 7D, 6D, KS
For example, suppose the sample space S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Let A = {1, 2, 3, 4, 5}, B = {4, 5, 6, 7, 8}, and C =
{7, 9}. A ∩ B = {4, 5}. P(A ∩ B) = 2 and is not equal to zero. Therefore, A and B are not mutually exclusive. A and
10
C do not have any numbers in common so P(A ∩ C) = 0 . Therefore, A and C are mutually exclusive.
If it is not known whether A and B are mutually exclusive, assume they are not until you can show otherwise. The
following examples illustrate these definitions and terms.
Example 3.6
3.6 Draw two cards from a standard 52-card deck with replacement. Find the probability of getting at least one black
card.
Example 3.7
Solution 3.7
Look at the sample space in Example 3.6.
a. Zero (0) or one (1) tails occur when the outcomes HH, TH, HT show up. P(F) = 3
4
e. Getting all tails occurs when tails shows up on both coins (TT). H’s outcomes are HH and HT.
J and H have nothing in common so P(J ∩ H) = 0. J and H are mutually exclusive.
3.7 A box has two balls, one white and one red. We select one ball, put it back in the box, and select a second ball
(sampling with replacement). Find the probability of the following events:
a. Let F = the event of getting the white ball twice.
b. Let G = the event of getting two balls of different colors.
c. Let H = the event of getting white on the first pick.
d. Are F and G mutually exclusive?
e. Are G and H mutually exclusive?
Example 3.8
Roll one fair, six-sided die. The sample space is {1, 2, 3, 4, 5, 6}. Let event A = a face is odd. Then A = {1, 3, 5}.
Let event B = a face is even. Then B = {2, 4, 6}.
142 Chapter 3 | Probability Topics
• Find the complement of A, A′. The complement of A, A′, is B because A and B together make up the sample
space. P(A) + P(B) = P(A) + P(A′) = 1. Also, P(A) = 3 and P(B) = 3 .
6 6
• Let event C = odd faces larger than two. Then C = {3, 5}. Let event D = all even faces smaller than five.
Then D = {2, 4}. P(C ∩ D) = 0 because you cannot have an odd and even face at the same time. Therefore,
C and D are mutually exclusive events.
• Let event E = all faces less than five. E = {1, 2, 3, 4}.
Are C and E mutually exclusive events? (Answer yes or no.) Why or why not?
Solution 3.8
⎛ ⎞
No. C = {3, 5} and E = {1, 2, 3, 4}. P⎝C ∩ E⎠ = 1 . To be mutually exclusive, P(C ∩ E) must be zero.
6
• Find P(C| A) . This is a conditional probability. Recall that the event C is {3, 5} and event A is {1, 3, 5}. To
find P(C| A) , find the probability of C using the sample space A. You have reduced the sample space from
⎛ ⎞
|
the original sample space {1, 2, 3, 4, 5, 6} to {1, 3, 5}. So, P⎝C A⎠ = 2 .
3
3.8 Let event A = learning Spanish. Let event B = learning German. Then A ∩ B =
learning Spanish and German.
Suppose P(A) = 0.4 and P(B) = 0.2 . P(A ∩ B) = 0.08 . Are events A and B independent? Hint: You must show
ONE of the following:
• P(A|B) = P(A)
• P(B| A) = P(B)
• P(A ∩ B) = P(A)P(B)
Example 3.9
Let event G = taking a math class. Let event H = taking a science class. Then, G ∩ H = taking a math class and
a science class. Suppose P(G) = 0.6, P(H) = 0.5 , and P(G ∩ H) = 0.3. Are G and H independent?
If G and H are independent, then you must show ONE of the following:
• P(G|H) = P(G)
• P(H |G) = P(H)
• P(G ∩ H) = P(G)P(H)
NOTE
The choice you make depends on the information you have. You could choose any of the methods here
because you have the necessary information.
Solution 3.9
⎛
|
⎞ P(G ∩H) 0.3
P⎝G H ⎠ =
P(H)
=
0.5
⎛ ⎞
= 0.6 = P⎝G⎠
Solution 3.9
P⎛⎝G⎞⎠P⎛⎝H ⎞⎠ = ⎛⎝0.6⎞⎠(0.5⎞⎠ = 0.3 = P⎛⎝G ∩ H ⎞⎠
Since G and H are independent, knowing that a person is taking a science class does not change the chance that
he or she is taking a math class. If the two events had not been independent (that is, they are dependent) then
knowing that a person is taking a science class would change the chance he or she is taking math. For practice,
show that P(H |G) = P(H) to show that G and H are independent events.
3.9 In a bag, there are six red marbles and four green marbles. The red marbles are marked with the numbers 1, 2, 3,
4, 5, and 6. The green marbles are marked with the numbers 1, 2, 3, and 4.
• R = a red marble
• G = a green marble
• O = an odd-numbered marble
• The sample space is S = {R1, R2, R3, R4, R5, R6, G1, G2, G3, G4}.
S has ten outcomes. What is P(G ∩ O) ?
Example 3.10
Let event C = taking an English class. Let event D = taking a speech class.
Suppose P(C) = 0.75 , P(D) = 0.3 , P(C|D) = 0.75 and P(C ∩ D) = 0.225 .
Solution 3.10
a. Yes, because P(C|D) = P(C) .
c.
⎛
| ⎞ P(C ∩ D) 0.225
P⎝D C⎠ =
P(C)
=
0.75
= 0.3
144 Chapter 3 | Probability Topics
3.10 A student goes to the library. Let events B = the student checks out a book and D = the student checks out a
DVD. Suppose that P(B) = 0.40 , P(D) = 0.30 and P(B ∩ D) = 0.20 .
a. Find P(B|D) .
b. Find P(D|B) .
Example 3.11
In a box there are three red cards and five blue cards. The red cards are marked with the numbers 1, 2, and 3, and
the blue cards are marked with the numbers 1, 2, 3, 4, and 5. The cards are well-shuffled. You reach into the box
(you cannot see into it) and draw one card.
Let R = red card is drawn, B = blue card is drawn, E = even-numbered card is drawn.
The sample space S = R1, R2, R3, B1, B2, B3, B4, B5. S has eight outcomes.
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
• P⎝R⎠ = 3 .P⎝B⎠ = 5 .P⎝R ∩ B⎠ = 0 . (You cannot draw one card that is both red and blue.)
8 8
⎛ ⎞
• P⎝E⎠ = 3 . (There are three even-numbered cards, R2, B2, and B4.)
8
•
|
P(E B) = 2 . (There are five blue cards: B1, B2, B3, B4, and B5. Out of the blue cards, there are two even
5
cards; B2 and B4.)
•
|
P(B E) = 2 . (There are three even-numbered cards: R2, B2, and B4. Out of the even-numbered cards, to
3
are blue; B2 and B4.)
• The events R and B are mutually exclusive because P(R ∩ B) = 0 .
⎛ ⎞
• Let G = card with a number greater than 3. G = {B4, B5}. P⎝G⎠ = 2 . Let H = blue card numbered between
8
1
|
one and four, inclusive. H = {B1, B2, B3, B4}. P(G H) = . (The only card in H that has a number greater
4
than three is B4.) Since 2 1
= , P(G) = P(G|H) , which means that G and H are independent.
8 4
Are the events of rooting for the away team and wearing blue independent? Are they mutually exclusive?
Example 3.12
In a particular college class, 60% of the students are female. Fifty percent of all students in the class have long
hair. Forty-five percent of the students are female and have long hair. Of the female students, 75% have long hair.
Let F be the event that a student is female. Let L be the event that a student has long hair. One student is picked
randomly. Are the events of being female and having long hair independent?
• The following probabilities are given in this example:
• P(F)=0.60; P(L)=0.50
• P(F ∩ L) = 0.45
• P(L|F) = 0.75
NOTE
The choice you make depends on the information you have. You could use the first or last condition on
the list for this example. You do not know P(F|L) yet, so you cannot use the second condition.
Solution 1
Check whether P(F ∩ L) = P(F)P(L) . We are given that P(F ∩ L) = 0.45 , but
P(F)P(L) = (0.60)(0.50) = 0.30 . The events of being female and having long hair are not independent because
P(F ∩ L) does not equal P(F)P(L) .
Solution 2
Check whether P(L|F) equals P(L) . We are given that P(L|F) = 0.75 , but P(L) = 0.50 ; they are not equal.
The events of being female and having long hair are not independent.
Interpretation of Results
The events of being female and having long hair are not independent; knowing that a student is female changes
the probability that a student has long hair.
3.12 Mark is deciding which route to take to work. His choices are I = the Interstate and F = Fifth Street.
• P(I) = 0.44 and P(F) = 0.56
• P(I ∩ F) = 0 because Mark will take only one route to work.
What is the probability of P(I ∪ F) ?
Example 3.13
a. Toss one fair coin (the coin has two sides, H and T). The outcomes are ________. Count the outcomes. There
are ____ outcomes.
b. Toss one fair, six-sided die (the die has 1, 2, 3, 4, 5 or 6 dots on a side). The outcomes are
146 Chapter 3 | Probability Topics
Solution 3.13
a. H and T; 2
b. 1, 2, 3, 4, 5, 6; 6
c. 2(6) = 12
d. T1, T2, T3, T4, T5, T6, H1, H2, H3, H4, H5, H6
f. B = {H3}; P(B) = 1
12
g. Yes, because P(A ∩ B) = 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
h. P⎝A ∩ B⎠ = 0.P⎝A⎠P⎝B⎠ = ( 3 ).P⎝A ∩ B⎠does not equalP⎝A⎠P⎝B⎠, soAandBare dependent.
12
3.13 A box has two balls, one white and one red. We select one ball, put it back in the box, and select a second ball
(sampling with replacement). Let T be the event of getting the white ball twice, F the event of picking the white ball
first, S the event of picking the white ball in the second drawing.
a. Compute P(T) .
One easy way to remember the multiplication rule is that the word "and" means that the event has to satisfy two conditions.
For example the name drawn from the class roster is to be both a female and a sophomore. It is harder to satisfy two
conditions than only one and of course when we multiply fractions the result is always smaller. This reflects the increasing
difficulty of satisfying two conditions.
Example 3.14
Klaus is trying to choose where to go on vacation. His two choices are: A = New Zealand and B = Alaska
• Klaus can only afford one vacation. The probability that he chooses A is P(A) = 0.6 and the probability that
he chooses B is P(B) = 0.35.
• P(A ∩ B) = 0 because Klaus can only afford to take one vacation
• Therefore, the probability that he chooses either New Zealand or Alaska is
P(A ∪ B) = P(A) + P(B) = 0.6 + 0.35 = 0.95 . Note that the probability that he does not choose to go
anywhere on vacation must be 0.05.
Example 3.15
Carlos plays college soccer. He makes a goal 65% of the time he shoots. Carlos is going to attempt two goals in a
row in the next game. A = the event Carlos is successful on his first attempt. P(A) = 0.65. B = the event Carlos is
successful on his second attempt. P(B) = 0.65. Carlos tends to shoot in streaks. The probability that he makes the
second goal | that he made the first goal is 0.90.
Solution 3.15
a. The problem is asking you to find P(A ∩ B) = P(B ∩ A) . Since P(B | A) = 0.90: P(B ∩ A) = P(B | A)
P(A) = (0.90)(0.65) = 0.585
Carlos makes the first and second goals with probability 0.585.
b. What is the probability that Carlos makes either the first goal or the second goal?
Solution 3.15
b. The problem is asking you to find P(A ∪ B).
148 Chapter 3 | Probability Topics
Solution 3.15
c. No, they are not, because P(B ∩ A) = 0.585.
P(B)P(A) = (0.65)(0.65) = 0.423
0.423 ≠ 0.585 = P(B ∩ A)
So, P(B ∩ A) is not equal to P(B)P(A).
Solution 3.15
d. No, they are not because P(A ∩ B) = 0.585.
To be mutually exclusive, P(A ∩ B) must equal zero.
3.15 Helen plays basketball. For free throws, she makes the shot 75% of the time. Helen must now attempt two free
throws. C = the event that Helen makes the first shot. P(C) = 0.75. D = the event Helen makes the second shot. P(D)
= 0.75. The probability that Helen makes the second free throw given that she made the first is 0.85. What is the
probability that Helen makes both free throws?
Example 3.16
A community swim team has 150 members. Seventy-five of the members are advanced swimmers. Forty-
seven of the members are intermediate swimmers. The remainder are novice swimmers. Forty of the advanced
swimmers practice four times a week. Thirty of the intermediate swimmers practice four times a week. Ten of
the novice swimmers practice four times a week. Suppose one member of the swim team is chosen randomly.
Solution 3.16
a. 28
150
b. What is the probability that the member practices four times a week?
Solution 3.16
b. 80
150
c. What is the probability that the member is an advanced swimmer and practices four times a week?
Solution 3.16
c. 40
150
d. What is the probability that a member is an advanced swimmer and an intermediate swimmer? Are being an
advanced swimmer and an intermediate swimmer mutually exclusive? Why or why not?
Solution 3.16
d. P(advanced ∩ intermediate) = 0, so these are mutually exclusive events. A swimmer cannot be an advanced
swimmer and an intermediate swimmer at the same time.
e. Are being a novice swimmer and practicing four times a week independent events? Why or why not?
Solution 3.16
e. No, these are not independent events.
P(novice ∩ practices four times per week) = 0.0667
P(novice)P(practices four times per week) = 0.0996
0.0667 ≠ 0.0996
3.16 A school has 200 seniors of whom 140 will be going to college next year. Forty will be going directly to work.
The remainder are taking a gap year. Fifty of the seniors going to college play sports. Thirty of the seniors going
directly to work play sports. Five of the seniors taking a gap year play sports. What is the probability that a senior is
taking a gap year?
Example 3.17
Felicity attends Modesto JC in Modesto, CA. The probability that Felicity enrolls in a math class is 0.2 and the
probability that she enrolls in a speech class is 0.65. The probability that she enrolls in a math class | that she
enrolls in speech class is 0.25.
Let: M = math class, S = speech class, M | S = math given speech
a. What is the probability that Felicity enrolls in math and speech?
Find P(M ∩ S) = P(M | S)P(S).
b. What is the probability that Felicity enrolls in math or speech classes?
Find P(M ∪ S) = P(M) + P(S) - P(M ∩ S).
c. Are M and S independent? Is P(M | S) = P(M)?
d. Are M and S mutually exclusive? Is P(M ∩ S) = 0?
Solution 3.17
a. 0.1625, b. 0.6875, c. No, d. No
3.17 A student goes to the library. Let events B = the student checks out a book and D = the student check out a DVD.
150 Chapter 3 | Probability Topics
Example 3.18
Studies show that about one woman in seven (approximately 14.3%) who live to be 90 will develop breast cancer.
Suppose that of those women who develop breast cancer, a test is negative 2% of the time. Also suppose that in
the general population of women, the test for breast cancer is negative about 85% of the time. Let B = woman
develops breast cancer and let N = tests negative. Suppose one woman is selected at random.
a. What is the probability that the woman develops breast cancer? What is the probability that woman tests
negative?
Solution 3.18
a. P(B) = 0.143; P(N) = 0.85
b. Given that the woman has breast cancer, what is the probability that she tests negative?
Solution 3.18
b. P(N | B) = 0.02
c. What is the probability that the woman has breast cancer AND tests negative?
Solution 3.18
c. P(B ∩ N) = P(B)P(N | B) = (0.143)(0.02) = 0.0029
d. What is the probability that the woman has breast cancer or tests negative?
Solution 3.18
d. P(B ∪ N) = P(B) + P(N) - P(B ∩ N) = 0.143 + 0.85 - 0.0029 = 0.9901
Solution 3.18
e. No. P(N) = 0.85; P(N | B) = 0.02. So, P(N | B) does not equal P(N).
Solution 3.18
f. No. P(B ∩ N) = 0.0029. For B and N to be mutually exclusive, P(B ∩ N) must be zero.
3.18 A school has 200 seniors of whom 140 will be going to college next year. Forty will be going directly to work.
The remainder are taking a gap year. Fifty of the seniors going to college play sports. Thirty of the seniors going
directly to work play sports. Five of the seniors taking a gap year play sports. What is the probability that a senior is
going to college and plays sports?
Example 3.19
Solution 3.19
a. 0.98; b. 0.1401; c. 0.857; d. 0.15
3.19 A student goes to the library. Let events B = the student checks out a book and D = the student checks out a
DVD. Suppose that P(B) = 0.40, P(D) = 0.30 and P(D | B) = 0.5.
a. Find P(B′).
b. Find P(D ∩ B).
c. Find P(B | D).
d. Find P(D ∩ B′).
e. Find P(D | B′).
Example 3.20
Suppose a study of speeding violations and drivers who use cell phones produced the following fictional data:
Table 3.2
The total number of people in the sample is 755. The row totals are 305 and 450. The column totals are 70 and
685. Notice that 305 + 450 = 755 and 70 + 685 = 755.
Calculate the following probabilities using the table.
Solution 3.20
number of cell phone users
a. = 305
total number in study 755
Solution 3.20
b. number that had no violation = 685
total number in study 755
c. Find P(Driver had no violation in the last year ∩ was a cell phone user).
Solution 3.20
c. 280
755
d. Find P(Driver is a cell phone user ∪ driver had no violation in the last year).
Solution 3.20
⎛ ⎞
d. ⎝305 + 685 ⎠ − 280 = 710
755 755 755 755
e. Find P(Driver is a cell phone user | driver had a violation in the last year).
Solution 3.20
e. 25 (The sample space is reduced to the number of drivers who had a violation.)
70
f. Find P(Driver had no violation last year | driver was not a cell phone user)
Solution 3.20
f. 405 (The sample space is reduced to the number of drivers who were not cell phone users.)
450
3.20 Table 3.3 shows the number of athletes who stretch before exercising and how many had injuries within the
past year.
Table 3.3
Example 3.21
Table 3.4 shows a random sample of 100 hikers and the areas of hiking they prefer.
Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
Female 18 16 ___ 45
Male ___ ___ 14 55
Total ___ 41 ___ ___
Solution 3.21
a.
Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
Female 18 16 11 45
Male 16 25 14 55
Total 34 41 25 100
b. Are the events "being female" and "preferring the coastline" independent events?
Let F = being female and let C = preferring the coastline.
1. Find P(F ∩ C) .
2. Find P(F)P(C)
Are these two numbers the same? If they are, then F and C are independent. If they are not, then F and C are not
independent.
Solution 3.21
b.
⎛ ⎞
1. P⎝F ∩ C⎠ = 18 = 0.18
100
⎛ ⎞⎛ ⎞
2. P(F)P(C) = ⎝ 45 ⎠⎝ 34 ⎠ = (0.45)(0.34) = 0.153
100 100
P(F ∩ C) ≠ P(F)P(C), so the events F and C are not independent.
c. Find the probability that a person is male given that the person prefers hiking near lakes and streams. Let M =
being male, and let L = prefers hiking near lakes and streams.
1. What word tells you this is a conditional?
2. Fill in the blanks and calculate the probability: P(___ | ___) = ___.
3. Is the sample space for this problem all 100 hikers? If not, what is it?
Solution 3.21
c.
1. The word 'given' tells you that this is a conditional.
2. P(M | L) = 25
41
3. No, the sample space for this problem is the 41 hikers who prefer lakes and streams.
d. Find the probability that a person is female or prefers hiking on mountain peaks. Let F = being female, and let
P = prefers mountain peaks.
1. Find P(F).
2. Find P(P).
3. Find P(F ∩ P) .
4. Find P(F ∪ P) .
Solution 3.21
d.
1. P(F) = 45
100
2. P(P) = 25
100
3. P(F ∩ P) = 11
100
4. P(F ∪ P) = 45 + 25 - 11 = 59
100 100 100 100
3.21 Table 3.6 shows a random sample of 200 cyclists and the routes they prefer. Let M = males and H = hilly path.
Table 3.6
a. Out of the males, what is the probability that the cyclist prefers a hilly path?
b. Are the events “being male” and “preferring the hilly path” independent events?
Example 3.22
Muddy Mouse lives in a cage with three doors. If Muddy goes out the first door, the probability that he gets caught
by Alissa the cat is 1 and the probability he is not caught is 4 . If he goes out the second door, the probability he
5 5
gets caught by Alissa is 1 and the probability he is not caught is 3 . The probability that Alissa catches Muddy
4 4
coming out of the third door is 1 and the probability she does not catch Muddy is 1 . It is equally likely that
2 2
1
Muddy will choose any of the three doors so the probability of choosing each door is .
3
4 3 1
Not Caught 15 12 6 ____
⎛ ⎞⎛ ⎞
• The first entry 1 = ⎝1 ⎠⎝1 ⎠ is P⎛⎝Door One ∩ Caught⎞⎠
15 5 3
⎛ ⎞⎛ ⎞
• The entry 4 = ⎝4 ⎠⎝1 ⎠ is P⎛⎝Door One ∩ Not Caught⎞⎠
15 5 3
Verify the remaining entries.
156 Chapter 3 | Probability Topics
a. Complete the probability contingency table. Calculate the entries for the totals. Verify that the lower-right
corner entry is 1.
Solution 3.22
a.
Solution 3.22
b. 41
60
c. What is the probability that Muddy chooses Door One ∪ Door Two given that Muddy is caught by Alissa?
Solution 3.22
c. 9
19
Example 3.23
Table 3.9 contains the number of crimes per 100,000 inhabitants from 2008 to 2011 in the U.S.
Solution 3.23
a. 0.0294, b. 0.1551, c. 0.7165, d. 0.2365, e. 0.2575
3.23 Table 3.10 relates the weights and heights of a group of individuals participating in an observational study.
Table 3.10
Tree Diagrams
Sometimes, when the probability problems are complex, it can be helpful to graph the situation. Tree diagrams can be used
to visualize and solve conditional probabilities.
Tree Diagrams
A tree diagram is a special type of graph used to determine the outcomes of an experiment. It consists of "branches" that
are labeled with either frequencies or probabilities. Tree diagrams can make some probability problems easier to visualize
and solve. The following example illustrates how to use a tree diagram.
Example 3.24
In an urn, there are 11 balls. Three balls are red (R) and eight balls are blue (B). Draw two balls, one at a time,
with replacement. "With replacement" means that you put the first ball back in the urn before you select the
second ball. The tree diagram using frequencies that show all the possible outcomes follows.
158 Chapter 3 | Probability Topics
The first set of branches represents the first draw. The second set of branches represents the second draw. Each of
the outcomes is distinct. In fact, we can list each red ball as R1, R2, and R3 and each blue ball as B1, B2, B3, B4,
B5, B6, B7, and B8. Then the nine RR outcomes can be written as:
R1R1; R1R2; R1R3; R2R1; R2R2; R2R3; R3R1; R3R2; R3R3
The other outcomes are similar.
There are a total of 11 balls in the urn. Draw two balls, one at a time, with replacement. There are 11(11) = 121
outcomes, the size of the sample space.
Solution 3.24
a. B1R1; B1R2; B1R3; B2R1; B2R2; B2R3; B3R1; B3R2; B3R3; B4R1; B4R2; B4R3; B5R1; B5R2; B5R3; B6R1;
B6R2; B6R3; B7R1; B7R2; B7R3; B8R1; B8R2; B8R3
Solution 3.24
⎛ ⎞⎛ ⎞
b. P(RR) = ⎝ 3 ⎠⎝ 3 ⎠ = 9
11 11 121
Solution 3.24
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
c. P(RB ∪ BR) = ⎝ 3 ⎠⎝ 8 ⎠ + ⎝ 8 ⎠⎝ 3 ⎠ = 48
11 11 11 11 121
d. Using the tree diagram, calculate P(R on 1st draw ∩ B on 2nd draw) .
Solution 3.24
⎛ ⎞⎛ ⎞
d. P(R on 1st draw ∩ B on 2nd draw) = ⎝ 3 ⎠⎝ 8 ⎠ = 24
11 11 121
e. Using the tree diagram, calculate P(R on 2nd draw | B on 1st draw).
Solution 3.24
Solution 3.24
f. P(BB) = 64
121
g. Using the tree diagram, calculate P(B on the 2nd draw | R on the first draw).
Solution 3.24
3.24 In a standard deck, there are 52 cards. 12 cards are face cards (event F) and 40 cards are not face cards (event N).
Draw two cards, one at a time, with replacement. All possible outcomes are shown in the tree diagram as frequencies.
Using the tree diagram, calculate P(FF).
Figure 3.3
160 Chapter 3 | Probability Topics
Example 3.25
An urn has three red marbles and eight blue marbles in it. Draw two marbles, one at a time, this time without
replacement, from the urn. "Without replacement" means that you do not put the first ball back before you
select the second marble. Following is a tree diagram for this situation. The branches are labeled with probabilities
instead of frequencies. The numbers at the ends of the branches are calculated by multiplying the numbers on the
⎛ ⎞⎛ ⎞
two corresponding branches, for example, ⎝ 3 ⎠⎝ 2 ⎠ = 6 .
11 10 110
NOTE
If you draw a red on the first draw from the three red possibilities, there are two red marbles left to draw on
the second draw. You do not put back or replace the first marble after you have drawn it. You draw without
replacement, so that on the second draw there are ten marbles left in the urn.
a. P(RR) = ________
Solution 3.25
⎛ ⎞⎛ ⎞
a. P(RR) = ⎝ 3 ⎠⎝ 2 ⎠ = 6
11 10 110
Solution 3.25
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
b. P(RB ∪ BR) = ⎝ 3 ⎠⎝ 8 ⎠ + ⎝ 8 ⎠⎝ 3 ⎠ = 48
11 10 11 10 110
Solution 3.25
c. P(R on 2nd | B on 1st) = 3
10
Solution 3.25
⎛ ⎞⎛ ⎞
d. P(R on 1st ∩ B on 2nd) = ⎝ 3 ⎠⎝ 8 ⎠ = 24
11 10 100
e. Find P(BB).
Solution 3.25
⎛ ⎞⎛ ⎞
e. P(BB) = ⎝ 8 ⎠⎝ 7 ⎠
11 10
Solution 3.25
f. Using the tree diagram, P(B on 2nd | R on 1st) = P(R | B) = 8 .
10
If we are using probabilities, we can label the tree in the following general way.
3.25 In a standard deck, there are 52 cards. Twelve cards are face cards (F) and 40 cards are not face cards (N). Draw
two cards, one at a time, without replacement. The tree diagram is labeled with all possible probabilities.
Figure 3.5
Example 3.26
A litter of kittens available for adoption at the Humane Society has four tabby kittens and five black kittens. A
family comes in and randomly selects two kittens (without replacement) for adoption.
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
a. ⎝1 ⎠⎝1 ⎠ b. ⎝4 ⎠⎝4 ⎠ c. ⎝4 ⎠⎝3 ⎠ d. ⎝4 ⎠⎝5 ⎠
2 2 9 9 9 8 9 9
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
a. ⎝4 ⎠⎝5 ⎠ b. ⎝4 ⎠⎝5 ⎠ c. ⎝4 ⎠⎝5 ⎠ + ⎝5 ⎠⎝4 ⎠ d. ⎝4 ⎠⎝5 ⎠ + ⎝5 ⎠⎝4 ⎠
9 9 9 8 9 9 9 9 9 8 9 8
c. What is the probability that a tabby is chosen as the second kitten when a black kitten was chosen as the
first?
d. What is the probability of choosing two kittens of the same color?
Solution 3.26
a. c, b. d, c. 4 , d. 32
8 72
3.26 Suppose there are four red balls and three yellow balls in a box. Two balls are drawn from the box without
replacement. What is the probability that one ball of each coloring is selected?
Probability.
Example 3.27
Suppose an experiment has the outcomes 1, 2, 3, ... , 12 where each outcome has an equal chance of occurring.
Let event A = {1, 2, 3, 4, 5, 6} and event B = {6, 7, 8, 9}. Then A intersect B = A ∩ B = {6} and A union B =
A ∪ B = {1, 2, 3, 4, 5, 6, 7, 8, 9}. . The Venn diagram is as follows:
Figure 3.6
Figure 3.6 shows the most basic relationship among these numbers. First, the numbers are in groups called sets; set A
and set B. Some number are in both sets; we say in set A ∩ in set B. The English word "and" means inclusive, meaning
having the characteristics of both A and B, or in this case, being a part of both A and B. This condition is called the
INTERSECTION of the two sets. All members that are part of both sets constitute the intersection of the two sets. The
intersection is written as A ∩ B where ∩ is the mathematical symbol for intersection. The statement A ∩ B is read as
"A intersect B." You can remember this by thinking of the intersection of two streets.
There are also those numbers that form a group that, for membership, the number must be in either one or the other group.
The number does not have to be in BOTH groups, but instead only in either one of the two. These numbers are called the
UNION of the two sets and in this case they are the numbers 1-5 (from A exclusively), 7-9 (from set B exclusively) and also
6, which is in both sets A and B. The symbol for the UNION is ∪ , thus A ∪ B = numbers 1-9, but excludes number 10,
11, and 12. The values 10, 11, and 12 are part of the universe, but are not in either of the two sets.
Translating the English word "AND" into the mathematical logic symbol ∩ , intersection, and the word "OR" into the
mathematical symbol ∪ , union, provides a very precise way to discuss the issues of probability and logic. The general
terminology for the three areas of the Venn diagram in Figure 3.6 is shown in Figure 3.7.
3.27 Suppose an experiment has outcomes black, white, red, orange, yellow, green, blue, and purple, where each
outcome has an equal chance of occurring. Let event C = {green, blue, purple} and event P = {red, yellow, blue}.
Then C ∩ P = {blue} and C ∪ P = {green, blue, purple, red, yellow} . Draw a Venn diagram representing this
situation.
Example 3.28
Flip two fair coins. Let A = tails on the first coin. Let B = tails on the second coin. Then A = {TT, TH} and B =
{TT, HT}. Therefore, A ∩ B = {TT} . A ∪ B = {TH, TT, HT} .
The sample space when you flip two fair coins is X = {HH, HT, TH, TT}. The outcome HH is in NEITHER A
NOR B. The Venn diagram is as follows:
Figure 3.7
3.28 Roll a fair, six-sided die. Let A = a prime number of dots is rolled. Let B = an odd number of dots is rolled. Then
A = {2, 3, 5} and B = {1, 3, 5}. Therefore, A ∩ B = {3, 5} . A ∪ B = {1, 2, 3, 5} . The sample space for rolling a
fair die is S = {1, 2, 3, 4, 5, 6}. Draw a Venn diagram representing this situation.
Example 3.29
A person with type O blood and a negative Rh factor (Rh-) can donate blood to any person with any blood
type. Four percent of African Americans have type O blood and a negative RH factor, 5−10% of African
Americans have the Rh- factor, and 51% have type O blood.
166 Chapter 3 | Probability Topics
Figure 3.8
The “O” circle represents the African Americans with type O blood. The “Rh-“ oval represents the African
Americans with the Rh- factor.
We will take the average of 5% and 10% and use 7.5% as the percent of African Americans who have the
Rh- factor. Let O = African American with Type O blood and R = African American with Rh- factor.
a. P(O) = ___________
b. P(R) = ___________
c. P(O ∩ R) = ___________
d. P(O ∪ R) = ____________
e. In the Venn Diagram, describe the overlapping area using a complete sentence.
f. In the Venn Diagram, describe the area in the rectangle but outside both the circle and the oval using a
complete sentence.
Example 3.30
Forty percent of the students at a local college belong to a club and 50% work part time. Five percent of the
students work part time and belong to a club. Draw a Venn diagram showing the relationships. Let C = student
belongs to a club and PT = student works part time.
Figure 3.9
• the probability that the student belongs to a club given that the student works part time.
P(C ∩ PT) 0.05
P(C|PT) = = = 0.1
P(PT) 0.50
• the probability that the student belongs to a club OR works part time.
P(C ∪ PT) = P(C) + P(PT) - P(C ∩ PT) = 0.40 + 0.50 - 0.05 = 0.85
In order to solve Example 3.30 we had to draw upon the concept of conditional probability from the previous section.
There we used tree diagrams to track the changes in the probabilities, because the sample space changed as we drew
without replacement. In short, conditional probability is the chance that something will happen given that some other event
has already happened. Put another way, the probability that something will happen conditioned upon the situation that
something else is also true. In Example 3.30 the probability P(C | PT) is the conditional probability that the randomly
drawn student is a member of the club, conditioned upon the fact that the student also is working part time. This allows us
to see the relationship between Venn diagrams and the probability postulates.
3.30 Fifty percent of the workers at a factory work a second job, 25% have a spouse who also works, 5% work a
second job and have a spouse who also works. Draw a Venn diagram showing the relationships. Let W = works a
second job and S = spouse also works.
168 Chapter 3 | Probability Topics
3.30 In a bookstore, the probability that the customer buys a novel is 0.6, and the probability that the customer buys
a non-fiction book is 0.4. Suppose that the probability that the customer buys both is 0.2.
a. Draw a Venn diagram representing the situation.
b. Find the probability that the customer buys either a novel or a non-fiction book.
c. In the Venn diagram, describe the overlapping area using a complete sentence.
d. Suppose that some customers buy only compact disks. Draw an oval in your Venn diagram representing this
event.
Example 3.31
A set of 20 German Shepherd dogs is observed. 12 are male, 8 are female, 10 have some brown coloring, and 5
have some white sections of fur. Answer the following using Venn Diagrams.
Draw a Venn diagram simply showing the sets of male and female dogs.
Solution 3.31
The Venn diagram below demonstrates the situation of mutually exclusive events where the outcomes are
independent events. If a dog cannot be both male and female, then there is no intersection. Being male precludes
being female and being female precludes being male: in this case, the characteristic gender is therefore mutually
exclusive. A Venn diagram shows this as two sets with no intersection. The intersection is said to be the null set
using the mathematical symbol ∅.
Figure 3.10
Draw a second Venn diagram illustrating that 10 of the male dogs have brown coloring.
Solution 3.31
The Venn diagram below shows the overlap between male and brown where the number 10 is placed in it.
This represents Male ∩ Brown : both male and brown. This is the intersection of these two characteristics. To
get the union of Male and Brown, then it is simply the two circled areas minus the overlap. In proper terms,
Male ∪ Brown = Male + Brown − Male ∩ Brown will give us the number of dogs in the union of these two
sets. If we did not subtract the intersection, we would have double counted some of the dogs.
Figure 3.11
Now draw a situation depicting a scenario in which the non-shaded region represents "No white fur and female,"
or White fur′ ∩ Female. the prime above "fur" indicates "not white fur." The prime above a set means not in
¯
that set, e.g. A′ means not A . Sometimes, the notation used is a line above the letter. For example, A = A′ .
Solution 3.31
Figure 3.12
Remember that probability is simply the proportion of the objects we are interested in relative to the total number of objects.
This is why we can see the usefulness of the Venn diagrams. Example 3.31 shows how we can use Venn diagrams to count
the number of dogs in the union of brown and male by reminding us to subtract the intersection of brown and male. We can
see the effect of this directly on probabilities in the addition rule.
Example 3.32
Let's sample 50 students who are in a statistics class. 20 are freshmen and 30 are sophomores. 15 students get a
"B" in the course, and 5 students both get a "B" and are freshmen.
Find the probability of selecting a student who either earns a "B" OR is a freshmen. We are translating the word
OR to the mathematical symbol for the addition rule, which is the union of the two sets.
Solution 3.32
We know that there are 50 students in our sample, so we know the denominator of our fraction to give us
probability. We need only to find the number of students that meet the characteristics we are interested in, i.e. any
freshman and any student who earned a grade of "B." With the Addition Rule of probability, we can skip directly
to probabilities.
Let "A" = the number of freshmen, and let "B" = the grade of "B." Below we can see the process for using Venn
diagrams to solve this.
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
The P⎝A⎠ = 20 = 0.40 , P⎝B⎠ = 15 = 0.30 , and P⎝A ∩ B⎠ = 5 = 0.10 .
50 50 50
Therefore, P(A ∩ B) = 0.40 + 0.30 − 0.10 = 0.60 .
Figure 3.13
If two events are mutually exclusive, then, like the example where we diagram the male and female dogs, the
addition rule is simplified to just P(A ∪ B) = P(A) + P(B) − 0 . This is true because, as we saw earlier, the
union of mutually exclusive events is the null set, ∅. The diagrams below demonstrate this.
Figure 3.14
P(A|B) = 0.10 = 1
0.30 3
172 Chapter 3 | Probability Topics
Figure 3.15
The multiplication rule must also be altered if the two events are independent. Independent events are defined as a situation
where the conditional probability is simply the probability of the event of interest. Formally, independence of events is
defined as P(A|B) = P(A) or P(B| A) = P(B) . When flipping coins, the outcome of the second flip is independent of the
outcome of the first flip; coins do not have memory. The Multiplication Rule of Probability for independent events thus
becomes:
P(A ∩ B) = P(A) ⋅ P(B)
One easy way to remember this is to consider what we mean by the word "and." We see that the Multiplication Rule has
translated the word "and" to the Venn notation for intersection. Therefore, the outcome must meet the two conditions of
freshmen and grade of "B" in the above example. It is harder, less probable, to meet two conditions than just one or some
other one. We can attempt to see the logic of the Multiplication Rule of probability due to the fact that fractions multiplied
times each other become smaller.
The development of the Rules of Probability with the use of Venn diagrams can be shown to help as we wish to calculate
probabilities from data arranged in a contingency table.
Example 3.33
Table 3.11 is from a sample of 200 people who were asked how much education they completed. The columns
represent the highest education they completed, and the rows separate the individuals by male and female.
Less than High School Grad High School Grad Some College College Grad Total
Male 5 15 40 60 120
Female 8 12 30 30 80
Total 13 27 70 90 200
Table 3.11
Now, we can use this table to answer probability questions. The following examples are designed to help
understand the format above while connecting the knowledge to both Venn diagrams and the probability rules.
What is the probability that a selected person both finished college and is female?
Solution 3.33
This is a simple task of finding the value where the two characteristics intersect on the table, and then applying
the postulate of probability, which states that the probability of an event is the proportion of outcomes that match
the event in which we are interested as a proportion of all total possible outcomes.
Solution 3.33
This task involves the use of the addition rule to solve for this probability.
P(College Grad ∪ Female) = P(F) + P(CG)− P(F ∩ CG)
Solution 3.33
Here we must use the conditional probability rule (the modified multiplication rule) to solve for this probability.
⎛ 15 ⎞
P(HS Grad ∩ Male) ⎝ 200 ⎠
P(HS Grad | Male = = ⎛ 120 ⎞
= 15 = 0.125
P(Male) 120
⎝ 200 ⎠
Can we conclude that the level of education attained by these 200 people is independent of the gender of the
person?
Solution 3.33
There are two ways to approach this test. The first method seeks to test if the intersection of two events equals the
product of the events separately remembering that if two events are independent than P(A)*P(B) = P(A ∩ B).
For simplicity's sake, we can use calculated values from above.
Does P(College Grad ∩ Female) = P(CG) ⋅ P(F)?
30 ≠ 90 ⋅ 80 because 0.15 ≠ 0.18.
200 200 200
Therefore, gender and education here are not independent.
The second method is to test if the conditional probability of A given B is equal to the probability of A. Again for
simplicity, we can use an already calculated value from above.
Does P(HS Grad | Male) = P(HS Grad)?
15 ≠ 27 because 0.125 ≠ 0.135.
120 200
Therefore, again gender and education here are not independent.
174 Chapter 3 | Probability Topics
KEY TERMS
Conditional Probability the likelihood that an event will occur given that another event has already occurred
Contingency Table the method of displaying a frequency distribution as a table with rows and columns to show how
two variables may be dependent (contingent) upon each other; the table provides an easy way to calculate
conditional probabilities.
Dependent Events If two events are NOT independent, then we say that they are dependent.
Event a subset of the set of all outcomes of an experiment; the set of all outcomes of an experiment is called a sample
space and is usually denoted by S. An event is an arbitrary subset in S. It can contain one outcome, two outcomes,
no outcomes (empty subset), the entire sample space, and the like. Standard notations for events are capital letters
such as A, B, C, and so on.
Independent Events The occurrence of one event has no effect on the probability of the occurrence of another event.
Events A and B are independent if one of the following is true:
1. P(A|B) = P(A)
2. P(B|A) = P(B)
3. P(A ∩ B) = P(A)P(B)
Mutually Exclusive Two events are mutually exclusive if the probability that they both happen at the same time is
zero. If events A and B are mutually exclusive, then P(A ∩ B) = 0.
Probability a number between zero and one, inclusive, that gives the likelihood that a specific event will occur; the
foundation of statistics is given by the following 3 axioms (by A.N. Kolmogorov, 1930’s): Let S denote the sample
space and A and B are two events in S. Then:
• 0 ≤ P(A) ≤ 1
• If A and B are any two mutually exclusive events, then P(A ∪ B) = P(A) + P(B).
• P(S) = 1
Sampling with Replacement If each member of a population is replaced after it is picked, then that member has the
possibility of being chosen more than once.
Sampling without Replacement When sampling is done without replacement, each member of a population may be
chosen only once.
The Complement Event The complement of event A consists of all outcomes that are NOT in A.
The Conditional Probability of A | B P(A | B) is the probability that event A will occur given that the event B has
already occurred.
The Intersection: the ∩ Event An outcome is in the event A ∩ B if the outcome is in both A ∩ B at the same
time.
The Union: the ∪ Event An outcome is in the event A ∪ B if the outcome is in A or is in B or is in both A and B.
Tree Diagram the useful visual representation of a sample space and events in the form of a “tree” with branches
marked by possible outcomes together with associated probabilities (frequencies, relative frequencies)
Venn Diagram the visual representation of a sample space and events in the form of circles or ovals showing their
intersections
CHAPTER REVIEW
3.1 Terminology
In this module we learned the basic terminology of probability. The set of all possible outcomes of an experiment is called
the sample space. Events are subsets of the sample space, and they are assigned a probability that is a number between zero
and one, inclusive.
FORMULA REVIEW
B)
3.3 Two Basic Rules of Probability
The multiplication rule: P(A ∩ B) = P(A | B)P(B)
The addition rule: P(A ∪ B) = P(A) + P(B) - P(A ∩
PRACTICE
3.1 Terminology
1. In a particular college class, there are male and female students. Some students have long hair and some students have
short hair. Write the symbols for the probabilities of the events for parts a through j. (Note that you cannot find numerical
answers here. You were not given enough information to find any probability values yet; concentrate on understanding the
symbols.)
• Let F be the event that a student is female.
• Let M be the event that a student is male.
• Let S be the event that a student has short hair.
• Let L be the event that a student has long hair.
a. The probability that a student does not have long hair.
b. The probability that a student is male or has short hair.
c. The probability that a student is a female and has long hair.
d. The probability that a student is male, given that the student has long hair.
e. The probability that a student has long hair, given that the student is male.
f. Of all the female students, the probability that a student has short hair.
g. Of all students with long hair, the probability that a student is female.
h. The probability that a student is female or has long hair.
i. The probability that a randomly selected student is a male student with short hair.
j. The probability that a student is female.
Use the following information to answer the next four exercises. A box is filled with several party favors. It contains 12
hats, 15 noisemakers, ten finger traps, and five bags of confetti.
Let H = the event of getting a hat.
Let N = the event of getting a noisemaker.
Let F = the event of getting a finger trap.
Let C = the event of getting a bag of confetti.
2. Find P(H).
3. Find P(N).
4. Find P(F).
5. Find P(C).
Use the following information to answer the next six exercises. A jar of 150 jelly beans contains 22 red jelly beans, 38
yellow, 20 green, 28 purple, 26 blue, and the rest are orange.
Let B = the event of getting a blue jelly bean
Let G = the event of getting a green jelly bean.
Let O = the event of getting an orange jelly bean.
Let P = the event of getting a purple jelly bean.
Let R = the event of getting a red jelly bean.
Let Y = the event of getting a yellow jelly bean.
6. Find P(B).
7. Find P(G).
8. Find P(P).
9. Find P(R).
10. Find P(Y).
Use the following information to answer the next six exercises. There are 23 countries in North America, 12 countries in
South America, 47 countries in Europe, 44 countries in Asia, 54 countries in Africa, and 14 in Oceania (Pacific Ocean
region).
Let A = the event that a country is in Asia.
Let E = the event that a country is in Europe.
Let F = the event that a country is in Africa.
Let N = the event that a country is in North America.
Let O = the event that a country is in Oceania.
Let S = the event that a country is in South America.
12. Find P(A).
13. Find P(E).
14. Find P(F).
15. Find P(N).
16. Find P(O).
17. Find P(S).
18. What is the probability of drawing a red card in a standard deck of 52 cards?
19. What is the probability of drawing a club in a standard deck of 52 cards?
20. What is the probability of rolling an even number of dots with a fair, six-sided die numbered one through six?
21. What is the probability of rolling a prime number of dots with a fair, six-sided die numbered one through six?
Use the following information to answer the next two exercises. You see a game at a local fair. You have to throw a dart at a
color wheel. Each section on the color wheel is equal in area.
Figure 3.16
Use the following information to answer the next ten exercises. On a baseball team, there are infielders and outfielders.
Some players are great hitters, and some players are not great hitters.
Let I = the event that a player in an infielder.
Let O = the event that a player is an outfielder.
Let H = the event that a player is a great hitter.
Let N = the event that a player is not a great hitter.
24. Write the symbols for the probability that a player is not an outfielder.
25. Write the symbols for the probability that a player is an outfielder or is a great hitter.
26. Write the symbols for the probability that a player is an infielder and is not a great hitter.
27. Write the symbols for the probability that a player is a great hitter, given that the player is an infielder.
28. Write the symbols for the probability that a player is an infielder, given that the player is a great hitter.
29. Write the symbols for the probability that of all the outfielders, a player is not a great hitter.
30. Write the symbols for the probability that of all the great hitters, a player is an outfielder.
31. Write the symbols for the probability that a player is an infielder or is not a great hitter.
32. Write the symbols for the probability that a player is an outfielder and is a great hitter.
33. Write the symbols for the probability that a player is an infielder.
34. What is the word for the set of all possible outcomes?
35. What is conditional probability?
36. A shelf holds 12 books. Eight are fiction and the rest are nonfiction. Each is a different book with a unique title. The
fiction books are numbered one to eight. The nonfiction books are numbered one to four. Randomly select one book
Let F = event that book is fiction
Let N = event that book is nonfiction
What is the sample space?
37. What is the sum of the probabilities of an event and its complement?
Use the following information to answer the next two exercises. You are rolling a fair, six-sided number cube. Let E = the
event that it lands on an even number. Let M = the event that it lands on a multiple of three.
38. What does P(E | M) mean in words?
39. What does P(E ∪ M) mean in words?
42. U and V are mutually exclusive events. P(U) = 0.26; P(V) = 0.37. Find:
a. P(U ∩ V) =
b. P(U |V) =
c. P(U ∪ V) =
Table 3.12
day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 Whites.
59. Complete the table using the data provided. Suppose that one person from the study is randomly selected. Find the
probability that person smoked 11 to 20 cigarettes per day.
60. Suppose that one person from the study is randomly selected. Find the probability that person smoked 11 to 20 cigarettes
per day.
61. Find the probability that the person was Latino.
62. In words, explain what it means to pick one person from the study who is “Japanese American AND smokes 21 to 30
cigarettes per day.” Also, find the probability.
63. In words, explain what it means to pick one person from the study who is “Japanese American ∪ smokes 21 to 30
cigarettes per day.” Also, find the probability.
64. In words, explain what it means to pick one person from the study who is “Japanese American | that person smokes 21
to 30 cigarettes per day.” Also, find the probability.
65. Prove that smoking level/day and ethnicity are dependent events.
Use the following information to answer the next two exercises. Suppose that you have eight cards. Five are green and three
are yellow. The cards are well shuffled.
66. Suppose that you randomly draw two cards, one at a time, with replacement.
Let G1 = first card is green
Let G2 = second card is green
a. Draw a tree diagram of the situation.
b. Find P(G1 ∩ G2).
c. Find P(at least one green).
d. Find P(G2 | G1).
e. Are G2 and G1 independent events? Explain why or why not.
67. Suppose that you randomly draw two cards, one at a time, without replacement.
G1 = first card is green
G2 = second card is green
a. Draw a tree diagram of the situation.
b. Find P(G1 ∩ G2).
c. Find P(at least one green).
d. Find P(G2 | G1).
e. Are G2 and G1 independent events? Explain why or why not.
Use the following information to answer the next two exercises. The percent of licensed U.S. drivers (from a recent year)
that are female is 48.60. Of the females, 5.03% are age 19 and under; 81.36% are age 20–64; 13.61% are age 65 or over. Of
the licensed U.S. male drivers, 5.04% are age 19 and under; 81.43% are age 20–64; 13.53% are age 65 or over.
HOMEWORK
182 Chapter 3 | Probability Topics
3.1 Terminology
72.
Figure 3.17 The graph in Figure 3.17 displays the sample sizes and percentages of people in different age and gender
groups who were polled concerning their approval of Mayor Ford’s actions in office. The total number in the sample of all
the age groups is 1,045.
a. Define three events in the graph.
b. Describe in words what the entry 40 means.
c. Describe in words the complement of the entry in question 2.
d. Describe in words what the entry 30 means.
e. Out of the males and females, what percent are males?
f. Out of the females, what percent disapprove of Mayor Ford?
g. Out of all the age groups, what percent approve of Mayor Ford?
h. Find P(Approve | Male).
i. Out of the age groups, what percent are more than 44 years old?
j. Find P(Approve | Age < 35).
73. Explain what is wrong with the following statements. Use complete sentences.
a. If there is a 60% chance of rain on Saturday and a 70% chance of rain on Sunday, then there is a 130% chance of
rain over the weekend.
b. The probability that a baseball player hits a home run is greater than the probability that he gets a successful hit.
Figure 3.18
74. Find the probability that an Emotional Health Index Score is 82.7.
75. Find the probability that an Emotional Health Index Score is 81.0.
76. Find the probability that an Emotional Health Index Score is more than 81?
77. Find the probability that an Emotional Health Index Score is between 80.5 and 82?
78. If we know an Emotional Health Index Score is 81.5 or more, what is the probability that it is 82.7?
79. What is the probability that an Emotional Health Index Score is 80.7 or 82.7?
80. What is the probability that an Emotional Health Index Score is less than 80.2 given that it is already less than 81.
81. What occupation has the highest emotional index score?
82. What occupation has the lowest emotional index score?
83. What is the range of the data?
84. Compute the average EHIS.
85. If all occupations are equally likely for a certain individual, what is the probability that he or she will have an occupation
with lower than average EHIS?
184 Chapter 3 | Probability Topics
88.
a. List the sample space of the 38 possible outcomes in roulette.
b. You bet on red. Find P(red).
c. You bet on -1st 12- (1st Dozen). Find P(-1st 12-).
d. You bet on an even number. Find P(even number).
e. Is getting an odd number the complement of getting an even number? Why?
f. Find two mutually exclusive events.
g. Are the events Even and 1st Dozen independent?
89. Compute the probability of winning the following types of bets:
a. Betting on two lines that touch each other on the table as in 1-2-3-4-5-6
b. Betting on three numbers in a line, as in 1-2-3
c. Betting on one number
d. Betting on four numbers that touch each other to form a square, as in 10-11-13-14
e. Betting on two numbers that touch each other on the table, as in 10-11 or 10-13
f. Betting on 0-00-1-2-3
g. Betting on 0-1-2; or 0-00-2; or 00-2-3
90. Compute the probability of winning the following types of bets:
a. Betting on a color
b. Betting on one of the dozen groups
c. Betting on the range of numbers from 1 to 18
d. Betting on the range of numbers 19–36
e. Betting on one of the columns
f. Betting on an even or odd number (excluding zero)
91. Suppose that you have eight cards. Five are green and three are yellow. The five green cards are numbered 1, 2, 3, 4,
and 5. The three yellow cards are numbered 1, 2, and 3. The cards are well shuffled. You randomly draw one card.
• G = card drawn is green
• E = card drawn is even-numbered
a. List the sample space.
b. P(G) = _____
c. P(G | E) = _____
d. P(G ∩ E) = _____
e. P(G ∪ E) = _____
f. Are G and E mutually exclusive? Justify your answer numerically.
186 Chapter 3 | Probability Topics
92. Roll two fair dice separately. Each die has six faces.
a. List the sample space.
b. Let A be the event that either a three or four is rolled first, followed by an even number. Find P(A).
c. Let B be the event that the sum of the two rolls is at most seven. Find P(B).
d. In words, explain what “P(A | B)” represents. Find P(A | B).
e. Are A and B mutually exclusive events? Explain your answer in one to three complete sentences, including
numerical justification.
f. Are A and B independent events? Explain your answer in one to three complete sentences, including numerical
justification.
93. A special deck of cards has ten cards. Four are green, three are blue, and three are red. When a card is picked, its color
of it is recorded. An experiment consists of first picking a card and then tossing a coin.
a. List the sample space.
b. Let A be the event that a blue card is picked first, followed by landing a head on the coin toss. Find P(A).
c. Let B be the event that a red or green is picked, followed by landing a head on the coin toss. Are the events A and
B mutually exclusive? Explain your answer in one to three complete sentences, including numerical justification.
d. Let C be the event that a red or blue is picked, followed by landing a head on the coin toss. Are the events A and
C mutually exclusive? Explain your answer in one to three complete sentences, including numerical justification.
94. An experiment consists of first rolling a die and then tossing a coin.
a. List the sample space.
b. Let A be the event that either a three or a four is rolled first, followed by landing a head on the coin toss. Find
P(A).
c. Let B be the event that the first and second tosses land on heads. Are the events A and B mutually exclusive?
Explain your answer in one to three complete sentences, including numerical justification.
95. An experiment consists of tossing a nickel, a dime, and a quarter. Of interest is the side the coin lands on.
a. List the sample space.
b. Let A be the event that there are at least two tails. Find P(A).
c. Let B be the event that the first and second tosses land on heads. Are the events A and B mutually exclusive?
Explain your answer in one to three complete sentences, including justification.
96. Consider the following scenario:
Let P(C) = 0.4.
Let P(D) = 0.5.
Let P(C | D) = 0.6.
a. Find P(C ∩ D).
b. Are C and D mutually exclusive? Why or why not?
c. Are C and D independent events? Why or why not?
d. Find P(C ∪ D).
e. Find P(D | C).
97. Y and Z are independent events.
a. Rewrite the basic Addition Rule P(Y ∪ Z) = P(Y) + P(Z) - P(Y ∩ Z) using the information that Y and Z are
independent events.
b. Use the rewritten rule to find P(Z) if P(Y ∪ Z) = 0.71 and P(Y) = 0.42.
98. G and H are mutually exclusive events. P(G) = 0.5 P(H) = 0.3
a. Explain why the following statement MUST be false: P(H | G) = 0.4.
b. Find P(H ∪ G).
c. Are G and H independent or dependent events? Explain in a complete sentence.
99. Approximately 281,000,000 people over age five live in the United States. Of these people, 55,000,000 speak a
language other than English at home. Of those who speak another language at home, 62.3% speak Spanish.
Let: E = speaks English at home; E′ = speaks another language at home; S = speaks Spanish;
Finish each probability statement by matching the correct answer.
Table 3.14
100. 1994, the U.S. government held a lottery to issue 55,000 Green Cards (permits for non-citizens to work legally in the
U.S.). Renate Deutsch, from Germany, was one of approximately 6.5 million people who entered this lottery. Let G = won
green card.
a. What was Renate’s chance of winning a Green Card? Write your answer as a probability statement.
b. In the summer of 1994, Renate received a letter stating she was one of 110,000 finalists chosen. Once the finalists
were chosen, assuming that each finalist had an equal chance to win, what was Renate’s chance of winning a
Green Card? Write your answer as a conditional probability statement. Let F = was a finalist.
c. Are G and F independent or dependent events? Justify your answer numerically and also explain why.
d. Are G and F mutually exclusive events? Justify your answer numerically and explain why.
101. Three professors at George Washington University did an experiment to determine if economists are more selfish
than other people. They dropped 64 stamped, addressed envelopes with $10 cash in different classrooms on the George
Washington campus. 44% were returned overall. From the economics classes 56% of the envelopes were returned. From
the business, psychology, and history classes 31% were returned.
Let: R = money returned; E = economics classes; O = other classes
a. Write a probability statement for the overall percent of money returned.
b. Write a probability statement for the percent of money returned out of the economics classes.
c. Write a probability statement for the percent of money returned out of the other classes.
d. Is money being returned independent of the class? Justify your answer numerically and explain it.
e. Based upon this study, do you think that economists are more selfish than other people? Explain why or why not.
Include numbers to justify your answer.
188 Chapter 3 | Probability Topics
102. The following table of data obtained from www.baseball-almanac.com shows hit information for four players.
Suppose that one hit from the table is randomly selected.
Table 3.15
Are "the hit being made by Hank Aaron" and "the hit being a double" independent events?
a. Yes, because P(hit by Hank Aaron | hit is a double) = P(hit by Hank Aaron)
b. No, because P(hit by Hank Aaron | hit is a double) ≠ P(hit is a double)
c. No, because P(hit is by Hank Aaron | hit is a double) ≠ P(hit by Hank Aaron)
d. Yes, because P(hit is by Hank Aaron | hit is a double) = P(hit is a double)
103. United Blood Services is a blood bank that serves more than 500 hospitals in 18 states. According to their website,
a person with type O blood and a negative Rh factor (Rh-) can donate blood to any person with any bloodtype. Their data
show that 43% of people have type O blood and 15% of people have Rh- factor; 52% of people have type O or Rh- factor.
a. Find the probability that a person has both type O blood and the Rh- factor.
b. Find the probability that a person does NOT have both type O blood and the Rh- factor.
104. At a college, 72% of courses have final exams and 46% of courses require research papers. Suppose that 32% of
courses have a research paper and a final exam. Let F be the event that a course has a final exam. Let R be the event that a
course requires a research paper.
a. Find the probability that a course has a final exam or a research project.
b. Find the probability that a course has NEITHER of these two requirements.
105. In a box of assorted cookies, 36% contain chocolate and 12% contain nuts. Of those, 8% contain both chocolate and
nuts. Sean is allergic to both chocolate and nuts.
a. Find the probability that a cookie contains chocolate or nuts (he can't eat it).
b. Find the probability that a cookie does not contain chocolate or nuts (he can eat it).
106. A college finds that 10% of students have taken a distance learning class and that 40% of students are part time
students. Of the part time students, 20% have taken a distance learning class. Let D = event that a student takes a distance
learning class and E = event that a student is a part time student
a. Find P(D ∩ E).
b. Find P(E | D).
c. Find P(D ∪ E).
d. Using an appropriate test, show whether D and E are independent.
e. Using an appropriate test, show whether D and E are mutually exclusive.
Table 3.16
Table 3.16
107. What is the probability that a randomly selected senator has an “Other” affiliation?
108. What is the probability that a randomly selected senator is up for reelection in November 2016?
109. What is the probability that a randomly selected senator is a Democrat and up for reelection in November 2016?
110. What is the probability that a randomly selected senator is a Republican or is up for reelection in November 2014?
111. Suppose that a member of the US Senate is randomly selected. Given that the randomly selected senator is up for
reelection in November 2016, what is the probability that this senator is a Democrat?
112. Suppose that a member of the US Senate is randomly selected. What is the probability that the senator is up for
reelection in November 2014, knowing that this senator is a Republican?
113. The events “Republican” and “Up for reelection in 2016” are ________
a. mutually exclusive.
b. independent.
c. both mutually exclusive and independent.
d. neither mutually exclusive nor independent.
114. The events “Other” and “Up for reelection in November 2016” are ________
a. mutually exclusive.
b. independent.
c. both mutually exclusive and independent.
d. neither mutually exclusive nor independent.
115. Table 3.17 gives the number of suicides estimated in the U.S. for a recent year by age, race (black or white), and sex.
We are interested in possible relationships between age, race, and sex. We will let suicide victims be our population.
Table 3.17
Table 3.18
Table 3.19
119. In a previous year, the weights of the members of the San Francisco 49ers and the Dallas Cowboys were published
in the San Jose Mercury News. The factual data were compiled into the following table.
Table 3.20
For the following, suppose that you randomly select one player from the 49ers or Cowboys.
a. Find the probability that his shirt number is from 1 to 33.
b. Find the probability that he weighs at most 210 pounds.
c. Find the probability that his shirt number is from 1 to 33 AND he weighs at most 210 pounds.
d. Find the probability that his shirt number is from 1 to 33 OR he weighs at most 210 pounds.
e. Find the probability that his shirt number is from 1 to 33 GIVEN that he weighs at most 210 pounds.
Use the following information to answer the next two exercises. This tree diagram shows the tossing of an unfair coin
followed by drawing one bead from a cup containing three red (R), four yellow (Y) and five blue (B) beads. For the coin,
P(H) = 2 and P(T) = 1 where H is heads and T is tails.
3 3
Figure 3.20
Table 3.21
For the following, suppose that you randomly select one player from the 49ers or Cowboys.
If having a shirt number from one to 33 and weighing at most 210 pounds were independent events, then what should be
true about P(Shirt# 1–33|≤ 210 pounds)?
124. The probability that a male develops some form of cancer in his lifetime is 0.4567. The probability that a male has at
least one false positive test result (meaning the test comes back for cancer when the man does not have it) is 0.51. Some of
the following questions do not have enough information for you to answer them. Write “not enough information” for those
answers. Let C = a man develops cancer in his lifetime and P = man has at least one false positive.
a. P(C) = ______
b. P(P|C) = ______
c. P(P|C') = ______
d. If a test comes up positive, based upon numerical values, can you assume that man has cancer? Justify numerically
and explain why or why not.
125. Given events G and H: P(G) = 0.43; P(H) = 0.26; P(H ∩ G) = 0.14
a. Find P(H ∪ G).
b. Find the probability of the complement of event (H ∩ G).
c. Find the probability of the complement of event (H ∪ G).
REFERENCES
3.1 Terminology
“Countries List by Continent.” Worldatlas, 2013. Available online at http://www.worldatlas.com/cntycont.htm (accessed
May 2, 2013).
Samuel, T. M. “Strange Facts about RH Negative Blood.” eHow Health, 2013. Available online at http://www.ehow.com/
facts_5552003_strange-rh-negative-blood.html (accessed May 2, 2013).
“United States: Uniform Crime Report – State Statistics from 1960–2011.” The Disaster Center. Available online at
http://www.disastercenter.com/crime/ (accessed May 2, 2013).
Data from Clara County Public H.D.
Data from the American Cancer Society.
Data from The Data and Story Library, 1996. Available online at http://lib.stat.cmu.edu/DASL/ (accessed May 2, 2013).
Data from the Federal Highway Administration, part of the United States Department of Transportation.
Data from the United States Census Bureau, part of the United States Department of Commerce.
Data from USA Today.
“Environment.” The World Bank, 2013. Available online at http://data.worldbank.org/topic/environment (accessed May 2,
2013).
“Search for Datasets.” Roper Center: Public Opinion Archives, University of Connecticut., 2013. Available online at
http://www.ropercenter.uconn.edu/data_access/data/search_for_datasets.html (accessed May 2, 2013).
SOLUTIONS
1
a. P(L′) = P(S)
b. P(M ∪ S)
c. P(F ∩ L)
d. P(M | L)
e. P(L | M)
f. P(S | F)
g. P(F | L)
h. P(F ∪ L)
i. P(M ∩ S)
j. P(F)
3 P(N) = 15 = 5 = 0.36
42 14
5 P(C) = 5 = 0.12
42
7 P(G) = 20 = 2 = 0.13
150 15
9 P(R) = 22 = 11 = 0.15
150 75
13 P(E) = 47 = 0.24
194
15 P(N) = 23 = 0.12
194
17 P(S) = 12 = 6 = 0.06
194 97
19 13 = 1 = 0.25
52 4
21 3 = 1 = 0.5
6 2
23 P(R) = 4 = 0.5
8
25 P(O ∪ H)
27 P(H | I)
29 P(N | O)
31 P(I ∪ N)
33 P(I)
35 The likelihood that an event will occur given that another event has already occurred.
37 1
39 the probability of landing on an even number or a multiple of three
41 P(J) = 0.3
43 P(Q ∩ R) = P(Q)P(R) 0.1 = (0.4)P(R) P(R) = 0.25
45 0.376
47 C | L means, given the person chosen is a Latino Californian, the person is a registered voter who prefers life in prison
without parole for a person convicted of first degree murder.
49 L ∩ C is the event that the person chosen is a Latino California registered voter who prefers life without parole over
the death penalty for a person convicted of first degree murder.
51 0.6492
53 No, because P(L ∩ C) does not equal 0.
57 P(being a female musician ∩ learning music in school) = 38 = 19 = 0.29 P(being a female musician)P(learning
130 65
⎛ ⎞⎛ ⎞ 4, 464 = 1, 116 = 0.26 No, they are not independent because P(being a female
music in school) = ⎝ 72 ⎠⎝ 62 ⎠ =
130 130 16, 900 4, 225
musician ∩ learning music in school) is not equal to P(being a female musician)P(learning music in school).
196 Chapter 3 | Probability Topics
58
Figure 3.21
60
35,065
100,450
62 To pick one person from the study who is Japanese American AND smokes 21 to 30 cigarettes per day means that the
person has to meet both criteria: both Japanese American and smokes 21 to 30 cigarettes. The sample space should include
everyone in the study. The probability is
4,715 .
100,450
64 To pick one person from the study who is Japanese American given that person smokes 21-30 cigarettes per day, means
that the person must fulfill both criteria and the sample space is reduced to those who smoke 21-30 cigarettes per day. The
probability is 4715 .
15,273
66
a.
Figure 3.22
⎛ ⎞⎛ ⎞
b. P(GG) = ⎝5 ⎠⎝5 ⎠ = 25
8 8 64
d. P(G | G) = 5
8
e. Yes, they are independent because the first card is placed back in the bag before the second card is drawn; the
composition of cards in the bag remains the same from draw one to draw two.
68
Table 3.22
b. P(F) = 0.486
c. P(>64 | F) = 0.1361
d. P(>64 and F) = P(F) P(>64|F) = (0.486)(0.1361) = 0.0661
e. P(>64 | F) is the percentage of female drivers who are 65 or older and P(>64 ∩ F) is the percentage of drivers who
are female and 65 or older.
f. P(>64) = P(>64 ∩ F) + P(>64 ∩ M) = 0.1356
g. No, being female and 65 or older are not mutually exclusive because they can occur at the same time P(>64 ∩ F) =
0.0661.
198 Chapter 3 | Probability Topics
70
Table 3.23
b. If we assume that all walkers are alone and that none from the other two groups travel alone (which is a big
assumption) we have: P(Alone) = 0.7318 + 0.0390 = 0.7708.
c. Make the same assumptions as in (b) we have: (0.7708)(1,000) = 771
d. (0.1332)(1,000) = 133
73
a. You can't calculate the joint probability knowing the probability of both events occurring, which is not in the
information given; the probabilities should be multiplied, not added; and probability is never greater than 100%
b. A home run by definition is a successful hit, so he has to have at least as many successful hits as home runs.
75 0
77 0.3571
79 0.2142
81 Physician (83.7)
83 83.7 − 79.6 = 4.1
85 P(Occupation < 81.3) = 0.5
87
a. The Forum Research surveyed 1,046 Torontonians.
b. 58%
c. 42% of 1,046 = 439 (rounding to the nearest integer)
d. 0.57
e. 0.60.
89
a. P(Betting on two line that touch each other on the table) = 6
38
f. P(Betting on 0-00-1-2-3) = 5
38
91
a. {G1, G2, G3, G4, G5, Y1, Y2, Y3}
b. 5
8
c. 2
3
d. 2
8
e. 6
8
f. No, because P(G ∩ E) does not equal 0.
93
NOTE
The coin toss is independent of the card picked first.
95
a. S = {(HHH), (HHT), (HTH), (HTT), (THH), (THT), (TTH), (TTT)}
b. 4
8
c. Yes, because if A has occurred, it is impossible to obtain two tails. In other words, P(A ∩ B) = 0.
97
a. If Y and Z are independent, then P(Y ∩ Z) = P(Y)P(Z), so P(Y ∪ Z) = P(Y) + P(Z) - P(Y)P(Z).
b. 0.5
99 iii; i; iv; ii
101
a. P(R) = 0.44
b. P(R | E) = 0.56
c. P(R | O) = 0.31
d. No, whether the money is returned is not independent of which class the money was placed in. There are several ways
to justify this mathematically, but one is that the money placed in economics classes is not returned at the same overall
rate; P(R | E) ≠ P(R).
e. No, this study definitely does not support that notion; in fact, it suggests the opposite. The money placed in the
200 Chapter 3 | Probability Topics
economics classrooms was returned at a higher rate than the money place in all classes collectively; P(R | E) > P(R).
103
a. P(type O ∪ Rh-) = P(type O) + P(Rh-) - P(type O ∩ Rh-)
0.52 = 0.43 + 0.15 - P(type O ∩ Rh-); solve to find P(type O ∩ Rh-) = 0.06
6% of people have type O, Rh- blood
b. P(NOT(type O ∩ Rh-)) = 1 - P(type O ∩ Rh-) = 1 - 0.06 = 0.94
94% of people do not have type O, Rh- blood
105
a. Let C = be the event that the cookie contains chocolate. Let N = the event that the cookie contains nuts.
b. P(C ∪ N) = P(C) + P(N) - P(C ∩ N) = 0.36 + 0.12 - 0.08 = 0.40
c. P(NEITHER chocolate NOR nuts) = 1 - P(C ∪ N) = 1 - 0.40 = 0.60
107 0
109 10
67
111 10
34
113 d
115
Table 3.24
Table 3.25
c.
22,050
29,760
d. 330
29,760
e.
2,000
29,760
f.
23,720
29,760
g.
5,010
6,020
117 b
119
a. 26
106
b. 33
106
c. 21
106
⎛ 26 ⎞ ⎛ 33 ⎞ ⎛ 21 ⎞ ⎛ 38 ⎞
d. ⎝106 ⎠ + ⎝106 ⎠ - ⎝106 ⎠ = ⎝106 ⎠
e. 21
33
121 a
124
a. P(C) = 0.4567
b. not enough information
c. not enough information
d. No, because over half (0.51) of men have at least one false positive text
126
a. (J ∪ K) = P( J ) + P(K) − P(J ∩ K); 0.45 = 0.18 + 0.37 − P(J ∩ K); solve to fin P(J ∩ K) = 0.10
4 | DISCRETE RANDOM
VARIABLES
Figure 4.1 You can use probability and discrete random variables to calculate the likelihood of lightning striking the
ground five times during a half-hour thunderstorm. (Credit: Leszek Leszczynski)
Introduction
A student takes a ten-question, true-false quiz. Because the student had such a busy schedule, he or she could not study and
guesses randomly at each answer. What is the probability of the student passing the test with at least a 70%?
Small companies might be interested in the number of long-distance phone calls their employees make during the peak time
of the day. Suppose the historical average is 20 calls. What is the probability that the employees make more than 20 long-
distance phone calls during the peak time?
These two examples illustrate two different types of probability problems involving discrete random variables. Recall that
discrete data are data that you can count, that is, the random variable can only take on whole number values. A random
variable describes the outcomes of a statistical experiment in words. The values of a random variable can vary with each
repetition of an experiment, often called a trial.
When we looked at the sample space for flipping 3 coins we could easily write the full sample space and thus could easily
count the number of events that met our desired result, e.g. x = 1 , where X is the random variable defined as the number of
heads.
As we have larger numbers of items in the sample space, such as a full deck of 52 cards, the ability to write out the sample
space becomes impossible.
We see that probabilities are nothing more than counting the events in each group we are interested in and dividing by the
number of elements in the universe, or sample space. This is easy enough if we are counting sophomores in a Stat class,
but in more complicated cases listing all the possible outcomes may take a life time. There are, for example, 36 possible
outcomes from throwing just two six-sided dice where the random variable is the sum of the number of spots on the up-
facing sides. If there were four dice then the total number of possible outcomes would become 1,296. There are more than
2.5 MILLION possible 5 card poker hands in a standard deck of 52 cards. Obviously keeping track of all these possibilities
and counting them to get at a single probability would be tedious at best.
An alternative to listing the complete sample space and counting the number of elements we are interested in, is to skip the
step of listing the sample space, and simply figuring out the number of elements in it and doing the appropriate division. If
we are after a probability we really do not need to see each and every element in the sample space, we only need to know
how many elements are there. Counting formulas were invented to do just this. They tell us the number of unordered subsets
of a certain size that can be created from a set of unique elements. By unordered it is meant that, for example, when dealing
cards, it does not matter if you got {ace, ace, ace, ace, king} or {king, ace, ace, ace, ace} or {ace, king, ace, ace, ace} and
so on. Each of these subsets are the same because they each have 4 aces and one king.
Combinational Formula
⎛n⎞ n!
⎝x⎠ = n Cx =
x !(n - x) !
This is the formula that tells the number of unique unordered subsets of size x that can be created from n unique elements.
The formula is read “n combinatorial x”. Sometimes it is read as “n choose x." The exclamation point "!" is called a
factorial and tells us to take all the numbers from 1 through the number before the ! and multiply them together thus
4! is 1*2*3*4=24. By definition 0! = 1. The formula is called the Combinatorial Formula. It is also called the Binomial
Coefficient, for reasons that will be clear shortly. While this mathematical concept was understood long before 1653, Blaise
Pascal is given major credit for his proof that he published in that year. Further, he developed a generalized method of
calculating the values for combinatorials known to us as the Pascal Triangle. Pascal was one of the geniuses of an era
of extraordinary intellectual advancement which included the work of Galileo, Rene Descartes, Isaac Newton, William
Shakespeare and the refinement of the scientific method, the very rationale for the topic of this text.
Let’s find the hard way the total number of combinations of the four aces in a deck of cards if we were going to take them
two at a time. The sample space would be:
S={Spade,Heart),(Spade, Diamond),(Spade,Club), (Diamond,Club),(Heart,Diamond),(Heart,Club)}
There are 6 combinations; formally, six unique unordered subsets of size 2 that can be created from 4 unique elements. To
use the combinatorial formula we would solve the formula as follows:
⎛4⎞ 4! 4·3·2·1
⎝2⎠ = (4 - 2) !2 ! = 2 · 1 · 2 · 1 = 6
If we wanted to know the number of unique 5 card poker hands that could be created from a 52 card deck we simply
compute:
⎛52⎞
⎝5⎠
where 52 is the total number of unique elements from which we are drawing and 5 is the size group we are putting them
into.
With the combinatorial formula we can count the number of elements in a sample space without having to write each one
of them down, truly a lifetime's work for just the number of 5 card hands from a deck of 52 cards. We can now apply this
tool to a very important probability density function, the hypergeometric distribution.
Remember, a probability density function computes probabilities for us. We simply put the appropriate numbers in the
formula and we get the probability of specific events. However, for these formulas to work they must be applied only to
cases for which they were designed.
And if we did not care what else we had in our hand for the other three cards we would compute:
⎛48⎞ 48 !
⎝ 3 ⎠ = 3 !45 ! = 17,296
Putting this together, we can compute the probability of getting exactly two aces in a 5 card poker hand as:
⎛4⎞⎛48⎞
⎝2⎠⎝ 3 ⎠
= .0399
⎛52⎞
⎝5⎠
This solution is really just the probability distribution known as the Hypergeometric. The generalized formula is:
⎛A⎞⎛N - A⎞
⎝ x ⎠⎝ n - x ⎠
h(x) = ⎛N ⎞
⎝n ⎠
where x = the number we are interested in coming from the group with A objects.
h(x) is the probability of x successes, in n attempts, when A successes (aces in this case) are in a population that contains N
elements. The hypergeometric distribution is an example of a discrete probability distribution because there is no possibility
of partial success, that is, there can be no poker hands with 2 1/2 aces. Said another way, a discrete random variable has
to be a whole, or counting, number only. This probability distribution works in cases where the probability of a success
changes with each draw. Another way of saying this is that the events are NOT independent. In using a deck of cards, we
are sampling WITHOUT replacement. If we put each card back after it was drawn then the hypergeometric distribution be
an inappropriate Pdf.
For the hypergeometric to work,
206 Chapter 4 | Discrete Random Variables
1. the population must be dividable into two and only two independent subsets (aces and non-aces in our example). The
random variable X = the number of items from the group of interest.
2. the experiment must have changing probabilities of success with each experiment (the fact that cards are not replaced
after the draw in our example makes this true in this case). Another way to say this is that you sample without
replacement and therefore each pick is not independent.
3. the random variable must be discrete, rather than continuous.
Example 4.1
A candy dish contains 30 jelly beans and 20 gumdrops. Ten candies are picked at random. What is the probability
that 5 of the 10 are gumdrops? The two groups are jelly beans and gumdrops. Since the probability question asks
for the probability of picking gumdrops, the group of interest (first group A in the formula) is gumdrops. The
size of the group of interest (first group) is 30. The size of the second group is 20. The size of the sample is 10
(jelly beans or gumdrops). Let X = the number of gumdrops in the sample of 10. X takes on the values x = 0, 1,
2, ..., 10. a. What is the probability statement written mathematically? b. What is the hypergeometric probability
density function written out to solve this problem? c. What is the answer to the question "What is the probability
of drawing 5 gumdrops in 10 picks from the dish?"
Solution 4.1
a. P(x = 5)
⎛30⎞⎛20⎞
⎝ 5 ⎠⎝ 5 ⎠
⎛ 50 ⎞
b. P(x = 5) =
⎝10⎠
c. P(x = 5) = 0.215
4.1 A bag contains letter tiles. Forty-four of the tiles are vowels, and 56 are consonants. Seven tiles are picked at
random. You want to know the probability that four of the seven tiles are vowels. What is the group of interest, the size
of the group of interest, and the size of the sample?
Binomial Formula
b(x) = ⎛⎝x⎞⎠ p x q n - x
n
where b(x) is the probability of X successes in n trials when the probability of a success in ANY ONE TRIAL is p. And of
course q=(1-p) and is the probability of a failure in any one trial.
We can see now why the combinatorial formula is also called the binomial coefficient because it reappears here again in
the binomial probability function. For the binomial formula to work, the probability of a success in any one trial must be
the same from trial to trial, or in other words, the outcomes of each trial must be independent. Flipping a coin is a binomial
process because the probability of getting a head in one flip does not depend upon what has happened in PREVIOUS flips.
(At this time it should be noted that using p for the parameter of the binomial distribution is a violation of the rule that
population parameters are designated with Greek letters. In many textbooks θ (pronounced theta) is used instead of p and
this is how it should be.
Just like a set of data, a probability density function has a mean and a standard deviation that describes the data set. For the
binomial distribution these are given by the formulas:
µ = np
σ = npq
Notice that p is the only parameter in these equations. The binomial distribution is thus seen as coming from the one-
parameter family of probability distributions. In short, we know all there is to know about the binomial once we know p,
the probability of a success in any one trial.
In probability theory, under certain circumstances, one probability distribution can be used to approximate another. We say
that one is the limiting distribution of the other. If a small number is to be drawn from a large population, even if there is no
replacement, we can still use the binomial even thought this is not a binomial process. If there is no replacement it violates
the independence rule of the binomial. Nevertheless, we can use the binomial to approximate a probability that is really a
hypergeometric distribution if we are drawing fewer than 10 percent of the population, i.e. n is less than 10 percent of N in
the formula for the hypergeometric function. The rationale for this argument is that when drawing a small percentage of the
population we do not alter the probability of a success from draw to draw in any meaningful way. Imagine drawing from
not one deck of 52 cards but from 6 decks of cards. The probability of say drawing an ace does not change the conditional
probability of what happens on a second draw in the same way it would if there were only 4 aces rather than the 24 aces
now to draw from. This ability to use one probability distribution to estimate others will become very valuable to us later.
There are three characteristics of a binomial experiment.
1. There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of
trials.
2. The random variable, x , number of successes, is discrete.
3. There are only two possible outcomes, called "success" and "failure," for each trial. The letter p denotes the probability
of a success on any one trial, and q denotes the probability of a failure on any one trial. p + q = 1.
4. The n trials are independent and are repeated using identical conditions. Think of this as drawing WITH replacement.
Because the n trials are independent, the outcome of one trial does not help in predicting the outcome of another trial.
Another way of saying this is that for each individual trial, the probability, p, of a success and probability, q, of a
failure remain the same. For example, randomly guessing at a true-false statistics question has only two outcomes.
If a success is guessing correctly, then a failure is guessing incorrectly. Suppose Joe always guesses correctly on any
statistics true-false question with a probability p = 0.6. Then, q = 0.4. This means that for every true-false statistics
question Joe answers, his probability of success (p = 0.6) and his probability of failure (q = 0.4) remain the same.
The outcomes of a binomial experiment fit a binomial probability distribution. The random variable X = the number of
successes obtained in the n independent trials.
The mean, μ, and variance, σ2, for the binomial probability distribution are μ = np and σ2 = npq. The standard deviation, σ,
is then σ = npq .
Any experiment that has characteristics three and four and where n = 1 is called a Bernoulli Trial (named after Jacob
Bernoulli who, in the late 1600s, studied them extensively). A binomial experiment takes place when the number of
successes is counted in one or more Bernoulli Trials.
Example 4.2
Suppose you play a game that you can only either win or lose. The probability that you win any game is 55%,
and the probability that you lose is 45%. Each game you play is independent. If you play the game 20 times, write
the function that describes the probability that you win 15 of the 20 times. Here, if you define X as the number of
wins, then X takes on the values 0, 1, 2, 3, ..., 20. The probability of a success is p = 0.55. The probability of a
failure is q = 0.45. The number of trials is n = 20. The probability question can be stated mathematically as P(x =
15).
208 Chapter 4 | Discrete Random Variables
4.2 A trainer is teaching a dolphin to do tricks. The probability that the dolphin successfully performs the trick is 35%,
and the probability that the dolphin does not successfully perform the trick is 65%. Out of 20 attempts, you want to
find the probability that the dolphin succeeds 12 times. Find the P(X=12) using the binomial Pdf.
Example 4.3
A fair coin is flipped 15 times. Each flip is independent. What is the probability of getting more than ten heads?
Let X = the number of heads in 15 flips of the fair coin. X takes on the values 0, 1, 2, 3, ..., 15. Since the coin is
fair, p = 0.5 and q = 0.5. The number of trials is n = 15. State the probability question mathematically.
Solution 4.3
P(x > 10)
Example 4.4
Approximately 70% of statistics students do their homework in time for it to be collected and graded. Each
student does homework independently. In a statistics class of 50 students, what is the probability that at least 40
will do their homework on time? Students are selected randomly.
a. This is a binomial problem because there is only a success or a __________, there are a fixed number of trials,
and the probability of a success is 0.70 for each trial.
Solution 4.4
a. failure
b. If we are interested in the number of students who do their homework on time, then how do we define X?
Solution 4.4
b. X = the number of statistics students who do their homework on time
Solution 4.4
c. 0, 1, 2, …, 50
Solution 4.4
d. Failure is defined as a student who does not complete his or her homework on time.
The probability of a success is p = 0.70. The number of trials is n = 50.
e. If p + q = 1, then what is q?
Solution 4.4
e. q = 0.30
f. The words "at least" translate as what kind of inequality for the probability question P(x ____ 40).
Solution 4.4
f. greater than or equal to (≥)
The probability question is P(x ≥ 40).
4.4 Sixty-five percent of people pass the state driver’s exam on the first try. A group of 50 individuals who have taken
the driver’s exam is randomly selected. Give two reasons why this is a binomial problem.
4.4 During the 2013 regular NBA season, DeAndre Jordan of the Los Angeles Clippers had the highest field goal
completion rate in the league. DeAndre scored with 61.3% of his shots. Suppose you choose a random sample of 80
shots made by DeAndre during the 2013 season. Let X = the number of shots that scored points.
a. What is the probability distribution for X?
b. Using the formulas, calculate the (i) mean and (ii) standard deviation of X.
c. Find the probability that DeAndre scored with 60 of these shots.
d. Find the probability that DeAndre scored with more than 50 of these shots.
Example 4.5
You play a game of chance that you can either win or lose (there are no other possibilities) until you lose. Your
probability of losing is p = 0.57. What is the probability that it takes five games until you lose? Let X = the number
210 Chapter 4 | Discrete Random Variables
of games you play until you lose (includes the losing game). Then X takes on the values 1, 2, 3, ... (could go on
indefinitely). The probability question is P(x = 5).
4.5 You throw darts at a board until you hit the center area. Your probability of hitting the center area is p = 0.17. You
want to find the probability that it takes eight throws until you hit the center. What values does X take on?
Example 4.6
A safety engineer feels that 35% of all industrial accidents in her plant are caused by failure of employees to
follow instructions. She decides to look at the accident reports (selected randomly and replaced in the pile after
reading) until she finds one that shows an accident caused by failure of employees to follow instructions. On
average, how many reports would the safety engineer expect to look at until she finds a report showing an
accident caused by employee failure to follow instructions? What is the probability that the safety engineer will
have to examine at least three reports until she finds a report showing an accident caused by employee failure to
follow instructions?
Let X = the number of accidents the safety engineer must examine until she finds a report showing an accident
caused by employee failure to follow instructions. X takes on the values 1, 2, 3, .... The first question asks you
to find the expected value or the mean. The second question asks you to find P(x ≥ 3). ("At least" translates to a
"greater than or equal to" symbol).
4.6 An instructor feels that 15% of students get below a C on their final exam. She decides to look at final exams
(selected randomly and replaced in the pile after reading) until she finds one that shows a grade below a C. We want to
know the probability that the instructor will have to examine at least ten exams until she finds one with a grade below
a C. What is the probability question stated mathematically?
Example 4.7
Suppose that you are looking for a student at your college who lives within five miles of you. You know that 55%
of the 25,000 students do live within five miles of you. You randomly contact students from the college until one
says he or she lives within five miles of you. What is the probability that you need to contact four people?
This is a geometric problem because you may have a number of failures before you have the one success you
desire. Also, the probability of a success stays approximately the same each time you ask a student if he or she
lives within five miles of you. There is no definite number of trials (number of times you ask a student).
a. Let X = the number of ____________ you must ask ____________ one says yes.
Solution 4.7
a. Let X = the number of students you must ask until one says yes.
Solution 4.7
b. 1, 2, 3, …, (total number of students)
Solution 4.7
c. p = 0.55; q = 0.45
Solution 4.7
d. P(x = 4)
for x = 1, 2, 3, ....
The expected value of X, the mean of this distribution, is 1/p. This tells us how many trials we have to expect until we get
the first success including in the count the trial that results in success. The above form of the Geometric distribution is used
for modeling the number of trials until the first success. The number of trials includes the one that is a success: x = all trials
including the one that is a success. This can be seen in the form of the formula. If X = number of trials including the success,
then we must multiply the probability of failure, (1-p), times the number of failures, that is X-1.
By contrast, the following form of the geometric distribution is used for modeling number of failures until the first success:
P⎛⎝X = x⎞⎠ = ⎛⎝1 - p⎞⎠ x p
for x = 0, 1, 2, 3, ....
In this case the trial that is a success is not counted as a trial in the formula: x = number of failures. The expected value,
⎛
1 − p⎞⎠
⎝
mean, of this distribution is µ = p . This tells us how many failures to expect before we have a success. In either case,
the sequence of probabilities is a geometric sequence.
Example 4.8
Assume that the probability of a defective computer component is 0.02. Components are randomly selected. Find
the probability that the first defect is caused by the seventh component tested. How many components do you
expect to test until one is found to be defective?
Let X = the number of computer components tested until the first defect is found.
X takes on the values 1, 2, 3, ... where p = 0.02. X ~ G(0.02)
Find P(x = 7). Answer: P(x = 7) = (1 - 0.02)7-1 × 0.02 = 0.0177.
The probability that the seventh component is the first defect is 0.0177.
The graph of X ~ G(0.02) is:
212 Chapter 4 | Discrete Random Variables
Figure 4.2
The y-axis contains the probability of x, where X = the number of computer components tested. Notice that the
probabilities decline by a common increment. This increment is the same ratio between each number and is called
a geometric progression and thus the name for this probability density function.
The number of components that you would expect to test until you find the first defective component is the mean,
µ = 50 .
The formula for the mean for the random variable defined as number of failures until first success is μ = 1p =
1 = 50
0.02
See Example 4.9 for an example where the geometric random variable is defined as number of trials until
first success. The expected value of this formula for the geometric will be different from this version of the
distribution.
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
The formula for the variance is σ2 = ⎝ 1p ⎠⎝ 1p − 1⎠ = ⎝ 1 ⎠⎝ 1 − 1⎠ = 2,450
0.02 0.02
⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
The standard deviation is σ = ⎝ 1p ⎠⎝ 1p − 1⎠ = ⎝ 1 ⎠⎝ 1 − 1⎠ = 49.5
0.02 0.02
Example 4.9
The lifetime risk of developing pancreatic cancer is about one in 78 (1.28%). Let X = the number of people you
ask before one says he or she has pancreatic cancer. The random variable X in this case includes only the number
of trials that were failures and does not count the trial that was a success in finding a person who had the disease.
The appropriate formula for this random variable is the second one presented above. Then X is a discrete random
⎛ ⎞
variable with a geometric distribution: X ~ G ⎝ 1 ⎠ or X ~ G(0.0128).
78
a. What is the probability of that you ask 9 people before one says he or she has pancreatic cancer? This is
asking, what is the probability that you ask 9 people unsuccessfully and the tenth person is a success?
b. What is the probability that you must ask 20 people?
c. Find the (i) mean and (ii) standard deviation of X.
Solution 4.9
a. P(x = 9) = (1 - 0.0128)9 * 0.0128 = 0.0114
b. P(x = 20) = (1 - 0.0128)19 * 0.0128 =0.01
⎛
1 − p⎞⎠ (1 − 0.0128)
⎝
c. i. Mean = μ = p = 0.0128 = 77.12
1− p 1 − 0.0128 ≈ 77.62
ii. Standard Deviation = σ = =
p2 0.0128 2
4.9 The literacy rate for a nation measures the proportion of people age 15 and over who can read and write. The
literacy rate for women in The United Colonies of Independence is 12%. Let X = the number of women you ask until
one says that she is literate.
a. What is the probability distribution of X?
b. What is the probability that you ask five women before one says she is literate?
c. What is the probability that you must ask ten women?
Example 4.10
A baseball player has a batting average of 0.320. This is the general probability that he gets a hit each time he is
at bat.
What is the probability that he gets his first hit in the third trip to bat?
Solution 4.10
P (x=3) = (1-0.32)3-1 × .32 = 0.1480
In this case the sequence is failure, failure success.
How many trips to bat do you expect the hitter to need before getting a hit?
Solution 4.10
µ = 1p = 1 = 3.125 ≈ 3
0.320
This is simply the expected value of successes and therefore the mean of the distribution.
Example 4.11
There is an 80% chance that a Dalmatian dog has 13 black spots. You go to a dog show and count the spots on
Dalmatians. What is the probability that you will review the spots on 3 dogs before you find one that has 13 black
spots?
Solution 4.11
P(x=3) = (1 - 0.80)3 × 0.80 = 0.0064
214 Chapter 4 | Discrete Random Variables
Example 4.12
A bank expects to receive six bad checks per day, on average. What is the probability of the bank getting fewer
than five bad checks on any given day? Of interest is the number of checks the bank receives in one day, so the
time interval of interest is one day. Let X = the number of bad checks the bank receives in one day. If the bank
expects to receive six bad checks per day then the average is six checks per day. Write a mathematical statement
for the probability question.
Solution 4.12
P(x < 5)
Example 4.13
You notice that a news reporter says "uh," on average, two times per broadcast. What is the probability that the
news reporter says "uh" more than two times per broadcast.
This is a Poisson problem because you are interested in knowing the number of times the news reporter says "uh"
during a broadcast.
Solution 4.13
a. one broadcast measured in minutes
b. What is the average number of times the news reporter says "uh" during one broadcast?
Solution 4.13
b. 2
Solution 4.13
c. Let X = the number of times the news reporter says "uh" during one broadcast.
x = 0, 1, 2, 3, ...
Solution 4.13
d. P(x > 2)
Example 4.14
Leah's answering machine receives about six telephone calls between 8 a.m. and 10 a.m. What is the probability
that Leah receives more than one call in the next 15 minutes?
Let X = the number of calls Leah receives in 15 minutes. (The interval of interest is 15 minutes or 1 hour.)
4
x = 0, 1, 2, 3, ...
If Leah receives, on the average, six telephone calls in two hours, and there are eight 15 minute intervals in two
hours, then Leah receives
⎛1 ⎞
⎝8 ⎠ (6) = 0.75 calls in 15 minutes, on average. So, μ = 0.75 for this problem.
X ~ P(0.75)
Find P(x > 1). P(x > 1) = 0.1734
Probability that Leah receives more than one telephone call in the next 15 minutes is about 0.1734.
The graph of X ~ P(0.75) is:
216 Chapter 4 | Discrete Random Variables
Figure 4.3
The y-axis contains the probability of x where X = the number of calls in 15 minutes.
Example 4.15
According to a survey a university professor gets, on average, 7 emails per day. Let X = the number of emails a
professor receives per day. The discrete random variable X takes on the values x = 0, 1, 2 …. The random variable
X has a Poisson distribution: X ~ P(7). The mean is 7 emails.
a. What is the probability that an email user receives exactly 2 emails per day?
b. What is the probability that an email user receives at most 2 emails per day?
c. What is the standard deviation?
Solution 4.15
⎛ ⎞ µ x e -µ 7 2 e -7
a. P⎝x = 2⎠ = = = 0.022
x! 2!
⎛ ⎞ 0 -7 1 -7 2 -7
b. P⎝x ≤ 2⎠ = 7 e + 7 e + 7 e = 0.029
0! 1! 2!
Example 4.16
Text message users receive or send an average of 41.5 text messages per day.
a. How many text messages does a text message user receive or send per hour?
b. What is the probability that a text message user receives or sends two messages per hour?
c. What is the probability that a text message user receives or sends more than two messages per hour?
Solution 4.16
a. Let X = the number of texts that a user sends or receives in one hour. The average number of texts received
per hour is 41.5 ≈ 1.7292.
24
⎛ ⎞ µ x e -µ 1.729 2 e -1.729
b.
⎝
P x=2 =
⎠ x!
=
2!
= 0.265
⎛ ⎞ ⎛ ⎞ ⎡ 0 -7 1 -7 2 -7 ⎤
c. P⎝x > 2⎠ = 1 - P⎝x ≤ 2⎠ = 1 - ⎣7 e + 7 e + 7 e ⎦ = 0.250
0! 1! 2!
Example 4.17
On May 13, 2013, starting at 4:30 PM, the probability of low seismic activity for the next 48 hours in Alaska
was reported as about 1.02%. Use this information for the next 200 days to find the probability that there will be
low seismic activity in ten of the next 200 days. Use both the binomial and Poisson distributions to calculate the
probabilities. Are they close?
Solution 4.17
Let X = the number of days with low seismic activity.
Using the binomial distribution:
⎛ ⎞ 200!
• P⎝x = 10⎠ = ×.0102 10 = 0.000039
10!(200 - 10) !
Using the Poisson distribution:
• Calculate μ = np = 200(0.0102) ≈ 2.04
⎛ ⎞ µ x e -µ 2.04 10 e -2.04
• P⎝x = 10⎠ = = = 0.000045
x! 10!
We expect the approximation to be good because n is large (greater than 20) and p is small (less than 0.05). The
results are close—both probabilities reported are almost 0.
Example 4.18
A survey of 500 seniors in the Price Business School yields the following information. 75% go straight to work
after graduation. 15% go on to work on their MBA. 9% stay to get a minor in another program. 1% go on to get a
Master's in Finance.
What is the probability that more than 2 seniors go to graduate school for their Master's in finance?
218 Chapter 4 | Discrete Random Variables
Solution 4.18
This is clearly a binomial probability distribution problem. The choices are binary when we define the results as
"Graduate School in Finance" versus "all other options." The random variable is discrete, and the events are, we
could assume, independent. Solving as a binomial problem, we have:
Binomial Solution
n * p = 500 * 0.01 = 5 = µ
− 0
P(0) = 500 ! 0.01 0(1 − 0.01) 500 = 0.00657
0 !(500 − 0) !
− 1
P(1) = 500 ! 0.01 1(1 − 0.01) 500 = 0.03318
1 !(500 − 1) !
− 2
P(2) = 500 ! 0.01 2(1 − 0.01) 500 = 0.08363
2 !(500 − 2) !
Adding all 3 together = 0.12339
1 − 0.12339 = 0.87661
Poisson approximation
n * p = 500 * 0.01 = 5 = µ
n * p * (1 − p) = 500 * 0.01 * ⎛⎝0.99⎞⎠ ≈ 5 = σ 2 = µ
e −np(np) x ⎧⎨ −5 0⎫ ⎧ −5 1⎫ ⎧ −5 2⎫
P(X) = = P(0) = e * 5 ⎬ + ⎨P(1) = e * 5 ⎬ + ⎨P(2) = e * 5 ⎬
x! ⎩ 0! ⎭ ⎩ 1! ⎭ ⎩ 2! ⎭
0.0067 + 0.0337 + 0.0842 = 0.1247
1 − 0.1247 = 0.8753
An approximation that is off by 1 one thousandth is certainly an acceptable approximation.
KEY TERMS
Bernoulli Trials an experiment with the following characteristics:
1. There are only two possible outcomes called “success” and “failure” for each trial.
2. The probability p of a success is the same for any trial (so the probability q = 1 − p of a failure is the same for
any trial).
Binomial Experiment a statistical experiment that satisfies the following three conditions:
1. There are a fixed number of trials, n.
2. There are only two possible outcomes, called "success" and, "failure," for each trial. The letter p denotes the
probability of a success on one trial, and q denotes the probability of a failure on one trial.
3. The n trials are independent and are repeated using identical conditions.
Binomial Probability Distribution a discrete random variable (RV) that arises from Bernoulli trials; there are a fixed
number, n, of independent trials. “Independent” means that the result of any trial (for example, trial one) does not
affect the results of the following trials, and all trials are conducted under the same conditions. Under these
circumstances the binomial RV X is defined as the number of successes in n trials. The mean is μ = np and the
standard deviation is σ = npq . The probability of exactly x successes in n trials is
P(X = x) = ⎛⎝x ⎞⎠ pxqn − x.
n
Geometric Distribution a discrete random variable (RV) that arises from the Bernoulli trials; the trials are repeated
until the first success. The geometric variable X is defined as the number of trials until the first success. The mean is
μ = 1p and the standard deviation is σ = 1 ⎛ 1 − 1⎞ . The probability of exactly x failures before the first success is
p⎝ p ⎠
given by the formula: P(X = x) = p(1 – p)x – 1 where one wants to know probability for the number of trials until the
first success: the xth trail is the first success.
An alternative formulation of the geometric distribution asks the question: what is the probability of x failures until
the first success? In this formulation the trial that resulted in the first success is not counted. The formula for this
presentation of the geometric is: P(X = x) = p(1 − p) x
1− p
The expected value in this form of the geometric distribution is µ = p
The easiest way to keep these two forms of the geometric distribution straight is to remember that p is the
probability of success and (1−p) is the probability of failure. In the formula the exponents simply count the number
of successes and number of failures of the desired outcome of the experiment. Of course the sum of these two
numbers must add to the number of trials in the experiment.
Poisson Probability Distribution a discrete random variable (RV) that counts the number of times a certain event
will occur in a specific interval; characteristics of the variable:
• The probability that the event occurs in a given interval is the same for all intervals.
• The events occur with a known mean and independently of the time since the last event.
The distribution is defined by the mean μ of the event in the interval. The mean is μ = np. The standard deviation
-µ
µx e
is σ = µ . The probability of having exactly x successes in r trials is P(x) = . The Poisson distribution is
x!
often used to approximate the binomial distribution, when n is “large” and p is “small” (a general rule is that np
should be greater than or equal to 25 and p should be less than or equal to 0.01).
Probability Distribution Function (PDF) a mathematical description of a discrete random variable (RV), given
either in the form of an equation (formula) or in the form of a table listing all the possible outcomes of an
experiment and the probability associated with each outcome.
Random Variable (RV) a characteristic of interest in a population being studied; common notation for variables are
upper case Latin letters X, Y, Z,...; common notation for a specific value from the domain (set of all possible values
of a variable) are lower case Latin letters x, y, and z. For example, if X is the number of children in a family, then x
represents a specific integer 0, 1, 2, 3,.... Variables in statistics differ from variables in intermediate algebra in the
two following ways.
• The domain of the random variable (RV) is not necessarily a numerical set; the domain may be expressed in
words; for example, if X = hair color then the domain is {black, blond, gray, green, orange}.
• We can tell what specific value x the random variable X takes only after performing the experiment.
CHAPTER REVIEW
4.0 Introduction
The characteristics of a probability distribution or density function (PDF) are as follows:
1. Each probability is between zero and one, inclusive (inclusive means to include zero and one).
2. The sum of the probabilities is one.
probability of a success on one trial and q denotes the probability of a failure on one trial.
3. The n trials are independent and are repeated using identical conditions.
The outcomes of a binomial experiment fit a binomial probability distribution. The random variable X = the number of
successes obtained in the n independent trials. The mean of X can be calculated using the formula μ = np, and the standard
deviation is given by the formula σ = npq .
P(x) = n! · px q
(n - x)
x !(n - x) !
An alternative formulation of the geometric distribution asks the question: what is the probability of x failures until the first
success? In this formulation the trial that resulted in the first success is not counted. The formula for this presentation of the
geometric is:
P(X = x) = p(1 − p) x
The easiest way to keep these two forms of the geometric distribution straight is to remember that p is the probability of
success and (1−p) is the probability of failure. In the formula the exponents simply count the number of successes and
number of failures of the desired outcome of the experiment. Of course the sum of these two numbers must add to the
number of trials in the experiment.
FORMULA REVIEW
X ~ G(p) means that the discrete random variable X has
4.1 Hypergeometric Distribution a geometric probability distribution with probability of
⎛A⎞⎛N - A⎞ success in a single trial p.
⎝ ⎠⎝ ⎠
h(x) = x ⎛ n ⎞- x X = the number of independent trials until the first success
N
⎝n ⎠ X takes on the values x = 1, 2, 3, ...
p = the probability of a success for any trial
4.2 Binomial Distribution q = the probability of a failure for any trial p + q = 1
X ~ B(n, p) means that the discrete random variable X q=1–p
has a binomial probability distribution with n trials and The mean is μ = 1p .
probability of success p.
X = the number of successes in n independent trials 1 – p 1 ⎛ 1 − 1⎞ .
The standard deviation is σ = = p⎝ p ⎠
n = the number of independent trials p2
X takes on the values x = 0, 1, 2, 3, ..., n
p = the probability of a success for any trial 4.4 Poisson Distribution
q = the probability of a failure for any trial X ~ P(μ) means that X has a Poisson probability distribution
where X = the number of occurrences in the interval of
p+q=1
interest.
q=1–p
X takes on the values x = 0, 1, 2, 3, ...
The mean of X is μ = np. The standard deviation of X is σ =
The mean μ or λ is typically given.
npq .
The variance is σ2 = μ, and the standard deviation is
P(x) = n! · px q
(n - x) σ= µ.
x !(n - x) !
When P(μ) is used to approximate a binomial distribution,
where P(X) is the probability of X successes in n trials μ = np where n represents the number of independent trials
when the probability of a success in ANY ONE TRIAL is and p represents the probability of success in a single trial.
p.
-µ
µx e
P(x) =
4.3 Geometric Distribution x!
P(X = x) = p(1 − p) x − 1
PRACTICE
4.0 Introduction
Use the following information to answer the next five exercises: A company wants to evaluate its attrition rate, in other
words, how long new hires stay with the company. Over the years, they have established the following probability
distribution.
Let X = the number of years a new hire will stay with the company.
Let P(x) = the probability that a new hire will stay with the company x years.
x P(x)
0 0.12
1 0.18
2 0.30
3 0.15
4
5 0.10
6 0.05
Table 4.1
2. P(x = 4) = _______
3. P(x ≥ 5) = _______
4. On average, how long would you expect a new hire to stay with the company?
5. What does the column “P(x)” sum to?
Use the following information to answer the next six exercises: A baker is deciding how many batches of muffins to make to
sell in his bakery. He wants to make enough to sell every one and no fewer. Through observation, the baker has established
a probability distribution.
x P(x)
1 0.15
2 0.35
3 0.40
4 0.10
Table 4.2
Use the following information to answer the next four exercises: Ellen has music practice three days a week. She practices
for all of the three days 85% of the time, two days 8% of the time, one day 4% of the time, and no days 3% of the time. One
week is selected at random.
10. Define the random variable X.
11. Construct a probability distribution table for the data.
12. We know that for a probability distribution function to be discrete, it must have two characteristics. One is that the sum
of the probabilities is one. What is the other characteristic?
Use the following information to answer the next five exercises: Javier volunteers in community events each month. He
does not do more than five events in a month. He attends exactly five events 35% of the time, four events 25% of the time,
224 Chapter 4 | Discrete Random Variables
three events 20% of the time, two events 10% of the time, one event 5% of the time, and no events 5% of the time.
13. Define the random variable X.
14. What values does x take on?
15. Construct a PDF table.
16. Find the probability that Javier volunteers for less than three events each month. P(x < 3) = _______
17. Find the probability that Javier volunteers for at least one event each month. P(x > 0) = _______
x P(x)
Table 4.3
24. On average (μ), how many would you expect to answer yes?
25. What is the standard deviation (σ)?
26. What is the probability that at most five of the freshmen reply “yes”?
27. What is the probability that at least two of the freshmen reply “yes”?
data from 203,967 incoming first-time, full-time freshmen from 270 four-year colleges and universities in the U.S. 71.3%
of those students replied that, yes, they believe that same-sex couples should have the right to legal marital status. Suppose
that you randomly select freshman from the study until you find one who replies “yes.” You are interested in the number of
freshmen you must ask.
28. In words, define the random variable X.
29. X ~ _____(_____,_____)
30. What values does the random variable X take on?
31. Construct the probability distribution function (PDF). Stop at x = 6.
x P(x)
1
2
3
4
5
6
Table 4.4
32. On average (μ), how many freshmen would you expect to have to ask until you found one who replies "yes?"
33. What is the probability that you will need to ask fewer than three freshmen?
Use the following information to answer the next six exercises: On average, eight teens in the U.S. die from motor vehicle
injuries per day. As a result, states across the country are debating raising the driving age.
41. Assume the event occurs independently in any given day. In words, define the random variable X.
42. X ~ _____(_____,_____)
43. What values does X take on?
44. For the given values of the random variable X, fill in the corresponding probabilities.
45. Is it likely that there will be no teens killed from motor vehicle injuries on any given day in the U.S? Justify your answer
numerically.
46. Is it likely that there will be more than 20 teens killed from motor vehicle injuries on any given day in the U.S.? Justify
your answer numerically.
226 Chapter 4 | Discrete Random Variables
HOMEWORK
56. On average, for every 25 patients calling in, how many do you expect to have the flu?
57. People visiting video rental stores often rent more than one DVD at a time. The probability distribution for DVD rentals
per customer at Video To Go is given Table 4.5. There is five-video limit per customer at this store, so nobody ever rents
more than five DVDs.
x P(x)
0 0.03
1 0.50
2 0.24
3
4 0.07
5 0.04
Table 4.5
63. A student takes a 32-question multiple-choice exam, but did not study and randomly guesses each answer. Each question
has three possible choices for the answer. Find the probability that the student guesses more than 75% of the questions
correctly.
64. Six different colored dice are rolled. Of interest is the number of dice that show a one.
a. In words, define the random variable X.
b. List the values that X may take on.
c. On average, how many dice would you expect to show a one?
d. Find the probability that all six dice show a one.
e. Is it more likely that three or that four dice will show a one? Use numbers to justify your answer numerically.
65. More than 96 percent of the very largest colleges and universities (more than 15,000 total enrollments) have some online
offerings. Suppose you randomly pick 13 such institutions. We are interested in the number that offer distance learning
courses.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. On average, how many schools would you expect to offer such courses?
e. Find the probability that at most ten offer such courses.
f. Is it more likely that 12 or that 13 will offer such courses? Use numbers to justify your answer numerically and
answer in a complete sentence.
66. Suppose that about 85% of graduating students attend their graduation. A group of 22 graduating students is randomly
chosen.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many are expected to attend their graduation?
e. Find the probability that 17 or 18 attend.
f. Based on numerical values, would you be surprised if all 22 attended graduation? Justify your answer numerically.
67. At The Fencing Center, 60% of the fencers use the foil as their main weapon. We randomly survey 25 fencers at The
Fencing Center. We are interested in the number of fencers who do not use the foil as their main weapon.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many are expected to not to use the foil as their main weapon?
e. Find the probability that six do not use the foil as their main weapon.
f. Based on numerical values, would you be surprised if all 25 did not use foil as their main weapon? Justify your
answer numerically.
68. Approximately 8% of students at a local high school participate in after-school sports all four years of high school. A
group of 60 seniors is randomly chosen. Of interest is the number who participated in after-school sports all four years of
high school.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many seniors are expected to have participated in after-school sports all four years of high school?
e. Based on numerical values, would you be surprised if none of the seniors participated in after-school sports all
four years of high school? Justify your answer numerically.
f. Based upon numerical values, is it more likely that four or that five of the seniors participated in after-school
sports all four years of high school? Justify your answer numerically.
69. The chance of an IRS audit for a tax return with over $25,000 in income is about 2% per year. We are interested in the
expected number of audits a person with that income has in a 20-year period. Assume each year is independent.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many audits are expected in a 20-year period?
e. Find the probability that a person is not audited at all.
f. Find the probability that a person is audited more than twice.
70. It has been estimated that only about 30% of California residents have adequate earthquake supplies. Suppose you
randomly survey 11 California residents. We are interested in the number who have adequate earthquake supplies.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. What is the probability that at least eight have adequate earthquake supplies?
e. Is it more likely that none or that all of the residents surveyed will have adequate earthquake supplies? Why?
f. How many residents do you expect will have adequate earthquake supplies?
71. There are two similar games played for Chinese New Year and Vietnamese New Year. In the Chinese version, fair dice
with numbers 1, 2, 3, 4, 5, and 6 are used, along with a board with those numbers. In the Vietnamese version, fair dice with
pictures of a gourd, fish, rooster, crab, crayfish, and deer are used. The board has those six objects on it, also. We will play
with bets being $1. The player places a bet on a number or object. The “house” rolls three dice. If none of the dice show the
number or object that was bet, the house keeps the $1 bet. If one of the dice shows the number or object bet (and the other
two do not show it), the player gets back his or her $1 bet, plus $1 profit. If two of the dice show the number or object bet
(and the third die does not show it), the player gets back his or her $1 bet, plus $2 profit. If all three dice show the number
or object bet, the player gets back his or her $1 bet, plus $3 profit. Let X = number of matches and Y = profit per game.
a. In words, define the random variable X.
b. List the values that X may take on.
c. List the values that Y may take on. Then, construct one PDF table that includes both X and Y and their
probabilities.
d. Calculate the average expected matches over the long run of playing this game for the player.
e. Calculate the average expected earnings over the long run of playing this game for the player.
f. Determine who has the advantage, the player or the house.
72. According to The World Bank, only 9% of the population of Uganda had access to electricity as of 2009. Suppose we
randomly sample 150 people in Uganda. Let X = the number of people who have access to electricity.
a. What is the probability distribution for X?
b. Using the formulas, calculate the mean and standard deviation of X.
c. Find the probability that 15 people in the sample have access to electricity.
d. Find the probability that at most ten people in the sample have access to electricity.
e. Find the probability that more than 25 people in the sample have access to electricity.
73. The literacy rate for a nation measures the proportion of people age 15 and over that can read and write. The literacy
rate in Afghanistan is 28.1%. Suppose you choose 15 people in Afghanistan at random. Let X = the number of people who
are literate.
a. Sketch a graph of the probability distribution of X.
b. Using the formulas, calculate the (i) mean and (ii) standard deviation of X.
c. Find the probability that more than five people in the sample are literate. Is it is more likely that three people or
four people are literate.
75. Suppose that the probability that an adult in America will watch the Super Bowl is 40%. Each person is considered
independent. We are interested in the number of adults in America we must survey until we find one who will watch the
Super Bowl.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many adults in America do you expect to survey until you find one who will watch the Super Bowl?
e. Find the probability that you must ask seven people.
f. Find the probability that you must ask three or four people.
76. It has been estimated that only about 30% of California residents have adequate earthquake supplies. Suppose we
are interested in the number of California residents we must survey until we find a resident who does not have adequate
earthquake supplies.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. What is the probability that we must survey just one or two residents until we find a California resident who does
not have adequate earthquake supplies?
e. What is the probability that we must survey at least three California residents until we find a California resident
who does not have adequate earthquake supplies?
f. How many California residents do you expect to need to survey until you find a California resident who does not
have adequate earthquake supplies?
g. How many California residents do you expect to need to survey until you find a California resident who does
have adequate earthquake supplies?
77. In one of its Spring catalogs, L.L. Bean® advertised footwear on 29 of its 192 catalog pages. Suppose we randomly
survey 20 pages. We are interested in the number of pages that advertise footwear. Each page may be picked more than
once.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _____(_____,_____)
d. How many pages do you expect to advertise footwear on them?
e. Is it probable that all twenty will advertise footwear on them? Why or why not?
f. What is the probability that fewer than ten will advertise footwear on them?
g. Reminder: A page may be picked more than once. We are interested in the number of pages that we must
randomly survey until we find one that has footwear advertised on it. Define the random variable X and give its
distribution.
h. What is the probability that you only need to survey at most three pages in order to find one that advertises
footwear on it?
i. How many pages do you expect to need to survey in order to find one that advertises footwear?
78. Suppose that you are performing the probability experiment of rolling one fair six-sided die. Let F be the event of
rolling a four or a five. You are interested in how many times you need to roll the die in order to obtain the first four or five
as the outcome.
• p = probability of success (event F occurs)
• q = probability of failure (event F does not occur)
a. Write the description of the random variable X.
b. What are the values that X can take on?
c. Find the values of p and q.
d. Find the probability that the first occurrence of event F (rolling a four or five) is on the second trial.
79. Ellen has music practice three days a week. She practices for all of the three days 85% of the time, two days 8% of the
time, one day 4% of the time, and no days 3% of the time. One week is selected at random. What values does X take on?
80. The World Bank records the prevalence of HIV in countries around the world. According to their data, “Prevalence of
HIV refers to the percentage of people ages 15 to 49 who are infected with HIV.”[1] In South Africa, the prevalence of HIV
is 17.3%. Let X = the number of people you test until you find a person infected with HIV.
a. Sketch a graph of the distribution of the discrete random variable X.
b. What is the probability that you must test 30 people to find one with HIV?
c. What is the probability that you must ask ten people?
d. Find the (i) mean and (ii) standard deviation of the distribution of X.
81. According to a recent Pew Research poll, 75% of millenials (people born between 1981 and 1995) have a profile on
a social networking site. Let X = the number of millenials you ask until you find a person without a profile on a social
networking site.
a. Describe the distribution of X.
b. Find the (i) mean and (ii) standard deviation of X.
c. What is the probability that you must ask ten people to find one person without a social networking site?
d. What is the probability that you must ask 20 people to find one person without a social networking site?
e. What is the probability that you must ask at most five people?
1. ”Prevalence of HIV, total (% of populations ages 15-49),” The World Bank, 2013. Available online at
http://data.worldbank.org/indicator/
SH.DYN.AIDS.ZS?order=wbapi_data_value_2011+wbapi_data_value+wbapi_data_value-last&sort=desc (accessed May
15, 2013).
232 Chapter 4 | Discrete Random Variables
87. Fertile, female cats produce an average of three litters per year. Suppose that one fertile, female cat is randomly chosen.
In one year, find the probability she produces:
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution of X. X ~ _______
d. Find the probability that she has no litters in one year.
e. Find the probability that she has at least two litters in one year.
f. Find the probability that she has exactly three litters in one year.
88. The chance of having an extra fortune in a fortune cookie is about 3%. Given a bag of 144 fortune cookies, we are
interested in the number of cookies with an extra fortune. Two distributions may be used to solve this problem, but only use
one distribution to solve the problem.
a. In words, define the random variable X.
b. List the values that X may take on.
c. How many cookies do we expect to have an extra fortune?
d. Find the probability that none of the cookies have an extra fortune.
e. Find the probability that more than three have an extra fortune.
f. As n increases, what happens involving the probabilities using the two distributions? Explain in complete
sentences.
89. According to the South Carolina Department of Mental Health web site, for every 200 U.S. women, the average number
who suffer from anorexia is one. Out of a randomly chosen group of 600 U.S. women determine the following.
a. In words, define the random variable X.
b. List the values that X may take on.
c. Give the distribution ofX. X ~ _____(_____,_____)
d. How many are expected to suffer from anorexia?
e. Find the probability that no one suffers from anorexia.
f. Find the probability that more than four suffer from anorexia.
90. The chance of an IRS audit for a tax return with over $25,000 in income is about 2% per year. Suppose that 100 people
with tax returns over $25,000 are randomly picked. We are interested in the number of people audited in one year. Use a
Poisson distribution to anwer the following questions.
a. In words, define the random variable X.
b. List the values that X may take on.
c. How many are expected to be audited?
d. Find the probability that no one was audited.
e. Find the probability that at least three were audited.
91. Approximately 8% of students at a local high school participate in after-school sports all four years of high school. A
group of 60 seniors is randomly chosen. Of interest is the number that participated in after-school sports all four years of
high school.
a. In words, define the random variable X.
b. List the values that X may take on.
c. How many seniors are expected to have participated in after-school sports all four years of high school?
d. Based on numerical values, would you be surprised if none of the seniors participated in after-school sports all
four years of high school? Justify your answer numerically.
e. Based on numerical values, is it more likely that four or that five of the seniors participated in after-school sports
all four years of high school? Justify your answer numerically.
92. On average, Pierre, an amateur chef, drops three pieces of egg shell into every two cake batters he makes. Suppose that
you buy one of his cakes.
a. In words, define the random variable X.
b. List the values that X may take on.
c. On average, how many pieces of egg shell do you expect to be in the cake?
d. What is the probability that there will not be any pieces of egg shell in the cake?
e. Let’s say that you buy one of Pierre’s cakes each week for six weeks. What is the probability that there will not
be any egg shell in any of the cakes?
f. Based upon the average given for Pierre, is it possible for there to be seven pieces of shell in the cake? Why?
Use the following information to answer the next two exercises: The average number of times per week that Mrs. Plum’s
cats wake her up at night because they want to play is ten. We are interested in the number of times her cats wake her up
each week.
93. In words, the random variable X = _________________
a. the number of times Mrs. Plum’s cats wake her up each week.
b. the number of times Mrs. Plum’s cats wake her up each hour.
c. the number of times Mrs. Plum’s cats wake her up each night.
d. the number of times Mrs. Plum’s cats wake her up.
94. Find the probability that her cats will wake her up no more than five times next week.
a. 0.5000
b. 0.9329
c. 0.0378
d. 0.0671
REFERENCES
SOLUTIONS
1
x P(x)
0 0.12
1 0.18
2 0.30
3 0.15
4 0.10
5 0.10
6 0.05
Table 4.6
x P(x)
0 0.03
1 0.04
2 0.08
3 0.85
Table 4.7
15
x P(x)
0 0.05
1 0.05
2 0.10
3 0.20
4 0.25
5 0.35
Table 4.8
17 1 – 0.05 = 0.95
18 X = the number of business majors in the sample.
19 2, 3, 4, 5, 6, 7, 8, 9
20 X = the number that reply “yes”
22 0, 1, 2, 3, 4, 5, 6, 7, 8
24 5.7
26 0.4151
28 X = the number of freshmen selected from the study until one replied "yes" that same-sex couples should have the right
to legal marital status.
30 1,2,…
32 1.4
35 0, 1, 2, 3, 4, …
37 0.0485
39 0.0214
41 X = the number of U.S. teens who die from motor vehicle injuries per day.
43 0, 1, 2, 3, 4, ...
45 No
48
a. X = the number of pages that advertise footwear
b. 0, 1, 2, 3, ..., 20
c. 3.03
d. 1.5197
50
a. X = the number of Patriots picked
b. 0, 1, 2, 3, 4
c. Without replacement
53 X = the number of patients calling in claiming to have the flu, who actually have the flu. X = 0, 1, 2, ...25
55 0.0165
57
a. X = the number of DVDs a Video to Go customer rents
b. 0.12
c. 0.11
d. 0.77
59 d. 4.43
61 c
63
• X = number of questions answered correctly
⎛ ⎞
• X ~ B ⎝32, 1 ⎠
3
• We are interested in MORE THAN 75% of 32 questions correct. 75% of 32 is 24. We want to find P(x > 24). The
event "more than 24" is the complement of "less than or equal to 24."
• P(x > 24) = 0
• The probability of getting more than 75% of the 32 questions correct when randomly guessing is very small and
practically zero.
65
a. X = the number of college and universities that offer online offerings.
b. 0, 1, 2, …, 13
c. X ~ B(13, 0.96)
d. 12.48
e. 0.0135
f. P(x = 12) = 0.3186 P(x = 13) = 0.5882 More likely to get 13.
67
a. X = the number of fencers who do not use the foil as their main weapon
b. 0, 1, 2, 3,... 25
c. X ~ B(25,0.40)
d. 10
e. 0.0442
f. The probability that all 25 not use the foil is almost zero. Therefore, it would be very surprising.
69
a. X = the number of audits in a 20-year period
b. 0, 1, 2, …, 20
c. X ~ B(20, 0.02)
d. 0.4
e. 0.6676
f. 0.0071
71
1. X = the number of matches
2. 0, 1, 2, 3
3. In dollars: −1, 1, 2, 3
238 Chapter 4 | Discrete Random Variables
4. 1
2
5. The answer is −0.0787. You lose about eight cents, on average, per game.
6. The house has the advantage.
73
a. X ~ B(15, 0.281)
Figure 4.4
75
a. X = the number of adults in America who are surveyed until one says he or she will watch the Super Bowl.
b. X ~ G(0.40)
c. 2.5
d. 0.0187
e. 0.2304
77
a. X = the number of pages that advertise footwear
b. X takes on the values 0, 1, 2, ..., 20
c. X ~ B(20, 29 )
192
d. 3.02
e. No
f. 0.9997
g. X = the number of pages we must survey until we find one that advertises footwear. X ~ G( 29 )
192
h. 0.3881
i. 6.6207 pages
79 0, 1, 2, and 3
81
a. X ~ G(0.25)
b. i. Mean = μ = 1p = 1 =4
0.25
1− p 1 − 0.25 ≈ 3.4641
ii. Standard Deviation = σ = =
p2 0.25 2
82
a. X ~ P(5.5); μ = 5.5; σ = 5.5 ≈ 2.3452
b. P(x ≤ 6) ≈ 0.6860
c. There is a 15.7% probability that the law staff will receive more calls than they can handle.
d. P(x > 8) = 1 – P(x ≤ 8) ≈ 1 – 0.8944 = 0.1056
84 Let X = the number of defective bulbs in a string. Using the Poisson distribution:
• μ = np = 100(0.03) = 3
• X ~ P(3)
• P(x ≤ 4) ≈ 0.8153
Using the binomial distribution:
• X ~ B(100, 0.03)
• P(x ≤ 4) = 0.8179
The Poisson approximation is very good—the difference between the probabilities is only 0.0026.
86
a. X = the number of children for a Spanish woman
b. 0, 1, 2, 3,...
c. 0.2299
d. 0.5679
e. 0.4321
88
a. X = the number of fortune cookies that have an extra fortune
b. 0, 1, 2, 3,... 144
c. 4.32
d. 0.0124 or 0.0133
e. 0.6300 or 0.6264
f. As n gets larger, the probabilities get closer together.
90
a. X = the number of people audited in one year
b. 0, 1, 2, ..., 100
240 Chapter 4 | Discrete Random Variables
c. 2
d. 0.1353
e. 0.3233
92
a. X = the number of shell pieces in one cake
b. 0, 1, 2, 3,...
c. 1.5
d. 0.2231
e. 0.0001
f. Yes
94 d
5 | CONTINUOUS RANDOM
VARIABLES
Figure 5.1 The heights of these radish plants are continuous random variables. (Credit: Rev Stan)
Introduction
Continuous random variables have many applications. Baseball batting averages, IQ scores, the length of time a long
distance telephone call lasts, the amount of money a person carries, the length of time a computer chip lasts, rates of return
from an investment, and SAT scores are just a few. The field of reliability depends on a variety of continuous random
variables, as do all areas of risk analysis.
NOTE
The values of discrete and continuous random variables can be ambiguous. For example, if X is equal to the number
of miles (to the nearest mile) you drive to work, then X is a discrete random variable. You count the miles. If X is
the distance you drive to work, then you measure values of X and X is a continuous random variable. For a second
example, if X is equal to the number of books in a backpack, then X is a discrete random variable. If X is the weight of
a book, then X is a continuous random variable because weights are measured. How the random variable is defined is
very important.
242 Chapter 5 | Continuous Random Variables
Figure 5.2 The graph shows a Uniform Distribution with the area between x = 3 and x = 6 shaded to represent the
probability that the value of the random variable X is in the interval between three and six.
Figure 5.3 The graph shows an Exponential Distribution with the area between x = 2 and x = 4 shaded to represent
the probability that the value of the random variable X is in the interval between two and four.
Figure 5.4 The graph shows the Standard Normal Distribution with the area between x = 1 and x = 2 shaded to
represent the probability that the value of the random variable X is in the interval between one and two.
Example 5.1
Consider the function f(x) = 1 for 0 ≤ x ≤ 20. x = a real number. The graph of f(x) = 1 is a horizontal line.
20 20
However, since 0 ≤ x ≤ 20, f(x) is restricted to the portion between x = 0 and x = 20, inclusive.
Figure 5.5
The area between f(x) = 1 where 0 ≤ x ≤ 20 and the x-axis is the area of a rectangle with base = 20 and height
20
= 1 .
20
⎛ ⎞
AREA = 20⎝ 1 ⎠ = 1
20
Suppose we want to find the area between f(x) = 1 and the x-axis where 0 < x < 2.
20
Figure 5.6
⎛ ⎞
AREA = (2 – 0)⎝ 1 ⎠ = 0.1
20
(2 – 0) = 2 = base of a rectangle
REMINDER
area of a rectangle = (base)(height).
The area corresponds to a probability. The probability that x is between zero and two is 0.1, which can be written
mathematically as P(0 < x < 2) = P(x < 2) = 0.1.
Suppose we want to find the area between f(x) = 1 and the x-axis where 4 < x < 15.
20
Figure 5.7
⎛ ⎞
AREA = (15 – 4)⎝ 1 ⎠ = 0.55
20
(15 – 4) = 11 = the base of a rectangle
The area corresponds to the probability P(4 < x < 15) = 0.55.
Suppose we want to find P(x = 15). On an x-y graph, x = 15 is a vertical line. A vertical line has no width (or zero
⎛ ⎞
width). Therefore, P(x = 15) = (base)(height) = (0) ⎝ 1 ⎠ = 0
20
Figure 5.8
P(X ≤ x), which can also be written as P(X < x) for continuous distributions, is called the cumulative distribution
function or CDF. Notice the "less than or equal to" symbol. We can also use the CDF to calculate P(X > x).
The CDF gives "area to the left" and P(X > x) gives "area to the right." We calculate P(X > x) for continuous
distributions as follows: P(X > x) = 1 – P (X < x).
246 Chapter 5 | Continuous Random Variables
Figure 5.9
Label the graph with f(x) and x. Scale the x and y axes with the maximum x and y values. f(x) = 1 , 0 ≤ x ≤ 20.
20
To calculate the probability that x is between two values, look at the following graph. Shade the region between x
= 2.3 and x = 12.7. Then calculate the shaded area of a rectangle.
Figure 5.10
⎛ ⎞
P(2.3 < x < 12.7) = (base)(height) = (12.7 − 2.3)⎝ 1 ⎠ = 0.52
20
5.1 Consider the function f(x) = 18 for 0 ≤ x ≤ 8. Draw the graph of f(x) and find P(2.5 < x < 7.5).
f(x) = 1 for a ≤ x ≤ b
b−a
(b − a) 2
µ = a + b and σ =
2 12
5.1 The data that follow are the number of passengers on 35 different charter fishing boats. The sample mean = 7.9 and
the sample standard deviation = 4.33. The data follow a uniform distribution where all values between and including
zero and 14 are equally likely. State the values of a and b. Write the distribution in proper notation, and calculate the
theoretical mean and standard deviation.
1 12 4 10 4 14 11
7 11 4 13 2 4 6
3 10 0 12 6 9 10
5 13 4 10 14 12 11
6 10 11 0 11 13 2
Table 5.1
Example 5.2
The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between zero and 15
minutes, inclusive.
a. What is the probability that a person waits fewer than 12.5 minutes?
Solution 5.2
a. Let X = the number of minutes a person must wait for a bus. a = 0 and b = 15. X ~ U(0, 15). Write the probability
density function. f (x) = 1 = 1 for 0 ≤ x ≤ 15.
15 − 0 15
Find P (x < 12.5). Draw a graph.
⎛ ⎞
P(x < k) = (base)(height) = (12.5 - 0)⎝ 1 ⎠ = 0.8333
15
The probability a person waits less than 12.5 minutes is 0.8333.
248 Chapter 5 | Continuous Random Variables
Figure 5.11
b. On the average, how long must a person wait? Find the mean, μ, and the standard deviation, σ.
Solution 5.2
b. μ = a + b = 15 + 0 = 7.5. On the average, a person must wait 7.5 minutes.
2 2
(b - a) 2 (15 - 0) 2
σ= = = 4.3. The Standard deviation is 4.3 minutes.
12 12
c. Ninety percent of the time, the time a person must wait falls below what value?
NOTE
This asks for the 90th percentile.
Solution 5.2
c. Find the 90th percentile. Draw a graph. Let k = the 90th percentile.
⎛ ⎞
0.90 = (k)⎝ 1 ⎠
15
k = (0.90)(15) = 13.5
The 90th percentile is 13.5 minutes. Ninety percent of the time, a person must wait at most 13.5 minutes.
Figure 5.12
5.2 The total duration of baseball games in the major league in the 2011 season is uniformly distributed between 447
hours and 521 hours inclusive.
a. Find a and b and describe what they represent.
b. Write the distribution.
c. Find the mean and the standard deviation.
d. What is the probability that the duration of games for a team for the 2011 season is between 480 and 500 hours?
An alternative form of the exponential distribution formula recognizes what is often called the decay factor. The decay factor
simply measures how rapidly the probability of an event declines as the random variable X increases. When the notation
using the decay parameter m is used, the probability density function is presented as:
f (x) = me −mx
where m = 1
µ
In order to calculate probabilities for specific probability density functions, the cumulative density function is used. The
cumulative density function (cdf) is simply the integral of the pdf and is:
⎛ ⎞ ∞ ⎡ x⎤
F ⎜x⎟ = ∫ 0
⎜ ⎟ ⎢1 - µ ⎥ - µx
⎢µ e ⎥ = 1 - e
⎝ ⎠ ⎣ ⎦
Example 5.3
Let X = amount of time (in minutes) a postal clerk spends with a customer. The time is known from historical data
to have an average amount of time equal to four minutes.
It is given that μ = 4 minutes, that is, the average time the clerk spends with a customer is 4 minutes. Remember
that we are still doing probability and thus we have to be told the population parameters such as the mean. To
do any calculations, we need to know the mean of the distribution: the historical time to provide a service, for
example. Knowing the historical mean allows the calculation of the decay parameter, m.
m = 1µ . Therefore, m = 1 = 0.25 .
4
When the notation used the decay parameter, m, the probability density function is presented as
− 1µ x
f (x) = me −mx , which is simply the original formula with m substituted for 1µ , or f (x) = 1µ e .
To calculate probabilities for an exponential probability density function, we need to use the cumulative density
function. As shown below, the curve for the cumulative density function is:
f(x) = 0.25e–0.25x where x is at least zero and m = 0.25.
For example, f(5) = 0.25e(-0.25)(5) = 0.072. In other words, the function has a value of .072 when x = 5.
The graph is as follows:
Figure 5.13
mean.
5.3 The amount of time spouses shop for anniversary cards can be modeled by an exponential distribution with the
average amount of time equal to eight minutes. Write the distribution, state the probability density function, and graph
the distribution.
Example 5.4
a. Using the information in Example 5.3, find the probability that a clerk spends four to five minutes with a
randomly selected customer.
Solution 5.4
a. Find P(4 < x < 5).
The cumulative distribution function (CDF) gives the area to the left.
P(x < x) = 1 – e–mx
P(x < 5) = 1 – e(–0.25)(5) = 0.7135 and P(x < 4) = 1 – e(–0.25)(4) = 0.6321
P(4 < x < 5)= 0.7135 – 0.6321 = 0.0814
Figure 5.14
5.4 The number of days ahead travelers purchase their airline tickets can be modeled by an exponential distribution
with the average amount of time equal to 15 days. Find the probability that a traveler will purchase a ticket fewer than
ten days in advance. How many days do half of all travelers wait?
Example 5.5
On the average, a certain computer part lasts ten years. The length of time the computer part lasts is exponentially
distributed.
a. What is the probability that a computer part lasts more than 7 years?
252 Chapter 5 | Continuous Random Variables
Solution 5.5
a. Let x = the amount of time (in years) a computer part lasts.
μ = 10 so m = 1
µ=
1 = 0.1
10
Find P(x > 7). Draw the graph.
P(x > 7) = 1 – P(x < 7).
Since P(X < x) = 1 – e–mx then P(X > x) = 1 – ( 1 –e–mx) = e–mx
P(x > 7) = e(–0.1)(7) = 0.4966. The probability that a computer part lasts more than seven years is 0.4966.
Figure 5.15
b. On the average, how long would five computer parts last if they are used one after another?
Solution 5.5
b. On the average, one computer part lasts ten years. Therefore, five computer parts, if they are used one right
after the other would last, on the average, (5)(10) = 50 years.
d. What is the probability that a computer part lasts between nine and 11 years?
Solution 5.5
d. Find P(9 < x < 11). Draw the graph.
Figure 5.16
P(9 < x < 11) = P(x < 11) – P(x < 9) = (1 – e(–0.1)(11)) – (1 – e(–0.1)(9)) = 0.6671 – 0.5934 = 0.0737. The probability
that a computer part lasts between nine and 11 years is 0.0737.
5.5 On average, a pair of running shoes can last 18 months if used every day. The length of time running shoes last is
exponentially distributed. What is the probability that a pair of running shoes last more than 15 months? On average,
how long would six pairs of running shoes last if they are used one after the other? Eighty percent of running shoes
last at most how long if used every day?
Example 5.6
Suppose that the length of a phone call, in minutes, is an exponential random variable with decay parameter 1
12
. The decay p[parameter is another way to view 1/λ. If another person arrives at a public telephone just before
you, find the probability that you will have to wait more than five minutes. Let X = the length of a phone call, in
minutes.
What is m, μ, and σ? The probability that you must wait more than five minutes is _______ .
Solution 5.6
• m= 1
12
• μ = 12
• σ = 12
P(x > 5) = 0.6592
Example 5.7
The time spent waiting between events is often modeled using the exponential distribution. For example, suppose
that an average of 30 customers per hour arrive at a store and the time between arrivals is exponentially
distributed.
a. On average, how many minutes elapse between two successive arrivals?
b. When the store first opens, how long on average does it take for three customers to arrive?
c. After a customer arrives, find the probability that it takes less than one minute for the next customer to
arrive.
d. After a customer arrives, find the probability that it takes more than five minutes for the next customer to
arrive.
e. Is an exponential distribution reasonable for this situation?
Solution 5.7
a. Since we expect 30 customers to arrive per hour (60 minutes), we expect on average one customer to arrive
every two minutes on average.
b. Since one customer arrives every two minutes on average, it will take six minutes on average for three
customers to arrive.
254 Chapter 5 | Continuous Random Variables
Figure 5.17
Figure 5.18
e. This model assumes that a single customer arrives at a time, which may not be reasonable since people
might shop in groups, leading to several customers arriving at the same time. It also assumes that the flow
of customers does not change throughout the day, which is not valid if some times of the day are busier than
others.
Example 5.8
Refer back to the postal clerk again where the time a postal clerk spends with his or her customer has an
exponential distribution with a mean of four minutes. Suppose a customer has spent four minutes with a postal
clerk. What is the probability that he or she will spend at least an additional three minutes with the postal clerk?
Figure 5.19
-µx 1
The formula for the exponential distribution: P(X = x) = me -mx = 1
µe Where m = the rate parameter, or μ = average
time between occurrences.
We see that the exponential is the cousin of the Poisson distribution and they are linked through this formula. There are
important differences that make each distribution relevant for different types of probability problems.
First, the Poisson has a discrete random variable, x, where time; a continuous variable is artificially broken into discrete
pieces. We saw that the number of occurrences of an event in a given time interval, x, follows the Poisson distribution.
For example, the number of times the telephone rings per hour. By contrast, the time between occurrences follows the
exponential distribution. For example. The telephone just rang, how long will it be until it rings again? We are measuring
length of time of the interval, a continuous random variable, exponential, not events during an interval, Poisson.
The Exponential Distribution v. the Poisson Distribution
A visual way to show both the similarities and differences between these two distributions is with a time line.
Figure 5.20
The random variable for the Poisson distribution is discrete and thus counts events during a given time period, t1 to t2 on
Figure 5.20, and calculates the probability of that number occurring. The number of events, four in the graph, is measured
in counting numbers; therefore, the random variable of the Poisson is a discrete random variable.
The exponential probability distribution calculates probabilities of the passage of time, a continuous random variable. In
Figure 5.20 this is shown as the bracket from t1 to the next occurrence of the event marked with a triangle.
Classic Poisson distribution questions are "how many people will arrive at my checkout window in the next hour?".
Classic exponential distribution questions are "how long it will be until the next person arrives," or a variant, "how long will
the person remain here once they have arrived?".
Again, the formula for the exponential distribution is:
- 1µ x
f (x) = me -mx or f (x) = 1µ e
We see immediately the similarity between the exponential formula and the Poisson formula.
−µ
µx e
P(x) =
x!
Both probability density functions are based upon the relationship between time and exponential growth or decay. The “e”
in the formula is a constant with the approximate value of 2.71828 and is the base of the natural logarithmic exponential
growth formula. When people say that something has grown exponentially this is what they are talking about.
An example of the exponential and the Poisson will make clear the differences been the two. It will also show the interesting
applications they have.
Poisson Distribution
Suppose that historically 10 customers arrive at the checkout lines each hour. Remember that this is still probability so we
have to be told these historical values. We see this is a Poisson probability problem.
We can put this information into the Poisson probability density function and get a general formula that will calculate the
probability of any specific number of customers arriving in the next hour.
The formula is for any value of the random variable we chose, and so the x is put into the formula. This is the formula:
x -10
f (x) = 10 e
x!
As an example, the probability of 15 people arriving at the checkout counter in the next hour would be
15 -10
P(x = 15) = 10 e = 0.0611
15 !
Here we have inserted x = 15 and calculated the probability that in the next hour 15 people will arrive is .061.
Exponential Distribution
If we keep the same historical facts that 10 customers arrive each hour, but we now are interested in the service time a
person spends at the counter, then we would use the exponential distribution. The exponential probability function for any
value of x, the random variable, for this particular checkout counter historical data is:
-x
f (x) = 1 e .1 = 10e -10x
.1
To calculate µ, the historical average service time, we simply divide the number of people that arrive per hour, 10 , into the
time period, one hour, and have µ = 0.1. Historically, people spend 0.1 of an hour at the checkout counter, or 6 minutes.
This explains the .1 in the formula.
There is a natural confusion with µ in both the Poisson and exponential formulas. They have different meanings, although
they have the same symbol. The mean of the exponential is one divided by the mean of the Poisson. If you are given the
historical number of arrivals you have the mean of the Poisson. If you are given an historical length of time between events
you have the mean of an exponential.
Continuing with our example at the checkout clerk; if we wanted to know the probability that a person would spend 9
minutes or less checking out, then we use this formula. First, we convert to the same time units which are parts of one
hour. Nine minutes is 0.15 of one hour. Next we note that we are asking for a range of values. This is always the case for a
continuous random variable. We write the probability question as:
p⎛⎝x ≤ 9⎞⎠ = 1 - 10e -10x
We can now put the numbers into the formula and we have our result.
-10(.15)
p(x = .15) = 1 - 10e = 0.7769
The probability that a customer will spend 9 minutes or less checking out is 0.7769.
We see that we have a high probability of getting out in less than nine minutes and a tiny probability of having 15 customers
arriving in the next hour.
258 Chapter 5 | Continuous Random Variables
KEY TERMS
Conditional Probability the likelihood that an event will occur given that another event has already occurred.
decay parameter The decay parameter describes the rate at which probabilities decay to zero for increasing values of
x. It is the value m in the probability density function f(x) = me(-mx) of an exponential random variable. It is also
equal to m = 1
µ , where μ is the mean of the random variable.
Exponential Distribution a continuous random variable (RV) that appears when we are interested in the intervals of
time between some random events, for example, the length of time between emergency arrivals at a hospital. The
1 and the standard deviation is σ = 1 . The probability density function is f (x) = me -mx or
mean is μ = m m
- 1µ x − 1µ x
f (x) = 1µ e , x ≥ 0 and the cumulative distribution function is P(X ≤ x) = 1 - e −mx or P(X ≤ x) = 1 - e
.
memoryless property For an exponential random variable X, the memoryless property is the statement that
knowledge of what has occurred in the past has no effect on future probabilities. This means that the probability that
X exceeds x + t, given that it has exceeded x, is the same as the probability that X would exceed t if we had no
knowledge about it. In symbols we say that P(X > x + t|X > x) = P(X > t).
Poisson distribution If there is a known average of μ events occurring per unit time, and these events are independent
of each other, then the number of events X occurring in one unit of time has the Poisson distribution. The probability
−µ
µx e
of x events occurring in one unit time is equal to P(X = x) = .
x!
Uniform Distribution a continuous random variable (RV) that has equally likely outcomes over the domain, a < x < b;
it is often referred as the rectangular distribution because the graph of the pdf has the form of a rectangle. The
(b − a) 2
mean is μ = a + b and the standard deviation is σ = . The probability density function is f(x) = 1
2 12 b−a
for a < x < b or a ≤ x ≤ b. The cumulative distribution is P(X ≤ x) = x − a .
b−a
CHAPTER REVIEW
Figure 5.21
The cumulative distribution function (cdf) of X is defined by P (X ≤ x). It is a function of x that gives the probability that
the random variable is less than or equal to x.
Figure 5.22
The probability P(c < X < d) may be found by computing the area under f(x), between c and d. Since the corresponding area
is a rectangle, the area may be found simply by multiplying the width and the height.
FORMULA REVIEW
The mean is µ = a + b
5.1 Properties of Continuous Probability 2
Density Functions
(b – a) 2
Probability density function (pdf) f(x): The standard deviation is σ =
12
• f(x) ≥ 0
Probability density function: f (x) = 1 for
• The total area under the curve f(x) is one. b−a
Cumulative distribution function (cdf): P(X ≤ x) a≤X≤b
⎛ ⎞
Area to the Left of x: P(X < x) = (x – a) ⎝ 1 ⎠
5.2 The Uniform Distribution b−a
⎛ ⎞
X = a real number between a and b (in some instances, X Area to the Right of x: P(X > x) = (b – x) ⎝ 1 ⎠
can take on the values a and b). a = smallest X; b = largest b−a
X Area Between c and d: P(c < x < d) = (base)(height) = (d
X ~ U (a, b)
260 Chapter 5 | Continuous Random Variables
⎛ ⎞
– c) ⎝ 1 ⎠
b−a 5.3 The Exponential Distribution
• pdf: f(x) = me(–mx) where x ≥ 0 and m > 0
• pdf: f (x) = 1 for a ≤ x ≤ b
b−a • cdf: P(X ≤ x) = 1 – e(–mx)
• cdf: P(X ≤ x) = x − a 1
• mean µ = m
b−a
PRACTICE
Figure 5.23
2. Which type of distribution does the graph illustrate?
Figure 5.24
Figure 5.25
4. What does the shaded area represent? P(___< x < ___)
Figure 5.26
5. What does the shaded area represent? P(___< x < ___)
Figure 5.27
6. For a continuous probablity distribution, 0 ≤ x ≤ 15. What is P(x > 15)?
7. What is the area under f(x) if the function is a continuous probability density function?
8. For a continuous probability distribution, 0 ≤ x ≤ 10. What is P(x = 7)?
9. A continuous probability function is restricted to the portion between x = 0 and 7. What is P(x = 10)?
10. f(x) for a continuous probability function is 1 , and the function is restricted to 0 ≤ x ≤ 5. What is P(x < 0)?
5
11. f(x), a continuous probability function, is equal to 1 , and the function is restricted to 0 ≤ x ≤ 12. What is P (0 < x <
12
12)?
262 Chapter 5 | Continuous Random Variables
Figure 5.28
13. Find the probability that x falls in the shaded area.
Figure 5.29
14. Find the probability that x falls in the shaded area.
Figure 5.30
⎛ ⎞
15. f(x), a continuous probability function, is equal to 1 and the function is restricted to 1 ≤ x ≤ 4. Describe P⎝x > 3 ⎠.
3 2
Table 5.2
The sample mean = 2.50 and the sample standard deviation = 0.8302.
The distribution can be written as X ~ U(1.5, 4.5).
16. What type of distribution is this?
17. In this distribution, outcomes are equally likely. What does this mean?
18. What is the height of f(x) for the continuous probability distribution?
19. What are the constraints for the values of x?
20. Graph P(2 < x < 3).
21. What is P(2 < x < 3)?
22. What is P(x < 3.5 | x < 4)?
23. What is P(x = 1.5)?
24. Find the probability that a randomly selected home has more than 3,000 square feet given that you already know the
house has more than 2,000 square feet.
Use the following information to answer the next eight exercises. A distribution is given as X ~ U(0, 12).
25. What is a? What does it represent?
26. What is b? What does it represent?
27. What is the probability density function?
28. What is the theoretical mean?
29. What is the theoretical standard deviation?
30. Draw the graph of the distribution for P(x > 9).
31. Find P(x > 9).
Use the following information to answer the next eleven exercises. The age of cars in the staff parking lot of a suburban
college is uniformly distributed from six months (0.5 years) to 9.5 years.
32. What is being measured here?
33. In words, define the random variable X.
34. Are the data discrete or continuous?
35. The interval of values for x is ______.
36. The distribution for X is ______.
37. Write the probability density function.
264 Chapter 5 | Continuous Random Variables
Figure 5.31
b. Identify the following values:
i. Lowest value for –x : _______
ii. Highest value for –x : _______
iii. Height of the rectangle: _______
iv. Label for x-axis (words): _______
v. Label for y-axis (words): _______
39. Find the average age of the cars in the lot.
40. Find the probability that a randomly chosen car in the lot was less than four years old.
a. Sketch the graph, and shade the area of interest.
Figure 5.32
b. Find the probability. P(x < 4) = _______
41. Considering only the cars less than 7.5 years old, find the probability that a randomly chosen car in the lot was less than
four years old.
a. Sketch the graph, shade the area of interest.
Figure 5.33
b. Find the probability. P(x < 4 | x < 7.5) = _______
42. What has changed in the previous two problems that made the solutions different?
43. Find the third quartile of ages of cars in the lot. This means you will have to find the value such that 3 , or 75%, of the
4
cars are at most (less than or equal to) that age.
a. Sketch the graph, and shade the area of interest.
Figure 5.34
b. Find the value k such that P(x < k) = 0.75.
c. The third quartile is _______
Use the following information to answer the next seven exercises. A distribution is given as X ~ Exp(0.75).
54. What is m?
55. What is the probability density function?
56. What is the cumulative distribution function?
57. Draw the distribution.
58. Find P(x < 4).
59. Find the 30th percentile.
60. Find the median.
61. Which is larger, the mean or the median?
Use the following information to answer the next 16 exercises. Carbon-14 is a radioactive element with a half-life of about
5,730 years. Carbon-14 is said to decay exponentially. The decay rate is 0.000121. We start with one gram of carbon-14.
We are interested in the time (years) it takes to decay carbon-14.
62. What is being measured here?
63. Are the data discrete or continuous?
64. In words, define the random variable X.
65. What is the decay rate (m)?
66. The distribution for X is ______.
67. Find the amount (percent of one gram) of carbon-14 lasting less than 5,730 years. This means, find P(x < 5,730).
a. Sketch the graph, and shade the area of interest.
Figure 5.35
b. Find the probability. P(x < 5,730) = __________
68. Find the percentage of carbon-14 lasting longer than 10,000 years.
a. Sketch the graph, and shade the area of interest.
Figure 5.36
b. Find the probability. P(x > 10,000) = ________
69. Thirty percent (30%) of carbon-14 will decay within how many years?
a. Sketch the graph, and shade the area of interest.
Figure 5.37
b. Find the value k such that P(x < k) = 0.30.
HOMEWORK
long–term parking center is supposed to arrive every eight minutes. The waiting times for the train are known to follow a
uniform distribution.
77. What is the average waiting time (in minutes)?
a. zero
b. two
c. three
d. four
78. The probability of waiting more than seven minutes given a person has waited more than four minutes is?
a. 0.125
b. 0.25
c. 0.5
d. 0.75
79. The time (in minutes) until the next bus departs a major bus depot follows a distribution with f(x) = 1 where x goes
20
from 25 to 45 minutes.
a. Define the random variable. X = ________
b. Graph the probability distribution.
c. The distribution is ______________ (name of distribution). It is _____________ (discrete or continuous).
d. μ = ________
e. σ = ________
f. Find the probability that the time is at most 30 minutes. Sketch and label a graph of the distribution. Shade the
area of interest. Write the answer in a probability statement.
g. Find the probability that the time is between 30 and 40 minutes. Sketch and label a graph of the distribution.
Shade the area of interest. Write the answer in a probability statement.
h. P(25 < x < 55) = _________. State this in a probability statement, similarly to parts g and h, draw the picture, and
find the probability.
80. Suppose that the value of a stock varies each day from $16 to $25 with a uniform distribution.
a. Find the probability that the value of the stock is more than $19.
b. Find the probability that the value of the stock is between $19 and $22.
c. Given that the stock is greater than $18, find the probability that the stock is more than $21.
81. A fireworks show is designed so that the time between fireworks is between one and five seconds, and follows a uniform
distribution.
a. Find the average time between fireworks.
b. Find probability that the time between fireworks is greater than four seconds.
82. The number of miles driven by a truck driver falls between 300 and 700, and follows a uniform distribution.
a. Find the probability that the truck driver goes more than 650 miles in a day.
b. Find the probability that the truck drivers goes between 400 and 650 miles in a day.
84. Suppose that the useful life of a particular car battery, measured in months, decays with parameter 0.025. We are
interested in the life of the battery.
a. Define the random variable. X = _________________________________.
b. Is X continuous or discrete?
c. On average, how long would you expect one car battery to last?
d. On average, how long would you expect nine car batteries to last, if they are used one after another?
e. Find the probability that a car battery lasts more than 36 months.
f. Seventy percent of the batteries last at least how long?
85. The percent of persons (ages five and older) in each state who speak a language at home other than English is
approximately exponentially distributed with a mean of 9.848. Suppose we randomly pick a state.
a. Define the random variable. X = _________________________________.
b. Is X continuous or discrete?
c. μ = ________
d. σ = ________
e. Draw a graph of the probability distribution. Label the axes.
f. Find the probability that the percent is less than 12.
g. Find the probability that the percent is between eight and 14.
h. The percent of all individuals living in the United States who speak a language at home other than English is 13.8.
Use the following information to answer the next three exercises. The average lifetime of a certain new cell phone is three
years. The manufacturer will replace any cell phone failing within two years of the date of purchase. The lifetime of these
cell phones is known to follow an exponential distribution.
88. The decay rate is:
a. 0.3333
b. 0.5000
c. 2
d. 3
89. What is the probability that a phone will fail within two years of the date of purchase?
a. 0.8647
b. 0.4866
c. 0.2212
d. 0.9997
REFERENCES
SOLUTIONS
1 Uniform Distribution
3 Normal Distribution
5 P(6 < x < 7)
7 one
9 zero
11 one
13 0.625
15 The probability is equal to the area from x = 3 to x = 4 above the x-axis and up to f(x) = 1 .
2 3
17 It means that the value of x is just as likely to be any number between 1.5 and 4.5.
19 1.5 ≤ x ≤ 4.5
21 0.3333
23 zero
24 0.6
26 b is 12, and it represents the highest value of x.
28 six
30
Figure 5.38
b. 3.5
7
43
a. Check student's solution.
b. k = 7.25
c. 7.25
45 No, outcomes are not equally likely. In this distribution, more people require a little bit of time, and fewer people require
a lot of time, so it is more likely that someone will require less time.
47 five
49 f(x) = 0.2e-0.2x
51 0.5350
53 6.02
55 f(x) = 0.75e-0.75x
57
Figure 5.39
59 0.4756
1 =
61 The mean is larger. The mean is m 1 ≈ 1.33 , which is greater than 0.9242.
0.75
63 continuous
65 m = 0.000121
67
a. Check student's solution
b. P(x < 5,730) = 0.5001
69
a. Check student's solution.
b. k = 2947.73
274 Chapter 5 | Continuous Random Variables
b. f (x) = 1 where 1 ≤ x ≤ 9
8
c. five
d. 2.3
e. 15
32
f. 333
800
g. 2
3
75
a. X represents the length of time a commuter must wait for a train to arrive on the Red Line.
b. Graph the probability distribution.
c. f (x) = 1 where 0 ≤ x ≤ 8
8
d. four
e. 2.31
f. 1
8
g. 1
8
77 d
78 b
80
a. The probability density function of X is 1 =1.
25 − 16 9
⎛ ⎞
P(X > 19) = (25 – 19) ⎝1 ⎠ = 6 = 2 .
9 9 3
Figure 5.40
⎛ ⎞
b. P(19 < X < 22) = (22 – 19) ⎝1 ⎠ = 3 = 1 .
9 9 3
Figure 5.41
c. This is a conditional probability question. P(x > 21 | x > 18). You can do this two ways:
◦ Draw the graph where a is now 18 and b is still 25. The height is 1 = 1
(25 − 18) 7
⎛ ⎞
So, P(x > 21 | x > 18) = (25 – 21) ⎝1 ⎠ = 4/7.
7
P(x > 21 ∩ x > 18)
◦ Use the formula: P(x > 21 | x > 18) =
P(x > 18)
P(x > 21) (25 − 21)
= = = 4.
P(x > 18) (25 − 18) 7
82
a. P(X > 650) = 700 − 650 = 50 = 1 = 0.125.
700 − 300 400 8
84
a. X = the useful life of a particular car battery, measured in months.
b. X is continuous.
c. 40 months
d. 360 months
e. 0.4066
f. 14.27
86
a. X = the time (in years) after reaching age 60 that it takes an individual to retire
b. X is continuous.
c. five
d. five
e. Check student’s solution.
f. 0.1353
g. before
h. 18.3
88 a
90 c
276 Chapter 5 | Continuous Random Variables
92 Let X = the number of no-hitters throughout a season. Since the duration of time between no-hitters is exponential, the
number of no-hitters per season is Poisson with mean λ = 3.
0 –3
Therefore, (X = 0) = 3 e = e–3 ≈ 0.0498
0!
NOTE
You could let T = duration of time between no-hitters. Since the time is exponential and there are 3 no-hitters per
season, then the time between no-hitters is 1 season. For the exponential, µ = 1 .
3 3
Therefore, m = 1
µ = 3 and T ∼ Exp(3).
a. The desired probability is P(T > 1) = 1 – P(T < 1) = 1 – (1 – e–3) = e–3 ≈ 0.0498.
b. Let T = duration of time between no-hitters. We find P(T > 2|T > 1), and by the memoryless property this is simply
P(T > 1), which we found to be 0.0498 in part a.
c. Let X = the number of no-hitters is a season. Assume that X is Poisson with mean λ = 3. Then P(X > 3) = 1 – P(X ≤ 3)
= 0.3528.
94
a. 100 = 11.11
9
b. P(X > 10) = 1 – P(X ≤ 10) = 1 – Poissoncdf(11.11, 10) ≈ 0.5532.
c. The number of people with Type B blood encountered roughly follows the Poisson distribution, so the number
of people X who arrive between successive Type B arrivals is roughly exponential with mean μ = 9 and m = 1
9
−x
9
. The cumulative distribution function of X is P(X < x) = 1 − e . Thus hus, P(X > 20) = 1 - P(X ≤ 20) =
⎛ − 20 ⎞
1 − ⎜1 − e 9⎟
≈ 0.1084.
⎝ ⎠
NOTE
We could also deduce that each person arriving has a 8/9 chance of not having Type B blood. So the probability
⎛ ⎞
20
that none of the first 20 people arrive have Type B blood is ⎝8 ⎠ ≈ 0.0948 . (The geometric distribution is
9
more appropriate than the exponential because the number of people between Type B people is discrete instead of
continuous.)
96 Let T = duration (in minutes) between successive visits. Since patients arrive at a rate of one patient every seven
t
minutes, μ = 7 and the decay constant is m = 1 . The cdf is P(T < t) = 1 − e 7
7
−2
7
a. P(T < 2) = 1 - 1 − e ≈ 0.2485.
− ⎛
− 15 ⎞ 15
b. P(T > 15) = 1 − P(T < 15) = 1 − ⎜1 − e 7 ⎟ ≈ e 7 ≈ 0.1173 .
⎝ ⎠
− ⎛ − 5⎞ 5
c. P(T > 15|T > 10) = P(T > 5) = 1 − ⎜1 − e 7 ⎟ = e 7 ≈ 0.4895 .
⎝ ⎠
d. Let X = # of patients arriving during a half-hour period. Then X has the Poisson distribution with a mean of 30 , X ∼
7
⎛ ⎞
Poisson ⎝30 ⎠ . Find P(X > 8) = 1 – P(X ≤ 8) ≈ 0.0311.
7
278 Chapter 5 | Continuous Random Variables