Week 02
Week 02
PRACTICE QUESTIONS:
Anderson Chapter 2: 1, 2, 4, 6, 7, 8, 9, 11, 12, 13, 15, 16, 17, 20, 21, 24, 27, 28, 39, 41, 42, 43, 44,
45, 46, 48
Newbold Chapter 1: 1.32, 1.33, 1.34, 1.36, 1.52
Black (2010) Chapter 2: 2.1, 2.2, 2.4, 2.6, 2.8, 2.9, 2.17, 2.18, 2.19, 2.21, 2.26
A percent frequency distribution summarizes the percent frequency of the data for each class.
A common graphical presentation of quantitative data is a histogram.
Questions
Question 1: The response to a question has three alternatives: A, B, and C. A sample of 120 responses
provides 60 A, 24 B, and 36 C. Show the frequency and relative frequency distributions.
Solution:
Response Frequency
A 60
B 24
C 36
Total 120
Response Relative
frequency
A 60/120=
0.5
B 24/120=
0.2
C 36/120=
0.3
Total 1
Question 2: Leverock’s Waterfront Steakhouse in Maderia Beach, Florida, uses a questionnaire to ask
customers how they rate the server, food quality, cocktails, prices, and atmosphere at the restaurant.
Each characteristic is rated on a scale of outstanding (O), very good (V), good (G), average (A), and
poor (P). Use descriptive statistics to summarize the following data collected on food quality. What is
your feeling about the food quality ratings at the restaurant?
Solution:
50 1
The food quality ratings are leaning towards outstanding and very good according to the high number
of frequency, hence the food quality ratings suggest that the restaurant has a number of satisfied
customers.
Question 3: A shortage of candidates has required school districts to pay higher salaries and offer extras
to attract and retain school district superintendents. The following data show the annual base salary
($1000s) for superintendents in 20 districts in the greater Rochester, New York, area (The Rochester
Democrat and Chronicle, February 10, 2008).
187 184 174 185
175 172 202 197
165 208 215 164
162 172 182 156
172 175 170 183
Use classes of 150–159, 160–169, and so on in the following.
a. Show the frequency distribution.
b. Show the percent frequency distribution.
c. Show the cumulative percent frequency distribution.
d. Develop a histogram for the annual base salary.
e. Do the data appear to be skewed? Explain.
f. What percentage of the superintendents make more than $200,000?
Solution:
A to c.
d.
e. The distribution is skewed to the right, because the highest bars in the histogram are to the left.
f. The percentage of the superintendents that make more than $200,000 is the sum of the percent
frequencies of the categories "200-209" and "210-219": 10% + 5% = 15%
Question 4: NRF/BIG research provided results of a consumer holiday spending survey (USA Today,
December 20, 2005). The following data provide the dollar amount of holiday spending for a sample of
25 consumers.
1200 850 740 590 340
450 890 260 610 350
1780 180 850 2050 770
800 1090 510 520 220
1450 280 1120 200 350
a. What is the lowest holiday spending? The highest?
b. Use a class width of $250 to prepare a frequency distribution and a percent frequency distribution for
the data.
c. Prepare a histogram and comment on the shape of the distribution.
d. What observations can you make about holiday spending?
Solution:
a. The lowest holiday spending is 180 dollars, while the highest holiday spending is 2050 dollars.
b.
c.
Holiday Spending
The histogram is skewed to the right, because the highest bars are to the left in the histogram.
d. Holiday spending varies between $180 and $2050, while most people spend between $180 and
$930.
c and d.
Questions
Question 1:
Solution:
Household size is a quantitative variable, which is also a discrete variable because its possible values are
1, 2, 3, and so on. Therefore, the number of people in your household is discrete, quantitative data.
Question 2
Solution:
Height is a quantitative variable, which is also a continuous variable because height can conceptually be
any positive number.
Question 3: A store sells from 0 to 12 computers per day. Is the amount of daily computer sales a discrete
or continuous random variable?
Solution: The discrete variable is a variable whose value is obtained by counting, hence we can say that
the daily computer sales are a Discrete Random Variable which takes countable and finite number of
values.
Question 4: A factory production process produces a small number of defective parts in its daily
production. Is the number of defective parts a discrete or continuous random variable?
Solution: We can count the number of defective parts. So based on this we can conclude that the number
of irregular parts is finite and that it is a Discrete Random Case Variable.
Question 5: For each of the following, indicate if a discrete or a continuous random variable provides the
best definition:
a. The number of cars that arrive each day for repair in a two-person repair shop.
b. The number of cars produced annually by General Motors
c. Total daily e-commerce sales in dollars
d. The number of passengers that are bumped from a specific airline flight 3 days before Christmas.
Solution:
a. Since the daily number of cars that arrive at the repair shop is countable, we consider it a discrete
variable.
b. Since the number of cars produced by General Motors each year is countable, we consider it a
discrete variable.
c. We consider total daily e-commerce sales in dollars to be a continuous variable because it can
take decimal values as well as discrete values. Moreover, daily sales can vary at different
intervals.
d. Since the number of people who involuntarily gave up their flight three days before Christmas is
something countable, we consider it a discrete random variable.
Grouped and ungrouped data
Raw data, or data that have not been summarized in any way, are sometimes referred to as ungrouped
data.
Data that have been organized into a frequency distribution are called grouped data.
Questions
Question 1: The Higher Education Research Institute at UCLA provides statistics on the most popular
majors among incoming college freshmen. The five most popular majors are Arts and Humanities (A),
Business Administration (B), Engineering (E), Professional (P), and Social Science (S) (The New York
Times Almanac, 2006). A broad range of other (O) majors, including biological science, physical science,
computer science, and education, are grouped together. The majors selected for a sample of 64 college
freshmen follow.
b.
c. 5 + 6 + 13 + 11 + 7 = 42 out of 64 freshmen chose one of the five most popular majors (all
categories except for O).
42 / 64 = 0.65625 or 65.625%
d. The most popular major is Business administration, since "O" represents a broad range of other
majors and "B" has the highest frequency excluding "O".
13 / 64 = 0.203125 or 20.3125% of freshman select the Business-administration major.
Question 2: The following data represent the afternoon high temperatures for 50 construction days during
a year in St. Louis.
42 70 64 47 66 69 73 38 48 25
55 85 10 24 45 31 62 47 63 84
16 40 81 15 35 17 40 36 44 17
38 79 35 36 23 64 75 53 31 60
31 38 52 16 81 12 61 43 30 33
a. Construct a frequency distribution for the data using five class intervals.
b. Construct a frequency distribution for the data using 10 class intervals.
c. Examine the results of (a) and (b) and comment on the usefulness of the frequency distribution in terms
of temperature summarization capability.
Solution:
a. Range = 85 – 12 = 75
Class width= Range/ number of classes= 75/5= 15
10-25 9
25-40 13
40-55 11
55-70 9
70-85 8
Total 50
10-18 7
18-26 3
26-34 5
34-42 9
42-50 7
50-58 3
58-66 6
66-74 4
74-82 4
82-90 2
c. The information presented in the tables is very useful because it shows the information
throughout the next 50 days. This information is beneficial to construction workers because it
helps them make decisions on the best possible day to proceed with their work.
Question 3: Data for a sample of 55 members of the Baseball Hall of Fame in Cooperstown, New York,
are shown here. Each observation indicates the primary position played by the Hall of Famers: pitcher
(P), catcher (H), 1st base (1), 2nd base (2), 3rd base (3), shortstop (S), left field (L), center field (C), and
right field (R).
b. Position P has the largest frequency, so pitcher is the position with the most Hall of Famers.
c. Position 3 has the smallest frequency, so 3rd base is the position with the fewest Hall of Famers.
d. Looking only at the frequencies for L, C, and R, position R has the largest frequency so right field
is the outfield position with the most Hall of Famers.
e. Infielders are positions 1, 2, 3, and S, which have a sum of:
5 + 4 + 2 + 5 = 16
Outfields are positions L, C, and R, which have a sum of:
6 + 5 + 7 = 18
Therefore, outfielders provide more Hall of Famers than infielders.
Question 4: A packaging process is supposed to fill small boxes of raisins with approximately 50 raisins
so that each box will weigh the same. However, the number of raisins in each box will vary. Suppose 100
boxes of raisins are randomly sampled, the raisins counted, and the following data are obtained.
Construct a frequency distribution for these data. What does the frequency distribution reveal about the
box fills?
Solution:
Range = 69-31 = 22
No. of classes = 11
Class width 2.0
0 39 – under 41 2
1 41 – under 43 1
2 43 – under 45 5
3 45 – under 47 10
4 47 – under 49 18
5 49 – under 51 13
6 51 – under 53 15
7 53 – under 55 15
8 55 – under 57 7
9 57 – under 59 9
10 59 – under 61 5
The distribution reveals that only 13 of the 100 boxes of raisins contain approx. 50 raisin (49 - under 50).
However, 71 of the 100 boxes of raisins contains between 45 and 55 raisins. It shows that there are a few
boxes (5) that have 9 or more extra raisins (59-60) and two boxes that have 9-11 less raisins (39- under
41) than the boxes are supposed to contain.
Question 5: The owner of a fast-food restaurant ascertains the ages of a sample of customers. From these
data, the owner constructs the frequency distribution shown. For each class interval of the frequency
distribution, determine the class midpoint, the relative frequency, and the cumulative frequency.
What does the relative frequency tell the fast-food restaurant owner about customer ages?
Solution:
0 0 - under 5 6 0.069767
1 5 – under 10 8 0.093023
2 10 – under 15 17 0.197674
3 15 – under 20 23 0.267442
4 20 – under 25 18 0.209302
5 25 – under 30 10 0.116279
6 30 – under 35 4 0.046512
The relative frequency tells us that it is most probable that a customer is in the 15-20 category with a
relative frequency of 0.267 and over two thirds i.e. 0.67 of the customers are between 10 and 25 years
of age.
Questions
Question 1: Consider the following frequency distribution.
total 50 1 100 50 1
Question 2: Construct a histogram and an ogive for the data in above exercise.
Solutions:
Frequency
Cumulative Frequency
class
class
Question 3: A doctor’s office staff studied the waiting times for patients who arrive at the office with a
request for emergency service. The following data with waiting times in minutes were collected over a
one-month period.
2 5 10 12 4 4 5 17 11 8 9 8 12 21 6 8 7 13 18 3
Use classes of 0–4, 5–9, and so on in the following:
a. Show the frequency distribution.
b. Show the relative frequency distribution.
c. Show the cumulative frequency distribution.
d. Show the cumulative relative frequency distribution.
e. What proportion of patients needing emergency service wait 9 minutes or less?
Solution:
a to d:
Question 4: Sorting through unsolicited e-mail and spam affects the productivity of office workers. An
InsightExpress survey monitored office workers to determine the unproductive time per day devoted to
unsolicited e-mail and spam (USA Today, November 13, 2003). The following data show a sample of
time in minutes devoted to this task.
2 4 8 4
8 1 2 32
12 1 5 7
5 5 3 4
24 19 4 14
Summarize the data by constructing the following:
a. A frequency distribution (classes 1–5, 6–10, 11–15, 16–20, and so on)
b. A relative frequency distribution
c. A cumulative frequency distribution
d. A cumulative relative frequency distribution
e. An ogive
f. What percentage of office workers spend 5 minutes or less on unsolicited e-mail and spam? What
percentage of office workers spend more than 10 minutes a day on this task?
Solution:
A to d
e.
Cumulative frequency
Time in minutes
f. The percentage of office works that spend 5 minutes or less is the cumulative relative frequency of the
interval 1-5:
0.6= 60%
The percentage of office works that spend more than 10 minutes is:
0.1 + 0.05 + 0.05 + 0 + 0.05 = 0.25 = 25%
Question 5: The following data are the average weekly mortgage interest rates for a 40-week period.
Questions
Question 1:
Solution:
Step 1: First we arrange the leading digits of each data value to the left of a vertical line. To the right of
the vertical line, we record the last digit for each data value.
Step 2: Continue adding the data values accordingly, for instance 112 shows the leading digits 11 to the
left of the line and the last digit 2 to the right of the line. Similarly, the data value 72 shows the leading
digit 7 to the left of the line and last digit 2 to the right of the line.
Step 3: Next, sort the digits on each line into rank order.
Key: 6 l 8 means 68
5 87
6 485
7 0256582
8 3025
Question 3: A psychologist developed a new test of adult intelligence. The test was administered to
20 individuals, and the following data were obtained.
114 99 131 124 117 102 106 127 119 115
98 104 144 151 132 106 125 122 118 118
Construct a stem-and-leaf display for the data.
Solution:
Question 4: Most major ski resorts offer family programs that provide ski and snowboarding instruction
for children. The typical classes provide four to six hours on the snow with a certified instructor. The
daily rate for a group lesson at 15 ski resorts follows (The Wall Street Journal, January 20, 2006).
b. The daily rate appears to be between $75 and $145, the most common being between $75 to
$115. There also appear to be a few unusual high daily rates: $137, $145, and $145.
Question 5: The 2004 Naples, Florida, minimarathon (13.1 miles) had 1228 registrants (Naples Daily
News, January 17, 2004). Competition was held in six age groups. The following data show the ages for a
sample of 40 individuals who participated in the marathon.
49 33 40 37 56 44 46 57 55 32
50 52 43 64 40 46 24 30 37 43
31 43 50 36 61 27 44 35 31 43
52 43 66 31 50 72 26 59 21 47
a. Show a stretched stem-and-leaf display.
b. What age group had the largest number of runners?
c. What age occurred most frequently?
d. A Naples Daily News feature article emphasized the number of runners who were “20-something.”
What percentage of the runners were in the 20-something age group?
Solution:
a.