Problem 1: Organizing Categorical Variables: Solution
Problem 1: Organizing Categorical Variables: Solution
2.1 A categorical variable has three categories, with the following frequencies of
occurence:
Category Frequency
A 13
B 28
C 9
a. Compute the percentage of values in each category.
b. What conclusions can you reach concerning the categories.
2.2 The following data represent the responses to two questions asked in a
survey of 40 college students majoring in business. What is your gender? (M=Male;
F-Female) and what is your major? (A=Accounting; C=Computer Information
Systems; M=Marketing)
Gender M
Major A
Gender F
Major A
Gender M
Major C
Gender F
Major C
A. Tally the data into a contingency table where the two rows represent the
gender categories and the three columns represent the academic major
categories.
A. Tally the data into a contingency table where the two rows represent the
gender categories and the three columns represent the academic major
categories.
Solution:
Gender Accounting
Male 14
Female 6
TOTAL 20
Gender Accounting
Male 70%
Female 30%
Total 100%
2.4 The Edmunds.com NHTSA Complaints Activity Report contains consumer vehicle com
submissions by automaker, brand, and category (data extracted from edmun.in/Ybmpuz). The fo
tables, stored in Automaker1 and Automaker2, represent complaints received by automake
complaints received by category for January 2013.
Automaker Number
American Honds 169
Chrysler LLC 439
Ford Motor Company 440
General Motors 551
Nissan Motors Corporation 467
Toyota Motor Sales 332
Other 516
Category Number
Airbags and Seatbelts 201
Body and Glass 182
Brakes 163
Fuel/emission/exhaust system 240
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and Wheels 71
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and Wheels 71
Solution:
Automaker Number Percentage
American Honds 169 5.80%
Chrysler L.L.C 439 15.07%
Ford Motor Company 440 15.10%
General Motors 551 18.91%
Nissan Motors Corp. 467 16.03%
Toyota Motors Sales 332 11.39%
Other 516 17.71%
TOTAL 2914 100.00%
B. What conclusions can you reach about the complaints for the different
automakers?
- From the table A, you can conclude that "automaker-General Motors" has the
highest percentage of complaints while "automaker-American Honds" has the
lowest percentage of complaints.
Solution:
Category Number Percentage
Airbags and Seatbelts 201 7.50%
Body and glass 182 6.79%
Brakes 163 6.08%
Fuel/emission/exhaust system 240 8.95%
Interior electronics/hardware 279 10.41%
Powertrain 1,148 42.82%
Steering 397 14.81%
Tires and Wheels 71 2.65%
TOTAL D. What conclusion can you reach about the complaints
2681 for different categories?
100.00%
- From the table C, you can conclude that "category-Powertrain" has the highest percentage of com
with 42.82% followed by "category-Steering" with 14.81%. Meanwhile, the lowest percentage of co
is the "tires and wheels" category with 2.65%. Thus, the company should focus on addressing "pow
as it is the highest number of complaints according to Complaints Activity Report conducted
Edmunds.com.
with 42.82% followed by "category-Steering" with 14.81%. Meanwhile, the lowest percentage of co
is the "tires and wheels" category with 2.65%. Thus, the company should focus on addressing "pow
as it is the highest number of complaints according to Complaints Activity Report conducted
Edmunds.com.
cies of
hest
age of
s asked in a
der? (M=Male;
Information
M M F M F
C C M A C
M M M M F
A A M C M
M M M F M
C A A M M
M M M M F
C A A A A
Computer Info. Sytem Marketing TOTAL
9 2 25
6 3 15
15 5 40
Female TOTAL
15% 50%
15% 37.50%
7.50% 13%
38% 100%
Female TOTAL
40% 100%
40% 100%
20% 100%
100% 100%
has the
has the
ent categories?
88 78 78 73 91 78 85
Solution:
Exam scores in Mathematics (ordered array)
73 78 78 78 85 88 91
2.13 In late 2011 and early 2012, the Universal Health Care Foundation of
Connecticut surveyed small business owners across the state that employed 50 or
fewer employees. The purpose of the study was to gain insight on the current small
business health-care environment. Small business owners were asked if they offered
health-care plans to their employees and if so, what portion (%) of the employee
monthly health-care premium the business paid. The following frequency distribution
was formed to summarize the portion of premium paid for 89 small business who
offer health-care plans to employees.
a. What percentage of small business pays less than 26% of the employee monthly
health-care premium?
- 2+4 = 6/89 x 100 = 6.74%
- Therefore, 6.74% of small businesses pay less than 26% of the employee
monthly health-care premium.
b. What percentage of small business pays between 26% and 75% of the employee
monthly health-care premium?
- 16+21 = 37/89 x 100 = 41.57%
- Therefore, 41.57% of small businesses pay between 26% and 75% of the
employee monthly health-care premium.
c. What percentage of small business pays more than 75% of the employer monthly
health-care premium.
- 23+23 = 46/89 x 100 = 51.69%
- Therefore, 51.69% of small businesses pay more than 75% of the employee
monthly health-care premium.
6 8 6 5 4 3 6
8 9 4 7 9 4 5
6 6 6 7 8 8 8
7 7 8 8 9 9 9
8
8
9
9
PROBLEM 3: VISUALIZING CATEGORICAL VARIABLE
Activity Percentage
Attending class/lab 9%
Sleeping 24%
Socializing, recreation, other 51%
Studying 7%
Working, volunteering, student clubs 9%
Studying
ACTIVITY
Sleeping
Attending class/lab
PERCENTAGE
24%
51%
40%
30%
24%
20%
9% 9%
10%
0%
Socializing, recreation, Sleeping Attending class/lab Working, volunteering, S
other student clubs
Activity
b. Which graphical method do you think is best for portraying these data?
- I think there isn't "best" method for portraying these data. It's largely a matter of taste. The pie chart
be easiest of these to use, as it compares all of the pieces together. The pareto chart shows more
others, as it includes the cumulative amounts and the sequence from the largest to least. The bar
decent sense of the distribution of the various activitites.
c. What conclusions can you reach concerning what college students do with their t
- I can conclude that these college students seem to spend considerable time on recreational activitie
like to socialize with people in their free time.
c. What conclusions can you reach concerning what college students do with their t
- I can conclude that these college students seem to spend considerable time on recreational activitie
like to socialize with people in their free time.
2.26 The following data has been recorded of the consumer complaints in a hotel:
Heating 30
Cleaning 100
Towels 50
Theft 2
Noise 10
Room service 20
120 100%
90%
100 80%
80 70%
60%
60 50%
40%
40 30%
20 20%
10%
0 0%
Cleaning Towels Heating Room service Noise Theft
Complaint Type
b. What were the top and bottom 50% complaints received for?
- Top, 47.17% complaints have been received just for cleaning issues and the rest 52.83% of
the complaints comprise towels, heating, room services, noise and theft, in that order.
b. What were the top and bottom 50% complaints received for?
- Top, 47.17% complaints have been received just for cleaning issues and the rest 52.83% of
the complaints comprise towels, heating, room services, noise and theft, in that order.
c. Based on results of a and b, what would you advise the hotel management to
prioritize?
- Based on the results, the management should prioritize the improvement of the cleaning
services.
2.27 The Edmunds.com NHTSA Complaints Activity Report contains consumer vehicle
complaints submissions by autmoaker, brand, and category. The following tables, stored in
Automaker 1 and Automaker 2, represent complaints received by automaker and complaints
received by category for January 2013.
Automaker Number
American Honda 169
Chrysler LLC 439
Ford Motor Company 440
General Motors 551
Nissan Motors Corporation 467
Toyota Motor Sales 332
Other 516
TOTAL 2,914
NUMBER
Nissan Motors Corporation
AUTOMAKE
General Motors
Ford Motor Company
Chrysler LLC
American Honda
0 100 200 300 400 500 600
NUMBER
Automaker Percentage
American Honda 6.00%
Chrysler LLC 15%
Ford Motor Company 15%
General Motors 19%
Nissan Motors Corporation 16%
Toyota Motor Sales 11%
Other 18%
TOTAL 100.00%
b. Which graphical method do you think is best for potraying these data?
- For me, the best method or more suitable for portraying these data is the bar chart.
Because the purpose of the activity report conducted by Edmunds.com is just to
compare the categories and to know what complaints are the most highest
percentage received by automaker.
Category Number
Airbags and seatbelts 201
Body and glass 182
Brakes 63
Fuel/emission/exhaust system 240
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and wheels 71
Category Number
Powertrain 1148
Steering 397
Interior electronics/hardware 279
Fuel/emission/exhaust system 240
Airbags and seatbelts 201
Body and glass 182
Tires and wheels 71
Brakes 63
CATEGORY OF COMPLAINTS
d. Discuss the "vital few" and "trivial many" reasons for the categories of complaints.
- Based on our chart, around 80% of all complaints are from only half of the total number of categori
few" are the following four categories: (1) powertrain, (2) steering, (3) interior electronics/hardwar
fuel/emission/exhaust system. The "trivial many" are the other four categories: (1) airbags and seatb
and glass, (3) tires and wheels, and (4) brakes. Thus, in order to spend their resources wisely, autom
focus their efforts on the 'vital few" in order to solve the largest number of complaints with the fewes
O WITH THEIR TIME
Cumulative Percentage
51%
75%
84%
93%
100% C u m m u la t iv e P e rc e n t a g e
ercentage
plaints in a hotel:
HOTEL
100%
Cummulative Percentage
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Theft
tive Percentage
rest 52.83% of
n that order.
rest 52.83% of
n that order.
ment to
of the cleaning
onsumer vehicle
tables, stored in
er and complaints
Percentage
5.80%
15.07%
15.10%
18.91%
16.03%
11.39%
17.71%
100.00%
600
600
11%
15%
16%
19%
ata?
ar chart.
ust to
est
LAINTS
100%
80%
60%
40%
20%
0%
f complaints.
54 69 98 93 53 74
Solution:
5 3 4
6 9
7 4
8
9 3 8
5 0 Solution:
6 odered array
7 4 4 6 50 74 74 76
8 1 9
9 2
81 89 92
PROBLEM 5: VISUALIZING TWO NUMERICAL VARIABLE
x: 7 5 8 3 6 0 2 4 9 5 8
y: 1 5 4 9 8 0 6 2 7 5 4
Solution:
x y
SCATTER PLOT
7 1
10
5 5
9
8 4 8
3 9 7
6 8 6
0 0 5
Y
4
2 6
3
4 2
2
9 7 1
5 5 0
8 4 0 1 2 3 4 5 6 7 8 9
2.49 The following is a series of annual sales (in $millions) over an 11-year
period (2002 to 2012):
Year : 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Sales: 13.0 17.0 19.0 20.0 20.5 20.5 20.5 20.0 19.0 17.0 13.0
Sales (millions of $)
20
2005 20
2006 20.5 15
2007 20.5 10
2008 20.5
2009 20 5
2010 19 0
2011 17 2000 2002 2004 2006 2008 2010 2012
2012 13 Year
5 6 7 8 9 10
x
ES PLOT
Year