0% found this document useful (0 votes)
242 views

Problem 1: Organizing Categorical Variables: Solution

This document contains summaries of categorical variable analysis problems. For the first problem, the percentages of occurrences in each of three categories (A, B, C) are computed. Category B has the highest percentage at 56% while C is the lowest at 18%. For the second problem, contingency tables are constructed from survey data on student gender and major. The tables show that males majoring in accounting make up 35% of respondents and powertrain complaints make up 42.82% of all complaints in an automaker complaints report.

Uploaded by

Argieshi GC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
242 views

Problem 1: Organizing Categorical Variables: Solution

This document contains summaries of categorical variable analysis problems. For the first problem, the percentages of occurrences in each of three categories (A, B, C) are computed. Category B has the highest percentage at 56% while C is the lowest at 18%. For the second problem, contingency tables are constructed from survey data on student gender and major. The tables show that males majoring in accounting make up 35% of respondents and powertrain complaints make up 42.82% of all complaints in an automaker complaints report.

Uploaded by

Argieshi GC
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Problem 1: ORGANIZING CATEGORICAL VARIABLES

2.1 A categorical variable has three categories, with the following frequencies of
occurence:
Category Frequency
A 13
B 28
C 9
a. Compute the percentage of values in each category.
b. What conclusions can you reach concerning the categories.

A. Compute the percentage of values in each category.


Solution:

ORGANIZING CATEGORIAL VARIABLES


Category Freqency Percentage of Frequencies
A 13 26%
B 28 56%
C 9 18%
TOTAL 50 100%

B. What conclusions can you reach concerning the categories.


- From the table A, you can conclude that category B has the highest
percentage of frequencies while category C has the lowest percentage of
frequency.

2.2 The following data represent the responses to two questions asked in a
survey of 40 college students majoring in business. What is your gender? (M=Male;
F-Female) and what is your major? (A=Accounting; C=Computer Information
Systems; M=Marketing)

Gender M
Major A
Gender F
Major A
Gender M
Major C
Gender F
Major C

A. Tally the data into a contingency table where the two rows represent the
gender categories and the three columns represent the academic major
categories.
A. Tally the data into a contingency table where the two rows represent the
gender categories and the three columns represent the academic major
categories.

Solution:
Gender Accounting
Male 14
Female 6
TOTAL 20

B. Construct contingency tables based on percentages of all 40 student


responses, based on row percentages and based on column percentages.

CONTINGENCY TABLES BASED ON OVERALL TO


Gender Accounting
Male 14
Female 6
TOTAL 20

14/40 = 35% 6/40 = 15%


9/40 = 22.5% 6/40 = 15%
2/40 = 5% 3/40 = 7.5%

Academic Major Male


Accounting 35%
Computer Info. System 22.50%
Marketing 5%
TOTAL 63%

CONTINGENCY TABLE BASED ON ROW P


Gender Accounting
Male 14
Female 6
TOTAL 20

14/25 = 56% 6/15 = 40%


9/25 = 36% 6/15 = 40%
2/25 = 8% 3/15 = 20%

Academic Major Male


Accounting 56%
Computer Info. System 36%
Marketing 8%
TOTAL 100%

CONTINGENCY TABLE BASED ON COLUMN


Gender Accounting
Male 14
Female 6
TOTAL 20

14/20 = 70% 6/20 = 30%


9/15 = 60% 6/15 = 40%
2/5 = 40% 3/5 = 60%

Gender Accounting
Male 70%
Female 30%
Total 100%

2.4 The Edmunds.com NHTSA Complaints Activity Report contains consumer vehicle com
submissions by automaker, brand, and category (data extracted from edmun.in/Ybmpuz). The fo
tables, stored in Automaker1 and Automaker2, represent complaints received by automake
complaints received by category for January 2013.

Automaker Number
American Honds 169
Chrysler LLC 439
Ford Motor Company 440
General Motors 551
Nissan Motors Corporation 467
Toyota Motor Sales 332
Other 516

Category Number
Airbags and Seatbelts 201
Body and Glass 182
Brakes 163
Fuel/emission/exhaust system 240
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and Wheels 71
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and Wheels 71

A. Compute the percentage of complaints for each automaker.

Solution:
Automaker Number Percentage
American Honds 169 5.80%
Chrysler L.L.C 439 15.07%
Ford Motor Company 440 15.10%
General Motors 551 18.91%
Nissan Motors Corp. 467 16.03%
Toyota Motors Sales 332 11.39%
Other 516 17.71%
TOTAL 2914 100.00%
B. What conclusions can you reach about the complaints for the different
automakers?

- From the table A, you can conclude that "automaker-General Motors" has the
highest percentage of complaints while "automaker-American Honds" has the
lowest percentage of complaints.

C. Compute the percentage of complaints for each category.

Solution:
Category Number Percentage
Airbags and Seatbelts 201 7.50%
Body and glass 182 6.79%
Brakes 163 6.08%
Fuel/emission/exhaust system 240 8.95%
Interior electronics/hardware 279 10.41%
Powertrain 1,148 42.82%
Steering 397 14.81%
Tires and Wheels 71 2.65%
TOTAL D. What conclusion can you reach about the complaints
2681 for different categories?
100.00%
- From the table C, you can conclude that "category-Powertrain" has the highest percentage of com
with 42.82% followed by "category-Steering" with 14.81%. Meanwhile, the lowest percentage of co
is the "tires and wheels" category with 2.65%. Thus, the company should focus on addressing "pow
as it is the highest number of complaints according to Complaints Activity Report conducted
Edmunds.com.
with 42.82% followed by "category-Steering" with 14.81%. Meanwhile, the lowest percentage of co
is the "tires and wheels" category with 2.65%. Thus, the company should focus on addressing "pow
as it is the highest number of complaints according to Complaints Activity Report conducted
Edmunds.com.
cies of

hest
age of

s asked in a
der? (M=Male;
Information

M M F M F
C C M A C
M M M M F
A A M C M
M M M F M
C A A M M
M M M M F
C A A A A
Computer Info. Sytem Marketing TOTAL
9 2 25
6 3 15
15 5 40

S BASED ON OVERALL TOTAL PERCENTAGE


Computer Info. System Marketing TOTAL
9 2 25
6 3 15
15 5 40

Female TOTAL
15% 50%
15% 37.50%
7.50% 13%
38% 100%

TABLE BASED ON ROW PERCENTAGE


Computer Info. System Marketing TOTAL
9 2 25
6 3 15
15 5 40

Female TOTAL
40% 100%
40% 100%
20% 100%
100% 100%

BLE BASED ON COLUMN PERCENTAGE


Computer Info. System Marketing TOTAL
9 2 25
6 3 15
15 5 40

Computer Info. System Marketing Total


60% 40% 62.50% 25/40
40% 60% 37.50% 15/40
100% 100% 100.00%

tains consumer vehicle complaint


m edmun.in/Ybmpuz). The following
laints received by automaker and
ary 2013.
ferent

has the
has the

ent categories?

he highest percentage of complaints


the lowest percentage of complaints
uld focus on addressing "powertrain"
s Activity Report conducted by
the lowest percentage of complaints
uld focus on addressing "powertrain"
s Activity Report conducted by
F M F M
A A C C
F M F F
A A A C
F F M M
C A A A
M F M M
C C A C
PROBLEM 2: ORGANIZING NUMERICAL VARIABLE

2.11 Construct an ordered array, given the following data


from a sample of exam scores in Mathematics.

88 78 78 73 91 78 85

Solution:
Exam scores in Mathematics (ordered array)

73 78 78 78 85 88 91

2.12 Construct an ordered array for 30 students' GPA.


5 4 6 7 9 8 5
9 7 6 5 9 5 7

Solution: Ordered array


3 4 4 5 5 5 6
4 4 5 5 5 6 7

Can you draw any meaningful conclusions? Why or


Why not?

- Yes, the ordered array version of data, enables you to


quickly see that the students' GPA is between 3 and 9.

2.13 In late 2011 and early 2012, the Universal Health Care Foundation of
Connecticut surveyed small business owners across the state that employed 50 or
fewer employees. The purpose of the study was to gain insight on the current small
business health-care environment. Small business owners were asked if they offered
health-care plans to their employees and if so, what portion (%) of the employee
monthly health-care premium the business paid. The following frequency distribution
was formed to summarize the portion of premium paid for 89 small business who
offer health-care plans to employees.

Portion of Premium Paid (%) Frequency


Less than 1% 2
1% but less than 26% 4
26% but less than 51% 16
51% but less than 76% 21
76% but less than 100% 23
100% 23
1% but less than 26% 4
26% but less than 51% 16
51% but less than 76% 21
76% but less than 100% 23
100% 23

a. What percentage of small business pays less than 26% of the employee monthly
health-care premium?
- 2+4 = 6/89 x 100 = 6.74%
- Therefore, 6.74% of small businesses pay less than 26% of the employee
monthly health-care premium.

b. What percentage of small business pays between 26% and 75% of the employee
monthly health-care premium?
- 16+21 = 37/89 x 100 = 41.57%
- Therefore, 41.57% of small businesses pay between 26% and 75% of the
employee monthly health-care premium.

c. What percentage of small business pays more than 75% of the employer monthly
health-care premium.
- 23+23 = 46/89 x 100 = 51.69%
- Therefore, 51.69% of small businesses pay more than 75% of the employee
monthly health-care premium.
6 8 6 5 4 3 6
8 9 4 7 9 4 5

6 6 6 7 8 8 8
7 7 8 8 9 9 9
8
8

9
9
PROBLEM 3: VISUALIZING CATEGORICAL VARIABLE

2.25 What do college students do with their time? A survey of 3,000


traditional-age student was taken, with the results as follows:

Activity Percentage
Attending class/lab 9%
Sleeping 24%
Socializing, recreation, other 51%
Studying 7%
Working, volunteering, student clubs 9%

a. Construct a bar chart, a pie chart, and a


pareto chart.

BAR CHART OF WHAT DO COLLEGE STUDENTS DO WITH THEIR TIME

Working, volunteering, student clubs

Studying
ACTIVITY

Socializing, recreation, other

Sleeping

Attending class/lab

0% 10% 20% 30% 40% 50%

PERCENTAGE

PIE CHART OF WHAT DO COLLEGE STUDENTS DO WITH THEIR TIME


9% 9%
7%

24%

51%

Attending class/lab Sleeping Socializing, recreation, other


51%

Attending class/lab Sleeping Socializing, recreation, other


Studying Working, volunteering, student clubs

Activity Total Percentage


Socializing, recreation, other 51%
Sleeping 24%
Attending class/lab 9%
Working, volunteering, student clubs 9%
Studying 7%
T o t a l P e rc e n t a g e

PARETO CHART OF WHAT DO COLLEGE STUDENTS DO WITH


60%
51%
50%

40%

30%
24%
20%
9% 9%
10%

0%
Socializing, recreation, Sleeping Attending class/lab Working, volunteering, S
other student clubs

Activity

Total Percentage Cumulative Percentage

b. Which graphical method do you think is best for portraying these data?

- I think there isn't "best" method for portraying these data. It's largely a matter of taste. The pie chart
be easiest of these to use, as it compares all of the pieces together. The pareto chart shows more
others, as it includes the cumulative amounts and the sequence from the largest to least. The bar
decent sense of the distribution of the various activitites.

c. What conclusions can you reach concerning what college students do with their t

- I can conclude that these college students seem to spend considerable time on recreational activitie
like to socialize with people in their free time.
c. What conclusions can you reach concerning what college students do with their t

- I can conclude that these college students seem to spend considerable time on recreational activitie
like to socialize with people in their free time.

2.26 The following data has been recorded of the consumer complaints in a hotel:

Complaint Type Number of Consumer Complaints

Heating 30
Cleaning 100
Towels 50
Theft 2
Noise 10
Room service 20

a. Construct a pareto chart.

Complaint Type Number of Consumer Complaints


Cleaning 100
Towels 50
Heating 30
Room service 20
Noise 10
Theft 2

PARETO CHART OF CONSUMER COMPLAINTS IN A HOTEL


No. of Consumer Complaints

120 100%
90%
100 80%
80 70%
60%
60 50%
40%
40 30%
20 20%
10%
0 0%
Cleaning Towels Heating Room service Noise Theft

Complaint Type

Number of Consumer Complaints Total Percentage Cummulative Percentage

b. What were the top and bottom 50% complaints received for?

- Top, 47.17% complaints have been received just for cleaning issues and the rest 52.83% of
the complaints comprise towels, heating, room services, noise and theft, in that order.
b. What were the top and bottom 50% complaints received for?

- Top, 47.17% complaints have been received just for cleaning issues and the rest 52.83% of
the complaints comprise towels, heating, room services, noise and theft, in that order.

c. Based on results of a and b, what would you advise the hotel management to
prioritize?

- Based on the results, the management should prioritize the improvement of the cleaning
services.

2.27 The Edmunds.com NHTSA Complaints Activity Report contains consumer vehicle
complaints submissions by autmoaker, brand, and category. The following tables, stored in
Automaker 1 and Automaker 2, represent complaints received by automaker and complaints
received by category for January 2013.

Automaker Number
American Honda 169
Chrysler LLC 439
Ford Motor Company 440
General Motors 551
Nissan Motors Corporation 467
Toyota Motor Sales 332
Other 516
TOTAL 2,914

a. Construct a bar chart and a pie chart for the


complaints received by automaker.

BAR CHART OF THE COMPLAINTS RECEIVED BY


AUTOMAKER
Other
Toyota Motor Sales
AUTOMAKER

Nissan Motors Corporation


General Motors
Ford Motor Company
Chrysler LLC
American Honda
0 100 200 300 400 500 600

NUMBER
Nissan Motors Corporation

AUTOMAKE
General Motors
Ford Motor Company
Chrysler LLC
American Honda
0 100 200 300 400 500 600

NUMBER

Automaker Percentage
American Honda 6.00%
Chrysler LLC 15%
Ford Motor Company 15%
General Motors 19%
Nissan Motors Corporation 16%
Toyota Motor Sales 11%
Other 18%
TOTAL 100.00%

b. Which graphical method do you think is best for potraying these data?

- For me, the best method or more suitable for portraying these data is the bar chart.
Because the purpose of the activity report conducted by Edmunds.com is just to
compare the categories and to know what complaints are the most highest
percentage received by automaker.

Category Number
Airbags and seatbelts 201
Body and glass 182
Brakes 63
Fuel/emission/exhaust system 240
Interior electronics/hardware 279
Powertrain 1,148
Steering 397
Tires and wheels 71

Category Number
Powertrain 1148
Steering 397
Interior electronics/hardware 279
Fuel/emission/exhaust system 240
Airbags and seatbelts 201
Body and glass 182
Tires and wheels 71
Brakes 63

C. Construct a pareto chart for the categories of


complaints.

CUMMUL A TIV E P E RCENTA G E


N O . O F C O M P L A IN T S

PARETO CHART OF CATEGORIES OF COMPLAINTS


1400 100%
1200
80%
1000
800 60%
600 40%
400
20%
200
0 0%

CATEGORY OF COMPLAINTS

Number Total Percentage Cummulative Percentage

d. Discuss the "vital few" and "trivial many" reasons for the categories of complaints.

- Based on our chart, around 80% of all complaints are from only half of the total number of categori
few" are the following four categories: (1) powertrain, (2) steering, (3) interior electronics/hardwar
fuel/emission/exhaust system. The "trivial many" are the other four categories: (1) airbags and seatb
and glass, (3) tires and wheels, and (4) brakes. Thus, in order to spend their resources wisely, autom
focus their efforts on the 'vital few" in order to solve the largest number of complaints with the fewes
O WITH THEIR TIME

40% 50% 60%

WITH THEIR TIME

ng, recreation, other


ng, recreation, other

Cumulative Percentage
51%
75%
84%
93%
100% C u m m u la t iv e P e rc e n t a g e

TUDENTS DO WITH THEIR TIME


100%
90%
80%
70%
60%
50%
40%
30%
9% 7% 20%
10%
0%
orking, volunteering, Studying
udent clubs

ercentage

traying these data?

r of taste. The pie chart would arguably


areto chart shows more data than the
argest to least. The bar chart gives a
ctivitites.

tudents do with their time?

on recreational activities and they also


me.
tudents do with their time?

on recreational activities and they also


me.

plaints in a hotel:

Total Percentage Cummulative Percentage


47% 47%
24% 71%
14% 85%
9% 94%
5% 99%
1% 100%

HOTEL
100%
Cummulative Percentage

90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Theft

tive Percentage

rest 52.83% of
n that order.
rest 52.83% of
n that order.

ment to

of the cleaning

onsumer vehicle
tables, stored in
er and complaints

Percentage
5.80%
15.07%
15.10%
18.91%
16.03%
11.39%
17.71%
100.00%

600
600

PIE CHART OF THE COMPLAINTS RECEIVED BY AUTOMAKER


6%
18%
15%

11%

15%

16%
19%

American Honda Chrysler LLC Ford Motor Company General Motors


Nissan Motors Corporation Toyota Motor Sales Other

ata?

ar chart.
ust to
est

Total Percentage Cummulative Percentage


44% 44%
15% 60%
11% 71%
9% 80%
8% 88%
7% 95%
3% 98%
2% 100%
CUMMUL A TIV E P E RCENTA G E

LAINTS
100%
80%
60%
40%
20%
0%

f complaints.

otal number of categories. The" vital


rior electronics/hardware, and (4)
s: (1) airbags and seatbelts, (2) body
esources wisely, automakers should
mplaints with the fewest resources.
D BY AUTOMAKER

Company General Motors


PROBLEM 4: VISUALIZING NUMERICAL VARIABLE

2.33 Construct a stem-and -leaf display, given the following


data from a sample of midterm exam scores in finance:

54 69 98 93 53 74

Solution:

5 3 4
6 9
7 4
8
9 3 8

2.34 Construct an ordered array, given the following stem-


and-leaf display from a sample of n=7 midterm exam scores in
information systems.

5 0 Solution:
6 odered array
7 4 4 6 50 74 74 76
8 1 9
9 2
81 89 92
PROBLEM 5: VISUALIZING TWO NUMERICAL VARIABLE

2.48 The following is a set of data from a sample of n=11 items.

x: 7 5 8 3 6 0 2 4 9 5 8
y: 1 5 4 9 8 0 6 2 7 5 4

a. Construct a scatter plot.

Solution:

x y
SCATTER PLOT
7 1
10
5 5
9
8 4 8
3 9 7
6 8 6
0 0 5
Y

4
2 6
3
4 2
2
9 7 1
5 5 0
8 4 0 1 2 3 4 5 6 7 8 9

b. Is there a relationship between x and y?


Explain.

- No relationship between x and y.

2.49 The following is a series of annual sales (in $millions) over an 11-year
period (2002 to 2012):

Year : 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Sales: 13.0 17.0 19.0 20.0 20.5 20.5 20.5 20.0 19.0 17.0 13.0

a. Construct a time series plot.


Solution:
Year Sales
TIME SERIES PLOT
2002 13
25
2003 17
2004 19

Sales (millions of $)
20
2005 20
2006 20.5 15

2007 20.5 10
2008 20.5
2009 20 5

2010 19 0
2011 17 2000 2002 2004 2006 2008 2010 2012

2012 13 Year

b. Does there appear to be any change in annual sales


over time? Explain.

- Yes, the annual sales appear to be increasing in the earlier


year before 2006 but start to decline after 2008.
PLOT

5 6 7 8 9 10

x
ES PLOT

2008 2010 2012 2014

Year

You might also like