MGS2150_Lecture2_Notes_II
MGS2150_Lecture2_Notes_II
1
Introduction
2
Summarizing Data for Two Variables Using Tables
3
Crosstabulation
4
Crosstabulation - Example
• In the left margin, the row labels correspond to the three rating categories
for the quality rating variable.
• In the top margin, the column show that the meal price data have been
grouped into four classes.
• Each restaurant is associated with a cell appearing in one of the rows and
one of the columns of the crosstabulation.
• Count the number of restaurants that belong to each of the cells.
Meal Price
Quality Rating $10-19 $20-29 $30-39 $40-49 Total
Good 42 40 2 0 84
Very Good 34 64 46 6 150
Excellent 2 14 28 22 66
TotaI 78 118 76 28 300
6
Crosstabulation of Quality Rating and Meal Price Data for
300 Los Angeles Restaurants
7
Construct a Crosstabulation with PivotTable
8
Initial PivotTable Fields Task Pane and PivotTable for the
Restaurant Data
9
Final PivotTable for the Restaurant Data
• Editing Options:
Step 1. Right-click cell B4 in the PivotTable or any other cell
containing meal prices.
Step 2. Choose Group from the list of options that appears
Step 3. When the Grouping dialog box appears:
Enter 10 in the Starting at: box; Enter 49 in the Ending at: box
Enter 10 in the By: box; Click OK
Step 4. Right-click on Excellent in cell A5
Select Move and click Move “Excellent” to End
10
Final PivotTable for the Restaurant Data (Cont’d)
11
Crosstabulation: Row Percentages
12
Row Percentages for Each Quality Rating Category
Meal Price
Quality Rating $10-19 $20-29 $30-39 $40-49 Total
Good 50.0% 47.6% 2.4% 0.0% 100.0%
Very Good 22.7% 42.7% 30.6% 4.0% 100.0%
Excellent 3.0% 21.2% 42.4% 33.4% 100.0%
• Of the restaurants with the lowest quality rating, 50% have lowest meal
prices.
• Of the restaurants with an excellent quality rating, greatest percentages
are for the more expensive restaurants .
• Restaurants with higher meal prices received higher quality ratings.
13
Crosstabulation: Column Percentages
Meal Price
Quality Rating $10-19 $20-29 $30-39 $40-49
Good 53.8% 33.9% 2.6% 0.0%
Very Good 43.6% 54.2% 60.5% 21.4%
Excellent 2.6% 11.9% 36.8% 78.6%
TotaI 100.0% 100.0% 100.0% 100.0%
15
Simpson’s Paradox: Example
• Western University has only one women’s softball scholarship remaining for the
coming year. The final two players that Western is considering are Allison and Emily.
The coaching staff has concluded that the speed and defensive skills are virtually
identical for the two players, and that the final decision will be based on which
player has the best batting average. Crosstabulations of each player’s batting
performance in their junior and senior years of high school are as follows:
Allison Emily
Outcome Junior Senior Outcome Junior Senior
Hit 15 (38%) 75 (30%) Hit 70 (35%) 35 (29%)
No Hit 25 (63%) 175 (70%) No Hit 130 (65%) 85 (71%)
Total At-Bats 40 (100%) 250 (100%) Total At-Bats 200 (100%) 120 (100%)
Allison had the higher batting average in both her junior year and
senior year.
16
Simpson’s Paradox: Example (Cont’d)
• When the results are aggregated for each player, a different picture emerges:
– Crosstabulations of each player’s batting performance in the combined two-year( junior
and senior years) of high school are as follows:
• Based on aggregated crosstabulation, Emily has the higher batting average over the
combined two years.
• This result contradicts the conclusion we reached with the unaggregated crosstabulation.
• The decision maker will have to decide whether the unaggregated or the aggregated form
of the crosstabulation is most helpful in identifying the desired conclusion
17
Summarizing Data for Two Variables Using Graphical
Displays
18
Scatter Diagram and Trendline
19
Types of Relationships Depicted by Scatter Diagram
20
Scatter Diagram and Trendline: Example
Number of Sales
Week Commercials Volume
1 2 50
2 5 57
3 1 41
. . .
. . .
21
Scatter Diagram and Trendline: Example (Cont’d)
22
Excel: Construct a Scatter Diagram
23
Excel: Construct a Scatter Diagram (Cont’d)
24
Side-by-Side Bar Charts
25
Side-by-Side Bar Charts – Example
• As the price increases (left to right), the height of the light blue bars
decreases and the height of the dark blue bars generally increases.
→ As price increases, the quality rating tends to be better.
• The very good rating tends to be more prominent in the middle price
categories.
26
Excel: Construct a Side-by-Side Bar Chart
27
Stacked Bar Chart
• A stacked bar chart is a bar chart in which each bar is broken into
rectangular segments of a different color showing the relative
frequency of each class.
• Because percentage frequencies are displayed, all bars are of the
same height, extending to the 100% mark.
• Example - Zagat’s restaurant reviews: As price increases, the quality
rating tends to be better.
28
Excel: Construct a Stacked Bar Chart
• You can easily change the side-by-side bar chart to a stacked bar
chart using the following steps.
Step 1. Click on the bar chart. Click the Design tab on the Ribbon.
In the Type group, click Change Chart Type.
Step 2. When the Change Chart Type dialog box appears:
Select the 100% Stacked Columns option. Click OK.
29
Data visualization
30
Creating Effective Graphical Displays
31
Choosing the Type of Graphical Display
32
Summary of Graphical Displays Used to Make
Comparisons and Show Relationships
33
Data Dashboards
34
Data Dashboard: Example
35
A Summary of Tabular and Graphical Displays of Data
36